Git Replacing Lf With Crlf

Why does Git want to correct my line endings to CRLF, even though I want them to be in LF?

Let's look at this in several parts:

  • !eol has no function here. This sets eol to unspecified, but that's already the default, and an unspecified value of eol does not disable LF-to-CRLF translation.

  • Since you did specify text=auto, Git will check whether the contents of .gitattributes appear to be text or binary, and of course they should appear to be text.

Hence this particular entry tells Git that it should perform translations on .gitattributes.

Meanwhile, it's useful to realize that line-ending transforms are a special case of the general clean-and-smudge-filter concept. VonC's accepted answer at your third link has a nice drawing of the way the smudge filter works, but lacks one for how the clean filter works, so let's dive into this, with a bit of background.

Git-ified ("freeze-dried") vs work-tree ("rehydrated") files, and the index

Git's normal1 atomic unit of storage is the commit. A commit holds a full snapshot of your source tree (plus the commit metadata that I won't go into here). For many good reasons, the files within a commit are kept in a compressed, frozen, read-only, and Git-only storage format. I've lately taken to calling these files freeze-dried. This helps to distinguish them from files that you actually work with / on.

Like everything inside Git's internal key-value object database, these commits and their files are all read-only. That means they're preserved forever (or as long as the commit itself continues to exist), which is great for archival, but completely useless for getting any new work done. So Git has to provide a way to "rehydrate" the files, turning them into ordinary files you can work with.

Your work-tree is where Git puts the rehydrated files. They have their ordinary form, in ordinary files under ordinary names. Every program on your computer can deal with them, and you can manipulate them as you please.

Git could stop here: you'd have your frozen committed files, and your malleable work-tree files, and Git would build new commits from the work-tree. Mercurial, which in many ways is quite similar to Git, does stop here. But Git doesn't stop here. Instead, it goes on to throw into the mix an intermediary, sitting between the current frozen commit and the work-tree. This intermediary is Git's index. Git sometimes calls this the staging area, or the cache, depending on who / which part of Git documentation is doing the calling. All three are names for the same entity, though.

The index / staging-area simply holds an extra copy of every file. The format of this extra copy is the freeze-dried, internal, Git-only storage format. Files in this format are automatically shared across all commits that have the same file, so this means that when the copy that's in the index is the same as the copy in any commit, it's actually shared with that commit.

This also means that git commit, which has to freeze-dry each file to store it forever, really has almost zero work to to: the files are already freeze-dried! The freeze-drying process took place earlier, when you ran git add. That's what gets Git much of its speed. It's also why Git keeps requiring that you git add all the time.2 Note that it means that when you run git commit, Git doesn't even need to look at your work-tree. (It still does a quick half-of-git status run by default though, to create the comment text for your commit message.)


1I say normal here because Git also offers low-level access to simple key-value storage through what it calls blob objects. To use this, though, you must resort to using some of the so-called plumbing commands, rather than the ones that are, at least in theory, user-friendly. :-)

2Mercurial, which uses the work-tree as the proposed next commit, doesn't require you to keep hg add-ing your files. Once you've done the initial hg add, an hg commit scans your work-tree and commits whatever you have changed. This is much friendlier to newcomers, but it also means that in a big project, when you run hg commit, be prepared to wait.


The role of the index / staging area in line-ending transformations

Remember that the index stores freeze-dried, Git-ified copies of each file. This means that that the index-to-work-tree "rehydration" step is a great place to do any transformations you want done. This is where the smudge filters in the linked answer come in: the smudge filter can modify the committed text so that the work-tree text is more useful.

Likewise, the work-tree-to-index "freeze-dry" step—the one that occurs when you run git add—is a great place to do any transformations you want done. This is where the clean filters come in: the clean filters can remove stuff that shouldn't go into the actual commit in the repository.

Line ending transformations, in Git, are just special cases of clean and smudge filters. A freeze-dried, in-repository file can have any line endings you like.3 When we have Git copy that file from the index / staging area, to the work-tree, during a git checkout, we can have Git change those line endings from LF-only to CRLF, for instance. When we have Git copy that file from the work-tree, to the index / staging area, we can have Git change those line endings from CRLF to LF-only.

And that's the default for CRLF transformations for a text file. Those transformations will change LF-only freeze-dried files to CRLF rehydrated files, and will change CRLF rehydrated files to LF-only freeze-dried files.

You are supposed to Get a warning whenever Git can detect that this might do something different from what is already being done. So, suppose that the file in .gitattributes in your work-tree right now has LF-only line endings. Suppose further that the freeze-dried copy in the commit and/or in the index/staging-area also has LF-only line endings. And suppose the directives say that index -> work-tree should change LF-only to CRLF: why, then, something's hinky, and Git should warn.

I have found that these warnings are sometimes a little trigger-happy. I can't pin that to specific cases in specific Git versions, because I myself do my best to never, ever let Git fiddle with my data. I want the work-tree copy to match the freeze-dried copy, all the way through, every time, because I avoid OSes that require silly line-ending special-ness. But the above is the general rule, and the warning you are getting now makes sense: the actual freeze-dried files and the work-tree files all have LF-only line endings right now, but your settings tell Git that text from .gitattributes should have been converted to have CRLF line endings in your work-tree.


3And Linus Torvalds demands that you shall like LF-only line endings. :-) Kidding aside, Git sort of prefers this. If you disable all transformations—by not enabling CRLF at all, or by marking all files as -text, Git will store—permanently!—whatever line ending you say. If you then change your mind, you are stuck with the line endings you already froze because nothing in any commit can ever be changed. If those commits are wrong, the only thing you can do is stop using them. You can make new, improved, corrected ones and use those instead.

I think it's these "frozen committed copy is wrong because it has CRLF endings" cases that usually trigger bogus CRLF line ending warning issues. Since I don't actually use the line-ending-transforming code myself, it's hard to be sure about that.

How do I force git to use LF instead of CR+LF under windows?

The OP added in his question:

the files checked out using msysgit are using CR+LF and I want to force msysgit to get them with LF

A first simple step would still be in a .gitattributes file:

# 2010
*.txt -crlf

# 2020
*.txt text eol=lf

(as noted in the comments by grandchild, referring to .gitattributes End-of-line conversion), to avoid any CRLF conversion for files with correct eol.

And I have always recommended git config --global core.autocrlf false to disable any conversion (which would apply to all versioned files)

See Best practices for cross platform git config?

Since Git 2.16 (Q1 2018), you can use git add --renormalize . to apply those .gitattributes settings immediately.


But a second more powerful step involves a gitattribute filter driver and add a smudge step

filter driver

Whenever you would update your working tree, a script could, only for the files you have specified in the .gitattributes, force the LF eol and any other formatting option you want to enforce.

If the "clear" script doesn't do anything, you will have (after commit) transformed your files, applying exactly the format you need them to follow.

LF will be replaced by CRLF in git - What is that and is it important?

In Unix systems the end of a line is represented with a line feed (LF). In windows a line is represented with a carriage return (CR) and a line feed (LF) thus (CRLF). when you get code from git that was uploaded from a unix system they will only have an LF.

If you are a single developer working on a windows machine, and you don't care that git automatically replaces LFs to CRLFs, you can turn this warning off by typing the following in the git command line

git config core.autocrlf true

If you want to make an intelligent decision how git should handle this, read the documentation

Here is a snippet

Formatting and Whitespace

Formatting and whitespace issues are some of the more frustrating and
subtle problems that many developers encounter when collaborating,
especially cross-platform. It’s very easy for patches or other
collaborated work to introduce subtle whitespace changes because
editors silently introduce them, and if your files ever touch a
Windows system, their line endings might be replaced. Git has a few
configuration options to help with these issues.

core.autocrlf

If you’re programming on Windows and working with people who are not
(or vice-versa), you’ll probably run into line-ending issues at some
point. This is because Windows uses both a carriage-return character
and a linefeed character for newlines in its files, whereas Mac and
Linux systems use only the linefeed character. This is a subtle but
incredibly annoying fact of cross-platform work; many editors on
Windows silently replace existing LF-style line endings with CRLF, or
insert both line-ending characters when the user hits the enter key.

Git can handle this by auto-converting CRLF line endings into LF when
you add a file to the index, and vice versa when it checks out code
onto your filesystem. You can turn on this functionality with the
core.autocrlf setting. If you’re on a Windows machine, set it to true
– this converts LF endings into CRLF when you check out code:

$ git config --global core.autocrlf true

If you’re on a Linux or Mac system that uses LF line endings, then you
don’t want Git to automatically convert them when you check out files;
however, if a file with CRLF endings accidentally gets introduced,
then you may want Git to fix it. You can tell Git to convert CRLF to
LF on commit but not the other way around by setting core.autocrlf to
input:

$ git config --global core.autocrlf input

This setup should leave you with CRLF endings in Windows checkouts,
but LF endings on Mac and Linux systems and in the repository.

If you’re a Windows programmer doing a Windows-only project, then you
can turn off this functionality, recording the carriage returns in the
repository by setting the config value to false:

$ git config --global core.autocrlf false

Understanding the warning LF will be replaced by CRLF

I guess you'd say that internally git prefers LF. To say it "uses LF internally" isn't quite accurate, because you can configure it not to. (For example, if everyone on your team uses Windows so you all turn eol normalization off, then git will have the CRLF eol markers.)

What the warning is saying is, you're checking this in with linefeed normalization on, but it already has LF. If you check it back out with autocrlf in effect, you'll get CRLF endings. BUT since you have a local copy in your work tree, it will stay as it is.

So "where" would typically be any new repo using autocrlf=true ; or any colleague's repo if checking out from git creates the file for them locally

Working on Windows, but getting LF will be replaced by CRLF when committing in Git

Git has two places that line feeds can be controlled:

  • In the global config settings on your system
  • In the .gitattributes file that applies per repo/project. These settings will override the user's configuration settings.

In your Git settings your have core.autocrlf=true. Meaning you are telling Git to change the line ending to CRLF. You can change this to see if Git stops trying to change the line endings.

git config --global core.autocrlf input

A better approach may be to set the proper line endings in the .gitattributes file. This is committed to the repository in the root and it will overrides user's individual settings. This ensures that all users committing to the repo will have the proper line endings. Because it sounds like you are working on *nix based project it would probably be prudent to set the line endings to line feed. In the .gitattribute files you could have something like this.

# Set all files to have LF line endings
* text eol=lf

This link has a more detailed explanation of the options you can set in the file: https://help.github.com/articles/dealing-with-line-endings/

EDIT 1: Thought I add this from the actual Git documentation: https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration

core.autocrlf

If you’re programming on Windows and working with people
who are not (or vice-versa), you’ll probably run into line-ending
issues at some point. This is because Windows uses both a
carriage-return character and a linefeed character for newlines in its
files, whereas Mac and Linux systems use only the linefeed character.
This is a subtle but incredibly annoying fact of cross-platform work;
many editors on Windows silently replace existing LF-style line
endings with CRLF, or insert both line-ending characters when the user
hits the enter key.

Git can handle this by auto-converting CRLF line endings into LF when
you add a file to the index, and vice versa when it checks out code
onto your filesystem. You can turn on this functionality with the
core.autocrlf setting. If you’re on a Windows machine, set it to true
– this converts LF endings into CRLF when you check out code:

$ git config --global core.autocrlf true

If you’re on a Linux or Mac
system that uses LF line endings, then you don’t want Git to
automatically convert them when you check out files; however, if a
file with CRLF endings accidentally gets introduced, then you may want
Git to fix it. You can tell Git to convert CRLF to LF on commit but
not the other way around by setting core.autocrlf to input:

$ git config --global core.autocrlf input

This setup should leave you
with CRLF endings in Windows checkouts, but LF endings on Mac and
Linux systems and in the repository.

If you’re a Windows programmer doing a Windows-only project, then you
can turn off this functionality, recording the carriage returns in the
repository by setting the config value to false:

$ git config --global core.autocrlf false



Related Topics



Leave a reply



Submit