Git Forces Refresh Index After Switching Between Windows and Linux

git forces refresh index after switching between Windows and Linux

You are completely correct here:

  • The thing you're using here, which Git variously calls the index, the staging area, or the cache, does in fact contain cache data.

  • The cache data that it contains is the result of system calls.

  • The system call data returned by a Linux system is different from the system call data returned by a Windows system.

Hence, an OS switch completely invalidates all the cache data.

... how can I use set the index file for different system?

Your best bet here is not to do this at all. Make two different work-trees, or perhaps even two different repositories. But, if that's more painful than this other alternative, try out these ideas:

The actual index file that Git uses merely defaults to .git/index. You can specify a different file by setting GIT_INDEX_FILE to some other (relative or absolute) path. So you could have .git/index-linux and .git/index-windows, and set GIT_INDEX_FILE based on whichever OS you're using.

Some Git commands use a temporary index. They do this by setting GIT_INDEX_FILE themselves. If they un-set it afterward, they may accidentally use .git/index at this point. So another option is to rename .git/index out of the way when switching OSes. Keep a .git/index-windows and .git/index-linux as before, but rename whichever one is in use to .git/index while it's in use, then rename it to .git/index-name before switching to the other system.

Again, I don't recommend attempting either of these methods, but they are likely to work, more or less.

How to change the line ending in a Git repo dynamically

But my files are still having the CRLF line ending.

If a file inside a commit inside the repository has CRLF line endings, that version of that file is stuck that way forever. No part of any existing commit can ever be changed.

If a file inside a commit inside the repository has LF-only line endings, that version of that file is stuck that way forever. However, you can can choose the ending you want to have Git place in your work-tree when you extract that file.

If you already extracted the file, Git has already done the conversion. Git now thinks everything is fine, even if you just now changed the conversion setting.

Thus, if you change the conversion setting, you must force Git to re-extract the file. The easiest way to do this consistently in all versions of Git is to remove the file from your work-tree, then run git checkout -- path/to/file. Because the file is gone from the work-tree, Git will be forced to extract it again. The updated EOL-conversion will be applied this time.

(Another way to do it is to alter the file, then run the same git checkout, or in Git 2.23 or later, to use git restore. By telling Git that Git should discard your version of the file, and Git seeing that your version of the file is indeed "wrong" in that it doesn't match the index copy because you changed it, Git will be forced to re-extract the index copy.)

That may suffice for you case, or may not. If it does not, read on.

What to know about Git's end-of-line conversions

I'm a firm believer in the "never use Windows at all so that you never need to have your version control system muck with line endings" philosophy myself, but there are a few things to know if you are in some other camp and do want Git to muck with line endings. The most important of these is this: What you store in Git, and what you use when you work with files you got out of Git, are not necessarily the same thing.

To see how this works, remember that Git stores commits rather than directly storing files. The files inside those commits come from Git's index, not from your work-tree. The format of an index-copy of a file is the same as the internal format that Git uses for frozen-for-all-time commits: the data are pre-compressed. So the copy of each file that's in the index is already significantly different from the copy you use in your work-tree, in that the one in your work-tree is not a Git blob object, and generally not zlib-compressed.

Git reads commits into the index before copying them out to your work-tree. Running git add on a file compresses and blob-ifies the file in order to store it in Git's index. Right at this point of conversion, while Git is compressing and Git-ifying a file (git add) or de-Git-ifying and decompressing a file (git checkout-index or equivalent), it's trivial for Git to insert additional conversion operations.

Git therefore does its thing at this point. The things that Git can do—the only things built in directly—are that, on the way out of the index, Git can replace \n-only line endings with \r\n line endings, and on the way into the index, Git can replace \r\n line endings with \n-only line endings.

In other words, you can arrange for Git to throw away some carriage returns before storing a file, and to add some carriage returns when extracting a file. If you do both of these, you get CRLF line endings in your work-tree and newline-only line endings in the commits.

You can, if you like, have Git do only one of these: in particular, with the crlf=input setting, you can tell Git: do just one conversion, on the work-tree-to-index copy operation.

If you choose to have Git do conversions when extracting files, the only conversion available here is turn LF-only into CRLF. You cannot turn CRLF endings into LF-only endings. If the in-Git committed file has CRLF endings, the in-work-tree extracted file will have CRLF endings.

Again, each of these conversions happens in just one direction:

  • index → work-tree: optionally, replace \n with \r\n
  • work-tree → index: optionally, replace \r\n with \n

What you choose with core.autocrlf or .gitattributes directives are:

  • text, -text, and/or core.autocrlf: which files
  • eol=... and/or core.eol: get which treatment(s)
  • crlf=input: on which operation(s)

Once a file has been treated and converted—by copying it to or from the index—Git marks the index's copy as "matches the work-tree's copy" by grabbing key data from the OS: the file's size and other lstat system call values. The precise details here vary because different OSes store different data with different granularity.

The easy way to force a new conversion is to remove one or the other copy of the file: rm file or git rm --cached file destroys the work-tree or index copy respectively, so now a git checkout -- file or git add file will make a new one.

When you run git commit, whatever bytes are in the index copy of the file go into the new commit that Git makes. This new commit is now frozen for all time: the bytes that were in the index are now in the commit, forever (or for as long as the commit itself continues to exist). Nothing and no one can change them.

Consequences of the above

What the above mean is that if you do plan to have your version control system (i.e., Git) muck about with line endings, the line endings you can—and thus probably should—always use for every index copy, and therefore every committed copy, of every text file are LF-only line endings. These can always be converted to CRLF endings in a work-tree file, through an appropriate .gitattributes setting or core.* settings. If you've done such a conversion, that work-tree file can be converted back to LF-only line endings on git add operations.

If you ever do commit a file with CRLF line endings, that commit is stuck that way for all time, and extracting that commit will give you a work-tree copy that has CRLF line endings, every time, because Git has no built in index → work-tree operation that will change this. The only built in CRLF-to-LF operation that Git has only works in the other direction, index ← work-tree.

If you'd like to make a new and improved commit in which the committed copy of that file has LF-only line endings, you have these two options:

  1. make sure your index ← work-tree settings do that, then force Git to add the file (e.g., change it in the work-tree or use git rm --cached on the index copy, and git add it); or
  2. use any command that changes the work-tree copy to have LF-only line endings, e.g., run dos2unix on it or similar, then git add it.

The advantage to method 2 is that you can see the effect immediately (in your work-tree file) and it's hard to get it wrong. The problem with method 1 is that you can't see it, and it's easy to get it totally wrong: e.g., you might accidentally use git rm instead of git rm --cached, which deletes both the index and work-tree copies.

Why error switching branches after squashing? Nothing to commit

TL;DR

Follow Git's advice: for each file that it names, move that file somewhere else (out of the way, perhaps out of the project entirely), or commit it. Then do the checkout, and see what file(s) you got that replaced those files, and decide whether to keep the replacements, or to use the saved copies you made before the checkout.

Be careful with git update-index --assume-unchanged or git update-index --skip-worktree: these work well for some cases, but set you up for this particular trap.

Since you are on Windows, which defaults to conflating files named (e.g.) readme with other different files named README—Windows can't store both; it just clobbers one of them—be careful with case-sensitive file names, usually made by some Linux programmer. :-)

Long

It's my understanding that the error occurs when you have local changes ...

That's not really right. You get that error when the operation—in this case, git checkout—would overwrite some state.

Git isn't about changes at all. Git is mostly about commits, and commits save state—a snapshot of all of your files, along with your metadata: your name and email address, the time you made the commit, and your log message as to why you made the commit, for instance. (Included in this metadata is another critical item, the parent commit hash ID, but we can ignore that for this particular problem.)

The difference between state and changes is like talking about the weather: saying it's warmer today than yesterday tells you one thing, but not everything, about the temperature. Saying that it was 15˚C (59˚F) yesterday, and is 20 / 68 today, tells you everything about the temperature. (Well, about this one temperature, anyway.) Note that it took two states to come up with the change: we have to subtract yesterday's temperature from today's to see how much warmer or colder it might be.

Anyway, commits store state: a full, complete copy of every file that was committed, as of the time it was committed. This copy actually comes out of Git's index, but we get to ignore that fine distinction for the moment. It's about to crop up in a moment, though. So each commit is very much independent of every other commit.

Your work-tree, on the other hand, is not something Git saves (at all, really, because of the index). You use it to work on your files, because the committed copies are in a special, frozen, compressed (sometimes very compressed), Git-only format. To make these useful, Git needs to expand them out into ordinary-format files, that you can use and change if you like. Those expanded copies go in your work-tree.

Now, a thing about the work-tree is that it's allowed to contain files that you won't commit. These are what Git calls untracked files. Normally, if there is a file in your work-tree that's is untracked—that won't be committed—Git will complain about that file. You can make Git shut up about it by listing the untracked file in .gitignore, but this is trickier than it looks. This is where Git smacks you in the face with the existence of the index, again.

The index is a weird and wonderful, but also obnoxious, thing that is pretty much unique to Git. In between the commits, which store files in a frozen Git-only compressed form, and the work-tree, which lets you work on your files, Git puts a third copy of every file. The index copy of each file is in the special Git-only format, but instead of being frozen, it's merely ready to freeze: kind of slushy, if you will. The point is that you can change this copy, and that's what git add does: it copies a file from the work-tree, into the index.

It's actually the presence of the index copy that determines whether or not a file is tracked. If the file is in the index, it's tracked; if not, it's untracked. Listing a file in .gitignore means: if it's not in the index, and is in the work-tree, don't complain. But it has a second side effect, which is: it gives Git permission to destroy the file, in some cases.

Filename case issues

Linux programmers happily write and commit two different files, one named README and other named readme or Readme. Or they do the same with header files: ip.h and IP.h (in older Linux kernel trees). When someone using a Mac or a Windows box tries to work with these commits, they get bitten by the fact that the work-tree on these systems can't put both files into place. (Git's index handles it just fine, because the index is actually a file, .git/index.)

If you are switching from a commit that has a file named README to one that has Readme, or that has both, Git will sometimes get a little discombobulated by this, and not know what to do. (Git needs to be smarter about this, someday.)

assume-unchanged and skip-worktree

In any case, suppose a file is in the index. If you change the work-tree copy, Git will tell you that you have a modified tracked file. If you don't want Git to keep reminding you about this, you can use git update-index --assume-unchanged or git update-index --skip-worktree to mark that file specially.

When you do this—and I think you probably did—Git stops comparing the index copy of the file to the work-tree copy, for git status commands, and does not copy the work-tree copy of the file over top of the index copy, for git add commands. This means that you can take a configuration file, modify it for some reason, and yet have new commits—which use the index copy of the file—store the original version of the file, the one that came out of the commit and went into your work-tree before you set the assume-unchanged or skip-worktree bit.

But git checkout must, when it goes to switch to some other commit, replace the index copy of that file with the (different) committed copy in the commit you're going to switch to. When this happens, Git will not only update the index copy, it will also overwrite the work-tree copy. So if you have a file marked with either of these two bits, you can get that error when you use git checkout.

Is this a problem? Maybe so, maybe not. If you force git checkout to check out that other commit, Git will overwrite the index entry with the file from the other commit, and replace the work-tree copy of that file with the one from the other commit. It's up to you to decide whether this is OK, and if not, whether you need to move the file out of the way first, or clear those bits and go ahead and add-and-commit the file.

There are also corner cases with half-ignored files

Suppose, on the other hand, you didn't set those index bits with git update-index. You could still have an ordinary, untracked file, perhaps even one listed in a .gitignore to keep Git quiet about it. But some other commit might have (a different version of) that same file, and if you have Git switch to that commit, Git will have to replace your untracked work-tree file with the version out of the commit where it's a tracked file.

In this case, git checkout will sometimes—but not always—also complain. Usually it will say that the checkout would overwrite an untracked file. If the file is listed in .gitignore, this will give some parts of Git permission to clobber it. Fortunately git checkout is usually pretty careful about these things.

Unstaged changes left after git reset --hard

Okay, I've kind of solved the problem.

It seemed that the .gitattributes file, containing:

*.sln        eol=crlf
*.vcproj eol=crlf
*.vcxproj* eol=crlf

made the project files appear unstaged. I am clueless why that is, and I'm really hoping that someone privy to the ways of git will give us a nice explanation.

My fix was to remove these files, and to add autocrlf = false under [core] in .git/config.

This does not amount to exactly the same thing as the previous configuration, as it requires every dev to have autocrlf = false. I'd like to find a better fix.

EDIT:

I commented the incriminating lines, uncommented them and it worked. What the ... I don't even ... !

Force LF eol in git repo and working copy

Without a bit of information about what files are in your repository (pure source code, images, executables, ...), it's a bit hard to answer the question :)

Beside this, I'll consider that you're willing to default to LF as line endings in your working directory because you're willing to make sure that text files have LF line endings in your .git repository wether you work on Windows or Linux. Indeed better safe than sorry....

However, there's a better alternative: Benefit from LF line endings in your Linux workdir, CRLF line endings in your Windows workdir AND LF line endings in your repository.

As you're partially working on Linux and Windows, make sure core.eol is set to native and core.autocrlf is set to true.

Then, replace the content of your .gitattributes file with the following

* text=auto

This will let Git handle the automagic line endings conversion for you, on commits and checkouts. Binary files won't be altered, files detected as being text files will see the line endings converted on the fly.

However, as you know the content of your repository, you may give Git a hand and help him detect text files from binary files.

Provided you work on a C based image processing project, replace the content of your .gitattributes file with the following

* text=auto
*.txt text
*.c text
*.h text
*.jpg binary

This will make sure files which extension is c, h, or txt will be stored with LF line endings in your repo and will have native line endings in the working directory. Jpeg files won't be touched. All of the others will be benefit from the same automagic filtering as seen above.

In order to get a get a deeper understanding of the inner details of all this, I'd suggest you to dive into this very good post "Mind the end of your line" from Tim Clem, a Githubber.

As a real world example, you can also peek at this commit where those changes to a .gitattributes file are demonstrated.

UPDATE to the answer considering the following comment

I actually don't want CRLF in my Windows directories, because my Linux environment is actually a VirtualBox sharing the Windows directory

Makes sense. Thanks for the clarification. In this specific context, the .gitattributes file by itself won't be enough.

Run the following commands against your repository

$ git config core.eol lf
$ git config core.autocrlf input

As your repository is shared between your Linux and Windows environment, this will update the local config file for both environment. core.eol will make sure text files bear LF line endings on checkouts. core.autocrlf will ensure potential CRLF in text files (resulting from a copy/paste operation for instance) will be converted to LF in your repository.

Optionally, you can help Git distinguish what is a text file by creating a .gitattributes file containing something similar to the following:

# Autodetect text files
* text=auto

# ...Unless the name matches the following
# overriding patterns

# Definitively text files
*.txt text
*.c text
*.h text

# Ensure those won't be messed up with
*.jpg binary
*.data binary

If you decided to create a .gitattributes file, commit it.

Lastly, ensure git status mentions "nothing to commit (working directory clean)", then perform the following operation

$ git checkout-index --force --all

This will recreate your files in your working directory, taking into account your config changes and the .gitattributes file and replacing any potential overlooked CRLF in your text files.

Once this is done, every text file in your working directory WILL bear LF line endings and git status should still consider the workdir as clean.



Related Topics



Leave a reply



Submit