Prevent Git Checkout from Overwriting a File

How to prevent checked-in files from overwriting local versions?

This is possibly a duplicate of git pull keeping local changes as the first answer there in particular seems to be the best solution if the platform-specific changes aren't to be committed.

Prevent git from overwriting file owner upon git pull

You are pretty close to the correct solution.

You need to enable the following hooks:

  • post-merge, called after a successful git pull
  • post-checkout, called after a successful git checkout

If you are sure to only use git pull, the post-merge hook is enough.

Enabling both hooks guarantee you the hook is always called at not extra cost.

The content of the hook should be like:

#!/bin/sh

# default owner user
OWNER="www-data:www-data"

# web repository directory
REPO_DIR="/var/www/html/wp-content/themes/quorum-theme"

echo
echo "---"
echo "--- Resetting ownership to ${OWNER} on ${REPO_DIR}"

sudo chown -R $OWNER $REPO_DIR

echo "--- Done"
echo "---"

The script will reset the ownership to OWNER of all files and directory inside REPO_DIR.

I have copied the values from your post, eventually change it to your needs.

To enable the hook you should:

  • create a file named post-merge with the script above
  • move it inside the directory .git/hook/ of your repo
  • give it the executable permission with chmod +x post-merge

Repeat eventually these steps for the post-checkout hook, that needs to be equal to the post-merge hook.

Pay attention to perform a sudo git pull if your user is not root. All the files and directories in the target directory are owned by www-data, you need to perform the git pull command with a superuser privilege or the command will fail.

Preventing a file overwrite with Git

You can do exactly what you're asking for with hooks, but I don't know if that solves your real problem.

If it's as simple as your in-progress files getting munged, that's easy: Never pull/merge/rebase with uncommitted changes. Always commit before bringing in any other code.

Prevent newly ignored files from changing on git checkout

Likely, if you can negotiate with all users of your repository, then the most suitable way for you would be to rewrite the whole repository, erasing the file completely. Use git filter-branch, something like this:

git filter-branch --tree-filter 'rm -f .project && \
if ! test -f .gitignore || ! grep -q "^\.project$" .gitignore; \
then \
echo .project >>.gitignore; \
fi' -- --all

If your repository is large enough then you probably wish to use --index-filter instead of --tree-filter because --index-filter operates directly on git DB without checking out of every commit in the repository. But the script for --index-filter is more complex and cumbersome.

Then, after repository is rewritten and you have checked that every commit in every branch received desired changes, then all developers should re-fetch the repo. It would be better to ask them to push all their local changes to the repository before you start to minimize the work of rebasing when they receive the modified repository.

To git checkout without overwriting data

Git is warning you that forms/answers.php has changes in your working copy or index that have not been committed.

You can use git-stash to save your changes then git-stash apply to restore them.

The common use case of git-stash is that you are working on changes but then must temporarily checkout a different branch to make a bug fix. So you can stash your changes in your index and working copy, checkout the other branch, make the bug fix, commit, checkout the original branch, and git-stash apply to restore your changes and pick-up where you left off.

`git checkout branch -- .`, without overwriting existing files

I am not sure this is exactly what you need but instead of a partial checkout you may want to use a merge, which is the standard way to resolve conflicting commits.

Considering that you are on an orphaned branch, you cannot use a simple git merge but you have to specify the option --allow-unrelated-histories. Since you want to script this, there cannot be any conflict: this should be straightforward with the right merging strategy, which could be ours given that you are in config/<archetype>. You also want your orphaned branch to remain untouched, still possible with --no-commit and a following git reset --hard HEAD. In the end, the merge command from config/<archetype> could be something like this (untested):

git merge --allow-unrelated-histories -s recursive -Xours --no-commit master

Avoid files from being overwritten during git merge

The question is - how do I prevent the JenkinsFile from being overwritten by any merge? I want the JenkinsFile to remain intact and not be affected by any merge. Is there a way to "lock" these files?

No.

There is a completely different way to go about this, though, that sidesteps the entire problem. In fact, there are multiple ways, but I'll show just one. There's an unfortunate problem in terms of getting to the state where things all work as desired, but once you do get there, you're good. The end goal here is to not have a committed file named Jenkinsfile (or JenkinsFile, but I've used the lowercase-F spelling below) whose content is branch-dependent. Instead, just have an uncommitted work-tree-only file whose name is Jenkins[Ff]ile and whose content is branch-dependent. Make the committed files have other names.

Background

Fundamentally, git merge works by combining work done, i.e., combining the changes to some file(s) since some common starting point. But Git doesn't store changes; Git stores snapshots. This creates a problem for git merge, and the solution requires that you understand how Git's commit graph works.

Almost every commit in a Git repository has at least one parent commit, which is that commit's immediate predecessor. Most have exactly one parent; commits of type "merge" have at least two, and usually exactly two. In fact, the presence of more than one parent is what defines a commit to be a merge commit. The other common special case is that the very first commit in a repository has no parent, because it can't have one, because it was the first commit. (Commits with three or more parents are called octopus merges but they do nothing you can't do with regular merges, so they're mainly for showing off. :-) )

These links, in which a commit stores the hash ID of its parent(s)—remember that each commit is found by its unique hash ID, that Git assigned to the commit when you made the commit—form backwards chains. These backwards chains are the history in the repository. History is commits; commits are history. A branch name simply identifies the (single) last commit that we wish to claim to be part of that branch:

... <-F <-G <-H   <--master

Here, instead of actual hash IDs, I've drawn in single uppercase letters that stand in for each commit. The name master holds the actual hash ID of commit H. We say that master points to H. H holds the hash ID of its parent G, so H points to G, which points to F, and so on, backwards down the line.

Nothing inside any commit can ever change, so we don't need the internal arrows, we just have to remember that they go backwards. It's actually very hard to go forwards, in Git: almost all operations start at the end(s) and work backwards. Once we have more than one branch, this gives is a picture that looks like this:

          G--H   <-- master
/
...--E--F
\
I--J <-- develop
\
K <-- test

To git checkout a branch means *extract the snapshot from the tip commit of that branch. Sogit checkout masterextracts the snapshot from commitH, whilegit checkout developorgit checkout testextracts those snapshots in turn. Also, doing agit checkoutof some branch name attaches the special nameHEAD` to that branch. This is how Git knows which branch—and commit—is the current one.

When you run git merge, you give Git the name of some other commit. That doesn't have to be a branch name—any name for a commit will serve—but giving it a branch name works fine, since that names the tip commit of that branch. So if you git checkout master and then run git merge develop, you start with:

          G--H   <-- master (HEAD)
/
...--E--F
\
I--J <-- develop
\
K <-- test

and Git finds commit J. Git then works backwards from both the current commit H and the named commit J to find the merge base of these two commits.

The merge base is, loosely, the first commit we get to from both tips. That's a commit that's on both branches, and in this case, that's obviously commit F. The idea of a merge base is crucial to understanding how merge works. Since the goal of the merge is to combine work, and that work can be found by comparing the snapshot in commit F, one comparison at a time, to each of the two tip commits H and J:

git diff --find-renames <hash-of-F> <hash-of-H>    # what we changed
git diff --find-renames <hash-of-F> <hash-of-J> # what they changed

To combine the changes, Git starts with all the files from F, and looks at which files we changed and which ones they changed. If we both changed different files, Git takes ours or theirs as appropriate. If we both changed the same file—this eventually brings up a philosophical problem which we'll get back to in a moment—Git attempts to smash our changes together with their changes, by assuming that if we touched some source line and they didn't, it should take ours, and if they touched some source line and we didn't, it should take theirs too. If we both touched the same lines of the same file, then either we did the exact same thing to those lines—in which case, Git takes one copy of that change—or there's a conflict.

If there are no conflicts, Git applies these combined changes to the snapshot in the merge base—in F, here—and uses the resulting files to write out a new snapshot. That new snapshot is a commit of type merge commit, having two parents. The first parent is the commit we were on before, H, and the second is the one we named with our argument, J, so the merge looks like this:

          G--H
/ \
...--E--F L <-- master (HEAD)
\ /
I--J <-- develop
\
K <-- test

Note that nothing happens to any existing commit, nor to any other branch name. Only our own branch name, master (to which HEAD is attached), moves; master now points to the new merge commit that Git just made.

If the merge goes badly, due to merge conflicts, Git will leave a mess behind. The index, which I'm not going to get into here, will contain all the conflicting input files, and the work-tree will contain Git's attempt at merge, along with conflict markers. Your job is to clean up the mess, fix up the index, and finish the merge (with git merge --continue or git commit—the --continue just runs commit) by hand.

Your problem: Jenkinsfile

Suppose that in commit F, the merge base, there is a file named Jenkinsfile. This same file, with this same name, appears in commits H and J. The copies in H and J differ—you said they do, so we'll assume that they do. Therefore at least one differs from F, and perhaps both differ from F.

Git is going to assume that the file that is named Jenkinsfile in both branch tips is the same file that is named Jenkinsfile in F. Obviously, it's not quite the same file—the contents differ—but Git will assume that it is, and that you're trying to combine work done on it.

So, Git will diff the version of Jenkinsfile in F against that in H, and then diff it again, against the version in J. There will be some changes. If both branch tips have changes, Git will combine them (or declare a conflict). Result: bad. Otherwise, Git will take the version of the file from whichever "side" changed it. Is that the side you want? If so, result: good. If not, result: bad.

In summary, for this scenario, there are three possible results:

  • Base vs HEAD is the only change: the result is fine.
  • Base vs theirs is the only change: result is bad.
  • Base vs HEAD and base vs theirs both have changes: result is probably bad.

It is of course possible that merge base commit F has no file named Jenkinsfile. And, it's possible that one or both commit has no such file. In this case, it gets a little trickier. We'll get to that in a moment.

The solution (and some issues getting there)

The solution here is to avoid having a single, fixed-name file, such as Jenkinsfile, in all commits when that file is intended to be branch-dependent. Suppose, instead, that commit F contains Jenkinsfile.master and Jenkinsfile.develop and Jenkinsfile.test. Then commit H will have a Jenkinsfile.master and Jenkinsfile.develop and Jenkinsfile.test too, and the changes from F to H in Jenkinsfile.master will be the ones you want to keep. Since commit J is in branch develop, it should always either have the same changes—imported from master at some point—or no changes at all. Git's merge will therefore do the right thing, in both cases.

The same logic applies to each of the other such files. Note that at this point, the commits identified by all branch tips should have no file named Jenkinsfile (without a suffix) at all. This is, of course, an idealized goal-state: to get there, you must actually make new commits in each branch, renaming the existing Jenkinsfile. But this will have no effect at all on any existing commits. All of that history in your repository is frozen for all time. This means that at some point, you'll run git merge and git merge will locate a merge base commit that has only Jenkinsfile, not Jenkinsfile.master, and not Jenkinsfile.develop or any other suffix.

Let's assume now that in H and J, you have already done this renaming, but in merge base F, you have not—obviously, since it's a historic commit. So F has a Jenkinsfile and no renamed files, while H and J have no Jenkinsfile but do have the renamed files.

Now, remember above where we showed the git diffs that git merge runs, to figure out what has changed since the merge base. One of the arguments is --find-renames. This directs Git to guess whether the file Jenkinsfile in F is "the same" file as Jenkinsfile.master in H, when comparing F and H. The same goes for the comparison of F vs J: is the old Jenkinsfile the same file as the new Jenkinsfile.develop?

If you followed the link to https://en.wikipedia.org/wiki/Ship_of_Theseus you will see that there's no philosophical right answer to the question of identity-over-time. But Git has its right answer, which is: If the file has a similarity index of 50% or better, it's the same file. We don't need to worry here about how Git computes this similarity index (it's a bit complicated); chances are very good that Git will detect the rename in both cases.

What this means in practice is that the first time you run this git merge, Git will immediately declare a merge conflict, of the type I like to call a high level conflict. That is, Git will say that Jenkinsfile was renamed in both branches, but to two different names. Git doesn't know whether to use the master version, or the develop version, or both, or neither, or what. It will just stop with a merge conflict. This is OK because it gives you a chance to resolve the conflict, which you should do by selecting the Jenkinsfile.master file as it appears in the master or --ours branch, and selecting the Jenkinsfile.develop file as it appears in the develop or --theirs branch, as your merged results. Put these two files into the index while removing the original name:

git rm --cached Jenkinsfile
git checkout --ours Jenkinsfile.master
git checkout --theirs Jenkinsfile.develop
git add Jenkinsfile.master Jenkinsfile.develop

You have now resolved the conflict by choosing to keep both files as they appear in both branch tips. You can now commit the result.

Every time you do a merge that uses one of the historic, single-Jenkinsfile commits, you'll need to check that the merge result is correct, or resolve any conflicts. (If it's not correct, immediately after merging, you can fix it in place and use git commit --amend to push the original merge aside and choose a new result as the merge commit. If you don't notice a bad merge, it's a bit more painful, but the recovery is similar in the end anyway. Remember how Git does merges, and work through the two git diffs, to see how putting the right result in any tip commit gets you where you need to go.)

Last, now there's no Jenkinsfile

Now that there's no file named Jenkinsfile, you'll have to redirect any software that wants to use such a file. There are multiple solutions (depending on the software and your OS), including making a symbolic link from Jenkinsfile to the correct per-branch checkout. (Make sure the symbolic link does not get committed, or you'll be right back to the same merge issue when Git tries to merge two potential symlink target changes.)

Prevent git from removing files on checkout

Try to skip the file (git update-index):

git update-index --skip-worktree -- myfile.conf

That should be preserved during checkout as I mentioned here.

If that doesn't work, try the alternative:

git update-index --assume-unchanged -- myfile.conf


Related Topics



Leave a reply



Submit