Git and Hard Links

Git and hard links

The 'tree' object, representing directories in Git, stores file name and (subset of) permissions. It doesn't store inode number (or other kind of file id). Therefore hard links cannot be represented in git, at least not without third party tools such as metastore or git-cache-meta (and I am not sure if it is possible even with those tools).

Git tries to not touch files that it doesn't need to update, but you have to take into account that git doesn't try to preserve hardlinks, so they can be broken by git.


About symbolic links pointing outside repository: git has no problems with them and should preserve contents of symbolic links... but utility of such links is dubious to me, as whether those symlinks would be broken or not depends on the filesystem layout outside git repository, and not under control of git.

Linux Hard Link (ln) and GIT

When the update done by git includes removal of the file, which could easily happen, the connection between file in git and the hard link outside of the repository would cease to exist.

In other words, using hardlinks with git is not a good idea. Also note that when you use harlink within git repository, git would complain about it.

git and hardlink in linux

You can't make¹ hard links outside a git repository. You have several choices:

  • Make ~/.zshrc a symbolic link to my-home-git-checkout/zshrc.
  • Keep a git checkout in your home directory.
  • Copy the file from your git checkout to your home directory, perhaps automatically upon a commit or checkout.

¹ Yeah, ok, quibble: you can't keep.

Does git use hardlinks for a remote on the same disk?

Testing with the following script suggests that pseudo-cloning via git init + git remote add + git fetch doesn't create hardlinks to the source repository:

hardlinktest:

#!/usr/bin/env bash
tmpdir="$(mktemp -d)"
trap "rm -rf $tmpdir" EXIT
cd "$tmpdir"
set -x
git clone https://github.com/dictcp/awesome-git testrepo
git clone testrepo testrepo.localclone
mkdir testrepo.pseudoclone
cd testrepo.pseudoclone
git init
git remote add sibling ../testrepo
git fetch sibling
cd ..
ls -1 -i testrepo*/.git/objects/a0

Relevant part of the output:

$ ls -1 -i testrepo*/.git/objects/a0
testrepo/.git/objects/a0:
417590 cdfa472f2bf8212a02a3edeb941868d651749d

testrepo.localclone/.git/objects/a0:
417590 cdfa472f2bf8212a02a3edeb941868d651749d

testrepo.pseudoclone/.git/objects/a0:
537341 cdfa472f2bf8212a02a3edeb941868d651749d

This means that the file testrepo.localclone/.git/objects/a0/cdfa472f2bf8212a02a3edeb941868d651749d is a hardlink to testrepo/.git/objects/a0/cdfa472f2bf8212a02a3edeb941868d651749d - their inode values are the same (417590 for my test run, but your mileage will of course vary). The inode value (537341) of the corresponding file in the testrepo.pseudoclone repository tells us that it is an independent copy.

Full output:

$ ./hardlinktest 
+ git clone https://github.com/dictcp/awesome-git testrepo
Cloning into 'testrepo'...
remote: Counting objects: 58, done.
remote: Total 58 (delta 0), reused 0 (delta 0), pack-reused 58
Unpacking objects: 100% (58/58), done.
Checking connectivity... done.
+ git clone testrepo testrepo.localclone
Cloning into 'testrepo.localclone'...
done.
+ mkdir testrepo.pseudoclone
+ cd testrepo.pseudoclone
+ git init
Initialized empty Git repository in /tmp/tmp.ZWoH0OTA1P/testrepo.pseudoclone/.git/
+ git remote add sibling ../testrepo
+ git fetch sibling
remote: Counting objects: 58, done.
remote: Compressing objects: 100% (40/40), done.
remote: Total 58 (delta 17), reused 0 (delta 0)
Unpacking objects: 100% (58/58), done.
From ../testrepo
* [new branch] master -> sibling/master
+ cd ..
+ ls -1 -i testrepo/.git/objects/a0 testrepo.localclone/.git/objects/a0 testrepo.pseudoclone/.git/objects/a0
testrepo/.git/objects/a0:
417590 cdfa472f2bf8212a02a3edeb941868d651749d

testrepo.localclone/.git/objects/a0:
417590 cdfa472f2bf8212a02a3edeb941868d651749d

testrepo.pseudoclone/.git/objects/a0:
537341 cdfa472f2bf8212a02a3edeb941868d651749d
+ rm -rf /tmp/tmp.ZWoH0OTA1P

Using hardlink to .git instead of git worktree

You probably don't want to do this. A worktree includes its own version of HEAD, the HEAD reflog, and the index. This is required because you have two separate branches checked out and you can stage files in each worktree independent of the other one.

If you hardlinked .git into another directory, you'll actually have the same HEAD, so you'll be on the same branch. Also, you'll have the same index, so as soon as you run git status in one directory, running it in the other directory will cause every file to be re-read. That's in addition to the fact that if you run git add in one directory, it will be reflected in the other as well.

As a result, this is likely to lead to a bunch of unhappiness and possibly some repository corruption. If you want to continue to use Eclipse, use separate clones, or you can use a different editor if you want to use worktrees.

Can hard links get broken?

This can happen if the original file (~/work/genDocs/bibs/SKM.bib) is recreated instead of being modified in-place. A new inode will be created, but your link will still point to the old inode. You can fix the issue by creating symbolic links with ln -s instead of hard links with link. See What is the difference between a symbolic link and a hard link?



Related Topics



Leave a reply



Submit