Check If Local Git Repo Is Ahead/Behind Remote

git ahead/behind info between master and branch?

Part 1

As an answer on your question 1, here's a trick I found to compare two branches and show how many commits each branch is ahead of the other (a more general answer on your question 1):

For local branches:
git rev-list --left-right --count master...test-branch

For remote branches:
git rev-list --left-right --count origin/master...origin/test-branch

This gives output like the following:

2 1

This output means: "Compared to master, test-branch is 1 commit ahead and 2 commits behind."

You can also compare local branches with remote branches, e.g. origin/master...master to find out how many commits a local branch (here master) is ahead/behind its remote counterpart.

Part 2

To answer the second part of your question, the solution depends on what exactly you want to achieve.

To view commits

In order to have git rev-list return the exact list of commits unique on either side, replace the --count argument with something like --pretty=oneline, making the complete command to execute:

git rev-list --left-right --pretty=oneline master...test-branch

This will generate output like this:

<bba27b56ad7072e281d529d4845e4edf877eb7d7 unique commit 2 on master
<dad0b69ec50ea57b076bfecabf2cc7c8a652bb6f unique commit 1 on master
>4bfad52fbcf0e60d78d06661d5c06b59c98ac8fd unique commit 1 on test-branch

Here every commit sha is preceded by < or > to indicate which branch it can be found on (left or right, here master or test-branch respectively).

To view code

If you want to view a diff of all new commits only found on either branch, you'll need to do this in two steps:

  1. define the most recent common ancestor
$ git merge-base master test-branch
c22faff7468d6d5caef217ac6b82f3ed95e9d902

  1. diff either branch to the commit sha obtained above (short format will usually do)

To show the diff of all commits only found on master

git diff c22faff7..master

To show the diff of all commits only found test-branch

git diff c22faff7..test-branch

Check if local git repo is ahead/behind remote

In the end, I implemented this in my C++11 git-ws plugin.

string currentBranch = run("git rev-parse --abbrev-ref HEAD"); 
bool canCommit = run("git diff-index --name-only --ignore-submodules HEAD --").empty();
bool canPush = stoi(run("git rev-list HEAD...origin/" + currentBranch + " --ignore-submodules --count")[0]) > 0;

Seems to work so far. canPull still needs to be tested and implemented.

Explanation:

  • currentBranch gets the console output, which is a string of the current branch name
  • canCommit gets whether the console outputs something (difference between current changes and HEAD, ignoring submodules)
  • canPush gets the count of changes between origin/currentBranch and the local repo - if > 0, the local repo can be pushed

How do you programmatically check if the local copy is behind the remote?

To get, programmatically, a count of commits that are different on the current branch vs its upstream, use git rev-list --count --left-right HEAD...@{upstream}, or git rev-list --count master...master@{upstream} for instance. Note the three dots here, which separate the branch name or HEAD from branch@{upstream}This is how git status or git branch -vv prints ahead 1 or behind 2 or up to date or whatever.

Note that this assumes that you are on a branch in the first place, and that the branch has an upstream to be ahead and/or behind. If the upstream is a remote-tracking name like origin/master, this assumes that the value stored in the remote-tracking name is the one you want stored in it.

There is a lot more to know

If you are scripting this stuff, it's important to know (or define) precisely what you mean by up to date.

Purely locally—i.e., within one repository + work-tree combination—there are three entities to think about:

  • The current commit, aka HEAD.

    This may be a detached HEAD, where HEAD contains a raw hash ID, or the opposite, on a branch, where HEAD contains the name of the branch itself. When on a branch, the branch name, e.g. master, contains the raw hash ID of the current commit. Either way, HEAD always refers to the current commit.1

    The current commit itself is read-only (entirely) and permanent (mostly—you can deliberately abandon commits, after which they eventually get removed). You can change which commit is the current commit (e.g., git checkout different-commit), but you cannot change the commits themselves. Since the commit cannot change, it's never "out of date" by definition: it is whatever it is. Like any commit, the current commit has some metadata (who made it, when, etc.) along with a complete snapshot of every file.

    Files store inside commits are in a special, Git-only format (and of course are read-only).

  • The work-tree, which is simply where you do your work.

    Here, you can read and write every file. These files are in their ordinary format, not compressed and Git-specific. You can also have files here that are not known to Git, but before we can talk about this properly, we need to cover the third entity.

  • The index, also called the staging area or sometimes the cache.

    The index has several uses (hence the multiple names) but I think it is best described as the next commit you would make, if you made a commit right now. That is, the index (which is actually just a file) holds all the information Git needs to make a new snapshot, to put into a new commit. Hence the index holds all the files that will go into the next commit you make.

    Files in the index are compressed, and in a Git-only format, just like files in commits. The crucial difference for our purposes here, though, is that the files in the index can be changed. You can put new files into the index, or remove existing files from the index, as well.

    All that git add file really does is to copy a file from the work-tree, into the index. This replaces the previous version in the index, so that the index now matches the work-tree. Or, if you wish to remove a file, git rm file removes that file from both the index and the work-tree.


1A new repository has no commits at all, so there is an exception to this rule: HEAD can refer to a branch name that simply does not yet exist. That's the case in a brand new repository: HEAD says that the current branch is master, yet master does not actually exist until you make the first commit.

(The git checkout --orphan command can re-create this special "on a branch that does not exist yet" state for another branch. This is not something most people will do most of the time, but it can come up in programs that examine the state.)


What git status does

Since the index and work-tree are both writable, both can be "dirty" or cause something to be "out of date" in some way. If you consider the work-tree file to be the newest, it may be the index copy that's out of date, because it does not match the work-tree copy. Once the work-tree file is copied into the index, the index no longer matches the HEAD commit, and a new commit will be needed at some point.

What git status does, besides running git rev-list --count --left-right with the branch and its upstream and getting those numbers,2 is that it runs, in effect, two git diffs (with --name-status since it's not interested in a detailed patch):

  1. Compare HEAD to index. Whatever is different here, these are the changes that are staged for commit, because if you made a commit now, Git would snapshot the entire index, and that snapshot would differ from the current commit in precisely these files.

  2. Compare index to work-tree. Whatever is different here, these are the changes that are not staged for commit. Once you run git add on these files, the index copy will match the work-tree copy, but no longer match the HEAD copy, so now those will be changes that are staged for commit.


2Note that git status first checks that you're on a branch, and if so, that the branch has an upstream setting. Also, this is all built into it, so it does not have to run a separate program, but the principle is the same.


Untracked and maybe ignored

We can now properly define what it means for a file to be untracked, too. An untracked file is, quite simply, a file that is not in the index. That is, if we remove a file from the index (only) with git rm --cached, or if we create a file in the work-tree without creating a corresponding file in the index, we have a work-tree file that has nothing of the same name in the index. That's an untracked file.

If a file is untracked, git status normally whines about it: the diff it runs that compares the index to the work-tree says ah, here is a file in the work-tree that is not in the index, and Git would tell you that it is untracked. If it is untracked on purpose, you can have git status shut up about it, by listing that file—or a path-name pattern that matches it—in a .gitignore file. Essentially, just before complaining that some file is untracked, Git looks at the ignore directives.3 But if the file is in the index, Git never looks for its name in any .gitignore.


3The ignore directives also tell git add that any en-masse "add everything" should avoid adding that file, if it's currently untracked.


Upstreams and remotes

An upstream for a branch can be a remote-tracking name, like origin/master. These names are your Git's way of remembering some other Git's branches. To update the remote-tracking names for the remote origin, you simply run git fetch origin.

Note that you can have more than one remote! If you add a second remote fred at some second URL, git fetch fred will call up the Git at that URL, and update your fred/master and so on. So it's important to run git fetch to the right remote.

Running git fetch with no additional name will fetch the remote for the current branch's upstream, or from origin the current branch has no upstream, or there is no current branch, so this is usually just a matter of running git fetch.

Submodules

Submodules are really just references to another Git repository, but this throws a whole new wrinkle into the general plan. Each Git repository has its own HEAD, work-tree, and index. These can be clean or dirty as before, and if the submodule is not in detached-HEAD state, the submodule's branch can be ahead of and/or behind its upstream.

Submodule repositories are, however, normally in detached-HEAD state. Each commit in the superproject lists the specific commit to which your Git should detach that submodule Git. When the superproject Git checks out the commit, the superproject Git stores the hash ID for the submodule into the superproject's index. That way each new superproject commit records the correct hash ID.

To change the hash ID, git add in the superproject copies the current hash ID of the actual checked-out submodule, into the index in the repository for the superproject (whew!). So if you've moved the submodule (via git checkout there), you navigate back to the superproject, run git add on the submodule path, and now the superproject's index records the correct hash ID, ready for the next superproject commit.

(Testing whether the submodule is on the commit desired by the superproject's index is more difficult.)

Git - check whether git notes branch is behind remote

This line:

fetch = +refs/notes/*:refs/notes/*

tells your Git: Take their refs/notes/* names and copy them to my refs/notes/* names. You never make any origin/refs/notes/commits name at all, you just copy the hash ID in their refs/notes/commits to your own refs/notes/commits.

This means that if you add your own commit notes, the next git fetch wipes them out, because it takes their latest notes, throwing away (force-updating) your own refs/notes/commits.

If you want to maintain your own notes, you'll need to put their notes in a different place or store your notes in a different place (either one). The git notes system isn't terribly good at all this—it's not really user-friendly at all here.

I'd suggest storing your notes under your own name, and using the standard name for their notes. I think you can do this with:

git config core.notesRef refs/notes/mynotes

for instance. Make sure you don't overwrite this notes ref using:

fetch = +refs/notes/commits:refs/notes/commits

so that you only overwrite your own refs/notes/commits from their refs/notes/commits.

You can now tell whether your notes and their notes don't match using:

git rev-parse refs/notes/commits
git rev-parse refs/notes/mynotes

These two commands will produce two hash IDs. If the hash IDs don't match, you have different commits as the tipmost notes commit.

(Note that, as the documentation shows, you can temporarily override the notes name in multiple ways, whether or not you've overridden the default-default refs/notes/commits name using core.notesRef as above. The term "default-default" here means that refs/notes/commits is the default if core.notesRef is not set. Setting core.notesRef sets a new default.)

how to see how many commit ahead or behind my local master is compare to my remote master in git

you can use git log to see the difference in commits between the two branches. In your case you can do:

git fetch origin master // this will fetch the remote master
git log master..origin/master // this will give commits not in master but in origin/master

Git: How to check if a local repo is up to date?

Try git fetch --dry-run
The manual (git help fetch) says:

--dry-run
Show what would be done, without making any changes.

Check if pull needed in Git

First use git remote update, to bring your remote refs up to date. Then you can do one of several things, such as:

  1. git status -uno will tell you whether the branch you are tracking is ahead, behind or has diverged. If it says nothing, the local and remote are the same.

  2. git show-branch *master will show you the commits in all of the branches whose names end in 'master' (eg master and origin/master).

If you use -v with git remote update (git remote -v update) you can see which branches got updated, so you don't really need any further commands.

However, it looks like you want to do this in a script or program and end up with a true/false value. If so, there are ways to check the relationship between your current HEAD commit and the head of the branch you're tracking, although since there are four possible outcomes you can't reduce it to a yes/no answer. However, if you're prepared to do a pull --rebase then you can treat "local is behind" and "local has diverged" as "need to pull", and the other two ("local is ahead" and "same") as "don't need to pull".

You can get the commit id of any ref using git rev-parse <ref>, so you can do this for master and origin/master and compare them. If they're equal, the branches are the same. If they're unequal, you want to know which is ahead of the other. Using git merge-base master origin/master will tell you the common ancestor of both branches, and if they haven't diverged this will be the same as one or the other. If you get three different ids, the branches have diverged.

To do this properly, eg in a script, you need to be able to refer to the current branch, and the remote branch it's tracking. The bash prompt-setting function in /etc/bash_completion.d has some useful code for getting branch names. However, you probably don't actually need to get the names. Git has some neat shorthands for referring to branches and commits (as documented in git rev-parse --help). In particular, you can use @ for the current branch (assuming you're not in a detached-head state) and @{u} for its upstream branch (eg origin/master). So git merge-base @ @{u} will return the (hash of the) commit at which the current branch and its upstream diverge and git rev-parse @ and git rev-parse @{u} will give you the hashes of the two tips. This can be summarized in the following script:

#!/bin/sh

UPSTREAM=${1:-'@{u}'}
LOCAL=$(git rev-parse @)
REMOTE=$(git rev-parse "$UPSTREAM")
BASE=$(git merge-base @ "$UPSTREAM")

if [ $LOCAL = $REMOTE ]; then
echo "Up-to-date"
elif [ $LOCAL = $BASE ]; then
echo "Need to pull"
elif [ $REMOTE = $BASE ]; then
echo "Need to push"
else
echo "Diverged"
fi

Note: older versions of git didn't allow @ on its own, so you may have to use @{0} instead.

The line UPSTREAM=${1:-'@{u}'} allows you optionally to pass an upstream branch explicitly, in case you want to check against a different remote branch than the one configured for the current branch. This would typically be of the form remotename/branchname. If no parameter is given, the value defaults to @{u}.

The script assumes that you've done a git fetch or git remote update first, to bring the tracking branches up to date. I didn't build this into the script because it's more flexible to be able to do the fetching and the comparing as separate operations, for example if you want to compare without fetching because you already fetched recently.



Related Topics



Leave a reply



Submit