Difference Between Patch and Diff Files

Difference between patch and diff files

What matters is the content of the file, not the extension. Both of those extensions imply that some sort of diff utility (diff, git diff, git format-patch, svn diff) produced the output.

Many diff utilities produce output which can be applied by the patch command. You will frequently need to use the -d and -p options to patch in order to get the paths matched up right (strip prefix, name target directory). If you see one of those extensions on a file distributed online, it's almost certainly an indication it's compatible with patch.

Git's diff output is compatible with patch, but I believe svn's is not. Of course, plain patches generated by git diff are probably best applied by git apply, and patches generated by git format-patch are designed for use with git-am.

What differences are in the patches/files created by diff and git diff?

  1. On what does it depend whether you can use git diff and diff for the same file?

Simply if the file is in a git repo working tree, then you would be able to use git diff to show changes for that file (against the same file as referenced by the git repo, like the index or blob objects).

This differs from 'diff', which compares files (meaning you need two files, not just one as in git diff when used in a git repo)

As hvd points out in the comments:

You can use git diff outside any work tree and pass it two files.

So you can use git diff in pretty much any situation you can use diff.

The reverse is not true

 git diff --color-words --no-index file1 file2


  1. And what are the differences between the formats?

git diff can emulate any diff format (unified, raw, ...).

It has git-specific format as well (--summary, --stat, ...)

See also:

  • Documentation/diff-format.txt
  • "How to read the output from git diff?"

A git diff will include a git header, with a "similarity index".

The hunks displays for each chunk of diffs are very similar to a diff -u.



  1. If you cannot exchange the commands (see 1.) how can you convert the files into the other format so that you can use them with the other command?

You can convert a git diff in a raw format, or patch with raw: --patch-with-raw.

The reverse is possible: you can apply a diff to a git repo.



  1. If you can exchange the commands (see 1.): Is it even recommend to do so?

It is if you don't have git installed (see the previous example)



  1. Are there any other notable differences in the files created by the two commands?

No: the result of applying a patch generated by a diff or a git diff should be the same.

What is the difference between 'git format-patch and 'git diff'?

A patch created with git format-patch will also include some meta-information about the commit (committer, date, commit message, ...) and will contains diff of binary data. Everything will be formatted as a mail, so that it can be easily sent. The person that receive it can then recreate the corresponding commit with git am and all meta-data will be intact. It can also be applied with git apply as it is a super-set of a simple diff.

A patch crated with git diff will be a simple diff with context (think diff -u). It can also be applied with git apply but the meta-data will not be recreated (as they are not present).

In summary, git format-patch is useful to transmit a commit, while git diff is useful to get a diff between two trees.

Diff between two patches

Your second workflow is the answer. Here is a possible alias:

diffdiff = !git stash save -u -q &&
git apply $1 && git add -A && git commit -q -m"1" &&
git reset --hard HEAD@{1} -q &&
git apply $2 && git add -A && git commit -q -m"2" &&
git reset --hard HEAD@{1} -q && git stash pop -q &&
git diff HEAD@{3} HEAD@{1} && :

Call it on the branch where you want to apply the diffs with diff files located outside of the repository.

git diffdiff <diff_file_1> <diff_file_2>

Not a multiplatform solution, tested on osx, with zsh and bash shells.
If you are okay to do it manually it is quite a bit simpler, much of the complication here is to provide a kind-of-foolproof solution. Also the first line of output is noise, seems like git stash pop doesn't obey the --quiet parameter.

(Edit: There is no need for the temporary branch, so I simplified the alias.)

How do diff/patch work and how safe are they?

What will trigger a merge conflict?

Let's look at the simplest of git's merge strategies, recursive, first: When merging two branches, say a and b, that have a common ancestor c, git creates a patch to go from commit c to the commit ad the head of a and tries to apply that patch to the tree at the head of b. If the patch fails, that's a merge conflict.

git by default uses the recursive strategy, a 3-way merge. The general idea is the same: If the 3-way merge algorithm described in the link fails because two commits from different branches changed the same lines, that's a merge conflict.

Is the context also used by the tools in order to apply the patch?

Yes. If a patch does not apply at the exact line number stored in the diff file, patch tries to find the right line a couple of lines adjacent to the original one based on the context.

How do they deal with changes that do not actually modify source code behavior? For example, swapping function definition places.

patch is not intelligent, it can not differentiate between such changes. It regards a moved function as a couple of added and a couple of deleted lines. If a commit on one branch alters a function and a commit on another moves the unaltered, then an attempt to merge will always give you a merge conflict.

Are there any caveats/limitations regarding the tools that the user should be aware of?

As for patch and diff: No. Both use algorithms that have been around since the early 1970s and are quite robust. As long as they don't complain, you can be fairly certain that they did what you intended.

That being said: git merge tries to resolve merge conflicts on its own. In some rare cases, things can go wrong here - this page has an example close to its end.

Have the algorithms been proven to not generate wrong results?
If not, are there implementations/papers proposing integration testing that at least prove them to be error-free empirically?

"wrong results" is a fairly unspecific term in this context; I'd claim it cannot be proven. What is empirically proven is that applying a patch generated by diff a b to file a will in any case produce file b.

Source code, even when changed, will not change much (specially because of the algorithm implemented and syntax restrictions), but can the safety be generalized to generic text files?

Again, diff/patch/git does not differentiate between source code and other text files. git works as well on generic text files as it does on source code.

I'm pretty much sure the Git is fully reliable since it do have the full
history of commits and can traverse history. What I would like is some
pointers to academic research and references regarding this, if they exist.

Commits in git are snapshots of the tree with meta data, not diffs to the adjacent versions. Patch and diff are not involved in revision traversal at all. (But one level below the surface, git then organizes blobs in pack files that do use a delta compression algorithm. Errors here would be easy to spot because git internally uses sha1 sums to identify files, and the sum would change if an error occurred.)

Why is LCS used instead of other string metric algorithms?

git uses Myers' algorithm by default. The original paper explains why it works the way it does. (It's not purely LCS.)

How to get patch or diff file - Git comparison between branches

try

git diff development..test > patch_name.patch

this will create the patch. apply the patch wherever you want.

Get diff (patch-file) that contains differences between local and remote repository

Assuming you are in the master branch:

$ git diff --no-prefix origin/master > save.patch


Related Topics



Leave a reply



Submit