Given Two Directory Trees, How to Find Out Which Files Differ by Content

Given two directory trees, how can I find out which files differ by content?

Try:

diff --brief --recursive dir1/ dir2/

Or alternatively, with the short flags -qr:

diff -qr dir1/ dir2/

If you also want to see differences for files that may not exist in either directory:

diff --brief --recursive --new-file dir1/ dir2/  # with long options
diff -qrN dir1/ dir2/ # with short flag aliases

Given two directory trees how to find which filenames are the same, considering only filenames satisfying a condition?

This is untested, but I'd try something like:

comm -12 <(cd dir1 && ls E*) <(cd dir2 && ls E*)

Basic idea:

  • Generate a list of filenames in dir1 that satisfy our condition. This can be done with ls E* because we're only dealing with a flat list of files. For subdirectories and recursion we'd use find instead (e.g. find . -name 'E*' -type f).

  • Put the filenames in a canonical order (e.g. by sorting them). We don't have to do anything here because E* expands in sorted order anyway. With find we might have to pipe the output into sort first.

  • Do the same thing to dir2.

  • Only output lines that are common to both lists, which can be done with comm -12.

    comm expects to be passed two filenames on the command line, so we use the <( ... ) bash feature to spawn a subprocess and connect its output to a named pipe; the name of the pipe can then be given to comm.

given two directory trees how to find which files are the same?

Well i found the answer myself. I had tried it before, but I thought it did not work.

diff -srq dir1/ dir2/ | grep identical

What -srq means? From diff --help :

-s  --report-identical-files  Report when two files are the same.
-r --recursive Recursively compare any subdirectories found.
-q --brief Output only whether files differ.

Diff files present in two different directories

You can use the diff command for that:

diff -bur folder1/ folder2/

This will output a recursive diff that ignore spaces, with a unified context:

  • b flag means ignoring whitespace
  • u flag means a unified context (3 lines before and after)
  • r flag means recursive

How do I compare two source trees in Linux?

You can try Meld. It is a wonderful visual diff tool ;-)



Related Topics



Leave a reply



Submit