Compare md5 sums in bash script
So .. the problem you're seeing appears to be that the format of the md5sum.txt
file you create doesn't match the format of the .md5
file that you download, against which you need to check the value that you calculate.
The following would be closer to my version of the script. (Explanation below.)
#!/bin/bash
if ! cd /home/example/public_html/exampledomain.com/billing/system/; then
echo "Can't find work directory" >&2
exit 1
fi
rm -f GeoLiteCity.dat
curl -L https://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz | gunzip > GeoLiteCity.dat
curl -L https://geolite.maxmind.com/download/geoip/database/GeoLite2-City.mmdb.gz | gunzip > GeoLite2-City.dat
curl -O https://geolite.maxmind.com/download/geoip/database/GeoLite2-City.md5
md5sum < GeoLite2-City.dat | cut -d\ -f1 > md5sum.txt
file1="md5sum.txt"
file2="GeoLite2-City.md5"
if ! cmp --silent "$file1" "$file2"; then
mail -s "Results of GeoLite Updates" email@address.com <<< "md5sum for GeoLite2-City failed. Please check the md5sum. File may possibly be corrupted."
fi
The major differences here are..
rm -f GeoLightCity.dat
instead of-rf
. Let's not reach farther than we need to.md5sum
takes standard input rather than processing the file by name. The effect is that the output does not include a filename. Unfortunately because of limitations to the Linuxmd5sum
command, this still doesn't match the .md5 file you download from Maxmind, so:cut
is used to modify the resultant output, leaving only the calculated md5.- using
cmp
instead of subshells, per comments on your question.
The second and third points are perhaps the most important ones for you.
Another option for creating your md5sum.txt file would be to do it on-the-fly as you're download. For example:
curl -L https://geolite.maxmind.com/download/geoip/database/GeoLite2-City.mmdb.gz \
| gunzip | tee -a GeoLite2-City.dat | cut -d\ -f1 | md5sum > md5sum.txt
This uses the tee
command to split the file into its "save" location and another pipe, which goes through md5sum to generate your .txt file.
Might save you a minute that would otherwise be eaten by the md5sum that runs afterwards. And it'll take better advantage of SMP. :)
Comparing content of 2 files with md5sum
You need a program/built-in that evaluates the comparison. Usually you would use test
/[
/[[
to do so. With these programs -eq
compares decimal numbers. Therefore use the string comparison =
instead.
[[ "$(md5sum file_1.sql)" = "$(md5sum file_2.sql)" ]]
The exit code $?
of this command tells you wether the two strings were equal.
However, you may want to use cmp
instead. This program compares the files directly, should be faster because it doesn't have to compute anything, and is also safer as it cannot give false positives like a hash comparison can do.
cmp file_1.sql file_2.sql
compare files in shell script with md5sum and create csv for the changed file
Food for thought maybe
runs to check if different, if so prints lines that have with the bits you indicated you wished to save to csv
#!/bin/bash
#Check if file are different then grep for word differ
#normally would spit out Files file2 and file1 differ
# flags are -F fixed string, -w match only full words
# -q quiet ie no output to stdout (screen)
if $(diff -q "$2" "$1" | grep -Fwq "differ")
then
#create a var of the changed text, awk looking at
#begining of line to see if begins with > and then
#output the full fine for awk to then select the
#vars you want
changeSyn=$(diff file2 file1 | awk '$1 ~ /^ *>/' | awk '{print $2","$5","$7 }')
#same again only for new vars
addedSyn=$(diff file2 file1 | awk '$1 ~ /^ *</' | awk '{print $2","$5","$7 }')
echo "$changeSyn"
echo "$addedSyn"
else
echo "No change"
fi
Bash - Compare 2 lists of files with their md5 check sums
An attempt using Awk
which is the right tool meant for this,
awk -F"/" 'FNR==NR{filearray[$1]=$NF; next }!($1 in filearray){printf "%s has a different md5sum\n",$NF}' file2 file1
file4.php has a different md5sum
Where, file2
and file1
are as follows
$ cat file1
df7a0edcb7994581430379db56d8d53b /home/user/vanila/file-1.php
e1af39e94239a944440ab2925393ae60 /home/user/vanila/file-2.php
ce74e43d24d9c36cd579e932ee94b152 /home/user/vanila/file-3.php
95b7d47ed7134912270f8d3059100e8c /home/user/vanila/file-4.php
$ cat file2
df7a0edcb7994581430379db56d8d53b /home/user/file-1.php
94b2a24a1fc9883246fc103f22818930 /home/user/file-1.1.php
e1af39e94239a944440ab2925393ae60 /home/user/file-2.php
ce74e43d24d9c36cd579e932ee94b152 /home/user/file-3.php
f5233ee990c50aade7c4e3ab9b4fe524 /home/user/file-4.php
To find the file is not present in one and not in other,
awk -F"/" 'FNR==NR{filelist[$NF]=$NF; next;}!($NF in filelist){printf "%s is an extra file",$NF}' file1 file2
file-1.1.php is an extra file
How to compare md5 hash values on a condition on shell script?
The issue you are having with the (updated) posted code is that you are using a for
loop when a while
loop works.
The following code works for me. I simply changed the for
loop to a while
loop.
#!/bin/sh
check() {
dir="$1"
chsum1=`find ~/NASAtest -type f -exec cat {} \; | md5`
chsum2=$chsum1
while [ $chsum1 == $chsum2 ]
do
echo "hello"
sleep 10
chsum2=`find ~/NASAtest -type f -exec cat {} \; | md5`
done
echo "hello"
#eval $2
}
check $*
The reason the while loop wasn't working is because you were missing spaces between the square brackets and the expression.
How to detect only the different files in my bash shell script?
Here is your script corrected:
while IFS= read -r filename;
do
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# inspecting the digest of each file individually #
# shows many files are identical and so are the digests #
# It also prints MD5 (full file path) = md5_signature! #
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
md5 "old/$filename" # please use double quotes
md5 "new/$filename"
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# Using -q eliminates all output from md5 except the sig #
# Your script now works correctly #
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
[[ $(md5 -q "old/$filename") == $(md5 -q "new/$filename") ]] || echo differs; # differs
done < files.txt
Problems:
- You had a typo of
new/$fullfile
rather thannew/$filename
- You should use
"new/$filename"
(ie, use double quotes) around the file name expansions - Use
md5 -q
to compare output ofmd5
on different files. Otherwisemd5
, by default, prints the input file path in the form ofMD5 (full_path/base_name) = 2504fcc0c0a57d14aa6b4193b5efaf94
. Since these paths are guaranteed to be different in two different directories, the different path names will cause the failure in the string comparison.
The comments above assume you are using md5
on BSD or, likely, on macOS.
Here is an alternate solution that works both on Linux with md5sum
and BSD with md5
. Just feed the content of the file to the stdin of either program and only the md5 signature is printed:
$ md5 <new/file.pdf
2504fcc0c0a57d14aa6b4193b5efaf94
vs if you use the file name, the path is printed and the MD5 hash signature used is printed:
$ md5 new/file.pdf
MD5 (new/file.pdf) = 2504fcc0c0a57d14aa6b4193b5efaf94
The same holds true for md5sum
on Linux or GNU core utilities.
MD5 comparison between two text files
I don't know if such a command exist, but I've taken the liberty to write you a sorting mechanism in Bash. Although it's optimised, I suggest you recreate it in a language of your own choice.
#! /bin/bash
# Sets the array delimiter to a newline
IFS=$'\n'
# If $1 is empty, default to 'file1.txt'. Same for $2.
FILE1=${1:-file1.txt}
FILE2=${2:-file2.txt}
DELETED=()
ADDED=()
CHANGED=()
# Loop over array $1 and print content
function array_print {
# -n creates a "pointer" to an array. This
# way you can pass large arrays to functions.
local -n array=$1
echo "$1: "
for i in "${array}"; do
echo $i
done
}
# This function loops over the entries in file_in and checks
# if they exist in file_tst. Unless doubles are found, a
# callback is executed.
function array_sort {
local file_in="$1"
local file_tst="$2"
local callback=${3:-true}
local -n arr0=$4
local -n arr1=$5
while read -r line; do
tst_hash=$(grep -Eo '^[^ ]+' <<< "$line")
tst_name=$(grep -Eo '[^ ]+$' <<< "$line")
hit=$(grep $tst_name $file_tst)
# If found, skip. Nothing is changed.
[[ $hit != $line ]] || continue
# Run callback
$callback "$hit" "$line" arr0 arr1
done < "$file_in"
}
# If tst is empty, line will be added to not_found. For file 1 this
# means that file doesn't exist in file2, thus is deleted. Otherwise
# the file is changed.
function callback_file1 {
local tst=$1
local line=$2
local -n not_found=$3
local -n found=$4
if [[ -z $tst ]]; then
not_found+=($line)
else
found+=($line)
fi
}
# If tst is empty, line will be added to not_found. For file 2 this
# means that file doesn't exist in file1, thus is added. Since the
# callback for file 1 already filled all the changed files, we do
# nothing with the fourth parameter.
function callback_file2 {
local tst=$1
local line=$2
local -n not_found=$3
if [[ -z $tst ]]; then
not_found+=($line)
fi
}
array_sort "$FILE1" "$FILE2" callback_file1 DELETED CHANGED
array_sort "$FILE2" "$FILE1" callback_file2 ADDED CHANGED
array_print ADDED
array_print DELETED
array_print CHANGED
exit 0
Since it might be hard to understand the code above, I've written it out. I hope it helps :-)
while read -r line; do
tst_hash=$(grep -Eo '^[^ ]+' <<< "$line")
tst_name=$(grep -Eo '[^ ]+$' <<< "$line")
hit=$(grep $tst_name $FILE2)
# If found, skip. Nothing is changed.
[[ $hit != $line ]] || continue
# If name does not occur, it's deleted (exists in
# file1, but not in file2)
if [[ -z $hit ]]; then
DELETED+=($line)
else
# If name occurs, it's changed. Otherwise it would
# not come here due to previous if-statement.
CHANGED+=($line)
fi
done < "$FILE1"
while read -r line; do
tst_hash=$(grep -Eo '^[^ ]+' <<< "$line")
tst_name=$(grep -Eo '[^ ]+$' <<< "$line")
hit=$(grep $tst_name $FILE1)
# If found, skip. Nothing is changed.
[[ $hit != $line ]] || continue
# If name does not occur, it's added. (exists in
# file2, but not in file1)
if [[ -z $hit ]]; then
ADDED+=($line)
fi
done < "$FILE2"
Bash script md5sum
#! /bin/bash
while read -r user passwd ; do
md5=$(printf %s "$passwd" | md5sum | cut -c1-32)
printf '%s %s %s\n' "$user" "$passwd" "$md5"
done
Related Topics
Awk: Words Frequency from One Text File, How to Ouput into Myfile.Txt
Suppress or Prevent Duplicate Inotifywait Events
Bash: /Bin/Myscript: Permission Denied
Sorting CSV File by 5Th Column Using Bash
How to Access the Base Filename of a File You Are Sourcing in Bash
Setting Per-File Flags with Automake
C Calling Conventions and Passed Arguments
How to Fetch the Tags for Ec2-Describe-Instances in a Shell Script
Linux Sort Doesn't Work with Negative Float Numbers
Bash: How to Traverse Directory Structure and Execute Commands
How to Monitor Newly Created File in a Directory with Bash
Subprocess Library Won't Execute Compgen