How to Check If Sed Has Changed a File

How to check if sed has changed a file

You could use awk instead:

awk '$0 ~ p { gsub(p, r); t=1} 1 END{ exit (!t) }' p="$pattern" r="$repl"

I'm ignoring the -i feature: you can use the shell do do redirections as necessary.

Sigh. Many comments below asking for basic tutorial on the shell. You can use the above command as follows:

if awk '$0 ~ p { gsub(p, r); t=1} 1 END{ exit (!t) }' \
p="$pattern" r="$repl" "$filename" > "${filename}.new"; then
cat "${filename}.new" > "${filename}"
# DO SOME OTHER STUFF HERE
else
# DO SOME OTHER STUFF HERE
fi

It is not clear to me if "DO SOME OTHER STUFF HERE" is the same in each case. Any similar code in the two blocks should be refactored accordingly.

How to check if the sed command replaced some string?

sed is not the right tool if you need to count the substitution, awk will fit better your needs :

awk -v OLD=foo -v NEW=bar '
($0 ~ OLD) {gsub(OLD, NEW); count++}1
END{print count " substitutions occured."}
' "$source_filename"

This latest solution counts only the number of lines substituted. The next snippet counts all substitutions with perl. This one has the advantage to be clearer than awk and we keep the syntax of sed substitution :

OLD=foo NEW=bar perl -pe '
$count += s/$ENV{OLD}/$ENV{NEW}/g;
END{print "$count substitutions occured.\n"}
' "$source_filename"

Edit

Thanks to william who had found the $count += s///g trick to count the number of substitutions (even or not on the same line)

Find out if file has changed

Michael, by "changed", are you asking if the file has been touched (i.e. datestamp is newer), or are you asking if the content is different?

If the former, you can test this with find or test. For example, in shell:

#!/bin/sh
touch file1
sleep 1
touch file2
if [ "file1" -nt "file2" ]; then
echo "This will never be seen."
else
echo "Sure enough, file1 is older."
fi

If what you're looking for is a test of the contents, then your operating system probably includes something that will test the hash of a file.

[ghoti@pc ~]$ date > testfile
[ghoti@pc ~]$ md5 testfile
MD5 (testfile) = 1b2faf8be02641f37e6d87b15444417d
[ghoti@pc ~]$ cksum testfile
3778116869 29 testfile
[ghoti@pc ~]$ sha1 testfile
SHA1 (testfile) = 5f4076a3828bc23a050be4867549996180c2a09a
[ghoti@pc ~]$ sha256 testfile
SHA256 (testfile) = f083afc28880319bc31417c08344d6160356d0f449f572e78b343772dcaa72aa
[ghoti@pc ~]$

I'm in FreeBSD. If you're in Linux, then you probably have "md5sum" instead of "md5".

To put this into a script, you'd need to walk through your list of files, store their hashes, then have a mechanism to test current files against their stored hashes. This is easy enough to script:

[ghoti@pc ~]$ find /bin -type f -exec md5 {} \; > /tmp/md5list
[ghoti@pc ~]$ head -5 /tmp/md5list
MD5 (/bin/uuidgen) = 5aa7621056ee5e7f1fe26d8abb750e7a
MD5 (/bin/pax) = 7baf4514814f79c1ff6e5195daadc1fe
MD5 (/bin/cat) = f1401b32ed46802735769ec99963a322
MD5 (/bin/echo) = 5a06125f527c7896806fc3e1f6f9f334
MD5 (/bin/rcp) = 84d96f7e196c10692d5598a06968b0a5

You can store this (instead of /bin run it against whatever's important, perhaps /) in a predictable location, then write a quick script to check a file against the hash:

#!/bin/sh

sumfile=/tmp/md5list

if [ -z "$1" -o ! -f "$1" ]; then
echo "I need a file."
exit 1
elif ! grep -q "($1)" $sumfile; then
echo "ERROR: Unknown file: $1."
exit 1
fi

newsum="`md5 $1`"

if grep -q "$newsum" $sumfile; then
echo "$1 matches"
else
echo "$1 IS MODIFIED"
fi

This kind of script is what tools like tripwire provide.

Return value of sed for no match

as @cnicutar commented, the return code of a command means if the command was executed successfully. has nothing to do with the logic you implemented in the codes/scripts.

so if you have:

echo "foo"|sed '/bar/ s/a/b/'

sed will return 0 but if you write some syntax/expression errors, or the input/file doesn't exist, sed cannot execute your request, sed will return 1.

workaround

this is actually not workaround. sed has q command: (from man page):

 q [exit-code]

here you can define exit-code as you want. For example '/foo/!{q100}; {s/f/b/}' will exit with code 100 if foo isn't present, and otherwise perform the substitution f->b and exit with code 0.

Matched case:

kent$  echo "foo" | sed  '/foo/!{q100}; {s/f/b/}'
boo
kent$ echo $?
0

Unmatched case:

kent$ echo "trash" | sed  '/foo/!{q100}; {s/f/b/}'
trash
kent$ echo $?
100

I hope this answers your question.

edit

I must add that, the above example is just for one-line processing. I don't know your exact requirement. when you want to get exit 1. one-line unmatched or the whole file. If whole file unmatching case, you may consider awk, or even do a grep before your text processing...

Check if file has been modified

One option is to check, if file has been modified. You can achieve with adding extension of backup file to -i option:

perl -pi.orig -e 's/contoso/'"$hostname"'/g' /etc/inet/hosts

This command will store original content of /etc/inet/hosts into /etc/inet/hosts.orig. Then run the specified command. Then you can check if the files are different with, for example cmp command:

if ! cmp -s foo.txt foo.txt.orig; then
echo OK
else
echo ERROR
fi

Remove the .orig file after that.

The other option is to modify the script to read the content of the file, replace required entry, check is change actually happened and return proper status at the end to verify in the shell using $?. You have been given solution in this answer.

Does sed have an option just for checking existence of string in file?

You can set the exit code using the q command in sed.

q [exit-code]

Immediately quit the sed script without processing any more input, except that if auto-print is not disabled the current pattern space
will be printed. The exit code argument is a GNU extension.

See also this answer for a clue.

EDIT:

As @cbuckley kindly linked to, you can find directions here.

how to do sed in-line replacement without backup file if the original file is not changed ? in case the file was not changed

I learned that what I was asking is not possible by "sed" as I suspected by RTFM.

I solved by adding "if [ grep ... ] " on the expression needed to replace.
The "sed" is performed if and only if the expression exists.

Thanks for the people that commented.



Related Topics



Leave a reply



Submit