Bash, Linux, Need to Remove Lines from One File Based on Matching Content from Another File

Deleting lines from one file which are in another file

grep -v -x -f f2 f1 should do the trick.

Explanation:

  • -v to select non-matching lines
  • -x to match whole lines only
  • -f f2 to get patterns from f2

One can instead use grep -F or fgrep to match fixed strings from f2 rather than patterns (in case you want remove the lines in a "what you see if what you get" manner rather than treating the lines in f2 as regex patterns).

How to remove the lines which appear on file B from another file A?

If the files are sorted (they are in your example):

comm -23 file1 file2

-23 suppresses the lines that are in both files, or only in file 2. If the files are not sorted, pipe them through sort first...

See the man page here

How to remove lines based on another file?

This job suites awk:

awk 'NR == FNR {a[$1]; next} !($1 in a)' file2.txt file1.txt

john  12  65  0
lee 9 15 0

Details:

NR == FNR {                  # While processing the first file
a[$1] # store the first field in an array a
next # move to next line
}
!($1 in a) # while processing the second file
# if first field doesn't exist in array a then print

How to delete from a text file, all lines that contain a specific string?

To remove the line and print the output to standard out:

sed '/pattern to match/d' ./infile

To directly modify the file – does not work with BSD sed:

sed -i '/pattern to match/d' ./infile

Same, but for BSD sed (Mac OS X and FreeBSD) – does not work with GNU sed:

sed -i '' '/pattern to match/d' ./infile

To directly modify the file (and create a backup) – works with BSD and GNU sed:

sed -i.bak '/pattern to match/d' ./infile

How can I remove lines in one file that exist in another?

You could likely use grep with the -v (invert-match) and -f (file) options:

grep -v -f oldfile newfile > newstrip

It matches any lines in newfile that are not in oldfile and saves them to newstrip. If you are happy with the results you could easily do afterward:

mv newstrip newfile

This will overwrite newfile with newstrip (removing newstrip).

How to delete line with matching pattern from another file?

most generic solution will be

$ grep -vf file2 file1

note that any substring match on any field will count. If you only restrict to exact match on an exact field (here assumed the last)

$ awk 'NR==FNR{a[$1]; next} !($NF in a)' file2 file1

Delete line containing pattern found anywhere in another file

If I understand correctly,
you want to exclude from the first file lines that would match any IP address in the second file.

This simple and admittedly a bit lazy solution might be good enough for your purpose:

grep -v file1 -Fwf <(awk '{ print $3 }' file2)

The Awk extracts the 3rd column with IP addresses,
and grep will use those as fixed patterns (-F) and only match complete words (-w).

If the IP address is not always the 3rd column,
then you could extract them by using pattern matching with grep,
as @tripleee suggested:

grep -v file1 -Fwf <(grep -owE '[1-9][0-9](\.[0-9]{1,3}){3}' file2)

delete lines based on one file contain to another

All you need is this, using any awk in any shell on every Unix box:

awk 'NR==FNR{a[$0]; next} !(substr($0,1,20) in a)' file1 file2

and with files such as you described on a reasonable processor it'll run in a couple of seconds rather than 4 hours.

Just make sure file1 only contains the numbers you want to match on, not a sed script using those numbers, e.g.:

$ head file?
==> file1 <==
20606516000100070004
20630555000100030001
20636222000800050001

==> file2 <==
20606516000100070004XXXXXXX19.202107.04.202105.03.202101.11.202001.11.2020WWREABBXBOU
99906516000100070004XXXXXXX19.202107.04.202105.03.202101.11.202001.11.2020WWREABBXBOU


$ awk 'NR==FNR{a[$0]; next} !(substr($0,1,20) in a)' file1 file2
99906516000100070004XXXXXXX19.202107.04.202105.03.202101.11.202001.11.2020WWREABBXBOU

Delete lines based on another file

When on an older Solaris server, use the tools in /usr/xpg4/bin whenever possible.

$ /usr/bin/grep
Usage: grep -hblcnsviw pattern file . . .
$ /usr/xpg4/bin/grep
Usage: grep [-E|-F] [-c|-l|-q] [-bhinsvwx] [file ...]
grep [-E|-F] [-c|-l|-q] [-bhinsvwx] -e pattern... [-f pattern_file]...[file...]
grep [-E|-F] [-c|-l|-q] [-bhinsvwx] [-e pattern]... -f pattern_file [file...]


Related Topics



Leave a reply



Submit