Finding Contents of One File in Another File

Find content of one file from another file in UNIX

One way with awk:

awk -v FS="[ =]" 'NR==FNR{rows[$1]++;next}(substr($NF,1,length($NF)-1) in rows)' File1 File2

This should be pretty quick. On my machine, it took under 2 seconds to create a lookup of 1 million entries and compare it against 3 million lines.

Machine Specs:

Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (8 cores)
98 GB RAM

Finding contents of one file in another file

grep itself is able to do so. Simply use the flag -f:

grep -f <patterns> <file>

<patterns> is a file containing one pattern in each line; and <file> is the file in which you want to search things.

Note that, to force grep to consider each line a pattern, even if the contents of each line look like a regular expression, you should use the flag -F, --fixed-strings.

grep -F -f <patterns> <file>

If your file is a CSV, as you said, you may do:

grep -f <(tr ',' '\n' < data.csv) <file>

As an example, consider the file "a.txt", with the following lines:

alpha
0891234
beta

Now, the file "b.txt", with the lines:

Alpha
0808080
0891234
bEtA

The output of the following command is:

grep -f "a.txt" "b.txt"
0891234

You don't need at all to for-loop here; grep itself offers this feature.


Now using your file names:

#!/bin/bash
patterns="/home/nimish/contents.txt"
search="/home/nimish/another_file.csv"
grep -f <(tr ',' '\n' < "${patterns}") "${search}"

You may change ',' to the separator you have in your file.

Query the contents of a file using another file in AWK

One way:

awk 'NR==FNR{a[i]=$2;b[i++]=$3;next}{for(j=0;j<i;j++){if ($3>=a[j] && $3<=b[j]){print;}}}' i=0 file2 file1
AAA BBB 1500
EEE FFF 2000

Read the file2 contents and store it in arrays a and b. When file1 is read, check for the number to be between the entire a and b arrays and print.

One more option:

$ awk 'NR==FNR{for(i=$2;i<=$3;i++)a[i];next}($3 in a)'  file2 file1
AAA BBB 1500
EEE FFF 2000

File2 is read and the entire range of numbers is broken up and stored into the associate array a. When we read the file1, we just need to lookup the array a.

Read one file to search another file and print out missing lines

Use grep with following options:

grep -Fvf b.txt a.txt

The key is to use -v:

-v, --invert-match
Invert the sense of matching, to select non-matching lines.

When reading patterns from a file I recommend to use the -F option as long as you not explicitly want that patterns are treated as regular expressions.

-F, --fixed-strings
Interpret PATTERN as a list of fixed strings (instead of regular expressions), separated by newlines, any of which
is to be matched.

How to find words from one file in another file?

You can use grep -f:

grep -Ff "first-file" "second-file"

OR else to match full words:

grep -w -Ff "first-file" "second-file"

UPDATE: As per the comments:

awk 'FNR==NR{a[$1]; next} ($1 in a){delete a[$1]; print $1}' file1 file2

search pattern from one file in another file and writing the line after the match into a third file

If you have GNU grep

grep --no-group-separator -A1 -Ff file1 file2
  • -A1 will tell grep to print the matching line as well as the next line
  • by default, the output groups will be separated by --, so use --no-group-separator if you wish to avoid this line

bash text search: find if the content of one file exists in another file

try grep

cat b.txt|grep -f a.txt

Find entries of one text file in another file in python

Strip the entries of newlines

Python includes newlines when you read lines - your first entry is read as 1223232\n. Strip the newline and it will work.

def readA():
with open('A.txt') as bondNumberFile:
for line in bondNumberFile:
readB(line.rstrip())

grep lines that appear partly in one file to another

grep -iFf file1 file2 > file

you need to tell grep that it is in fgrep mode with the -F option, then the -f specifies what file to read from.

Note that I have changed your >> redirect (append) to > (create).
You'll trip yourself up in testing using >>, as your first tests will always appear at the top of the file, and if you're rushing you won't think it is working. Use > for development and if you really need append mode, then add it after you are certain you basic cmd is working as required.

Finally, I'd use the -i (ignore case) option sparingly. If you really need to match lower case versions of your target strings, better to include those in your file1, so your process is self documenting.

IHTH

How to check whether one file's value contains in another text file? (perl script)

Based on my understanding I write this code:

use strict;
use warnings;
#use ReadWrite;
use Array::Utils qw(:all);

use vars qw($my1file $myfile1cnt $my2file $myfile2cnt @output);

$my1file = "did1.txt"; $my2file = "did2.txt";

We are going to read both first and second files (DID1 and DID2).

readFileinString($my1file, \$myfile1cnt); readFileinString($my2file, \$myfile2cnt);

In first file, as per the OP's request the first four characters should be matched with second file and then if they matched we need to check rest of the characters in the first file with the second one.

while($myfile1cnt=~m/^((\w){4})\:([^\n]+)$/mig)
{
print "<LineStart>";
my $lineChk = $1; my $full_Line = $3; #print ": $full_Line\n";
my @First_values = split /\:/, $full_Line; #print join "\n", @First_values;

If the first four digit matched then,

    if($myfile2cnt=~m/^$lineChk\:([^\n]+)$/m)
{

Storing the rest of the content in the same and to be split with colon and getting the characters to be matched with first file contents.

        my $FullLine = $1;  my @second_values = split /:/, $FullLine;

Then search each letter first and second content which matched line...

        foreach my $sngletter(@First_values)
{

If the letters are matched with first and second file its going to be printed.

            if( grep {$_ eq "$sngletter"} @second_values)
{
print "Matched: $sngletter\t";
}
}
}
else { print "Not Matched..."; }

This is just information that the line end.

    print "<LineEnd>\n"
}

#------------------>Reading a file
sub readFileinString
#------------------>
{
my $File = shift;
my $string = shift;
use File::Basename;
my $filenames = basename($File);

open(FILE1, "<$File") or die "\nFailed Reading File: [$File]\n\tReason: $!";
read(FILE1, $$string, -s $File, 0);
close(FILE1);
}


Related Topics



Leave a reply



Submit