Match Specific Column with Grep Command

Match specific column with grep command

Try doing this :

$ awk '$2 == 866' test.txt

No need to add {print}, the default behaviour of awk is to print on a true condition.

with grep :

$ grep -P '^\S+\s+866\b' *

But awk can print filenames too & is quite more robust than grep here :

$ awk '$2 == 866{print FILENAME":"$0; nextfile}' *

Extract column using grep

First figure out the command to find the column number.

columnname=C
sed -n "1 s/${columnname}.*//p" datafile | sed 's/[^\t*]//g' | wc -c

Once you know the number, use cut

cut -f1,3 < datafile 

Combine into one command

cut -f1,$(sed -n "1 s/${columnname}.*//p" datafile | 
sed 's/[^\t*]//g' | wc -c) < datafile

Finished? No, you should improve the first sed command when one header can be a substring of another header: include tabs in your match and put the tabs back in the replacement string.

grep file matching specific column

You can use grep in combination with sed to manipulate the input patterns and achieve what you're looking for

grep -Ef <(sed -e 's/^/^(\\S+\\s+){2}/;s/$/\\s*/' uniq.txt) result.txt

If you want to match nth column, replace 2 in above command with n-1

outputs

A00260:70:HJM2YDSXX:4:1111:15519:16720  NC_000011.10    9606    169     0       28      151     1
A00260:70:HJM2YDSXX:3:1536:9805:14841 NW_021160017.1 9606 81 0 24 151 1
A00260:70:HJM2YDSXX:3:1366:27181:24330 NC_014803.1 234831 121 121 26 151 3

using only 'grep' command to get specific column

You may use a GNU grep with a PCRE pattern:

grep -Po '^([^,]*,){2}\K[^,]*' file

Here,

  • ^ - start of string
  • ([^,]*,){2} - two occurrences of any zero or more chars other than , and then a ,
  • \K - match reset operator discarding all text matched so far
  • [^,]* - zero or more chars other than a comma.

Grep and returning only column of match

grep itself can do this, with no additional tools, by using the -o/--only-matching switch. You should be able to just do:

grep -o '\<age:[0-9]\+'

To explain the less common parts of the regex:

  • \< is a zero-width assertion that you're at the beginning of a word (that is, age is preceded by a non-word character or occurs at the beginning of the line, but it's not actually matching that non-word character); this prevents you from matching, say image:123. It doesn't technically require whitespace, so it would match :age: or the like; if that's a problem, match \t itself and use cut or tr to remove it later.
  • \+ means "match 1 or more occurrences of the preceding character class" (which is [0-9], so it matches one or more digits). \+ is equivalent to repeating the class twice, with the second copy followed by *, e.g. [0-9][0-9]*, except it's shorter, and some regex engines can optimize \+ better.

grep to search data in first column

Use awk. cat myfile | awk '{print $1}' | grep query

Validating specific column in grep

You need to use -P option of grep to enable perl compatible regular expressions, could you please try following. Written and tested with your shown samples.

grep -P '("\d+",){4}"[a-zA-Z]+","2020-12-\d{2}"' Input_file

Explanation: Adding explanation for above, following is only for explanation purposes.

grep             ##Starting grep command from here.
-P ##Mentioning -P option for enabling PCRE regex with grep.
'("\d+",){4} ##Looking for " digits " comma this combination 4 times here.
"[a-zA-Z]+", ##Then looking for " alphabets ", with this one.
"2020-12-\d{2}" ##Then looking for " 2020-12-07 date " which OP needs.
' Input_file ##Mentioning Input_file name here.

bash: grep exact matches based on the first column

It matches

9342432_A1 9342432 1 0 0 0

because it has 9342432 in the second column.

You need to update the command to make grep check lines starting with those words, that is, use ^word:

$ grep -E '^4324321_A3|^9342432' file
4324321_A3 4324321 1 0 0 0
9342432 9342432 2 0 0 0

To make it more accurate, you can also use -w that matches the full word. This way grep -wE '^4324321_A3|^9342432' file would not match a line like

4324321_A3something 4324321 1 0 0 0

Use grep on a specific column from text Bash Linux

Use awk for these column based searches-

cuenta=1
awk -F: -v var="$cuenta" '$1 == var {print $0}' $documento | cut -d":" -f1,2,4

This will only match lines which have 1 in the first column.



Related Topics



Leave a reply



Submit