Match specific column with grep command
Try doing this :
$ awk '$2 == 866' test.txt
No need to add {print}
, the default behaviour of awk
is to print on a true
condition.
with grep :
$ grep -P '^\S+\s+866\b' *
But awk can print filenames too & is quite more robust than grep here :
$ awk '$2 == 866{print FILENAME":"$0; nextfile}' *
Extract column using grep
First figure out the command to find the column number.
columnname=C
sed -n "1 s/${columnname}.*//p" datafile | sed 's/[^\t*]//g' | wc -c
Once you know the number, use cut
cut -f1,3 < datafile
Combine into one command
cut -f1,$(sed -n "1 s/${columnname}.*//p" datafile |
sed 's/[^\t*]//g' | wc -c) < datafile
Finished? No, you should improve the first sed
command when one header can be a substring of another header: include tabs in your match and put the tabs back in the replacement string.
grep file matching specific column
You can use grep
in combination with sed
to manipulate the input patterns and achieve what you're looking for
grep -Ef <(sed -e 's/^/^(\\S+\\s+){2}/;s/$/\\s*/' uniq.txt) result.txt
If you want to match n
th column, replace 2 in above command with n-1
outputs
A00260:70:HJM2YDSXX:4:1111:15519:16720 NC_000011.10 9606 169 0 28 151 1
A00260:70:HJM2YDSXX:3:1536:9805:14841 NW_021160017.1 9606 81 0 24 151 1
A00260:70:HJM2YDSXX:3:1366:27181:24330 NC_014803.1 234831 121 121 26 151 3
using only 'grep' command to get specific column
You may use a GNU grep with a PCRE pattern:
grep -Po '^([^,]*,){2}\K[^,]*' file
Here,
^
- start of string([^,]*,){2}
- two occurrences of any zero or more chars other than,
and then a,
\K
- match reset operator discarding all text matched so far[^,]*
- zero or more chars other than a comma.
Grep and returning only column of match
grep
itself can do this, with no additional tools, by using the -o
/--only-matching
switch. You should be able to just do:
grep -o '\<age:[0-9]\+'
To explain the less common parts of the regex:
\<
is a zero-width assertion that you're at the beginning of a word (that is, age is preceded by a non-word character or occurs at the beginning of the line, but it's not actually matching that non-word character); this prevents you from matching, sayimage:123
. It doesn't technically require whitespace, so it would match:age:
or the like; if that's a problem, match\t
itself and usecut
ortr
to remove it later.\+
means "match 1 or more occurrences of the preceding character class" (which is[0-9]
, so it matches one or more digits).\+
is equivalent to repeating the class twice, with the second copy followed by*
, e.g.[0-9][0-9]*
, except it's shorter, and some regex engines can optimize\+
better.
grep to search data in first column
Use awk. cat myfile | awk '{print $1}' | grep query
Validating specific column in grep
You need to use -P
option of grep
to enable perl compatible regular expressions, could you please try following. Written and tested with your shown samples.
grep -P '("\d+",){4}"[a-zA-Z]+","2020-12-\d{2}"' Input_file
Explanation: Adding explanation for above, following is only for explanation purposes.
grep ##Starting grep command from here.
-P ##Mentioning -P option for enabling PCRE regex with grep.
'("\d+",){4} ##Looking for " digits " comma this combination 4 times here.
"[a-zA-Z]+", ##Then looking for " alphabets ", with this one.
"2020-12-\d{2}" ##Then looking for " 2020-12-07 date " which OP needs.
' Input_file ##Mentioning Input_file name here.
bash: grep exact matches based on the first column
It matches
9342432_A1 9342432 1 0 0 0
because it has 9342432
in the second column.
You need to update the command to make grep check lines starting with those words, that is, use ^word
:
$ grep -E '^4324321_A3|^9342432' file
4324321_A3 4324321 1 0 0 0
9342432 9342432 2 0 0 0
To make it more accurate, you can also use -w
that matches the full word. This way grep -wE '^4324321_A3|^9342432' file
would not match a line like
4324321_A3something 4324321 1 0 0 0
Use grep on a specific column from text Bash Linux
Use awk
for these column based searches-
cuenta=1
awk -F: -v var="$cuenta" '$1 == var {print $0}' $documento | cut -d":" -f1,2,4
This will only match lines which have 1
in the first column.
Related Topics
Libv4L2: Error Turning on Stream: No Space Left on Device
Gdb/Ddd Program Received Signal Sigill
Take The Last Part of The Folder Path in Shell
Getting The Canonical Time Zone Name in Shell Script
How to Test My Bash Script on Older Versions of Bash
Alternative to Valgrind (Memcheck) for Finding Leaks on Linux
Docker Create Two Bridges That Corrupts My Internet Access
Find Files and Print Only Their Parent Directories
Cmake Error: The Following Variables Are Used in This Project, But They Are Set to Notfound
On Building Docker Image Level=Error Msg="Can't Close Tar Writer: Io: Read/Write on Closed Pipe"
The Difference Between Wait_Queue_Head and Wait_Queue in Linux Kernel
$${Home} or ${Home} in Makefile
Open-Source Opengl Profiler for Linux
What Does It Mean to Break User Space
Docker with '-User' Can Not Write to Volume with Different Ownership