Linux shell sort file according to the second column?
If this is UNIX:
sort -k 2 file.txt
You can use multiple -k
flags to sort on more than one column. For example, to sort by family name then first name as a tie breaker:
sort -k 2,2 -k 1,1 file.txt
Relevant options from "man sort":
-k, --key=POS1[,POS2]
start a key at POS1, end it at POS2 (origin 1)
POS is F[.C][OPTS], where F is the field number and C the character position in the field. OPTS is one or more single-letter ordering options, which override global ordering options for that key. If no key is given, use the entire line as the key.
-t, --field-separator=SEP
use SEP instead of non-blank to blank transition
sort a file based on a column in another file
An awk solution: store the 2nd file in memory, then loop over the first file, emitting matching lines from the 2nd file:
awk 'FNR==NR {x2[$1] = $0; next} $1 in x2 {print x2[$1]}' second first
Implementing @Barmar's comment
join -1 2 -o "1.1 1.2 2.2 2.3" <(cat -n first | sort -k2) <(sort second) |
sort -n |
cut -d ' ' -f 2-
note to other answerers, I tested with these files:
$ cat first
foo x y
bar x y
baz x y
$ cat second
bar x1 y1
baz x2 y2
foo x3 y3
Explanation of
awk 'FNR==NR {x2[$1] = $0; next} $1 in x2 {print x2[$1]}' second first
This part reads the 1st file in the command line paramters (here, "second"):
FNR==NR {x2[$1] = $0; next}
The condition FNR == NR
will be true only for the first named file. FNR
is awk's "File Record Number" variable, NR
is the current record number from all input sources. The current line is stored in an associative array named x2
(not a great variable name) indexed by the first field of the record.
The next condition, $1 in x2
, will only start after the file "second" has been completely read. It will look at the first field of the line in file named "first", and the action prints the corresponding line from file "second", which has been stored in the array.
Note that the order of the files in the awk command is important. Since you control the output based on the file named "first", it must be the last file processed by awk.
How to sort a file according to a column in another file?
The function sorted
can receive a keyword argument called key
, which is a function that returns a comparable argument for each element of the list.
If you have two lists with the File_1 columns in in one and the File_2 columns in the other, you could use:
indexes = sorted(range(len(File_2Column)), key=lambda i: File_1Col4[i])
sortedFile_2Col = [File_2Column[i] for i in indexes]
# you can repeat this line for all the columns you want to be sorted by that order
Sorting lines in one file given the order in another file
Use awk
to put the line number from file2
as an extra column in front of file1
. Sort the result by that column. Then remove that prefix column
awk 'FNR == NR { lineno[$1] = NR; next}
{print lineno[$1], $0;}' file2 file1 | sort -k 1,1n | cut -d' ' -f2-
Sorting data in file based on first column in another file
$ cat tst.awk
NR==FNR {
if (NR==1) {
print
}
else {
map[$1] = $0
}
next
}
{ print map[$1] }
$ awk -f tst.awk dataframe1 dataframe2
N02_M N05_F N06_M N07_F N08_F N09_M N02_M N026_F N03_M
586 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364
2237895 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225
7499 0.803 0.803 0.803 0.803 0.803 0.803 0.803 0.803 0.803
35209 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94
2255280 0.995 0.995 0.995 0.995 0.995 0.995 0.995 0.995 0.995
7294280 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478
Sorting data based on second column of a file
You can use the key
option of the sort
command, which takes a "field number", so if you wanted the second column:
sort -k2 -n yourfile
-n
,--numeric-sort
compare according to string numerical value
For example:
$ cat ages.txt
Bob 12
Jane 48
Mark 3
Tashi 54
$ sort -k2 -n ages.txt
Mark 3
Bob 12
Jane 48
Tashi 54
Sort a file by first (or second, or else) column in python
The problem you're having is that you're not turning each line into a list. When you read in the file, you're just getting the whole line as a string. You're then sorting by the first character of each line, and this is always the same character in your input, 'E'
.
To just sort by the first column, you need to split the first block off and just read that section. So your key should be this:
for line in sorted(lines, key=lambda line: line.split()[0]):
split
will turn your line into a list, and then the first column is taken from that list.
Related Topics
How to Handle Error/Exception in Shell Script
Finding Directories with Find in Bash Using a Exclude List
Which Os/Platforms Implement Wait Morphing Optimization
Docker Run Groupadd && Useradd Directives Have No Effect
How to Include a Directory in The Package Debuild
Qt Does Not Create Output Files in Debug/Release Folders in Linux
How to Set Environment Variable in Linux Permanently
Setfacl in Dockerfile Has No Effect
Qt - How to Detect Whether The Application Is Running on Gnome or Kde
Docker - Eacces: Permission Denied, Mkdir '/App/Node_Modules/.Cache'
Can Not Route Packets from One Interface to Another
Pci-E Memory Space Access with Mmap
Programmatically Set Custom Folder/Directory Icon in Linux
Adding a Shell Command Inside/Inline of a Systemd Service File