How to sort with multiple lines in bash?
Probably far from optimal, but
sed -r ':r;/(^|\n)$/!{$!{N;br}};s/\n/\v/g' names | sort | sed 's/\v/\n/g'
seems to do the job (names
is the file with records). This allows records of arbitrary length, not just 2 lines.
Sorting multiple-line records in alphabetical order using shell
The quick way of coming up with something is:
$ cat file | awk 'BEGIN{RS=""; FS="\n"; OFS="|"}{$1=$1}1' \
| sort | awk 'BEGIN{FS="|";OFS="\n";ORS="\n\n"}{$1=$1}1'
Or you can write it in a single Gnu AWK,
$ awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"; PROCINFO["sorted_in"]="@val_str_asc"}
{a[NR]=$0}END{for(i in a) print a[i]}' file
If you don't want the last line to be empty, you can do the following:
$ cat file | awk 'BEGIN{RS=""; FS="\n"; OFS="|"}{$1=$1}1' \
| sort | awk 'BEGIN{FS="|";OFS="\n"}{$1=$1}1' | sed '$d'
$ awk 'BEGIN{RS=""; FS=OFS="\n"; PROCINFO["sorted_in"]="@val_str_asc"}
{a[NR]=$0}END{for(i in a) print a[i] (--NR?"\n":"")}' file
Sort array with multiple lines using another ordered array pattern in bash with awk
Could you please try following. I have changed solution a bit now. Why because it was not clear that you want to print ALL values of for example NC
from array a so I have changed the logic now. Where it will keep concatenating values to itself for a string NC
OR NV
and when it checks it in array b or so then it will print all values of it(from array a).
awk -v OFS='\t' '
FNR==NR{
split($5,a,"_")
array[a[1]]=(array[a[1]]?array[a[1]] ORS $0:$0)
next
}
($1 in array) {
print array[$0]
delete array[$0]
}
END{
for(j in array){
if(array[j]){ print array[j] }
}
}' <(printf '%s\n' "${a[@]}") <(printf '%s\n' "${b[@]}")
How to sort data based on the value of a column for part (multiple lines) of a file?
Apply the DSU (Decorate/Sort/Undecorate) idiom using any awk+sort+cut and regardless of how many lines are in each bock:
$ awk -v OFS='\t' '
NF<pNF || NR==1 { blockNr++ }
{ print blockNr, NF, NR, (NF>1 ? $1 : NR), $0; pNF=NF }
' file |
sort -n -k1,1 -k2,2 -k4,4 -k3,3 |
cut -f5-
3
0
1 0.8
2 0.5
3 0.2
3
1
1 0.4
2 0.1
3 0.8
3
2
1 0.8
2 0.4
3 0.3
To understand what that's doing, just look at the first 2 steps:
$ awk -v OFS='\t' 'NF<pNF || NR==1{ blockNr++ } { print blockNr, NF, NR, (NF>1 ? $1 : NR), $0; pNF=NF }' file
1 1 1 1 3
1 1 2 2 0
1 2 3 2 2 0.5
1 2 4 1 1 0.8
1 2 5 3 3 0.2
2 1 6 6 3
2 1 7 7 1
2 2 8 2 2 0.1
2 2 9 3 3 0.8
2 2 10 1 1 0.4
3 1 11 11 3
3 1 12 12 2
3 2 13 1 1 0.8
3 2 14 2 2 0.4
3 2 15 3 3 0.3
$ awk -v OFS='\t' 'NF<pNF || NR==1{ blockNr++ } { print blockNr, NF, NR, (NF>1 ? $1 : NR), $0; pNF=NF }' file |
sort -n -k1,1 -k2,2 -k4,4 -k3,3
1 1 1 1 3
1 1 2 2 0
1 2 4 1 1 0.8
1 2 3 2 2 0.5
1 2 5 3 3 0.2
2 1 6 6 3
2 1 7 7 1
2 2 10 1 1 0.4
2 2 8 2 2 0.1
2 2 9 3 3 0.8
3 1 11 11 3
3 1 12 12 2
3 2 13 1 1 0.8
3 2 14 2 2 0.4
3 2 15 3 3 0.3
and notice that the awk
command is just creating the key values that you need for sort
to sort on by block number, line number or $1, etc. So awk
Decorates the input, sort
Sorts it, and cut
Undecorates it by removing the decoration values that the awk
script added.
Bash - sort range of lines in file
With head, GNU sed and tail:
(head -n 1 test.sh; sed -n '2,${/\\/p}' test.sh | sort; tail -n 1 test.sh) > test_new.sh
Output:
g++ -o test.out \
Blub.cpp \
Framework.cpp \
Main.cpp \
Sample.cpp \
-std=c++14 -lboost
How to sort groups of lines?
Maybe not the fastest :) [1] but it will do what you want, I believe:
for line in $(grep -n '^\[.*\]$' sections.txt |
sort -k2 -t: |
cut -f1 -d:); do
tail -n +$line sections.txt | head -n 5
done
Here's a better one:
for pos in $(grep -b '^\[.*\]$' sections.txt |
sort -k2 -t: |
cut -f1 -d:); do
tail -c +$((pos+1)) sections.txt | head -n 5
done
[1] The first one is something like O(N^2) in the number of lines in the file, since it has to read all the way to the section for each section. The second one, which can seek immediately to the right character position, should be closer to O(N log N).
[2] This takes you at your word that there are always exactly five lines in each section (header plus four following), hence head -n 5
. However, it would be really easy to replace that with something which read up to but not including the next line starting with a '[', in case that ever turns out to be necessary.
Preserving start and end requires a bit more work:
# Find all the sections
mapfile indices < <(grep -b '^\[.*\]$' sections.txt)
# Output the prefix
head -c+${indices[0]%%:*} sections.txt
# Output sections, as above
for pos in $(printf %s "${indices[@]}" |
sort -k2 -t: |
cut -f1 -d:); do
tail -c +$((pos+1)) sections.txt | head -n 5
done
# Output the suffix
tail -c+$((1+${indices[-1]%%:*})) sections.txt | tail -n+6
You might want to make a function out of that, or a script file, changing sections.txt to $1 throughout.
Related Topics
For Loop for Files in Multiple Folders - Bash Shell
How to Release Hugepages from the Crashed Application
Run Silverlight with Apache Server (Under Linux)
How Does Ancillary Data in Sendmsg() Work
Critical Timing in an Arm Linux Kernel Driver
I Can't Execute Command Modprobe Vboxdrv
Shell Script to Copy and Prepend Folder Name to Files from Multiple Subdirectories
What Does "$1/*" Mean in "For File in $1/*"
How to Update Cudnn to a Newer Version
Is There a Clang Mingw Cross Compiler for Linux
Compressing the Core Files During Core Generation
Difference Between -Shared and -Wl,-Shared of the Gcc Options