Sort Entries of Lines Using Shell

Sort entries of lines using shell

Perl can do this nicely as a one-line Unix/Linux command:

perl -n -e "print join ' ', sort{a<=>b} split ' '" < input.txt > output.txt

This is "archaic" Perl with no dollars before the a and b, which allows the command to run fine in both Windows and bash shells. If you use the dollars with bash, they must either be escaped with backslashes, or you must invert the single and double quotes.

Note that the distinctions you are trying to draw between commands, programming languages, and programs are pretty thin. Bash is a programming language. Perl can certainly be used as a shell. Both are commands.

The reason your script runs slowly is that it spawns 3 processes per loop iteration. Process creation is pretty expensive.

How to sort file lines in BASH

sort is your go.

sort <your_file>

Sort text file in a shell script

Leveraging the techniques taught in How can I extract a predetermined range of lines from a text file on Unix? --

#!/usr/bin/env bash

input=$1
total_lines=$(wc -l <"$1")
sections=$2

lines_per_section=$(( total_lines / sections ))
if (( lines_per_section * sections != total_lines )); then
echo "ERROR: ${total_lines} does not evenly divide into ${sections} sections" >&2
exit 1
fi

start=0
ranges=( )
for (( i=0; i<sections; i++ )); do
ranges+=( "$start:$(( start + lines_per_section ))" )
(( start += lines_per_section ))
done

get_range() { sed -n "$(( $1 + 1 )),$(( $2 ))p;$(( $2 + 1 ))q" <"$input"; }
consolidate_input() {
if (( $# )); then
current=$1; shift
paste <(get_range "${current%:*}" "${current#*:}") <(consolidate_input "$@")
fi
}

consolidate_input "${ranges[@]}"

But don't do that. Just put your three sections in three separate files, so you can use paste file1 file2 file3.

Bash - sort range of lines in file

With head, GNU sed and tail:

(head -n 1 test.sh; sed -n '2,${/\\/p}' test.sh | sort; tail -n 1 test.sh) > test_new.sh

Output:


g++ -o test.out \
Blub.cpp \
Framework.cpp \
Main.cpp \
Sample.cpp \
-std=c++14 -lboost

bash - how do you sort within the lines of a text file

If you have gnu awk then it can be done in a single command using asort function:

awk '{for(i=1; i<=NF; i++) c[i]=$i; n=asort(c); 
for (i=1; i<=n; i++) printf "%s%s", c[i], (i<n?OFS:RS); delete c}' file

t v x y z
a b c

Sorting multiple-line records in alphabetical order using shell

The quick way of coming up with something is:

$ cat file | awk 'BEGIN{RS=""; FS="\n"; OFS="|"}{$1=$1}1' \
| sort | awk 'BEGIN{FS="|";OFS="\n";ORS="\n\n"}{$1=$1}1'

Or you can write it in a single Gnu AWK,

$ awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"; PROCINFO["sorted_in"]="@val_str_asc"}
{a[NR]=$0}END{for(i in a) print a[i]}' file

If you don't want the last line to be empty, you can do the following:

$ cat file | awk 'BEGIN{RS=""; FS="\n"; OFS="|"}{$1=$1}1' \
| sort | awk 'BEGIN{FS="|";OFS="\n"}{$1=$1}1' | sed '$d'

$ awk 'BEGIN{RS=""; FS=OFS="\n"; PROCINFO["sorted_in"]="@val_str_asc"}
{a[NR]=$0}END{for(i in a) print a[i] (--NR?"\n":"")}' file

Shell one-liner to add a line to a sorted file

echo "New Line" | sort -o file - file

The -o file means write result to file (and it is explicitly safe to have any of the input files as the output file). The - on its own means 'read standard input' which contains the new line of information. The file at the end means 'also read file'. This would work with any Unix sort from (at least) 7th Edition UNIX™ circa 1978 onwards, and possibly even before that. There are no temporary files or dependencies on other utilities.

Given that a single line is 'sorted' and the file is also in sorted order, you can probably speed the process up by just merging the two sorted inputs:

echo "New Line" | sort -o file -m - file

That also would have worked with even really old sort commands.

Sort alphabetically lines between 2 patterns in Bash

This is a perfect case to use asort() to sort an array in GNU awk:

gawk '/PATTERN1/ {f=1; delete a}
/PATTERN2/ {f=0; n=asort(a); for (i=1;i<=n;i++) print a[i]}
!f
f{a[$0]=$0}' file

This uses a similar logic as How to select lines between two marker patterns which may occur multiple times with awk/sed with the addition that it:

  • Prints lines outside this range
  • Stores lines within this range
  • And when the range is over, sorts and prints them.

Detailed explanation:

  • /PATTERN1/ {f=1; delete a} when finding a line matching PATTERN1, sets a flag on, and clears the array of lines.
  • /PATTERN2/ {f=0; n=asort(a); for (i=1;i<=n;i++) print a[i]} when finding a line matching PATTERN2, sets the flag off. Also, sorts the array a[] containing all the lines in the range and print them.
  • !f if the flag is off (that is, outside the range), evaluate as True so that the line is printed.
  • f{a[$0]=$0} if the flag is on, store the line in the array a[] so that its info can be used later on.

Test

▶ gawk '/PATTERN1/ {f=1} /PATTERN2/ {f=0; n=asort(a); for (i=1;i<=n;i++) print a[i]} !f; f{a[$0]=$0}' FILE
aaa
bbb
PATTERN1
bar
baz
foo
qux
PATTERN2
ccc
ddd


Related Topics



Leave a reply



Submit