Bash Sort by Regexp

Sort output of find based on regex

A possible solution (if your paths don't contain newlines):

find . -type d -name 'somefilter' |
sed -En 's|.*/.*_([[0-9]{8}_[0-9]{6})_.*|\1 &|p' |
sort -k1,1 |
sed -E 's/[^ ]* //'

The pipeline works as follows:

  1. find the matching dirpaths
  2. search for the _YYYYMMDD_HHMMSS_ pattern in each dirname and prepend it to the output; drop the dirpaths that don't match.
  3. sort considering the first column only (which is YYYYMMDD_HHMMSS)
  4. remove the first column from the output

How to sort lines by regex with sed

If you can use the . as a field separator, you can sort the file, like:

sort -t. -k2,2 file

The above (of course) needs to have same number of . dots in the line (at least in the first 2 columns).

As @mklement0 said in the edit: using -k2,2 instead of the plain -k2 is better because:

it's cleaner (more robust, more efficient) to limit sorting to the
field of interest - if you don't specify an end field, everything
through the end of the line is sorted

How to sort an array using regex in Javascript?

No need for regex, just sort based on the integer value of the second character in each entry:





arr = ["#0abc", "#2egf", "#0pol kol", "#1loa", "#2ko pol"];


arr.sort((a, b) => parseInt(a[1]) - parseInt(b[1]));

console.log(arr);

How to sort a file based on key name instead of its position in unix?

sort doesn't have a concept of named keys, but you can perform a Schwartzian transform to temporarily add the key as a prefix to the line, sort on the first field, then discard it.

sed 's/\(.*\)\(party_id="[^"]*"\)/\2    \1\2/' file |
sort -t ' ' -k1,1 |
cut -f2-

(where the whitespace between the two first back references and in the sort -t argument is a literal tab, which however Stack Overflow renders as a sequence of spaces).

Regex/bash find string with most recent date?

Perhaps, the shortest command:

ls -1 | sort -t_ -k2.5nr,2 -k2.3nr,2 -k2.1nr,2 -k3r

It sorts by years, months, and days, in that order. The -t option specifies field separator for the column numbers used in -k option values.

The -kX.Ynr,2 options stand for sorting by column X, character number Y in reverse (r) numeric order (n); stop sorting at column 2 (the last character after comma).

          -k2.5
··············v
00.weekly_17032015_T050600
^^^^^^^^
column 2

The last -k3r sorts by the third column in reverse order.

The most recent will be at the top of the list. You can select it by appending | head -1 to the end of the command.

Sort grep matches by the characters' location in character classes

You can define the custom sorting order by decorate/undecorate pattern with awk and sort. For example

$ echo {m,M}{a,A}{r,R,y,Y} | tr ' ' '\n' | 
awk -v pat='mMaArRyY' '{for(i=1;i<=length($0);i++)
printf "%s", index(pat,substr($0,i,1));
print "\t" $0}' |
sort | cut -f2-

mar
maR
may
maY
mAr
mAR
mAy
mAY
Mar
MaR
May
MaY
MAr
MAR
MAy
MAY

UPDATE
For overlapping patters [aA][Aa] here is updated solution, to show how the order is determined I didn't include the final cut.

$ echo {a,A}{A,a} | tr ' ' '\n' | 
awk -v pat='aA,Aa' 'BEGIN{n=split(pat,p,",")}
{for(i=1;i<=length($0);i++)
printf "%s",index(p[i],substr($0,i,1));
print "\t" $0}' |
sort

11 aA
12 aa
21 AA
22 Aa

Here is the full script in action

$ cat text
abcMay defmaY ghiMark jklMaY443

$ grep -oE "\S*[mM][aA][rRyY]\S*" text
abcMay
defmaY
ghiMark
jklMaY443

extract the pattern matched sub string

$ ... | sed -r 's/(\S*([mM][aA][rRyY])\S*)/\2\t\1/'
May abcMay
maY defmaY
Mar ghiMark
MaY jklMaY443

$ ... | awk -v pat='mM,aA,rRyY' 'BEGIN{n=split(pat,p,",")}
{for(i=1;i<=length($1);i++)
printf "%s",index(p[i],substr($0,i,1));
print "\t" $0}'
| sort

114 maY defmaY
211 Mar ghiMark
213 May abcMay
214 MaY jklMaY443

everything in order, eliminate dummy keys

... | cut -f3-

defmaY
ghiMark
abcMay
jklMaY443


Related Topics



Leave a reply



Submit