Sorting a Tab Delimited File

Sorting a tab delimited file

Using bash, this will do the trick:

$ sort -t$'\t' -k3 -nr file.txt

Notice the dollar sign in front of the single-quoted string. You can read about
it in the ANSI-C Quoting sections of the bash man page.

Sort a tab delimited file based on column sort command bash

To sort on the fourth column use just the -k 4,4 selector.

sort -t $'\t' -k 4,4 <filename>

You might also want -V which sorts numbers more naturally. For example, yielding 1 2 10 rather than 1 10 2 (lexicographic order).

sort -t $'\t' -k 4,4 -V <filename>

If you're getting errors about the $'\t' then make sure your shell is bash. Perhaps you're missing #!/bin/bash at the top of your script?

c# sort a tab delimited file

As said above - you can't really expect people to do the work for you... but I was bored.

Here is a simple solution in the form of a complete console app that will likely fall apart the second you give it real world data, but hopefully will get you started.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
//Read file
var fileContents = File.ReadAllText("file.txt");

//split on carriage returns and line feeds, remove empty entries.
var lines = fileContents.Split(new[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries);

//Split each line on Tab
var splitLines = lines.Select(l => l.Split(new[] { '\t' }, StringSplitOptions.RemoveEmptyEntries));

//splitLines is now an array of arrays. Each splitLine entry is a line, and each entry of each splitline element is
//a single field... so we should be able to sort how we want, e.g. by first field then by second field:
var sortedLines = splitLines.OrderBy(sl => sl[0]).ThenBy(sl => sl[1]);

//put back together as TSV - put tabs back.
var linesWithTabsAgain = sortedLines.Select(sl => string.Join("\t", sl));

//put carriage returns/linefeeds back
var linesWithCRLF = string.Join("\r\n", linesWithTabsAgain);

File.WriteAllText("newFile.txt",linesWithCRLF);


}
}
}

How do I sort a tab separated file on the nth column using cygwin sort?

On my machine (Mac bash prompt, GNU sort ...) this works:

sort -t '   ' -k 2,2 in.txt > out.txt

(A "real" tab between the quotes.)

To get the tab there I type CTRL-V, TAB (CTRL-V followed by TAB).

EDIT: I've now tested it on a Windows machine from the cygwin prompt and it works the same there (as I expected, bash is bash).

Sort rows in a tab delimited file with numerical order

Specify the character index where numbers begin in the field in KEYDEF. In this case we want to sort on the numeric part of the first field, which begins from the 3rd char, thus -k1.3n:

$ sort -k1.3n file
ch1 1 209
ch1 23 890
ch3 45 21
ch4 66 12
ch10 11 53
ch10 12 90

What linux commands can I use to sort columns in a tab-separated text file?

awk solution:

awk 'BEGIN{ FS=OFS="\t"; PROCINFO["sorted_in"]="@ind_str_asc" }
{ split($0,b,FS); delete b[1]; asort(b); r="";
for(i in b) r=(r!="")? r OFS b[i] : b[i]; a[$1] = r
}
END{ for(i in a) print i,a[i] }' file

The output:

fileB   Y
fileM B C M Y
fileX A M Z

  • PROCINFO["sorted_in"]="@ind_str_asc" - sort mode

  • split($0,b,FS); - split the line into array b by FS (field separator)

  • asort(b) - sort marker values



Related Topics



Leave a reply



Submit