An efficient way to transpose a file in Bash
awk '
{
for (i=1; i<=NF; i++) {
a[NR,i] = $i
}
}
NF>p { p = NF }
END {
for(j=1; j<=p; j++) {
str=a[1,j]
for(i=2; i<=NR; i++){
str=str" "a[i,j];
}
print str
}
}' file
output
$ more file
0 1 2
3 4 5
6 7 8
9 10 11
$ ./shell.sh
0 3 6 9
1 4 7 10
2 5 8 11
Performance against Perl solution by Jonathan on a 10000 lines file
$ head -5 file
1 0 1 2
2 3 4 5
3 6 7 8
4 9 10 11
1 0 1 2
$ wc -l < file
10000
$ time perl test.pl file >/dev/null
real 0m0.480s
user 0m0.442s
sys 0m0.026s
$ time awk -f test.awk file >/dev/null
real 0m0.382s
user 0m0.367s
sys 0m0.011s
$ time perl test.pl file >/dev/null
real 0m0.481s
user 0m0.431s
sys 0m0.022s
$ time awk -f test.awk file >/dev/null
real 0m0.390s
user 0m0.370s
sys 0m0.010s
EDIT by Ed Morton (@ghostdog74 feel free to delete if you disapprove).
Maybe this version with some more explicit variable names will help answer some of the questions below and generally clarify what the script is doing. It also uses tabs as the separator which the OP had originally asked for so it'd handle empty fields and it coincidentally pretties-up the output a bit for this particular case.
$ cat tst.awk
BEGIN { FS=OFS="\t" }
{
for (rowNr=1;rowNr<=NF;rowNr++) {
cell[rowNr,NR] = $rowNr
}
maxRows = (NF > maxRows ? NF : maxRows)
maxCols = NR
}
END {
for (rowNr=1;rowNr<=maxRows;rowNr++) {
for (colNr=1;colNr<=maxCols;colNr++) {
printf "%s%s", cell[rowNr,colNr], (colNr < maxCols ? OFS : ORS)
}
}
}
$ awk -f tst.awk file
X row1 row2 row3 row4
column1 0 3 6 9
column2 1 4 7 10
column3 2 5 8 11
The above solutions will work in any awk (except old, broken awk of course - there YMMV).
The above solutions do read the whole file into memory though - if the input files are too large for that then you can do this:
$ cat tst.awk
BEGIN { FS=OFS="\t" }
{ printf "%s%s", (FNR>1 ? OFS : ""), $ARGIND }
ENDFILE {
print ""
if (ARGIND < NF) {
ARGV[ARGC] = FILENAME
ARGC++
}
}
$ awk -f tst.awk file
X row1 row2 row3 row4
column1 0 3 6 9
column2 1 4 7 10
column3 2 5 8 11
which uses almost no memory but reads the input file once per number of fields on a line so it will be much slower than the version that reads the whole file into memory. It also assumes the number of fields is the same on each line and it uses GNU awk for ENDFILE
and ARGIND
but any awk can do the same with tests on FNR==1
and END
.
transpose a column in unix
With GNU awk for multi-char RS:
$ printf 'x,\ny,\nz,\n' | awk -v RS='^$' '{gsub(/\n|(,\n$)/,"")} 1'
x,y,z
transpose column to row
its working. try below
awk 'NR==1{print} NR>1{a[$1]=a[$1]" "$2}END{for (i in a){print i " " a[i]}}' file | tac
or you can use sort
awk 'NR==1{print} NR>1{a[$1]=a[$1]" "$2}END{for (i in a){print i " " a[i]}}' file | sort -k1 -n
Transpose rows to columns using the first column as reference in unix shell
You can do this in Awk
, get a hash-map of values in first column as key and values in the rest of the row as hash values.
awk '
{
for(i=2;i<=NF;i++)
unique[$1]=(unique[$1]FS$i); next
} END {
for (i in unique) {
n=split(unique[i],temp);
for(j=1;j<=n;j++)
print i,temp[j]
}
}' file
should work on awk
present on any POSIX compliant shell.
The steps:-
- The loop
for(i=2;i<=NF;i++)
runs for column number 2 till the last column in each line and a hash-mapunique
is created based on value of first column($1
) and other columns are designated from$2
until$NF
- The part under
END
runs after all the lines are processed. We use thesplit()
call to separate each value from the array and store them as individual elements in arraytemp
. - The we run a loop for all array elements in
temp
and print the index along with the element in the new array.
linux - transpose rows with pattern into columns
using this method might help you to solve issue:
echo $(cat answer.txt) | sed 's/ ([0-9]\)/\n\1/g'
Related Topics
Ld Does Not Link Opengl on Linux
How to Connect to Amazon Linux Instance Using Remote Desktop from Windows 7
Docker Run a Shell Script in The Background Without Exiting The Container
Failed Opening The Rdb File ... Read-Only File System
Which Suits Linux ? Gnu Make Vs Cmake Vs Codeblocks Vs Qmake
What Does '-Oom-Kill-Disable' Do for a Docker Container
What Is The Effect of Setting a Linux Socket - High Priority
Why Does The Call Latency on Clock_Gettime(Clock_Realtime, ..) Vary So Much
Bash: Return a String from Bash Function
How to Get Screen Dpi (Linux,Mac) Programatically
Gdb Complains No Source Available
Inappropriate Ioctl for Device When Trying to Ssh
Enable Dynamic Debug for Multiple Files at Boot
How to Install Devtools Package for R Studio Mounted on Linux Redhat Server
Version Control System Which Keeps Track of Individual Files