Formatting XML as comma delimited using sed or awk
Try this -
awk -v FS="" '{gsub(/^[[:space:]]+/,"",$0);ORS=(NR%3==0?RS:FS)}1' f
How to split a delimited string into an array in awk?
Have you tried:
echo "12|23|11" | awk '{split($0,a,"|"); print a[3],a[2],a[1]}'
Joining two file with sed awk separated by comma
You can just use paste
:
paste -d, file1 file2
example1,testing1
example2,testing2
example3,testing3
Or, you can use awk
:
awk -v OFS=, 'FNR==NR{a[++i]=$0; next} {print a[FNR], $0}' file1 file2
example1,testing1
example2,testing2
example3,testing3
using sed or awk to double quote comma separate and concatenate a list
The easiest is something like this (in pseudo code):
- Read a line;
- Put the line in quotes;
- Keep that quoted line in a stack or string;
- At the end (or while constructing the string), join the lines together with a comma.
Depending on the language, that is fairly straightforward to do:
With awk
:
$ awk 'BEGIN{OFS=","}{s=s ? s OFS "\"" $1 "\"" : "\"" $1 "\""} END{print s}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
Or, less 'wall of quotes' to define a quote character:
$ awk 'BEGIN{OFS=",";q="\""}{s=s ? s OFS q$1q : q$1q} END{print s}' file
With sed
:
$ sed -E 's/^(.*)$/"\1"/' file | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g'
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
(With Perl and Ruby, with a join
function, it is easiest to push the elements onto a stack and then join that.)
Perl:
$ perl -lne 'push @a, "\"$_\""; END{print join(",", @a)}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
Ruby:
$ ruby -ne 'BEGIN{@arr=[]}; @arr.push "\"#{$_.chomp}\""; END{puts @arr.join(",")}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
Awk ordering of delimited string
Using awk
$awk '{print $1,$4,$3,$2}' FS="|" OFS="|" file
name|surname|email|phone
Awk & Sort-Output as Comma Delimited?
To get comma-separated output, use the following:
$ awk 'BEGIN{FS="\t"; OFS=","} ($2 <= 5000000 && $3 >= 5000000) || ($2 >= 5000000 && $3 <= 6000000) || ($2 <= 6000000 && $3 >= 6000000) || ($2 <= 5000000 && $3 >= 6000000) {$1=$1;print}' file | awk 'BEGIN{FS=","; OFS=","} ($1 == "chr12") ' | sort -t$"," -k4rn
chr12,5294045,5393088,0.923076923076923
chr12,3306736,5048326,0.913561847988077
chr12,5505370,6006665,0.791318864774624
The only change above is the addition on the action:
{$1=$1;print}
awk
will only reformat a line with a new field separator if the one or more of the fields on the line have been changed in some way. $1=$1
is sufficient to indicate that field 1 has been changed. Consequently, the new field separators are inserted.
Also, the two calls to awk
can be combined into a single call:
awk 'BEGIN{FS="\t"; OFS=","} ($2 <= 5000000 && $3 >= 5000000) || ($2 >= 5000000 && $3 <= 6000000) || ($2 <= 6000000 && $3 >= 6000000) || ($2 <= 5000000 && $3 >= 6000000) {$1=$1; if($1 == "chr12") print}' file | sort -t$"," -k4rn
Simpler Example
In the following, the input is tab-separated and the output field separator, OFS
, is set to a comma. In this first example, the awk
command print
is used:
$ echo $'a\tb\tc' | awk -v OFS=, '{print}'
a b c
Despite OFS=,
, the output retains the tab-separator.
Now, we add the simple statement $1=$1
and observe the output:
$ echo $'a\tb\tc' | awk -v OFS=, '{$1=$1;print}'
a,b,c
The output is now comma-separated. Again, that is because awk
only reformats a line with the new OFS
if it thinks that a field on the line has been changed in some way. The assignment of $1
to itself is sufficient to trigger that reformat.
Note that it is not sufficient to make a change that affects the line as a whole. For example, the following does not trigger a reformat:
$ echo $'a\tb\tc' | awk -v OFS=, '{$0=$0;print}'
a b c
It is necessary to change one or more fields of the line individually. In the following, sub
operates on $0
as a whole and, consequently, no reformat is triggered:
$ echo $'a\tb\tc' | awk -v OFS=, '{sub($1,"NEW");print}'
NEW b c
In the example below, however, sub
operates specifically on field $1
and hence triggers a reformat:
$ echo $'a\tb\tc' | awk -v OFS=, '{sub($1,"NEW", $1);print}'
NEW,b,c
Related Topics
How to Shell Have More Than One Job in Linux
Bash Script to Calculate Time Elapsed
Reading Data from PDF Files into R
Getting Stacktrace from Core Dump
Listen on a Network Port and Save Data to a Text File
Setting Node_Env for Node.Js + Expressjs Application as a Daemon Under Ubuntu
Linux - Yum Install Gcc - Missing Kernel-Headers
How to Do the Opposite of Diff
Why Would It Be Impossible to Fully Statically Link an Application
Securing a Linux Webserver for Public Access
Delimited by Comma Using Awk or Sed with the Tags Below
How to Move All Files Including Hidden Files into Parent Directory via *
Installing Node.Js on Debian 6.0
Difference Between Netstat and Ss in Linux