Delimited by Comma Using Awk or Sed with the Tags Below

Formatting XML as comma delimited using sed or awk

Try this -

awk -v FS=""  '{gsub(/^[[:space:]]+/,"",$0);ORS=(NR%3==0?RS:FS)}1' f

How to split a delimited string into an array in awk?

Have you tried:

echo "12|23|11" | awk '{split($0,a,"|"); print a[3],a[2],a[1]}'

Joining two file with sed awk separated by comma

You can just use paste:

paste -d, file1 file2
example1,testing1
example2,testing2
example3,testing3

Or, you can use awk:

awk -v OFS=, 'FNR==NR{a[++i]=$0; next} {print a[FNR], $0}' file1 file2
example1,testing1
example2,testing2
example3,testing3

using sed or awk to double quote comma separate and concatenate a list

The easiest is something like this (in pseudo code):

  1. Read a line;
  2. Put the line in quotes;
  3. Keep that quoted line in a stack or string;
  4. At the end (or while constructing the string), join the lines together with a comma.

Depending on the language, that is fairly straightforward to do:

With awk:

$ awk 'BEGIN{OFS=","}{s=s ? s OFS "\"" $1 "\"" : "\"" $1 "\""} END{print s}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"

Or, less 'wall of quotes' to define a quote character:

$ awk 'BEGIN{OFS=",";q="\""}{s=s ? s OFS q$1q : q$1q} END{print s}' file

With sed:

$ sed -E 's/^(.*)$/"\1"/' file | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g'
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"

(With Perl and Ruby, with a join function, it is easiest to push the elements onto a stack and then join that.)

Perl:

$ perl -lne 'push @a, "\"$_\""; END{print join(",", @a)}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"

Ruby:

$ ruby -ne 'BEGIN{@arr=[]}; @arr.push "\"#{$_.chomp}\""; END{puts @arr.join(",")}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"

Awk ordering of delimited string

Using awk

$awk '{print $1,$4,$3,$2}' FS="|" OFS="|" file
name|surname|email|phone

Awk & Sort-Output as Comma Delimited?

To get comma-separated output, use the following:

$ awk 'BEGIN{FS="\t"; OFS=","} ($2 <= 5000000 && $3 >= 5000000) || ($2 >= 5000000 && $3 <= 6000000) || ($2 <= 6000000 && $3 >= 6000000) || ($2 <= 5000000 && $3 >= 6000000) {$1=$1;print}' file | awk 'BEGIN{FS=","; OFS=","} ($1 == "chr12") ' | sort -t$"," -k4rn 
chr12,5294045,5393088,0.923076923076923
chr12,3306736,5048326,0.913561847988077
chr12,5505370,6006665,0.791318864774624

The only change above is the addition on the action:

{$1=$1;print}

awk will only reformat a line with a new field separator if the one or more of the fields on the line have been changed in some way. $1=$1 is sufficient to indicate that field 1 has been changed. Consequently, the new field separators are inserted.

Also, the two calls to awk can be combined into a single call:

awk 'BEGIN{FS="\t"; OFS=","} ($2 <= 5000000 && $3 >= 5000000) || ($2 >= 5000000 && $3 <= 6000000) || ($2 <= 6000000 && $3 >= 6000000) || ($2 <= 5000000 && $3 >= 6000000) {$1=$1; if($1 == "chr12") print}' file | sort -t$"," -k4rn

Simpler Example

In the following, the input is tab-separated and the output field separator, OFS, is set to a comma. In this first example, the awk command print is used:

$ echo $'a\tb\tc' | awk -v OFS=, '{print}'
a b c

Despite OFS=,, the output retains the tab-separator.

Now, we add the simple statement $1=$1 and observe the output:

$ echo $'a\tb\tc' | awk -v OFS=, '{$1=$1;print}'
a,b,c

The output is now comma-separated. Again, that is because awk only reformats a line with the new OFS if it thinks that a field on the line has been changed in some way. The assignment of $1 to itself is sufficient to trigger that reformat.

Note that it is not sufficient to make a change that affects the line as a whole. For example, the following does not trigger a reformat:

$ echo $'a\tb\tc' | awk -v OFS=, '{$0=$0;print}'
a b c

It is necessary to change one or more fields of the line individually. In the following, sub operates on $0 as a whole and, consequently, no reformat is triggered:

$ echo $'a\tb\tc' | awk -v OFS=, '{sub($1,"NEW");print}'
NEW b c

In the example below, however, sub operates specifically on field $1 and hence triggers a reformat:

$ echo $'a\tb\tc' | awk -v OFS=, '{sub($1,"NEW", $1);print}'
NEW,b,c


Related Topics



Leave a reply



Submit