Linux Split a Column into Two Different Columns in a Same CSV File

Linux split a column into two different columns in a same CSV file

You can do it with awk.

Create a file named script.awk, with the following contents:

BEGIN {
line = 0; #Initialize at zero
}
/,,/ { #every time we hit the delimiter
line = 0; #reset line to zero
}
!/,,/{ #otherwise
a[line] = a[line]" "$0; # Add the new input line to the output line
line++; # increase the counter by one
}
END {
for (i in a )
print a[i] # print the output
}

Run file like this:

awk -f test.awk < datafile 

Output:

$ cat datafile
11
22
13
,,
aa
bb
cc
,,
ww
kk
ll
,,
$ awk -f script.awk < datafile
11 aa ww
22 bb kk
13 cc ll

Or if you just want a one-liner, do this:

awk 'BEGIN{line=0;}/,,/{line=0;}!/,,/{a[line++]=a[line]" "$0;}END{for (i in a ) print a[i]}' datafile 

EDIT:

This will add commas between the fields:

awk 'BEGIN{line=0;}/,,/{line=0;}!/,,/{a[line++]=a[line]?a[line]","$0:$0;}END{for (i in a ) print a[i]}' datafile
# ^ This is the part that I changed

Extract two Columns from CSV file, split them in while read loop and do different commands with each string

David!

I am a newbie here as well, but I think I have the answer.

If you create a script file with this content and make it executable, you can then type ./script.bash NameOfMyFile.csv and it will do your trick. You can at least use this code as a place to start. Good luck!

#!/bin/bash
file=$1
while IFS=';' read -r newfile contents
do
echo "Create file: $newfile ==> with contents: $contents"
echo $contents > $newfile
done < "$file"

The sample file I fed it looked like this:

Name1; Put this stuff in the first place I want it And then put in more stuff
Name2; I might, have some; puncuated stuff! But it shouldn't matter.
Name3; 20394)@(#$
Name4; No newline at the end of this line

Output:

Create file: Name1  ==> with contents:  Put this stuff in the first place I want it And then put in more stuff
Create file: Name2 ==> with contents: I might, have some; puncuated stuff! But it shouldn't matter.
Create file: Name3 ==> with contents: 20394)@(#$
Create file: Name4 ==> with contents: No newline at the end of this line

I hope this helps!
Kylie

Split column into multiple based on match/delimiter using bash awk

Here's a csplit+paste solution

$ csplit --suppress-matched -zs test.file2 /male_position/ {*}
$ ls
test.file2 xx00 xx01 xx02
$ paste xx*
0.00 0 0
0.00 5 1
1.05 10 2
1.05 3
1.05 5
1.05
3.1
5.11
12.74

From man csplit

csplit - split a file into sections determined by context lines

-z, --elide-empty-files
remove empty output files

-s, --quiet, --silent
do not print counts of output file sizes

--suppress-matched
suppress the lines matching PATTERN

  • /male_position/ is the regex used to split the input file
  • {*} specifies to create as many splits as possible
  • use -f and -n options to change the default output file names
  • paste xx* to paste the files column wise, TAB is default separator

Combining CSV files and splitting the column into 2 columns using R

The problem is that the source data is delimited by:

  • one space when the second number is negative, and
  • two spaces when the second number is positive (space for the absent minus sign).

The trick is to split the string on one or more spaces:

 data <- do.call("rbind", strsplit(as.character(trimws(merged$V1))," +",fixed=FALSE))

I'm a bit OCD on charsets, unreliable files, etc, so I tend to use splitters such as "[[:space:]]+" instead, since it'll catch whitespace-variants instead of the space " " or tab "\t".

(In regex-speak, the + says "one or more". Other modifiers include ? as zero or one, and * as zero or more.)



Related Topics



Leave a reply



Submit