Splitting a file using AWK on Mac OS X
You can fix this script by using a variable:
awk '/SEPARATOR/{n++}{filename = "part" n ".txt"; print >filename }' in.txt
Splitting text file and adding line count in header with awk in OSX
With GNU Awk or Mawk:
awk -v RS='\nB \\* - \\|[0-9]+\\|\n' 'NF {
numLines = gsub("(^|\n)>", "\n") # replace line-initial ">" and count lines in block
fname = "part" ++n # determine next output filename
printf "%s%s\n", numLines " 120", $0 > fname # output header + block
close(fname) # close output file
}' file
Note: Unless the last line in the input file is a separator line, the last output file will have a trailing empty line (the data-line count in the header will be correct, however) - the OP has confirmed this not to be a problem.
GNU Awk or Mawk are needed, because only they support multi-character regex-based
RS
(input-record separator) values - unlike the BSDawk
that macOS comes with. It is possible to solve this problem differently, but it would be a little more cumbersome.- Both GNU Awk and Mawk can be installed on macOS via package manager Homebrew; with Homebrew installed, simply run
brew install gawk
orbrew install mawk
.
- Both GNU Awk and Mawk can be installed on macOS via package manager Homebrew; with Homebrew installed, simply run
The approach breaks the input into blocks of lines, by the
B
separator lines. Thus, each such block must fit into memory as a whole (presumably two copies at once, due to performing a string substitution.Having the whole block of lines in memory before writing them to the output file is what allows counting the lines up front and adding that information to the header.
numLines = gsub("(^|\n)>", "\n")
performs both the removal of the line-initial>
chars. and determines the number of lines in the block, taking advantage of the fact thatgsub()
returns the number of replacements made.
Using awk to split CSV file by column
I've resolved this. Following the logic of this thread, I checked my line endings with the file
command and learned that the file had the old-style Mac line terminators. I opened my input CSV file with Text Wrangler and saved it again with Unix style line endings. Once I did that, the awk
command listed above worked as expected. It took ~5 seconds to create 63 new CSV files broken out by date.
rename output file using split function on mac osx
If you check man split
you'll find that the argument --additional-suffix=SUFFIX
is not supported in this version.
To achieve what I understand you want you'll need an Automator script or a shell script, e.g.:
#!/bin/sh
DONE=false
until $DONE; do
for i in $(seq 1 16); do
read line || DONE=true;
[ -z "$line" ] && continue;
lines+=$line$'\n';
done
ratio=${lines::${#lines}-10}
(cat "Ratio"; echo "$ratio .txt";)
#echo "--- DONE SPLITTING ---";
lines=;
done < $1
How can I split a large text file into smaller files with an equal number of lines?
Have a look at the split command:
$ split --help
Usage: split [OPTION] [INPUT [PREFIX]]
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default
size is 1000 lines, and default PREFIX is `x'. With no INPUT, or when INPUT
is -, read standard input.
Mandatory arguments to long options are mandatory for short options too.
-a, --suffix-length=N use suffixes of length N (default 2)
-b, --bytes=SIZE put SIZE bytes per output file
-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file
-d, --numeric-suffixes use numeric suffixes instead of alphabetic
-l, --lines=NUMBER put NUMBER lines per output file
--verbose print a diagnostic to standard error just
before each output file is opened
--help display this help and exit
--version output version information and exit
You could do something like this:
split -l 200000 filename
which will create files each with 200000 lines named xaa xab xac
...
Another option, split by size of output file (still splits on line breaks):
split -C 20m --numeric-suffixes input_filename output_prefix
creates files like output_prefix01 output_prefix02 output_prefix03 ...
each of maximum size 20 megabytes.
Multisplitting in AWK
Try that :
echo "$test" | awk -F'[;&]' '{print $4}'
I specify a multiple separator in -F'[;&]'
Related Topics
How Does Bash Script Command Substitution Work
Keep Ssh Sessions Running After Disconnection
Bash: Transform Key-Value Lines to CSV Format
Using Sftp to Transfer Images from HTML Form to Remote Linux Server Using Perl/Cgi.Pm
What Is The Right Place for Findxxx.Cmake Files for Locally Compiled Libs
Format and Filter File to CSV Table
"Sudo" Fails with "Sudo Requires a Tty" When Executed from Putty Command Line
Why a Static Library Can Depend on a Shared a Library
Most Efficient Way to Concatenate Thousands of Files in Perl
A Wrong Size of "Len" Calculated by $ - Symbol with Fasm Equ
Search Ip from a Text File in .Csv Log File, If Found Add New Column Next to It
How to Add External References in Monodevelop