View tabular file such as CSV from command line
You can also use this:
column -s, -t < somefile.csv | less -#2 -N -S
column
is a standard unix program that is very convenient -- it finds the appropriate width of each column, and displays the text as a nicely formatted table.
Note: whenever you have empty fields, you need to put some kind of placeholder in it, otherwise the column gets merged with following columns. The following example demonstrates how to use sed
to insert a placeholder:
$ cat data.csv
1,2,3,4,5
1,,,,5
$ sed 's/,,/, ,/g;s/,,/, ,/g' data.csv | column -s, -t
1 2 3 4 5
1 5
$ cat data.csv
1,2,3,4,5
1,,,,5
$ column -s, -t < data.csv
1 2 3 4 5
1 5
$ sed 's/,,/, ,/g;s/,,/, ,/g' data.csv | column -s, -t
1 2 3 4 5
1 5
Note that the substitution of ,,
for , ,
is done twice. If you do it only once, 1,,,4
will become 1, ,,4
since the second comma is matched already.
Commandline CSV viewer
It's the furthest thing from elegant, and it's probably something you've already thought of and are looking for a better solution, but I work around this annoyance by doing a series of sed replacements to put a whitespace in empty fields. I have these as functions in my bashrc...
csvcolumn() { sed -e "s/^$2/ $2/" -e "s/$2$/$2 /" -e "s/$2$2/$2 $2/g" -e "s/$2$2/$2 $2/g" $1 | column -t -s$2 ; }
csvcomma() { sed -e 's/^,/ ,/' -e 's/,$/, /' -e 's/,,/, ,/g' -e 's/,,/, ,/g' $1 | column -t -s, ; }
The first one takes two args to be able to specify the delimiter character. The second is the same thing but it only takes one arg and assumes the delimiter is a comma since that's most often what I use anyway.
csvcolumn input.csv ,
or
csvcomma input.csv
Working with complex CSV from Linux command line
As long as your input doesn't contain columns with escaped embedded ,
chars., you should be able to parse it with awk
, using ,
as the field separator; e.g.:
awk -F, '{ n = split($5, subField, "[[:blank:]]+"); for (i=1;i<=n;++i) print subField[i] }' file.csv
The above splits the 5th field into sub-fields by whitespace, using the split()
function.
How to pretty print the csv which has long columns from command line?
I believe that csvlook
is treating the tab characters in that column just like any other character, and doesn't know about their special behaviour.
The easiest way to get the columns to line up is to minimally expand the tabs:
expand -t1 nupic_out.csv | csvlook
How to extract one column of a csv file
You could use awk for this. Change '$2' to the nth column you want.
awk -F "\"*,\"*" '{print $2}' textfile.csv
creating a bash function to print csv to screen
function dl {
wget -O stats.zip "http://...?DataFilter=itemCode:$1";
unzip -c stats.zip | columns -s, -t;
}
- You can use unzip's "-c" command to print the content of files
- Bash (and other shells) allow you to wrap statements into functions
Additionally I've used the column
command in order to format the CSV output.
You can call the function with the ItemCode as parameter: dl 526
Viewing a very large CSV file?
You could try PostgreSQL 9.1+ and its file_fdw (File Foreign Data Wrapper) which would pretend that the CSV file is a table. If you replaced the CSV file with another CSV file of the same name, then you would see the new info immediately in the database.
You can improve performance by using a materialized view (PG 9.3+) which essentially creates a real database table from the CSV data. You could use pgAgent to refresh the materialized view on a schedule.
Another alternative would be to use the COPY statement:
/* the columns in this table are the same as the columns in your csv: */
create table if not exists my_csv (
some_field text, ...
);
/* COPY appends, so truncate the table if loading fresh data again: */
truncate table my_csv;
/*
you need to be a postgres superuser to use COPY
use psql \copy if you can't be superuser
put the csv file in /srv/vendor-name/
*/
copy
my_csv
from
'/srv/vendor-name/my.csv'
with (
format csv
);
How do I convert a tab-separated values (TSV) file to a comma-separated values (CSV) file in BASH?
Update: The following solutions are not generally robust, although they do work in the OP's specific use case; see the bottom section for a robust, awk
-based solution.
To summarize the options (interestingly, they all perform about the same):
tr:
devnull's solution (provided in a comment on the question) is the simplest:
tr '\t' ',' < file.tsv > file.csv
sed:
The OP's own sed
solution is perfectly fine, given that the input contains no quoted strings (with potentially embedded \t
chars.):
sed 's/\t/,/g' file.tsv > file.csv
The only caveat is that on some platforms (e.g., macOS) the escape sequence \t
is not supported, so a literal tab char. must be spliced into the command string using ANSI quoting ($'\t'
):
sed 's/'$'\t''/,/g' file.tsv > file.csv
awk:
The caveat with awk
is that FS
- the input field separator - must be set to \t
explicitly - the default behavior would otherwise strip leading and trailing tabs and replace interior spans of multiple tabs with only a single ,
:
awk 'BEGIN { FS="\t"; OFS="," } {$1=$1; print}' file.tsv > file.csv
Note that simply assigning $1
to itself causes awk
to rebuild the input line using OFS
- the output field separator; this effectively replaces all \t
chars. with ,
chars. print
then simply prints the rebuilt line.
Robust awk
solution:
As A. Rabus points out, the above solutions do not handle unquoted input fields that themselves contain ,
characters correctly - you'll end up with extra CSV fields.
The following awk
solution fixes this, by enclosing such fields in "..."
on demand (see the non-robust awk
solution above for a partial explanation of the approach).
If such fields also have embedded "
chars., these are escaped as ""
, in line with RFC 4180.Thanks, Wyatt Israel.
awk 'BEGIN { FS="\t"; OFS="," } {
rebuilt=0
for(i=1; i<=NF; ++i) {
if ($i ~ /,/ && $i !~ /^".*"$/) {
gsub("\"", "\"\"", $i)
$i = "\"" $i "\""
rebuilt=1
}
}
if (!rebuilt) { $1=$1 }
print
}' file.tsv > file.csv
$i ~ /[,"]/ && $i !~ /^".*"$/
detects any field that contains,
and/or"
and isn't already enclosed in double quotesgsub("\"", "\"\"", $i)
escapes embedded"
chars. by doubling them$i = "\"" $i "\""
updates the result by enclosing it in double quotesAs stated before, updating any field causes
awk
to rebuild the line from the fields with theOFS
value, i.e.,,
in this case, which amounts to the effective TSV -> CSV conversion; flagrebuilt
is used to ensure that each input record is rebuilt at least once.
Related Topics
Replacing Control Character in Sed
How to Remove the Bom from a Utf-8 File
Centos 64 Bit Bad Elf Interpreter
Node.Js: Cannot Find Module 'Request'
Why Exit Code 141 with Grep -Q
How to Calculate CPU Utilization of a Process & All Its Child Processes in Linux
Can Docker Solve a Problem of Mismatched C Shared Libraries
How to Count Lines of Code Including Sub-Directories
Bash: Inserting a Line in a File at a Specific Location
How to Use Ioctl() to Manipulate My Kernel Module
Linux Terminal Input: Reading User Input from Terminal Truncating Lines at 4095 Character Limit
Vim Configuration for Linux Kernel Development
Docker in Docker Cannot Mount Volume
How to Compile and Link a 32-Bit Windows Executable Using Mingw-W64
How to Pass Parameters to a Bash Script
Delete All Files Older Than 30 Days, Based on File Name as Date