How to Delete the First Column ( Which Is in Fact Row Names) from a Data File in Linux

How to delete the first column ( which is in fact row names) from a data file in linux?

@Karafka I had CSV files so I added the "," separator (you can replace with yours

cut -d"," -f2- input.csv  > output.csv

Then, I used a loop to go over all files inside the directory

# files are in the directory tmp/
for f in tmp/*
do
name=`basename $f`
echo "processing file : $name"
#kepp all column excep the first one of each csv file

cut -d"," -f2- $f > new/$name
#files using the same names are stored in directory new/
done

How to delete rows if nth column includes specific word?

Use awk and a regex to test 4th column.

awk '$4 ~ "^(exonic|exonic;splicing|splicing)$"' file

Output:


chr1 26162313 26162313 exonic
chr1 26349533 26349535 exonic
chr1 26487940 26487940 exonic
chr1 26162353 26162313 splicing
chr1 26349533 26349535 exonic;splicing
chr1 26357656 26357656 exonic

How do I delete all lines in a concatenated text file that match the header WITHOUT deleting the header? [bash]

The following AWK script removes all lines that are exactly the same as the first one.

awk '{ if($0 != header) { print; } if(header == "") { header=$0; } }' inputfile > outputfile

It will print the first line because the initial value of header is an empty string. Then it will store the first line in header because it is empty.

After this it will print only lines that are not equal to the first one already stored in header. The second if will always be false once the header has been saved.

Note: If the file starts with empty lines these empty lines will be removed.

To remove the first number column you can use

sed 's/^[0-9][0-9]*[ \t]*//' inputfile > outputfile

You can combine both commands to a pipe

awk '{ if($0 != header) { print; } if(header == "") { header=$0; } }' inputfile | sed 's/^[0-9][0-9]*[ \t]*//' > outputfile

How to make zero the entire row except the first column, if it has a zero in any other colum in Linux?

Assuming the columns in your input file are separated by tabs:

awk -F'\t' '{ if ($2 == 0 || $3 == 0) { $2 = 0; $3 = 0 }; printf("%d\t%.1f\t%d\n", $1, $2, $3) }' ifile.txt

Output:

1       4.5     9
2 0.0 0
3 2.4 4
4 3.1 2
5 0.0 0
6 2.4 1
7 0.0 0

Remove text before first comma character

Could you please try following. Using back references concept of sed here, where matching everything till first occurrence of ,(comma) and then keeping save everything in a temporary buffer memory which later while substituting I am using it by doing \1.

sed 's/[^,]*,\(.*\)/\1/' Input_file

Why OP's attempt is not working, since OP is using .*, and .* being a Greedy character it covers till last occurrence of , so in spite of catching very first occurrence of it we get value till last comma.

Delete row if the second column value is repeated

Using awk:

awk '!v[$2] { print; v[$2]=1; } ' input

The code checks an associative array, v, to see if the second field has been seen before. If this is the first time it sees the field (v[$2] is not defined and !v[$2] is true), it prints out the line and sets v[$2] to 1 so that next time !v[$2] evaluates to false.

Gives:

abcd eeee
wxyz njtq
abcd rtmk
ijkl mnmn

How can I remove the first line of a text file using bash/sed script?

Try tail:

tail -n +2 "$FILE"

-n x: Just print the last x lines. tail -n 5 would give you the last 5 lines of the input. The + sign kind of inverts the argument and make tail print anything but the first x-1 lines. tail -n +1 would print the whole file, tail -n +2 everything but the first line, etc.

GNU tail is much faster than sed. tail is also available on BSD and the -n +2 flag is consistent across both tools. Check the FreeBSD or OS X man pages for more.

The BSD version can be much slower than sed, though. I wonder how they managed that; tail should just read a file line by line while sed does pretty complex operations involving interpreting a script, applying regular expressions and the like.

Note: You may be tempted to use

# THIS WILL GIVE YOU AN EMPTY FILE!
tail -n +2 "$FILE" > "$FILE"

but this will give you an empty file. The reason is that the redirection (>) happens before tail is invoked by the shell:

  1. Shell truncates file $FILE
  2. Shell creates a new process for tail
  3. Shell redirects stdout of the tail process to $FILE
  4. tail reads from the now empty $FILE

If you want to remove the first line inside the file, you should use:

tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"

The && will make sure that the file doesn't get overwritten when there is a problem.



Related Topics



Leave a reply



Submit