Eliminate Unwanted Output Using Awk and Sed

eliminate unwanted output using awk and sed

Those find errors will be on stderr, so bypass your chain entirely, you'll want to redirect the errors with 2>/dev/null, although that will prevent you seeing any other errors in the find command.

find /opt/site/ -name '.log.txt' 2>/dev/null | xargs cat | awk '{$NF=""; print $0}' | xargs sed "/Filesystem/d" | sed '1i Owner RepoName CreatedDate' | column -t

In general with a complicated command like this, you should break it down when you have errors so that you can work out where the problem is coming from.

Let's split up this command to see what it's doing:

find /opt/site/ -name '.log.txt' 2>/dev/null - find all the files under /opt/site/ named .log.txt

xargs cat - get all their contents, one after the other

awk '{$NF=""; print $0}' - delete the last column

xargs sed "/Filesystem/d" - Treat each entry as a file and delete any lines containing Filesystem from the contents of those files.

sed '1i Owner RepoName CreatedDate' - Insert Owner RepoName CreatedDate on the first line

column -t - Convert the given data into a table

I'd suggest building up the command, and checking the output is correct at each stage.

Several things are surprising about your command:

  1. The find one looks for files that are exactly .log.txt rather than an extension.
  2. The second xargs call - converting the contents of the .log.txt files into filenames.

Remove a specific character using awk or sed

Use sed's substitution: sed 's/"//g'

s/X/Y/ replaces X with Y.

g means all occurrences should be replaced, not just the first one.

removing unwanted text using sed or cut or awk

Like this?

sort -V -k2 -k3 source.txt* |
awk '/Lat:/ { gsub(/;/, ""); print $6, $8 }
/WiFiAdapterAndroid/'

Notice also how sort can read a list of files without the help from cat.

cut is not suitable here because it can't decide which lines to modify; it's all or nothing. Conversely grep can't modify the strings it extracts. Awk, then, provides an easy and compact notation for doing both.

Briefly, Awk executes a script on each line of input at a time (or, more broadly, a record; it's easy to configure it to operate on units which are parts of lines or collections of adjacent lines, too). Each script unit consists of a condition and an action; both of them are optional, so an action without a condition matches all lines unconditionally, and a condition without an action defaults to printing the input for which the condition matched.

The first line of the script has a regex condition which selects lines which match the regular expression Lat:; the action cleans up the line with a simple substitution to remove any semicolons, then prints the sixth and eighth tokens on the line. (Each record is split into fields; again, there is a lot of flexibility here, but by default, each field is a non-whitespace token separated by whitespace from adjacent tokens.) And finally, as you might guess, the second condition is another regex, which causes matching inputs to be printed.

delete a column with awk or sed

This might work for you (GNU sed):

sed -i -r 's/\S+//3' file

If you want to delete the white space before the 3rd field:

sed -i -r 's/(\s+)?\S+//3' file

awk or sed to remove text in file before character and then after character

awk -F'[\t()]' '{OFS="\t"; print $1, $2, $3, $5 $6}' file

Output:


chr4 100009839 100009851 ADH5_1
chr4 100006265 100006367 ADH5_2
chr4 100003125 100003267 ADH5_3

How can I remove responses from LiveHTTPHeaders output using awk, perl or sed?

Looks like you're having trailing whitespace issues.

$ sed -e 's/^\s*$//' livehttp.txt | \
perl -e '$/ = ""; while (<>) { print if /^(GET|POST)/ }'

This works by putting Perl's readline operator into paragraph mode (via $/ = ""), which grabs records a chunk at a time, separated by two or more consecutive newlines.

It's nice when it works, but it's a bit brittle. Blank but not empty lines will gum up the works, but sed can clean those up.

Equivalent and more concise command:

$ sed -e 's/^\s*$//' livehttp.txt | perl -000 -ne 'print if /^(GET|POST)/'

Can I delete a field in awk?

I believe simplest would be to use sub function to replace first occurrence of continuous ,,(which are getting created after you made 2nd field NULL) with single ,. But this assumes that you don't have any commas in between field values.

awk 'BEGIN{FS=OFS=","}{$2="";sub(/,,/,",");print $0}' Input_file

2nd solution: OR you could use match function to catch regex from first comma to next comma's occurrence and get before and after line of matched string.

awk '
match($0,/,[^,]*,/){
print substr($0,1,RSTART-1)","substr($0,RSTART+RLENGTH)
}' Input_file

Removing unwanted characters and empty lines with SED, TR or/and awk

Wow I solved the problem by that time but forgot to answer, so here it is!

Using only tr command I could accomplish that like this:

tr -d '\377\376\015\000\277\003' < logs.csv | tr -s '\n'

tr removed all the unwanted characters and empty lines, and it was really, really fast, much faster than the options using sed and awk



Related Topics



Leave a reply



Submit