Using Sed to Get the Last N Lines of a Huge Text File

Using SED to Get the Last n Lines of a Huge Text File

You don't. You use tail -n NUMLINES for that.

tail -n 100 A.txt > B.txt

How to use sed to remove the last n lines of a file

I don't know about sed, but it can be done with head:

head -n -2 myfile.txt

Retrieve last 100 lines logs

You can use tail command as follows:

tail -100 <log file>   > newLogfile

Now last 100 lines will be present in newLogfile

EDIT:

More recent versions of tail as mentioned by twalberg use command:

tail -n 100 <log file>   > newLogfile

How can I read first n and last n lines from a file?

Chances are you're going to want something like:

... | awk -v OFS='\n' '{a[NR]=$0} END{print a[1], a[2], a[NR-1], a[NR]}'

or if you need to specify a number and taking into account @Wintermute's astute observation that you don't need to buffer the whole file, something like this is what you really want:

... | awk -v n=2 'NR<=n{print;next} {buf[((NR-1)%n)+1]=$0}
END{for (i=1;i<=n;i++) print buf[((NR+i-1)%n)+1]}'

I think the math is correct on that - hopefully you get the idea to use a rotating buffer indexed by the NR modded by the size of the buffer and adjusted to use indices in the range 1-n instead of 0-(n-1).

To help with comprehension of the modulus operator used in the indexing above, here is an example with intermediate print statements to show the logic as it executes:

$ cat file   
1
2
3
4
5
6
7
8

.

$ cat tst.awk                
BEGIN {
print "Populating array by index ((NR-1)%n)+1:"
}
{
buf[((NR-1)%n)+1] = $0

printf "NR=%d, n=%d: ((NR-1 = %d) %%n = %d) +1 = %d -> buf[%d] = %s\n",
NR, n, NR-1, (NR-1)%n, ((NR-1)%n)+1, ((NR-1)%n)+1, buf[((NR-1)%n)+1]

}
END {
print "\nAccessing array by index ((NR+i-1)%n)+1:"
for (i=1;i<=n;i++) {
printf "NR=%d, i=%d, n=%d: (((NR+i = %d) - 1 = %d) %%n = %d) +1 = %d -> buf[%d] = %s\n",
NR, i, n, NR+i, NR+i-1, (NR+i-1)%n, ((NR+i-1)%n)+1, ((NR+i-1)%n)+1, buf[((NR+i-1)%n)+1]
}
}
$
$ awk -v n=3 -f tst.awk file
Populating array by index ((NR-1)%n)+1:
NR=1, n=3: ((NR-1 = 0) %n = 0) +1 = 1 -> buf[1] = 1
NR=2, n=3: ((NR-1 = 1) %n = 1) +1 = 2 -> buf[2] = 2
NR=3, n=3: ((NR-1 = 2) %n = 2) +1 = 3 -> buf[3] = 3
NR=4, n=3: ((NR-1 = 3) %n = 0) +1 = 1 -> buf[1] = 4
NR=5, n=3: ((NR-1 = 4) %n = 1) +1 = 2 -> buf[2] = 5
NR=6, n=3: ((NR-1 = 5) %n = 2) +1 = 3 -> buf[3] = 6
NR=7, n=3: ((NR-1 = 6) %n = 0) +1 = 1 -> buf[1] = 7
NR=8, n=3: ((NR-1 = 7) %n = 1) +1 = 2 -> buf[2] = 8

Accessing array by index ((NR+i-1)%n)+1:
NR=8, i=1, n=3: (((NR+i = 9) - 1 = 8) %n = 2) +1 = 3 -> buf[3] = 6
NR=8, i=2, n=3: (((NR+i = 10) - 1 = 9) %n = 0) +1 = 1 -> buf[1] = 7
NR=8, i=3, n=3: (((NR+i = 11) - 1 = 10) %n = 1) +1 = 2 -> buf[2] = 8

Edit the first and last line of a huge file

I can't think of a way you can do this in-place (I'd be interested to hear one!)

Hardly a one-liner but you could give this a try:

# substitute the first line and exit
sed '1s/-flag \(.*\)/\1/;q' file > new
# add the rest of the file (probably quicker than sed)
tail -n +2 file >> new
# cut off the last line of the file
truncate -s $(( $(stat -c "%s" new) - $(tail -n 1 new | wc -c) )) new
# substitute the last line
tail -n 1 file | sed 's/-flag \(.*\)/\1/' >> new

This assumes you have a couple of tools like truncate and that you can do arithmetic in your shell (my shell is bash).

The truncate -s removes the last line by taking the difference between the total file size stat -c "%s" and the length of the last line in bytes.

I'm not sure what you were trying to remove from the last line but I assumed that it was the same as the first (remove -flag from the start of the line).

Suggested modifications are welcome.

Copying last n lines to a new file and then removing the n lines from original

Not all versions of head support negative line counts.
The default installed on macOS doesn't.

If you have coreutils installed (If you have Homebrew installed you can do this: brew install coreutils) you should be able to use ghead -n -3000.

sed - how to delete everything but the last n lines?

From this list of sed one-liners:

sed -e :a -e '$q;N;11,$D;ba'

On Windows:

sed.exe -e :a -e "$q;N;11,$D;ba"

That example is for n = 10. Replace 11 with n + 1. It basically works by keeping a running list in the pattern space. For the first 10 lines, it just appends to the pattern space (N), then loops to the beginning. For 11 and later, it also deletes the first line of the pattern space. Finally, at the ($), it quits, which automatically prints the final n lines.

This is really not sed's strong suit, though. I would just install coreutils from gnuwin32.

Printing a line in a certain percentile of a larg text file

In bash:

head -`echo scale=0\;$(cat file|wc -l)\*95/100 | bc -l` file | tail -n 1


Related Topics



Leave a reply



Submit