Quickly Read the Last Line of a Text File

Quickly read the last line of a text file?

Have a look at my answer to a similar question for C#. The code would be quite similar, although the encoding support is somewhat different in Java.

Basically it's not a terribly easy thing to do in general. As MSalter points out, UTF-8 does make it easy to spot \r or \n as the UTF-8 representation of those characters is just the same as ASCII, and those bytes won't occur in multi-byte character.

So basically, take a buffer of (say) 2K, and progressively read backwards (skip to 2K before you were before, read the next 2K) checking for a line termination. Then skip to exactly the right place in the stream, create an InputStreamReader on the top, and a BufferedReader on top of that. Then just call BufferedReader.readLine().

c++ fastest way to read only last line of text file?

Use seekg to jump to the end of the file, then read back until you find the first newline.
Below is some sample code off the top of my head using MSVC.

#include <iostream>
#include <fstream>
#include <sstream>

using namespace std;

int main()
{
string filename = "test.txt";
ifstream fin;
fin.open(filename);
if(fin.is_open()) {
fin.seekg(-1,ios_base::end); // go to one spot before the EOF

bool keepLooping = true;
while(keepLooping) {
char ch;
fin.get(ch); // Get current byte's data

if((int)fin.tellg() <= 1) { // If the data was at or before the 0th byte
fin.seekg(0); // The first line is the last line
keepLooping = false; // So stop there
}
else if(ch == '\n') { // If the data was a newline
keepLooping = false; // Stop at the current position.
}
else { // If the data was neither a newline nor at the 0 byte
fin.seekg(-2,ios_base::cur); // Move to the front of that data, then to the front of the data before it
}
}

string lastLine;
getline(fin,lastLine); // Read the current line
cout << "Result: " << lastLine << '\n'; // Display it

fin.close();
}

return 0;
}

And below is a test file. It succeeds with empty, one-line, and multi-line data in the text file.

This is the first line.
Some stuff.
Some stuff.
Some stuff.
This is the last line.

Reading the 2 last line from a text

Since

 File.ReadAllLines("C:\\test.log");

returns an array you can take the last two items of the array:

 var data = File.ReadAllLines("C:\\test.log");

string last = data[data.Length - 1];
string lastButOne = data[data.Length - 2];

In general case with long files (and that's why ReadAllLines is a bad choice) you can implement

public static partial class EnumerableExtensions {
public static IEnumerable<T> Tail<T>(this IEnumerable<T> source, int count) {
if (null == source)
throw new ArgumentNullException("source");
else if (count < 0)
throw new ArgumentOutOfRangeException("count");
else if (0 == count)
yield break;

Queue<T> queue = new Queue<T>(count + 1);

foreach (var item in source) {
queue.Enqueue(item);

if (queue.Count > count)
queue.Dequeue();
}

foreach (var item in queue)
yield return item;
}
}

...

var lastTwolines = File
.ReadLines("C:\\test.log") // Not all lines
.Tail(2);

What is the most efficient way to get first and last line of a text file?

docs for io module

with open(fname, 'rb') as fh:
first = next(fh).decode()

fh.seek(-1024, 2)
last = fh.readlines()[-1].decode()

The variable value here is 1024: it represents the average string length. I choose 1024 only for example. If you have an estimate of average line length you could just use that value times 2.

Since you have no idea whatsoever about the possible upper bound for the line length, the obvious solution would be to loop over the file:

for line in fh:
pass
last = line

You don't need to bother with the binary flag you could just use open(fname).

ETA: Since you have many files to work on, you could create a sample of couple of dozens of files using random.sample and run this code on them to determine length of last line. With an a priori large value of the position shift (let say 1 MB). This will help you to estimate the value for the full run.

How to efficiently read only last line of the text file

You want to read the file backwards using ReverseLineReader:

How to read a text file reversely with iterator in C#

Then run .Take(1) on it.

var lines = new ReverseLineReader(filename);
var last = lines.Take(1);

You'll want to use Jon Skeet's library MiscUtil directly rather than copying/pasting the code.



Related Topics



Leave a reply



Submit