What Is the Most Efficient Way to Get First and Last Line of a Text File

What is the most efficient way to get first and last line of a text file?

docs for io module

with open(fname, 'rb') as fh:
first = next(fh).decode()

fh.seek(-1024, 2)
last = fh.readlines()[-1].decode()

The variable value here is 1024: it represents the average string length. I choose 1024 only for example. If you have an estimate of average line length you could just use that value times 2.

Since you have no idea whatsoever about the possible upper bound for the line length, the obvious solution would be to loop over the file:

for line in fh:
pass
last = line

You don't need to bother with the binary flag you could just use open(fname).

ETA: Since you have many files to work on, you could create a sample of couple of dozens of files using random.sample and run this code on them to determine length of last line. With an a priori large value of the position shift (let say 1 MB). This will help you to estimate the value for the full run.

c++ fastest way to read only last line of text file?

Use seekg to jump to the end of the file, then read back until you find the first newline.
Below is some sample code off the top of my head using MSVC.

#include <iostream>
#include <fstream>
#include <sstream>

using namespace std;

int main()
{
string filename = "test.txt";
ifstream fin;
fin.open(filename);
if(fin.is_open()) {
fin.seekg(-1,ios_base::end); // go to one spot before the EOF

bool keepLooping = true;
while(keepLooping) {
char ch;
fin.get(ch); // Get current byte's data

if((int)fin.tellg() <= 1) { // If the data was at or before the 0th byte
fin.seekg(0); // The first line is the last line
keepLooping = false; // So stop there
}
else if(ch == '\n') { // If the data was a newline
keepLooping = false; // Stop at the current position.
}
else { // If the data was neither a newline nor at the 0 byte
fin.seekg(-2,ios_base::cur); // Move to the front of that data, then to the front of the data before it
}
}

string lastLine;
getline(fin,lastLine); // Read the current line
cout << "Result: " << lastLine << '\n'; // Display it

fin.close();
}

return 0;
}

And below is a test file. It succeeds with empty, one-line, and multi-line data in the text file.

This is the first line.
Some stuff.
Some stuff.
Some stuff.
This is the last line.

How would I go about printing the last line in a large text file?

If you only need the last line, throw everything else away.

with open('foo.txt') as f:
for line in f:
pass

# `line` is the last line of the file.

Much faster (but far less readable) would be to start at the end of the file and move backwards by bytes until you find \n, then read.

with open('foo.txt') as f:
fd = f.fileno()
os.lseek(fd, 0, os.SEEK_END)
while True:
ch = os.read(fd, 1)
if ch == b'\n':
line = f.read()
break
else:
os.lseek(fd, -2, os.SEEK_CUR)

# `line` is the last line of the file

This works by reading the file from the end, looking for the first newline, then reading forward from there.

How to read first word of last line from text file?

last is a string and last[0] is the first character of that string. If you want the first word then you'll need to split the string on spaces:

string[] words = last.Split(' ');

Then you can get the first word:

Console.WriteLine(words[0]);

You'll need to include more error checking - in case the last line of the file is empty for example, and perhaps cope with more whitespace characters than just a space. There might be tabs:

var words = last.Split(new char[] { ' ', '\t' });

Most efficient way to modify the last line of a large text file in Python

Update: Use ShadowRanger's answer. It's much shorter and robust.

For posterity:

Read the last N bytes of the file and search backwards for the newline.

#!/usr/bin/env python

with open("test.txt", "wb") as testfile:
testfile.write('\n'.join(["one", "two", "three"]) + '\n')

with open("test.txt", "r+b") as myfile:
# Read the last 1kiB of the file
# we could make this be dynamic, but chances are there's
# a number like 1kiB that'll work 100% of the time for you
myfile.seek(0,2)
filesize = myfile.tell()
blocksize = min(1024, filesize)
myfile.seek(-blocksize, 2)
# search backwards for a newline (excluding very last byte
# in case the file ends with a newline)
index = myfile.read().rindex('\n', 0, blocksize - 1)
# seek to the character just after the newline
myfile.seek(index + 1 - blocksize, 2)
# read in the last line of the file
lastline = myfile.read()
# modify last_line
lastline = "Brand New Line!\n"
# seek back to the start of the last line
myfile.seek(index + 1 - blocksize, 2)
# write out new version of the last line
myfile.write(lastline)
myfile.truncate()

Extracting first 2 rows and last row from .txt or .csv Python

This is what you need:

def extract_lines(filename,outputname):
l = []
with open(filename,'r') as f:
for index,line in enumerate(f): #This iterates the file line by line which is memory efficient in case the csv is huge.
if index < 2: #first 2 lines
l.append(line)
if index > 1: # means the file has at least 3 lines
l.append(line)
with open(outputname,'w') as f:
for line in l:
f.write(line)


Related Topics



Leave a reply



Submit