How to Read File N Lines at a Time

How to read file N lines at a time?

One solution would be a list comprehension and the slice operator:

with open(filename, 'r') as infile:
lines = [line for line in infile][:N]

After this lines is tuple of lines. However, this would load the complete file into memory. If you don't want this (i.e. if the file could be really large) there is another solution using a generator expression and islice from the itertools package:

from itertools import islice
with open(filename, 'r') as infile:
lines_gen = islice(infile, N)

lines_gen is a generator object, that gives you each line of the file and can be used in a loop like this:

for line in lines_gen:
print line

Both solutions give you up to N lines (or fewer, if the file doesn't have that much).

How to read first N lines of a file?

Python 3:

with open("datafile") as myfile:
head = [next(myfile) for x in range(N)]
print(head)

Python 2:

with open("datafile") as myfile:
head = [next(myfile) for x in xrange(N)]
print head

Here's another way (both Python 2 & 3):

from itertools import islice

with open("datafile") as myfile:
head = list(islice(myfile, N))
print(head)

Read n lines at a time using Bash

This is harder than it looks. The problem is how to keep the file handle.

The solution is to create another, new file handle which works like stdin (file handle 0) but is independent and then read from that as you need.

#!/bin/bash

# Create dummy input
for i in $(seq 1 10) ; do echo $i >> input-file.txt ; done

# Create new file handle 5
exec 5< input-file.txt

# Now you can use "<&5" to read from this file
while read line1 <&5 ; do
read line2 <&5
read line3 <&5
read line4 <&5

echo "Four lines: $line1 $line2 $line3 $line4"
done

# Close file handle 5
exec 5<&-

Read a File 8 Lines at a Time Python

A simple implementation using itertools.islice

from itertools import islice
with open("test.txt") as fin:
try:
while True:
data = islice(fin, 0, 8)

firstname = next(data)
lastname = next(data)
email = next(data)
#.....
except StopIteration:
pass

A better more pythonic implementation

>>> from collections import namedtuple
>>> from itertools import islice
>>> records = namedtuple('record',
('firstname','lastname','email' #, .....
))
>>> with open("test.txt") as fin:
try:
while True:
data = islice(fin, 0, 3)

data = record(*data)
print data.firstname, data.lastname, data.email #.......
except (StopIteration, TypeError):
pass

Read two lines at a time from a txt file

you can iterate over the file using enumerate to get line numbers, and simply store even number lines in a temp variable, and move on to the next odd number line. On odd number lines you can then have access to the previous line as well as the current line.

with open('somefile.txt', 'r') as f:
lastline = ''
for line_no, line in enumerate(f):
if line_no % 2 == 0: #even number lines (0, 2, 4 ...) go to `lastline`
lastline = line
continue #jump back to the loop for the next line
print("here's two lines for ya")
print(lastline)
print(line)

Read and print from a text file N lines at a time using a generator only

Solved:

textfile = "f:\\mark\\python\\test.txt"

def read_n(file, x):
with open(file, mode='r') as fh:
while True:
data = ''.join(fh.readline() for _ in range(x))

if not data:
break

yield data
print()

for nlines in read_n(textfile, 5):
print(nlines.rstrip())

Output:

abc
123
def
456
ghi

789
jkl
abc
123
def

456
ghi
789
jkl
abc

123
def
456
ghi
789

jkl
abc
123
def
456

ghi
789
jkl

Read a large file N lines at a time in Node.JS

Lines can only be parsed one at a time by someone. So, if you want 10 at once, then you just collect them one at a time until you have collected 10 and then process the 10.

I did not think Jarek's code quite worked right so here's a different version that collects 10 lines into an array and then calls dbInsert():

var tenLines = [];
lr.on('line', function(line) {
tenLines.push(line);
if (tenLines.length === 10) {
lr.pause();
dbInsert(<yourSQL>, function(error, returnVal){
if (error) {
// some sort of error handling here
}
tenLines = [];
lr.resume();
});
}
});
// process last set of lines in the tenLines buffer (if any)
lr.on('end', function() {
if (tenLines.length !== 0) {
// process last set of lines
dbInsert(...);
}
});

Jarek's version seems to call dbInsert() on every line event rather than only every 10th line event and did not process any left over lines at the end of the file if they weren't a perfect multiple of 10 lines long.



Related Topics



Leave a reply



Submit