Determine the Number of Lines Within a Text File

Determine the number of lines within a text file

Seriously belated edit: If you're using .NET 4.0 or later

The File class has a new ReadLines method which lazily enumerates lines rather than greedily reading them all into an array like ReadAllLines. So now you can have both efficiency and conciseness with:

var lineCount = File.ReadLines(@"C:\file.txt").Count();

Original Answer

If you're not too bothered about efficiency, you can simply write:

var lineCount = File.ReadAllLines(@"C:\file.txt").Length;

For a more efficient method you could do:

var lineCount = 0;
using (var reader = File.OpenText(@"C:\file.txt"))
{
while (reader.ReadLine() != null)
{
lineCount++;
}
}

Edit: In response to questions about efficiency

The reason I said the second was more efficient was regarding memory usage, not necessarily speed. The first one loads the entire contents of the file into an array which means it must allocate at least as much memory as the size of the file. The second merely loops one line at a time so it never has to allocate more than one line's worth of memory at a time. This isn't that important for small files, but for larger files it could be an issue (if you try and find the number of lines in a 4GB file on a 32-bit system, for example, where there simply isn't enough user-mode address space to allocate an array this large).

In terms of speed I wouldn't expect there to be a lot in it. It's possible that ReadAllLines has some internal optimisations, but on the other hand it may have to allocate a massive chunk of memory. I'd guess that ReadAllLines might be faster for small files, but significantly slower for large files; though the only way to tell would be to measure it with a Stopwatch or code profiler.

How to get the number of lines in a text file without opening it?

You can't count the number of lines in a file without reading it. The operating systems your code runs on do not store the number of lines as some kind of metadata. They don't even generally distinguish between binary and text files! You just have to read the file and count the newlines.

However, you can probably do this faster than you are doing it now, if your files have a large number of lines.

This line of code is what I'm worried about:

nbLines = (file.split("\n")).length;

Calling split here creates a large number of memory allocations, one for each line in the file.

My hunch is that it would be faster to count the newlines directly in a for loop:

function lineCount( text ) {
var nLines = 0;
for( var i = 0, n = text.length; i < n; ++i ) {
if( text[i] === '\n' ) {
++nLines;
}
}
return nLines;
}

This counts the newline characters without any memory allocations, and most JavaScript engines should do a good job of optimizing this code.

You may also want to adjust the final count slightly depending on whether the file ends with a newline or not, according to how you want to interpret that. But don't do that inside the loop, do it afterward.

How do i find out the Number of Lines in a text file?

Why not direct

 using System.IO;
using System.Linq;

...

int count = File.ReadLines(@"E:\File.txt").Count();

Is there a better way to determine the number of lines in a large txt file(1-2 GB)?

I'm just thinking out loud here, but chances are performance is I/O bound and not CPU bound. In any case, I'm wondering if interpreting the file as text may be slowing things down as it will have to convert between the file's encoding and string's native encoding. If you know the encoding is ASCII or compatible with ASCII, you might be able to get away with just counting the number of times a byte with the value 10 appears (which is the character code for a linefeed).

What if you had the following:

FileStream fs = new FileStream("path.txt", FileMode.Open, FileAccess.Read, FileShare.None, 1024 * 1024);

long lineCount = 0;
byte[] buffer = new byte[1024 * 1024];
int bytesRead;

do
{
bytesRead = fs.Read(buffer, 0, buffer.Length);
for (int i = 0; i < bytesRead; i++)
if (buffer[i] == '\n')
lineCount++;
}
while (bytesRead > 0);

My benchmark results for 1.5GB text file, timed 10 times, averaged:

  • StreamReader approach, 4.69 seconds
  • File.ReadLines().Count() approach, 4.54 seconds
  • FileStream approach, 1.46 seconds

How to find number of lines in text file using python?

try this

with open(<pathtofile>) as f:
print len(f.readlines())

counting the number of lines in a text file (java)

If you just want to add the data to an array, then I append the new values to an array. If the amount of data you are reading isn't large and you don't need to do it often that should be fine. I use something like this, as given in this answer: Reading a plain text file in Java

BufferedReader fileReader = new BufferedReader(new FileReader("path/to/file.txt"));
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();

while (line != null) {
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
String everything = sb.toString();
} finally {
br.close();
}

If you are reading in numbers, the strings can be converted to numbers, say for integers intValue = Integer.parseInt(text)

Function to count number of lines in a text file

The only alternative I see is to read the lines one by one (EDIT: or even just skip them one by one) instead of reading the whole file at once. Unfortunately I can't test which is faster right now. I imagine skipping is quicker.

Dim objFSO, txsInput, strTemp, arrLines
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")

strTextFile = "sample.txt"
txsInput = objFSO.OpenTextFile(strTextFile, ForReading)

'Skip lines one by one
Do While txsInput.AtEndOfStream <> True
txsInput.SkipLine ' or strTemp = txsInput.ReadLine
Loop

wscript.echo txsInput.Line-1 ' Returns the number of lines

'Cleanup
Set objFSO = Nothing

Incidentally, I took the liberty of removing some of your 'comments. In terms of good practice, they were superfluous and didn't really add any explanatory value, especially when they basically repeated the method names themselves, e.g.

'Create a File System Object
... CreateObject("Scripting.FileSystemObject")


Related Topics



Leave a reply



Submit