Read a File Backwards

read file backwards (last line first)

It goes like this:

  1. Seek to one byte before the end of the file using fseek. There's no guarantee that the last line will have an EOL so the last byte doesn't really matter.
  2. Read one byte using fgetc.
  3. If that byte is an EOL then the last line is a single empty line and you have it.
  4. Use fseek again to go backwards two bytes and check that byte with fgetc.
  5. Repeat the above until you find an EOL. When you have an EOL, the file pointer will be at the beginning of the next (from the end) line.
  6. ...
  7. Profit.

Basically you have to keep doing (4) and (5) while keeping track of where you were when you found the beginning of a line so that you can seek back there before starting your scan for the beginning of the next line.

As long as you open your file in text mode you shouldn't have have to worry about multibyte EOLs on Windows (thanks for the reminder Mr. Lutz).

If you happen to be given a non-seekable input (such as a pipe), then you're out of luck unless you want to dump your input to a temporary file first.

So you can do it but it is rather ugly.

You could do pretty much the same thing using mmap and a pointer if you have mmap available and the "file" you're working with is mappable. The technique would be pretty much the same: start at the end and go backwards to find the end of the previous line.


Re: "I am the one creating this file. So, can I create in a way its in the reverse order? Is that possible?"

You'll run into the same sorts of problems but they'll be worse. Files in C are inherently sequential lists of bytes that start at the beginning and go to the end; you're trying to work against this fundamental property and going against the fundamentals is never fun.

Do you really need your data in a plain text file? Maybe you need text/plain as the final output but all the way through? You could store the data in an indexed binary file (possibly even an SQLite database) and then you'd only have to worry about keeping (or windowing) the index in memory and that's unlikely to be a problem (and if it is, use a "real" database); then, when you have all your lines, just reverse the index and away you go.

How to read a file in reverse order?

for line in reversed(open("filename").readlines()):
print line.rstrip()

And in Python 3:

for line in reversed(list(open("filename"))):
print(line.rstrip())

Read a file backwards line by line using fseek

The question is asking using fseek, so can only assume that performance is an issue and file() is not the solution. Here is a simple approach using fseek:

My file.txt

#file.txt
Line 1
Line 2
Line 3
Line 4
Line 5

And the code:

<?php

$fp = fopen('file.txt', 'r');

$pos = -2; // Skip final new line character (Set to -1 if not present)

$lines = array();
$currentLine = '';

while (-1 !== fseek($fp, $pos, SEEK_END)) {
$char = fgetc($fp);
if (PHP_EOL == $char) {
$lines[] = $currentLine;
$currentLine = '';
} else {
$currentLine = $char . $currentLine;
}
$pos--;
}

$lines[] = $currentLine; // Grab final line

var_dump($lines);

Output:

array(5) {
[0]=>
string(6) "Line 5"
[1]=>
string(6) "Line 4"
[2]=>
string(6) "Line 3"
[3]=>
string(6) "Line 2"
[4]=>
string(6) "Line 1"
}

You don't have to append to the $lines array like I am, you can print the output straight away if that is the purpose of your script. Also it is easy to introduce a counter if you want to limit the number of lines.

$linesToShow = 3;
$counter = 0;
while ($counter <= $linesToShow && -1 !== fseek($fp, $pos, SEEK_END)) {
// Rest of code from example. After $lines[] = $currentLine; add:
$counter++;
}

Read a file backwards?

As per comment, a possible (quite simple) alternative would be read the lines into a vector. For example:

#include <iostream>
#include <fstream>
#include <string>
#include <vector>

int main()
{
std::ifstream in("main.cpp");

if (in.is_open())
{
std::vector<std::string> lines_in_reverse;
std::string line;
while (std::getline(in, line))
{
// Store the lines in reverse order.
lines_in_reverse.insert(lines_in_reverse.begin(), line);
}
}
}

EDIT:

As per jrok's and Loki Astari's comments, push_back() would be more efficient but the lines would be in file order, so reverse iteration (reverse_iterator) or std::reverse() would be necessary:

    std::vector<std::string> lines_in_order;
std::string line;
while (std::getline(in, line))
{
lines_in_order.push_back(line);
}

Reading a text file backwards in C

I recommend a more portable (hopefully) way of file size determination since fseek(binaryStream, offset, SEEK_END) is not guaranteed to work. See the code below.

I believe that files should be at least minimally buffered at the kernel level (e.g. buffering at least one block per file by default), so seeks should not incur significant amount of extra I/O and should only advance the file position internally. If the default buffering is not satisfactory, you may try to use setvbuf() to speed up the I/O.

#include <limits.h>
#include <string.h>
#include <stdio.h>

/* File must be open with 'b' in the mode parameter to fopen() */
long fsize(FILE* binaryStream)
{
long ofs, ofs2;
int result;

if (fseek(binaryStream, 0, SEEK_SET) != 0 ||
fgetc(binaryStream) == EOF)
return 0;

ofs = 1;

while ((result = fseek(binaryStream, ofs, SEEK_SET)) == 0 &&
(result = (fgetc(binaryStream) == EOF)) == 0 &&
ofs <= LONG_MAX / 4 + 1)
ofs *= 2;

/* If the last seek failed, back up to the last successfully seekable offset */
if (result != 0)
ofs /= 2;

for (ofs2 = ofs / 2; ofs2 != 0; ofs2 /= 2)
if (fseek(binaryStream, ofs + ofs2, SEEK_SET) == 0 &&
fgetc(binaryStream) != EOF)
ofs += ofs2;

/* Return -1 for files longer than LONG_MAX */
if (ofs == LONG_MAX)
return -1;

return ofs + 1;
}

/* File must be open with 'b' in the mode parameter to fopen() */
/* Set file position to size of file before reading last line of file */
char* fgetsr(char* buf, int n, FILE* binaryStream)
{
long fpos;
int cpos;
int first = 1;

if (n <= 1 || (fpos = ftell(binaryStream)) == -1 || fpos == 0)
return NULL;

cpos = n - 1;
buf[cpos] = '\0';

for (;;)
{
int c;

if (fseek(binaryStream, --fpos, SEEK_SET) != 0 ||
(c = fgetc(binaryStream)) == EOF)
return NULL;

if (c == '\n' && first == 0) /* accept at most one '\n' */
break;
first = 0;

if (c != '\r') /* ignore DOS/Windows '\r' */
{
unsigned char ch = c;
if (cpos == 0)
{
memmove(buf + 1, buf, n - 2);
++cpos;
}
memcpy(buf + --cpos, &ch, 1);
}

if (fpos == 0)
{
fseek(binaryStream, 0, SEEK_SET);
break;
}
}

memmove(buf, buf + cpos, n - cpos);

return buf;
}

int main(int argc, char* argv[])
{
FILE* f;
long sz;

if (argc < 2)
{
printf("filename parameter required\n");
return -1;
}

if ((f = fopen(argv[1], "rb")) == NULL)
{
printf("failed to open file \'%s\'\n", argv[1]);
return -1;
}

sz = fsize(f);
// printf("file size: %ld\n", sz);

if (sz > 0)
{
char buf[256];
fseek(f, sz, SEEK_SET);
while (fgetsr(buf, sizeof(buf), f) != NULL)
printf("%s", buf);
}

fclose(f);
return 0;
}

I've only tested this on windows with 2 different compilers.

Reading large files backwards (from end to start) in C#

Try following code. The last line could be blank. Wasn't sure best way of handling the last line being blank.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace GetFileReverse
{
class Program
{
const string FILENAME = @"c:\temp\test.txt";
static void Main(string[] args)
{
GetFileReverse getFileReverse = new GetFileReverse(FILENAME);
string line = "";
while ((line = getFileReverse.ReadLine()) != null)
{
Console.WriteLine(line);
}
}
}
public class GetFileReverse : IDisposable
{
const int BUFFER_SIZE = 1024;
private FileStream stream { get; set; }
private string data { get; set; }
public Boolean SOF { get; set; }
private long position { get; set; }
public GetFileReverse(string filename)
{
stream = File.OpenRead(filename);
if (stream != null)
{
position = stream.Seek(0, SeekOrigin.End);
SOF = false;
data = string.Empty;
}
else
{
SOF = true;
}
}
private byte[] ReadStream()
{
byte[] bytes = null;
int size = BUFFER_SIZE;
if (position != 0)
{
bytes = new byte[BUFFER_SIZE];
long oldPosition = position;
if (position >= BUFFER_SIZE)
{
position = stream.Seek(-1 * BUFFER_SIZE, SeekOrigin.Current);
}
else
{
position = stream.Seek(-1 * position, SeekOrigin.Current);
size = (int)(oldPosition - position);
bytes = new byte[size];
}
stream.Read(bytes, 0, size);
stream.Seek(-1 * size, SeekOrigin.Current);
}
return bytes;

}
public string ReadLine()
{
string line = "";
while (!SOF && (!data.Contains("\r\n")))
{
byte[] bytes = ReadStream();
if (bytes != null)
{
string temp = Encoding.UTF8.GetString(bytes);
data = data.Insert(0, temp);
}
SOF = position == 0;
}

int lastReturn = data.LastIndexOf("\r\n");
if (lastReturn == -1)
{
if (data.Length > 0)
{
line = data;
data = string.Empty;
}
else
{
line = null;
}
}
else
{
line = data.Substring(lastReturn + 2);
data = data.Remove(lastReturn);
}

return line;
}
public void Close()
{
stream.Close();
}
public void Dispose()
{
stream.Dispose();
data = string.Empty;
position = -1;
}
}
}

Is it possible to read a file backwards using Lua?

This solution is based on the idea of @PaulKulchenko.

Yes, it's cumbersome :-)

Define function io.linesbackward(filename) in the io library:

function io.linesbackward(filename)
local file = assert(io.open(filename))
local chunk_size = 4*1024
local iterator = function() return "" end
local tail = ""
local chunk_index = math.ceil(file:seek"end" / chunk_size)
return
function()
while true do
local lineEOL, line = iterator()
if lineEOL ~= "" then
return line:reverse()
end
repeat
chunk_index = chunk_index - 1
if chunk_index < 0 then
file:close()
iterator = function()
error('No more lines in file "'..filename..'"', 3)
end
return
end
file:seek("set", chunk_index * chunk_size)
local chunk = file:read(chunk_size)
local pattern = "^(.-"..(chunk_index > 0 and "\n" or "")..")(.*)"
local new_tail, lines = chunk:match(pattern)
iterator = lines and (lines..tail):reverse():gmatch"(\n?\r?([^\n]*))"
tail = new_tail or chunk..tail
until iterator
end
end
end

Usage:

local filename = "your_file.txt"
print("--- backward: ---------------------------")
for line in io.linesbackward(filename) do
print(line)
end
print("--- forward: ----------------------------")
for line in io.lines(filename) do
print(line)
end
print("-----------------------------------------")

Read file lines backwards (fgets) with php

First way:

$file = file("test.txt");
$file = array_reverse($file);
foreach($file as $f){
echo $f."<br />";
}

Second Way (a):

To completely reverse a file:


$fl = fopen("\some_file.txt", "r");
for($x_pos = 0, $output = ''; fseek($fl, $x_pos, SEEK_END) !== -1; $x_pos--) {
$output .= fgetc($fl);
}
fclose($fl);
print_r($output);

Second Way (b):
Of course, you wanted line-by-line reversal...


$fl = fopen("\some_file.txt", "r");
for($x_pos = 0, $ln = 0, $output = array(); fseek($fl, $x_pos, SEEK_END) !== -1; $x_pos--) {
$char = fgetc($fl);
if ($char === "\n") {
// analyse completed line $output[$ln] if need be
$ln++;
continue;
}
$output[$ln] = $char . ((array_key_exists($ln, $output)) ? $output[$ln] : '');
}
fclose($fl);
print_r($output);

How to read a file backwards to find substring efficiently

Well I found this kind of interesting so I knocked up a proof-of-concept for the binary-search idea.

This is poorly tested and probably a little buggy but seems to work so far and demonstrates the idea of divide-and-conquer. You check in the middle of the file and, depending if you are to high or too low, you divide the data into two and search the relevant half. You do that recursively until you get close enough.

#include <ctime>
#include <cmath>
#include <cstdlib>
#include <string>
#include <fstream>
#include <iostream>

// Don't use this, its just to show how many reads
// are being done to find the record.
int global_counter;

std::streampos find_stamp(std::istream& is, long stamp, std::streampos pos, std::streampos end)
{
++global_counter;

if(pos == 0) // can't divide zero
return 0;

std::string s;
long found_stamp;

// extract nearest timestamp after pos
is.seekg(pos);
if(!(std::getline(std::getline(is, s, ','), s, '"') >> found_stamp))
return end;

// if its too big check first half of this region
if(found_stamp > stamp)
return find_stamp(is, stamp, pos / 2, pos);

// if its not within 10 timestamp seconds check end half of this region
if(stamp - found_stamp > 10)
return find_stamp(is, stamp, (pos + end) / 2, end);

// read record by record (prolly more efficient than skipping)
pos = is.tellg();
while(std::getline(std::getline(is, s, ','), s, '"') >> found_stamp)
{
if(found_stamp > stamp)
return pos;
pos = is.tellg();
}
return end;
}

void print_after(const std::string& filename, long stamp)
{
// open at end of file (to get length)
std::ifstream ifs(filename, std::ios::ate);

std::streampos end = ifs.tellg();
auto pos = end / 2; // start checking in middle

// find position before required record
// (may be in the middle of a record)
if((pos = find_stamp(ifs, stamp, pos, end)) != end)
{
ifs.seekg(pos);

std::string line;
std::getline(ifs, line, ','); // skip to next whole record

// print out all following recors
while(std::getline(ifs, line, ','))
std::cout << line;
}
}

inline
std::string leading_zeros(int n, int zeros = 2)
{
std::string s;
for(int z = std::pow(10, zeros - 1); z; z /= 10)
s += (n < z ? "0":"");
return s + std::to_string(n);
}

int main()
{
std::srand(std::time(0));

// generate some test data
std::ofstream ofs("test.txt");

for(int i = 0; i < 1000; ++i)
{
ofs << '"' << leading_zeros(i, 10) << '"';
ofs << ":{\"AA\":" << (std::rand() % 100);
ofs << '.' << (std::rand() % 100) << "},\n";
}

ofs.close();

global_counter = 0;
print_after("test.txt", 993);

std::cout << "find checked " << global_counter << " places in the file\n";
}

Output:

"0000000994":{"AA":80.6}
"0000000995":{"AA":11.90}
"0000000996":{"AA":16.43}
"0000000997":{"AA":53.11}
"0000000998":{"AA":68.43}
"0000000999":{"AA":79.77}
find checked 6 places in the file


Related Topics



Leave a reply



Submit