C# - Streamreader and Seeking

c# - StreamReader and seeking

Yes you can, see this:

var sr = new StreamReader("test.txt");
sr.BaseStream.Seek(2, SeekOrigin.Begin); // Check sr.BaseStream.CanSeek first

Update:
Be aware that you can't necessarily use sr.BaseStream.Position to anything useful because StreamReader uses buffers so it will not reflect what you actually have read. I guess you gonna have problems finding the true position. Because you can't just count characters (different encodings and therefore character lengths). I think the best way is to work with FileStream´s themselves.

Update:
Use the TGREER.myStreamReader from here:
http://www.daniweb.com/software-development/csharp/threads/35078
this class adds BytesRead etc. (works with ReadLine() but apparently not with other reads methods)
and then you can do like this:

File.WriteAllText("test.txt", "1234\n56789");

long position = -1;

using (var sr = new myStreamReader("test.txt"))
{
Console.WriteLine(sr.ReadLine());

position = sr.BytesRead;
}

Console.WriteLine("Wait");

using (var sr = new myStreamReader("test.txt"))
{
sr.BaseStream.Seek(position, SeekOrigin.Begin);
Console.WriteLine(sr.ReadToEnd());
}

Seek through FileStream then using StreamReader to read from there

So thanks to Hans Passant, I have got the answer:

var buffer = new char[BufferSize];

var endpoints = new List<long>();

using (var fileStream = this.CreateMultipleReadAccessFileStream(fileName))
{
var fileLength = fileStream.Length;

var seekPositionCount = fileLength / concurrentReads;

long currentOffset = 0;
for (var i = 0; i < concurrentReads; i++)
{
var seekPosition = seekPositionCount + currentOffset;

// seek the file forward
// fileStream.Seek(seekPosition, SeekOrigin.Current);

// setting true at the end is very important, keeps the underlying fileStream open.
using (var streamReader = this.CreateTemporaryStreamReader(fileStream))
{
// this is poor on performance, hence why you split the file here and read in new threads.
streamReader.DiscardBufferedData();
// you have to advance the fileStream here, because of the previous line
streamReader.BaseStream.Seek(seekPosition, SeekOrigin.Begin);
// this also seeks the file forward the amount in the buffer...
int bytesRead;
var totalBytesRead = 0;
while ((bytesRead = await streamReader.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
totalBytesRead += bytesRead;

var found = false;

var gotR = false;

for (var j = 0; j < buffer.Length; j++)
{
if (buffer[j] == '\r')
{
gotR = true;
continue;
}

if (buffer[j] == '\n' && gotR)
{
// so we add the total bytes read, minus the current buffer amount read, then add how far into the buffer we actually read.
seekPosition += totalBytesRead - BufferSize + j;
endpoints.Add(seekPosition);
found = true;
break;
}
// if we have found new line then move the position to
}

if (found) break;
}
}

currentOffset = seekPosition;
}
}

return endpoints;

Note the new part, rather than doing this twice:

fileStream.Seek(seekPosition, SeekOrigin.Current);

I now use SeekOrigin.Begin and use the StreamReader to progress the underlying base stream:

// this is poor on performance, hence why you split the file here and read in new threads.
streamReader.DiscardBufferedData();
// you have to advance the fileStream here, because of the previous line
streamReader.BaseStream.Seek(seekPosition, SeekOrigin.Begin);

The DiscardBufferedData will mean that I'm always using the underlying stream position.

Why is StreamReader and sr.BaseStream.Seek() giving Junk Characters even in UTF8 Encoding

Why is StreamReader and sr.BaseStream.Seek() giving Junk Characters even in UTF8 Encoding

It is exactly because of UTF-8 that sr.BaseStream is giving junk characters. :)

StreamReader is a relatively "smarter" stream. It understands how strings work, whereas FileStream (i.e. sr.BaseStream) doesn't. FileStream only knows about bytes.

Since your file is encoded in UTF-8 (a variable-length encoding), letters like A, B and C are encoded with 1 byte, but the character needs 3 bytes. You can get how many bytes a character needs by doing:

Console.WriteLine(Encoding.UTF8.GetByteCount("•"));

So when you move the stream to "the position just after ", you haven't actually moved past the , you are just on the second byte of it.

The reason why the Lengths are different is similar: StreamReader gives you the number of characters, whereas sr.BaseStream gives you the number of bytes.

Tracking the position of the line of a streamreader

You can do this one of three ways:

1) Write your own StreamReader. Here's a good place to start: How to know position(linenumber) of a streamreader in a textfile?

2) The StreamReader class has two very important, but private variables called charPos and charLen that are needed in locating the actual "read" position and not just the underlying position of the stream. You could use reflection to get the values as suggested here

Int32 charpos = (Int32) s.GetType().InvokeMember("charPos", 
BindingFlags.DeclaredOnly |
BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.Instance | BindingFlags.GetField
,null, s, null);

Int32 charlen= (Int32) s.GetType().InvokeMember("charLen",
BindingFlags.DeclaredOnly |
BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.Instance | BindingFlags.GetField
,null, s, null);

return (Int32)s.BaseStream.Position-charlen+charpos;

3) Simply read the entire file into a string array. Something like this:

char[] CRLF = new char[2] { '\n', '\r' };
TextReader tr = File.OpenText("some path to file");
string[] fileLines = tr.ReadToEnd().Split(CRLF);

Another possibility (along the sames lines as #3) is to read in the lines and store the line in an array. When you want to read the prior line, just use the array.

C# StreamReader, Seek backwards for one line from middle of file

You can use "File.ReadAllLines" function, this is the fastest way I know to read text file.
For example, I have a text file like below.

aaa123
aaa456
aaa789
bbb123
bbb456
bbb789

When I search in each lines in text file if I found "a789". So I expect to get the data of previous line. In this case is "aaa456".

   string[] lines = File.ReadAllLines(@"C:\Users\Binh\Desktop\C#\test\test.txt");           
string data = "";

for (int i = 0; i < lines.Length; i++)
{
if(lines[i].Contains("a789"))
{
data = lines[i - 1];//Read backward for one line
}
}
Console.WriteLine(data);//This will return "aaa456"

Seeking for a line i a text file

I have found a solution, the line I have appended to the file will always be the last line in the file, so I created a method to read the last line. See below:

public string ReadLastLine(string path)
{
string returnValue = "";
FileStream fs = new FileStream(path, FileMode.Open);
for (long pos = fs.Length - 2; pos > 0; --pos)
{
fs.Seek(pos, SeekOrigin.Begin);
StreamReader ts = new StreamReader(fs);
returnValue = ts.ReadToEnd();
int eol = returnValue .IndexOf("\n");
if (eol >= 0)
{
fs.Close();
return returnValue .Substring(eol + 1);
}
}
fs.Close();
return returnValue ;
}

Reading a line from a streamreader without consuming?

The problem is the underlying stream may not even be seekable. If you take a look at the stream reader implementation it uses a buffer so it can implement TextReader.Peek() even if the stream is not seekable.

You could write a simple adapter that reads the next line and buffers it internally, something like this:

 public class PeekableStreamReaderAdapter
{
private StreamReader Underlying;
private Queue<string> BufferedLines;

public PeekableStreamReaderAdapter(StreamReader underlying)
{
Underlying = underlying;
BufferedLines = new Queue<string>();
}

public string PeekLine()
{
string line = Underlying.ReadLine();
if (line == null)
return null;
BufferedLines.Enqueue(line);
return line;
}

public string ReadLine()
{
if (BufferedLines.Count > 0)
return BufferedLines.Dequeue();
return Underlying.ReadLine();
}
}

Return StreamReader to Beginning

You need to seek on the stream, like you did, then call DiscardBufferedData on the StreamReader. Documentation here:

Edit: Adding code example:

Stream s = new MemoryStream();
StreamReader sr = new StreamReader(s);
// later... after we read stuff
s.Position = 0;
sr.DiscardBufferedData(); // reader now reading from position 0


Related Topics



Leave a reply



Submit