c# - StreamReader and seeking
Yes you can, see this:
var sr = new StreamReader("test.txt");
sr.BaseStream.Seek(2, SeekOrigin.Begin); // Check sr.BaseStream.CanSeek first
Update:
Be aware that you can't necessarily use sr.BaseStream.Position
to anything useful because StreamReader
uses buffers so it will not reflect what you actually have read. I guess you gonna have problems finding the true position. Because you can't just count characters (different encodings and therefore character lengths). I think the best way is to work with FileStream
´s themselves.
Update:
Use the TGREER.myStreamReader
from here:
http://www.daniweb.com/software-development/csharp/threads/35078
this class adds BytesRead
etc. (works with ReadLine()
but apparently not with other reads methods)
and then you can do like this:
File.WriteAllText("test.txt", "1234\n56789");
long position = -1;
using (var sr = new myStreamReader("test.txt"))
{
Console.WriteLine(sr.ReadLine());
position = sr.BytesRead;
}
Console.WriteLine("Wait");
using (var sr = new myStreamReader("test.txt"))
{
sr.BaseStream.Seek(position, SeekOrigin.Begin);
Console.WriteLine(sr.ReadToEnd());
}
Seek through FileStream then using StreamReader to read from there
So thanks to Hans Passant, I have got the answer:
var buffer = new char[BufferSize];
var endpoints = new List<long>();
using (var fileStream = this.CreateMultipleReadAccessFileStream(fileName))
{
var fileLength = fileStream.Length;
var seekPositionCount = fileLength / concurrentReads;
long currentOffset = 0;
for (var i = 0; i < concurrentReads; i++)
{
var seekPosition = seekPositionCount + currentOffset;
// seek the file forward
// fileStream.Seek(seekPosition, SeekOrigin.Current);
// setting true at the end is very important, keeps the underlying fileStream open.
using (var streamReader = this.CreateTemporaryStreamReader(fileStream))
{
// this is poor on performance, hence why you split the file here and read in new threads.
streamReader.DiscardBufferedData();
// you have to advance the fileStream here, because of the previous line
streamReader.BaseStream.Seek(seekPosition, SeekOrigin.Begin);
// this also seeks the file forward the amount in the buffer...
int bytesRead;
var totalBytesRead = 0;
while ((bytesRead = await streamReader.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
totalBytesRead += bytesRead;
var found = false;
var gotR = false;
for (var j = 0; j < buffer.Length; j++)
{
if (buffer[j] == '\r')
{
gotR = true;
continue;
}
if (buffer[j] == '\n' && gotR)
{
// so we add the total bytes read, minus the current buffer amount read, then add how far into the buffer we actually read.
seekPosition += totalBytesRead - BufferSize + j;
endpoints.Add(seekPosition);
found = true;
break;
}
// if we have found new line then move the position to
}
if (found) break;
}
}
currentOffset = seekPosition;
}
}
return endpoints;
Note the new part, rather than doing this twice:
fileStream.Seek(seekPosition, SeekOrigin.Current);
I now use SeekOrigin.Begin
and use the StreamReader
to progress the underlying base stream:
// this is poor on performance, hence why you split the file here and read in new threads.
streamReader.DiscardBufferedData();
// you have to advance the fileStream here, because of the previous line
streamReader.BaseStream.Seek(seekPosition, SeekOrigin.Begin);
The DiscardBufferedData
will mean that I'm always using the underlying stream position.
Why is StreamReader and sr.BaseStream.Seek() giving Junk Characters even in UTF8 Encoding
Why is StreamReader and sr.BaseStream.Seek() giving Junk Characters even in UTF8 Encoding
It is exactly because of UTF-8 that sr.BaseStream
is giving junk characters. :)
StreamReader
is a relatively "smarter" stream. It understands how strings work, whereas FileStream
(i.e. sr.BaseStream
) doesn't. FileStream
only knows about bytes.
Since your file is encoded in UTF-8 (a variable-length encoding), letters like A
, B
and C
are encoded with 1 byte, but the •
character needs 3 bytes. You can get how many bytes a character needs by doing:
Console.WriteLine(Encoding.UTF8.GetByteCount("•"));
So when you move the stream to "the position just after •
", you haven't actually moved past the •
, you are just on the second byte of it.
The reason why the Length
s are different is similar: StreamReader
gives you the number of characters, whereas sr.BaseStream
gives you the number of bytes.
Tracking the position of the line of a streamreader
You can do this one of three ways:
1) Write your own StreamReader. Here's a good place to start: How to know position(linenumber) of a streamreader in a textfile?
2) The StreamReader class has two very important, but private variables called charPos and charLen that are needed in locating the actual "read" position and not just the underlying position of the stream. You could use reflection to get the values as suggested here
Int32 charpos = (Int32) s.GetType().InvokeMember("charPos",
BindingFlags.DeclaredOnly |
BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.Instance | BindingFlags.GetField
,null, s, null);
Int32 charlen= (Int32) s.GetType().InvokeMember("charLen",
BindingFlags.DeclaredOnly |
BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.Instance | BindingFlags.GetField
,null, s, null);
return (Int32)s.BaseStream.Position-charlen+charpos;
3) Simply read the entire file into a string array. Something like this:
char[] CRLF = new char[2] { '\n', '\r' };
TextReader tr = File.OpenText("some path to file");
string[] fileLines = tr.ReadToEnd().Split(CRLF);
Another possibility (along the sames lines as #3) is to read in the lines and store the line in an array. When you want to read the prior line, just use the array.
C# StreamReader, Seek backwards for one line from middle of file
You can use "File.ReadAllLines" function, this is the fastest way I know to read text file.
For example, I have a text file like below.
aaa123
aaa456
aaa789
bbb123
bbb456
bbb789
When I search in each lines in text file if I found "a789". So I expect to get the data of previous line. In this case is "aaa456".
string[] lines = File.ReadAllLines(@"C:\Users\Binh\Desktop\C#\test\test.txt");
string data = "";
for (int i = 0; i < lines.Length; i++)
{
if(lines[i].Contains("a789"))
{
data = lines[i - 1];//Read backward for one line
}
}
Console.WriteLine(data);//This will return "aaa456"
Seeking for a line i a text file
I have found a solution, the line I have appended to the file will always be the last line in the file, so I created a method to read the last line. See below:
public string ReadLastLine(string path)
{
string returnValue = "";
FileStream fs = new FileStream(path, FileMode.Open);
for (long pos = fs.Length - 2; pos > 0; --pos)
{
fs.Seek(pos, SeekOrigin.Begin);
StreamReader ts = new StreamReader(fs);
returnValue = ts.ReadToEnd();
int eol = returnValue .IndexOf("\n");
if (eol >= 0)
{
fs.Close();
return returnValue .Substring(eol + 1);
}
}
fs.Close();
return returnValue ;
}
Reading a line from a streamreader without consuming?
The problem is the underlying stream may not even be seekable. If you take a look at the stream reader implementation it uses a buffer so it can implement TextReader.Peek() even if the stream is not seekable.
You could write a simple adapter that reads the next line and buffers it internally, something like this:
public class PeekableStreamReaderAdapter
{
private StreamReader Underlying;
private Queue<string> BufferedLines;
public PeekableStreamReaderAdapter(StreamReader underlying)
{
Underlying = underlying;
BufferedLines = new Queue<string>();
}
public string PeekLine()
{
string line = Underlying.ReadLine();
if (line == null)
return null;
BufferedLines.Enqueue(line);
return line;
}
public string ReadLine()
{
if (BufferedLines.Count > 0)
return BufferedLines.Dequeue();
return Underlying.ReadLine();
}
}
Return StreamReader to Beginning
You need to seek on the stream, like you did, then call DiscardBufferedData
on the StreamReader
. Documentation here:
Edit: Adding code example:
Stream s = new MemoryStream();
StreamReader sr = new StreamReader(s);
// later... after we read stuff
s.Position = 0;
sr.DiscardBufferedData(); // reader now reading from position 0
Related Topics
How to Specify My Explicit Type Comparator Inline
What Is the Fastest Way I Can Compare Two Equal-Size Bitmaps to Determine Whether They Are Identical
The Current Synchronizationcontext May Not Be Used as a Taskscheduler
Getting the "Diff" Between Two Arrays in C#
Linq Expression to Return Property Value
How to Get the Current User Directory
Export Datatable to Excel with Open Xml Sdk in C#
How to Convert Ienumerable to Observablecollection
Force Gui Update from UI Thread
Pulling Data from a Webpage, Parsing It for Specific Pieces, and Displaying It
Getting Specified Node Values from Xml Document
Win32_Processor::Is Processorid Unique for All Computers
Passing Object in Redirecttoaction
How to Find Out When You'Ve Been Loaded via Xml Serialization
Lambda Expression Not Returning Expected Memberinfo