How to Read File Binary in C#

How to read file binary in C#?

Quick and dirty version:

byte[] fileBytes = File.ReadAllBytes(inputFilename);
StringBuilder sb = new StringBuilder();

foreach(byte b in fileBytes)
{
sb.Append(Convert.ToString(b, 2).PadLeft(8, '0'));
}

File.WriteAllText(outputFilename, sb.ToString());

Binary file read in c#

Your problem is almost certainly the use of the File.ReadAllLines() method, which only works on text files, not binary files.

Try using ReadAllBytes() instead.

Reading from a file mixed with text and binary data in c#

  • Use BinaryReader instead of StreamReader.

    • You can use BinaryReader to read text just like StreamReader too - the only catch is that you'll need to bring your own ReadLine as an extension-method, but here's an example below.
  • It is technically possible to use both StreamReader and BinaryReader on the same Stream concurrently - but you need to be familiar with the internals of both and how their read-buffer and stream-reading behaviour works. So I don't recommend using this approach at all.

  • Use the BinaryReaderExtensions below to have ReadLine, and switch to binary methods when you get to the binary part of the file:

using System;
using System.IO;
using System.Text;

public static class BinaryReaderExtensions
{
public static String ReadLine( this BinaryReader reader )
{
if( reader is null ) throw new ArgumentNullException(nameof(reader));
if( reader.IsEndOfStream() ) return null;

StringBuilder sb = new StringBuilder();

while( ReadChar( reader, out Char c ) )
{
if( c == '\r' || c == '\n' )
{
return sb.ToString();
}
else
{
sb.Append( c );
}
}

if( sb.Length > 0 ) return sb.ToString();

return null;

char character;
}

private static Boolean ReadChar( BinaryReader reader, out Char c )
{
if( reader.IsEndOfStream() ) return false;
c = reader.ReadChar();
return true;
}

public static Boolean IsEndOfStream(this BinaryReader reader)
{
return reader.BaseStream.Position == reader.BaseStream.Length;
}
}

Example:

using( FileStream fs = new FileStream( "file.dat", FileMode.Read, etc ) )
using( BinaryReader rdr = new BinaryReader( fs, Encoding.UTF8 ) )
{
// I assume the first 5 lines are text:
List<String> linesOfText = new List<String>();
for( Int32 i = 0; i < 5; i++ )
{
String line = rdr.ReadLine();
if( line is null ) throw new InvalidOperationException( "Encountered premature EOF in text section." );
linesOfText.Add( line );
}

// And after the 5th line it's a 512 byte blob of binary data (for example):
Byte[] buffer = new Byte[ 512 ];
Int32 bytesRead = rdr.Read( buffer, index: 0, count: 512 );
if( bytesRead != buffer.Length ) throw new InvalidOperationException( "Encountered premature EOF (in binary section)." );
}

.net: efficient way to read a binary file into memory then access

To get the initial MemoryStream from reading the file, the following works:

    byte[] bytes;
try
{
// File.ReadAllBytes opens a filestream and then ensures it is closed
bytes = File.ReadAllBytes(_fi.FullName);
_ms = new MemoryStream(bytes, 0, bytes.Length, false, true);
}
catch (IOException e)
{
throw e;
}

File.ReadAllBytes() copies the file content into memory. It uses using, which means that it ensures the file gets closed. So no Finally statement is needed.

I can read individual values from the MemoryStream using MemoryStream.Read. These calls involve copies of those values, which is fine.

In one situation, I needed to read a table out of the file, change a value, and then calculate a checksum of the entire file with that change in place. Instead of copying the entire file so that I could edit one part, I was able to calculate the checksum in progressive steps: first on the initial, unchanged segment of the file, then continue with the middle segment that was changed, then continue with the remainder.

For this I could process the first and final segments using the MemoryStream. This involved lots of reads, with each read copying; but those copies were transient variables, so no significant working set increase.

For the middle segment, that needed to be copied since it had to be changed (but the original version needed to be kept intact). The following worked:

    // get ref (not copy!) to the byte array underlying the MemoryStream
byte[] fileData = _ms.GetBuffer();

// determine the required length
int length = _tableRecord.Length;

// create array to hold the copy
byte[] segmentCopy = new byte[length];

// get the copy
Array.ConstrainedCopy(fileData, _tableRecord.Offset, segmentCopy, 0, length);

After modifying values in segmentCopy, I then needed to pass this to my static method for calculating checksums, which expected a MemoryStream (for sequential reading). This worked:

    // new MemoryStream will hold a ref to the segmentCopy array (no new copy!)
MemoryStream ms = new MemoryStream(segmentCopy, 0, segmentCopy.Length);

What I haven't needed to do yet, but will want to do, is to get a slice of the MemoryStream that doesn't involve copying. This works:

    MemoryStream sliceFromMS = new MemoryStream(fileData, offset, length);

From above, fileData was a ref to the array underlying the original MemoryStream. Now sliceFromMS will have a ref to a segment within that same array.

How to import and read large binary file data in c#?

This will very much depend on what format the file is in. Each byte in the file might represent different things, or it might just represent values from a large array, or some mix of the two.

You need to know what the format looks like to be able to read it, since binary files are not self-descriptive. Reading a simple object might look like

var authorName = binReader.ReadString();
var publishDate = DateTime.FromBinary(binReader.ReadInt64());
...

If you have a list of items it is common to use a length prefix. Something like

var numItems = binReader.ReadInt32();
for(int i = 0; i < numItems; i++){
var title = binReader.ReadString();
...
}

You would then typically create one or more objects from the data that can be used in the rest of the application. I.e.

new Bibliography(authorName, publishDate , books);

If this is a format you do not control I hope you have a detailed specification. Otherwise this is kind of a lost cause for anything but the cludgiest solutions.

If there is more data than can fit in memory you need some kind of streaming mechanism. I.e. read one item, do some processing of the item, save the result, read the next item, etc.

If you do control the format I would suggest alternatives that are easier to manage. I have used protobuf.Net, and I find it quite easy to use, but there are other alternatives. The common way to use these kinds of libraries is to create a class for the data, and add attributes for the fields that should be stored. The library can manage serialization/deserialization automatically, and usually handle things like inheritance and changes to the format in an easy way.

C# loading binary files

1: For very small files File.ReadAllBytes will be fine.

2: For very big files and using .net 4.0 , you can make use MemoryMapped Files.

3: If Not using .net 4.0 than , reading chunks of data would be good choice



Related Topics



Leave a reply



Submit