How to read file binary in C#?
Quick and dirty version:
byte[] fileBytes = File.ReadAllBytes(inputFilename);
StringBuilder sb = new StringBuilder();
foreach(byte b in fileBytes)
{
sb.Append(Convert.ToString(b, 2).PadLeft(8, '0'));
}
File.WriteAllText(outputFilename, sb.ToString());
Binary file read in c#
Your problem is almost certainly the use of the File.ReadAllLines()
method, which only works on text files, not binary files.
Try using ReadAllBytes() instead.
Reading from a file mixed with text and binary data in c#
Use
BinaryReader
instead ofStreamReader
.- You can use
BinaryReader
to read text just likeStreamReader
too - the only catch is that you'll need to bring your ownReadLine
as an extension-method, but here's an example below.
- You can use
It is technically possible to use both
StreamReader
andBinaryReader
on the sameStream
concurrently - but you need to be familiar with the internals of both and how their read-buffer and stream-reading behaviour works. So I don't recommend using this approach at all.Use the
BinaryReaderExtensions
below to haveReadLine
, and switch to binary methods when you get to the binary part of the file:
using System;
using System.IO;
using System.Text;
public static class BinaryReaderExtensions
{
public static String ReadLine( this BinaryReader reader )
{
if( reader is null ) throw new ArgumentNullException(nameof(reader));
if( reader.IsEndOfStream() ) return null;
StringBuilder sb = new StringBuilder();
while( ReadChar( reader, out Char c ) )
{
if( c == '\r' || c == '\n' )
{
return sb.ToString();
}
else
{
sb.Append( c );
}
}
if( sb.Length > 0 ) return sb.ToString();
return null;
char character;
}
private static Boolean ReadChar( BinaryReader reader, out Char c )
{
if( reader.IsEndOfStream() ) return false;
c = reader.ReadChar();
return true;
}
public static Boolean IsEndOfStream(this BinaryReader reader)
{
return reader.BaseStream.Position == reader.BaseStream.Length;
}
}
Example:
using( FileStream fs = new FileStream( "file.dat", FileMode.Read, etc ) )
using( BinaryReader rdr = new BinaryReader( fs, Encoding.UTF8 ) )
{
// I assume the first 5 lines are text:
List<String> linesOfText = new List<String>();
for( Int32 i = 0; i < 5; i++ )
{
String line = rdr.ReadLine();
if( line is null ) throw new InvalidOperationException( "Encountered premature EOF in text section." );
linesOfText.Add( line );
}
// And after the 5th line it's a 512 byte blob of binary data (for example):
Byte[] buffer = new Byte[ 512 ];
Int32 bytesRead = rdr.Read( buffer, index: 0, count: 512 );
if( bytesRead != buffer.Length ) throw new InvalidOperationException( "Encountered premature EOF (in binary section)." );
}
.net: efficient way to read a binary file into memory then access
To get the initial MemoryStream from reading the file, the following works:
byte[] bytes;
try
{
// File.ReadAllBytes opens a filestream and then ensures it is closed
bytes = File.ReadAllBytes(_fi.FullName);
_ms = new MemoryStream(bytes, 0, bytes.Length, false, true);
}
catch (IOException e)
{
throw e;
}
File.ReadAllBytes()
copies the file content into memory. It uses using
, which means that it ensures the file gets closed. So no Finally
statement is needed.
I can read individual values from the MemoryStream using MemoryStream.Read
. These calls involve copies of those values, which is fine.
In one situation, I needed to read a table out of the file, change a value, and then calculate a checksum of the entire file with that change in place. Instead of copying the entire file so that I could edit one part, I was able to calculate the checksum in progressive steps: first on the initial, unchanged segment of the file, then continue with the middle segment that was changed, then continue with the remainder.
For this I could process the first and final segments using the MemoryStream. This involved lots of reads, with each read copying; but those copies were transient variables, so no significant working set increase.
For the middle segment, that needed to be copied since it had to be changed (but the original version needed to be kept intact). The following worked:
// get ref (not copy!) to the byte array underlying the MemoryStream
byte[] fileData = _ms.GetBuffer();
// determine the required length
int length = _tableRecord.Length;
// create array to hold the copy
byte[] segmentCopy = new byte[length];
// get the copy
Array.ConstrainedCopy(fileData, _tableRecord.Offset, segmentCopy, 0, length);
After modifying values in segmentCopy, I then needed to pass this to my static method for calculating checksums, which expected a MemoryStream (for sequential reading). This worked:
// new MemoryStream will hold a ref to the segmentCopy array (no new copy!)
MemoryStream ms = new MemoryStream(segmentCopy, 0, segmentCopy.Length);
What I haven't needed to do yet, but will want to do, is to get a slice of the MemoryStream that doesn't involve copying. This works:
MemoryStream sliceFromMS = new MemoryStream(fileData, offset, length);
From above, fileData
was a ref to the array underlying the original MemoryStream. Now sliceFromMS
will have a ref to a segment within that same array.
How to import and read large binary file data in c#?
This will very much depend on what format the file is in. Each byte in the file might represent different things, or it might just represent values from a large array, or some mix of the two.
You need to know what the format looks like to be able to read it, since binary files are not self-descriptive. Reading a simple object might look like
var authorName = binReader.ReadString();
var publishDate = DateTime.FromBinary(binReader.ReadInt64());
...
If you have a list of items it is common to use a length prefix. Something like
var numItems = binReader.ReadInt32();
for(int i = 0; i < numItems; i++){
var title = binReader.ReadString();
...
}
You would then typically create one or more objects from the data that can be used in the rest of the application. I.e.
new Bibliography(authorName, publishDate , books);
If this is a format you do not control I hope you have a detailed specification. Otherwise this is kind of a lost cause for anything but the cludgiest solutions.
If there is more data than can fit in memory you need some kind of streaming mechanism. I.e. read one item, do some processing of the item, save the result, read the next item, etc.
If you do control the format I would suggest alternatives that are easier to manage. I have used protobuf.Net, and I find it quite easy to use, but there are other alternatives. The common way to use these kinds of libraries is to create a class for the data, and add attributes for the fields that should be stored. The library can manage serialization/deserialization automatically, and usually handle things like inheritance and changes to the format in an easy way.
C# loading binary files
1: For very small files File.ReadAllBytes will be fine.
2: For very big files and using .net 4.0 , you can make use MemoryMapped Files.
3: If Not using .net 4.0 than , reading chunks of data would be good choice
Related Topics
Best Way to Split String into Lines
Get Number of Listeners, Clients Connected to Signalr Hub
Should I Use Appdomain.Currentdomain.Basedirectory or System.Environment.Currentdirectory
Extension Methods Syntax VS Query Syntax
Set Cultureinfo in ASP.NET Core to Have a . as Currencydecimalseparator Instead of ,
How to Use a Controller in Another Assembly in ASP.NET Core MVC 2.0
Simulate Steady CPU Load and Spikes
Microsoft.Office.Interop.Excel Really Slow
Suppress Properties with Null Value on ASP.NET Web API
Binding Property to Control in Winforms
Only Parameterless Constructors and Initializers Are Supported in Linq to Entities
How to Initialize a C# Attribute with an Array or Other Variable Number of Arguments
Get List of Zero Reference Codes in Visual Studio
How to Write a Viewmodelbase in Mvvm