C# - How to Save Byte Values to File With Smallest Size Possible

Get file's size from bytes array (without saving to disc)

What about array.Length? Looks like a size in bytes.

How do I get a human-readable file size in bytes abbreviation using .NET?

This may not the most efficient or optimized way to do it, but it's easier to read if you are not familiar with log maths, and should be fast enough for most scenarios.

string[] sizes = { "B", "KB", "MB", "GB", "TB" };
double len = new FileInfo(filename).Length;
int order = 0;
while (len >= 1024 && order < sizes.Length - 1) {
order++;
len = len/1024;
}

// Adjust the format string to your preferences. For example "{0:0.#}{1}" would
// show a single decimal place, and no space.
string result = String.Format("{0:0.##} {1}", len, sizes[order]);

How to calculate the size of the file before saving it to the disk from the stream?

If you have a Stream object, you can use its Length property - which gives the length in bytes - but not all streams support it (for example, memory and file stream do, but network streams generally don't).

If you're starting from a byte array (byte[]) use its Length property to get the amount of bytes.

If you're using a string, it depends on the encoding. For example, for UTF8, you can use

int byteCount = System.Text.Encoding.UTF8.GetByteCount(str);

Is it possible to reduce the size of an image comes from byte array

Please understand that there is no free lunch. Decreasing the size of a JPEG image by increasing the compression will also decrease the quality of the image. However, that said, you can reduce the size of a JPEG image using the Image class. This code assumes that inputBytes contains the original image.

var jpegQuality = 50; 
Image image;
using (var inputStream = new MemoryStream(inputBytes)) {
image = Image.FromStream(inputStream);
var jpegEncoder = ImageCodecInfo.GetImageDecoders()
.First(c => c.FormatID == ImageFormat.Jpeg.Guid);
var encoderParameters = new EncoderParameters(1);
encoderParameters.Param[0] = new EncoderParameter(Encoder.Quality, jpegQuality);
Byte[] outputBytes;
using (var outputStream = new MemoryStream()) {
image.Save(outputStream, jpegEncoder, encoderParameters);
outputBytes = outputStream.ToArray();
}
}

Now outputBytes contains a recompressed version of the image using a different JPEG quality.

By decreasing the jpegQuality (should be in the range 0-100) you can increase the compression at the cost of lower image quality. See the Encoder.Quality field for more information.

Here is an example where you can see how jpegQuality affects the image quality. It is the same photo compressed using 20, 50 and 80 as the value of jpegQuality. Sizes are 4.99, 8.28 and 12.9 KB.

JPEG quality sample image

Notice how the text becomes "smudged" even when the quality is high. This is why you should avoid using JPEG for images with uniformly colored areas (images/diagrams/charts created on a computer). Use PNG instead. For photos JPEG is very suitable if you do not lower the quality too much.

Minimise File Size for Decimal .csv file

I think that the question is legitimate, but the answer is that you impose logical conditions that leave no place for any solution.

So if you could avoid CSV structure for your custom structure you could save something, but you need it and it pretty much determines your solution. The only variable left is how do you encode the text, but you can't encode the text in less than 8 bits, you can just use higher values like Unicode (16 bits).

I won't comment on using compression as you already mentioned that you are looking for alternative answers and you are aware of that.

Create batches in LINQ

An Enumerable.Chunk() extension method was added to .NET 6.0.

Example:

var list = new List<int> { 1, 2, 3, 4, 5, 6, 7 };

var chunks = list.Chunk(3);
// returns { { 1, 2, 3 }, { 4, 5, 6 }, { 7 } }

For those who cannot upgrade, the source is available on GitHub.

Why compressed bytes is bigger than bytes?

Very very simplified, imagine that ZIP file works like this:

  • It has an index where it says what filenames it contains and where we can find them
  • It compresses each file by saying how many times each byte is repeated

So, if you have a file layers.pic that contains: 0 0 0 0 0 0 0 0 50 50 50 50 50 50 50 50 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100, you could say instead say: "layers.pic, right after index, 8x0, 8x50, 16x100" and it would be shorter. But imagine if a file only has 0 17 39; then the "compression" would actually be twice as long as the file (1x0 1x17 1x39), and you'd still need to waste additional space telling the index what its original name is and where to find it. Even if we decided compression is not worth it and stored the file as-is into the archive, we'd still increase the file size because we need to put something in the index.

(ZIP archive is a bit more complicated than this; but the basic principles are quite close - including the option to not compress if the entry would end up larger.)

EDIT: If you check out the Wikipedia page, you can find out that each file entry has a header of at least 30 bytes plus file name size; the central index repeats that information again, in a bit expanded form; then there's the EOCD that is at least 20 bytes. Your file is named test.txt for 8 bytes, so just the metadata occupies at least (30+8) + (46+8) + 20 = 112 bytes already, without your compressed data itself (which are consequently taking up at most 35 bytes).



Related Topics



Leave a reply



Submit