File I/O with Streams - Best Memory Buffer Size

File I/O with streams - best memory buffer size

Files are already buffered by the file system cache. You just need to pick a buffer size that doesn't force FileStream to make the native Windows ReadFile() API call to fill the buffer too often. Don't go below a kilobyte, more than 16 KB is a waste of memory and unfriendly to the CPU's L1 cache (typically 16 or 32 KB of data).

4 KB is a traditional choice, even though that will exactly span a virtual memory page only ever by accident. It is difficult to profile; you'll end up measuring how long it takes to read a cached file. Which runs at RAM speeds, 5 gigabytes/sec and up if the data is available in the cache. It will be in the cache the second time you run your test, and that won't happen in a production environment too often. File I/O is completely dominated by the disk drive or the NIC and is glacially slow, copying the data is peanuts. 4 KB will work fine.

What is the best memory buffer size to allocate to download a file from Internet?

Use at least 4KB. It's the normal page size for Windows (i.e. the granularity at which Windows itself manages memory), which means that the .Net memory allocator doesn't need to break down a 4KB page into 1KB allocations.

Of course, using a 64KB block will be faster, but only marginally so.

What is the best buffer size when using BinaryReader to read big files ( 1 GB)?

"Sequential File Programming Patterns and Performance with .NET" is a great article in I/O performance improvement.

In page 8 of this PDF file, it shows that the bandwidth for buffer size bigger than eight bytes, is constant. Consider that the article has been written in 2004 and the hard disk drive is "Maxtor 250 GB 7200 RPM SATA disk" and the result should be different by latest I/O technologies.

If you are looking for the best performance take a look at pinvoke.net or the page 9 of the PDF file, the un-buffered file performance measurements shows better results:

In un-buffered I/O, the disk data moves directly between the
application’s address space and the device without any intermediate
copying.

Summary

For single disks, use the defaults of the .NET framework – they deliver excellent performance for sequential file access.
Pre-allocate large sequential files (using the SetLength() method) when the file is created. This typically improves speed by about 13% when compared to a fragmented file.
At least for now, disk arrays require un-buffered I/O to achieve the highest performance - buffered I/O can be eight times slower than un-buffered I/O. We expect this problem will be addressed in later releases of the .NET framework.
If you do your own buffering, use large request sizes (64 KB is a good place to start). Using the .NET framework, a single processor can read and write a disk array at over 800 Mbytes/s using un-buffered I/O.

How do you determine the ideal buffer size when using FileInputStream?

Optimum buffer size is related to a number of things: file system block size, CPU cache size and cache latency.

Most file systems are configured to use block sizes of 4096 or 8192. In theory, if you configure your buffer size so you are reading a few bytes more than the disk block, the operations with the file system can be extremely inefficient (i.e. if you configured your buffer to read 4100 bytes at a time, each read would require 2 block reads by the file system). If the blocks are already in cache, then you wind up paying the price of RAM -> L3/L2 cache latency. If you are unlucky and the blocks are not in cache yet, the you pay the price of the disk->RAM latency as well.

This is why you see most buffers sized as a power of 2, and generally larger than (or equal to) the disk block size. This means that one of your stream reads could result in multiple disk block reads - but those reads will always use a full block - no wasted reads.

Now, this is offset quite a bit in a typical streaming scenario because the block that is read from disk is going to still be in memory when you hit the next read (we are doing sequential reads here, after all) - so you wind up paying the RAM -> L3/L2 cache latency price on the next read, but not the disk->RAM latency. In terms of order of magnitude, disk->RAM latency is so slow that it pretty much swamps any other latency you might be dealing with.

So, I suspect that if you ran a test with different cache sizes (haven't done this myself), you will probably find a big impact of cache size up to the size of the file system block. Above that, I suspect that things would level out pretty quickly.

There are a ton of conditions and exceptions here - the complexities of the system are actually quite staggering (just getting a handle on L3 -> L2 cache transfers is mind bogglingly complex, and it changes with every CPU type).

This leads to the 'real world' answer: If your app is like 99% out there, set the cache size to 8192 and move on (even better, choose encapsulation over performance and use BufferedInputStream to hide the details). If you are in the 1% of apps that are highly dependent on disk throughput, craft your implementation so you can swap out different disk interaction strategies, and provide the knobs and dials to allow your users to test and optimize (or come up with some self optimizing system).

Automatically selecting buffer size for File I/O

In Java the optimal is usually around the L1 cache size which is typically 32 KB. In Java, at least choosing 1024 bytes or 1 MB doesn't make much difference (<20%)

If you are reading data sequentially, usually your OS is smart enough to detect this and prefetch the data for you.

What you can do is the following. This test appears to show a significant difference in the block sizes used.

public static void main(String... args) throws IOException {
    for (int i = 512; i <= 2 * 1024 * 1024; i *= 2)
        readWrite(i);
}

private static void readWrite(int blockSize) throws IOException {
    ByteBuffer bb = ByteBuffer.allocateDirect(blockSize);
    long start = System.nanoTime();
    FileChannel out = new FileOutputStream("deleteme.dat").getChannel();
    for (int i = 0; i < (1024 << 20); i += blockSize) {
        bb.clear();
        while (bb.remaining() > 0)
            if (out.write(bb) < 1) throw new AssertionError();
    }
    out.close();
    long mid = System.nanoTime();
    FileChannel in = new FileInputStream("deleteme.dat").getChannel();
    for (int i = 0; i < (1024 << 20); i += blockSize) {
        bb.clear();
        while (bb.remaining() > 0)
            if (in.read(bb) < 1) throw new AssertionError();
    }
    in.close();
    long end = System.nanoTime();
    System.out.printf("With %.1f KB block size write speed %.1f MB/s, read speed %.1f MB/s%n",
            blockSize / 1024.0, 1024 * 1e9 / (mid - start), 1024 * 1e9 / (end - mid));
}

prints

With 0.5 KB block size write speed 96.6 MB/s, read speed 169.7 MB/s
With 1.0 KB block size write speed 154.2 MB/s, read speed 312.2 MB/s
With 2.0 KB block size write speed 201.5 MB/s, read speed 438.7 MB/s
With 4.0 KB block size write speed 288.0 MB/s, read speed 733.9 MB/s
With 8.0 KB block size write speed 318.4 MB/s, read speed 711.8 MB/s
With 16.0 KB block size write speed 540.6 MB/s, read speed 1263.7 MB/s
With 32.0 KB block size write speed 726.0 MB/s, read speed 1370.9 MB/s
With 64.0 KB block size write speed 801.8 MB/s, read speed 1536.5 MB/s
With 128.0 KB block size write speed 857.5 MB/s, read speed 1539.6 MB/s
With 256.0 KB block size write speed 794.0 MB/s, read speed 1781.0 MB/s
With 512.0 KB block size write speed 676.2 MB/s, read speed 1221.4 MB/s
With 1024.0 KB block size write speed 886.3 MB/s, read speed 1501.5 MB/s
With 2048.0 KB block size write speed 784.7 MB/s, read speed 1544.9 MB/s

What this test doesn't show is that the hard drive only supports 60 MB/s reads and 40 MB/s writes. All you are testing is the speed in and out of cache. If this was your only priority, you would use a memory mapped file.

int blockSize = 32 * 1024;
ByteBuffer bb = ByteBuffer.allocateDirect(blockSize);
FileChannel out = new FileOutputStream("deleteme.dat").getChannel();
for (int i = 0; i < (1024 << 20); i += blockSize) {
    bb.clear();
    while (bb.remaining() > 0)
        if (out.write(bb) < 1) throw new AssertionError();
}
out.close();

long start = System.nanoTime();
FileChannel in = new FileInputStream("deleteme.dat").getChannel();
MappedByteBuffer map = in.map(FileChannel.MapMode.READ_ONLY, 0, in.size());
in.close();
long end = System.nanoTime();
System.out.printf("Mapped file at a rate of %.1f MB/s%n",
        1024 * 1e9 / (end - start));

prints

Mapped file at a rate of 589885.5 MB/s

This is so fast because it just maps the data in the OS disk cache directly into the memory of the application (so no copying is required)

Determining buffer size when working with files in C#?

4KB is a good choice. for more info look to this:

File I/O with streams - best memory buffer size

Greetings

What does buffer size mean when streaming text to a file

Writing to a file requires using the WriteFile() winapi function. Note the function signature, the 2nd argument is lpBuffer, a buffer that contains the bytes that need to be written. The 3rd argument say how many bytes are in that buffer.

You can technically write just a single byte at a time. But that's inefficient, WriteFile() is not a very cheap function. It works much better if you write a chunk of bytes instead. There will be many fewer calls to WriteFile.

So StreamWriter has a byte[] array that acts as the buffer. When you call Write/Line() then it converts the text to bytes and copies them into that buffer. Very fast.

Which works until that array is full. Then it must call WriteFile() to empty the buffer again. How often that happens entirely depends on the size of the buffer and how much text you write.

StreamWriter can write to many different kind of streams. It doesn't have to be just a file on disk. You can also use it to write text to a network stream for example. Or the screen. Or through a pipe to another process. Or to a device through a serial or USB port. Or memory through a memory-mapped file. Etcetera, many possibilities.

Clearly very different things happen under the hood when you make the Write/Line() call. You may make your program work more optimally if you use a different size buffer. Above all, the Microsoft programmers just could not predict how you are going to use StreamWriter. And could therefore not know what buffer size is "best". They did not want to paint you into a corner where you always had to live with the buffer size that they chose.

So you got the option to pick another size. The default one is 1024 bytes. That a pretty modest size, based on the assumption that you'll write to another stream that's also buffered. Like FileStream, the one you'll use when you write to a file. It has a buffer of 4096 bytes.

If you want to know which buffer size is best then you have to experiment. It cannot be predicted, there's entirely too much code that runs under the hood to allow you to guess at it. But beware that by far the most common outcome of such a test is that it just doesn't have a noticeable effect. Which is the way it should be, it is an operating system's duty to perform well under all reasonable circumstances. When an oddball driver is involved then you'd have a good reason to give it a try.

Difference between buffer size when streams copying

If you are using .NET 4 you can do it simpler:

srcStream.CopyTo(dstStream);

But if you want/need to implement it yourself I would suggest smaller buffer (256B - 1KB) for memory streams and medium size buffer (10KB) for file streams. You can also make it dependent on size of the source stream for example 10% with some size limit of 1MB or so.

For files, the bigger the buffer the faster copy operation (to a degree) but less safe. For memory streams small buffer is pretty much just as effective as big one but easier on memory (if you copy a lot).

File I/O with Streams - Best Memory Buffer Size