How to Create a File with Any Given Size in Linux

How to create a file with a given size in Linux?

For small files:

dd if=/dev/zero of=upload_test bs=file_size count=1

Where file_size is the size of your test file in bytes.

For big files:

dd if=/dev/zero of=upload_test bs=1M count=size_in_megabytes

How to create a file with ANY given size in Linux?

Sparse file

dd of=output.dat bs=1 seek=390143672 count=0

This has the added benefit of creating the file sparse if the underlying filesystem supports that. This means, no space is wasted if some of the pages (_blocks) ever get written to and the file creation is extremely quick.

Non-sparse (opaque) file:

Edit since people have, rightly pointed out that sparse files have characteristics that could be disadvantageous in some scenarios, here is the sweet point:

You could use fallocate (in Debian present due to util-linux) instead:

fallocate -l 390143672 output.dat

This still has the benefit of not needing to actually write the blocks, so it is pretty much as quick as creating the sparse file, but it is not sparse. Best Of Both Worlds.

Create a large file with a given size with a pattern in Linux

while true ; do printf "DEADBEEF"; done | dd of=/tmp/bigfile bs=blocksize count=size iflag=fullblock

Is it possible to create a file with a fixed size in linux?

The system call truncate(2) doesn't fill the file with zeros. It simply advances the file's reported size and leaves holes in it.

When you read from it, you do get zeros, but that's just a convenience of the OS.

The truncate() and ftruncate() functions cause the regular file named
by path or referenced by fd to be truncated to a size of precisely
length bytes.

If the file previously was shorter, it is extended, and the extended
part reads as null bytes ('\0').

About holes (from TLPI):

The existence of holes means that a file’s nominal size may be larger
than the amount of disk storage it utilizes (in some cases,
considerably larger).

Filesystems and holes:

Rather than allocate blocks of null bytes for the holes in a file, the
file system can just mark (with the value 0) appropriate pointers in
the i-node and in the indirect pointer blocks to indicate that they
don't refer to actual disk blocks.

As Per Johansson notes, this is dependent of the filesystem.

Most native UNIX file systems support the concept of file holes, but
many nonnative file systems (e.g., Microsoft’s VFAT) do not. On a
file system that doesn’t support holes, explicit null bytes are
written to the file.

Create a file of a specific size with random printable strings in bash

The correct way is to use a transformation like base64 to convert the random bytes to characters. That will not erase any of the randomness from the source, it will only convert it to some other form.

For a (a little bit bigger) file of 1 MegaByte of size:

dd if=/dev/urandom bs=786438 count=1 | base64 > /tmp/file

The resulting file will contain characters in the range A–Za–z0–9 and +/=.

Below is the reason for the file to be a little bigger, and a solution.

You could add a filter to translate from that list to some other list (of the same size or less) with tr.

cat /tmp/file | tr 'A-Za-z0-9+/=' 'a-z0-9A-Z$%'

I have left the = outside of the translation because for an uniform random distribution it is better to leave out the last characters that will (almost) allways be =.

Size

The size of the file will get expanded from the original size used from /dev/random in a factor of 4/3. That is because we are transforming 256 byte values into 64 different characters. That is done by taking 6 bits from the stream of bytes to encode each character. When 4 characters have been encoded (6*4=24 bits) only three bytes have been consumed (8*3=24).

So, we need a count of bytes multiple of 3 to get an exact result, and multiple of 4 because we will have to divide by that.

We can not get a random file of exactly 1024 bytes (1k) or 1024*1024 = 1,048,576 bytes (1M) because both are not exact multiple of 3. But we can produce a file a little bigger and truncate it (if such precision is needed):

wanted_size=$((1024*1024))
file_size=$(( ((wanted_size/12)+1)*12 ))
read_size=$((file_size*3/4))

echo "wanted=$wanted_size file=$file_size read=$read_size"

dd if=/dev/urandom bs=$read_size count=1 | base64 > /tmp/file

truncate -s "$wanted_size" /tmp/file

The last step to truncate to the exact value is optional.

Randomness generation.

As you are going to extract so much random values from urandom, please do not use random (use urandom) or your app will be blocked for a long time and the rest of the computer will work without randomness.

I'll recommend that you install the package haveged:

haveged uses HAVEGE (HArdware Volatile Entropy Gathering and Expansion)
to maintain a 1M pool of random bytes used to fill /dev/random
whenever the supply of random bits in dev/random falls below the low
water mark of the device.

If that is possible.

How to create file of x size?

Yes you would do it after fopen - you can create what is know as a sparse file

#include <stdio.h>
int main(void) {
        int X = 1024 * 1024 - 1;
        FILE *fp = fopen("myfile", "w");
        fseek(fp, X , SEEK_SET);
        fputc('\0', fp);
        fclose(fp);
}

That should create you a file for X Byte for whatever you need, in this case it's 1MiB