Bus Error Opening and Mmap'Ing a File

C Bus Error with mmap, what is the problem

You can't write to a non-existent portion of a file via mmap().

This truncates the file:

int dfd = open(comms->destination, O_CREAT | O_RDWR | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);

This mmap()'s the truncated file:

void* destination = mmap(NULL, fileSize, PROT_READ | PROT_WRITE, MAP_PRIVATE,  dfd, 0);

This tries to copy to a non-existent part of the file:

memcpy(destination, content, fileSize);

Per the Linux mmap() man page ERRORS:

ERRORS

...

Use of a mapped region can result in these signals:

...

SIGBUS Attempted access to a portion of the buffer that does not
correspond to the file (for example, beyond the end of the
file, including the case where another process has truncated
the file).

Per the POSIX mmap() specification:

... References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal.

An implementation may generate SIGBUS signals when a reference would cause an error in the mapped object, such as out-of-space condition.

The fix is to call ftruncate() to set the file length on the output file:

ftruncate( dfd, fileSize );
void* destination = mmap(NULL, fileSize, PROT_READ | PROT_WRITE, MAP_SHARED, dfd, 0);

Note that you also have to replace the MAP_PRIVATE flag with MAP_SHARED. Per POSIX again (bolding mine):

MAP_SHARED and MAP_PRIVATE describe the disposition of write references to the memory object. If MAP_SHARED is specified, write references shall change the underlying object. If MAP_PRIVATE is specified, modifications to the mapped data by the calling process shall be visible only to the calling process and shall not change the underlying object. It is unspecified whether modifications to the underlying object done after the MAP_PRIVATE mapping is established are visible through the MAP_PRIVATE mapping. Either MAP_SHARED or MAP_PRIVATE can be specified, but not both. The mapping type is retained across fork().

Bus error (core dumped) when using strcpy to a mmap'ed file

If

fd = open("/tmp/msyncTest", (O_CREAT | O_TRUNC | O_RDWR), (S_IRWXU | S_IRWXG | S_IRWXO) );

is successful, fd will refer to a zero-length file (O_TRUNC). The call to mmap()

address = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED, fd, my_offset);

establishes a memory-mapping, but the pages do not correspond to an object.

http://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html has the following to say about this situation:

The system always zero-fills any partial page at the end of an object. Further, the system never writes out any modified portions of the last page of an object that are beyond its end. References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object result in delivery of a SIGBUS signal.

Similarly, man mmap on Linux notes

Use of a mapped region can result in these signals:
[...]
SIGBUS Attempted access to a portion of the buffer that does not correspond to the file (for example, beyond the end of the file, including the case where another process has truncated the file).

Consequently, you must ftruncate() the file to a non-zero length before mmap()ing it (unless you are mmap()ing anonymous memory).

mmap Bus Error writing to MAP_SHARED file over 2Gb

I believe the problem is that j is an int. When j hits large values, (j + 1) * BLOCK_SIZE overflows and your ftruncate call does not do what you intend. Checking the return value from ftruncate should confirm this.

The mmap man page specifically calls out SIGBUS as meaning that the attempted access is not backed by the file.

Linux/perl mmap performance

Ok, found the problem. As suspected, neither linux or perl were to blame. To open and access the file I do something like this:

#!/usr/bin/perl
# Create 1 GB file if you do not have one:
# dd if=/dev/urandom of=test.bin bs=1048576 count=1000
use strict; use warnings;
use Sys::Mmap;

open (my $fh, "<test.bin")
|| die "open: $!";

my $t = time;
print STDERR "mmapping.. ";
mmap (my $mh, 0, PROT_READ, MAP_SHARED, $fh)
|| die "mmap: $!";
my $str = unpack ("A1024", substr ($mh, 0, 1024));
print STDERR " ", time-$t, " seconds\nsleeping..";

sleep (60*60);

If you test that code, there are no delays like those I found in my original code, and after creating the minimal sample (always do that, right!) the reason suddenly became obvious.

The error was that I in my code treated the $mh scalar as a handle, something which is light weight and can be moved around easily (read: pass by value). Turns out, it's actually a GB long string, definitively not something you want to move around without creating an explicit reference (perl lingua for a "pointer"/handle value). So if you need to store in in a hash or similar, make sure you store \$mh, and deref it when you need to use it like ${$hash->{mh}}, typically as the first parameter in a substr or similar.

Parsing mmaped file with strtok?

strtok() modifies the string it operates on. Assuming you don't want to change the file contents, you need to change your mmap() options.

You are opening and mapping the file read-only:

if ((fdsrc = open("filename.txt", O_RDONLY)) < 0) {
...
if ((src = mmap(0, statbuf.st_size, PROT_READ, MAP_SHARED, fdsrc, 0)) == (caddr_t) -1) {
...

Map the file with PROT_READ|PROT_WRITE and MAP_PRIVATE:

src = mmap(0, statbuf.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fdsrc, 0);
if (src == (caddr_t) -1) {

You might need to open the file with O_RDWR instead of O_RDONLY

BEWARE THOUGH:

If the file size exactly matches a multiple of the page size used for the mapping, the file will not be a NUL-terminated string and you will likely get a SIGSEGV when strtok() attempts to read past the end of the mapping.

In that case, you can mmap() a zero-filled page immediately following the file's mapping.

When and how is mmap'ed memory swapped in and out?

As has been discussed, your file will be accessed in pages; on x86_64 (and IA32) architectures, a page is typically 4096 bytes. So, very little if any of the file will be loaded at mmap time. The first time you access some page in either file, then the kernel will generate a page fault and load some of your file. The kernel may prefetch pages, so more than one page may be loaded. Whether it does this depends on your access pattern.

In general, your performance should be good if your working set fits in memory. That is, if you're only regularly accesning 3G of file across the two files, so long as you have 3G of RAM available to your process, things should generally be fine.

On a 64-bit system there's no reason to split the files, and you'll be fine if the parts you need tend to fit in RAM.

Note that if you mmap an existing file, swap space will not be required to read that file. When an object is backed by a file on the filesystem, the kernel can read from that file rather than swap space. However, if you specify MMAP_PRIVATE in your call to mmap, swap space may be required to hold changed pages until you call msync.

mmap file with larger fixed length with zero padding?

This is one of the few reasonable use cases for MAP_FIXED, to remap part of an existing mapping to use a new backing file.

A simple solution here is to unconditionally mmap 64 MB of anonymous memory (or explicitly mmap /dev/zero), without MAP_FIXED and store the resulting pointer.

Next, mmap 64 MB or your actual file size (whichever is less) of your actual file, passing in the result of the anonymous/zero mmap and passing the MAP_FIXED flag. The pages corresponding to your file will no longer be anonymous/zero mapped, and instead will be backed by your file's data; the remaining pages will be backed by the anonymous/zero pages.

When you're done, a single munmap call will unmap all 64 MB at once (you don't need to separately unmap the real file pages and the zero backed pages).

Extremely simple example (no error checking, please add it yourself):

// Reserve 64 MB of contiguous addresses; anonymous mappings are always zero backed
void *mapping = mmap(NULL, 64 * 1024 * 1024, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

// Open file and check size
struct stat sb;
int fd = open(myfilename, O_RDONLY);
fstat(fd, &sb);
// Use smaller of file size or 64 MB
size_t filemapsize = sb.st_size > 64 * 1024 * 1024 ? 64 * 1024 * 1024 : sb.st_size;
// Remap up to 64 MB of pages, replacing some or all of original anonymous pages
mapping = mmap(mapping, filemapsize, PROT_READ, MAP_SHARED | MAP_FIXED, fd, 0);
close(fd);

// ... do stuff with mapping ...
munmap(mapping, 64 * 1024 * 1024);


Related Topics



Leave a reply



Submit