C Bus Error with mmap, what is the problem
You can't write to a non-existent portion of a file via mmap()
.
This truncates the file:
int dfd = open(comms->destination, O_CREAT | O_RDWR | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
This mmap()
's the truncated file:
void* destination = mmap(NULL, fileSize, PROT_READ | PROT_WRITE, MAP_PRIVATE, dfd, 0);
This tries to copy to a non-existent part of the file:
memcpy(destination, content, fileSize);
Per the Linux mmap()
man page ERRORS:
ERRORS
...
Use of a mapped region can result in these signals:
...
SIGBUS
Attempted access to a portion of the buffer that does not
correspond to the file (for example, beyond the end of the
file, including the case where another process has truncated
the file).
Per the POSIX mmap()
specification:
... References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a
SIGBUS
signal.An implementation may generate
SIGBUS
signals when a reference would cause an error in the mapped object, such as out-of-space condition.
The fix is to call ftruncate()
to set the file length on the output file:
ftruncate( dfd, fileSize );
void* destination = mmap(NULL, fileSize, PROT_READ | PROT_WRITE, MAP_SHARED, dfd, 0);
Note that you also have to replace the MAP_PRIVATE
flag with MAP_SHARED
. Per POSIX again (bolding mine):
MAP_SHARED
andMAP_PRIVATE
describe the disposition of write references to the memory object. IfMAP_SHARED
is specified, write references shall change the underlying object.If MAP_PRIVATE
is specified, modifications to the mapped data by the calling process shall be visible only to the calling process and shall not change the underlying object. It is unspecified whether modifications to the underlying object done after theMAP_PRIVATE
mapping is established are visible through theMAP_PRIVATE
mapping. EitherMAP_SHARED
orMAP_PRIVATE
can be specified, but not both. The mapping type is retained acrossfork()
.
Bus error (core dumped) when using strcpy to a mmap'ed file
If
fd = open("/tmp/msyncTest", (O_CREAT | O_TRUNC | O_RDWR), (S_IRWXU | S_IRWXG | S_IRWXO) );
is successful, fd
will refer to a zero-length file (O_TRUNC
). The call to mmap()
address = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED, fd, my_offset);
establishes a memory-mapping, but the pages do not correspond to an object.
http://pubs.opengroup.org/onlinepubs/7908799/xsh/mmap.html has the following to say about this situation:
The system always zero-fills any partial page at the end of an object. Further, the system never writes out any modified portions of the last page of an object that are beyond its end. References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object result in delivery of a SIGBUS signal.
Similarly, man mmap
on Linux notes
Use of a mapped region can result in these signals:
[...]
SIGBUS Attempted access to a portion of the buffer that does not correspond to the file (for example, beyond the end of the file, including the case where another process has truncated the file).
Consequently, you must ftruncate()
the file to a non-zero length before mmap()
ing it (unless you are mmap()
ing anonymous memory).
mmap Bus Error writing to MAP_SHARED file over 2Gb
I believe the problem is that j
is an int
. When j
hits large values, (j + 1) * BLOCK_SIZE
overflows and your ftruncate
call does not do what you intend. Checking the return value from ftruncate
should confirm this.
The mmap man page specifically calls out SIGBUS
as meaning that the attempted access is not backed by the file.
Linux/perl mmap performance
Ok, found the problem. As suspected, neither linux or perl were to blame. To open and access the file I do something like this:
#!/usr/bin/perl
# Create 1 GB file if you do not have one:
# dd if=/dev/urandom of=test.bin bs=1048576 count=1000
use strict; use warnings;
use Sys::Mmap;
open (my $fh, "<test.bin")
|| die "open: $!";
my $t = time;
print STDERR "mmapping.. ";
mmap (my $mh, 0, PROT_READ, MAP_SHARED, $fh)
|| die "mmap: $!";
my $str = unpack ("A1024", substr ($mh, 0, 1024));
print STDERR " ", time-$t, " seconds\nsleeping..";
sleep (60*60);
If you test that code, there are no delays like those I found in my original code, and after creating the minimal sample (always do that, right!) the reason suddenly became obvious.
The error was that I in my code treated the $mh
scalar as a handle, something which is light weight and can be moved around easily (read: pass by value). Turns out, it's actually a GB long string, definitively not something you want to move around without creating an explicit reference (perl lingua for a "pointer"/handle value). So if you need to store in in a hash or similar, make sure you store \$mh
, and deref it when you need to use it like ${$hash->{mh}}
, typically as the first parameter in a substr or similar.
Parsing mmaped file with strtok?
strtok()
modifies the string it operates on. Assuming you don't want to change the file contents, you need to change your mmap()
options.
You are opening and mapping the file read-only:
if ((fdsrc = open("filename.txt", O_RDONLY)) < 0) {
...
if ((src = mmap(0, statbuf.st_size, PROT_READ, MAP_SHARED, fdsrc, 0)) == (caddr_t) -1) {
...
Map the file with PROT_READ|PROT_WRITE
and MAP_PRIVATE
:
src = mmap(0, statbuf.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fdsrc, 0);
if (src == (caddr_t) -1) {
You might need to open the file with O_RDWR
instead of O_RDONLY
BEWARE THOUGH:
If the file size exactly matches a multiple of the page size used for the mapping, the file will not be a NUL-terminated string and you will likely get a SIGSEGV when strtok()
attempts to read past the end of the mapping.
In that case, you can mmap()
a zero-filled page immediately following the file's mapping.
When and how is mmap'ed memory swapped in and out?
As has been discussed, your file will be accessed in pages; on x86_64 (and IA32) architectures, a page is typically 4096 bytes. So, very little if any of the file will be loaded at mmap time. The first time you access some page in either file, then the kernel will generate a page fault and load some of your file. The kernel may prefetch pages, so more than one page may be loaded. Whether it does this depends on your access pattern.
In general, your performance should be good if your working set fits in memory. That is, if you're only regularly accesning 3G of file across the two files, so long as you have 3G of RAM available to your process, things should generally be fine.
On a 64-bit system there's no reason to split the files, and you'll be fine if the parts you need tend to fit in RAM.
Note that if you mmap an existing file, swap space will not be required to read that file. When an object is backed by a file on the filesystem, the kernel can read from that file rather than swap space. However, if you specify MMAP_PRIVATE in your call to mmap, swap space may be required to hold changed pages until you call msync.
mmap file with larger fixed length with zero padding?
This is one of the few reasonable use cases for MAP_FIXED
, to remap part of an existing mapping to use a new backing file.
A simple solution here is to unconditionally mmap
64 MB of anonymous memory (or explicitly mmap /dev/zero
), without MAP_FIXED
and store the resulting pointer.
Next, mmap
64 MB or your actual file size (whichever is less) of your actual file, passing in the result of the anonymous/zero mmap
and passing the MAP_FIXED
flag. The pages corresponding to your file will no longer be anonymous/zero mapped, and instead will be backed by your file's data; the remaining pages will be backed by the anonymous/zero pages.
When you're done, a single munmap
call will unmap all 64 MB at once (you don't need to separately unmap the real file pages and the zero backed pages).
Extremely simple example (no error checking, please add it yourself):
// Reserve 64 MB of contiguous addresses; anonymous mappings are always zero backed
void *mapping = mmap(NULL, 64 * 1024 * 1024, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
// Open file and check size
struct stat sb;
int fd = open(myfilename, O_RDONLY);
fstat(fd, &sb);
// Use smaller of file size or 64 MB
size_t filemapsize = sb.st_size > 64 * 1024 * 1024 ? 64 * 1024 * 1024 : sb.st_size;
// Remap up to 64 MB of pages, replacing some or all of original anonymous pages
mapping = mmap(mapping, filemapsize, PROT_READ, MAP_SHARED | MAP_FIXED, fd, 0);
close(fd);
// ... do stuff with mapping ...
munmap(mapping, 64 * 1024 * 1024);
Related Topics
How to Detect Out-Of-Memory Segfaults
Generic Printing Using a Usb Port
Sort a Find Command to Respect a Custom Order in Unix
Logstash Too Many Files Opened
Printing Floating Point Numbers in Assembler
Linux Shell Scripting: How to Remove Final Numbers in a Word List File
Yocto for Nvidia Jetson Fails Because of Gcc 7 - Cannot Compute Suffix of Object Files
Possible to Assign a New Ip Address on Every Http Request
Spawn_Id: Spawn Id Exp6 Not Open
Run Meteor as a Daemon Process
Ssl/Qsslsocket_Openssl.Cpp:1414: Error: Q_Ssl_Ctrl Was Not Declared in This Scope Error