Boost Interprocess Mutexes and Checking for Abandonment

Boost interprocess mutexes and checking for abandonment

When I don't use the timeout object, and the mutex is abandoned, the ScopedLock ctor blocks indefinitely. That's expected

The best solution for your problem would be if boost had support for robust mutexes. However Boost currently does not support robust mutexes. There is only a plan to emulate robust mutexes, because only linux has native support on that. The emulation is still just planned by Ion Gaztanaga, the library author.
Check this link about a possible hacking of rubust mutexes into the boost libs:
http://boost.2283326.n4.nabble.com/boost-interprocess-gt-1-45-robust-mutexes-td3416151.html

Meanwhile you might try to use atomic variables in a shared segment.

Also take a look at this stackoverflow entry:
How do I take ownership of an abandoned boost::interprocess::interprocess_mutex?

When I do use the timeout, and the mutex is abandoned, the ScopedLock ctor returns immediately and tells me that it doesn't own the mutex. Ok, perhaps that's normal, but why isn't it waiting for the 10 seconds I'm telling it too?

This is very strange, you should not get this behavior. However:
The timed lock is possibly implemented in terms of the try lock. Check this documentation:
http://www.boost.org/doc/libs/1_53_0/doc/html/boost/interprocess/scoped_lock.html#idp57421760-bb
This means, the implementation of the timed lock might throw an exception internally and then returns false.

inline bool windows_mutex::timed_lock(const boost::posix_time::ptime &abs_time)
{
sync_handles &handles =
windows_intermodule_singleton<sync_handles>::get();
//This can throw
winapi_mutex_functions mut(handles.obtain_mutex(this->id_));
return mut.timed_lock(abs_time);
}

Possibly, the handle cannot be obtained, because the mutex is abandoned.

When the mutex isn't abandoned, and I use the timeout, the ScopedLock ctor still returns immediately, telling me that it couldn't lock, or take ownership, of the mutex and I go through the motions of removing the mutex and remaking it. This is not at all what I want.

I am not sure about this one, but I think the named mutex is implemented by using a shared memory. If you are using Linux, check for the file /dev/shm/MutexName. In Linux, a file descriptor remains valid until that is not closed, no matter if you have removed the file itself by e.g. boost::interprocess::named_recursive_mutex::remove.

How do I take ownership of an abandoned boost::interprocess::interprocess_mutex?

Unfortunately, this isn't supported by the boost::interprocess API as-is. There are a few ways you could implement it however:

If you are on a POSIX platform with support for pthread_mutexattr_setrobust_np, edit boost/interprocess/sync/posix/thread_helpers.hpp and boost/interprocess/sync/posix/interprocess_mutex.hpp to use robust mutexes, and to handle somehow the EOWNERDEAD return from pthread_mutex_lock.

If you are on some other platform, you could edit boost/interprocess/sync/emulation/interprocess_mutex.hpp to use a generation counter, with the locked flag in the lower bit. Then you can create a reclaim protocol that will set a flag in the lock word to indicate a pending reclaim, then do a compare-and-swap after a timeout to check that the same generation is still in the lock word, and if so replace it with a locked next-generation value.

If you're on windows, another good option would be to use native mutex objects; they'll likely be more efficient than busy-waiting anyway.

You may also want to reconsider the use of a shared-memory protocol - why not use a network protocol instead?

Using boost::interprocess condition variable on an already locked mutex

You need a BasicLockable. Indeed scoped_lock (or lock_guard) are not that. unique_lock and similar are:

The class unique_lock meets the BasicLockable requirements. If Mutex
meets the Lockable requirements, unique_lock also meets the Lockable
requirements (ex.: can be used in std::lock); if Mutex meets the
TimedLockable requirements, unique_lock also meets the TimedLockable
requirements.

Here's a small demo assuming some types for your interprocess mutex and condition:

Coliru

#include <boost/date_time/posix_time/posix_time.hpp>
#include <boost/interprocess/managed_mapped_file.hpp>
#include <boost/interprocess/sync/interprocess_condition.hpp>
#include <boost/interprocess/sync/interprocess_mutex.hpp>
#include <mutex>
#include <thread>
namespace bip = boost::interprocess;
using namespace std::literals;

using boost::posix_time::milliseconds;
auto now = boost::posix_time::microsec_clock::universal_time;

int main() {
bip::managed_mapped_file mmf(bip::open_or_create, "mapped.dat", 32<<10);

auto& mutex = *mmf.find_or_construct<bip::interprocess_mutex>("mutex")();
auto& cond = *mmf.find_or_construct<bip::interprocess_condition>("cond")();
auto& data = *mmf.find_or_construct<int>("data")(0);

auto is_ready = [&data] { return data != 42; };

std::unique_lock lk(mutex);

/*void*/ cond.wait(lk);

/*void*/ cond.wait(lk, is_ready);

// check return values for these:
cond.timed_wait(lk, now() + milliseconds(120));
cond.timed_wait(lk, now() + milliseconds(120), is_ready);
}

(Of course that would just block forever because nothing ever notifies the condition).

Added a running demo with a very quick-and-dirty signaller thread: http://coliru.stacked-crooked.com/a/a1eb29653f1bbcee

Without Standard Library

You can use the equivalent Boost types: https://www.boost.org/doc/libs/1_76_0/doc/html/thread/synchronization.html#thread.synchronization.locks

Avoid locking inside of boost::interprocess::managed_shared_memory

The problem is I am at a total loss on how to change the locking behavior of the managed_shared_memory object. I looked through its constructors but I cannot find a solution to this problem.

Okay, I think the choice is hard-coded - for good reason. Like I said in my comment there is no safe way to access managed memory segments from different processes without synchronization on the segment metadata.

Now, if you're absolutely sure you want to (you're in shooting-yourself-in-the-foot-with-enough-rope-to-hang-yourself territory) you should probably manually mount a managed external buffer:

Managed External Buffer: Constructing all Boost.Interprocess objects in a user provided buffer

You could of course still put it in an (unmanaged) mapped_region of Mappable Object (like shared_memory_object).

You can parameterize your managed external buffer as you wish, but the default already doesn't lock:

typedef basic_managed_external_buffer <
char,
rbtree_best_fit<null_mutex_family, offset_ptr<void> >,
flat_map_index
> managed_external_buffer;

boost::interprocess::scoped_lock application crash inside lock

That's seems perfectly logic to me :)

As your application crash, the mutex which maps to your OS interprocess communication mechanism (IPC) is not released. When your application restart it tries to get the mutex without success !

I suppose your application has different subsystems (processes) that need to be synchronized.

You have to devise a global policy in case of a crash of one of your subsystem to manage correctly the lock. For example, if one of your subsystem crash, it should try and unlock the mutex at startup. It can be tricky as other subsystems use that lock. Timeouts can help too. In any case, you have to devise the policy having in mind that any of your processes can crash while having locked the mutex...

Of course, if you do not need interprocess locking, use simple scoped locks :)

my2c

Boost Interprocess mutexes and condition variables

wait releases the mutex while waiting, so the other thread can acquire the mutex and perform the notify.
Also see the description on https://www.boost.org/doc/libs/1_57_0/doc/html/interprocess/synchronization_mechanisms.html#interprocess.synchronization_mechanisms.conditions.conditions_whats_a_condition.

boost::interprocess::named_mutex vs CreateMutex

Caveat: I've not spent much time with boost::interprocess, so this information is just from a quick inspection of the source. That said, I've used the Windows synchronisation API's a lot, so here goes...


The main difference between the two methods of interprocess synchronisation is how the object exists within the system.

With boost::interprocess::named_mutex, as well as a system-specific mutex, it looks like a synchronisation object is created as a file on the system. The location of the file is based on Registry entries (see note 1) (at least in Boost 1.54.0)... it's most likely located under the Common Application Data folder (see note 2). When the aplication crashes, this file is, in your case, not removed. I'm not sure if this is by design... however in the case of an application crash, it's perhaps best not to mess with the file system, just in case.

Conversely, when you use CreateMutex, an object is created at the kernel mode, which for named mutexes can be accessed by several applications. You get a handle to the Mutex by specifying the name when you create it, and you lose the handle when you call CloseHandle on it. The mutex object is destroyed when there are no more handles referencing it.

The important part of this is in the documentation:

The system closes the handle automatically when the process terminates. The mutex object is destroyed when its last handle has been closed.

This basically means that Windows will clean up after your application.

Note that if you don't perform a ReleaseMutex, and your application owns the mutex when it dies, then it's possible/likely that a waiting thread or process would see that the mutex had been abandoned (WaitForSingleObject returns WAIT_ABANDONED), and would gain ownership.

I apologise for not providing a solution, but I hope it answers your question about why the two systems act differently.


  1. Just as an aside, using registry entries to get this information is horrible - it would be safer, and more future-proof, to use SHGetKnownFolderPath. But I digress.

  2. Depending on your OS version, this could be %ALLUSERSPROFILE%\Application Data\boost.interprocess or ProgramData\boost.interprocess, or somewhere else entirely.

boost managed_shared_memory find() method stuck on mutex forever

  • waits forever (not sure if this is needed, just added it to make sure linux doesn't free that shared memory as soon as the program exits)

No that's not required. Shared memory is shared. It stays unless you explicitly remove() it.

Review

You have at least one inconsistency: the name of the object is either "Datastore" or "DataStore" - make sure you match the spelling.

Other than that, I think

  • you might not want "array-style" allocation, which you are (inadvertently?) using
  • you might be better off using find_or_construct which does remove the potential race-condition (time-of-check vs time-of-use window between finding and creating a new instance, respectively).

Other than that I don't see any immediate reason for a hang. Perhaps you can try by removing the shared object once, manually, and using the following simplified program to re-test:

#include <boost/interprocess/managed_shared_memory.hpp>
#include <iostream>
#include <cassert>
namespace bip = boost::interprocess;

class data_store {
public:
data_store(uint32_t id, const char *name, bool safe) :
id_(id), safe_(safe)
{
id_ = id;
assert(name && strlen(name) < (sizeof(name_)-1));
strncpy(name_, name, sizeof(name_));
safe_ = safe;
}

uint32_t id() const { return id_; }
const char *name() const { return name_; }

private:
char name_[32] = {0};
uint32_t id_;
bool safe_;
};

int main () try {
bip::managed_shared_memory seg(bip::open_or_create, "seg2", 2048);
data_store ds = *seg.find_or_construct<data_store>("DataStore")(1, "ds", true);
std::cout << "Free size " << seg.get_free_memory() << std::endl;
std::cout << "Data store name " << ds.name() << std::endl;
} catch (std::exception ex) {
std::cerr << ex.what() << '\n';
}

It contains a few style fixes as well as the extra assert on name length.

Live On Coliru

Note: On Coliru using managed_mapped_file instead because manged_shared_memory is not available on Coliru.

Prints:

Free size 1712
Data store name ds
-rw-r--r-- 1 2001 2000 2.0K Mar 5 12:26 seg2
Free size 1712
Data store name ds


Related Topics



Leave a reply



Submit