How and When to Use /Dev/Shm for Efficiency

How and when to use /dev/shm for efficiency?

You don't use /dev/shm. It exists so that the POSIX C library can provide shared memory support via the POSIX API. Not so you can poke at stuff in there.

If you want an in-memory filesystem of your very own, you can mount one wherever you want it.

mount -t tmpfs tmpfs /mnt/tmp, for example.

A Linux tmpfs is a temporary filesystem that only exists in RAM. It is implemented by having a file cache without any disk storage behind it. It will write its contents into the swap file under memory pressure. If you didn't want the swapfile you can use a ramfs.

I don't know where you got the idea of using /dev/shm for efficiency in reading files, because that isn't what it does at all.

Maybe you were thinking of using memory mapping, via the mmap system call?

Read this answer here: https://superuser.com/a/1030777/4642 it covers a lot of tmpfs information.

better Java IPC@Linux tactic: (a) java.nio File API on /dev/shm or (b) JNI to shmctl(2)?

I vote "NIO and /dev/shm".

But before making any final decisions, you should also consider other options, including CLIP:

http://ambientideas.com/blog/index.php/tag/java/page/2/
http://ltsllc.com/talks/20090407_ipc.pdf
inter jvm communication

Sockets, message queues and named pipes are other IPC methods I wouldn't necessarily dismiss out-of-hand. IMHO...

POSIX Shared memory

It is the same thing except that shm_open(test) is POSIX standard and needs the librt library, and open(/dev/shm/test) is not POSIX standard and does not need the librt library. Performance is equal for both solution.

Writing to shared memory in Python is very slow

After more research I've found that python actually creates folders in /tmp which are starting with pymp-, and though no files are visible within them using file viewers, it looks exatly like /tmp/ is used by python for shared memory. Performance seems to be decreasing when file cashes are flushed.

The working solution in the end was to mount /tmp as tmpfs:

sudo mount -t tmpfs tmpfs /tmp

And, if using the latest docker, by providing --tmpfs /tmp argument to the docker run command.

After doing this, read/write operations are done in RAM, and performance is fast and stable.

I still wonder why /tmp is used for shared memory, not /dev/shm which is already monted as tmpfs and is supposed to be used for shared memory.

What is the most efficient way in Selenium to open and process links within a loop?

I would probably try implement your use case without BeautifulSoap in a structure like this:

1. create web driver

wd = webdriver.Chrome('chromedriver',options=options)

2. open the "main" web page

wd.get("url")

3. get all elements

elements = wd.find_elements_by_css_selector('ul[data-card-id="..."])

4. get the url of each element

pages = []
for element in elements:
   pages.append(element.get_attribute('href')

5. process each page

for page in pages:
   wd.get(page)
   # ...

How and When to Use /Dev/Shm for Efficiency