How to Limit the Cache Used by Copying So There Is Still Memory Available for Other Caches

How can I limit the cache used by copying so there is still memory available for other caches?

The nocache command is the general answer to this problem! It is also in Debian and Ubuntu 13.10 (Saucy Salamander).

Thanks, Peter, for alerting us to the --drop-cache" option in rsync. But that was rejected upstream (Bug 9560 – drop-cache option), in favor of a more general solution for this: the new "nocache" command based on the rsync work with fadvise.

You just prepend "nocache" to any command you want. It also has nice utilities for describing and modifying the cache status of files. For example, here are the effects with and without nocache:

$ ./cachestats ~/file.mp3
pages in cache: 154/1945 (7.9%) [filesize=7776.2K, pagesize=4K]
$ ./nocache cp ~/file.mp3 /tmp
$ ./cachestats ~/file.mp3
pages in cache: 154/1945 (7.9%) [filesize=7776.2K, pagesize=4K]\
$ cp ~/file.mp3 /tmp
$ ./cachestats ~/file.mp3
pages in cache: 1945/1945 (100.0%) [filesize=7776.2K, pagesize=4K]

So hopefully that will work for other backup programs (rsnapshot, duplicity, rdiff-backup, amanda, s3sync, s3ql, tar, etc.) and other commands that you don't want trashing your cache.

Any Java caches that can limit memory usage of in-memory cache, not just instance count?

I agree with Paul that this is often solved by using a soft reference cache, though it may evict entries earlier than you prefer. A usually acceptable solution is to use a normal cache that evicts to the soft cache, and recovers entries on a miss if possible. This victim caching approach works pretty well, giving you a lower bar but extra benefit if free memory is available.

The memory size can be determined by enabling the Java agent, and usage is pretty simple when using the SizeOf utility (http://sourceforge.net/projects/sizeof). I've only used this for debugging purposes, and I'd recommend benchmarking the overhead before adopting it for normal usage.

In my caching library, I am planning on adding the ability to plug in a evaluator once the core algorithm is implemented. This way you could store a collection as the value, but bound the cache by the sum of all collection sizes. I have seen unbounded collections as values in caches cause OutOfMemoryExceptions, so having control is quite handy.

If you really need this, and I'd advise not to, we could enhance my current implementation to support this. You can email me, ben.manes-at-gmail.com.

Prevent backup reads from getting into linux page cache


  • if you re using rsync there is the flag --drop-cache according to this question
  • the nocache utility which

minimize the effect an application has on the Linux file system cache

Use case: backup processes that should not interfere with the present state of the cache.

  • using dd there is direct I/O to bybass cache according to this question
  • the dd also has the option nocache option check the command info coreutils 'dd invocation'for details

How can I show size of the filesystem cache that has to be synced?

/proc/meminfo can tell you this information:

cat /proc/meminfo | grep -e Dirty -e Writeback 

According to kernel documentation,

Dirty
Memory which is waiting to get written back to the disk
Writeback
Memory which is actively being written back to the disk

I don't think there is a way to determine how much Dirty or Writeback memory is specific to a device though.



Related Topics



Leave a reply



Submit