How Would You Implement an Lru Cache in Java

How would you implement an LRU cache in Java?

If I were doing this again from scratch today, I'd use Guava's CacheBuilder.

LRU cache in Java with Generics and O(1) operations

From the question itself, we can see that the problem of O(n) operations arises when querying the linked list. Therefore, we need an alternative data structure. We need to be able to update the items' last access time from the HashMap without searching.

We can keep two separate data structures. A HashMap with (Key,Pointer) pairs and a doubly linked list which will work as the priority queue for deletion and store the Values. From the HashMap, we can point to an element in the doubly linked list and update its' retrieval time. Because we go directly from the HashMap to the item in the list, our time complexity remains at O(1)

For example, our doubly linked list can look like:

least_recently_used  -> A <-> B <-> C <-> D <-> E <- most_recently_used

We need to keep a pointer to the LRU and MRU items. The entries' values will be stored in the list and when we query the HashMap, we will get a pointer to the list. On get(), we need to put the item at the right-most side of the list. On put(key,value), if the cache is full, we need to remove the item at the left-most side of the list from both, the list and the HashMap.

The following is an example implementation in Java:

public class LRUCache<K, V>{

// Define Node with pointers to the previous and next items and a key, value pair
class Node<T, U> {
Node<T, U> previous;
Node<T, U> next;
T key;
U value;

public Node(Node<T, U> previous, Node<T, U> next, T key, U value){
this.previous = previous;
this.next = next;
this.key = key;
this.value = value;
}
}

private HashMap<K, Node<K, V>> cache;
private Node<K, V> leastRecentlyUsed;
private Node<K, V> mostRecentlyUsed;
private int maxSize;
private int currentSize;

public LRUCache(int maxSize){
this.maxSize = maxSize;
this.currentSize = 0;
leastRecentlyUsed = new Node<K, V>(null, null, null, null);
mostRecentlyUsed = leastRecentlyUsed;
cache = new HashMap<K, Node<K, V>>();
}

public V get(K key){
Node<K, V> tempNode = cache.get(key);
if (tempNode == null){
return null;
}
// If MRU leave the list as it is
else if (tempNode.key == mostRecentlyUsed.key){
return mostRecentlyUsed.value;
}

// Get the next and previous nodes
Node<K, V> nextNode = tempNode.next;
Node<K, V> previousNode = tempNode.previous;

// If at the left-most, we update LRU
if (tempNode.key == leastRecentlyUsed.key){
nextNode.previous = null;
leastRecentlyUsed = nextNode;
}

// If we are in the middle, we need to update the items before and after our item
else if (tempNode.key != mostRecentlyUsed.key){
previousNode.next = nextNode;
nextNode.previous = previousNode;
}

// Finally move our item to the MRU
tempNode.previous = mostRecentlyUsed;
mostRecentlyUsed.next = tempNode;
mostRecentlyUsed = tempNode;
mostRecentlyUsed.next = null;

return tempNode.value;

}

public void put(K key, V value){
if (cache.containsKey(key)){
return;
}

// Put the new node at the right-most end of the linked-list
Node<K, V> myNode = new Node<K, V>(mostRecentlyUsed, null, key, value);
mostRecentlyUsed.next = myNode;
cache.put(key, myNode);
mostRecentlyUsed = myNode;

// Delete the left-most entry and update the LRU pointer
if (currentSize == maxSize){
cache.remove(leastRecentlyUsed.key);
leastRecentlyUsed = leastRecentlyUsed.next;
leastRecentlyUsed.previous = null;
}

// Update cache size, for the first added entry update the LRU pointer
else if (currentSize < maxSize){
if (currentSize == 0){
leastRecentlyUsed = myNode;
}
currentSize++;
}
}
}

Segmented LRU Cache in Java

In Java, LRU caches are usually implemented with LinkedHashMap. This is a hashmap with an internal ordering among nodes.

For a segmented LRU cache, you'll need to make a cache class that includes two of these maps.

For the protected map, use the constructor with accessOrder = true, so that every time you access an entry, it will move it to the end of the internal ordering.

You should create subclasses that override the removeEldestEntry method to automatically expire entries and move them from the protected to probationary segment when necessary.

Moving cache hits from the probationary map to the protected map will have to be done by your cache class.

Best way to implement LRU cache

If you want an LRU cache, the simplest in Java is LinkedHashMap. The default behaviour is FIFO however you can changes it to "access order" which makes it an LRU cache.

public static <K,V> Map<K,V> lruCache(final int maxSize) {
return new LinkedHashMap<K, V>(maxSize*4/3, 0.75f, true) {
@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() > maxSize;
}
};
}

Note: I have using the constructor which changes the collection from newest first to most recently used first.

From the Javadoc

public LinkedHashMap(int initialCapacity,
float loadFactor,
boolean accessOrder)
Constructs an empty LinkedHashMap instance with the specified initial capacity, load factor and ordering mode.
Parameters:
initialCapacity - the initial capacity
loadFactor - the load factor
accessOrder - the ordering mode - true for access-order, false for insertion-order

When accessOrder is true the LinkedHashMap re-arranges the order of the map whenever you get() an entry which is not the last one.

This way the oldest entry is the least which is recently used.

the best way to implement LRU cache

I don't know if this is beneficial, but if you can replace your LinkedHashMap with a ConcurrentHashMap then you'll improve your throughput - a ConcurrentHashMap uses sharding to permit multiple simultaneous readers and writers. It is also thread-safe, so you won't need to synchronize your readers and writers.

Barring that, replace your use of the synchronized keyword with a ReadWriteLock. This will allow multiple simultaneous readers.

Easy, simple to use LRU cache in java

You can use a LinkedHashMap (Java 1.4+) :

// Create cache
final int MAX_ENTRIES = 100;
Map cache = new LinkedHashMap(MAX_ENTRIES+1, .75F, true) {
// This method is called just after a new entry has been added
public boolean removeEldestEntry(Map.Entry eldest) {
return size() > MAX_ENTRIES;
}
};

// Add to cache
Object key = "key";
cache.put(key, object);

// Get object
Object o = cache.get(key);
if (o == null && !cache.containsKey(key)) {
// Object not in cache. If null is not a possible value in the cache,
// the call to cache.contains(key) is not needed
}

// If the cache is to be used by multiple threads,
// the cache must be wrapped with code to synchronize the methods
cache = (Map)Collections.synchronizedMap(cache);

Implementing an LRU cache with built in data structures

You can use Java LinkedHashMap and override removeEldestEntry to implement LRU cache

public class SimpleLru<K, V> extends LinkedHashMap<K, V>{

final int cacheSize;

public SimpleLru(int cacheSize) {
this.cacheSize = cacheSize;
}

@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return this.size() > this.cacheSize;
}

public static void main(String[] args) {
SimpleLru<String, Integer> m = new SimpleLru<>(2); // Max 2
m.put("k1", 1); // k1:1
m.put("k2", 2); // k1:1, k2:2
m.put("k3", 3); // k2:2, k3:3
}
}

If you want to have a thread-safe version, you can use:

public class ConcurrentLru<K, V> {

final Object mutex = new Object();
final Map<K, V> cache;

public ConcurrentLru(final int cacheSize) {
this.cache = new LinkedHashMap<K, V>() {

@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return this.size() > cacheSize;
}
};
}

public void put(K k, V v) {
synchronized (this.mutex) { this.cache.put(k, v); }
}

public boolean contains(K k) {
synchronized (this.mutex) { return this.cache.containsKey(k); }
}

public V remove(K k) {
synchronized (this.mutex) { return this.cache.remove(k); }
}

public V get(K k) {
synchronized (this.mutex) { return this.cache.get(k); }
}
}

Concurrent LRU cache implementation

The best you can do is to make it thread-safe is to wrap it with Collections.synchronizedMap(map) as explained in the javadoc:

Note that this implementation is not synchronized. If multiple threads
access a linked hash map concurrently, and at least one of the threads
modifies the map structurally, it must be synchronized externally.
This is typically accomplished by synchronizing on some object that
naturally encapsulates the map. If no such object exists, the map
should be "wrapped" using the Collections.synchronizedMap method. This
is best done at creation time, to prevent accidental unsynchronized
access to the map:

Map m = Collections.synchronizedMap(new LinkedHashMap(...));

However it is not enough to make it fully thread-safe you sill need to protect any iteration over the content of the map using the instance of the wrapped map as object's monitor:

Map m = Collections.synchronizedMap(map);
...
Set s = m.keySet(); // Needn't be in synchronized block
...
synchronized (m) { // Synchronizing on m, not s!
Iterator i = s.iterator(); // Must be in synchronized block
while (i.hasNext())
foo(i.next());
}

This is pretty much all you can easily do with what we have out of the box in the JDK, if you want something thread-safe and more efficient, you should rather look at Cache from Google Guava.

Here is an example of a LRU cache with a max size of 2 built with guava:

ConcurrentMap<String, String> cache = 
CacheBuilder.newBuilder()
.maximumSize(2L)
.<String, String>build().asMap();
cache.put("a", "b");
cache.put("b", "c");
System.out.println(cache);
cache.put("a", "d");
System.out.println(cache);
cache.put("c", "d");
System.out.println(cache);

Output:

{b=c, a=b}
{b=c, a=d}
{c=d, a=d}


Related Topics



Leave a reply



Submit