How to implement a Least Frequently Used (LFU) cache?
Least Frequently Used (LFU) is a type of cache algorithm used to manage memory within a computer. The standard characteristics of this method involve the system keeping track of the number of times a block is referenced in memory. When the cache is full and requires more room the system will purge the item with the lowest reference frequency.
What would be the best way to implement a most-recently-used cache of objects, say in Java?
I've already implemented one using LinkedHashMap(by maintaining the no. of times objects are accessed) But I'm curious if any of the new concurrent collections would be better candidates.
Consider this case : Suppose cache is full and we need to make space for another one. Say two objects are noted in cache which are accessed for one time only. Which one to remove if we come to know that other(which is not in cache)object is being accessed for more than once ?
Thanks!
According to me, the best way to implement a most-recently-used cache of objects would be to include a new variable as 'latestTS' for each object. TS stands for timestamp.
// A static method that returns the current date and time as milliseconds since January 1st 1970 long latestTS = System.currentTimeMillis();
ConcurrentLinkedHashMap is not yet implemented in Concurrent Java Collections. (Ref: Java Concurrent Collection API). However, you can try and use ConcurrentHashMap and DoublyLinkedList
About the case to be considered: in such case, as I have said that you can declare latestTS variable, based upon the value of latestTS variable, you can remove an entry and add the new object. (Don't forget to update frequency and latestTS of the new object added)
As you have mentioned, you can use LinkedHashMap as it gives element access in O(1) and also, you get the order traversal. Please, find the below code for LFU Cache: (PS: The below code is the answer for the question in the title ie "How to implement LFU cache")
import java.util.LinkedHashMap;
import java.util.Map;
public class LFUCache {
class CacheEntry
{
private String data;
private int frequency;
// default constructor
private CacheEntry()
{}
public String getData() {
return data;
}
public void setData(String data) {
this.data = data;
}
public int getFrequency() {
return frequency;
}
public void setFrequency(int frequency) {
this.frequency = frequency;
}
}
private static int initialCapacity = 10;
private static LinkedHashMap<Integer, CacheEntry> cacheMap = new LinkedHashMap<Integer, CacheEntry>();
/* LinkedHashMap is used because it has features of both HashMap and LinkedList.
* Thus, we can get an entry in O(1) and also, we can iterate over it easily.
* */
public LFUCache(int initialCapacity)
{
this.initialCapacity = initialCapacity;
}
public void addCacheEntry(int key, String data)
{
if(!isFull())
{
CacheEntry temp = new CacheEntry();
temp.setData(data);
temp.setFrequency(0);
cacheMap.put(key, temp);
}
else
{
int entryKeyToBeRemoved = getLFUKey();
cacheMap.remove(entryKeyToBeRemoved);
CacheEntry temp = new CacheEntry();
temp.setData(data);
temp.setFrequency(0);
cacheMap.put(key, temp);
}
}
public int getLFUKey()
{
int key = 0;
int minFreq = Integer.MAX_VALUE;
for(Map.Entry<Integer, CacheEntry> entry : cacheMap.entrySet())
{
if(minFreq > entry.getValue().frequency)
{
key = entry.getKey();
minFreq = entry.getValue().frequency;
}
}
return key;
}
public String getCacheEntry(int key)
{
if(cacheMap.containsKey(key)) // cache hit
{
CacheEntry temp = cacheMap.get(key);
temp.frequency++;
cacheMap.put(key, temp);
return temp.data;
}
return null; // cache miss
}
public static boolean isFull()
{
if(cacheMap.size() == initialCapacity)
return true;
return false;
}
}
You might benefit from the LFU implementation of ActiveMQ: LFUCache
They have provided some good functionality.
I think, the LFU data structure must combine priority queue (for maintaining fast access to lfu item) and hash map (for providing fast access to any item by its key); I would suggest the following node definition for each object stored in cache:
class Node<T> {
// access key
private int key;
// counter of accesses
private int numAccesses;
// current position in pq
private int currentPos;
// item itself
private T item;
//getters, setters, constructors go here
}
You need key
for referring to an item. You need numAccesses
as a key for priority queue. You need currentPos
to be able to quickly find a pq position of item by key. Now you organize hash map (key( Integer
) -> node( Node<T>
)) to quickly access items and min heap-based priority queue using number of accesses as priority. Now you can very quickly perform all operations (access, add new item, update number of acceses, remove lfu). You need to write each operation carefully, so that it maintains all the nodes consistent (their number of accesses, their position in pq and there existence in hash map). All operations will work with constant average time complexity which is what you expect from cache.
上一篇: C ++ STL优先级队列
下一篇: 如何实现最少使用(LFU)缓存?