LRU cache and implementation principles for Java and Android

1. Overview

Android provides the LRUCache class, which can be used conveniently to implement the cache of LRU algorithms. Java provides LinkedHashMap, which can use this class to easily implement LRU algorithms. Java's LRULinkedHashMap directly inherits LinkedHashMap, and can implement LRU algorithms after making very few changes.

2. Java's LRU algorithm

The basis of Java's LRU algorithm is LinkedHashMap. LinkedHashMap inherits HashMap and has made certain changes based on HashMap to implement the LRU algorithm.

1、HashMap

First of all, HashMap stores each node information in the Entry<K,V> structure. Entry<K,V> stores the key, value, and hash information corresponding to the node, and also stores the reference to the next node of the current node. Therefore, Entry<K,V> is a one-way linked list. The storage structure of HashMap is in the form of an array plus a one-way linked list. The hashCode corresponding to each key can be found in the HashMap array; and if multiple keys correspond to the same hashCode, they are in the same position in the array. At this time, the HashMap will put the corresponding information into Entry<K,V> and use a linked list to connect these Entry<K,V>.

 static class Entry<K,V> implements <K,V> {
    final K key;
    V value;
    Entry<K,V> next;
    int hash;

    /**
     * Creates new entry.
     */
    Entry(int h, K k, V v, Entry<K,V> n) {
      value = v;
      next = n;
      key = k;
      hash = h;
    }

    public final K getKey() {
      return key;
    }

    public final V getValue() {
      return value;
    }

    public final V setValue(V newValue) {
      V oldValue = value;
      value = newValue;
      return oldValue;
    }

    public final boolean equals(Object o) {
      if (!(o instanceof ))
        return false;
       e = ()o;
      Object k1 = getKey();
      Object k2 = ();
      if (k1 == k2 || (k1 != null && (k2))) {
        Object v1 = getValue();
        Object v2 = ();
        if (v1 == v2 || (v1 != null && (v2)))
          return true;
      }
      return false;
    }

    public final int hashCode() {
      return (getKey()) ^ (getValue());
    }

    public final String toString() {
      return getKey() + "=" + getValue();
    }

    /**
     * This method is invoked whenever the value in an entry is
     * overwritten by an invocation of put(k,v) for a key k that's already
     * in the HashMap.
     */
    void recordAccess(HashMap<K,V> m) {
    }

    /**
     * This method is invoked whenever the entry is
     * removed from the table.
     */
    void recordRemoval(HashMap<K,V> m) {
    }
  }

The following is the code of HashMap's put method and analyze it

public V put(K key, V value) {
    if (table == EMPTY_TABLE) {
      inflateTable(threshold);
    }
    if (key == null)
      return putForNullKey(value);

　//I don’t care about the above information, the following is the normal insertion logic.
　//First calculate hashCode    int hash = hash(key);
　//Calculate the hashCode position in the array through the calculated hashCode    int i = indexFor(hash, );

　//For loop to find whether there is a node in the HashMap, and the corresponding key is exactly the same as the incoming key.  If it exists, it means that the user wants to replace the value corresponding to the key, so you can return it by directly replacing the value.    for (Entry&lt;K,V&gt; e = table[i]; e != null; e = ) {
      Object k;
      if ( == hash &amp;&amp; ((k = ) == key || (k))) {
        V oldValue = ;
         = value;
        (this);
        return oldValue;
      }
    }

 //The logic execution reaches this point, indicating that there is no completely consistent kye in the HashMap. Call addEntry to create a new node to save key and value information, and add it to the HashMap    modCount++;
    addEntry(hash, key, value, i);
    return null;
  }

Some comments have been added to the above code to get an idea of the whole. The following is a detailed explanation of some points worth analyzing.

<1> int i = indexFor(hash, );

You can check the source code:

　static int indexFor(int h, int length) {
    // assert (length) == 1 : "length must be a non-zero power of 2";
    return h & (length-1);
  }

Why does the obtained hashCode(h) need to be bitwise and operation with (length-1)? This is to ensure that the high-level information of h is removed. If the array size is 8 (1000), and the calculated value of h is 10 (1010), if the data whose index of the array is directly obtained is 10, an exception will definitely be thrown. Therefore, using bitwise and (0111&1010), the high-bit information is successfully cleared and 2 (0010), indicating the data with index of 2 in the corresponding array. The effect is the same as taking the remainder, but the bit operation efficiency is significantly higher.

However, there is a problem with this. If the length is 9, the length-1 information is 8 (1000). In this way, bit operation will not only fail to clear the high-bit data, but the result will definitely be wrong. So there must be something special about the size of the array. By looking at the source code, we can find that HashMap is always guaranteed that the number of corresponding arrays is to the n power of 2.

First, when put, call the inflateTable method. The focus is on the roundUpToPowerOf2 method. Although its content contains a large number of bit-related operations and processing, and it is not clear that the annotation has been clear, which will ensure that the number of arrays is to the n-power of 2.

private void inflateTable(int toSize) {
// Find a power of 2 >= toSize
int capacity = roundUpToPowerOf2(toSize);

threshold = (int) (capacity * loadFactor, MAXIMUM_CAPACITY + 1);
table = new Entry[capacity];
initHashSeedAsNeeded(capacity);
}

Secondly, in other places such as addEntry, methods such as (2 * ), << 1 are also used to ensure that the number of arrays is to the n power of 2.

<2> for (Entry<K,V> e = table[i]; e != null; e = )

Because HashMap uses the form of an array plus a linked list, after obtaining the position in the array through hashCode, what you get is not an Entry<K,V>, but an Entry<K,V> linked list. You must loop the linked list to get the value corresponding to the key.

<3> addEntry(hash, key, value, i);

First, determine whether the number of arrays exceeds the threshold. If it exceeds, you need to increase the number of arrays. Then a new Entry is created and added to the array.

/**
   * Adds a new entry with the specified key, value and hash code to
   * the specified bucket. It is the responsibility of this
   * method to resize the table if appropriate.
   *
   * Subclass overrides this to alter the behavior of put method.
   */
  void addEntry(int hash, K key, V value, int bucketIndex) {
    if ((size >= threshold) && (null != table[bucketIndex])) {
      resize(2 * );
      hash = (null != key) ? hash(key) : 0;
      bucketIndex = indexFor(hash, );
    }

    createEntry(hash, key, value, bucketIndex);
  }

  /**
   * Like addEntry except that this version is used when creating entries
   * as part of Map construction or "pseudo-construction" (cloning,
   * deserialization). This version needn't worry about resizing the table.
   *
   * Subclass overrides this to alter the behavior of HashMap(Map),
   * clone, and readObject.
   */
  void createEntry(int hash, K key, V value, int bucketIndex) {
    Entry<K,V> e = table[bucketIndex];
    table[bucketIndex] = new Entry<>(hash, key, value, e);
    size++;
  }

2、LinkedHashMap

LinkedHashMap has been modified based on HashMap. First, change the Entry from a one-way linked list to a two-way linked list. Added references to Entry for both before and after teams.

 private static class Entry<K,V> extends <K,V> {
    // These fields comprise the doubly linked list used for iteration.
    Entry<K,V> before, after;

    Entry(int hash, K key, V value, <K,V> next) {
      super(hash, key, value, next);
    }

    /**
     * Removes this entry from the linked list.
     */
    private void remove() {
       = after;
       = before;
    }

    /**
     * Inserts this entry before the specified existing entry in the list.
     */
    private void addBefore(Entry<K,V> existingEntry) {
      after = existingEntry;
      before = ;
       = this;
       = this;
    }

    /**
     * This method is invoked by the superclass whenever the value
     * of a pre-existing entry is read by  or modified by .
     * If the enclosing Map is access-ordered, it moves the entry
     * to the end of the list; otherwise, it does nothing.
     */
    void recordAccess(HashMap<K,V> m) {
      LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m;
      if () {
        ++;
        remove();
        addBefore();
      }
    }

    void recordRemoval(HashMap<K,V> m) {
      remove();
    }
  }

At the same time, LinkedHashMap provides a reference header to Entry (private transient Entry<K,V> header). The purpose of header is to always be the header () and tail () of all members in HashMap. This way, the format of HashMap's array plus linked list is modified. In LinkedHashMap, the data saving format of HashMap array plus linked list is retained, and a set of bidirectional linked lists with header as the start mark (we will call it the bidirectional linked list of header for the time being). LinkedHashMap implements the LRU algorithm through the header's bidirectional linked list. Always point to the node that is least commonly used recently. If you delete it, you will delete the corresponding node. Relatively, it points to the node you just used.

LinkedHashMap does not provide put method, but LinkedHashMap rewrites addEntry and createEntry methods, as follows:

 /**
   * This override alters behavior of superclass put method. It causes newly
   * allocated entry to get inserted at the end of the linked list and
   * removes the eldest entry if appropriate.
   */
  void addEntry(int hash, K key, V value, int bucketIndex) {
    (hash, key, value, bucketIndex);

    // Remove eldest entry if instructed
    Entry<K,V> eldest = ;
    if (removeEldestEntry(eldest)) {
      removeEntryForKey();
    }
  }

  /**
   * This override differs from addEntry in that it doesn't resize the
   * table or remove the eldest entry.
   */
  void createEntry(int hash, K key, V value, int bucketIndex) {
    <K,V> old = table[bucketIndex];
    Entry<K,V> e = new Entry<>(hash, key, value, old);
    table[bucketIndex] = e;
    (header);
    size++;
  }

The put method of HashMap calls the addEntry method; the addEntry method of HashMap calls the createEntry method. Therefore, the above two methods and the contents in HashMap can be put together for easy analysis and form the following method:

　void addEntry(int hash, K key, V value, int bucketIndex) {
    if ((size >= threshold) && (null != table[bucketIndex])) {
      resize(2 * );
      hash = (null != key) ? hash(key) : 0;
      bucketIndex = indexFor(hash, );
    }

    <K,V> old = table[bucketIndex];
    Entry<K,V> e = new Entry<>(hash, key, value, old);
    table[bucketIndex] = e;
    (header);
    size++;

    // Remove eldest entry if instructed
    Entry<K,V> eldest = ;
    if (removeEldestEntry(eldest)) {
      removeEntryForKey();
    }
  }

Similarly, first determine whether the threshold is exceeded, and if it exceeds it, the number of arrays will be increased. Then create an Entry object and add it to the corresponding array and linked list of HashMap. Unlike HashMap, LinkedHashMap adds two operations: header; and removeEntryForKey();

First, let’s analyze (header). where e is an object, the addBefore code is as follows, which means that the header is associated with the current object, so that the current object is added to the tail of the header's bidirectional linked list ():

private void addBefore(Entry<K,V> existingEntry) {
      after = existingEntry;
      before = ;
       = this;
       = this;
    }

Next is another point, the code is as follows:

  // Remove eldest entry if instructed
    Entry<K,V> eldest = ;
    if (removeEldestEntry(eldest)) {
      removeEntryForKey();
    }

Among them, removeEldestEntry determines whether it is necessary to delete the node that is the least commonly used recently. The removeEldestEntry(eldest) method in LinkedHashMap always returns false. If we want to implement the LRU algorithm, we need to rewrite this method to determine under what circumstances and delete the most recently used node. The function of removeEntryForKey is to delete the node corresponding to the key in the array and linked list structure of HashMap. The source code is as follows:

　final Entry<K,V> removeEntryForKey(Object key) {
    if (size == 0) {
      return null;
    }
    int hash = (key == null) ? 0 : hash(key);
    int i = indexFor(hash, );
    Entry<K,V> prev = table[i];
    Entry<K,V> e = prev;

    while (e != null) {
      Entry<K,V> next = ;
      Object k;
      if ( == hash &&
        ((k = ) == key || (key != null && (k)))) {
        modCount++;
        size--;
        if (prev == e)
          table[i] = next;
        else
           = next;
        (this);
        return e;
      }
      prev = e;
      e = next;
    }

    return e;
  }

removeEntryForKey is a HashMap method, which is powerless to do with the bidirectional linked list of the header in LinkedHashMap, and LinkedHashMap has not rewrite this method. So how should the header's bidirectional linked list be handled?

After a closer look at the code, you can see that after successfully deleting the node in the HashMap, the (this); method is called. This method is empty in HashMap, and the LinkedHashMap Entry implements this method. The two lines of code in the remove() method are standard codes that delete the current node in the bidirectional linked list and are not explained.

 /**
     * Removes this entry from the linked list.
     */
    private void remove() {
       = after;
       = before;
    }void recordRemoval(HashMap<K,V> m) {
      remove();
    }

Above, the code analysis of the added nodes of LinkedHashMap has been completed, and you can see that the newly added nodes are perfectly placed at the end of the header bidirectional link list.

However, this is obviously a first-in-first-out algorithm, not the least commonly used algorithm recently. When getting, you need to update the header bidirectional linked list and place the node you just got at the end of the header bidirectional linked list. Let's take a look at the source code of get:

public V get(Object key) {
    Entry<K,V> e = (Entry<K,V>)getEntry(key);
    if (e == null)
      return null;
    (this);
    return ;
  }

The code is very short. The getEntry on the first line calls the getEntry method of HashMap, and it does not require explanation. The code that really deals with the header bidirectional linked list is (this). Take a look at the code:

 /**
     * Removes this entry from the linked list.
     */
    private void remove() {
       = after;
       = before;
    }

    /**
     * Inserts this entry before the specified existing entry in the list.
     */
    private void addBefore(Entry<K,V> existingEntry) {
      after = existingEntry;
      before = ;
       = this;
       = this;
    }

    /**
     * This method is invoked by the superclass whenever the value
     * of a pre-existing entry is read by  or modified by .
     * If the enclosing Map is access-ordered, it moves the entry
     * to the end of the list; otherwise, it does nothing.
     */
    void recordAccess(HashMap<K,V> m) {
      LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m;
      if () {
        ++;
        remove();
        addBefore();
      }
    }

First, delete the current node in the header bidirectional linked list, and then add the current node to the end of the header bidirectional linked list. Of course, when calling LinkedHashMap, you need to set accessOrder to true, otherwise it is the FIFO algorithm.

3. Android's LRU algorithm

Android also provides HashMap and LinkedHashMap, and the overall idea is somewhat similar, but the implementation details are obviously different. Moreover, although LruCache provided by Android uses LinkedHashMap, the implementation idea is different. Java needs to rewrite removeEldestEntry to determine whether to delete nodes; Android needs to rewrite LruCache's sizeOf to return the size of the current node. Android will judge whether the limit is exceeded based on this size and call the trimToSize method to clear the excess nodes.

Android's sizeOf method returns 1 by default. The default method is to determine whether the number of data in the HashMap exceeds the set threshold. You can also override the sizeOf method to return the size of the current node. Android's safeSizeOf will call the sizeOf method, and other methods to determine the threshold will call the safeSizeOf method, perform addition and subtraction operations and judge the threshold. Then determine whether the node needs to be cleared.

Java's removeEldestEntry method can also achieve the same effect. Java requires users to provide the entire judgment process by themselves, and there are still some differences in the two ideas.

sizeOf and safeSizeOf do not need to be explained, and although the put and get methods are not exactly the same as Java implementation methods, their ideas are the same and do not require analysis. At the end of the put method in LruCache, the trimToSize method is called, which is used to clear out the exceeded nodes. Its code is as follows:

　public void trimToSize(int maxSize)
 {
  while (true)
  {
   Object key;
   Object value;
   synchronized (this) {
    if (( < 0) || ((()) && ( != 0))) {
     throw new IllegalStateException(getClass().getName() + ".sizeOf() is reporting inconsistent results!");
    }
if (size <= maxSize) {
break;
}

     toEvict = ()().iterator().next();

    key = ();
    value = ();
    (key);
     -= safeSizeOf(key, value);
     += 1;
   }

   entryRemoved(true, key, value, null);
  }
 }

The key point to be explained is the line of code toEvict = ()().iterator().next(); The code in front of it determines whether it is necessary to delete the most recently used node, and the code in the latter is used to delete specific nodes. This line of code is used to get the most recently used nodes.

The first thing to be explained is that Android's LinkedHashMap and Java's LinkedHashMap have the same idea, and they also use header to save two-way linked lists. When put and get, the corresponding node will be updated and the node that has not been used for the longest time will be saved; it will be used to point to the node that has just been used. So toEvict = ()().iterator().next(); This line will definitely get the node in the end. Below, analyze the code step by step and you can see how it is implemented.

First, (), HashMap defines this method, but LinkedHashMap does not override this method. Therefore, the corresponding method of HashMap is called:

public Set<Entry<K, V>> entrySet() {
    Set<Entry<K, V>> es = entrySet;
    return (es != null) ? es : (entrySet = new EntrySet());
  }

The above code does not need to be explained in detail, new an instance of the EntrySet class. EntrySet is also defined in HashMap, but not in LinkedHashMap.

private final class EntrySet extends AbstractSet<Entry<K, V>> {
    public Iterator<Entry<K, V>> iterator() {
      return newEntryIterator();
    }
    public boolean contains(Object o) {
      if (!(o instanceof Entry))
        return false;
      Entry<?, ?> e = (Entry<?, ?>) o;
      return containsMapping((), ());
    }
    public boolean remove(Object o) {
      if (!(o instanceof Entry))
        return false;
      Entry<?, ?> e = (Entry<?, ?>)o;
      return removeMapping((), ());
    }
    public int size() {
      return size;
    }
    public boolean isEmpty() {
      return size == 0;
    }
    public void clear() {
      ();
    }
  }

Iterator<Entry<K, V>> newEntryIterator() { return new EntryIterator(); }

It is obvious in the code that toEvict = ()().iterator().next() is to call

newEntryIterator().next() means calling (new EntryIterator()).next(). The EntryIterator class is defined in LinkedHashMap.

private final class EntryIterator
      extends LinkedHashIterator<<K, V>> {
    public final <K, V> next() { return nextEntry(); }
  }

  private abstract class LinkedHashIterator<T> implements Iterator<T> {
    LinkedEntry<K, V> next = ;
    LinkedEntry<K, V> lastReturned = null;
    int expectedModCount = modCount;

    public final boolean hasNext() {
      return next != header;
    }

    final LinkedEntry<K, V> nextEntry() {
      if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
      LinkedEntry<K, V> e = next;
      if (e == header)
        throw new NoSuchElementException();
      next = ;
      return lastReturned = e;
    }

    public final void remove() {
      if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
      if (lastReturned == null)
        throw new IllegalStateException();
      ();
      lastReturned = null;
      expectedModCount = modCount;
    }
  }

Now we can draw the conclusion that the line of code in trimToSize gets the corresponding node, which is the node that is the least commonly used recently.

The above is a detailed explanation of the implementation principle of Android Java LRU cache. We will continue to add relevant information in the future. Thank you for your support for this site!