SoFunction
Updated on 2025-03-11

Android data structure optimization tutorial

ArrayList vs. LinkedList

  • ArrayList search is fast, adding and deleting is slow, and it is an array inside, continuous space, address sequential search is fast, adding and deleting is an operation, while copy is a loop assignment, and adding and deleting at the end is not affected.
  • LinkedList adds and deletes fast, searches slow, and internal operations node are linked lists. Insert and delete only operate the head and tail pointers of the node. The memory is not continuous, searching is a polling method, and the for loop is time-consuming operation. Slow search and modification

Selection method: The data does not add or delete a large amount of data, and only displays in order using ArrayList such as listview and recycleview; the displayed data contains users and can delete it, and use LinkedList;

HashMap: Arrays used before Android 24 before 1.7, linked lists used for the construction of arrays, arrays (ArrayList) + linked lists, as a whole, each element in the array, forming a node together for a linked list, arrays and linked lists; after 1.8, in addition to arrays and linked lists, there are red and black trees (binary trees, balanced binary trees), and LinkedHaspMap bidirectional pointers.

1.7:

transient Entry[] table = (Entry<K,V>[]) EMPTY_TABLE;

How to ensure that K:V is unique and check the put() method

public V put(K key, V value) {
    if (table == EMPTY_TABLE) {
        inflateTable(threshold);
    }
    if (key == null)
        return putForNullKey(value);
    int hash = hash(key);
    int i = indexFor(hash, );
    for (Entry<K, V> e = table[i]; e != null; e = ) {
        Object k;
        if ( == hash && ((k = ) == key || (k))) {
            V oldValue = ;
             = value;
            (this);
            return oldValue;
        }
    }
    modCount++;	
    addEntry(hash, key, value, i);
    return null;
}
final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return .stringHash32((String) k);
        }
        h ^= ();
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }
    static int indexFor(int h, int length) {
        return h & (length-1);
    }
void addEntry(int hash, K key, V value, int bucketIndex) {
    	//sizi is greater than the threshold 2 times the capacity expansion        if ((size &gt;= threshold) &amp;&amp; (null != table[bucketIndex])) {
            resize(2 * );
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, );
        }
        createEntry(hash, key, value, bucketIndex);
    }
void createEntry(int hash, K key, V value, int bucketIndex) {  
        Entry<K,V> e = table[bucketIndex]; 
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }  

put process: k is object type, find the hashcode of int type according to k, boxing process, table [] size is unknown, find the modulus operation according to the length of hashcode % table, and obtain the range 0-length-1. Find the modulus operation is equivalent to the bit operation. The source code uses bit operation, which is faster, and the speed is faster when converting to bytecode in JVM. At this time, you get the index index, that is, i in the source code, and find the position to operate through the subscript i, and complete the task of k. Then addEntry(hash, key, value, i); and call the createEntry method. First record the subscript i into e, then use HashMapEntry to assign a value, new a new node, the new node points to e, and then assign the new node to table[bucketIndex], that is, the header insertion method, and put the new node at the position of i.

put: k:Object -> hashcode :int -> (hashcode % length) == (h & (length -1))-> :0~length-1 index;

Hash collision: The process of obtaining the index is the displacement operation (module finding) of hash operation. The model is a many-to-one process. A many-to-one process. hashcode1 hashcode2 will get the same index, so a hash collision (hash collision) occurs. hashmap provides a method to solve the collision: the linked list method, which uses the newly added node as the next node of the next node.

cpu: All operations are bit operations, the fastest of which is the position, & or | operation rather than +

Check out the get() method

public V get(Object key) {   
    if (key == null)   
        return getForNullKey();   
    int hash = hash(());   
    for (Entry<K,V> e = table[indexFor(hash, )];   
        e != null;   
        e = ) {   
        Object k;   
        if ( == hash && ((k = ) == key || (k)))   
            return ;   
    }   
    return null;   
}

Similar to get(), find index first, and then poll the linked list at the table[] position.

Expansion issues:

Loading factor: final float loadFactor = 0.75; (This table starts to expand by more than 10%)

Threshold: 0.75f*16(length)=12; element (all stored elements)>12 means expansion,

The default hashmap size: 16, that is, the table[16] in new Hashmap(), the size must be a power of 2.

The significance of expansion: 0.6-0.75 is the most suitable loading factor, the results of mathematician tests. Improve efficiency. When all a table conflicts, the efficiency is minimal to degenerate into a single linked list. Adding and deleting is efficient, and the search is low. Avoid conflicts, larger lengths, lower chances of conflicts

If the capacity is expanded by 2 times the threshold when addingEntry is greater than the threshold, call resize()

void resize(int newCapacity) {
    Entry[] oldTable = table;
    int oldCapacity = ;
    if (oldCapacity == MAXIMUM_CAPACITY) {
        threshold = Integer.MAX_VALUE;
        return;
    }
    Entry[] newTable = new Entry[newCapacity];
    transfer(newTable);//Use to move all the elements of the original table into the newTable    table = newTable;  //Assign newTable to table    threshold = (int)(newCapacity * loadFactor);//Recalculate the critical value}

Transfer the hasp table when expanding capacity

void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = ;
        //Transfer the old table        for (Entry&lt;K,V&gt; e : table) {
            // Hash all nodes            while(null != e) {
                Entry&lt;K,V&gt; next = ;
                if (rehash) {
                     = null ==  ? 0 : hash();
                }
                int i = indexFor(, newCapacity);
                 = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }
    }

Capacity expansion consumes performance and avoids capacity expansion. When creating, the size of the hash table should be evaluated (size/0.75+1). How to ensure that the size is a power of 2 is related to put. Call inflateTable(threshold);

private void inflateTable(int toSize) {
    // Find a power of 2 &gt;= toSize
    int capacity = roundUpToPowerOf2(toSize);
    threshold = (int) (capacity * loadFactor, MAXIMUM_CAPACITY + 1);
    table = new Entry[capacity];
    initHashSeedAsNeeded(capacity);//Initialize hashSeed variable}
private static int roundUpToPowerOf2(int number) {
    // assert number &gt;= 0 : "number must be non-negative";
    return number &gt;= MAXIMUM_CAPACITY
        ? MAXIMUM_CAPACITY
        : (number &gt; 1) ? ((number - 1) &lt;&lt; 1) : 1;
}

It will perform operations and turn to the most recent power of 2. The real initialization of the hash table is when put, not when new.

Initialization: When put, prevent the creation from being unused. When put again, the actual initialization is only achieved.

The power of size is 2: Ensure that h & (length -1) operation, such as the binary bit of 16-1: 11111, 32-1 is: 11111. If it is not the power of 2, if length is 10, length-1 is 9, and the binary is: 1001, only the highest and lowest bits will work after performing the operation. If the power of 2, the value of the function is more, and the possibility of collision is lower.

example:

Decimal Binary (hash value) and operation
h1 6 0110
length1-1 9 1001 0000
length2-1 15 1111 0110
h2 7 0111
length1-1 9 1001 0001
length2-1 15 1111 0111

Hashmap has a threshold, 25% of memory is wasted and space is used to exchange time. Especially when expanding capacity, if one more node is added, it will be twice as large.

SparseArray appears in Android: Use double arrays, one key and one value

public class SparseArray<E> implements Cloneable {
    private static final Object DELETED = new Object();
    private boolean mGarbage = false;
    private int[] mKeys;
    private Object[] mValues;
    private int mSize;

key is int type, value is for object

public void put(int key, E value) {
    int i = (mKeys, mSize, key);
    //It turns out that there is already a key. It may be that after removing, the value stores DELETED, or it may store the old value, so replace it    if (i &gt;= 0) {
        mValues[i] = value;
    } else {
        //Not found, inverse i, get i=lo()        i = ~i;
        //If i is less than the array length, and mValues==DELETED (the corresponding key of i is deleted delayed)        if (i &lt; mSize &amp;&amp; mValues[i] == DELETED) {
            //Directly replace it to realize the real deletion of the original key-value pair            mKeys[i] = key;
            mValues[i] = value;
            return;
        }
        //There may be delayed deletion elements in the array and the current array length is full, so it cannot be added        if (mGarbage &amp;&amp; mSize &gt;= ) {
            //Real deletion, clear all delayed deleted elements from the array;            gc();
            //After clearing, redetermine the target position of the current key in the array;            // Search again because indices may have changed.
            i = ~(mKeys, mSize, key);
        }
        //There is no garbage or the current array can still be added, and there is no need to expand the capacity. All the elements after i will be moved back, and there will still be a garbage key in the array that is DELETED;        mKeys = (mKeys, mSize, i, key);
        mValues = (mValues, mSize, i, value);
        //The new element is successfully added, the number of potentially available elements + 1        mSize++;
    }
}

class ContainerHelpers {

// This is (), but doesn't do any argument validation.
//The first parameter array is an array with keys, the second is the number of elements in the array (the length of keys is not necessarily equal), and the third value is the target keystatic int binarySearch(int[] array, int size, int value) {
    //lo is the left boundary of binary search    int lo = 0;
    //hi is the right boundary of binary search    int hi = size - 1;
    //Not found yet, continue to search    while (lo &lt;= hi) {
        //Two at the left border + right border to get the index of mid        final int mid = (lo + hi) &gt;&gt;&gt; 1;
        //Get intermediate elements        final int midVal = array[mid];
        // The target key is in the right part.  .  .  .  I feel this part is too simple        if (midVal &lt; value) {
            lo = mid + 1;
        } else if (midVal &gt; value) {
            hi = mid - 1;
        } else {
            //Equal, found, return the subscript corresponding to the key in the array;            return mid;  // value found
        }
    }
    //The element was not found, reversed to lo!  !  !  !  !  Very important    return ~lo;  // value not present
}

Search for keys and use binary search. After finding the index inserted, the subsequent elements use arraycopy.

Memory saving, the speed will not be slow. Use binary search, put one for loop one by one, and in out of order binary search. The faster you use it, the faster you can. When removing, mark the removed subscript as delete. Next time you insert it here, put it directly, and do not move the array. Space multiplexing is more efficient.

public void delete(int key) {
    //Look for the subscript of the corresponding key in the array. If it exists, return the subscript. If it does not exist, return the inverse of the subscript;    int i = (mKeys, mSize, key);
    //The key exists in the mKeys array, delete the element, replace the original value with DELETED, and serves as a marker;    if (i &gt;= 0) {
        if (mValues[i] != DELETED) {
            mValues[i] = DELETED;
            mGarbage = true;
        }
    }
}
/**
 * @hide
 * Removes the mapping from the specified key, if there was any, returning the old value.
 */
public E removeReturnOld(int key) {
    int i = (mKeys, mSize, key);
 
    if (i &gt;= 0) {
        if (mValues[i] != DELETED) {
            final E old = (E) mValues[i];
            mValues[i] = DELETED;
            mGarbage = true;
            return old;
        }
    }
    return null;
}
/**
 * Alias for {@link #delete(int)}.
 */
public void remove(int key) {
    delete(key);
}

Disadvantage: key can only be an int value

ArrayMap: HashMap + SparseArray Thoughts

@Override
    public V put(K key, V value) {
        //Current capacity        final int osize = mSize;
        //The hash value of key        final int hash;
        //Subscript of key hash        int index;
        if (key == null) {
            //The key is empty hash value is 0            hash = 0;
            // Find the subscript of the hash value of the key            index = indexOfNull();
        } else {
            //The hash value of key            hash = mIdentityHashCode ? (key) : ();
            // Find the subscript of the hash value of the key            index = indexOf(key, hash);
        }
        if (index &gt;= 0) {
            //The element to be added already exists, then the replacement operation is performed directly            index = (index&lt;&lt;1) + 1;
            final V old = (V)mArray[index];
            mArray[index] = value;
            return old;
        }
        //Inverse to get the position of the element to be added        index = ~index;
        if (osize &gt;= ) {
            //Expand new capacity            final int n = osize &gt;= (BASE_SIZE*2) ? (osize+(osize&gt;&gt;1))
                    : (osize &gt;= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);
            if (DEBUG) (TAG, "put: grow from " +  + " to " + n);
            //Original hash array            final int[] ohashes = mHashes;
            //Original scatter list            final Object[] oarray = mArray;
            //Expand operation            allocArrays(n);
            if (CONCURRENT_MODIFICATION_EXCEPTIONS &amp;&amp; osize != mSize) {
                throw new ConcurrentModificationException();
            }
            if ( &gt; 0) {
                if (DEBUG) (TAG, "put: copy 0-" + osize + " to 0");
                //Copy the original array back to the new array                (ohashes, 0, mHashes, 0, );
                (oarray, 0, mArray, 0, );
            }
            //Recycling and Release Operation            freeArrays(ohashes, oarray, osize);
        }
        if (index &lt; osize) {
            if (DEBUG) (TAG, "put: move " + index + "-" + (osize-index)
                    + " to " + (index+1));
            //Move the data at index (including index) and afterwards back            (mHashes, index, mHashes, index + 1, osize - index);
            (mArray, index &lt;&lt; 1, mArray, (index + 1) &lt;&lt; 1, (mSize - index) &lt;&lt; 1);
        }
        if (CONCURRENT_MODIFICATION_EXCEPTIONS) {
            if (osize != mSize || index &gt;= ) {
                throw new ConcurrentModificationException();
            }
        }
        //Add data to index        mHashes[index] = hash;
        mArray[index&lt;&lt;1] = key;
        mArray[(index&lt;&lt;1)+1] = value;
        mSize++;
        return null;
    }
private void allocArrays(final int size) {
        if (mHashes == EMPTY_IMMUTABLE_INTS) {
            //If mHashes is immutable during expansion, an exception will be thrown            throw new UnsupportedOperationException("ArrayMap is immutable");
        }
        if (size == (BASE_SIZE*2)) {
            //If the capacity expansion is 8 (BASE_SIZE=4)            synchronized () {
                if (mTwiceBaseCache != null) {
                    /**
                       If there is currently an int cache reusable array with capacity 8 and an object cache reusable array with capacity 16, then multiplex these arrays without renew
                      */
                    final Object[] array = mTwiceBaseCache;
                    mArray = array;
                    mTwiceBaseCache = (Object[])array[0];
                    mHashes = (int[])array[1];
                    array[0] = array[1] = null;
                    mTwiceBaseCacheSize--;
                    if (DEBUG) (TAG, "Retrieving 2x cache " + mHashes
                            + " now have " + mTwiceBaseCacheSize + " entries");
                    return;
                }
            }
        } else if (size == BASE_SIZE) {
             //If the capacity expansion is 4 (BASE_SIZE=4)            synchronized () {
                if (mBaseCache != null) {
                    /**
                       If there is currently an int cache reusable array with capacity 4 and an object cache reusable array with capacity 8, then multiplex these arrays without renew
                      */
                    final Object[] array = mBaseCache;
                    mArray = array;
                    mBaseCache = (Object[])array[0];
                    mHashes = (int[])array[1];
                    array[0] = array[1] = null;
                    mBaseCacheSize--;
                    if (DEBUG) (TAG, "Retrieving 1x cache " + mHashes
                            + " now have " + mBaseCacheSize + " entries");
                    return;
                }
            }
        }
        mHashes = new int[size];
        mArray = new Object[size&lt;&lt;1];
    }
    private static void freeArrays(final int[] hashes, final Object[] array, final int size) {
        if ( == (BASE_SIZE*2)) {
            //If the current capacity is 8 (BASE_SIZE=4)            synchronized () {
                if (mTwiceBaseCacheSize &lt; CACHE_SIZE) {
                    //Cache the current array and set the data after the array subscript is 2 to null                    array[0] = mTwiceBaseCache;
                    array[1] = hashes;
                    for (int i=(size&lt;&lt;1)-1; i&gt;=2; i--) {
                        array[i] = null;
                    }
                    mTwiceBaseCache = array;
                    mTwiceBaseCacheSize++;
                    if (DEBUG) (TAG, "Storing 2x cache " + array
                            + " now have " + mTwiceBaseCacheSize + " entries");
                }
            }
        } else if ( == BASE_SIZE) {
            //If the current capacity is 4 (BASE_SIZE=4)            synchronized () {
                //Cache the current array and set the data after the array subscript is 2 to null                if (mBaseCacheSize &lt; CACHE_SIZE) {
                    array[0] = mBaseCache;
                    array[1] = hashes;
                    for (int i=(size&lt;&lt;1)-1; i&gt;=2; i--) {
                        array[i] = null;
                    }
                    mBaseCache = array;
                    mBaseCacheSize++;
                    if (DEBUG) (TAG, "Storing 1x cache " + array
                            + " now have " + mBaseCacheSize + " entries");
                }
            }
        }
    }
 @Override
  public V get(Object key) {
        //Get the index of mHashes where the key hashcode is located,        final int index = indexOfKey(key);
        /**
         According to the data storage structure of mArray, if you know the subscript *2 of mHashes (index << 1), you will get the subscript of its corresponding element at the starting subscript of mArray. The first is the key and the second is the value
         */
        return index &gt;= 0 ? (V)mArray[(index&lt;&lt;1)+1] : null;
  }
public int indexOfKey(Object key) {
        return key == null ? indexOfNull()
                : indexOf(key, mIdentityHashCode ? (key) : ());
}
int indexOf(Object key, int hash) {
        final int N = mSize;
        // Important fast case: if nothing is in here, nothing to look for.
        if (N == 0) {
            return ~0;
        }
        //Dialogram to find the subscript where hash is located        int index = binarySearchHashes(mHashes, N, hash);
        // If the hash code wasn't found, then we have no entry for this key.
        if (index &lt; 0) {
            //Not found, return directly            return index;
        }
        // If the key at the returned index matches, that's what we want.
        if ((mArray[index&lt;&lt;1])) {
            //If the hash subscript corresponds to the key in mArray and the key to be found, return directly to the current subscript            return index;
        }
        //Conflict handling plan        //Travel over the element after the current index and find the subscript of the matching key where the matching key is located        // Search for a matching key after the index.
        int end;
        for (end = index + 1; end &lt; N &amp;&amp; mHashes[end] == hash; end++) {
            if ((mArray[end &lt;&lt; 1])) return end;
        }
        //Transf the element before the current index and find the subscript of the matching key where the matching key is located        // Search for a matching key before the index.
        for (int i = index - 1; i &gt;= 0 &amp;&amp; mHashes[i] == hash; i--) {
            if ((mArray[i &lt;&lt; 1])) return i;
        }
        // Key not found -- return negative value indicating where a
        // new entry for this key should go.  We use the end of the
        // hash chain to reduce the number of array entries that will
        // need to be copied when inserting.
        return ~end;
}

When encountering hash conflicts, use the append method, and the index accumulates during conflicts.

Performance improvement: Space and time selection issues

This is the end of this article about Android data structure optimization tutorial. For more related Android data structure content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!