IOS Cache design detailed introduction and simple examples

IOS Cache Design

The design of Cache is a basic computer theory and one of the important basic skills of programmers. Cache is almost everywhere, including the CPU's L1 L2 Cache, the clean page and dirty page mechanism of iOS system, the tag mechanism of HTTP, etc. Behind these are the application of Cache's design ideas.

Why Cache

The purpose of Cache is to pursue a higher speed experience. The source of Cache is the difference in cost and performance between the two data reading methods.

Before starting to design a cache, you need to clear the media of the data storage. As client developers, we are also concerned about many data storage methods:

The data is initially stored on the Server, and this data needs to be obtained through network requests.
When obtaining data from the Server, it will pass through various intermediate network nodes (such as agents), which sometimes cache our data.
After downloading the data locally, we will cache a copy on the local disk, so that we may not have to go to the server to request it again every time.
After being stored in disk, the storage method of data will affect the reading speed. SQLite stored in B+ Tree is much faster than directly serializing NSArray into files.
When the App starts, the system will load the data downloaded from the Server from disk to memory. The read and write performance of memory is much faster than that of disk.
In Memory, there will be speed differences in different data structure storage methods. Storing read data in the form of NSDictionary (hash table) has better writing performance than Array, but the space overhead is greater. Although the read and write performance of memory is much higher than that of disk, it sometimes encounters bottlenecks when operating large collection data.
There are also Register, L1, and L2 faster than Memory, but for iOS App development, there are few optimizations that go deep into this level.

Every link mentioned above has differences in performance and cost. The server's data is naturally the most timely and accurate. However, if an app needs to obtain Server's data in the form of NSArray, it will take a "long" process in the middle. It can be said that the cache design idea exists in every step.

The premise of our understanding and practice of Cache is that we have a relatively in-depth understanding of the differences between storage media and different data structures.

If the performance optimization of most of our apps involves Cache, they are usually processed on the Memory medium. Data that needs to be obtained from Disk or through complex CPU calculations can be stored in Memory through reasonable data structures, which can solve most of the cache requirements in our App development. The Cache design at this level also has different postures. Let’s take a look at the simple and usable model first.

Simple and usable Cache

Thanks to the encapsulation of NSDictionary in Foundation, we can use the hash table data structure to implement a simple and available cache mechanism. Let’s first look at an example:

- (NSString*)getFormmatedPhoneNumber:(NSNumber*)phone
{
 if(phone == nil)
 {
  return nil;  
 }
 
 return [PhoneFormatLib formatPhoneNumber:phone]; //CPU time-consuming operation}

This is a simple function to format mobile phone numbers. The formatPhoneNumber function is a CPU Intensive call. In addition, for the same mobile phone number in business scenarios, the formatted NSString needs to be obtained frequently. If the calculation is repeated every time, it is obviously a waste of CPU resources and the performance is not good. We can add a simple cache to optimize:

static NSMutableDictionary* gPhoneCache = nil;
- (NSString*)getFormmatedPhoneNumber:(NSNumber*)phone
{
  if(phone == nil)
  {
    return nil;
  }
  
  NSString* phoneNumberStr = nil;
  
  [_phoneLock lock];
  if(gPhoneCache == nil)
  {
    gPhoneCache = @{}.mutableCopy;
  }
  
  phoneNumberStr = [gPhoneCache objectForKey:phone];
  if (phoneNumberStr == nil) {
    phoneNumberStr = [PhoneFormatLib formatPhoneNumber:phone];
    [gPhoneCache setObject:phoneNumberStr forKey:phone];
  }
  [_phoneLock unlock];
  
  return phoneNumberStr; 
}

By introducing NSMutableDictionary, the problem of repeatedly calling formatPhoneNumber is avoided every time. So easy completes a fast cache design and can be submitted to the test immediately, throwing the optimization results into the product manager's face, thanks to the time complexity of hash table O(1). Memory space will consume more, but the impact on small amounts of data is relatively small. Modern hash tables will not allocate a large amount of space from the beginning, but will gradually expand as the data increases.

The biggest problem with this simple and usable Cache design is that the code is too fragmented and uncontrollable. A small and scattered cache design is almost equivalent to digging a hole. When you design a cache, the amount of data may be small, but when the maintenance is later, and when the business changes, no one can guarantee that the overhead of this memory can still be ignored. Moreover, this kind of memory loss is difficult to detect. It is cleverly hidden in a certain .m file. When you want to control the memory overhead of the entire App in the later stage, you will feel that there are pitfalls everywhere and you have no idea how to start. You may also find that the above cache code does not release the cache.

All codes that have side effects on our entire app need to be centrally managed and must be understood and positioned from the architectural level. How to define side effects? It can be abstracted into a "write operation". Adding new records to the cache is a write operation. The side effect of this write operation is additional memory overhead. The essence of cache is to exchange space for time. This space loss is our side effect. One side effect will cause more side effects. To clarify these side effects often requires repeated review of a large amount of code. A better approach is to centrally manage the code with side effects from the beginning.

Elegant and controllable Cache

To avoid scattered cache code placement is to design an elegant and controllable Cache module. In an App, there may be various data that require Cache, phoneNumberCache, avatarCache, spaceshipCache, etc. We need to have a source to track these caches. The intuitive approach is to generate and hold these various caches through factory classes:

//
@interface CacheFactory : NSObject
+ (instancetype)sharedInstance;
- (id<MyCacheProtocol>)getPhoneNumberCache;
- (void)clearPhoneNumberCache;
- (id<MyCacheProtocol>)getAvatarCache;
- (void)clearAvatarCache;
@end

In this way, when we need to evaluate the impact of various caches on the entire App memory overhead, we just need to start with the CacheFactory code, and there are traces to follow when debugging, and other engineers will be grateful to take over your code.

It is also a good habit to separate cache declaration and implementation desire through protocol. Another important point of cache knowledge is cache's elimination strategy. Different strategies perform differently, such as FIFO, LRU, 2Queues, etc. There are many mature third-party cache frameworks available now, and the system also provides NSCache with unclear elimination strategies. If you have not written any cache elimination strategy, I still recommend that you try to do one by yourself, at least read the relevant implementation source code. It is necessary to understand these elimination strategies. When doing some in-depth optimization, you need to make decisions according to local conditions.

The use of cache must be collected and released, and it cannot be created but not released. In fact, all operations involving data must consider the life cycle of data. When we do business, we mostly use Controller as the basis unit. In some scenarios, the possibility of a Controller being re-entered after exiting is very low. Cleaning up cache in a timely manner will make our App perform better overall.

Immutable Cache

What is stored in the Cache? It's Data. Speaking of Data, we have to mention the "Immutability" that peak likes to talk about the most. Immutability has a great relationship with the stability of our code, so big that it is like "elephant in the room", which is very important and easy to be ignored.

When practicing Immutability, you need to classify Data first, and then distinguish how each type of Data implements immutability. The most important thing in Data classification is to distinguish the difference between value types and reference types. When passing values, a new memory copy is passed, so the value types are mostly safe. When passing points, the same shared memory space is passed, which is also a major reason why pointers are dangerous. Primitive types such as bool, Int, long, etc. are all value types that can be passed with confidence, while object types are often passed in the form of pointers, so special attention is required. We generally pass them through copy (generating new memory copies). This is also why many basic classes originally changed into value types in Objective C in Swift, strengthening Immutability to make our code safer.

Let's look at the read and write operations of different types of data in Cache.

Value Type-Read

Value types can be returned with confidence:

- (int)spaceshipCount
{
  //...
  return _shipCount;
}

Value Type-Write

Value types can also be written safely:

- (void)setSpaceshipCount:(int)count
{
  _shipCount = count;
}

Object Type-Read

The pointer type needs to generate a new copy:

- (User*)luckyUser
{
  //...
 return [_luckyUser copy];  
}

The copy method of the object class requires us to manually implement the NSCopying protocol. Although it seems a bit cumbersome in the early stage of development, it will be very rewarding in the later stage. Moreover, the copy here must be deep copy, and every property held in the User needs recursive copy.

Object Type-Write

The danger of object type writing operation is the entry parameter of the function. If the entry parameter is also an object type, a shared reference is passed in:

- (void)setLuckyUser:(User*)user
{
  //...
  _luckyUser = [user copy]; 
}

Collection Type-Read

Collection classes also require copy, which is the hardest hit area for bugs and crash:

- (NSArray*)hotDishes
{
 //...
  return [_hotDishes copy];
}

Collection Type-Write

- (void)setHotDishes:(NSArray*)dishes
{
 //...
 _hotDishes = [dished copy];
}

After seeing this, you may have discovered that the principle is relatively simple. As long as you ensure that the data obtained by the business module from the cache is independent copy, various hidden dangers caused by data sharing can be avoided. The Cache module is a bit similar to pure functions in functional programming. It neither depends on the external state nor modify the external state. It focuses on processing the input (input parameter) and output (return value) of each function call.

Multi-threaded safety

The focus of Cache multi-threaded safety is on the processing of collection classes. Cache itself is mostly managing collections of data. It should be noted that NSString should actually be classified as a collection class. From the perspective of data read and write and multi-threaded safety, NSString and NSArray perform consistently in many aspects. Some mature third-party cache libraries have already handled multi-thread safety issues for us. If it is a wheel made by ourselves, we must pay special attention to ensuring that reading and writing are atomic operations. As for how to use locks, there are already many related articles, so I will not elaborate on them here.

Summarize

The key to understanding Cache is to understand the design ideas behind it, and then to have a more comprehensive understanding of the behavior of our App and understand the bottlenecks behind each business process. As the code is written more and more, the business becomes more and more complex. Today or tomorrow, we always encounter the need to apply Cache design.

Thank you for reading, I hope it can help you. Thank you for your support for this site!