Redis hotkey problem analysis and solutions

1. Description of problem phenomenon

I don't know if you have encountered this phenomenon: there is not much data storage in the Redis cache, but the CPU consumption and memory, network and other resource loads of Redis instances of some nodes in the cluster are very high. Sometimes, a certain node may inexplicably go down.

When encountering the above problems, basically congratulations. In most cases, if nothing unexpected happens, you may have encountered a hot key problem.

2. What is a hot topic key

Redis hotspot key refers to keys with a higher access frequency. When a large number of requests are concentrated on one or a few hotspot keys, the resources such as CPU, memory, and network bandwidth of the Redis node where these keys are located will be consumed in large quantities, affecting the overall performance and stability of the Redis cluster.

3. The harm of hot spot keys

3.1 Redis node load is too high

When certain keys are frequently accessed, the Redis node will be overloaded, which will affect the performance and stability of Redis.

3.2 Redis cluster uneven load

When some keys are frequently accessed, it will cause the node to be overloaded while other nodes will be lighter, which will cause the Redis cluster to be unbalanced.

3.3 Redis cluster performance degrades

When the access frequency of some keys is particularly high, it will cause the CPU, memory, network and other resources of the Redis node to be overloaded, which will affect the performance of Redis and even cause Redis to go down.

3.4 Data inconsistency

When some keys become hot keys, if the data volume is large or the update frequency is fast, it may lead to data inconsistency, such as inconsistency between the data in the cache and the data in the database, and inconsistency between the data from different nodes.

3.5 Cache Breakthrough

When the access frequency of some keys is particularly high, if the data of these keys expires or is deleted, and a large number of requests are used to access the key at the same time, it will cause these requests to directly access the backend database, resulting in cache breakdown problems.

4. Analysis of the causes of hot spot keys

The generation of hotspot keys is usually related to the following scenarios:

4.1 Hot data

Some data have high access frequency, such as popular products, popular news, popular reviews, etc.

4.2 Business peak period

When it is at its peak of business, some data will be frequently accessed, such as Double 11 flash sale, full-point flash sale, etc.

4.3 Code logic issues

The code logic of the program causes some keys to be frequently accessed, such as high-frequency polling in the program or the existence of a code dead loop.

5. How to detect hot spots

In the above sections, we understand the concept and reasons for the hotkey. In actual production, we can also encounter such production environment phenomena, which we need to analyze and solve. So how should we detect the hotkey problem?

Here, I provide two solutions to detect hotspot keys. These are Redis monitoring tools and slow query logs.

5.1 Redis Monitoring Tool

Redis provides some monitoring tools, such as Redis monitor and redis-stat, which can be used to monitor the running status of Redis instances. Through these tools, we can observe keys with higher access frequency and their impact on Redis performance.

Redis monitor: Use the monitor command of redis-cli, you can view the command execution status of Redis instances in real time. By analyzing the output log information, you can find the key with a higher access frequency.
redis-stat: redis-stat is a tool for real-time monitoring of Redis instances. It can display indicators including command execution times, memory usage, etc. By observing these metrics, you can find the impact of hotspot keys on Redis performance.

5.2 Slow query log

Redis's slow query log records commands with a long execution time. By analyzing the slow query log, you can find operations that may have hotspot keys. You can use the `slowlog` command of `redis-cli` to view slow query logs.

Through the above method, the hotspot key and its impact on Redis performance can be detected.

6. Solve hot key problems

After finding the hotkey key, we need to adopt corresponding strategies to solve the hotkey problem.

I think we should think from two perspectives to solve the hotkey problem. One is to avoid the generation of hotkeys, such as adopting the data sharding strategy, realizing data load balancing through the hash slot consistency algorithm in Redis Cluster mode, and realizing consistent hashing and other sharding algorithms in non-Cluster mode through the client or proxy layer.

Second, when hot key problems have already occurred, the read and write pressure of the cache server is reduced through the read and write separation scheme;

By caching and preheating, avoid direct query of hot data on the database, causing pressure on the database;

If it really doesn't work, the system will be protected through current limiting or fuse degradation measures. Of course, the most effective way to solve the problem should be to solve the problem at the root cause and avoid the problem. It is really a business need and cannot be avoided. That is only a positive measure to protect the stability of the system as much as possible.

6.1 Data Sharding

Data sharding is to store hotspot data on multiple Redis nodes to avoid excessive load on a single node. It is the most commonly used strategy to solve hotspot key problems.

For example, in Redis Cluster mode, data is automatically distributed on multiple nodes by slot, thereby achieving load balancing. For non-Cluster mode, sharding algorithms such as consistent hashing can be implemented through the client or proxy layer to distribute data on multiple Redis instances.

6.2 Read and write separation

Read and write separation can separate read operations from write operations, reducing the load on a single node.

In master-slave replication mode, read operations can be distributed to slave nodes, thus sharing the pressure of the master node.

In addition, automatic failover and read-write separation can be achieved using proxy layers such as Redis Sentinel or Twemproxy.

6.3 Cache preheating

Cache warm-up refers to actively loading hotspot data into the cache after the system is started or restarted.

In this way, when users access these hotspot data, they can directly obtain it from the cache, avoiding pressure on the backend database.

Cache warm-up can be achieved by loading hotspot data at a timed task or when the application starts.

6.4 Current Limitation

Current limiting prevents system overload by controlling the rate of requests.

Realizing current limiting at the application layer can effectively reduce the pressure on Redis by hotspot keys.

Common current limiting algorithms include leaky bucket algorithm and token bucket algorithm.

6.5 Fuse downgrade

Fuse degradation is a strategy to automatically reduce system functions when there is a problem with the system. The application layer realizes fuse degradation, which can quickly reduce the access pressure to Redis when there is a hotkey problem with Redis. Fuse downgrade can be achieved through open source tools such as Hystrix.

Through the above strategy, the hotkey problem of Redis can be effectively solved. However, in practical applications, it is necessary to choose appropriate strategies based on specific business scenarios and needs. Next, we will use practical cases to illustrate how to solve hotkey problems.

7. Practical cases

7.1 Solve the problem of popular products on e-commerce platforms

In an e-commerce platform, the number of views and purchases of some popular products is much higher than that of other products, causing the keys of these products to become hot keys.

To solve this problem, we can take the following measures:

Store product data fragments on multiple Redis nodes to achieve load balancing (for example, using Redis Cluster clusters), and try to avoid the hot spot keys of multiple products being distributed and stored on the same Redis node.
Set up a current limiting strategy for popular products to prevent too many requests from causing excessive pressure on Redis.
Use cache preheating to load popular products into the cache in advance, avoiding direct query of the database.

Summarize

The above is personal experience. I hope you can give you a reference and I hope you can support me more.