Implementation example of Redis traversing massive data

1. Preface

Sometimes we need to know the usage of redis online, especially the key values of some prefixes. How can we check it? Today I will share with you a little knowledge point!

2. Accidents occur

Because our user token cache uses the key in the format [user_token:userid] to save the user's token value. In order to help development partners check how many logged-in users are currently online.

I directly used the keys user_token method to query, and the accident happened. Redis is unavailable and fake death.

Analysis of reasons

There are millions of logged-in users online, and the amount of data is relatively large; the keys algorithm is a traversal algorithm, and the complexity is O(n), that is, the more data, the higher the time complexity.

If the data volume reaches several million, the keys instruction will cause the Redis service to be stuttered, because Redis is a single-threaded program that executes all instructions in sequence. Other instructions must wait until the current keys instruction is executed before continuing.

Solution

So how do we traverse the large amount of data? This is also often asked in interviews. We can use another command scan of redis.

Let's take a look at the characteristics of scan

1. Although the complexity is also O(n), it is carried out in steps through cursors and does not block threads.

2. Provide count parameter, not the number of results, but the number of dictionary slots in a single traversal of redis (approximately equal to)

3. Like keys, it also provides pattern matching function;

4. The server does not need to save state for the cursor. The only state of the cursor is the cursor integer returned to the client;

5. The returned results may be duplicated and the client needs to repeat them, which is very important;

6. The result of a single return is empty does not mean that the traversal is over, but depends on whether the returned cursor value is zero.

1. Scan command format

SCAN cursor [MATCH pattern] [COUNT count]

2. Command explanation

scan cursor MATCH <Return elements matching the given pattern> count Number of elements returned per iteration

The SCAN command is an incremental loop, and only a small portion of the elements will be returned per call. So I won't let Redis die

The SCAN command returns a cursor, which starts from 0 and ends from 0.

3. Give an example

redis > scan 0 match user_token* count 5 
 1) "6"
 2) 1) "user_token:1000"
 2) "user_token:1001"
 3) "user_token:1010"
 4) "user_token:2300"
 5) "user_token:1389"

Start traversal from 0, return cursor 6, and return data. Continue to scan traversal, start from 6

redis > scan 6 match user_token* count 5 
 1) "10"
 2) 1) "user_token:3100"
 2) "user_token:1201"
 3) "user_token:1410"
 4) "user_token:5300"
 5) "user_token:3389"

3. Summary

This is often asked in interviews, and it is also something that our friends often use during work. Generally, small companies will not have any problems, but when there is a lot of data, your operation method is wrong, and your performance will be deducted, haha.

This is the end of this article about the implementation example of Redis traversing massive data. For more related content for Redis traversing massive data, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!