Redis KEYS alternative to query large-scale data

Preface

While the KEYS command is simple and direct when using Redis, its full table scanning feature can cause performance problems when processing large-scale data and may even block Redis services. This article will introduce four efficient solutions to replace KEYS, SCAN commands, ordered sets, hash tables and RediSearch modules, to cope with query and management of large batches of data. According to my actual usage, it is better to use the SCAN command when querying large-volume data of Redis.

KEYS command problem background

The KEYS command traverses the entire keyspace, which can cause the following problems for Redis instances containing a large number of keys:
High latency: The execution time is long, which affects the response speed of other commands.
Blocking Redis: Under a single-threaded model, KEYS will block the Redis server, causing other operations to be unable to be processed in time.
Memory consumption: Returning all matching keys may take up a lot of memory.
Therefore, the use of the KEYS command should be avoided in production environments.

Alternatives

1. Use the SCAN command

Theory introduction

SCAN is an incremental iterator that can traverse the key space step by step in batches to avoid loading all keys at once. It supports cursor mechanism, allowing users to complete a complete traversal through multiple calls.

advantage

Non-blocking: Does not block Redis servers, suitable for online environments.
Low resource consumption: Only a small number of keys are returned at a time to reduce memory pressure.

shortcoming

The result set is not fixed: The result set of SCAN is not fixed and there may be duplicate or missing keys, especially when the keys change frequently.
Additional parameters: The COUNT parameters need to be set reasonably to balance traversal speed and resource consumption.

Sample code

/**
  * scan command test
  * @author senfel
  * @date 2024/12/26 11:34
  * @return void
  */
@Test
public void scan() {
    try (Jedis jedis = new Jedis("localhost", 6379)) {
        String cursor = "0";
        ScanParams scanParams = new ScanParams().match("sys_dict:*").count(100);
        do {
            ScanResult&lt;String&gt; scanResult = (cursor, scanParams);
            for (String key : ()) {
                ("Found key: " + key);
            }
            cursor = ();
        } while (!("0"));
    }
}

2. Use an ordered set (Sorted Set)

Theory introduction

If you need to sort or range query the keys, you can consider storing the keys in an ordered set and assigning a unique score to each key. This allows you to efficiently obtain keys in a specified range through commands such as ZRANGE or ZREVRANGE.

advantage

Efficient query: supports fast range query and sorting.
Flexibility: Score rules can be adjusted according to business needs.

shortcoming

Additional overhead: The orderly collection needs to be maintained, which increases the complexity of the write operation.

Sample code

/**
 * sortSet
 * @author senfel
 * @date 2024/12/26 11:51
 * @return void
 */
@Test
public void sortSet() {
    try (Jedis jedis = new Jedis("localhost", 6379)) {
        // Add key to ordered collection        for (int i = 0; i &lt; 100; i++) {
            ("sorted_keys", (), "senfel"+i);
        }
        // Get the first 10 keys        Set&lt;String&gt; keys = ("sorted_keys", 0, 9);
        for (String key : keys) {
            ("Key from sorted set: " + key);
        }
    }
}

3. Use hash (hash)

Theory introduction

If the keys have similar structures or belong to the same category, they can be stored in a hash table, each field representing a key. This allows batch acquisition through HGETALL or HSCAN to be critical.

advantage

Centralized management: easy to operate and maintain in batches.
Efficient access: Hash tables provide O(1) lookup performance.

shortcoming

Limited scope of application: suitable for situations where keys have the same prefix or classification.

Sample code

/**
 * useHash
 * @author senfel
 * @date 2024/12/26 11:55
 * @return void
 */
@Test
public void useHash() {
    try (Jedis jedis = new Jedis("localhost", 6379)) {
        for (int i = 0; i &lt; 100; i++) {
            // Add key to hash table            ("user_data", "name"+i, "senfel"+i);
        }
        // Get all key-value pairs        Map&lt;String, String&gt; userData = ("user_data");
        for (&lt;String, String&gt; entry : ()) {
            ("User data: " + () + " -&gt; " + ());
        }
    }
}

4. Use Redis modules (such as RediSearch)

Theory introduction

The Redis module extends the capabilities of Redis, where RediSearch provides full-text search and indexing capabilities, enabling efficient management and query of large amounts of data. It supports complex query syntax and filtering conditions.

Docker is recommended for installation of RediSearch

docker run --name redisearch -p 16379:6379 -v redis-data:/data redis/redis-stack-server:latest

advantage

Strong query capabilities: Supports advanced queries such as full-text search and fuzzy matching.
High performance: The optimized index structure ensures efficient query performance.

shortcoming

Depend on external modules: Redis modules need to be installed and configured.
Learning cost: API and configuration are relatively complex and require some time to get familiar with.

maven dependency

<dependency>
    <groupId></groupId>
    <artifactId>jredisearch</artifactId>
    <version>2.0.0</version>
</dependency>

Sample code

/**
  * useRediSearch Not installed RediSearch Not tested
  * @author senfel
  * @date 2024/12/26 12:26
  * @return void
  */
@Test
public void useRediSearch() {
    Client client = ("localhost", 6379).connect();
    // Create index and add document    ("idx", ()
            .addField(new TextField("title"))
            .addField(new TextField("content"))
            .build());
    ("idx", "doc1", 1.0, ()
            .addField("title", "Redis Search")
            .addField("content", "Learn how to use Redis Search"));
    // Query the document    SearchResult result = ("idx", new Query("Redis"));
    for (Document doc : ()) {
        ("Found document: " + ());
    }
    ();
}

Summarize

To sum up, Redis large-batch data solutions currently include SCAN commands, ordered collections, hash tables, and RediSearch extension modules. Generally, non-blocking and low resource consumption SCAN commands can be used for Redis large-batch key traversals. Ordered sets are used for scenarios that require sorting or range queries. Hash tables can be used for situations where keys have the same prefix or classification. If full-text search or complex queries are required, high-performance and powerful query capabilities can be used.

The above is the detailed content of Redis KEYS querying large-scale data alternatives. For more information about Redis KEYS data alternatives, please pay attention to my other related articles!