Introduction to 5 BitMap application scenarios and implementations in Redis

Redis BitMap is an efficient bit-operating data structure that treats strings as arrays of binary bits. In Redis, a BitMap can store up to 2^32 bits, about 512MB, and the time complexity of operating a single bit is O(1). This structure is particularly efficient when processing boolean states of massive data, and can complete high-performance statistical and analysis tasks with extremely small memory usage.

1. Redis BitMap Basics

1.1 Basic Concept

BitMap is essentially a bit array, and each element of the array can only be 0 or 1. In Redis, BitMap is implemented based on the String type. Each byte (8 bits) of a string can represent 8 different bits, thus realizing the function of a bit array.

1.2 Core Commands

Redis provides a series of commands to operate BitMap:

SETBIT key offset value: Set the bit value of the key at offset
GETBIT key offset: Get the bit value of the key at offset
BITCOUNT key [start end]: Statistics the number of 1 within the specified range
BITPOS key bit [start end]: Returns the position of the first bit set to the bit value
BITOP operation destkey key [key ...]: Perform bit operations (AND, OR, XOR, NOT) on multiple BitMaps
BITFIELD key [GET type offset] [SET type offset value]: Atomic operation of multiple bit domains

2. Application scenario 1: User check-in system

2.1 Scene Description

In many applications, it is necessary to record whether the user checks in every day, and supports statistical functions such as querying the number of consecutive days for users to check the total number of days for signing in that month. Traditional solutions may use relational databases to store daily check-in records, but this method consumes storage space and is also inefficient in querying.

2.2 BitMap Solution

Using BitMap, we can use one bit to represent the check-in status of a day, only 30-31 bits a month, which is very space-saving.

2.3 Implementation example

import ;
import ;
import ;

public class SignInSystem {
    private Jedis jedis;
    private static final DateTimeFormatter MONTH_FORMATTER = ("yyyyMM");
    
    public SignInSystem(String host, int port) {
         = new Jedis(host, port);
    }
    
    // User check-in    public void signIn(long userId, LocalDate date) {
        String signKey = getSignKey(userId, date);
        int dayOfMonth = () - 1; // Redis BitMap is 0-based        (signKey, dayOfMonth, true);
    }
    
    // Check whether the user is checked in    public boolean hasSignedIn(long userId, LocalDate date) {
        String signKey = getSignKey(userId, date);
        int dayOfMonth = () - 1;
        return (signKey, dayOfMonth);
    }
    
    // Get the number of user check-in times that month    public long getMonthlySignCount(long userId, LocalDate date) {
        String signKey = getSignKey(userId, date);
        return (signKey);
    }
    
    // Get the user's first check-in date of the month    public int getFirstSignInDay(long userId, LocalDate date) {
        String signKey = getSignKey(userId, date);
        long pos = (signKey, true);
        return pos == -1 ? -1 : (int) pos + 1; // Convert back to natural day    }
    
    // Get the number of consecutive days for users to sign in that month    public int getConsecutiveSignDays(long userId, LocalDate date) {
        String signKey = getSignKey(userId, date);
        int dayOfMonth = () - 1;
        int count = 0;
        
        // Find the number of consecutive days of sign-in starting from the same day        for (int i = dayOfMonth; i &gt;= 0; i--) {
            if ((signKey, i)) {
                count++;
            } else {
                break;
            }
        }
        return count;
    }
    
    // Build a check-in key    private String getSignKey(long userId, LocalDate date) {
        return "user:sign:" + userId + ":" + (MONTH_FORMATTER);
    }
}

2.4 Performance and spatial analysis

Space occupancy: Each user only needs 4 bytes (1 integer) per month to store all check-in records
Time complexity: Single check-in/query operation is O(1)
Advantages: Extremely low storage cost, efficient statistical capability

3. Application scenario 2: Online user statistics

3.1 Scene Description

Large systems need to count the number of online users in real time and analyze user activity, such as the number of daily active users (DAU) and monthly active users (MAU) and other key indicators. Traditional solutions may use Set or Hash structures, but they consume a lot of memory when facing massive users.

3.2 BitMap Solution

Using BitMap, the user ID can be mapped directly to a bit offset, and each user only takes 1 bit. Ten million users only need about 1.2MB of memory.

3.3 Implementation example

import ;
import ;
import ;

public class UserActivityTracker {
    private Jedis jedis;
    private static final DateTimeFormatter DATE_FORMATTER = ("yyyyMMdd");
    
    public UserActivityTracker(String host, int port) {
         = new Jedis(host, port);
    }
    
    // Record user activity    public void trackUserActivity(long userId, LocalDate date) {
        String key = getActivityKey(date);
        (key, userId, true);
    }
    
    // Get the daily active users (DAU)    public long getDailyActiveUsers(LocalDate date) {
        String key = getActivityKey(date);
        return (key);
    }
    
    // Get monthly active users (MAU)    public long getMonthlyActiveUsers(int year, int month) {
        LocalDate startDate = (year, month, 1);
        LocalDate endDate = (1).minusDays(1);
        
        // Create temporary result key        String destKey = "temp:mau:" + year + month;
        
        // Collect active users for all dates throughout the month        for (LocalDate date = startDate; !(endDate); date = (1)) {
            String dayKey = getActivityKey(date);
            // Use OR operations to merge daily active data            ("OR", destKey, destKey, dayKey);
        }
        
        // Calculate the total number of active users        long mau = (destKey);
        
        // Clean the temporary keys        (destKey);
        
        return mau;
    }
    
    //Judge the overlap of active users for two days (retention rate related)    public long getActiveUserOverlap(LocalDate date1, LocalDate date2) {
        String key1 = getActivityKey(date1);
        String key2 = getActivityKey(date2);
        String destKey = "temp:overlap:" + (DATE_FORMATTER) + ":" + (DATE_FORMATTER);
        
        // Use AND to find users who have been active for two days        ("AND", destKey, key1, key2);
        long overlap = (destKey);
        
        // Clean the temporary keys        (destKey);
        
        return overlap;
    }
    
    // Get active user key    private String getActivityKey(LocalDate date) {
        return "user:active:" + (DATE_FORMATTER);
    }
}

3.4 Expansion: Calculation of next-day retention rate

public double getRetentionRate(LocalDate date) {
    LocalDate nextDate = (1);
    
    // Number of active users on the day    long todayActive = getDailyActiveUsers(date);
    if (todayActive == 0) return 0.0;
    
    // Calculate the number of active users on that day who are still active the next day    long overlap = getActiveUserOverlap(date, nextDate);
    
    // Calculate retention rate    return (double) overlap / todayActive;
}

IV. Application scenario 3: Bloom filter implementation

4.1 Scene Description

The Bloom filter is a probabilistic data structure with high spatial efficiency, used to determine whether an element exists in a set. It is widely used in scenarios such as big data, cache penetration protection, spam filtering, etc. The Bloom filter may be misjudgmented, but it can perform efficient queries at a very small cost of memory.

4.2 BitMap Solution

Bloom filters can be easily implemented using Redis's BitMap, mapping elements to different locations in a bit array through multiple hash functions.

4.3 Implementation example

import ;
import ;
import ;
import ;
import ;
import ;

public class RedisBloomFilter {
    private Jedis jedis;
    private String key;
    private int hashFunctions;
    private long size;
    
    /**
      * Create a Bloom filter
      * @param host Redis host
      * @param port Redis port
      * @param key filter key name
      * @param size bit array size
      * @param hashFunctions Number of hash functions
      */
    public RedisBloomFilter(String host, int port, String key, long size, int hashFunctions) {
         = new Jedis(host, port);
         = key;
         = size;
         = hashFunctions;
    }
    
    /**
      * Add elements to the Bloom filter
      */
    public void add(String value) {
        for (long position : getHashPositions(value)) {
            (key, position, true);
        }
    }
    
    /**
      * Determine whether an element may exist in the filter
      * @return true means that it may exist, false means that it must not exist
      */
    public boolean mightContain(String value) {
        for (long position : getHashPositions(value)) {
            if (!(key, position)) {
                return false;
            }
        }
        return true;
    }
    
    /**
      * Calculate multiple positions of elements in the Bloom filter
      */
    private List&lt;Long&gt; getHashPositions(String value) {
        List&lt;Long&gt; positions = new ArrayList&lt;&gt;(hashFunctions);
        
        try {
            MessageDigest md = ("MD5");
            byte[] bytes = ((StandardCharsets.UTF_8));
            
            // Generate multiple hash positions using the same MD5 value            for (int i = 0; i &lt; hashFunctions; i++) {
                long hashValue = 0;
                for (int j = i * 4; j &lt; i * 4 + 4; j++) {
                    hashValue &lt;&lt;= 8;
                    int index = j % ;
                    hashValue |= (bytes[index] &amp; 0xFF);
                }
                ((hashValue % size));
            }
        } catch (NoSuchAlgorithmException e) {
            throw new RuntimeException("MD5 algorithm not found", e);
        }
        
        return positions;
    }
    
    /**
      * Reset the filter
      */
    public void clear() {
        (key);
    }
}

4.4 Application example: cache penetration protection

public class CacheService {
    private RedisBloomFilter bloomFilter;
    private Jedis jedis;
    
    public CacheService(String host, int port) {
         = new Jedis(host, port);
        // Create a Bloom filter with a size of 10 million bits and uses 7 hash functions         = new RedisBloomFilter(host, port, "cache:bloom:filter", 10_000_000, 7);
        
        // Initialize the filter and add all valid IDs        initBloomFilter();
    }
    
    private void initBloomFilter() {
        // Simulate loading all valid IDs from the database and adding them to the Bloom filter        List&lt;String&gt; allValidIds = getAllIdsFromDatabase();
        for (String id : allValidIds) {
            (id);
        }
    }
    
    public String getDataById(String id) {
        // First check whether the ID may exist        if (!(id)) {
            return null; // The ID must not exist, return directly        }
        
        // Try to get from cache        String cacheKey = "cache:data:" + id;
        String data = (cacheKey);
        
        if (data != null) {
            return data; // Cache hit        }
        
        // Cache misses, get from the database        data = getFromDatabase(id);
        
        if (data != null) {
            // Save to cache            (cacheKey, 3600, data);
            return data;
        }
        
        // ID does not exist in the database (the case of misjudgment of the Bloom filter)        return null;
    }
    
    // Simulate to get data from the database    private String getFromDatabase(String id) {
        // The database will be queried in actual projects        return null; // Simulation data does not exist    }
    
    // Simulate to get all IDs from the database    private List&lt;String&gt; getAllIdsFromDatabase() {
        // In actual project, the database will be queried to obtain all valid IDs        return new ArrayList&lt;&gt;();
    }
}

V. Application Scenario 4: User Behavior Analysis and Recommendation System

5.1 Scene Description

In the recommendation system, it is necessary to analyze the user's behavioral preferences for different items (such as articles and products), including browsing, collection, likes, etc. This data is used to build inputs to user profiles and content recommendation algorithms. Traditional solutions may use relational databases or document databases to store these behavioral records, but they will face storage and query efficiency problems in large-scale scenarios.

5.2 BitMap Solution

Use BitMap to efficiently store user preference status for items. For example, use different BitMap to record whether the user browses, collects, and purchases a certain product.

5.3 Implementation example

import ;
import ;
import ;
import ;
import ;

public class UserBehaviorAnalyzer {
    private Jedis jedis;
    
    // Behavior type constant    private static final String VIEW = "view";
    private static final String LIKE = "like";
    private static final String COLLECT = "collect";
    private static final String PURCHASE = "purchase";
    
    public UserBehaviorAnalyzer(String host, int port) {
         = new Jedis(host, port);
    }
    
    /**
      * Record the user's behavior on items
      * @param userId User ID
      * @param itemId Item ID
      * @param behaviorType behavior type
      */
    public void recordBehavior(long userId, long itemId, String behaviorType) {
        String key = getBehaviorKey(userId, behaviorType);
        (key, itemId, true);
    }
    
    /**
      * Check whether the user has had specific behaviors with the item
      */
    public boolean hasBehavior(long userId, long itemId, String behaviorType) {
        String key = getBehaviorKey(userId, behaviorType);
        return (key, itemId);
    }
    
    /**
      * Get the total number of items that the user has for a specific behavior
      */
    public long getBehaviorCount(long userId, String behaviorType) {
        String key = getBehaviorKey(userId, behaviorType);
        return (key);
    }
    
    /**
      * Get the total number of users with specific behaviors
      */
    public long getUserCountWithBehavior(long itemId, String behaviorType) {
        // This implementation requires traversing all users, and other ways of optimization may be required in actual applications.        // Here are examples only, actual projects should consider performance impact        int userCount = 0;
        
        // Assume that the user ID range is 1-10000        for (long userId = 1; userId &lt;= 10000; userId++) {
            if (hasBehavior(userId, itemId, behaviorType)) {
                userCount++;
            }
        }
        
        return userCount;
    }
    
    /**
      * Calculate behavioral similarity among users (for collaborative filtering recommendations)
      * @return Return the number of items that the two users behave in common
      */
    public long calculateUserSimilarity(long userId1, long userId2, String behaviorType) {
        String key1 = getBehaviorKey(userId1, behaviorType);
        String key2 = getBehaviorKey(userId2, behaviorType);
        String destKey = "temp:similarity:" + userId1 + ":" + userId2 + ":" + behaviorType;
        
        // Use AND operations to find common behavior        ("AND", destKey, key1, key2);
        long similarity = (destKey);
        
        // Clean the temporary keys        (destKey);
        
        return similarity;
    }
    
    /**
      * Generate item recommendations based on user behavior
      * @return Recommended itemsIDList
      */
    public List&lt;Long&gt; getRecommendations(long userId, int limit) {
        List&lt;Long&gt; recommendations = new ArrayList&lt;&gt;();
        Set&lt;Long&gt; alreadyViewed = new HashSet&lt;&gt;();
        
        // Get the items that the user has viewed        String viewKey = getBehaviorKey(userId, VIEW);
        for (long i = 0; i &lt; 10000; i++) { // Assume item ID range            if ((viewKey, i)) {
                (i);
            }
        }
        
        // Find users with similar behaviors        List&lt;Long&gt; similarUsers = findSimilarUsers(userId);
        
        // Recommend items from similar users' browsing history        for (Long similarUserId : similarUsers) {
            String otherViewKey = getBehaviorKey(similarUserId, VIEW);
            for (long i = 0; i &lt; 10000; i++) { // Assume item ID range                if (() &gt;= limit) {
                    break;
                }
                
                // Only recommended items that users have not viewed                if ((otherViewKey, i) &amp;&amp; !(i)) {
                    (i);
                    (i); // Avoid repeated recommendations                }
            }
        }
        
        return recommendations;
    }
    
    // Find similar users    private List&lt;Long&gt; findSimilarUsers(long userId) {
        // More complex algorithms may be needed in practical applications        // Here are examples only        List&lt;Long&gt; similarUsers = new ArrayList&lt;&gt;();
        
        // Assume that the user ID range is 1-10000        for (long otherUserId = 1; otherUserId &lt;= 10000; otherUserId++) {
            if (userId == otherUserId) continue;
            
            long similarityScore = calculateUserSimilarity(userId, otherUserId, VIEW);
            if (similarityScore &gt; 5) { // Similarity threshold                (otherUserId);
            }
            
            if (() &gt;= 10) {
                break; // Limit the number of similar users            }
        }
        
        return similarUsers;
    }
    
    // Get behavior key    private String getBehaviorKey(long userId, String behaviorType) {
        return "user:" + userId + ":" + behaviorType;
    }
}

6. Application scenario 5: IP address statistics and blacklisting system

6.1 Scene Description

In network security and traffic analysis scenarios, it is necessary to access IP addresses, identify abnormal IPs, and implement the black and white list of IPs. Traditional solutions may use Hash or Set to store IP addresses, but in large-scale scenarios, memory consumption is huge.

6.2 BitMap Solution

BitMap can map IP addresses into bit offsets, greatly saving memory. There are 2^32 IPv4 addresses (about 4.3 billion). Using BitMap only requires 512MB of memory to represent all possible IP addresses.

6.3 Implementation example

import ;
import ;
import ;

public class IPAddressTracker {
    private Jedis jedis;
    
    public IPAddressTracker(String host, int port) {
         = new Jedis(host, port);
    }
    
    /**
      * Add IP address to blacklist
      */
    public void addToBlacklist(String ipAddress) {
        long ipValue = ipToLong(ipAddress);
        ("ip:blacklist", ipValue, true);
    }
    
    /**
      * Check if the IP is on the blacklist
      */
    public boolean isBlacklisted(String ipAddress) {
        long ipValue = ipToLong(ipAddress);
        return ("ip:blacklist", ipValue);
    }
    
    /**
      * Record IP access
      */
    public void trackIPVisit(String ipAddress) {
        long ipValue = ipToLong(ipAddress);
        ("ip:visited", ipValue, true);
    }
    
    /**
      * Get the total number of accesses to different IPs
      */
    public long getUniqueIPCount() {
        return ("ip:visited");
    }
    
    /**
      * Record IP access on a specific date
      */
    public void trackIPVisitByDate(String ipAddress, String date) {
        long ipValue = ipToLong(ipAddress);
        ("ip:visited:" + date, ipValue, true);
    }
    
    /**
      * Get the number of different IP accesses for a specific date
      */
    public long getUniqueIPCountByDate(String date) {
        return ("ip:visited:" + date);
    }
    
    /**
      * Get the number of IPs that have been active for many consecutive days
      */
    public long getActiveIPsForDays(String[] dates) {
        if ( == 0) return 0;
        
        String destKey = "temp:active:ips";
        
        // Copy the data from the first day        ("AND", destKey, "ip:visited:" + dates[0]);
        
        // Perform AND operation on all dates        for (int i = 1; i &lt; ; i++) {
            ("AND", destKey, destKey, "ip:visited:" + dates[i]);
        }
        
        long count = (destKey);
        (destKey);
        
        return count;
    }
    
    /**
      * IP address is converted to long integer
      */
    private long ipToLong(String ipAddress) {
        try {
            byte[] bytes = (ipAddress).getAddress();
            long result = 0;
            for (byte b : bytes) {
                result = result &lt;&lt; 8 | (b &amp; 0xFF);
            }
            return result;
        } catch (UnknownHostException e) {
            throw new IllegalArgumentException("Invalid IP address: " + ipAddress, e);
        }
    }
    
    /**
      * Long integer conversion to IP address
      */
    private String longToIp(long ip) {
        return ((ip &gt;&gt; 24) &amp; 0xFF) + "." +
               ((ip &gt;&gt; 16) &amp; 0xFF) + "." +
               ((ip &gt;&gt; 8) &amp; 0xFF) + "." +
               (ip &amp; 0xFF);
    }
}

6.4 Application example: DDOS attack protection

public class DDOSProtection {
    private IPAddressTracker ipTracker;
    private Jedis jedis;
    private String currentDateKey;
    
    public DDOSProtection(String host, int port) {
         = new Jedis(host, port);
         = new IPAddressTracker(host, port);
        updateDateKey();
    }
    
    // Update date Key    private void updateDateKey() {
        String date = ().toString();
         = "ip:access:count:" + date;
    }
    
    /**
      * Record IP access and check if the threshold is exceeded
      * @return true means that the IP should be blocked
      */
    public boolean shouldBlockIP(String ipAddress, int accessLimit) {
        // Check if it is already on the blacklist        if ((ipAddress)) {
            return true;
        }
        
        // Record access        long ipValue = ipToLong(ipAddress);
        String accessKey = currentDateKey + ":" + ipAddress;
        
        // Record the number of visits and check        long accessCount = (accessKey);
        
        // Set 24-hour expiration        if (accessCount == 1) {
            (accessKey, 86400);
        }
        
        // Check whether the access limit is exceeded        if (accessCount &gt; accessLimit) {
            // Add to blacklist            (ipAddress);
            return true;
        }
        
        return false;
    }
    
    /**
      * IP address is converted to long integer
      */
    private long ipToLong(String ipAddress) {
        try {
            byte[] bytes = (ipAddress).getAddress();
            long result = 0;
            for (byte b : bytes) {
                result = result &lt;&lt; 8 | (b &amp; 0xFF);
            }
            return result;
        } catch ( e) {
            throw new IllegalArgumentException("Invalid IP address: " + ipAddress, e);
        }
    }
}

7. Performance optimization and best practices

BitMap is efficient and powerful in Redis, but you need to pay attention to the following points when using it

7.1 Memory usage

Accurate calculation: Every 8 bits take up 1 byte, 2^32 bits require 512MB
Automatic expansion: Redis will automatically expand the string according to the maximum bit offset set
Sparse bitmap optimization: For very sparse situations, you can consider using a Hash structure instead

7.2 Operational efficiency

Single point operation: The time complexity of GETBIT/SETBIT is O(1)
Range operation: BITCOUNT/BITPOS consumes a lot in a large range and can limit the range
Bit operation: The performance of BITOP is proportional to the operand length, and bit operations should be avoided on excessive BitMap

7.3 Limitations of use

The upper limit of offset: Maximum 2^32-1 offset
Atomicity assurance: All bit operations are atomic, suitable for concurrent scenarios
Persistence considerations: A large number of BitMap operations will increase AOF file size and RDB snapshot time

7.4 Best Practices

Reasonable design of key names: Use consistent naming rules for easy management
Regular cleaning: Set expiration time for temporary BitMap
Batch operation: Batch processing of bit operations using the BITFIELD command
Cache results: For frequently calculated bit statistics results, they can be cached
Monitor memory: A large number of BitMaps may cause memory surges, and memory usage should be monitored

8. Summary

In practical applications, the biggest advantage of BitMap is its extremely low memory consumption and O(1) operation complexity, which is very suitable for dealing with membership issues of large-scale collections. By rationally designing key structures and operation logic, BitMap can solve massive data statistics and analysis challenges that are difficult for traditional solutions to cope with.

The above is the detailed content of the 5 BitMap application scenarios and implementation introduction in Redis. For more information about Redis implementing BitMap, please pay attention to my other related articles!