Four strategies for downgrading Redis cache

introduction

Redis plays a crucial role as the core cache component in a high concurrency system architecture. It can not only significantly improve the system response speed, but also effectively reduce database pressure.

However, when Redis service fails, performance degradation, or connection timeout, without an appropriate degradation mechanism, it may cause a system to collapse and cause global services to become unavailable.

Cache downgrade is a key link in the design of high-availability systems, and it provides alternatives to system behavior in case of cache layer failures to ensure that core business processes can continue to run.

What is cache downgrade?

Cache downgrade refers to the alternative processing mechanism that the system actively or passively adopts when the cache service is unavailable or the response is extremely slow to ensure the continuity of business processes and the stability of the system.

Compared with the response strategies for problems such as cache penetration, cache breakdown and cache avalanche, cache downgrade focuses more on "elegant downgrade", that is, make certain compromises in performance and function, but ensure that the core functions of the system are available.

Strategy 1: Local cache fallback strategy

principle

The local cache fallback strategy adds a local cache layer within the application (such as Caffeine, Guava Cache, etc.) outside the Redis cache layer. When Redis is unavailable, the system automatically switches to the local cache. Although data consistency and real-time may be affected, the basic cache function can be guaranteed.

Implementation method

Here is an example of a local cache fallback implemented using Spring Boot + Caffeine:

@Service
public class ProductService {
    @Autowired
    private RedisTemplate&lt;String, Product&gt; redisTemplate;
    
    // Configure local cache    private Cache&lt;String, Product&gt; localCache = ()
            .expireAfterWrite(5, )
            .maximumSize(1000)
            .build();
    
    @Autowired
    private ProductRepository productRepository;
    
    private final AtomicBoolean redisAvailable = new AtomicBoolean(true);
    
    public Product getProductById(String productId) {
        Product product = null;
        
        // Try to get it from Redis        if (()) {
            try {
                product = ().get("product:" + productId);
            } catch (Exception e) {
                // Redis exception, marked as unavailable, logging                (false);
                ("Redis unavailable, switching to local cache", e);
                // Start background timed task detection Redis recovery                scheduleRedisRecoveryCheck();
            }
        }
        
        // If Redis is not available or missed, try local cache        if (product == null) {
            product = (productId);
        }
        
        // If the local cache is also missed, load from the database        if (product == null) {
            product = (productId).orElse(null);
            
            // If the product is found, update the local cache            if (product != null) {
                (productId, product);
                
                // If Redis is available, also update the Redis cache                if (()) {
                    try {
                        ().set("product:" + productId, product, 30, );
                    } catch (Exception e) {
                        // Update failed to record only logs and does not affect the return result                        ("Failed to update Redis cache", e);
                    }
                }
            }
        }
        
        return product;
    }
    
    // Check whether Redis is restored regularly    private void scheduleRedisRecoveryCheck() {
        ScheduledExecutorService scheduler = ();
        (() -&gt; {
            try {
                ().getConnection().ping();
                (true);
                ("Redis service recovered");
                ();
            } catch (Exception e) {
                ("Redis still unavailable");
            }
        }, 30, 30, );
    }
}

Pros and cons analysis

advantage:

Fully localized processing, no dependency on external services, fast response speed
Relatively simple to implement without additional infrastructure
Even if Redis is completely unavailable, the system can still provide basic caching capabilities

shortcoming:

Local cache capacity is limited and large amounts of data cannot be cached
When deploying multiple instances, cache data in each node is inconsistent.
Local cache will be lost when the application restarts
Increased memory usage may affect the application of other functions

Applicable scenarios

Read more and write less scenarios with low data consistency requirements
Small applications or services with small data volumes
Core services that require extremely high availability
Microservices with limited number of single applications or instances

Policy 2: Static default policy

principle

The static default value policy is the easiest way to downgrade. When the cache is unavailable, it directly returns predefined default data or static content to avoid access to the underlying data source. This strategy is suitable for non-core data display, such as recommendation lists, ad slots, configuration items, etc.

Implementation method

@Service
public class RecommendationService {
    @Autowired
    private RedisTemplate&lt;String, List&lt;ProductRecommendation&gt;&gt; redisTemplate;
    
    @Autowired
    private RecommendationEngine recommendationEngine;
    
    // Preloaded static recommended data, which can be initialized at the start of the application    private static final List&lt;ProductRecommendation&gt; DEFAULT_RECOMMENDATIONS = new ArrayList&lt;&gt;();
    
    static {
        // Initialize some popular products as the default recommendation        DEFAULT_RECOMMENDATIONS.add(new ProductRecommendation("1001", "Hot Product 1", 4.8));
        DEFAULT_RECOMMENDATIONS.add(new ProductRecommendation("1002", "Hot Product 2", 4.7));
        DEFAULT_RECOMMENDATIONS.add(new ProductRecommendation("1003", "Hot Product 3", 4.9));
        // More default recommendations...    }
    
    public List&lt;ProductRecommendation&gt; getRecommendationsForUser(String userId) {
        String cacheKey = "recommendations:" + userId;
        
        try {
            // Try to get personalized recommendations from Redis            List&lt;ProductRecommendation&gt; cachedRecommendations = ().get(cacheKey);
            
            if (cachedRecommendations != null) {
                return cachedRecommendations;
            }
            
            // Cache misses, generate new recommendations            List&lt;ProductRecommendation&gt; freshRecommendations = (userId);
            
            // Cache recommendation results            if (freshRecommendations != null &amp;&amp; !()) {
                ().set(cacheKey, freshRecommendations, 1, );
                return freshRecommendations;
            } else {
                // The recommendation engine returns empty results, using the default recommendation                return DEFAULT_RECOMMENDATIONS;
            }
        } catch (Exception e) {
            // Redis or recommendation engine exception, return to the default recommendation            ("Failed to get recommendations, using defaults", e);
            return DEFAULT_RECOMMENDATIONS;
        }
    }
}

Pros and cons analysis

advantage

Extremely simple to implement with almost no additional development costs
No access to data sources, reducing system load
Response time is determined and there will be no delay increase due to cache failures
Fully isolated cache failure scope

shortcoming

What is returned is static data, which cannot meet personalized needs
The data is poor in real time and may not match the actual situation
Not suitable for core business data or transaction processes

Applicable scenarios

Non-critical business data, such as recommendations, advertising, marketing information
Scenarios with low requirements for real-time data
System edge functions do not affect core processes
Unpersonalized display areas in high-traffic systems

Strategy Three: Downgrade Switch Strategy

principle

The downgrade switch policy configures dynamic switches to temporarily turn off specific functions or simplify processing flows in case of cache failures, reducing the burden on the system. This strategy is usually implemented in conjunction with configuration centers, which has strong flexibility and controllability.

Implementation method

Degrading switches are implemented using configuration centers such as Spring Cloud Config and Apollo:

@Service
public class UserProfileService {
    @Autowired
    private RedisTemplate&lt;String, UserProfile&gt; redisTemplate;
    
    @Autowired
    private UserRepository userRepository;
    
    @Value("${-mode:true}")
    private boolean fullProfileMode;
    
    @Value("${-cache:true}")
    private boolean useCache;
    
    // Apollo Configuration Center Listener automatically refreshes configuration    @ApolloConfigChangeListener
    private void onChange(ConfigChangeEvent changeEvent) {
        if (("-mode")) {
            fullProfileMode = (("-mode").getNewValue());
        }
        if (("-cache")) {
            useCache = (("-cache").getNewValue());
        }
    }
    
    public UserProfile getUserProfile(String userId) {
        if (!useCache) {
            // The cache downgrade switch is enabled, directly query the database            return getUserProfileFromDb(userId, fullProfileMode);
        }
        
        // Try to get from cache        try {
            UserProfile profile = ().get("user:profile:" + userId);
            if (profile != null) {
                return profile;
            }
        } catch (Exception e) {
            // Logs when cached exceptions and continues to obtain them from the database            ("Redis cache failure when getting user profile", e);
            // The automatic downgrade switch can be triggered here            triggerAutoDegradation("");
        }
        
        // Cache miss or exceptions, get from the database        return getUserProfileFromDb(userId, fullProfileMode);
    }
    
    // Decide whether to load complete or simplified user profile based on fullProfileMode    private UserProfile getUserProfileFromDb(String userId, boolean fullMode) {
        if (fullMode) {
            // Get complete user information, including details, preference settings, etc.            UserProfile fullProfile = (userId);
            try {
                // Try to update the cache, but it does not affect the main process                if (useCache) {
                    ().set("user:profile:" + userId, fullProfile, 30, );
                }
            } catch (Exception e) {
                ("Failed to update user profile cache", e);
            }
            return fullProfile;
        } else {
            // Downgrade mode: Only obtain basic user information            return (userId);
        }
    }
    
    // Trigger automatic downgrade    private void triggerAutoDegradation(String feature) {
        // Implement automatic downgrade logic, such as modifying the configuration through the Configuration Center API        // Or update the local downgrade status and automatically downgrade after the threshold is reached    }
}

Pros and cons analysis

advantage

High flexibility, different levels of downgrade strategies can be configured according to different scenarios
Dynamically adjustable without restarting the application
Fine control, only specific functions can be downgraded
Combined with the monitoring system, automatic downgrade and recovery can be achieved

shortcoming

High implementation complexity and requires configuration center support
Multiple functional levels and downgrade solutions are required in advance
Testing difficulty increases, and various downgrade scenarios need to be verified
Managing switch status requires additional operation and maintenance work

Applicable scenarios

Large complex systems with clear functional priorities
Scenarios where traffic fluctuations are large and system behavior needs to be dynamically adjusted
Have a perfect monitoring system to detect problems in a timely manner
High system availability requirements and tolerate some functions downgrades

Strategy 4: Circuit breaking and current limiting strategies

principle

The fuse and current limiting strategy monitors the response status of Redis, and automatically triggers the fuse mechanism when an exception is found, temporarily cutting off access to Redis to avoid the avalanche effect. At the same time, the amount of requests entering the system is controlled through the current limit to prevent the system from being overloaded during the downgrade.

Implementation method

Use Resilience4j or Sentinel to achieve fuse and current limit:

@Service
public class ProductCatalogService {
    @Autowired
    private RedisTemplate&lt;String, List&lt;Product&gt;&gt; redisTemplate;
    
    @Autowired
    private ProductCatalogRepository repository;
    
    // Create a fuse    private final CircuitBreaker circuitBreaker = ("redisCatalogCache");
    
    // Create a current limiter    private final RateLimiter rateLimiter = ("catalogService", ()
            .limitRefreshPeriod((1))
            .limitForPeriod(1000) // 1000 requests per second            .timeoutDuration((25))
            .build());
    
    public List&lt;Product&gt; getProductsByCategory(String category, int page, int size) {
        // Application current limit        ();
        
        String cacheKey = "products:category:" + category + ":" + page + ":" + size;
        
        // Use fuse to wrap Redis calls        Supplier&lt;List&lt;Product&gt;&gt; redisCall = (
                circuitBreaker, 
                () -&gt; ().get(cacheKey)
        );
        
        try {
            // Try to get data from Redis            List&lt;Product&gt; products = ();
            if (products != null) {
                return products;
            }
        } catch (Exception e) {
            // The fuse will handle exceptions, just logs are needed here            ("Failed to get products from cache, fallback to database", e);
        }
        
        // Fuse or cache miss, load from database        List&lt;Product&gt; products = (category, (page, size));
        
        // Update cache only when the fuse is closed        if (() == ) {
            try {
                ().set(cacheKey, products, 1, );
            } catch (Exception e) {
                ("Failed to update product cache", e);
            }
        }
        
        return products;
    }
    
    // Provide fuse status monitoring endpoints    public CircuitBreakerStatus getCircuitBreakerStatus() {
        return new CircuitBreakerStatus(
                ().toString(),
                ().getFailureRate(),
                ().getNumberOfBufferedCalls(),
                ().getNumberOfFailedCalls()
        );
    }
    
    // Value object: fuse status    @Data
    @AllArgsConstructor
    public static class CircuitBreakerStatus {
        private String state;
        private float failureRate;
        private int totalCalls;
        private int failedCalls;
    }
}

Pros and cons analysis

advantage

Ability to automatically detect and react to Redis exceptions
Prevent the cascade of fault propagation and avoid the avalanche effect
With self-restoration ability, it can automatically switch back after Redis recovery
Protect the backend system through current limiting to avoid overload during downgrade

shortcoming

The implementation is relatively complex and requires the introduction of additional fuse and current limiting libraries.
It is difficult to tune the fuse parameters
Additional delays may be introduced
More monitoring and management are needed

Applicable scenarios

High concurrency system, heavy dependence on Redis
Microservice architecture, need to prevent failure propagation
Have a clear service level agreement (SLA) that is sensitive to response time
The system has good monitoring capabilities and can observe the fuse status

Summarize

By rationally implementing the Redis cache downgrade strategy, the system can maintain basic functions even in the event of a cache layer failure and provide users with continuously available services. This not only improves the reliability of the system, but also provides strong guarantees for business continuity.

This is the end of this article about the four strategies for Redis cache downgrade. For more related content on Redis cache downgrade, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!