introduction
Redis plays a crucial role as the core cache component in a high concurrency system architecture. It can not only significantly improve the system response speed, but also effectively reduce database pressure.
However, when Redis service fails, performance degradation, or connection timeout, without an appropriate degradation mechanism, it may cause a system to collapse and cause global services to become unavailable.
Cache downgrade is a key link in the design of high-availability systems, and it provides alternatives to system behavior in case of cache layer failures to ensure that core business processes can continue to run.
What is cache downgrade?
Cache downgrade refers to the alternative processing mechanism that the system actively or passively adopts when the cache service is unavailable or the response is extremely slow to ensure the continuity of business processes and the stability of the system.
Compared with the response strategies for problems such as cache penetration, cache breakdown and cache avalanche, cache downgrade focuses more on "elegant downgrade", that is, make certain compromises in performance and function, but ensure that the core functions of the system are available.
Strategy 1: Local cache fallback strategy
principle
The local cache fallback strategy adds a local cache layer within the application (such as Caffeine, Guava Cache, etc.) outside the Redis cache layer. When Redis is unavailable, the system automatically switches to the local cache. Although data consistency and real-time may be affected, the basic cache function can be guaranteed.
Implementation method
Here is an example of a local cache fallback implemented using Spring Boot + Caffeine:
@Service public class ProductService { @Autowired private RedisTemplate<String, Product> redisTemplate; // Configure local cache private Cache<String, Product> localCache = () .expireAfterWrite(5, ) .maximumSize(1000) .build(); @Autowired private ProductRepository productRepository; private final AtomicBoolean redisAvailable = new AtomicBoolean(true); public Product getProductById(String productId) { Product product = null; // Try to get it from Redis if (()) { try { product = ().get("product:" + productId); } catch (Exception e) { // Redis exception, marked as unavailable, logging (false); ("Redis unavailable, switching to local cache", e); // Start background timed task detection Redis recovery scheduleRedisRecoveryCheck(); } } // If Redis is not available or missed, try local cache if (product == null) { product = (productId); } // If the local cache is also missed, load from the database if (product == null) { product = (productId).orElse(null); // If the product is found, update the local cache if (product != null) { (productId, product); // If Redis is available, also update the Redis cache if (()) { try { ().set("product:" + productId, product, 30, ); } catch (Exception e) { // Update failed to record only logs and does not affect the return result ("Failed to update Redis cache", e); } } } } return product; } // Check whether Redis is restored regularly private void scheduleRedisRecoveryCheck() { ScheduledExecutorService scheduler = (); (() -> { try { ().getConnection().ping(); (true); ("Redis service recovered"); (); } catch (Exception e) { ("Redis still unavailable"); } }, 30, 30, ); } }
Pros and cons analysis
advantage:
- Fully localized processing, no dependency on external services, fast response speed
- Relatively simple to implement without additional infrastructure
- Even if Redis is completely unavailable, the system can still provide basic caching capabilities
shortcoming:
- Local cache capacity is limited and large amounts of data cannot be cached
- When deploying multiple instances, cache data in each node is inconsistent.
- Local cache will be lost when the application restarts
- Increased memory usage may affect the application of other functions
Applicable scenarios
- Read more and write less scenarios with low data consistency requirements
- Small applications or services with small data volumes
- Core services that require extremely high availability
- Microservices with limited number of single applications or instances
Policy 2: Static default policy
principle
The static default value policy is the easiest way to downgrade. When the cache is unavailable, it directly returns predefined default data or static content to avoid access to the underlying data source. This strategy is suitable for non-core data display, such as recommendation lists, ad slots, configuration items, etc.
Implementation method
@Service public class RecommendationService { @Autowired private RedisTemplate<String, List<ProductRecommendation>> redisTemplate; @Autowired private RecommendationEngine recommendationEngine; // Preloaded static recommended data, which can be initialized at the start of the application private static final List<ProductRecommendation> DEFAULT_RECOMMENDATIONS = new ArrayList<>(); static { // Initialize some popular products as the default recommendation DEFAULT_RECOMMENDATIONS.add(new ProductRecommendation("1001", "Hot Product 1", 4.8)); DEFAULT_RECOMMENDATIONS.add(new ProductRecommendation("1002", "Hot Product 2", 4.7)); DEFAULT_RECOMMENDATIONS.add(new ProductRecommendation("1003", "Hot Product 3", 4.9)); // More default recommendations... } public List<ProductRecommendation> getRecommendationsForUser(String userId) { String cacheKey = "recommendations:" + userId; try { // Try to get personalized recommendations from Redis List<ProductRecommendation> cachedRecommendations = ().get(cacheKey); if (cachedRecommendations != null) { return cachedRecommendations; } // Cache misses, generate new recommendations List<ProductRecommendation> freshRecommendations = (userId); // Cache recommendation results if (freshRecommendations != null && !()) { ().set(cacheKey, freshRecommendations, 1, ); return freshRecommendations; } else { // The recommendation engine returns empty results, using the default recommendation return DEFAULT_RECOMMENDATIONS; } } catch (Exception e) { // Redis or recommendation engine exception, return to the default recommendation ("Failed to get recommendations, using defaults", e); return DEFAULT_RECOMMENDATIONS; } } }
Pros and cons analysis
advantage
- Extremely simple to implement with almost no additional development costs
- No access to data sources, reducing system load
- Response time is determined and there will be no delay increase due to cache failures
- Fully isolated cache failure scope
shortcoming
- What is returned is static data, which cannot meet personalized needs
- The data is poor in real time and may not match the actual situation
- Not suitable for core business data or transaction processes
Applicable scenarios
- Non-critical business data, such as recommendations, advertising, marketing information
- Scenarios with low requirements for real-time data
- System edge functions do not affect core processes
- Unpersonalized display areas in high-traffic systems
Strategy Three: Downgrade Switch Strategy
principle
The downgrade switch policy configures dynamic switches to temporarily turn off specific functions or simplify processing flows in case of cache failures, reducing the burden on the system. This strategy is usually implemented in conjunction with configuration centers, which has strong flexibility and controllability.
Implementation method
Degrading switches are implemented using configuration centers such as Spring Cloud Config and Apollo:
@Service public class UserProfileService { @Autowired private RedisTemplate<String, UserProfile> redisTemplate; @Autowired private UserRepository userRepository; @Value("${-mode:true}") private boolean fullProfileMode; @Value("${-cache:true}") private boolean useCache; // Apollo Configuration Center Listener automatically refreshes configuration @ApolloConfigChangeListener private void onChange(ConfigChangeEvent changeEvent) { if (("-mode")) { fullProfileMode = (("-mode").getNewValue()); } if (("-cache")) { useCache = (("-cache").getNewValue()); } } public UserProfile getUserProfile(String userId) { if (!useCache) { // The cache downgrade switch is enabled, directly query the database return getUserProfileFromDb(userId, fullProfileMode); } // Try to get from cache try { UserProfile profile = ().get("user:profile:" + userId); if (profile != null) { return profile; } } catch (Exception e) { // Logs when cached exceptions and continues to obtain them from the database ("Redis cache failure when getting user profile", e); // The automatic downgrade switch can be triggered here triggerAutoDegradation(""); } // Cache miss or exceptions, get from the database return getUserProfileFromDb(userId, fullProfileMode); } // Decide whether to load complete or simplified user profile based on fullProfileMode private UserProfile getUserProfileFromDb(String userId, boolean fullMode) { if (fullMode) { // Get complete user information, including details, preference settings, etc. UserProfile fullProfile = (userId); try { // Try to update the cache, but it does not affect the main process if (useCache) { ().set("user:profile:" + userId, fullProfile, 30, ); } } catch (Exception e) { ("Failed to update user profile cache", e); } return fullProfile; } else { // Downgrade mode: Only obtain basic user information return (userId); } } // Trigger automatic downgrade private void triggerAutoDegradation(String feature) { // Implement automatic downgrade logic, such as modifying the configuration through the Configuration Center API // Or update the local downgrade status and automatically downgrade after the threshold is reached } }
Pros and cons analysis
advantage
- High flexibility, different levels of downgrade strategies can be configured according to different scenarios
- Dynamically adjustable without restarting the application
- Fine control, only specific functions can be downgraded
- Combined with the monitoring system, automatic downgrade and recovery can be achieved
shortcoming
- High implementation complexity and requires configuration center support
- Multiple functional levels and downgrade solutions are required in advance
- Testing difficulty increases, and various downgrade scenarios need to be verified
- Managing switch status requires additional operation and maintenance work
Applicable scenarios
- Large complex systems with clear functional priorities
- Scenarios where traffic fluctuations are large and system behavior needs to be dynamically adjusted
- Have a perfect monitoring system to detect problems in a timely manner
- High system availability requirements and tolerate some functions downgrades
Strategy 4: Circuit breaking and current limiting strategies
principle
The fuse and current limiting strategy monitors the response status of Redis, and automatically triggers the fuse mechanism when an exception is found, temporarily cutting off access to Redis to avoid the avalanche effect. At the same time, the amount of requests entering the system is controlled through the current limit to prevent the system from being overloaded during the downgrade.
Implementation method
Use Resilience4j or Sentinel to achieve fuse and current limit:
@Service public class ProductCatalogService { @Autowired private RedisTemplate<String, List<Product>> redisTemplate; @Autowired private ProductCatalogRepository repository; // Create a fuse private final CircuitBreaker circuitBreaker = ("redisCatalogCache"); // Create a current limiter private final RateLimiter rateLimiter = ("catalogService", () .limitRefreshPeriod((1)) .limitForPeriod(1000) // 1000 requests per second .timeoutDuration((25)) .build()); public List<Product> getProductsByCategory(String category, int page, int size) { // Application current limit (); String cacheKey = "products:category:" + category + ":" + page + ":" + size; // Use fuse to wrap Redis calls Supplier<List<Product>> redisCall = ( circuitBreaker, () -> ().get(cacheKey) ); try { // Try to get data from Redis List<Product> products = (); if (products != null) { return products; } } catch (Exception e) { // The fuse will handle exceptions, just logs are needed here ("Failed to get products from cache, fallback to database", e); } // Fuse or cache miss, load from database List<Product> products = (category, (page, size)); // Update cache only when the fuse is closed if (() == ) { try { ().set(cacheKey, products, 1, ); } catch (Exception e) { ("Failed to update product cache", e); } } return products; } // Provide fuse status monitoring endpoints public CircuitBreakerStatus getCircuitBreakerStatus() { return new CircuitBreakerStatus( ().toString(), ().getFailureRate(), ().getNumberOfBufferedCalls(), ().getNumberOfFailedCalls() ); } // Value object: fuse status @Data @AllArgsConstructor public static class CircuitBreakerStatus { private String state; private float failureRate; private int totalCalls; private int failedCalls; } }
Pros and cons analysis
advantage
- Ability to automatically detect and react to Redis exceptions
- Prevent the cascade of fault propagation and avoid the avalanche effect
- With self-restoration ability, it can automatically switch back after Redis recovery
- Protect the backend system through current limiting to avoid overload during downgrade
shortcoming
- The implementation is relatively complex and requires the introduction of additional fuse and current limiting libraries.
- It is difficult to tune the fuse parameters
- Additional delays may be introduced
- More monitoring and management are needed
Applicable scenarios
- High concurrency system, heavy dependence on Redis
- Microservice architecture, need to prevent failure propagation
- Have a clear service level agreement (SLA) that is sensitive to response time
- The system has good monitoring capabilities and can observe the fuse status
Summarize
By rationally implementing the Redis cache downgrade strategy, the system can maintain basic functions even in the event of a cache layer failure and provide users with continuously available services. This not only improves the reliability of the system, but also provides strong guarantees for business continuity.
This is the end of this article about the four strategies for Redis cache downgrade. For more related content on Redis cache downgrade, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!