Caching is a technique used to store copies of data in a high-speed data storage layer to improve retrieval times. While caching can significantly improve application performance, improper implementation can lead to various issues. This document outlines common anti-patterns related to caching and provides optimization strategies.
public class ProductService { private final Map<Long, Product> productCache = new HashMap<>(); private final ProductRepository repository; public ProductService(ProductRepository repository) { this.repository = repository; } public Product getProduct(Long id) { // Check if product exists in cache if (productCache.containsKey(id)) { return productCache.get(id); } // If not in cache, get from database and cache it Product product = repository.findById(id); if (product != null) { productCache.put(id, product); } return product; } public void updateProduct(Product product) { // Update product in database repository.save(product); // Cache invalidation is missing here! // The cache still contains the old version of the product }}
JavaScript Example:
Copy
class UserService { constructor() { this.userCache = new Map(); this.api = new ApiClient(); } async getUser(userId) { // Check if user exists in cache if (this.userCache.has(userId)) { return this.userCache.get(userId); } // If not in cache, fetch from API and cache it const user = await this.api.fetchUser(userId); this.userCache.set(userId, user); return user; } async updateUser(userId, userData) { // Update user via API await this.api.updateUser(userId, userData); // Cache invalidation is missing here! // The cache still contains the old user data }}// Usageconst userService = new UserService();// First call fetches and cachesconst user = await userService.getUser('user123');console.log(user); // {id: 'user123', name: 'John', email: 'john@example.com'}// Update userawait userService.updateUser('user123', {name: 'John Updated'});// Second call returns stale data from cacheconst updatedUser = await userService.getUser('user123');console.log(updatedUser); // Still {id: 'user123', name: 'John', email: 'john@example.com'}
This anti-pattern occurs when an application fails to properly invalidate or update cached data after the underlying data has changed. This leads to stale data being served to users, potentially causing inconsistencies, incorrect business decisions, or confusing user experiences.
public class ProductService { private final Map<Long, Product> productCache = new HashMap<>(); private final ProductRepository repository; public ProductService(ProductRepository repository) { this.repository = repository; } public Product getProduct(Long id) { // Check if product exists in cache if (productCache.containsKey(id)) { return productCache.get(id); } // If not in cache, get from database and cache it Product product = repository.findById(id); if (product != null) { productCache.put(id, product); } return product; } public void updateProduct(Product product) { // Update product in database repository.save(product); // Properly invalidate cache productCache.remove(product.getId()); // Or update the cache with the new version // productCache.put(product.getId(), product); } // Additional method to handle cache invalidation for bulk updates public void invalidateCache() { productCache.clear(); }}
JavaScript Example:
Copy
class UserService { constructor() { this.userCache = new Map(); this.api = new ApiClient(); } async getUser(userId) { // Check if user exists in cache if (this.userCache.has(userId)) { return this.userCache.get(userId); } // If not in cache, fetch from API and cache it const user = await this.api.fetchUser(userId); this.userCache.set(userId, user); return user; } async updateUser(userId, userData) { // Update user via API const updatedUser = await this.api.updateUser(userId, userData); // Properly update cache with new data this.userCache.set(userId, updatedUser); return updatedUser; } // Method to invalidate a specific user in cache invalidateUser(userId) { this.userCache.delete(userId); } // Method to invalidate entire cache invalidateCache() { this.userCache.clear(); }}// Usageconst userService = new UserService();// First call fetches and cachesconst user = await userService.getUser('user123');console.log(user); // {id: 'user123', name: 'John', email: 'john@example.com'}// Update user (cache is updated)await userService.updateUser('user123', {name: 'John Updated'});// Second call returns updated data from cacheconst updatedUser = await userService.getUser('user123');console.log(updatedUser); // {id: 'user123', name: 'John Updated', email: 'john@example.com'}
Implement proper cache invalidation strategies to ensure data consistency. When data is updated, either remove the corresponding cache entry or update it with the new value. For distributed systems, consider using cache invalidation events or time-based expiration policies. Always ensure that all code paths that modify data also handle cache invalidation appropriately.
public class ExpensiveDataService { private final Map<String, CachedData> cache = new ConcurrentHashMap<>(); private final ExpensiveDataRepository repository; public ExpensiveDataService(ExpensiveDataRepository repository) { this.repository = repository; } public Data getData(String key) { CachedData cachedData = cache.get(key); // Check if data is in cache and not expired if (cachedData != null && !cachedData.isExpired()) { return cachedData.getData(); } // If not in cache or expired, fetch from database Data freshData = repository.fetchExpensiveData(key); cache.put(key, new CachedData(freshData, System.currentTimeMillis() + 60000)); // 1 minute TTL return freshData; } private static class CachedData { private final Data data; private final long expiryTime; public CachedData(Data data, long expiryTime) { this.data = data; this.expiryTime = expiryTime; } public Data getData() { return data; } public boolean isExpired() { return System.currentTimeMillis() > expiryTime; } }}
JavaScript Example:
Copy
class DatabaseCache { constructor(dbClient) { this.cache = new Map(); this.dbClient = dbClient; this.cacheTTL = 60000; // 1 minute in milliseconds } async getRecord(recordId) { const cachedRecord = this.cache.get(recordId); // Check if record is in cache and not expired if (cachedRecord && Date.now() < cachedRecord.expiresAt) { return cachedRecord.data; } // If not in cache or expired, fetch from database const data = await this.dbClient.fetchRecord(recordId); // Store in cache with expiration time this.cache.set(recordId, { data, expiresAt: Date.now() + this.cacheTTL }); return data; }}// Usage in a high-traffic web serverapp.get('/api/products/:id', async (req, res) => { const productId = req.params.id; const dbCache = new DatabaseCache(dbClient); try { // If many requests come in at once for the same expired product, // all will miss the cache and hit the database simultaneously const product = await dbCache.getRecord(productId); res.json(product); } catch (error) { res.status(500).send('Error fetching product'); }});
Cache stampede (also known as thundering herd or cache avalanche) occurs when many concurrent requests attempt to access a cached item that is either expired or missing, causing all of them to simultaneously try to fetch the data from the underlying data source. This can overwhelm the database or service, leading to increased latency, timeouts, or even system failures.
Request Coalescing: When multiple requests for the same resource arrive simultaneously, only one request should fetch from the underlying data source while others wait for that result.
Staggered Expiration Times: Add random jitter to cache expiration times to prevent many items from expiring simultaneously.
Background Refresh: Proactively refresh cache items before they expire, ideally during low-traffic periods.
Soft Expiration: Continue serving stale data while asynchronously refreshing the cache.
Cache Lock: Use a distributed lock to ensure only one process can refresh a particular cache entry at a time.
Fallback to Stale Data: If fetching fresh data fails, temporarily continue serving stale data rather than failing completely.
Circuit Breaker Pattern: Implement circuit breakers to prevent overwhelming the backend system during outages.
Tiered Caching: Implement multiple layers of caching with different expiration policies.
By implementing these strategies, you can prevent cache stampedes and ensure your application remains responsive even under high load or when cached data expires.
public class SimpleCache<K, V> { private final Map<K, V> cache = new HashMap<>(); private final int maxSize; public SimpleCache(int maxSize) { this.maxSize = maxSize; } public V get(K key) { return cache.get(key); } public void put(K key, V value) { // Check if cache is full if (cache.size() >= maxSize) { // Inefficient: randomly remove an entry when cache is full K randomKey = cache.keySet().iterator().next(); cache.remove(randomKey); } cache.put(key, value); }}
JavaScript Example:
Copy
class SimpleCache { constructor(maxSize) { this.cache = new Map(); this.maxSize = maxSize; } get(key) { return this.cache.get(key); } set(key, value) { // Check if cache is full if (this.cache.size >= this.maxSize) { // Inefficient: clear the entire cache when it's full this.cache.clear(); } this.cache.set(key, value); }}// Usageconst cache = new SimpleCache(100);cache.set('user:1', { name: 'John', age: 30 });// ... more items added// When cache reaches 100 items, the next item will cause the entire cache to be cleared
This anti-pattern occurs when an application uses inefficient or inappropriate cache eviction policies. Common issues include clearing the entire cache when it’s full, randomly removing entries without considering their usage patterns, or using a one-size-fits-all approach for all types of data. This leads to poor cache hit ratios, unnecessary recomputation of values, and degraded application performance.
LRU (Least Recently Used): Best for data with temporal locality.
LFU (Least Frequently Used): Best for data with frequency-based access patterns.
FIFO (First In First Out): Simple but effective for data with uniform access patterns.
Time-Based Expiration: Good for data that becomes stale after a certain period.
Use Different Policies for Different Data Types:
Configuration data might benefit from time-based expiration.
User session data might benefit from LRU.
Popular content might benefit from LFU.
Implement Size-Aware Caching:
Consider both the number of items and their size.
Evict larger items first when memory pressure is high.
Monitor Cache Effectiveness:
Track hit/miss ratios to evaluate policy effectiveness.
Adjust cache sizes and policies based on metrics.
Consider Hybrid Approaches:
Combine multiple policies (e.g., TinyLFU with window TinyLFU).
Use adaptive policies that change based on workload.
Implement Segmented Caching:
Divide the cache into segments with different policies.
Allocate more space to segments with higher hit rates.
Pre-emptive Eviction:
Proactively evict items that are likely to become stale.
Use predictive models to anticipate which items will be needed.
By implementing appropriate cache eviction policies, you can significantly improve cache hit ratios, reduce resource consumption, and enhance application performance.