You know the pattern. Your monitoring dashboard looks healthy: 99% cache hit ratio, database CPU at 15%. Suddenly, a single "hot" key—perhaps the global configuration object or the homepage personalization metadata—expires. In the 200 milliseconds it takes to fetch the data from the database and repopulate Redis, 5,000 concurrent requests miss the cache. They all rush the database simultaneously. The database CPU spikes to 100%, connections time out, and the backend enters a crash loop because the database never recovers enough to serve the first query that would repopulate the cache. This is the Cache Stampede (or Thundering Herd). This post details the two architectural patterns to solve it: Distributed Locking and Probabilistic Early Expiration (PER) . The Root Cause: The Latency Gap The stampede occurs because there is a non-zero time gap ($\Delta$) between detecting a cache miss and writing the new value. If your system throughput is $R$ request...
Android, .NET C#, Flutter, and Many More Programming tutorials.