Scaling Redis caches for millions of requests

When you embed an image in a GitHub README, it might be viewed millions of times. If we hit the GitHub API for every single view, we'd be rate-limited instantly. Here is how we solved it.

The Caching Strategy

We use a tiered caching strategy with Redis at the edge. 1. L1 Cache (Memory): Extremely hot widgets are held in memory. 2. L2 Cache (Redis): All valid widget data lives here with a TTL of 2 hours. 3. L3 Source (GitHub API): We only fetch when L2 misses.

Handling Spikes

We implemented "Stale-While-Revalidate". If a cache entry is expired, we serve the stale data immediately to the user, and trigger a background fetch to update the cache. This ensures 0ms latency added for the user even during updates.

The Caching Strategy

Handling Spikes

Ready to upgrade your profile?