Engineering

Scaling Redis caches for millions of requests

Feb 05, 2026
Sumit Thakur
8 min read

When you embed an image in a GitHub README, it might be viewed millions of times. If we hit the GitHub API for every single view, we'd be rate-limited instantly. Here is how we solved it.

The Caching Strategy

We use a tiered caching strategy with Redis at the edge. 1. L1 Cache (Memory): Extremely hot widgets are held in memory. 2. L2 Cache (Redis): All valid widget data lives here with a TTL of 2 hours. 3. L3 Source (GitHub API): We only fetch when L2 misses.

Handling Spikes

We implemented "Stale-While-Revalidate". If a cache entry is expired, we serve the stale data immediately to the user, and trigger a background fetch to update the cache. This ensures 0ms latency added for the user even during updates.


Ready to upgrade your profile?

Join thousands of developers using GitFlex to showcase their work. It's free to get started.