Why Our Fastest Cache Was Not Redis: Designing a Two-Tier Cache for Geospatial APIs

Redis is a great cache. It is also not free.

That sounds obvious, but many systems treat Redis like the default answer to every caching question. In latency-sensitive APIs, especially gateways, the network hop to Redis can still be expensive compared with memory you already own inside the process.

That is why I like the design in src/services/cache.js: it uses an in-process lru-cache as the first read path and optional Redis as the shared second tier.

What this cache is optimizing for

The gateway has two kinds of cacheable work:

region lookups, which are relatively small and highly reusable
route responses, which are more expensive to compute and more valuable to reuse across instances

The service handles both with one pattern:

check local LRU first
if absent, check Redis
if Redis hits, backfill local memory
on write, update both tiers when Redis is available

That is a very practical layout for a stateless Node service. It makes hot-path reads extremely fast while still giving you a path to cross-instance reuse when the same requests land on different replicas.

Why this is better than “just Redis”

A shared cache is useful for consistency across instances, but a local cache is useful for speed. If a request is truly hot, the best answer is often to avoid every network call you can, including the call to your cache.

This matters even more in a gateway because the request may already involve multiple external dependencies: Photon for region inference, Valhalla for routing, and observability sinks for metrics and traces. Saving one more round trip on a common path has real value.

The code also handles degraded mode sensibly. If Redis is unavailable, the service logs a warning and falls back to in-memory caching only. That is exactly the kind of failure behavior I want in front-door services. Partial optimization should degrade to a smaller optimization, not to an outage.

The subtle cache problem hiding here

The route cache key in src/services/gatewayService.js hashes JSON.stringify(body). That is simple and fast, but it is only stable if equivalent requests serialize identically.

Two clients can send the same semantic request with different key order and miss the cache. That is not a bug in correctness, but it is a leak in efficiency.

This is a great example of the tradeoffs behind “simple” production code. The current approach is easy to reason about and cheap to compute. A canonical JSON serializer or normalized request model would improve hit rates, but it would also add code and complexity. Early on, the simpler choice is often correct. Later, when traffic patterns justify it, the optimization becomes worth revisiting.

TTL design matters too

The code uses different TTL behavior for route responses and region lookups. That is good. Not every cacheable thing should age the same way.

Region lookups change slowly and can tolerate longer TTLs. Route responses may depend on data freshness expectations, traffic modeling choices, or operational appetite for stale results. Even without dynamic invalidation, separating those lifetimes shows good architectural taste.

What I would add next

If I were evolving this cache, I would prioritize three additions:

canonical request hashing for better route cache reuse
explicit cache metrics: hit rate, miss rate, Redis fallback frequency, and serialization failures
protections against cache stampedes for very hot keys

I would also think carefully about whether every endpoint deserves the same caching strategy. Some payloads are naturally more reusable than others, and a gateway eventually benefits from endpoint-aware cache policy instead of one blanket TTL.

The lesson

Good caching is not about picking one technology. It is about shaping the read path around the actual cost structure of your system.

In this gateway, the smartest cache is not Redis alone and not local memory alone. It is the combination: fast enough for hot traffic, shared enough for a multi-instance deployment, and graceful enough to survive dependency trouble. That is exactly the kind of design that looks boring in code review and valuable in production.