SDI: 3 ways to expire cached items

server-push update ^ TTL ^ conditional-GET # write-through is not cache expiration

Few Online articles list these solutions explicitly. Some of these are simple concepts but fundamental to DB tuning and app tuning. compares write-through ^ write-behind ^ refresh-ahead. I think refresh-ahead is similar to TTL.

B) cache-invalidation — some “events” would trigger an invalidation. Without invalidation, a cache item would live forever with a infinity TTL, like the list of China provinces.

After cache proxies get the invalidation message in a small payload (bandwidth-friendly), the proxies discard the outdated item, and can decide when to request an update. The request may be skipped completely if the item is no longer needed.

B2) cache-update by server push — IFF bandwidth is available, server can send not only a tiny invalidation message, but also the new cache content.

IFF combined with TTL, or with reliability added, then multicast can be used to deliver cache updates, as explained in my other blogposts.

T) TTL — more common. Each “cache item” embeds a time-to-live data field a.k.a expiry timestamp. Http cookie is the prime example.

In Coherence, it’s possible for the cache proxy to pre-emptively request an update on an expired item. This would reduce latency but requires a multi-threaded cache proxy.

G) conditional-GET in HTTP is a proven industrial strength solution described in my 2005 book [[computer networking]]. The cache proxy always sends a GET to the database but with a If-modified-since header. This reduces unnecessary database load and network load.

W) write-behind (asynchronous) or write-through — in some contexts, the cache proxy is not only handling Reads but also Writes. So the Read requests will read or add to cache, and Write requests will update both cache proxy and the master data store. Drawback — In distributed topology, updates from other sources are not visible to “me” the cache proxy, so I still rely one of the other 3 means.

TTL eager server-push conditional-GET
if frequent query, in-frequent updates efficient efficient frequent but tiny requests between DB and cache proxy
if latency important OK lowest latency slower lazy fetch, though efficient
if in-frequent query good waste DB/proxy/NW resources as “push” is unnecessary efficient on DB/proxy/NW
if frequent update unsuitable high load on DB/proxy/NW efficient conflation
if frequent update+query unsuitable can be wasteful perhaps most efficient


reliable multicast for replicated-cache update is a 1999 research paper. I hope by now multicast has grown more mature more proven. Not sure where this is used, perhaps within certain network boundaries such as a private network of data servers.

This paper examines reliable multicast for invalidation and delivery of popular, frequently updated objects to web cache proxies.

static horizontal sharding: drawbacks #Rahul

Eg of static horizontal sharding — by-symbol. NYSE, OPRA…

Rahul told me that outside finance sector, many companies (esp. west coast) are cautious about static sharding. Over time, One shard can become extremely hot while other shards’ resources stay underutilized.

Rahul said it’s a hassle to change static sharding config. Logistically challenging. Requires phased restarts.

As a result, many tech companies use static horizontal sharding very cautiously, only with valid reasons.

Dynamic sharding is a new concept to me. I think it’s like … based on load level, we could shift one “topic” between shards.

too many DB-writes: sharding insufficient #Indeed

Context: each company receives many many reviews. In a similar scenario, we can say that too many user comments flood in during the soccer world cup.

Aha — the updates don’t need to show up on browser in strict order

Aha — commenting user only need to see her own comment on top, and need not see other users’ latest comments. The comments below her own could be browser-cached content. Ajax…

Interviewer: OK you said horizontal sharding by company id can address highly concurrent data store updates. Now what if one company, say, Amazon, by itself gets 20% of the updates so sharding can’t help this one company.

me: I guess the update requests would block and possibly time out. We can use a task queue.

This is similar to WordPress import/export requests.

  • Each task takes a few seconds, so we don’t want user to wait.
  • If high server load, the wait could be longer.
  • More important — this task need not be immediate. It can be scheduled.
  • We should probably optimize for server efficiency rather than latency or throughput

So similarly, all the review submissions on Amazon can be queued and scheduled.

In FIX protocol, an order sender can receive 150=A meaning PendingNew. I call it “queued for exchange”.  The OMS Server sends the acknowledgement in two steps.

  1. Optional. Execution Report message on order placement in OMS Server’s Order Book. ExecType (150) = A (Pending New)
  1. Execution Report message on receiving acknowledgement from exchange. ExecType (150) = 0 (New). I guess this indicates placement in exchange order book

Note this optimization is completely different from HFT latency engineering.