ETag and conditional requests for systems analysts

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

Why ETag shows up in SA interviews

You are forty minutes into a systems analyst loop at Stripe or DoorDash. The interviewer sketches a mobile app that polls a profile endpoint every thirty seconds and asks how you would cut backend load by ninety percent without breaking freshness guarantees. Or they ask how two operators editing the same merchant record from different tabs should avoid clobbering each other. Both questions have the same load-bearing answer: ETag plus conditional requests, defined in RFC 7232.

The header itself is three lines of spec. The implications — cache invalidation, lost updates, weak versus strong validation, CDN behavior, retry semantics — fill the rest of the hour. Strong candidates name the header in the first thirty seconds and unfold it into two distinct use cases. Weak candidates conflate caching and concurrency, tangle the If-Match versus If-None-Match direction, and lose the thread.

This guide walks the topic the way a senior SA at Notion or Linear would: definition, two use cases, validator semantics, the traps that fail real candidates, and a whiteboard script. Everything below assumes HTTP/1.1; HTTP/2 and HTTP/3 inherit the same semantics with binary framing.

What an ETag actually is

An ETag is an opaque string the server attaches to a response. It identifies the current version of the resource at that URL. The client should treat it as a black box — never parse it, never compare it for ordering, only check it for equality.

GET /merchants/42 HTTP/1.1
Host: api.example.com

HTTP/1.1 200 OK
ETag: "9b2c8d-v3"
Content-Type: application/json

{"id": 42, "name": "Blue Bottle Coffee", "version": 3}

The server generates the value any way it wants — a hash of the response body, a database row version, a monotonic counter, a UUID assigned at write time. The wire format is a quoted ASCII string, optionally prefixed with W/ to mark it as weak. The opacity is the point: it lets the server change strategies without breaking clients.

Two HTTP headers consume this value on subsequent requests. If-None-Match is used for read-side caching — "give me a fresh copy only if the version changed." If-Match is used for write-side concurrency — "apply this update only if the version is still the one I saw." Mixing them up is the single most common interview mistake on this topic.

Caching with If-None-Match

The client stores the ETag alongside the cached body. On the next request to the same URL, it sends the value back as a precondition.

GET /merchants/42 HTTP/1.1
If-None-Match: "9b2c8d-v3"

If the resource has not changed, the server short-circuits with a body-less 304 Not Modified. The client reuses its cached copy. If the resource has changed, the server returns the full 200 OK with a new ETag and the new body.

HTTP/1.1 304 Not Modified
ETag: "9b2c8d-v3"
Cache-Control: max-age=60

The savings compound. A merchant profile payload at Uber or Airbnb can be 8-40 KB of JSON. A 304 response is roughly 150 bytes of headers. On a mobile network with a poll every thirty seconds, that is a two-orders-of-magnitude reduction in bytes shipped, plus the database read on the server side becomes a single primary-key lookup of the version column instead of a full row fetch and serialization pass.

CDNs and reverse proxies layer onto this naturally. Fastly, Cloudflare, and CloudFront all respect If-None-Match on the origin side, which means a CDN edge can revalidate against the origin with a tiny conditional request instead of pulling the full object. The browser, the CDN, and any intermediate proxy all participate in the same protocol.

Load-bearing trick: If-None-Match for reads, If-Match for writes. If you forget which is which, remember that browsers send If-None-Match on GET — the same direction as caching.

Optimistic concurrency with If-Match

The write-side use case is the one that trips most candidates. Picture two support agents editing the same merchant record from different tabs. Without concurrency control, the second save silently overwrites the first. This is the lost-update problem, and ETag plus If-Match is the textbook fix.

PUT /merchants/42 HTTP/1.1
If-Match: "9b2c8d-v3"
Content-Type: application/json

{"name": "Blue Bottle Coffee Roasters"}

If the version on the server still matches "9b2c8d-v3", the update applies and the response carries a new ETag. If another client has updated the row in the meantime, the server returns 412 Precondition Failed and the body is rejected.

HTTP/1.1 412 Precondition Failed
ETag: "9b2c8d-v4"

The client now has a choice: refetch, show the conflict to the user, or auto-merge if the fields are non-overlapping. This is optimistic in the literal sense — the client assumes it can write, and only the server arbitrates. It is faster than pessimistic locking because no row lock is held across the round-trip, and it scales cleanly because there is nothing to release on client crash.

Concern Pessimistic lock Optimistic ETag
Latency under low contention Higher (lock acquire) Lower (one round-trip)
Behavior on client crash Lock leak risk No state held
Scaling to many readers Lock manager bottleneck Stateless, CDN-friendly
Cost of conflict Block + wait Retry the chain
Where it ships Bank transfers, inventory REST APIs, document edits

The implicit contract: the client must repeat the read-modify-write loop on conflict. GET the resource, get the new ETag, reapply the user's intent, PUT with the new If-Match. This is also why APIs that expose ETag should also accept partial updates via PATCH — full PUT semantics get expensive when the round-trip is just to refresh the validator.

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

Strong vs weak validators

The spec defines two flavors of ETag. The distinction sounds pedantic until you hit a gzip-on-the-edge bug at 2 a.m.

Strong ETags guarantee that two responses with the same ETag are byte-for-byte identical. Same headers, same body, same encoding. You can use them for range requests — If-Range: "abc123" plus Range: bytes=1000-1999 is only safe with strong validators because the byte offsets must line up.

ETag: "9b2c8d-v3"

Weak ETags guarantee semantic equivalence only. The bodies represent the same logical resource but may differ in whitespace, key ordering, gzip framing, or other trivia. The wire syntax is the W/ prefix.

ETag: W/"9b2c8d-v3"

Compression at the CDN edge is the usual reason to mark a validator weak. Two identical JSON bodies compressed by different gzip implementations produce different bytes, so a strong ETag would force unnecessary cache misses. A weak ETag papers over the difference and keeps the 304 rate high. Range requests, however, must reject weak validators — the partial byte range would not match the regenerated full body.

Gotcha: Never invent your own W/ prefix on the client. It is a server-side declaration. The client just echoes whatever string came back.

Common pitfalls

When candidates first reach for ETag on a whiteboard, the most common slip is using If-Match for caching or If-None-Match for writes. The two headers feel symmetric, but the semantics are inverted: reads want to be skipped when the version is unchanged, writes want to be skipped when the version has changed. A clean fix is to memorize the direction once — "none match means nothing new for me, send 304" — and stop deriving it on the fly. Drawing the four-cell matrix of method × header on the whiteboard before answering buys you ten seconds and prevents the wrong-direction trap.

A second trap is generating ETags non-deterministically. If your service computes the validator as a hash of the serialized JSON, and your JSON serializer does not guarantee key ordering, two functionally identical writes can produce different ETags. The cache hit rate collapses and clients get spurious 200s. The fix is either to use a database row version as the ETag source — monotonic, deterministic, free — or to canonicalize the response body before hashing. Most teams pick the row version because it costs zero extra CPU.

A third trap is forgetting that ETags are scoped to a single URL. The same logical entity exposed at /v1/users/42 and /v2/users/42 will have unrelated ETag values. Clients that store ETags keyed by entity ID instead of URL will get false positives and false negatives. The contract is one URL, one validator, no cross-URL inference. This is why CDN keys also include the full path, not the entity ID.

A fourth trap is treating 412 Precondition Failed as a permanent error. Clients see the 4xx prefix and surface a hard failure to the user. In reality, 412 is a soft, retry-on-refresh condition — the right UX is "your version is stale, here is the new one, do you want to reapply your changes?" Eng teams that get this wrong see support tickets for "the save button doesn't work" that are actually two operators editing the same row.

A fifth trap is forgetting that a CDN may rewrite the ETag header. Cloudflare for example replaces strong ETags with weak ones when it transforms the body — minification, brotli on the fly, image resizing. If your client rejects weak validators or compares them as opaque strings against a previously-stored strong value, you get a cache stampede on every deploy.

How to answer this on a whiteboard

The interviewer wants to see structure. Walk the topic in four beats. First, define the header in one sentence and give the wire format. Second, name the two use cases — caching with If-None-Match, concurrency with If-Match — and draw the request-response pair for each. Third, distinguish strong from weak and give one concrete example (gzip at the edge). Fourth, mention at least one pitfall, ideally the lost-update retry loop, to show you have shipped this before.

If the loop is for a senior systems analyst role at a fintech or marketplace — Stripe, DoorDash, Airbnb, Uber — expect a follow-up on idempotency keys. The two patterns are siblings, not duplicates. Idempotency keys ensure that retrying the same request does not double-charge; ETags ensure that two different requests cannot silently overwrite each other. A senior candidate names both and draws the orthogonal axis. Conflating them is the most common reason a strong "principal SA" rubric drops to "senior" on the debrief.

Beat What to say Time budget
Definition "Opaque server-issued version string, per URL." 30 seconds
Caching Draw GET + If-None-Match + 304 cycle. 90 seconds
Concurrency Draw PUT + If-Match + 412 cycle. 90 seconds
Strong vs weak Gzip-at-edge example, range request rule. 60 seconds
Pitfall Lost-update retry chain or non-deterministic hash. 60 seconds

If you want to drill exactly this pattern — header semantics, conditional requests, when 304 versus 412 ships — NAILDD is launching with hundreds of systems analyst problems built around the RFC 7232 family of questions.

FAQ

Is ETag mandatory in REST APIs?

No, the spec does not require it. Most public APIs at Stripe, GitHub, and Shopify expose it on read-heavy resources because the bandwidth savings are large and the implementation cost is small. Internal microservices often skip it for high-write, low-read tables where the cache hit rate would be negligible. The decision is a cost-benefit one: a few minutes of engineering for a measurable drop in egress and database load.

Can I use Last-Modified instead?

Yes, Last-Modified and If-Modified-Since are the older sibling protocol and still work. The downside is one-second resolution — two writes inside the same second collapse to the same validator and you get a stale 304. Most modern services pick ETag for new endpoints and keep Last-Modified only on legacy paths.

What status code should I return on a write conflict?

412 Precondition Failed is canonical when If-Match fails. Some APIs return 409 Conflict, which is acceptable for general resource conflicts but loses the precondition-specific signal. Keeping 412 and 409 distinct lets clients pick the right retry strategy without inspecting the body.

How does ETag interact with PATCH?

PATCH is the natural partner. You GET to read the validator, build a JSON Patch or merge-patch document, then PATCH with If-Match: "...". The server applies the patch atomically against the validated version. Cheaper than full PUT because you ship only changed fields, with the same lost-update protection.

What about ETag in GraphQL or gRPC?

GraphQL queries traditionally bypass HTTP caching because every operation hits the same /graphql POST endpoint. Persisted queries plus a hash-based URL can restore ETag semantics for reads. gRPC has no native equivalent — concurrency uses a version field in the message body, like expected_version: 3, the same idea at the application layer. The pattern, not the header, is what matters.

Should I expose the ETag value in the response body too?

It is a common compromise. The header is for HTTP machinery — caches, proxies, conditional requests — but client state managers like React Query or SWR often want the value in the body for optimistic UI. Returning it in both places costs nothing if you keep them in sync: same string, generated once, written to both.