Skills and frameworks

Designing a URL Shortener System Design Interview: Capacity, Encoding, and Analytics

9 min read · April 25, 2026

URL shortener is the most-asked warm-up system design question and the easiest to under-deliver on. Here's how to walk the full loop — capacity math, base62 encoding, caching, and analytics — without hand-waving.

Designing a URL Shortener System Design Interview: Capacity, Encoding, and Analytics

The URL shortener question is the "fizzbuzz of system design interviews." It shows up in every rotation at Amazon, Meta, Uber, Airbnb, Stripe, Lyft, and every mid-sized startup mimicking their loop. Most candidates treat it as trivial and promptly fail to show any senior signal. Staff candidates treat it as what it is: a blank check to demonstrate capacity math, ID generation, caching, sharding, and analytics — all in forty-five minutes.

This guide is the version of the URL shortener conversation I wish every candidate walked into. The goal is to hit every scoring area the interviewer is checking without getting stuck on encoding trivia.

Frame the requirements first

Before drawing anything, ask:

Read vs. write ratio? Typical answer: 100:1 to 1000:1. Most shortened URLs are written once and read many times.
Scale? Target 100M new URLs/day, 10B redirects/day. (Bitly historically around this scale.)
Latency? p99 < 100ms for redirect. This is a user-facing GET; slow is unacceptable.
Custom aliases? Often yes — /my-campaign. Changes ID generation strategy.
Expiry? Often yes — defaults to forever, optional TTL per URL.
Analytics? Clicks per URL, country, referrer, time series. Almost always yes.
Global or single-region? Usually global for latency; demands regional caches and eventually consistent global ID allocation.

Stating these numbers out loud scores points. "I'm going to design for 100M writes/day and 10B reads/day, with p99 redirect under 100ms globally" is a better opening than "so we need a database and a cache."

Capacity math, out loud

This is where most candidates fumble. Do the math on the whiteboard.

Writes: 100M/day = ~1,200 writes/sec average, ~3x peak = 3,600 writes/sec.
Reads: 10B/day = ~115,000 reads/sec average, ~3x peak = 350,000 reads/sec.
Storage: 100M new URLs/day × 365 × 5 years = ~180B URLs. Each record ~500 bytes (long URL, short code, user ID, timestamps, metadata) → ~90TB. Sharded, with replication, ~300TB total.
Cache sizing: 20% of URLs drive 80% of reads (Pareto). Cache the top 1B URLs × 500 bytes = 500GB. Achievable with a Redis cluster.
ID space: 180B URLs with base62 encoding needs log62(180e9) ≈ 6.3 → 7 characters. So /abc1234.

Performing this math earns more points than the architecture that follows. It demonstrates you're a senior engineer, not a diagram-drawer.

The architecture

[client] --> [CDN/edge cache] --> [load balancer]
                                      |
                    +-----------------+------------------+
                    |                 |                  |
              [write service]   [read service]   [analytics service]
                    |                 |                  |
                    |            [Redis L2]         [Kafka]
                    |                 |                  |
                    +----[sharded DB (Postgres/DynamoDB/Cassandra)]
                              |
                         [followers / replicas]

Pick a store and justify it:

Postgres with Citus sharding. Familiar, strong transactional guarantees, good for structured analytics joins. My preferred answer for a fintech-adjacent interview.
DynamoDB. Single-digit-ms reads, managed scaling, excellent for key-value access like shortener redirects. My preferred answer in an AWS-heavy interview.
Cassandra. Great at write-heavy workloads, eventually consistent. Good fit for analytics events.

Access pattern is 99% key lookup (GET /abc1234), which means a KV store is the best fit. Reach for DynamoDB or Cassandra unless the interviewer specifically wants relational.

ID generation — the interesting part

You need to turn each new URL into a compact, unique short code. Five common approaches:

Hashing. MD5/SHA-1 the long URL, take first 7 base62 chars. Collision risk rises with scale; requires collision handling (probe next hash, add salt). Deterministic is good if the same URL should always produce the same code; deterministic is bad if users expect different shortenings of the same URL.
Auto-increment + base62 encode. Integer IDs from the DB encoded as base62. 123456789 -> 8M0kX. Simple, dense, predictable — the last property is a problem (competitors can scrape sequential IDs).
Snowflake IDs. Twitter's 64-bit ID with timestamp, machine ID, sequence. Non-guessable-ish, monotonic-ish. Great for distributed systems.
Counter service (distributed). A service like Twitter's Snowflake or Flickr's ticket server allocates ranges of IDs to each worker. Workers then encode their range to base62. Scales to millions/sec.
Pre-generated pool of short codes. A batch job generates random 7-char codes into a available_codes table; writers claim one at a time. Simplest; avoids collisions entirely; wastes a bit of code space.

In practice the right answer combines approaches: a distributed ID service (Snowflake-style) or a pre-generated pool, encoded as base62 for URL-safety.

Base62 encoding

ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encode(n):
    s = []
    while n > 0:
        s.append(ALPHABET[n % 62])
        n //= 62
    return "".join(reversed(s))

62 chars gives 62^7 = 3.5 trillion possible 7-char codes — comfortable for the 180B-URL target. Base62 avoids case-insensitive URL ambiguity pitfalls (some systems prefer base58, which removes visually-ambiguous characters like 0/O, l/1).

Custom aliases live in the same table with a custom: true flag and go through a uniqueness check before insert.

Reads, caching, and the redirect path

The redirect path is the hot path — 350k/sec at peak. Budget it carefully:

CDN / edge cache. If short URLs are mostly HTTP 301 permanent redirects, your CDN can cache them for hours. Cloudflare Workers, AWS CloudFront, Fastly. Dramatically reduces origin load.
Caveat: 301 is cached by browsers indefinitely, which breaks your analytics. 302 forces revalidation but still hits the origin. Most production shorteners return 302.
In-memory L1 cache. Each read-service instance keeps an LRU of the hottest N URLs. Caffeine in Java, lru_cache in Python. Cuts Redis load by ~50%.
Redis L2 cache. Keyed by short code, 500GB+ cluster, LFU eviction (TinyLFU or Redis LFU policy for skewed workloads). Hit rate target: >95%.
Database fallback. Cache miss falls through to the sharded KV store. DynamoDB GetItem or Postgres PK lookup — single-digit ms.

On cache miss, use request coalescing so only one thread per key fetches from the DB — otherwise a popular URL expiring triggers a stampede. singleflight in Go is the reference primitive.

The write path is simpler: generate ID, write to DB, asynchronously populate cache (or let first read lazy-load it). Returning 201 immediately keeps write latency low.

Sharding

At 180B URLs you're sharding regardless of backend. Key by short code hash:

Consistent hashing with virtual nodes (Cassandra-style).
DynamoDB: use short code as the partition key; it automatically shards.
Postgres + Citus: distribute table by hash(short_code).

Secondary indexes (by user ID, by creation time) need either Global Secondary Indexes (DynamoDB), a secondary table keyed differently, or a search service (OpenSearch) fed by CDC.

Analytics queries ("top 100 URLs by clicks in the last hour") are a different access pattern from redirects and should be served from a separate store — typically ClickHouse, BigQuery, or Druid fed by Kafka events.

Analytics

Every redirect emits an event:

{"short_code": "abc1234", "ts": "...", "ip": "...",
 "user_agent": "...", "referer": "...", "country": "US"}

Publish to Kafka. Consumers:

Real-time counter. Increment clicks:abc1234 in Redis for dashboards that need <1s freshness. Use HLL (HyperLogLog) for unique visitor counts to avoid storing every IP.
Time series. Roll up per-minute, per-hour, per-day counts into a time series DB (ClickHouse, Druid).
Long-term cold storage. Raw events to S3/Parquet for ad-hoc analysis via Athena/BigQuery.

The redirect path should not block on analytics. Emit the event to a local buffer and flush asynchronously; a dropped event is a bug but not worth failing the redirect over.

GeoIP lookup (MaxMind GeoLite2 or AWS's equivalent) happens in the analytics pipeline, not on the redirect path.

Security, abuse, and rate limiting

Every URL shortener becomes a phishing and malware-distribution tool. Staff answers address this:

Safe-browsing check at creation. Google Safe Browsing API or similar. Reject or flag.
Rate limit per user/API key. Token bucket, per the rate-limiting playbook. Stops automated abuse.
Phishing domain blocklist. Deny *.tk, known-abuse domains, or more broadly any URL matching a curated list.
Abuse review. Flag URLs crossing a click threshold for sudden spikes or those associated with reported abuse. Short-circuit to a warning page (bit.ly/warning-style).
CAPTCHA on the create path for anonymous users. Obvious but often forgotten.

Common candidate mistakes

Skipping capacity math. "It'll be fast" is not an answer. Do the arithmetic.
Choosing SQL for pure key-value access patterns. Postgres works, but DynamoDB or Cassandra is the better default.
Using MD5 naively. Produces 32 hex chars; truncating invites collisions. If you hash, you need a collision strategy.
Ignoring the analytics pipeline. An interviewer who asks "we want click counts" and gets a hand-wave loses patience fast.
Treating the CDN as magic. If you say "we'll cache at the CDN," explain why 302 vs 301 matters and how analytics survives caching.
Missing idempotency on create. If the client retries, you don't want to allocate two codes for the same request. Idempotency key or dedup by (user, long URL) hash.
Forgetting custom aliases. "What if the user wants /my-event?" is asked 100% of the time.
Overcomplicating ID generation. Snowflake or a pre-generated pool is plenty. Don't invent a blockchain-based ID service.

Advanced follow-ups

"How do you handle expiry?" Answer: a TTL column; a background job deletes expired rows (Cassandra has native TTL that does this for free). Redirects check expiry and return 410 Gone for expired codes. Analytics preserve the event history.
"How do you deal with a celebrity URL (hot key)?" Answer: edge cache + in-process cache + possibly splitting the counter updates across N counters and summing. The read path is usually served from cache; the write-side hot spot is in analytics counters.
"How do you prevent two users from getting the same code?" Answer: unique constraint in the DB or atomic allocation from the pre-generated pool. For Snowflake-style IDs, the uniqueness is structural.
"How does this work in multi-region?" Answer: DynamoDB global tables or active-active Cassandra with write-region pinning. ID generation must avoid cross-region coordination — Snowflake-style IDs with per-region machine IDs do this cleanly.
"What if the long URL is 10KB?" Answer: validate input size, reject beyond a reasonable cap (2KB is sane — longer URLs are typically bugs or abuse).
"How do you A/B test different destination URLs for the same short code?" Answer: a smart-link feature — the code resolves to a rule-based destination via user attributes. Different table, different data model.
"How do you handle a sudden 100x traffic spike?" Answer: CDN absorbs the burst; autoscale read services; Redis hits its ceiling and you load-shed at the edge. Preprovision for expected campaigns.
"Can you support click-through redirects with interstitials for branded links?" Answer: yes, but now the redirect is a full HTML page render — different caching, different latency budget.

Real-world references

Bitly. The classic operator; their engineering blog has written about Snowflake-style IDs and their move to Cassandra.
TinyURL, is.gd. Earlier generation; simpler architectures.
Twitter's t.co. Interstitial redirector for safety (Safe Browsing) and analytics; essentially a URL shortener with strict abuse checks.
YouTube's youtu.be. Short-form redirect service.
Discord invite codes. Not a URL shortener per se but the same ID-generation and scaling problem.

The candidates who ace the URL shortener question are the ones who treat it as a canvas rather than a trap. They do the capacity math, name the ID generation strategy, cache at three levels, acknowledge the analytics pipeline as a real system, and plan for abuse before being asked. The question is easy to pass and hard to impress on — the difference is articulation and depth at every layer.

If you can walk a whiteboard through the capacity numbers, the ID generation choice, the redirect hot path, the analytics Kafka pipeline, and the abuse vector, you'll leave the interviewer with nothing left to probe. URL shortener is where system design candidates demonstrate they've thought end-to-end about a production system, not where they demonstrate they know what Redis is.

Designing a Chat System Design Interview — WebSockets, Presence, and Message Storage — A system design interview guide for chat applications, covering WebSockets, fanout, message ordering, presence, storage, delivery receipts, media, search, scaling, and common trade-offs.
Designing a Payment System Design Interview — Idempotency, Ledgers, and Reconciliation — A senior payment-system design answer lives or dies on idempotency, double-entry ledgers, and reconciliation. This guide gives the architecture, state model, failure-mode answers, and interview script.
Designing a Search System Design Interview — Inverted Index, Ranking, and Recall — A practical system design guide for search interviews, covering inverted indexes, crawling and ingestion, query execution, ranking, recall, freshness, personalization, scaling, and evaluation trade-offs.
Designing Uber System Design Interview — Geo-Indexing, Matching, and ETA — A practical system design guide for the Uber-style ride-hailing prompt, covering geo-indexing, driver matching, ETA estimation, trip state, scale, and failure modes.
Backend System Design Interview Cheatsheet in 2026 — Patterns, Examples, Practice Plan, and Common Traps — A backend System Design interview cheatsheet for 2026 with the core flow, architecture patterns, capacity heuristics, reliability tradeoffs, and traps that separate senior answers from vague box drawing.

Designing a URL Shortener System Design Interview: Capacity, Encoding, and Analytics

Frame the requirements first

Capacity math, out loud

The architecture

ID generation — the interesting part

Base62 encoding

Reads, caching, and the redirect path

Sharding

Analytics

Security, abuse, and rate limiting

Common candidate mistakes

Advanced follow-ups

Real-world references

Related guides

More in Skills and frameworks

A/B Testing Interview Questions in 2026 — Power Analysis, Peeking, and SRM

API Design Interview Cheatsheet in 2026 — Patterns, Examples, Practice Plan, and Common Traps

API Design Interview Guide — REST vs GraphQL vs gRPC, Versioning, and Pagination