The Netflix System Design Interview: Streaming Scale, CDN, and Microservices
Netflix's system design loop is the FAANG loop tuned for streaming video, chaos engineering, and a microservices stack older than most of the candidates interviewing. Here's how they actually grade it.
Netflix is a peculiar interview. The company doesn't do leveling-by-ladder the way Google or Amazon do — there's effectively one senior IC level and a staff level, and the bar for both is unusually high. The company runs at a scale that is genuinely unique (15%+ of worldwide downstream internet traffic at peak, on a good night), and the architecture they've written publicly about is both sophisticated and, in places, hilariously idiosyncratic.
This guide is written for candidates targeting a senior or staff SWE role at Netflix in 2026, primarily on the streaming platform side (playback, control plane, personalization, encoding) though the rubric is similar for the studio-technology and games orgs. Sources are Blind, Netflix's own tech blog (surprisingly candid), public talks at QCon and re:Invent, and conversations with people who've done the loop recently.
The loop structure
Netflix's software engineering loop is five rounds for senior, six for staff, and famously efficient. They move fast once you're in the pipe.
- Recruiter call. 30 minutes. They screen hard for 'senior' — Netflix's senior is other companies' staff, and they tell you this during the call.
- Hiring manager screen. 45-60 minutes. Half behavioral, half a scoped technical discussion. The HM is often the person the role reports to, and they have real weight in the final decision.
- Technical phone screen. 60 minutes. Coding in a shared editor. Medium-hard leetcode, typically one question with follow-ups. Netflix cares less about puzzle tricks than about clean code and reasoning.
- Onsite: coding round. 60 minutes. One hard problem, often API-design-flavored. Think 'design a rate limiter library with these constraints' rather than 'reverse this linked list.'
- Onsite: system design round. 60 minutes. Product-shaped question close to something Netflix actually builds. Streaming playback, watch-history service, profile switching, A/B testing platform.
- Onsite: behavioral round. 45-60 minutes. This is where the 'keeper test' culture gets probed. Covered lightly below; the sibling hiring-bar guide goes deep.
- Onsite: deep-dive on your past work. 45-60 minutes. Pick a recent project and defend every architectural choice. Netflix takes this seriously; it's often the most technically penetrating round.
- Bar raiser / cross-team round (staff-only). One more round with a senior IC from an adjacent team.
The whole loop typically wraps in 2-3 weeks from screen to offer. Netflix does not bulk-interview and they do not ghost — you get a clear yes or no.
What Netflix actually grades on
Netflix's public culture memo tells you half of what they care about. The system design rubric, as observable from debrief patterns:
- Operating at their scale without performance. You should name millions of QPS, terabytes, planetary footprint without flinching. If you over-dramatize the scale ("we'd need ten thousand machines!") you signal you haven't worked at it.
- Microservices fluency. Netflix famously pioneered the microservices stack. They expect you to think in services, not in monoliths. You should talk about service boundaries, API contracts, versioning, and backward compatibility by reflex.
- Chaos-engineering instinct. Did you name what fails? Netflix invented Chaos Monkey. If your design doesn't answer 'what happens when a whole AZ disappears,' you haven't designed it.
- Caching at every layer. EVCache, their internal memcached fork, is everywhere at Netflix. You should think about client-side cache, service-level cache, and edge cache as three distinct decisions.
- CDN and edge thinking. Open Connect, Netflix's CDN, serves the majority of their bytes. Any streaming-adjacent design question expects you to think about cache fill, origin shielding, regional popularity, and ISP-embedded caches.
- Observability as a design axis. Netflix runs on metrics — Atlas, Mantis, Lumen. You should say 'here's what I'd instrument' as a first-class part of the design, not an afterthought.
- Cost awareness. Netflix is the second-biggest AWS customer on earth, and they care about the AWS bill. Naming cost tradeoffs (spot instances, storage tiers, egress) scores points.
- Async by default. Netflix leans on event-driven architectures. Kafka for event streams, Keystone for the pipeline, async APIs wherever latency allows. If you design synchronous-only systems when async would do, you lose points.
- Polyglot persistence. Cassandra, DynamoDB, EVCache, Elasticsearch, MySQL, and their own KV stores all appear in Netflix's stack. Picking the right one — and defending it — is a real axis.
What does not score: lecturing the interviewer on microservices, naming obscure Netflix OSS tools by name (they respect knowledge but not name-dropping), or designing for scale the problem doesn't need.
Example questions
From Netflix loops reported on Blind and Levels.fyi in 2024-2026:
- Design the video playback service. Specifically: user hits play, what happens from the client request to the first byte of video? Cover CDN selection, ABR manifest, licensing, and playback telemetry.
- Design the watch-history service. Reads are bursty (home page load), writes are constant (every few seconds during playback), and consistency matters for 'continue watching.'
- Design the A/B test configuration and assignment service. 500M+ users, 2000+ concurrent experiments, assignment must be stable, and metrics must be derivable.
- Design the encoding pipeline. A studio drops a 4K master. What produces the final set of encodes, and how is the catalog updated?
- Design the recommendations serving path. Not the training — the serving. User opens the app, what sequence of services and caches hydrates the home page in 200ms?
- Design the thumbnail-generation and serving service. Artwork personalization, A/B testable, cached aggressively.
- Design a global rate limiter for the Netflix API gateway. Distinct per-user, per-endpoint, with failover behavior when the rate-limit service is degraded.
- Design the playback telemetry pipeline. 200B+ events per day, used for both real-time quality monitoring and offline analysis.
- Design the control plane for Open Connect appliance provisioning — getting a box to an ISP and keeping its catalog fresh.
- Design the subtitle and caption delivery system, including on-the-fly rendering in supported locales.
- Design the profile-switching experience. When I switch from my profile to my partner's, what changes under the hood, and what stays cached?
The streaming questions are the ones candidates most underprepare for. Knowing what ABR means (Adaptive Bitrate), how HLS and DASH differ, what a manifest is, and why licensing is a bottleneck is table stakes.
Strong vs passing answers
A passing answer to "design the watch-history service" picks Cassandra, adds EVCache, describes reads and writes, and handles basic eventual consistency. It scores in the middle.
A strong answer:
- Scopes aggressively first. "500M profiles, average 3 sessions per day, each session emits a position update every 30 seconds during active playback. That's order-of-magnitude 3B writes per day, 35K writes per second average, 100K peak. Reads are page-load bursty, probably 10x writes at peak."
- Separates hot and cold. "The 'currently watching' state is tiny — maybe one row per profile per in-progress title. I'd put that in EVCache with Cassandra as the durable store. Historical watch data — everything you've ever watched — is cold, larger, and rarely read. That goes to Cassandra with a different access pattern."
- Names the consistency contract. "Eventual consistency is fine for the full history. For 'continue watching,' we want read-your-writes on the user's own profile. I'd do a quorum write with a short-TTL cache pin so the user's next read sees their own progress."
- Names failure modes. "If EVCache is down, we fall back to Cassandra. If Cassandra is degraded in a region, we serve stale from the regional cache and block writes cleanly instead of silently dropping them. I'd Chaos Monkey this path in staging."
- Names the telemetry. "I'd instrument write latency p99, read latency p99, cache hit rate, and a specific metric for 'continue watching correctness' — we replay a sample of sessions and verify the position matches what the client reported."
- Names a specific Netflix idiom. "I'd build this as a Tier-1 service, which at Netflix means it gets the full regional failover treatment and a well-defined graceful degradation path — the app should still open and play content even if history is down."
Candidates who hit all six points in 40 minutes get pulled into the staff conversation.
Common failure modes
The ways candidates lose a Netflix system design round:
- Designing for Twitter-scale when the problem is Netflix-scale. Netflix scale is more vertical than horizontal. A billion users hitting one profile load endpoint is different from 200M users streaming a single live event. Know the shape.
- Ignoring the CDN. For any streaming or video-adjacent question, the CDN is the first box you draw, not the last. Candidates who put AWS at the center and the CDN on the edge as an afterthought lose.
- Over-centralizing control. "We'll have a global service that coordinates everything." Netflix runs in three AWS regions simultaneously and is designed to tolerate any one going down entirely. Single points of coordination are red flags.
- Skipping licensing and DRM in playback design. Real Netflix engineers think about content licensing windows (different per region, expires and rotates), DRM key delivery, and manifest security. If you design a playback system with none of that, the interviewer will wait.
- Forgetting mobile. A huge chunk of Netflix's audience is on mobile, often on weak connections. If your design assumes a stable 100Mbps pipe, you've missed the biggest failure mode.
- No cost conversation. Netflix engineers name cost. Not to the penny, but to the order of magnitude. "This approach doubles our egress bill; here's the cheaper alternative with a quality tradeoff."
- Defending instead of integrating. Netflix interviewers push. They are trained to push. Candidates who argue instead of taking the probe as a hint score lower on 'collaboration.'
Prep strategy
30-50 hours over three to four weeks if you're coming from a strong distributed-systems background:
- Read the Netflix tech blog. The posts on EVCache, Atlas, Zuul, Hystrix (now Resilience4j), Keystone, Mantis, the Open Connect deep-dives, and the A/B testing platform are all directly relevant. Twenty hours of reading.
- Watch Adrian Cockcroft's and Josh Evans's talks. Old but still the definitive public explanations of Netflix's architecture philosophy.
- Drill the five canonical streaming questions. Playback, watch history, recommendations serving, encoding pipeline, and A/B testing. Practice each end-to-end.
- Know your storage primitives cold. Cassandra write path, DynamoDB consistency model, Kafka partitioning, EVCache vs Memcached. You will be asked specifics.
- Practice the deep-dive. Pick your strongest past project and prepare to defend every architectural choice for 45 minutes. Write down the decisions, the alternatives, and the reasons.
- Rehearse scale numbers. Net-flix-ish numbers you should have memorized: 250M+ subscribers, 3 AWS regions, 200+ services, peak traffic during Stranger Things finales (measurable on internet backbone graphs).
- Mock with someone who's been through it. Netflix's bar is specific. Generic FAANG mocks undershoot it, and Amazon-style behavioral mocks overshoot it in the wrong direction.
Next-day follow-up
Netflix's post-interview etiquette is lean and professional:
- Send a one-paragraph thank-you to the recruiter. They relay it. Don't LinkedIn-ping individual interviewers — Netflix's culture is allergic to performative networking.
- Within 24 hours, write your debrief. Every question, your answer, the probes, what you'd say differently. If you get a no, you'll want this for round two somewhere else.
- Netflix is unusually honest in 'no' feedback. Ask the recruiter for the specific dimension. They often tell you.
- If you get a yes, the offer is usually your top-of-market number at base salary. Netflix famously pays mostly in cash, minimal equity. Negotiate the base up; there's no RSU refresh to hide behind. Use Levels.fyi for your role family as the anchor.
The candidates who clear a Netflix system design loop are the ones who can look at a streaming problem, think in CDN-first and region-first terms, name specific failure modes, and defend cost tradeoffs without apology. If you can operate at that scale without drama, and you can narrate the chaos-engineering instinct naturally, you will be in the senior pile. If you think of Netflix as 'like Hulu but bigger,' you will not.
Sources and further reading
When evaluating any company's interview process, hiring bar, or compensation, cross-reference what you read here against multiple primary sources before making decisions.
- Levels.fyi — Crowdsourced compensation data with real recent offers across tech employers
- Glassdoor — Self-reported interviews, salaries, and employee reviews searchable by company
- Blind by Teamblind — Anonymous discussions about specific companies, often the freshest signal on layoffs, comp, culture, and team-level reputation
- LinkedIn People Search — Find current employees by company, role, and location for warm-network outreach and informational interviews
These are starting points, not the last word. Combine multiple sources, weight recent data over older, and treat anonymous reports as signal that needs corroboration.
Related guides
- The Airbnb System Design Interview in 2026 — Search, Ranking, and Trust-and-Safety Scale — Airbnb's system design loop is FAANG-flavored but has three distinctive axes: search-and-ranking, trust-and-safety, and marketplace dynamics. Here's how the loop actually grades and what a strong answer looks like.
- The Atlassian System Design Interview — Jira, Confluence, and Team-of-Teams Scale — Atlassian system design interviews reward candidates who can model collaborative enterprise software, not just recite generic distributed systems. This guide breaks down the Jira/Confluence-style prompts, the 2026 rubric, and the answers that show senior judgment.
- The Cloudflare System Design Interview — Edge Networking, Workers, and DDoS at Scale — Cloudflare system design interviews reward candidates who understand edge architecture, control-plane propagation, request isolation, and abuse-resistant systems. This guide maps the 2026 bar for networking, Workers, and DDoS-style prompts.
- Netflix Software Engineer Interview Process in 2026 — Coding, System Design, Behavioral Rounds, and Hiring Bar — A practical guide to the Netflix Software Engineer interview process in 2026, including coding screens, system design, behavioral signals, hiring bar by level, and a focused prep plan.
- The Shopify System Design Interview — Commerce Scale, Ruby, and Pair-Programming — Shopify's system design round isn't like Google's. It cares about commerce-specific correctness, multi-tenant isolation, pair-programming culture, and how to reason about a Ruby monolith at scale. Here's what they grade on.
