SQL Window Functions Interview Guide — RANK, LAG, and Running Totals Worked Examples
A practical SQL window functions interview guide with worked RANK, LAG, running total, rolling average, and cohort-style examples. Use it to answer analytics, data engineering, and product SQL questions without defaulting to slow self-joins.
SQL Window Functions Interview Guide — RANK, LAG, and Running Totals Worked Examples
This SQL window functions interview guide focuses on the patterns interviewers actually test: RANK, LAG, LEAD, running totals, rolling averages, percent-of-total, and “top N per group” queries. The core skill is not memorizing syntax; it is explaining which rows are visible to each calculation, what order they are evaluated in, and how the frame changes the answer.
Window functions show up in analytics engineering, business intelligence, product analytics, data engineering, growth, finance, and backend roles where SQL is used for reporting. Interviewers like them because a single question reveals whether you understand aggregation, ordering, duplicates, nulls, and grain. A candidate who reaches for five nested subqueries when one window function would work usually looks less production-ready.
SQL window functions interview guide: the mental model
A window function calculates a value across a set of related rows while keeping the original row-level output. GROUP BY collapses rows. A window function annotates rows.
The useful sentence is: “Partition chooses the group, order chooses the sequence, and the frame chooses which rows in that sequence are visible to the calculation.”
Basic shape:
function_name(expression) OVER (
PARTITION BY grouping_column
ORDER BY sort_column
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)
| Clause | What it controls | Interview trap | |---|---|---| | PARTITION BY | Resets the calculation per group | Forgetting it and ranking across the whole table | | ORDER BY | Defines sequence inside the group | Assuming deterministic ties without a tie-breaker | | Frame | Defines which ordered rows are included | Relying on database defaults for running totals | | Outer ORDER BY | Sorts final results | Confusing output order with window order |
A strong interview answer starts by naming the row grain. For example: “The table is one row per order. I need one output row per order, so I will use a window function rather than grouping away the order row.” That signals that you understand why the tool fits.
Worked example 1: RANK, DENSE_RANK, and ROW_NUMBER
Prompt: “For each department, return the top three employees by sales this quarter. Include ties if two employees have the same sales.”
WITH ranked AS (
SELECT
department_id,
employee_id,
quarter_sales,
RANK() OVER (
PARTITION BY department_id
ORDER BY quarter_sales DESC
) AS sales_rank
FROM employee_quarter_sales
)
SELECT department_id, employee_id, quarter_sales, sales_rank
FROM ranked
WHERE sales_rank <= 3
ORDER BY department_id, sales_rank, employee_id;
RANK() leaves gaps after ties. If two people tie for first, the next person is third. That is right when the business asks for “top three ranks including ties.” DENSE_RANK() does not leave gaps, so two people tied for first are followed by rank two. ROW_NUMBER() forces a unique order and is right when the business needs exactly three rows per department.
Interviewers often ask the follow-up: “What if we want exactly three employees even with ties?” Add a deterministic tie-breaker:
ROW_NUMBER() OVER (
PARTITION BY department_id
ORDER BY quarter_sales DESC, employee_id ASC
) AS row_num
Then filter row_num <= 3. The tie-breaker matters because without it, repeated runs can return different rows when sales are equal. In production dashboards, nondeterministic ties create messy audits.
How to explain it: “I chose RANK because the prompt says include ties. If the product requirement changes to exactly three rows, I would switch to ROW_NUMBER and document the tie-breaker.”
Worked example 2: LAG for before-and-after comparisons
Prompt: “For each customer order, show the previous order date and the number of days since the previous order.”
SELECT
customer_id,
order_id,
order_date,
LAG(order_date) OVER (
PARTITION BY customer_id
ORDER BY order_date, order_id
) AS previous_order_date,
DATE_DIFF(
order_date,
LAG(order_date) OVER (
PARTITION BY customer_id
ORDER BY order_date, order_id
),
DAY
) AS days_since_previous_order
FROM orders;
LAG reads a value from a prior row in the same partition. LEAD reads a future row. Both require a meaningful order. If two orders happen on the same timestamp, include order_id as a stable tie-breaker.
Many candidates try to solve this with a self-join:
-- Usually harder to maintain
SELECT o.*, MAX(p.order_date) AS previous_order_date
FROM orders o
LEFT JOIN orders p
ON p.customer_id = o.customer_id
AND p.order_date < o.order_date
GROUP BY o.customer_id, o.order_id, o.order_date;
That can work, but it is more expensive and more error-prone. The window version states the intent directly.
Null handling is the follow-up. The first order for a customer has no previous order, so previous_order_date is null and the difference is null. Do not automatically coalesce it to zero unless the metric definition says “first order gap equals zero.” In behavioral analytics, that zero can pollute averages.
Worked example 3: running totals with explicit frames
Prompt: “Show daily revenue and cumulative revenue by month.”
WITH daily AS (
SELECT
DATE_TRUNC(order_date, DAY) AS order_day,
DATE_TRUNC(order_date, MONTH) AS order_month,
SUM(revenue) AS daily_revenue
FROM orders
GROUP BY 1, 2
)
SELECT
order_day,
daily_revenue,
SUM(daily_revenue) OVER (
PARTITION BY order_month
ORDER BY order_day
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS month_to_date_revenue
FROM daily
ORDER BY order_day;
The explicit ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is not decorative. Some databases default to a RANGE frame when ORDER BY is present. With duplicate ordering values, RANGE can include all peers with the same date or value, which may produce surprising jumps. Aggregating to one row per day first and using ROWS makes the result easier to reason about.
A polished answer says: “I aggregate to the metric grain first, then window over that result. I do not run the window directly over raw orders because multiple orders per day would create multiple partial cumulative rows.”
Worked example 4: rolling seven-day average
Prompt: “For every day, calculate the trailing seven-day average of active users.”
SELECT
activity_day,
active_users,
AVG(active_users) OVER (
ORDER BY activity_day
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS trailing_7_day_avg
FROM daily_active_users
ORDER BY activity_day;
This works if the table has exactly one row per calendar day. If missing days are possible, first build a date spine and left join the activity table. Otherwise a “seven-row” average is not the same as a “seven-day” average.
Interview-safe explanation: “ROWS 6 PRECEDING means the current row plus six earlier rows. I would verify that the source has no missing days; if it does, I would create a complete calendar table before calculating the rolling average.”
Worked example 5: percent of total without losing row detail
Prompt: “For each product in each category, show its revenue and share of category revenue.”
SELECT
category,
product_id,
revenue,
revenue / NULLIF(
SUM(revenue) OVER (PARTITION BY category),
0
) AS category_revenue_share
FROM product_revenue;
This is the cleanest contrast with GROUP BY. A grouped query can calculate category totals, but then you need to join those totals back to products. The window version keeps each product row and attaches the category total. NULLIF protects against division by zero.
Follow-up: “How would you calculate share of total company revenue?” Remove PARTITION BY category. The window now sees the full result set.
Common traps interviewers are listening for
- Using WHERE on a window alias in the same SELECT. SQL evaluation order means the alias is not available in
WHERE. Use a CTE, subquery, orQUALIFYif the database supports it. - Forgetting the partition. A rank across the whole company is different from a rank per department.
- Missing deterministic ordering. If two events share the same timestamp, order by timestamp plus event id.
- Windowing at the wrong grain. Aggregate to day, customer, product, or session first when the metric is defined at that grain.
- Assuming default frames. For cumulative metrics, write the frame explicitly.
- Confusing
RANKandROW_NUMBER. Ties change row counts. - Ignoring nulls. First rows, missing previous events, and zero denominators should be named.
Interview framework for any window-function prompt
Use this four-step answer:
- State the output grain. “One row per customer order.”
- Define the partition. “Reset the calculation for each customer.”
- Define the order. “Sequence by order date and order id.”
- Choose the function and frame. “Use LAG for previous order; use SUM with an unbounded preceding frame for cumulative revenue.”
This framework turns a vague SQL prompt into a design conversation. If you get stuck, write a small table with three or four rows and manually compute the first two outputs. Interviewers give partial credit for correct reasoning even before the syntax is perfect.
How to talk about window functions on a resume
Weak bullet: “Used SQL window functions for reporting.”
Better bullet: “Built revenue and retention dashboards using SQL window functions, including cohort ranks, LAG-based reorder intervals, and month-to-date running totals at daily grain.”
Best bullet: “Replaced self-join-heavy retention queries with LAG, ROW_NUMBER, and explicit-frame running totals, reducing query complexity and making customer lifecycle metrics auditable by product and finance teams.”
The point is to connect the function to a business metric and a data-quality outcome. Hiring teams care less that you know RANK() exists and more that you can use it to prevent duplicate top-N results, explain a retention gap, or ship a dashboard that does not change randomly because of tie behavior.
Quick prep checklist
Before a SQL window functions interview, practice these prompts until they feel automatic:
- Top N per group with and without ties.
- First event, latest event, and deduplication using
ROW_NUMBER. - Previous purchase date using
LAG. - Month-to-date revenue with an explicit frame.
- Rolling seven-day average with missing calendar dates.
- Percent of category total.
- Sessionization using gaps between events.
- “Second highest salary” using
DENSE_RANKorROW_NUMBER.
The winning pattern is calm specificity. Name the grain, partition, order, and frame. Then write the query. That is the difference between someone who has memorized window syntax and someone who can use it in real analytics work.
Bonus worked example: sessionization with LAG
A higher-level window question is sessionization: “Group events into sessions when a user is inactive for more than 30 minutes.” The pattern is LAG, then a running sum of session starts.
WITH sequenced AS (
SELECT
user_id,
event_time,
LAG(event_time) OVER (
PARTITION BY user_id
ORDER BY event_time
) AS previous_event_time
FROM events
), flagged AS (
SELECT
*,
CASE
WHEN previous_event_time IS NULL THEN 1
WHEN TIMESTAMP_DIFF(event_time, previous_event_time, MINUTE) > 30 THEN 1
ELSE 0
END AS is_new_session
FROM sequenced
)
SELECT
*,
SUM(is_new_session) OVER (
PARTITION BY user_id
ORDER BY event_time
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS session_number
FROM flagged;
This is a great interview answer because it combines two window ideas: compare a row to the previous row, then cumulatively count the boundaries. Say out loud that the 30-minute threshold is a product definition, not a SQL fact, and that duplicated timestamps may need an event-id tie-breaker. That kind of precision is what makes the answer feel production-ready.
Related guides
- SQL Interview Questions Cheatsheet: Window Functions, CTEs & Optimization — Master the SQL concepts that actually show up in technical interviews: window functions, CTEs, and query optimization with real examples.
- Product Sense Questions for the PM Interview — Frameworks and Worked Examples — A practical product sense interview guide with a repeatable framework, worked examples, metric trees, tradeoff language, and traps to avoid when answering ambiguous PM prompts.
- API Design Interview Cheatsheet in 2026 — Patterns, Examples, Practice Plan, and Common Traps — A practical API design interview cheatsheet for 2026: how to scope the problem, choose REST/GraphQL/gRPC patterns, model resources, handle auth, versioning, rate limits, and avoid the traps that cost senior candidates offers.
- AWS Interview Cheatsheet in 2026 — Patterns, Examples, Practice Plan, and Common Traps — A high-signal AWS interview cheatsheet for 2026 covering architecture patterns, IAM, networking, reliability, cost, debugging, and the answers that show real cloud judgment.
- Backend System Design Interview Cheatsheet in 2026 — Patterns, Examples, Practice Plan, and Common Traps — A backend System Design interview cheatsheet for 2026 with the core flow, architecture patterns, capacity heuristics, reliability tradeoffs, and traps that separate senior answers from vague box drawing.
