SQL window functions interview questions — what FAANG actually asks
Window functions are the single most common SQL topic in data interviews at top tech. They're also the single most likely place a senior candidate stalls.
Below: the 8 questions that come up in real loops, why they're asked, and how to actually answer them.
Why interviewers ask window-function questions
Window functions test three things at once:
- Do you understand the difference between row-level operations and group aggregates?
- Can you reason about ordering and partitioning without re-running the query in your head?
- Do you know the standard tricks (running totals, gaps, deduplication) that come up daily on the job?
Interviewers favour these questions because they discriminate well between strong and average candidates. Loops at Google, Meta, Amazon, Stripe, Airbnb, and Microsoft all include at least one.
Question 1: ROW_NUMBER vs RANK vs DENSE_RANK
Given a sales table with salesperson_id and amount, find the top 3 salespeople per region.
The trap: candidates default to RANK() and miss that ties produce 4 rows instead of 3. The correct answer uses ROW_NUMBER() if you want exactly 3 — and you should explain why you chose it.
SELECT salesperson_id, region, total
FROM (
SELECT
salesperson_id,
region,
SUM(amount) AS total,
ROW_NUMBER() OVER (PARTITION BY region ORDER BY SUM(amount) DESC) AS rn
FROM sales
GROUP BY salesperson_id, region
) ranked
WHERE rn <= 3;Question 2: Running totals
Show cumulative revenue per day.
SELECT
day,
daily_revenue,
SUM(daily_revenue) OVER (ORDER BY day ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM daily_revenue;The ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW part is what makes it a running total. Without it, some engines default to RANGE semantics, which can lump together rows with equal day values.
Question 3: LAG and LEAD
Find the time between consecutive orders for each customer.
SELECT
customer_id,
order_id,
ordered_at,
ordered_at - LAG(ordered_at) OVER (PARTITION BY customer_id ORDER BY ordered_at) AS gap
FROM orders;Why this question matters: LAG is the simplest way to compare a row to its previous row, and the pattern shows up everywhere — churn analysis, session detection, time-between-events.
Question 4: PARTITION BY vs GROUP BY — the trap
Show each transaction along with the average transaction amount in its city.
The trap: candidates write a GROUP BY which collapses rows. The correct answer uses a window function so each transaction row is preserved:
SELECT
transaction_id,
city,
amount,
AVG(amount) OVER (PARTITION BY city) AS city_avg
FROM transactions;When to use which:
- GROUP BY collapses rows — one row per group.
- PARTITION BY keeps every row — adds aggregate as a new column.
Question 5: First and last value per group
What was the first purchase amount each customer made?
SELECT DISTINCT
customer_id,
FIRST_VALUE(amount) OVER (PARTITION BY customer_id ORDER BY purchased_at) AS first_purchase
FROM purchases;FIRST_VALUE and LAST_VALUE are underused. Most candidates write a subquery with MIN(date) and join back — works but verbose.
Question 6: Gap and island detection
Find consecutive day streaks for each user.
This is the "Duolingo streak" question. The trick: subtract row-number from the date. Rows that share the same offset belong to the same streak.
SELECT
user_id,
COUNT(*) AS streak_length,
MIN(activity_date) AS streak_start,
MAX(activity_date) AS streak_end
FROM (
SELECT
user_id,
activity_date,
activity_date - INTERVAL '1 day' * ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY activity_date) AS grp
FROM activity
) grouped
GROUP BY user_id, grp;Question 7: Deduplication
Keep only the most recent row per customer.
SELECT *
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY updated_at DESC) AS rn
FROM customer_state
) latest
WHERE rn = 1;The pattern: ROW_NUMBER() partitioned by the dedup key, ordered by recency, filter = 1.
Question 8: Percentile per group
Find the median transaction per city.
SELECT DISTINCT
city,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY amount) OVER (PARTITION BY city) AS median_amount
FROM transactions;Or with NTILE:
SELECT city, AVG(amount) AS median_amount
FROM (
SELECT city, amount, NTILE(2) OVER (PARTITION BY city ORDER BY amount) AS half
FROM transactions
) buckets
WHERE half = 1
GROUP BY city;What to drill before your next interview
- Run each of these against a real dataset (Postgres, BigQuery, or Snowflake — they all support window functions).
- Modify the partition and ordering to see how output changes. Build the muscle of predicting what comes out.
- Time yourself on the gap-and-island and running-total problems. They're the most asked.
If you want a daily 10-minute drill with worked solutions across these and 500+ other SQL problems, join the waitlist for naildd.
FAQ
Are window functions different across Postgres, MySQL, BigQuery, and Snowflake?
Slightly. All major engines support ROW_NUMBER, RANK, LAG, LEAD, and basic aggregates. Edge cases: MySQL added window functions in 8.0 (older versions don't support them), BigQuery uses slightly different syntax for NTILE ties, and Snowflake has additional functions like MEDIAN directly.
Which is faster — window function or self-join?
Window functions are usually faster because the engine processes the table once. Self-joins with subqueries can blow up to O(n²) on large tables.
Do I need to memorize the exact syntax?
For window functions specifically, yes — at least the four most common: ROW_NUMBER(), LAG, SUM() OVER, and RANK(). Interviewers expect you to write them on a whiteboard without autocomplete.