How to calculate funnel step conversion in SQL
Contents:
What step conversion is and why it matters
It is a Monday morning at a Notion-sized SaaS company and the head of growth slacks the analytics channel: paid signups doubled last quarter but paid revenue is flat. The product team blames pricing, marketing blames the ad mix, and the CEO wants an answer by Wednesday. The honest answer almost always lives in a step-by-step conversion table. If landing-to-signup is 50%, signup-to-trial is 80%, and trial-to-paid is 8%, the leak is at the paywall, not the top. Until you can show that table, every meeting is a vibes argument.
Funnel step conversion is the rate at which users move from one stage of a defined journey to the next. The full funnel has a top-of-funnel conversion (visits to paying customers, often called overall CR) and a series of step conversions between adjacent stages. Each step is independently actionable. Top-of-funnel CR hides where to invest the next engineering sprint; step CR shows you which screen to redesign first.
This post walks through a real interview-grade SQL pattern that hiring managers at Stripe, DoorDash, Linear, and Vercel use on product analyst candidates: the absolute conversion query, the relative step query, a channel-segmented breakdown, and the traps that trip up most candidates the first time they hit a funnel table at scale.
The funnel schema
Almost every product event log has the same shape: a fact table with at least three columns — who, what, when. The canonical B2B SaaS funnel is landing -> signup -> trial_started -> paid -> renewed, but the same pattern fits e-commerce (view_item -> add_to_cart -> checkout_start -> purchase), gaming (install -> tutorial_complete -> level_5 -> first_purchase), or fintech (app_open -> kyc_submit -> kyc_pass -> first_deposit).
event_name | user_id | event_timestamp
landing | 42 | 2026-05-01 10:00:00
signup | 42 | 2026-05-01 10:15:00
trial_started | 42 | 2026-05-01 10:17:00
paid | 42 | 2026-05-15 09:00:00The first decision is whether to require strict ordering (a user must hit step 1, then 2, then 3 in chronological order) or loose qualification (a user is counted at step N if they ever fired event N in the window). UX funnels want strict ordering because the flow matters. Marketing funnels often want loose qualification because the user can land via multiple paths. Decide first, document the choice, and write the SQL to match.
Absolute conversion from the top
Absolute conversion measures the rate from the top down to each step. It answers the question every executive really wants: of all users who showed up at the landing page, what fraction ever paid? This is the headline metric that goes in the board deck.
WITH user_funnel AS (
SELECT
user_id,
MAX(CASE WHEN event_name = 'landing' THEN 1 ELSE 0 END) AS landing,
MAX(CASE WHEN event_name = 'signup' THEN 1 ELSE 0 END) AS signup,
MAX(CASE WHEN event_name = 'trial_started' THEN 1 ELSE 0 END) AS trial,
MAX(CASE WHEN event_name = 'paid' THEN 1 ELSE 0 END) AS paid
FROM product_events
WHERE event_timestamp >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY user_id
)
SELECT
SUM(landing) AS landing_users,
SUM(signup) AS signup_users,
SUM(trial) AS trial_users,
SUM(paid) AS paid_users,
ROUND(100.0 * SUM(signup) / NULLIF(SUM(landing), 0), 2) AS landing_to_signup_pct,
ROUND(100.0 * SUM(trial) / NULLIF(SUM(landing), 0), 2) AS landing_to_trial_pct,
ROUND(100.0 * SUM(paid) / NULLIF(SUM(landing), 0), 2) AS landing_to_paid_pct
FROM user_funnel;The shape is straightforward. The GROUP BY user_id pivots the long event log into a wide row where each column is a Boolean for whether that user ever fired the event. The outer SUM collapses the wide table down to total user counts per stage. Dividing the deeper stage by the shallowest stage gives absolute conversion.
A subtle point worth narrating to an interviewer: MAX(CASE WHEN ...) instead of SUM(CASE WHEN ...) makes the per-user column a Boolean, so users who fired the event twice (a known issue in many event pipelines) do not get double-counted. With SUM, a duplicate emit on a redirect quietly pushes CR above 100% and someone catches it in review. The NULLIF(..., 0) guards against an empty top of funnel; on staging or early-launch products, a bare division produces a divide-by-zero error.
Relative conversion between steps
Relative conversion is the diagnostic view: the rate from one step to the immediately adjacent next step. A team can stare at "landing-to-paid is 4%" forever and never know whether to fix the landing page, the signup form, or the paywall. The relative table tells them.
WITH unique_events AS (
SELECT DISTINCT user_id, event_name
FROM product_events
WHERE event_timestamp >= CURRENT_DATE - INTERVAL '30 days'
),
step_counts AS (
SELECT
SUM(CASE WHEN event_name = 'landing' THEN 1 ELSE 0 END) AS landing_n,
SUM(CASE WHEN event_name = 'signup' THEN 1 ELSE 0 END) AS signup_n,
SUM(CASE WHEN event_name = 'trial_started' THEN 1 ELSE 0 END) AS trial_n,
SUM(CASE WHEN event_name = 'paid' THEN 1 ELSE 0 END) AS paid_n
FROM unique_events
)
SELECT
ROUND(100.0 * signup_n / NULLIF(landing_n, 0), 2) AS step_landing_to_signup_pct,
ROUND(100.0 * trial_n / NULLIF(signup_n, 0), 2) AS step_signup_to_trial_pct,
ROUND(100.0 * paid_n / NULLIF(trial_n, 0), 2) AS step_trial_to_paid_pct
FROM step_counts;Read this output like a chain of pipes. The step with the lowest percentage is the bottleneck. If step_landing_to_signup_pct is 12% while the others are above 60%, the landing page is failing. If step_trial_to_paid_pct is 6% while the others are above 50%, pricing or paywall is the problem. The relative view turns a single number into a triage diagram.
For strict-ordering funnels, swap DISTINCT user_id, event_name for a per-user pivot that captures MIN(event_timestamp) per event and gates each step on the previous step's timestamp being earlier. Product analysts at Uber and Airbnb default to the strict variant, because a row where the user paid before trial almost always signals a tracking bug or promo flow that should be filtered first.
Cohort segmentation by channel
The overall funnel is the average across every acquisition source, which makes it the answer to nobody's question. Paid social users behave nothing like referral users; branded search converts at multiples of display. A single funnel number is directionally correct but operationally useless for budget decisions.
WITH user_funnel AS (
SELECT
u.user_id,
u.utm_source AS channel,
MAX(CASE WHEN e.event_name = 'signup' THEN 1 ELSE 0 END) AS signup,
MAX(CASE WHEN e.event_name = 'trial_started' THEN 1 ELSE 0 END) AS trial,
MAX(CASE WHEN e.event_name = 'paid' THEN 1 ELSE 0 END) AS paid
FROM users u
LEFT JOIN product_events e
ON e.user_id = u.user_id
WHERE u.created_at >= CURRENT_DATE - INTERVAL '60 days'
GROUP BY u.user_id, u.utm_source
)
SELECT
channel,
COUNT(*) AS cohort_size,
ROUND(100.0 * SUM(signup) / COUNT(*), 2) AS signup_pct,
ROUND(100.0 * SUM(trial) / NULLIF(SUM(signup), 0), 2) AS trial_after_signup_pct,
ROUND(100.0 * SUM(paid) / NULLIF(SUM(trial), 0), 2) AS paid_after_trial_pct
FROM user_funnel
GROUP BY channel
HAVING COUNT(*) >= 100
ORDER BY paid_after_trial_pct DESC;The HAVING COUNT(*) >= 100 is not aesthetic. It is the minimum cohort size before percentage swings stop being noise. If a channel has 12 signups and 3 paid, the reported CR is 25% but the 95% confidence interval is wider than the y-axis. Hiring managers test for this — write the filter into the query and explain why.
A typical output for a US consumer SaaS product: referral programs land at the top with trial-to-paid above 30% because friends pre-qualify the user. Branded search lands second around 20%. Paid social comes in at 5 to 10%, display ads at 1 to 3%. Once that table is in front of a CMO, the conversation stops being "is paid working?" and starts being "what is the right blend across channels given each one's lifetime value?"
Common pitfalls
The most common pitfall is ignoring event ordering on a strict funnel. A user who fires paid without ever firing trial_started is either a tracking bug or a free-promotion flow that should be filtered out, and counting that user in the trial-to-paid denominator inflates the rate. The fix is to gate each step on the previous step's timestamp and alert on order violations so the upstream pipeline issue gets chased down.
The second pitfall is computing counts without DISTINCT user_id. Event pipelines often double-emit on retry or redirect, so a user firing signup at the form submit and again at the welcome page is counted twice. Use SELECT DISTINCT user_id, event_name in a subquery before counting, or use MAX(CASE WHEN ...) per user, which collapses duplicates by construction.
The third pitfall is mixing cohorts of different ages in one funnel. A user who signed up two days ago has not had time to convert to paid if your trial is 14 days. If you include them in the denominator alongside a user from two months ago, you understate the paid step. Anchor the funnel to a fixed cohort window (signed up at least 30 days ago) and report CR only for users who have had enough time to finish the journey.
The fourth pitfall is writing the entire funnel as one mega-query with eight nested CASE WHEN columns. It works for four steps and collapses around step seven, when reviewers cannot follow which condition gates which. The maintainable pattern is one CTE per step, each filtering on the previous step's user set, joined with LEFT JOIN against the next event.
The fifth pitfall is reporting only the percentage. A 50% step CR on 6 users is not the same metric as 50% on 60,000 users, but they look identical on a dashboard. Hiring managers at companies like Databricks and Snowflake will probe this in a take-home: show the cohort size in the same row as the rate, every time.
Optimization tips
For an event table with billions of rows, the bottleneck is the wide pivot in the first CTE. Partition product_events by event_timestamp::date so the 30-day filter becomes a partition scan, and cluster by user_id so per-user aggregation is local. On Snowflake or BigQuery this turns a full scan into a small range scan.
If the funnel runs daily, materialize an event_per_user_per_day rollup that incrementally upserts yesterday's pivoted rows into a slim table. Dashboards aggregate the rollup instead of the raw fact, cutting query time from minutes to seconds. For cohort breakdowns at scale, swap COUNT(DISTINCT user_id) for APPROX_COUNT_DISTINCT where available — accuracy is within a percent or two and runtime drops by one to two orders of magnitude.
Related reading
- SQL window functions interview questions
- How to calculate conversion window in SQL
- How to calculate customer journey in SQL
- How to calculate activation rate in SQL
If you want to drill product-analytics SQL questions like this every day, the NAILDD app is launching with 500+ SQL problems across exactly this pattern.
FAQ
Should I use strict ordering or loose qualification?
Strict ordering applies when the funnel is a UX flow and the user must physically pass each step to reach the next. Loose qualification fits marketing-style "did the user ever do X in the window" questions, where a user can arrive via multiple paths. Product onboarding and checkout funnels almost always want strict; marketing awareness-to-action funnels usually want loose. Document the choice in the metric definition so the next analyst does not flip it silently.
What time window should I pick between steps?
It depends on the product. A consumer SaaS with self-serve signup has signup-to-trial within minutes and trial-to-paid in 14 to 30 days, matching the trial length. A B2B product with sales involvement can take 30 to 90 days between trial and paid. Pick the window that covers 80 to 90% of historical conversions for that step, and label the funnel "30-day step CR" so nobody compares your number to a 90-day calculation from another team.
How do I handle branching funnels with alternate paths?
Treat each branch as its own funnel. If the flow is A -> B1 or B2 -> C, compute two independent funnels and report both. Merging branches into a single number obscures which path is the winner and forces the downstream debate back into qualitative claims. Hiring managers love this question because it tests whether the candidate insists on a clean answer instead of forcing the data into a single column.
How many funnel steps should I report?
Three to seven. Below three you lose diagnostic power. Above seven the table becomes unreadable and the percentages compound into single-digit numbers nobody can interpret. If the real flow has fifteen events, pick the five that matter for the business question and treat the rest as substeps you drill into when one of the main numbers moves.
What is the difference between step conversion and cohort retention?
Step conversion measures how a single cohort moves through a sequence of stages toward a one-time outcome like a first purchase. Cohort retention measures whether the same cohort comes back over time to repeat an action like opening the app weekly. They share the cohort-anchoring trick but answer different questions. Weak step CR is a marketing or onboarding problem; weak retention is a product-value problem.
How do I size the impact of fixing a step?
Multiply current step CR by upstream cohort size to get users who pass today. Apply an optimistic relative lift (10 to 30%) and recompute the downstream count. Multiply by average revenue per paid user for a dollar estimate. That back-of-envelope number justifies engineering work, and it is the answer interviewers want when they ask "how would you prioritize?"