Quantiles and percentiles
Contents:
What a percentile actually is
A percentile is the value below which a given share of your data sits. The 90th percentile of session length is the duration that 90% of your sessions are shorter than. The median is just the 50th percentile under a friendlier name.
Picture your PM at Stripe pinging you on a Friday: "what's the P99 checkout latency for the new flow, and is it within SLA". You cannot answer that with a mean. You need a single number that says "99% of checkouts complete in under X milliseconds", and that number is by definition the 99th percentile. Once you internalize that framing, percentiles stop being academic and start being the language of every service-level conversation.
The reason analysts lean on percentiles harder than averages is that real product data is rarely symmetric. Revenue per user has whales. Latency has tail events. Time-to-first-action has people who never act at all. Means hide this skew under a single misleading number; percentiles expose it.
Quantiles, quartiles, deciles, percentiles
These four words name the same idea sliced into different pieces. A quantile is the general concept — a cut point that divides data by share. Quartiles cut it into four parts (Q1, Q2, Q3). Deciles cut it into ten. Percentiles cut it into one hundred. The 50th percentile equals Q2 equals the fifth decile equals the median — pick the vocabulary that fits the audience.
Data: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
Q1 (25th percentile) = 27.5
Median (50th percentile) = 55
Q3 (75th percentile) = 82.5
P90 = 91
P99 = 99.1One more term you will see in tickets: IQR, the interquartile range. It is just Q3 minus Q1, a measure of spread that ignores the tails. IQR sits at the heart of the Tukey rule for flagging outliers — anything below Q1 minus 1.5 IQR or above Q3 plus 1.5 IQR. That rule is what your box plot draws.
Q1 median Q3
| | |
----+-------+--------+----
25% 50% 75%
|<--- IQR --->|Computing percentiles in Python
NumPy is the workhorse. np.percentile accepts a percentile number from 0 to 100; np.quantile accepts the same input as a fraction from 0 to 1. Both interpolate by default, matching what Postgres calls PERCENTILE_CONT.
import numpy as np
data = [15, 20, 35, 40, 50, 60, 70, 80, 85, 95]
np.percentile(data, 50) # 55.0 — median
np.percentile(data, 25) # 36.25 — Q1
np.percentile(data, 75) # 78.75 — Q3
np.percentile(data, 90) # 86.5 — P90
# Several percentiles at once
np.percentile(data, [25, 50, 75, 90, 99])
# Quantile form (0 to 1)
np.quantile(data, 0.5) # 55.0
np.quantile(data, 0.25) # 36.25Pandas wraps the same logic. The quantile method takes one number or a list, and describe spits out Q1, median, and Q3 — the fastest way to eyeball a fresh dataset.
import pandas as pd
df = pd.DataFrame({
'user_id': range(1, 11),
'revenue': [100, 200, 500, 800, 1200, 2000, 3500, 5000, 8000, 50000]
})
df['revenue'].quantile(0.5) # median
df['revenue'].quantile(0.90) # P90
df['revenue'].quantile([0.25, 0.5, 0.75, 0.9])
df['revenue'].describe()
# count 10.0
# mean 7130.0
# std 15000+
# min 100
# 25% 575
# 50% 1600
# 75% 4625
# max 50000Group operations follow the same shape. If you want P90 revenue per acquisition channel, you group, then call quantile.
# P90 revenue per segment
df.groupby('segment')['revenue'].quantile(0.9)
# Several percentiles, one call
df.groupby('segment')['revenue'].describe(percentiles=[.25, .5, .75, .9, .95])Computing percentiles in SQL
In Postgres the canonical function is PERCENTILE_CONT(fraction) WITHIN GROUP (ORDER BY column). The same syntax works in Snowflake, Redshift, and BigQuery standard SQL.
-- Quartiles and tail latency in one pass
SELECT
PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY latency_ms) AS q1,
PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY latency_ms) AS median,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY latency_ms) AS q3,
PERCENTILE_CONT(0.90) WITHIN GROUP (ORDER BY latency_ms) AS p90,
PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY latency_ms) AS p99
FROM checkout_events;
-- Percentiles per segment
SELECT
segment,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY revenue) AS median_revenue
FROM users
GROUP BY segment;Postgres also ships PERCENTILE_DISC, which returns one of the actual values rather than interpolating. Use CONT for continuous metrics — revenue, latency, duration. Use DISC when interpolation makes no sense, like with discrete ratings.
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY val) FROM (VALUES (10), (20)) t(val);
-- 15.0 — interpolated
SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY val) FROM (VALUES (10), (20)) t(val);
-- 10 — nearest actual valueWindow functions cover the bucketing flavor. NTILE(N) assigns each row to one of N equal-size buckets — perfect for "give me the top decile of accounts by ARR". PERCENT_RANK() returns the percentile rank of every row when you want a continuous score rather than a bucket label.
SELECT user_id, revenue,
NTILE(4) OVER (ORDER BY revenue) AS quartile,
ROUND(PERCENT_RANK() OVER (ORDER BY revenue) * 100, 1) AS pct_rank
FROM users;ClickHouse uses a different family — quantile(0.5)(col) for an approximate median, quantileExact(0.5)(col) for an exact one, quantiles(0.25, 0.5, 0.75)(col) for several at once. MySQL before version 8 has no built-in function and forces a ROW_NUMBER() workaround.
Where percentiles show up in product analytics
The most common place is latency and SLA tracking. An average response time of 120 ms means almost nothing — one cron that hangs for ten seconds drags it up while the typical user is unaffected. A P50 of 80 ms with a P99 of 1800 ms tells the real story: most requests are snappy, but the top 1% are painfully slow. SLAs are written as percentile thresholds for exactly this reason.
response_times = [50, 80, 100, 120, 150, 200, 250, 500, 800, 2000]
p50 = np.percentile(response_times, 50)
p90 = np.percentile(response_times, 90)
p99 = np.percentile(response_times, 99)
print(f'P50: {p50} ms, P90: {p90} ms, P99: {p99} ms')The second hot spot is revenue segmentation. "Top 10% of customers by lifetime value" is a P90 cut, and you express it cleanly with a window function plus a CASE statement, or with NTILE if you also want the bottom slices.
SELECT
user_id,
total_revenue,
NTILE(10) OVER (ORDER BY total_revenue DESC) AS decile
FROM user_revenue;The first decile here is your top 10% — the accounts Stripe or Notion call out by name in QBR slides. Deciles two through five are the healthy middle; the bottom half is where churn risk lives.
The third use case is outlier detection. The Tukey rule uses Q1, Q3, and IQR to flag suspicious values, and it sits at the core of the standard box plot.
q1 = df['revenue'].quantile(0.25)
q3 = df['revenue'].quantile(0.75)
iqr = q3 - q1
lower = q1 - 1.5 * iqr
upper = q3 + 1.5 * iqr
outliers = df[(df['revenue'] < lower) | (df['revenue'] > upper)]
print(f'Outliers flagged: {len(outliers)}')Percentile vs mean
The mean answers "what would each user get if I divided the pie equally". The median answers "what does a typical user actually see". Those are different questions, and the second one is almost always what a stakeholder really wants when they ask for a number.
| Metric | Mean | Median (P50) | P90 / P99 |
|---|---|---|---|
| Outlier-robust | No | Yes | Yes |
| Describes | The pooled total | The typical case | The tail |
| Use when | Distribution is roughly normal | Distribution is skewed | SLAs, worst-case planning |
For a real-world example, ship a pricing experiment at Airbnb. 80% of guests spend $80-$300 per booking; 5% spend over $5,000. The mean lands near $450; the median sits closer to $180. Reporting the mean makes the average guest sound four times richer than they are; the median tells the truth.
Common pitfalls
The most frequent mistake is using the mean for a skewed distribution and treating it as the typical value. Revenue, session length, time-to-purchase, latency — all have fat tails, and the mean is dominated by the upper end. Lead with the median and only report the mean alongside when the audience specifically wants the total-per-user implication.
The second trap is confusing "the 90th percentile" with "the top 10%". P90 is a value — a single number at the boundary. The top 10% is a set of rows — every observation above that boundary. When a stakeholder asks for "P90 revenue", clarify which they mean. Half the time they want the number P90 itself; half the time they want the average revenue of the top decile, which is a different metric.
The third pitfall lives in SQL. People reach for AVG when computing tail metrics because the syntax is shorter, then they pull a number that lies about latency. Whenever the metric you care about is a tail, use a percentile function — tail regressions hide perfectly inside average ms numbers and only surface when a customer escalates.
The fourth is mixing interpolation methods between tools. NumPy, Pandas, Postgres PERCENTILE_CONT, and ClickHouse quantile all interpolate slightly differently in edge cases. If you compute P90 in Python and the dashboard later shows a different number from Snowflake, that gap is almost always the interpolation rule, not a real data drift. Pick one tool as canonical for any given metric and document it.
The fifth is sampling-driven instability for high percentiles. A reliable P99 needs roughly one hundred times more data than a reliable P50, because by definition only 1% of your sample contributes. For a stable P99 you want tens of thousands of observations at minimum, or switch to a parametric model of the tail.
Interview questions
What is the median and how does it differ from the mean? The median is the 50th percentile — the value with half the data below it. It is robust to outliers; a single extreme observation barely moves it. The mean is sensitive to extremes; one whale shifts it noticeably. For skewed distributions like revenue or latency, the median better describes the typical observation while the mean better describes the pooled total.
What is IQR and how do you use it to flag outliers? IQR is Q3 minus Q1 — the spread of the middle 50% of the data. The Tukey rule labels anything below Q1 minus 1.5 times IQR or above Q3 plus 1.5 times IQR as an outlier. It draws the whiskers on a box plot, and because it relies on quartiles, it stays robust to the very outliers it is trying to detect.
Why does P99 matter for SLAs? P99 captures tail behavior that the mean hides. A service can have an average response time of 100 ms while still serving the slowest 1% of users in five seconds — and that 1% is who escalates to support. SLAs are written as percentile thresholds because they bound the experience of every user, not the average user.
How do you compute a median in SQL? In Postgres, Snowflake, BigQuery, and Redshift use PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY column). In older MySQL you use a self-join with ROW_NUMBER() to find the middle row. In ClickHouse it is quantile(0.5)(column) for an approximate answer or quantileExact(0.5)(column) for a precise one.
Related reading
- How to calculate IQR in SQL
- How to calculate IQR outliers in SQL
- Tail latency percentiles in SQL
- How to build customer segments in SQL
- SQL window functions interview questions
If you want to drill percentile questions like these every day under interview pressure, NAILDD is launching with hundreds of SQL and statistics problems built around exactly this pattern.
FAQ
Are quartiles, deciles, and percentiles the same thing?
They are all quantiles — the same idea cut into different numbers of pieces. Quartiles split the data into four parts (Q1, Q2, Q3). Deciles split it into ten. Percentiles split it into one hundred. The 25th percentile equals Q1, the 50th percentile equals the median equals Q2 equals the fifth decile, and so on.
How do I pick between PERCENTILE_CONT and PERCENTILE_DISC?
PERCENTILE_CONT interpolates between adjacent values, which is what you want for continuous metrics like revenue, latency, or duration. PERCENTILE_DISC returns one of the actual values in the data, which is what you want for discrete metrics like star ratings or integer counts where an interpolated value would not make physical sense.
How do I compute percentiles in ClickHouse?
ClickHouse uses the quantile family rather than SQL standard PERCENTILE_CONT. quantile(0.5)(revenue) returns an approximate median using a reservoir-sampling sketch. quantileExact(0.5)(revenue) returns the exact median but is slower on large tables. quantiles(0.25, 0.5, 0.75, 0.9, 0.99)(revenue) returns several percentiles in one tuple.
Why is my P99 so unstable between runs?
Because by definition only 1% of your sample feeds into the P99 estimate. With a thousand rows, ten rows determine the answer, and the next ten give a different number. Stable P99 estimates need tens of thousands of observations at minimum — or bootstrap a confidence interval around the percentile instead.
Can I use percentiles in an A/B test?
Yes, and you often should. When the metric of interest is a tail — latency, basket size in a long-tailed market, time-to-completion — comparing means hides exactly the effect you are testing for. The two cleanest ways to test a percentile are a quantile regression on the experiment data or a bootstrap of the percentile difference between groups. Both produce a confidence interval you can act on.