How to calculate ARPPU in SQL
Contents:
What ARPPU is and why it matters
ARPPU — Average Revenue Per Paying User — is the metric you reach for when you stop caring about the funnel and start caring about the people who paid. ARPU spreads revenue across every user, free riders included, and dilutes everything by the paying rate. ARPPU strips that dilution out. It is the average ticket of buyers, and it is the number a pricing manager at Stripe, a packaging team at Notion, or a monetization lead at DoorDash quotes when the question is "how much does a real customer spend with us?".
ARPU and ARPPU move for different reasons. If a Linear PM says monetization "feels slow", ARPU could be flat because paying rate dropped while the ticket grew, or because more users converted but each paid less. The decomposition only works when ARPPU sits cleanly next to ARPU.
This post walks through the SQL end to end: monthly base, country and channel breakdowns, cohort ARPPU, median and percentile views, and the net-of-refunds variant finance signs off on. Every snippet runs in Postgres with minor edits for Snowflake, BigQuery, or Redshift.
Formulas: ARPPU, ARPU, and how they connect
The base definition is one line. Total revenue in the period, divided by the count of distinct users who paid.
ARPPU = Revenue / Paying UsersA "paying user" is anyone with at least one paid transaction in the window. Five purchases by the same buyer count once — the denominator is users, not transactions. The link to ARPU is what makes both metrics useful together.
ARPU = Paying Rate × ARPPUIf ARPU drops, only two things can be true: fewer people paid, or the people who paid spent less. ARPPU isolates the second case. ARPPU stable but ARPU down means a conversion problem; ARPPU down while paying rate holds means pricing, discounting, or mix.
The data model we will use
For every example below, assume three boring tables — a users dimension, a transactions fact, and an activity stream.
users (user_id, registered_at, country, channel, platform)
transactions (user_id, amount, status, created_at, refund_amount)
activity (user_id, date)transactions.status includes at least paid and refunded. amount is the gross transaction amount in USD. refund_amount is non-null when a refund has been issued. activity.date is one row per user per active day — the standard DAU/MAU shape.
Seven SQL queries you will actually ship
1. Monthly ARPPU
Sum paid revenue, count distinct payers, divide. Trailing twelve months gives enough range for seasonality.
SELECT
DATE_TRUNC('month', created_at)::DATE AS month,
SUM(amount)::NUMERIC AS revenue,
COUNT(DISTINCT user_id) AS paying_users,
SUM(amount)::NUMERIC / NULLIF(COUNT(DISTINCT user_id), 0) AS arppu
FROM transactions
WHERE status = 'paid'
AND amount > 0
AND created_at >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY 1
ORDER BY 1;The amount > 0 filter strips zero-dollar rows that leak from trials, partner orders, or test data. The NULLIF is defensive: a month with zero payers returns null instead of crashing the dashboard.
2. ARPU vs ARPPU side by side
Paying rate, ARPU, ARPPU on one row, for one month — the decomposition I would put on a metric-review dashboard.
WITH revenue AS (
SELECT SUM(amount)::NUMERIC AS total_revenue
FROM transactions
WHERE status = 'paid'
AND created_at >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
AND created_at < DATE_TRUNC('month', CURRENT_DATE)
),
all_users AS (
SELECT COUNT(DISTINCT user_id) AS cnt
FROM activity
WHERE DATE >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
AND DATE < DATE_TRUNC('month', CURRENT_DATE)
),
paying AS (
SELECT COUNT(DISTINCT user_id) AS cnt
FROM transactions
WHERE status = 'paid'
AND created_at >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
AND created_at < DATE_TRUNC('month', CURRENT_DATE)
)
SELECT
r.total_revenue,
a.cnt AS all_users,
p.cnt AS paying_users,
r.total_revenue / NULLIF(a.cnt, 0) AS arpu,
r.total_revenue / NULLIF(p.cnt, 0) AS arppu,
p.cnt::NUMERIC * 100 / NULLIF(a.cnt, 0) AS paying_pct
FROM revenue r, all_users a, paying p;When the metric review starts and ARPU is off, glance at the three columns and you know whether the conversation should go to pricing or to conversion. The same query reveals when ARPPU and ARPU disagree on direction — a warning that a mix shift is hiding inside the aggregate.
3. ARPPU by country
The breakdown finance and pricing ask for first. Purchasing power, payment friction, and plan availability all vary by country, and ARPPU catches all three.
SELECT
u.country,
COUNT(DISTINCT t.user_id) AS paying_users,
SUM(t.amount)::NUMERIC AS revenue,
SUM(t.amount)::NUMERIC / NULLIF(COUNT(DISTINCT t.user_id), 0) AS arppu
FROM transactions t
JOIN users u ON u.user_id = t.user_id
WHERE t.status = 'paid'
AND t.created_at >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY u.country
ORDER BY arppu DESC;A consistent gap — US ARPPU three times Brazilian ARPPU at the same tier — usually signals untapped pricing room in the low-ARPPU market. A sudden change in one country, with nothing else moved, usually signals a tax, currency, or local-payment regression worth investigating before the next forecast.
4. ARPPU by acquisition channel
Marketing wants this slice when arguing about budget. Anchor it on registration month so younger users do not bias the comparison.
SELECT
u.channel,
COUNT(DISTINCT t.user_id) AS paying_users,
SUM(t.amount)::NUMERIC / NULLIF(COUNT(DISTINCT t.user_id), 0) AS arppu
FROM transactions t
JOIN users u ON u.user_id = t.user_id
WHERE t.status = 'paid'
AND t.created_at >= CURRENT_DATE - INTERVAL '90 days'
AND u.registered_at < CURRENT_DATE - INTERVAL '90 days'
GROUP BY u.channel
ORDER BY arppu DESC;Without the cutoff, the chart will say the newest paid channel has the lowest ARPPU — but you are measuring user maturity, not channel quality. With it, the channel-mix discussion stays honest.
5. Cohort ARPPU by acquisition month
Cohorts answer the lifetime version: how much does a paying user from this acquisition month end up spending with us? The query keys on users.registered_at so the cohort is acquisition, not first-paid.
SELECT
DATE_TRUNC('month', u.registered_at)::DATE AS cohort,
SUM(t.amount)::NUMERIC AS cohort_revenue,
COUNT(DISTINCT t.user_id) AS cohort_paying,
SUM(t.amount)::NUMERIC / NULLIF(COUNT(DISTINCT t.user_id), 0) AS arppu
FROM transactions t
JOIN users u ON u.user_id = t.user_id
WHERE t.status = 'paid'
AND u.registered_at >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY 1
ORDER BY 1;In a healthy subscription product, older cohorts accumulate more revenue per paying user — that is the curve a board wants to see. Flat or downward cohort ARPPU is a retention or monetization problem hiding under aggregate growth.
6. Median and percentile ARPPU
The mean lies whenever the spend distribution is heavy-tailed, and consumer products almost always are. A small share of whales drags the average up and hides the typical buyer. Median plus percentiles fixes that.
SELECT
DATE_TRUNC('month', created_at)::DATE AS month,
AVG(user_total) AS arppu_mean,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY user_total) AS arppu_median,
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY user_total) AS arppu_p90,
PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY user_total) AS arppu_p99
FROM (
SELECT
user_id,
DATE_TRUNC('month', created_at) AS created_at,
SUM(amount) AS user_total
FROM transactions
WHERE status = 'paid'
AND amount > 0
AND created_at >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY user_id, DATE_TRUNC('month', created_at)
) per_user
GROUP BY 1
ORDER BY 1;The gap between mean and median is the diagnostic. A two-times gap is normal for a product with strong power users; ten-times is a warning that headline ARPPU is driven by a handful of accounts whose behavior is not generalizable. When you see that, report median alongside the mean.
7. Net ARPPU after refunds
Gross ARPPU lies in any category with material refunds — fashion, certain electronics, anything with a generous return policy. The corrected version subtracts refund_amount before dividing.
SELECT
DATE_TRUNC('month', created_at)::DATE AS month,
SUM(amount - COALESCE(refund_amount, 0))::NUMERIC AS net_revenue,
COUNT(DISTINCT user_id) AS paying_users,
SUM(amount - COALESCE(refund_amount, 0))::NUMERIC
/ NULLIF(COUNT(DISTINCT user_id), 0) AS net_arppu
FROM transactions
WHERE status IN ('paid', 'refunded')
AND created_at >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY 1
ORDER BY 1;Reporting gross and net side by side is the practice that survives audit. Gross is the catalog story; net is the cash that arrived. Finance signs off on net.
Common pitfalls
The first trap is confusing ARPU with ARPPU and presenting one as the other. ARPU includes every user; ARPPU restricts to payers and is always larger, often by an order of magnitude. A pricing review built on ARPU when the answer was ARPPU produces the wrong recommendation, because what a buyer actually spends gets drowned in zeros. Every dashboard tile should label which one it is.
The second trap is letting refunds inflate the number. If your query sums amount and ignores refund_amount, ARPPU is structurally overstated. The same logic applies to credits, chargebacks, and post-purchase discounts. Keep gross and net side by side and default to net for external reports.
The third trap is the period mismatch. Monthly, quarterly, and lifetime ARPPU are three different numbers and cannot be compared without normalizing. A user who paid once in January contributes the full January ARPPU but is not necessarily in February. Pick a window, document it, and avoid mixing windows on the same axis.
The fourth trap is multi-product attribution. A user paying for Plan A and Plan B is one paying user, not two. Most warehouses get this right because COUNT(DISTINCT user_id) collapses across products. The bug appears when an analyst joins to a product table and double-counts — compare paying-user counts with and without the join, and if they differ, the join is fanning the rows.
The fifth trap is outliers. Power users distort the mean. One percent of customers paying fifty percent of revenue is a real pattern, and the mean ARPPU on that distribution is not a useful estimate of the typical buyer. Publish median and ninetieth-percentile ARPPU alongside the mean, and investigate any month where the gap widens.
Optimization tips
ARPPU queries are usually cheap because the transactions table is smaller than events. Past a few hundred million rows, three habits matter. First, partition transactions on created_at with monthly partitions — every ARPPU query is range-bounded by date, so pruning collapses the scan to two months of rows.
Second, materialize the per-user monthly aggregate. The percentile query is the costliest pattern because it computes a per-user sum before any percentile can run. A nightly user_month_revenue table turns that workload into a cheap lookup.
Third, watch COUNT(DISTINCT user_id) cost. On very large tables, swap to APPROX_COUNT_DISTINCT in Snowflake, HLL_COUNT.MERGE in BigQuery, or the HLL extension in Postgres. Keep exact counts for finance-grade reporting only.
A Python sanity check
When a SQL number looks weird, recompute it from a pandas dataframe and compare.
paid = transactions.loc[transactions["status"] == "paid"]
revenue = paid["amount"].sum()
arppu = revenue / max(paid["user_id"].nunique(), 1)If pandas and SQL disagree, the bug is almost always in the SQL — a missed status = 'paid' filter, a timezone-shifted date range, a join fanout, or integer division on cents.
Related reading
- How to calculate ARPU in SQL — the wider-denominator sibling, with the decomposition into paying rate and ARPPU spelled out.
- SQL window functions interview questions —
LAG, percentile, and frame patterns that show up next to ARPPU on every real dashboard. - A/B testing peeking mistake — what to watch when ARPPU is the success metric of an experiment.
If you want to drill SQL questions like this every day, NAILDD is launching with hundreds of analytics SQL problems built around this pattern — revenue, paying-user denominators, cohort views, and percentiles.
FAQ
What is the difference between ARPU and ARPPU?
ARPU divides revenue by the entire user base — free, trial, lapsed, every signup in the period. ARPPU divides by paying users only and is therefore always larger, often by ten to fifty times in freemium. ARPU tells you how monetization is trending across the funnel; ARPPU tells you the average ticket of buyers. Both belong on the same dashboard because they answer different questions.
Should I use mean or median ARPPU?
Both, on the same chart. Consumer products almost always have heavy-tailed revenue: a small share of buyers contributes a disproportionate share. The mean is sensitive to that tail; the median is not. When the mean is much larger than the median, headline ARPPU is being pulled up by whales. Median plus ninetieth percentile gives the audience enough to interpret both the typical buyer and the tail.
ARPPU is up — is that good?
Usually yes, but verify the paying rate first. If ARPPU rose because existing buyers spend more, monetization is healthy. If ARPPU rose because marginal buyers stopped converting and only the high-spending segment remained, total revenue can still be falling. Put ARPPU, paying rate, and total revenue on the same chart — if all three are up, celebrate; if ARPPU rises while the others fall, you have a survivorship problem.
ARPPU is stable but ARPU is rising — what does that mean?
The textbook signal of a paying-rate improvement: more users converting at the same average ticket. Exactly what monetization teams hope for, because it grows revenue without price increases. Confirm by computing paying rate explicitly.
Can ARPPU be used as a proxy for LTV?
For a single cohort over a long enough window, yes — cumulative cohort ARPPU is essentially historical LTV per paying user. The shortcut breaks when the window is too short for the full payment curve, or when churn is non-stationary. Use ARPPU for monthly buyer spend, LTV for lifetime signup value.