May 17, 2026·14 min read

How to calculate ARPU in SQL

Sharpen SQL for your next interview

500+ SQL problems with worked solutions — joins, window functions, CTEs.

Contents:

What ARPU is and why analysts get it wrong
Formulas: ARPU, ARPPU, and the decomposition
The data model we will use
Eight SQL queries you will actually ship
Common pitfalls
Optimization tips
A Python sanity check
Related reading
FAQ

What ARPU is and why analysts get it wrong

ARPU — Average Revenue Per User — is the metric that quietly sits at the bottom of almost every monetization deck. Unit economics start with it, LTV is built on it, and investor updates lean on it whenever someone wants to show that "users are getting more valuable over time." Stripe, Netflix, DoorDash and every consumer subscription business publish some version of it. So when a PM at Notion pings you on Slack on a Sunday night with "can you pull ARPU by channel before standup?", the answer needs to be a clean SQL query, not a debate about definitions.

The trap is the denominator. The formula looks like a one-liner — revenue divided by users — but the word "users" hides three completely different cohorts: everyone who ever signed up, everyone who was active last month, and everyone who actually paid. Pick the wrong one and ARPU swings by a factor of five. I have seen growth dashboards at fintechs and marketplaces show three different ARPUs in the same week, all technically correct, all answering different questions.

This post walks through the SQL you need to compute ARPU and ARPPU end to end: the base monthly query, the decomposition into paying rate and average ticket, breakdowns by channel and platform, cohort ARPU, month-over-month dynamics, and the net-of-refunds version that finance actually cares about. Every query is runnable in Postgres with minor edits for Snowflake, BigQuery, or Redshift.

Formulas: ARPU, ARPPU, and the decomposition

The base definitions are simple. ARPU is total revenue divided by the total user base in the same period. ARPPU — Average Revenue Per Paying User — uses only paying users in the denominator.

ARPU  = Revenue / Total Users
ARPPU = Revenue / Paying Users

The relationship between the two is the most useful fact in the room when ARPU moves and nobody knows why. Paying rate is the share of users who actually paid in the period, and ARPU equals paying rate times ARPPU.

ARPU = Paying Rate × ARPPU

If ARPU drops, only two things can be happening: fewer people paid, or the people who paid paid less. Decomposing the metric tells you which one — and the answer drives completely different actions. A drop in paying rate usually points at conversion, onboarding, or a paywall change. A drop in ARPPU usually points at pricing, discounting, or mix. You want this decomposition in your back pocket the moment a metric review starts.

The data model we will use

For every example below, assume three tables. They roughly match what you would find in a healthy product warehouse — a users dimension, an orders fact, and an events stream.

users   (user_id, registered_at, channel, platform)
orders  (user_id, amount, status, created_at, refund_amount)
events  (user_id, event, event_at)

orders.status includes at least paid and refunded. amount is the gross order amount in USD. refund_amount is non-null when a refund has been issued. Dates are timestamps. The schema is intentionally boring — the queries should translate to whatever you have in production with minimal edits.

Eight SQL queries you will actually ship

1. ARPU for a single month

The simplest version. Sum revenue from paid orders inside the month, divide by the count of users who existed before the period ended.

SELECT
    SUM(o.amount)::NUMERIC                                         AS revenue,
    COUNT(DISTINCT u.user_id)                                      AS total_users,
    SUM(o.amount)::NUMERIC / NULLIF(COUNT(DISTINCT u.user_id), 0)  AS arpu
FROM users u
LEFT JOIN orders o
       ON o.user_id = u.user_id
      AND o.status  = 'paid'
      AND o.created_at >= '2026-04-01'
      AND o.created_at <  '2026-05-01'
WHERE u.registered_at < '2026-05-01';

Two details are doing a lot of work here. The LEFT JOIN keeps users without orders inside the denominator — exactly what ARPU means. The NULLIF(..., 0) is defensive: if you ever run this query for a period with no users (a brand-new market, a feature flag rollout, an analytics gap), it returns null instead of crashing.

2. ARPPU for the same month

The denominator now restricts to paying users only. Notice how the math collapses to a single table.

SELECT
    SUM(amount)::NUMERIC                                        AS revenue,
    COUNT(DISTINCT user_id)                                     AS paying_users,
    SUM(amount)::NUMERIC / NULLIF(COUNT(DISTINCT user_id), 0)   AS arppu
FROM orders
WHERE status = 'paid'
  AND created_at >= '2026-04-01'
  AND created_at <  '2026-05-01';

ARPPU is almost always materially higher than ARPU. It measures the average ticket of buyers and is not diluted by the zeros from non-paying users. If your product has a paying rate of 5 percent and ARPPU of 40 USD, your ARPU will land around 2 USD — the gap is exactly the paying rate.

3. The decomposition in one query

This is the query I would put on a metric review dashboard. ARPU, paying rate, and ARPPU on one row, by month.

WITH monthly AS (
    SELECT
        DATE_TRUNC('month', created_at)::DATE                        AS month,
        COUNT(DISTINCT user_id)                                      AS total_users,
        COUNT(DISTINCT user_id) FILTER (WHERE status = 'paid')       AS paying_users,
        SUM(amount) FILTER (WHERE status = 'paid')                   AS revenue
    FROM orders
    GROUP BY 1
)
SELECT
    month,
    revenue,
    total_users,
    paying_users,
    revenue::NUMERIC / NULLIF(total_users, 0)            AS arpu,
    paying_users::NUMERIC / NULLIF(total_users, 0)       AS paying_rate,
    revenue::NUMERIC / NULLIF(paying_users, 0)           AS arppu
FROM monthly
ORDER BY month;

When ARPU drops, you glance at the next two columns and you already know whether the conversation should be with the conversion team or with pricing.

4. ARPU by acquisition channel

Marketing teams live and die on this breakdown. Fix the registration date so newer users do not bias the comparison.

SELECT
    u.channel,
    SUM(o.amount)::NUMERIC / NULLIF(COUNT(DISTINCT u.user_id), 0)   AS arpu
FROM users u
LEFT JOIN orders o
       ON o.user_id = u.user_id
      AND o.status  = 'paid'
      AND o.created_at >= '2026-04-01'
      AND o.created_at <  '2026-05-01'
WHERE u.registered_at < '2026-04-01'
GROUP BY u.channel
ORDER BY arpu DESC;

Two patterns show up reliably across consumer products. Organic traffic posts a higher ARPU than paid acquisition. And mature cohorts beat fresh ones — which is why pinning the registration cutoff matters. Without it, you are measuring user maturity, not channel quality.

5. ARPU by platform

Almost the same query, swapping channel for platform. iOS, Android, and web typically pay differently — partly because of audience, partly because of payment friction, partly because of platform fees.

SELECT
    u.platform,
    SUM(o.amount)::NUMERIC / NULLIF(COUNT(DISTINCT u.user_id), 0)   AS arpu
FROM users u
LEFT JOIN orders o
       ON o.user_id = u.user_id
      AND o.status  = 'paid'
      AND o.created_at >= '2026-04-01'
      AND o.created_at <  '2026-05-01'
GROUP BY u.platform
ORDER BY arpu DESC;

A subtle point worth remembering: if you book gross revenue in orders.amount but the App Store and Play Store skim 15–30 percent, your platform ARPUs are pre-fee. For honest comparisons, also produce a net-of-fees version.

6. Cohort ARPU

Cohorts answer "how much does a user from this acquisition month end up paying us, in total?". The query below uses first-paid-month as the cohort definition; an alternative is registration month, which captures the lifetime of every signup.

WITH cohort AS (
    SELECT
        user_id,
        DATE_TRUNC('month', MIN(created_at))::DATE AS cohort_month
    FROM orders
    WHERE status = 'paid'
    GROUP BY user_id
)
SELECT
    c.cohort_month,
    SUM(o.amount)::NUMERIC / NULLIF(COUNT(DISTINCT c.user_id), 0)   AS arpu_cohort
FROM cohort c
LEFT JOIN orders o
       ON o.user_id = c.user_id
      AND o.status  = 'paid'
GROUP BY c.cohort_month
ORDER BY c.cohort_month;

In a healthy product, older cohorts accumulate more revenue per user than younger ones — that is exactly the curve you want to show a board. A flat or downward curve is a retention or monetization problem.

7. Month-over-month ARPU dynamics

Window functions earn their keep here. LAG pulls the previous month's value into the same row so the percent change is one expression away.

WITH monthly_arpu AS (
    SELECT
        DATE_TRUNC('month', o.created_at)::DATE                          AS month,
        SUM(o.amount)::NUMERIC / NULLIF(COUNT(DISTINCT u.user_id), 0)    AS arpu
    FROM orders o
    JOIN users u USING (user_id)
    WHERE o.status = 'paid'
    GROUP BY 1
)
SELECT
    month,
    arpu,
    LAG(arpu) OVER (ORDER BY month)                                AS prev_arpu,
    ROUND(100.0 * (arpu - LAG(arpu) OVER (ORDER BY month))
                / NULLIF(LAG(arpu) OVER (ORDER BY month), 0), 2)   AS mom_pct
FROM monthly_arpu
ORDER BY month;

This is the query I reach for when a PM says "ARPU is off this month" and I need to confirm by how much before opening the BI tool.

8. Net ARPU after refunds

Gross ARPU lies if refunds are non-trivial. The corrected version subtracts refund_amount from gross order revenue before dividing.

SELECT
    DATE_TRUNC('month', o.created_at)::DATE                                 AS month,
    SUM(o.amount - COALESCE(o.refund_amount, 0))::NUMERIC
    / NULLIF(COUNT(DISTINCT u.user_id), 0)                                  AS net_arpu
FROM orders o
JOIN users u USING (user_id)
WHERE o.status = 'paid'
GROUP BY 1
ORDER BY 1;

Categories with high return rates — fashion, certain electronics, anything with a generous policy — show large gaps between gross and net ARPU. Finance only signs off on the net version.

Sharpen SQL for your next interview

500+ SQL problems with worked solutions — joins, window functions, CTEs.

Join the waitlist

Common pitfalls

The first trap is integer division. In Postgres, SUM(int) / COUNT(int) quietly evaluates as integer division and produces a number rounded toward zero — so an ARPU of 4.7 silently becomes 4. The fix is to cast one side of the division to NUMERIC (::NUMERIC) or multiply by 1.0. BigQuery and Snowflake are more forgiving, but the bug still hides in dashboards that span databases. Whenever a metric looks suspiciously round, this is the first thing to check.

The second trap is the denominator drift. "All registered users", "monthly actives", and "paying users" are three different populations, and the same business on the same data will produce three different ARPUs depending on which one you pick. Worse, dashboards built by different teams often pick differently without saying so. The cure is to write the definition into the metric documentation — denominator name, exact filter, and the rationale — and reference it from every dashboard. If you cannot answer "who is in the denominator?" in one sentence, the metric is undefined.

The third trap is mix effects. ARPU can drop without any single segment getting worse — the composition just shifts. If a paid acquisition push brings in a wave of low-ARPU users, the blended metric falls even though every channel held its own. The diagnostic is always the same: produce ARPU by channel, platform, country, and cohort, then check whether the segments are stable or whether one of them is dragging the average. Aggregate metrics without a breakdown are a Simpson's paradox waiting to happen.

The fourth trap is refunds and discounts. If your query sums amount across paid orders, refunds never enter the calculation and the metric is structurally inflated. The same applies to discounts and credits issued after the fact. The clean fix is to compute net revenue (amount - refund_amount - credits) and keep both gross and net ARPU on the dashboard. Gross tells you the catalog price story; net tells you what actually hit the bank.

The fifth trap is conflating ARPU with LTV. ARPU is revenue per user for a single period — usually a month. LTV is the total revenue a user produces across their entire relationship. For subscription products the relationship is roughly LTV ≈ ARPU / churn, but this is a rough approximation that breaks the moment cohort revenue is non-flat. If the question is "how much is a new signup worth?", LTV is the metric. If the question is "how is monetization trending?", ARPU is the metric.

Optimization tips

Most ARPU queries do not need exotic tuning, but the moment your orders table crosses a few hundred million rows, three things start to matter. First, partition the orders table by created_at (monthly partitions are the standard). Every ARPU query is naturally range-bounded by date, so partition pruning collapses the scan from "the whole table" to "two months of data" without changing the SQL.

Second, materialize the monthly aggregate. The decomposition query in section 3 is the perfect candidate for a materialized view refreshed nightly — it is read constantly, it is identical across most dashboards, and recomputing it from scratch on every load is wasteful. In Snowflake or BigQuery, the same logic lives in a scheduled query writing to a small reporting table.

Third, watch the COUNT(DISTINCT user_id) cost. On very large tables, exact distinct counts are expensive; for trending dashboards where two-digit precision is enough, swap to APPROX_COUNT_DISTINCT (Snowflake), HLL_COUNT.MERGE (BigQuery), or count_distinct with HLL (Postgres extension). For finance-grade reporting you keep the exact count; for product analytics, approximate is almost always fine.

A Python sanity check

Whenever a SQL number looks weird, the fastest debugging move is to recompute it from a pandas dataframe and compare. Two lines is enough for ARPU.

revenue = orders.loc[orders["status"] == "paid", "amount"].sum()
total_users = users["user_id"].nunique()
arpu = revenue / max(total_users, 1)

If pandas and SQL disagree, the bug is almost always in the SQL — usually a missed join condition, a timezone-shifted date filter, or the integer division trap from above.

SQL window functions interview questions — the LAG pattern from the MoM query, plus a dozen more.
A/B testing peeking mistake — what to watch for when ARPU is the success metric of an experiment.
Why are you leaving your job? — the behavioral counterpart to the technical screen these queries show up in.

If you want to drill SQL questions like this every day, NAILDD is launching with hundreds of analytics SQL problems built exactly around this pattern — revenue, cohorts, decomposition, window functions.

FAQ

Should I report ARPU or ARPPU to a PM?

Usually both. ARPU answers "how much revenue does the average user generate", which is the right number for unit economics, marketing efficiency, and investor decks. ARPPU answers "what is the average ticket size of buyers", which is the right number for pricing, packaging, and discount decisions. Reporting one without the other hides the decomposition — and the decomposition is exactly where most of the interesting questions live.

How do I compute ARPU for one-off purchases instead of subscriptions?

The mechanics are identical: revenue in the period divided by users in the same period. A user with five purchases still counts once in the denominator — that is what COUNT(DISTINCT user_id) enforces. The only adjustment is what counts as "the period": for subscription products you typically use a calendar month; for marketplaces with bursty buying, you may prefer a rolling 30-day window so weekend spikes do not skew the picture.

ARPU is up but order count is down — is that good?

Sometimes. If active users are paying more per head, that is healthy growth. If the user base shrank and only the high-spending tail survived, total revenue can still be down even though the average looks better. Always check ARPU and total revenue together — and ideally paying rate and ARPPU too. A metric that improves while the underlying business shrinks is the most dangerous kind of false positive.

Do I need to strip out discounts and promotions from ARPU?

It is wise to track both gross and net ARPU and to show them side by side. Gross ARPU reflects catalog pricing and is useful for pricing experiments. Net ARPU reflects the cash that actually arrived after refunds, credits, and discounts, and is the version finance, investors, and forecasting models care about. If you can only have one, default to net — overstating revenue is the failure mode you most want to avoid.

How do you compare ARPU across channels fairly?

Fix the cohort. The most common mistake is comparing channels using all-time users on each side: paid channels usually have younger users than organic, and younger users have less accumulated spend. Anchor every channel comparison to a fixed registration window (for example, "users who registered before April 1") so that channel quality, not cohort age, is what shows up in the chart.

What about ARPU for free users vs paying users in a freemium product?

This is exactly what ARPU and ARPPU together solve. ARPU measures the whole base — free and paid combined — and is the number that connects to acquisition cost. ARPPU measures the converted slice only, and is the number that connects to pricing. The ratio between them is your paying rate. A freemium product with a 3 percent paying rate and a 30 USD ARPPU has roughly 0.9 USD ARPU; if the paying rate climbs to 5 percent at the same ARPPU, ARPU jumps by two-thirds without anyone paying a dollar more. That asymmetry is why monetization teams obsess over the paying-rate lever.