Days since last login in SQL
Contents:
Why days-since-last-login matters
It's a Tuesday afternoon and your growth PM pings you on Slack: "give me the list of users about to churn by tomorrow — the lifecycle team ships the re-engagement campaign Thursday". That request hides a per-user pre-churn signal that every product dashboard at Netflix, DoorDash, Linear, Notion, Airbnb, and Stripe computes the same way: days since the user last logged in. It's one of the simplest scoring inputs in the analyst's toolbox and one of the highest-signal predictors of whether a customer will still be active 30 days from now.
It works because engagement decays geometrically. A user who logged in yesterday will overwhelmingly log in again this week. A user whose last login was 45 days ago has roughly the same return probability as a brand-new visitor — low. Crossing 14 days doubles conditional churn probability on most consumer products; crossing 30 roughly triples it. Thresholds vary, but the curve shape is universal enough that every lifecycle program segments on it.
The trick is computing it correctly. The headline definition hides four choices: what counts as a login, which time zone the day belongs to, what to do with never-logged-in users, and how to handle multi-device sessions. This post walks through five SQL recipes built around the same logins table — basic per-user query, distribution histogram, risk segments, cohort view, at-risk trigger — and the pitfalls that come up in real dashboards and interviews.
The SQL formula
The skeleton is the same as every "time since last X" metric in SQL — group by user, take MAX of the timestamp, subtract from today. Minimal schema:
logins (user_id BIGINT, login_time TIMESTAMPTZ, device TEXT, method TEXT)
users (user_id BIGINT, email TEXT, created_at TIMESTAMPTZ)Every row in logins is one authentication event. The query collapses them into one row per user, with last_login_at and days_since as the per-user fields.
WITH last_login AS (
SELECT
user_id,
MAX(login_time) AS last_login_at,
CURRENT_DATE - MAX(login_time)::DATE AS days_since
FROM logins
GROUP BY user_id
)
SELECT
user_id,
last_login_at,
days_since
FROM last_login
WHERE days_since > 30
ORDER BY days_since DESC;Persist this in a daily snapshot. Once the table crosses a few hundred million rows, scanning it from a dashboard hurts — a nightly job rebuilds user_last_login_daily(user_id, last_login_at, days_since) and every downstream report joins to that.
The symmetric "users who logged in within the last seven days" query pushes the filter into WHERE:
SELECT user_id, MAX(login_time) AS last_login_at
FROM logins
WHERE login_time >= CURRENT_DATE - INTERVAL '7 day'
GROUP BY user_id;Filtering before the aggregate lets the planner skip most of the table — cost drops an order of magnitude.
Distribution histogram
A single average tells you nothing — the distribution is bimodal, with most users either logged in today or functionally gone. The right shape is a histogram, bucketed into the breakpoints the lifecycle team thinks in.
WITH last_login AS (
SELECT
user_id,
CURRENT_DATE - MAX(login_time)::DATE AS days_since
FROM logins
GROUP BY user_id
)
SELECT
CASE
WHEN days_since = 0 THEN 'today'
WHEN days_since <= 1 THEN '1 day'
WHEN days_since <= 7 THEN '2-7 days'
WHEN days_since <= 30 THEN '8-30 days'
WHEN days_since <= 90 THEN '31-90 days'
WHEN days_since <= 180 THEN '91-180 days'
ELSE '180+ days'
END AS bucket,
COUNT(*) AS users,
ROUND(COUNT(*)::numeric * 100 / SUM(COUNT(*)) OVER (), 2) AS pct
FROM last_login
GROUP BY bucket
ORDER BY MIN(days_since);On a healthy consumer product, today and 2-7 days together hold 40 to 60 percent of the login base. If the long tail past 90 days swells over 30 percent, the product has a retention problem acquisition cannot patch over. The histogram also spots release regressions: plot the buckets weekly — if today shrinks and 8-30 days grows the week after a deploy, the new build is shedding users.
Risk segmentation
Pre-churn scoring is almost always built on top of a days-since-last-login segment. Named tiers — active, at-risk, dormant, likely-churned, churned — collapse the continuous metric into a categorical column lifecycle, ML, and CS teams all share.
WITH user_status AS (
SELECT
user_id,
CURRENT_DATE - MAX(login_time)::DATE AS days_since_login,
CASE
WHEN CURRENT_DATE - MAX(login_time)::DATE <= 7 THEN 'active'
WHEN CURRENT_DATE - MAX(login_time)::DATE <= 14 THEN 'at-risk'
WHEN CURRENT_DATE - MAX(login_time)::DATE <= 30 THEN 'dormant'
WHEN CURRENT_DATE - MAX(login_time)::DATE <= 90 THEN 'likely-churned'
ELSE 'churned'
END AS risk_segment
FROM logins
GROUP BY user_id
)
SELECT
risk_segment,
COUNT(*) AS users,
ROUND(COUNT(*)::numeric * 100 / SUM(COUNT(*)) OVER (), 2) AS pct
FROM user_status
GROUP BY risk_segment
ORDER BY MIN(days_since_login);The thresholds above are defensible defaults for daily-engagement products, not universal. Pick the boundary where conditional return probability drops below the level the product tolerates: 7 and 14 for a fitness tracker, double for a podcast player, multiply by ten for a tax tool.
Store the segment as dim_user.risk_segment, written nightly, so every chart and audience export filters on one definition. The minute two teams compute their own thresholds, the segment is meaningless.
Cohort view
Average days-since-last-login by acquisition cohort is the chart that tells you whether the product is getting better, worse, or just attracting different traffic. A monthly cohort of new sign-ups is the standard cut.
WITH user_login AS (
SELECT
u.user_id,
DATE_TRUNC('month', u.created_at)::DATE AS cohort,
CURRENT_DATE - MAX(l.login_time)::DATE AS days_since_login
FROM users u
LEFT JOIN logins l USING (user_id)
GROUP BY u.user_id, DATE_TRUNC('month', u.created_at)::DATE
)
SELECT
cohort,
COUNT(*) AS users,
ROUND(AVG(days_since_login)::numeric, 1) AS avg_days_since,
ROUND(COUNT(*) FILTER (WHERE days_since_login <= 7)::numeric * 100 / COUNT(*), 2) AS active_pct,
ROUND(COUNT(*) FILTER (WHERE days_since_login > 90)::numeric * 100 / COUNT(*), 2) AS churned_pct
FROM user_login
GROUP BY cohort
ORDER BY cohort;Older cohorts always have larger averages — that's tenure, not regression. The diagnostic move is to compare equal-age slices: active_pct at exactly 90 days after sign-up. If November 2025 sits at 32 percent and March 2026 at 18 percent, onboarding got worse — the growth team should know before the next planning cycle.
The LEFT JOIN matters. An inner join silently drops never-logged-in users — the population activation most wants to see. Keep them in and treat NULL days as a separate signal.
Re-engagement trigger
The point is to act on it. The standard lifecycle pattern emails users who hit a precise dormancy window — early enough to be timely, late enough not to annoy frequent users. 14 to 21 days is a robust default for consumer apps.
SELECT
u.user_id,
u.email,
CURRENT_DATE - MAX(l.login_time)::DATE AS days_inactive
FROM users u
JOIN logins l USING (user_id)
WHERE u.email IS NOT NULL
GROUP BY u.user_id, u.email
HAVING CURRENT_DATE - MAX(l.login_time)::DATE BETWEEN 14 AND 21
ORDER BY MAX(l.login_time);Run this nightly, write to a reengagement_queue table, and let the lifecycle platform pull from it. The narrow band means each user enters the campaign exactly once — anyone who fell out yesterday is caught today, and nobody gets the same email twice. Combine with a frequency cap on the user dimension so users who recently received another lifecycle message are excluded.
The same shape powers other moments: 90-day win-back uses a different BETWEEN window; a 7-day nudge uses = 7; a CS escalation list joins to the revenue table and keeps high-value accounts only.
Common pitfalls
The most common failure is conflating login with activity. A user who consumes content through push notifications, an API client, or a passive mobile background process is "active" by every product definition but never fires a login. Triggering a "we miss you" email on day 14 to a user who opened the app yesterday lands as embarrassing. Compute days-since-last-activity from the canonical event allow-list — app_open, purchase, view — and reserve the login table for auth-related signals.
Never-logged-in users are the second trap. After a LEFT JOIN from users to logins, those rows return NULL for MAX(login_time) and the day arithmetic returns NULL. A downstream WHERE days_since > 30 silently drops them — and that's the audience activation most wants to see. Coalesce to a large sentinel value or bucket never-logged-in users as their own explicit segment.
Time zones are the third trap. login_time is almost always stored in UTC, while CURRENT_DATE follows the warehouse session time zone. A user in the Pacific zone who logs in at 11 PM local time gets stamped at 7 AM UTC the next day — the row falls into the next calendar bucket and the daily aggregate shifts by one. Cast both sides into one business time zone, or bucket by region in each region's local clock.
Multi-device sessions are the fourth. Products let users authenticate independently on web, iOS, Android, and desktop, and each authentication writes a separate row. A user who hasn't opened web in three months but logs in on mobile every morning has days_since = 0 only if the query groups across devices. Group on user_id alone, not on (user_id, device), unless the consumer explicitly wants a per-device view.
The fifth trap is treating SSO and silent token refreshes as logins. OAuth refreshes can fire every few minutes in the background with no human action — if they all write to the same table, every user looks freshly active forever. Filter on method = 'interactive' or whatever your auth team uses to flag real re-authentications, and confirm the column is reliable before you wire a campaign to it.
Optimization tips
The biggest win is to never recompute from the raw logins table on demand. Persist a daily snapshot — user_last_login_daily(user_id, last_login_at, days_since, risk_segment) rebuilt nightly — and let every dashboard, audience export, and campaign read from it. The performance gap between snapshot and raw is three to four orders of magnitude.
Index (user_id, login_time DESC) on Postgres so the per-user MAX becomes a one-row index lookup. On Snowflake and BigQuery, cluster on (user_id, login_time). On Redshift, sort on login_time and distribute on user_id. The warehouse should never scan the whole event history to find each user's most recent row.
For rolling distribution and cohort views, partition by login_time::date where supported. Daily partitioning lets the histogram prune to the trailing year of partitions — on BigQuery, the difference between a 30-second and a 30-minute scan for any product with more than two years of history.
When COUNT(DISTINCT user_id) becomes the bottleneck, switch to APPROX_COUNT_DISTINCT on BigQuery or HLL_COUNT.DISTINCT on Snowflake and Redshift. Error band is 1 to 2 percent — fine for operational dashboards, not for a board deck.
Related reading
- How to calculate churn in SQL
- How to calculate churn risk score in SQL
- How to calculate cohort retention in SQL
- How to calculate D1 D7 D30 retention in SQL
- SQL window functions interview questions
If you want to drill SQL questions like this every day, NAILDD is launching with 500+ SQL problems built around exactly this kind of interview prompt.
FAQ
Days since last login or days since last activity — which one should I report?
Both, on different surfaces. Days since last login is right for auth-tied campaigns — password resets, security check-ins, MFA reminders. Days since last activity is right for lifecycle work because it tracks the broader behavior — content opens, push receipts, API calls. A user with a persistent session has high days-since-login but zero days-since-activity; treating them as the same number is the most common cause of embarrassing "we miss you" emails.
What threshold should I pick for the re-engagement campaign?
The 14-to-21-day window is the robust default for daily-rhythm consumer products. For weekly-rhythm apps like a podcast or recipe site, double to 28 to 42 days. For monthly-rhythm tools like tax software, multiply by four. The empirical way: plot conditional return rate as a function of days inactive, find the elbow, and trigger one to two days before it.
How do I handle users who never logged in?
Bucket them as an explicit segment rather than letting them fall through a NULL filter. Sign-ups who never logged in are the audience activation most wants to act on. Use COALESCE(MAX(login_time), users.created_at) so the arithmetic returns a real number, plus a never_logged_in boolean the lifecycle platform can filter on separately. Two distinct audiences, two distinct campaigns.
How do I deal with time zones in the daily bucket?
Pick a single business time zone, cast login_time AT TIME ZONE 'UTC' AT TIME ZONE 'America/New_York' (or whichever zone finance reports in), and bucket on the result. Mixing CURRENT_DATE in the warehouse session zone with UTC timestamps causes one-day skew in almost every daily metric. For a globally distributed user base, split by region and bucket each in its own local clock, then union for the global view.
How do I merge multi-device logins into one signal?
Group on user_id alone, not on (user_id, device), when computing the per-user MAX(login_time). user_id is the canonical identity from the auth system and ties every device session back to the same human. The exception is when the consumer explicitly wants a per-device cut — for example, the iOS team asking "how many iOS users haven't opened the iOS app in 30 days" — in which case the device filter belongs in WHERE, not GROUP BY.
How does this metric relate to churn?
Days-since-last-login is the per-user pre-churn signal that feeds the churn model and the churn risk score. Churn is a population-level rate over a period; days-since-last-login is a continuous per-user value that flips a user into "churned" once it crosses a threshold. The two live next to each other on every retention dashboard.