May 18, 2026·12 min read

How to calculate IP velocity in SQL

Sharpen SQL for your next interview

500+ SQL problems with worked solutions — joins, window functions, CTEs.

Contents:

What IP velocity actually catches
Windows and metrics worth tracking
The SQL formula
Rolling up by ASN and region
Spotting distributed botnets
Common pitfalls
Optimization tips
Related reading
FAQ

What IP velocity actually catches

A single IP firing dozens of signups in an hour is almost always abuse. It is the cheapest, oldest attack against promo codes, refer-a-friend bonuses, and free trials, and it is still the first thing fraud teams at Stripe, DoorDash, and Uber check when a coupon campaign goes sideways at 2am. IP velocity, in plain terms, is just the count of actions originating from the same network address over a sliding window. The reason it shows up in every fraud analyst interview is that the SQL is short, the trade-offs are real, and the candidate has to defend their thresholds.

The scenario you usually get is something like this: "We launched a $20 first-order credit last night and burned through the budget in four hours. Find the IPs responsible by Monday morning." The wrong answer is to dump every IP with more than one signup. The right answer is a sliding window on a few different durations, joined against ASN metadata, with a clear story about what to do for residential addresses behind carrier NAT versus a known datacenter range. If the interviewer is from a payments company, expect a follow-up about unique cards per IP rather than raw signups.

The reason this lives next to user velocity but is not the same metric is that user velocity tracks an authenticated identity doing too much, while IP velocity tracks a network address doing too much across many identities. The two complement each other. An attacker who rotates accounts but stays on one home IP shows up on velocity per IP. An attacker who rotates IPs but reuses one stolen card shows up on velocity per payment instrument. A real fraud stack runs both, and the interview answer that includes that nuance lands better than one that does not.

Windows and metrics worth tracking

The interesting denominators are signups per IP per hour, transactions per IP per minute, unique cards per IP per day, and failed login attempts per IP per hour for credential stuffing. Each one has a different attacker in mind. Signups per hour catches coupon farming. Transactions per minute catches card testing, where a bad actor checks a list of stolen card numbers against a low-friction merchant. Unique cards per IP catches the same card-testing attack from the other side. Failed logins per IP catches credential stuffing.

The reason you want more than one window is that attackers calibrate. A scripted signup tool that knows there is a per-minute cap will pace itself just under the limit, which means a one-minute window misses it while a one-hour window catches the cumulative burst. Picking three windows, usually one minute, one hour, and twenty-four hours, gives you coverage against both burst and slow-drip patterns without much extra cost.

The SQL formula

The classic shape is a single aggregate per IP with several FILTER clauses, one per window. The HAVING clause at the bottom is where you encode the alert threshold. The example assumes a signups table with ip, user_id, and created_at columns and runs on Postgres or any warehouse with FILTER syntax, including Snowflake and Databricks.

SELECT
    ip,
    COUNT(*) FILTER (WHERE created_at >= NOW() - INTERVAL '1 minute')  AS last_1_min,
    COUNT(*) FILTER (WHERE created_at >= NOW() - INTERVAL '1 hour')    AS last_1_hour,
    COUNT(*) FILTER (WHERE created_at >= NOW() - INTERVAL '24 hours')  AS last_24h,
    COUNT(DISTINCT user_id) FILTER (
        WHERE created_at >= NOW() - INTERVAL '24 hours'
    ) AS unique_users_24h
FROM signups
WHERE created_at >= NOW() - INTERVAL '24 hours'
GROUP BY ip
HAVING COUNT(*) FILTER (
    WHERE created_at >= NOW() - INTERVAL '1 hour'
) > 10
ORDER BY last_1_hour DESC;

Ten signups in an hour from one address is the red-flag threshold most consumer apps converge on, but it is a starting point, not a constant. The right number depends on what fraction of your normal traffic comes through corporate or carrier NAT and how expensive a false positive is. For a B2B product where each signup is a sales-qualified lead, anything above five per hour is worth a review. For a free-tier consumer game where signups are essentially free, twenty per hour might be a more realistic line.

For transaction velocity rather than signup velocity, swap the table and look at unique payment instruments instead of unique users. Card testing tends to leave a clear signature: many distinct card numbers, all failing 3DS or AVS, all from one IP in a five-minute window. That query is the same shape with COUNT(DISTINCT card_hash) substituted in.

Rolling up by ASN and region

A single IP threshold misses the modern shape of abuse, which is distributed across an ASN. The rollup query groups by autonomous system number and country so you can spot a datacenter range with thousands of clean-looking signups across hundreds of fresh IPs. AWS, DigitalOcean, Hetzner, OVH, and Linode are the usual suspects. Residential ASNs from Comcast, AT&T, or Verizon are not, even when the volume is high.

SELECT
    asn,
    asn_org,
    country,
    COUNT(*) AS signups,
    COUNT(DISTINCT ip) AS unique_ips,
    COUNT(*)::NUMERIC / NULLIF(COUNT(DISTINCT ip), 0) AS signups_per_ip
FROM signups
WHERE created_at >= CURRENT_DATE - INTERVAL '1 day'
GROUP BY asn, asn_org, country
HAVING COUNT(*) > 100
ORDER BY signups_per_ip DESC;

A datacenter ASN with a high signups-per-IP ratio is almost always a botnet renting cloud capacity. The same query against a residential ASN tells you almost nothing, because home users do not show up at a datacenter signups-per-IP ratio in normal traffic. Some teams keep a curated allowlist of known mobile carrier ASNs that NAT thousands of users behind a small pool of IPs and exempt them from the per-IP rule entirely, falling back to device fingerprinting for those ranges.

Sharpen SQL for your next interview

500+ SQL problems with worked solutions — joins, window functions, CTEs.

Join the waitlist

Spotting distributed botnets

The harder attack is one where each individual IP looks clean. A modern abuse network spreads ten thousand signups across ten thousand residential proxy IPs, so the per-IP query finds nothing. The signature to look for is a time-bucket spike in total signups paired with a low signups-per-IP ratio. That combination means "many addresses, each barely touching the limit, all firing in the same minute."

WITH suspicious_window AS (
    SELECT
        DATE_TRUNC('minute', created_at) AS minute,
        COUNT(DISTINCT ip) AS unique_ips,
        COUNT(*) AS signups
    FROM signups
    WHERE created_at >= NOW() - INTERVAL '24 hours'
    GROUP BY DATE_TRUNC('minute', created_at)
    HAVING COUNT(*) > 50
)
SELECT
    minute,
    unique_ips,
    signups,
    signups::NUMERIC / unique_ips AS signups_per_ip
FROM suspicious_window
ORDER BY signups DESC;

When the spike happens at exactly the same minute as a marketing send or a TikTok video going viral, you have a real growth event. When it happens at 3am on a Tuesday with no campaign attached, you have a botnet warming up. The trick on the interview is to volunteer that distinction without being asked, because it shows you have actually had to debug a false positive in a real incident.

Common pitfalls

The first trap is using one IP threshold for everything. A corporate NAT can put hundreds of legitimate users behind one address. Universities are even worse, sometimes putting an entire dorm on a /24. Blocking those ranges turns a fraud rule into a customer support fire, and the right fix is to identify trusted NAT ranges in advance, exempt them from the per-IP rule, and lean on device fingerprinting and email reputation for those users.

The second trap is treating residential and datacenter IPs the same. A datacenter IP making consumer signups is almost always suspicious because no normal user signs up for a food delivery app from an AWS EC2 instance. A residential IP making the same volume is a real household. Most modern stacks join against an IP type provider like MaxMind GeoIP2, IPHub, or ipqualityscore so the rule can apply a much lower threshold to datacenter ranges and a much higher one to residential.

The third trap is treating IPv6 the same as IPv4. An IPv6 customer prefix is almost always a /64 block assigned to a single household, and the user can rotate through trillions of addresses inside it without touching anyone else. If you key your velocity rule on the full IPv6 address you will catch nothing. The fix is to truncate to /64 for IPv6 and treat that as the unit, which matches how real network operators allocate space.

The fourth trap is automatic VPN blocking. A surprisingly large share of legitimate users on consumer platforms run a VPN by default — corporate laptops, privacy-conscious travelers, anyone on hotel Wi-Fi. Blocking VPNs outright trades a small reduction in abuse for a real cut in conversion. Flag VPN traffic, raise the friction for that subset, and reserve outright blocks for repeat offenders who have already failed other rules.

The fifth trap is trusting IP geolocation as ground truth. Commercial GeoIP databases sit at roughly five to ten percent error in any given month, and the error is concentrated in the cases you care about, such as mobile carriers and recently reallocated ranges. Building a rule that blocks any signup where the GeoIP country does not match the billing address will surface a steady stream of false positives, especially around airports and border regions. Use geo mismatch as one feature in a score, not as a unilateral block.

Optimization tips

The query that scans signups for the last twenty-four hours runs fine up to single-digit millions of rows per day on a warehouse, but past that you want a partitioned table on created_at and an index on ip so the per-IP aggregate skips full scans. Snowflake handles this transparently with clustering keys on created_at. Postgres needs explicit declarative partitioning or a BRIN index on the timestamp column.

For real-time decisions at signup, SQL is the wrong tool. Push the per-IP counter into a key-value store with a TTL — Redis with INCR and EXPIRE is the textbook setup — and read the counter inline in the request path. Flush the rollups into the warehouse hourly for analytics. The Redis layer handles the millisecond budget; the SQL layer answers questions about yesterday.

For the ASN rollup, materialize a daily summary table keyed on (date, asn, country) rather than scanning raw events every time. Most fraud dashboards refresh once an hour at most, and a materialized view on the rollup is enough to keep the interactive query under a second.

If you want to drill fraud-style SQL questions like this every day, NAILDD is launching with 500+ SQL problems across exactly this pattern.

FAQ

What thresholds should I use for IP velocity?

The defaults depend on your product. A consumer SaaS or marketplace usually starts around twenty signups per hour from one IP as a review trigger and lowers it for IP ranges that score as datacenter. A B2B product where every signup is a high-intent lead can drop that to five per hour without much false positive cost. The honest answer in an interview is to say you would tune it against the false positive rate you can tolerate from customer support, not pick a number from a blog.

Maintain an allowlist of trusted corporate and carrier NAT ranges, either built from your own traffic data or licensed from a provider like MaxMind. For addresses on the allowlist, exempt them from the per-IP velocity rule and rely on device fingerprinting and email reputation instead. The risk of NAT is not that the rule fires once — it is that it fires repeatedly on the same legitimate company and erodes trust in the fraud team.

How do I detect datacenter IPs?

Use a commercial IP intelligence provider. MaxMind GeoIP2 Anonymous IP, IPHub, and ipqualityscore all expose an API that returns whether an IP is a datacenter, a residential proxy, a known VPN, or a Tor exit node. The data is good enough to use as a feature in a rule, but not perfect, so the right pattern is to lower thresholds for flagged IPs rather than block them outright.

Should I block VPN and Tor traffic entirely?

Generally no, at least not as a first action. Plenty of legitimate users route consumer traffic through a VPN, especially on mobile and on corporate laptops. Tor is rarer in the wild and is more often associated with abuse, but even then a hard block is usually the wrong call for a paid consumer product. Flag the traffic, raise the friction with extra verification, and reserve a block for accounts that fail downstream checks.

Can I run IP velocity in real time directly from SQL?

Not at the request path. A warehouse query takes hundreds of milliseconds at best, which is too slow for a signup or checkout decision. The real-time layer is almost always Redis or a similar key-value store with a per-IP counter and a short TTL. The SQL version of the rule runs on the analytics side, looks at yesterday, and feeds the threshold tuning rather than the live block decision.