Variance vs standard deviation

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

The short answer

Variance and standard deviation are two views of the same idea — how spread out your data is around its mean — but they live in different unit systems and serve different jobs. Variance is the average squared deviation from the mean; standard deviation is the square root of variance, which puts the number back into the units you started with. That single difference drives almost every choice between them in practice.

If your dataset is dollars, standard deviation is in dollars and variance is in "dollars squared". You can tell a stakeholder revenue per user "varies by about plus or minus 12 dollars" and they will understand; nobody intuitively understands 144 dollars squared. Variance is the language of formulas underneath A/B testing, regression, and ANOVA. Standard deviation is the language of dashboards, error bars, and reports.

The formulas for a sample of n observations are:

variance (s^2) = sum( (x_i - mean)^2 ) / (n - 1)
sd (s)        = sqrt(variance)

Population formulas divide by N instead of n-1, but analyst work almost always uses a sample.

A worked example with five salaries

Five engineers at a Stripe team report base salaries of 150k, 170k, 190k, 210k, and 230k USD. The mean is 190k. The deviations from the mean are -40, -20, 0, 20, and 40 thousand dollars. Squaring those gives 1600, 400, 0, 400, and 1600 (in thousands of dollars squared). The sum is 4000.

mean = 190 (k USD)

deviations:   -40, -20,   0,  20,  40
squared:     1600, 400,   0, 400, 1600
sum:         4000

sample variance = 4000 / (5 - 1) = 1000  (k USD)^2
sample sd       = sqrt(1000) approx 31.6 (k USD)

A standard deviation of about 31.6 thousand dollars means an individual salary lands roughly 31.6k from the mean on average. Variance of 1000 thousand-dollars-squared is mathematically equivalent but has no clean verbal translation — that quantity is built for math, not for sentences.

When to reach for variance

Variance is the natural quantity inside almost every classical statistical formula. Confidence intervals, t-tests, z-tests, ANOVA, and ordinary least squares regression all start from sums of squared deviations. The reason is algebraic: squared deviations add cleanly across independent random variables, while absolute deviations and standard deviations do not.

The headline property is additivity. For independent X and Y, Var(X + Y) = Var(X) + Var(Y), and Var(aX) = a-squared times Var(X). Standard deviation has no such clean rule. If you are combining metrics — total revenue from price times quantity, error from two independent sources — stay in variance land until the very last step and only then take a square root.

Variance is also what your A/B testing platform stores internally. The "pooled variance" in a two-sample t-test, the "residual variance" in a regression model, and the variance reduction promised by CUPED are all variance numbers. If you are doing math on spread, you want variance. If you are talking about spread, you want standard deviation.

When to reach for standard deviation

Standard deviation is what you put on a chart and what you put in a report. Error bars on a bar chart are usually one standard error wide, which is SD divided by the square root of n. Plain-English statements like "checkout time varies by about 4 seconds" are standard deviation statements.

Standard deviation also pairs with the mean to form the coefficient of variation, which lets you compare spread across metrics with very different scales. The 68-95-99.7 rule is stated in standard deviations: 68 percent of a normal distribution lies within one SD, 95 percent within roughly 1.96 SD, 99.7 percent within three. That rule is one of the few statistical facts non-technical stakeholders remember.

Sample vs population: why n-1

The dividing line is the denominator. Population variance divides the sum of squared deviations by N. Sample variance divides by n - 1. That subtraction is Bessel's correction, and it exists because dividing by n produces a biased estimator that systematically underestimates true population variance.

population variance: sum( (x - mu)^2 ) / N
sample variance:     sum( (x - x_bar)^2 ) / (n - 1)

Intuition: when you compute the sample mean from the data, you "use up" one degree of freedom. The n deviations from x_bar are not independent — they sum to zero by construction — so only n - 1 of them carry real information about spread.

In analyst practice, you are essentially always working with a sample, so you want the n - 1 version. The libraries disagree on defaults, which is a common silent bug. NumPy's np.var(x) divides by n; np.var(x, ddof=1) divides by n - 1. Pandas pd.Series(x).var() divides by n - 1 by default. Most SQL engines — Postgres, Snowflake, BigQuery, Databricks SQL — make VARIANCE and STDDEV map to the sample versions, with _POP suffixes for population. If an interviewer does not specify, default to sample (n - 1) and say so out loud.

The 68-95-99.7 rule

For a roughly normal distribution, about 68 percent of values fall within one SD of the mean, 95 percent within two SD (more precisely 1.96), and 99.7 percent within three. The rule is approximate, but it gives a fast back-of-the-envelope check on whether a single data point is plausible or suspicious.

Average order value at a DoorDash-style marketplace is 24 USD with SD of 6 USD. Then 68 percent of orders fall in the 18-30 USD band, 95 percent in 12-36 USD, and 99.7 percent in 6-42 USD. An order of 60 USD sits six SD above the mean — on a strictly normal distribution that would happen in roughly two parts per billion. In real life that almost certainly means the distribution is not normal, and you would want to switch to percentiles instead of SD-based bands.

The rule is a sanity check, not ground truth. Real revenue, latency, and engagement metrics are heavy-tailed, so SD bands undercount weird values. When SD-based bands disagree with empirical quantiles, trust the quantiles.

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

Coefficient of variation: scaling SD by the mean

Standard deviation alone can mislead when comparing metrics on different scales. A 50 USD SD looks big for AOV around 100 USD and tiny for AOV around 10,000 USD. Coefficient of variation (CV) fixes this:

CV = SD / mean

CV is unitless. CV around 0.5 means plus or minus half the mean — a noisy metric. CV around 0.05 means roughly 5 percent of the mean — a tight metric.

Use CV when comparing noisiness across metrics on one dashboard, when ranking metrics for A/B test prioritization (high-CV metrics need bigger samples), or when comparing the same metric across regions with different baselines.

Where variance shows up in A/B testing

A/B testing math is variance-first. The standard error of the mean is SE = SD / sqrt(n). A 95 percent confidence interval is mean +/- 1.96 * SE. A two-sample t-test divides the observed difference by the standard error of that difference, which is built from pooled variance.

Minimum detectable effect is roughly proportional to SD and inversely proportional to the square root of n:

MDE proportional to SD / sqrt(n)

That is why heavy-tailed metrics like revenue per user, session duration, and time-to-first-action need enormous samples. Halving SD via CUPED effectively quadruples detectable resolution at the same sample size, which is why variance reduction is now table-stakes in modern experimentation platforms.

SQL recipes

Most SQL engines expose both sample and population versions. The unsuffixed names default to sample in Postgres, Snowflake, BigQuery, and Databricks SQL:

-- Postgres / Snowflake / BigQuery / Databricks SQL
SELECT
    VARIANCE(salary)    AS var_sample,   -- sample variance (n-1)
    VAR_POP(salary)     AS var_pop,      -- population variance (N)
    STDDEV(salary)      AS sd_sample,    -- sample SD
    STDDEV_POP(salary)  AS sd_pop        -- population SD
FROM employees;

For grouped analysis — comparing spread of order value by country or by acquisition channel — wrap the calls in a GROUP BY and order by SD descending to surface noisy segments fast:

SELECT
    country,
    COUNT(*)              AS n,
    AVG(order_value)      AS mean_aov,
    STDDEV(order_value)   AS sd_aov,
    STDDEV(order_value) / NULLIF(AVG(order_value), 0) AS cv_aov
FROM orders
WHERE order_ts >= DATE '2026-01-01'
GROUP BY country
ORDER BY sd_aov DESC;

Watch the NULLIF around AVG — without it you risk a divide-by-zero on empty cohorts. Also watch your engine's null handling: most SQL engines skip nulls in STDDEV, but a coalesce-to-zero before the aggregate will silently inflate spread.

Python recipes

The five-salary example again, this time with both NumPy and pandas to make the defaults explicit:

import numpy as np
import pandas as pd

x = [150, 170, 190, 210, 230]  # k USD

np.var(x)              # 800.0  -> divides by n (population)
np.var(x, ddof=1)      # 1000.0 -> divides by n-1 (sample)

np.std(x)              # ~28.28 -> population SD
np.std(x, ddof=1)      # ~31.62 -> sample SD

pd.Series(x).var()     # 1000.0 -> sample by default
pd.Series(x).std()     # ~31.62 -> sample by default

If you mix NumPy and pandas in the same notebook, you will eventually produce two "standard deviations" of the same column that disagree at the third decimal place. That is not a bug; it is the ddof default difference. Either standardize on ddof=1 everywhere or stay inside pandas.

Common pitfalls

The first pitfall is interpreting variance directly to a stakeholder. "Variance of order value is 144 dollars squared" is technically correct and practically useless — squared units do not map to any sentence a human says. Convert to standard deviation before you talk and leave variance inside the formulas that consume it.

The second pitfall is mixing sample and population versions across a single analysis. If your SQL pulls STDDEV_POP but a downstream Python notebook calls pd.Series.std(), you are silently comparing two estimators. The bias is small for large n but visible for cohorts under a thousand. Lock in one convention (sample, n - 1) and verify the libraries you use match.

The third pitfall is quoting SD without the mean. "Standard deviation is 100" tells you nothing on its own — huge for mean 50, trivial for mean 100,000. Always quote SD next to the mean, or convert to coefficient of variation for cross-metric comparisons.

The fourth pitfall is adding standard deviations as if they were variances. For independent X and Y, Var(X + Y) equals Var(X) + Var(Y), but SD(X + Y) does NOT equal SD(X) + SD(Y). To combine spreads of independent variables, go through variance: SD(X + Y) = sqrt(Var(X) + Var(Y)). This trap shows up in capacity planning and in any "total" metric built from independent components.

If you want to drill questions exactly like "variance vs SD — when do you use which" against real interview transcripts, NAILDD is launching with a deep statistics track aimed at data analyst and data scientist loops.

FAQ

Which one should I put in the executive summary slide?

Standard deviation, almost always. Variance lives in squared units that do not match any sentence a stakeholder will say out loud. Standard deviation pairs cleanly with the mean ("AOV is 24 USD with a spread of about 6 USD"), with the 68-95-99.7 rule, and with error bars on charts. Save variance for the methods appendix where you show the formula behind a confidence interval.

Sample or population — which do I default to?

Sample, with n - 1 in the denominator (Bessel's correction). You are almost never working with the entire population in analyst work — you have a slice. Libraries disagree on defaults (NumPy goes to n, pandas goes to n - 1) so spell out which you used. In an interview, say "sample variance with n - 1" out loud the first time it comes up and you will rarely get a follow-up.

What is the difference between SD and standard error?

SD measures how spread out individual observations are around their mean. Standard error (SE) measures how spread out the sample mean itself would be if you re-ran the experiment. The formula is SE = SD / sqrt(n), so SE shrinks as your sample grows while SD does not. Error bars showing "the typical user" use SD; error bars showing "the precision of the average we just estimated" use SE. Confidence intervals are built from SE.

Can I compare SDs across two different datasets?

Only if the scales match. SD of 10 on a dataset with mean 100 is loud; SD of 10 on a dataset with mean 100,000 is silent. For cross-metric or cross-cohort comparisons, divide SD by the mean to get the coefficient of variation, which is unitless and comparable.

How do I explain variance to a product manager in one sentence?

"Variance is standard deviation squared — the version statistical formulas use, while standard deviation is the version we put on charts."