How to calculate COGS in SQL

Sharpen SQL for your next interview
500+ SQL problems with worked solutions — joins, window functions, CTEs.
Join the waitlist

What is COGS and why it matters

COGS (Cost of Goods Sold) is the foundation of unit economics for any product company that moves physical inventory. If revenue is $100 and COGS is $40, gross profit is $60 and gross margin is 60 percent. Strip COGS out of the picture and you cannot reason about pricing power, vendor leverage, or the impact of a 5 percent freight increase on the bottom line.

Imagine a finance partner pings you on Slack at 9:14 AM: the CFO wants Q1 gross margin by category before the board call at 11. You have a purchase_orders table with lot-level unit costs, an orders table with paid quantities, and three SKUs that switched suppliers mid-quarter. You cannot just average everything. You need a defensible COGS query that respects when each unit was bought, how much was paid, and which units actually shipped.

This guide covers the three COGS methods you will be asked about in retail and marketplace data interviews at Amazon, DoorDash, Stripe, and Shopify-style companies. We start with the textbook formula, walk through SQL for each method, then dig into the traps that quietly inflate gross margin by 10 points and make the CFO call you back at 11:15.

The SQL formula

The accounting identity is short:

COGS = beginning_inventory + purchases - ending_inventory

That works if you trust your inventory snapshots. In practice analysts compute COGS bottom-up from transaction data, because warehouse counts drift and the snapshot tables update once a week. The per-unit version looks like this:

COGS_per_unit = cost_purchased / quantity_purchased
COGS_total    = COGS_per_unit * units_sold

The hard part is which cost to attach to which sold unit. That is what weighted average and FIFO are actually solving.

Basic single-SKU calculation

Start with one product. You bought a batch, you sold some, you want to know what those sold units cost you. Assume purchase_orders(sku, quantity, unit_cost, received_at) and orders(sku, quantity, status, created_at).

WITH purchases AS (
    SELECT
        SUM(quantity * unit_cost)    AS total_purchased_cost,
        SUM(quantity)                AS total_units_purchased
    FROM purchase_orders
    WHERE sku = 'product-x'
      AND received_at BETWEEN '2026-01-01' AND '2026-04-01'
),
sold AS (
    SELECT SUM(quantity) AS units_sold
    FROM orders
    WHERE sku = 'product-x'
      AND status = 'paid'
      AND created_at BETWEEN '2026-01-01' AND '2026-04-01'
)
SELECT
    p.total_purchased_cost,
    p.total_units_purchased,
    s.units_sold,
    (p.total_purchased_cost::NUMERIC
        / NULLIF(p.total_units_purchased, 0)) AS avg_unit_cost,
    (p.total_purchased_cost::NUMERIC
        / NULLIF(p.total_units_purchased, 0)) * s.units_sold AS cogs
FROM purchases p
CROSS JOIN sold s;

What this gives you is the simplest possible answer: total spend on lots received in the window, divided by total units received, multiplied by units sold. It is honest when costs are stable. It quietly lies when your supplier raised prices 18 percent halfway through the quarter and most of the sales happened after the price change.

Weighted average cost

When you carry many SKUs with multiple receipts at different unit costs, weighted average cost (WAC) is the default. It collapses each SKU into a single blended unit cost across a trailing window, then multiplies by sold units.

WITH wac AS (
    SELECT
        sku,
        SUM(quantity * unit_cost)::NUMERIC
            / NULLIF(SUM(quantity), 0) AS weighted_avg_cost
    FROM purchase_orders
    WHERE received_at >= CURRENT_DATE - INTERVAL '12 months'
    GROUP BY sku
)
SELECT
    o.sku,
    SUM(o.quantity)                          AS units_sold,
    SUM(o.quantity) * w.weighted_avg_cost    AS cogs,
    SUM(o.gross_revenue)                     AS revenue,
    1 - (SUM(o.quantity) * w.weighted_avg_cost)
        / NULLIF(SUM(o.gross_revenue), 0)    AS gross_margin
FROM orders o
JOIN wac w USING (sku)
WHERE o.status = 'paid'
  AND o.created_at >= CURRENT_DATE - INTERVAL '3 months'
GROUP BY o.sku, w.weighted_avg_cost
ORDER BY cogs DESC;

The trailing 12 months for WAC is a deliberate choice. Too short and one bulk shipment skews the blended cost; too long and you carry inflation noise from a year you stopped selling that SKU. For categories with short shelf life or rapid cost shifts (electronics, fashion), shorten the window to 90 days. For commodities, 12 months is usually safe.

Sharpen SQL for your next interview
500+ SQL problems with worked solutions — joins, window functions, CTEs.
Join the waitlist

FIFO in SQL

First-in, first-out attaches the oldest available lot cost to each sold unit. This matches physical inventory rotation in most warehouses and produces more accurate gross margin when costs move in one direction over the period. FIFO is genuinely hard in pure SQL because you are matching sold quantities against an ordered queue of lots. The clean approach uses cumulative sums and a join on the cumulative range:

WITH lots AS (
    SELECT
        sku,
        received_at,
        quantity,
        unit_cost,
        SUM(quantity) OVER (
            PARTITION BY sku
            ORDER BY received_at
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS cumulative_received,
        SUM(quantity) OVER (
            PARTITION BY sku
            ORDER BY received_at
            ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
        ) AS cumulative_received_before
    FROM purchase_orders
),
sold_totals AS (
    SELECT sku, SUM(quantity) AS units_sold
    FROM orders
    WHERE status = 'paid'
      AND created_at BETWEEN '2026-01-01' AND '2026-04-01'
    GROUP BY sku
),
matched AS (
    SELECT
        l.sku,
        l.unit_cost,
        GREATEST(
            0,
            LEAST(s.units_sold, l.cumulative_received)
              - COALESCE(l.cumulative_received_before, 0)
        ) AS units_consumed_from_lot
    FROM lots l
    JOIN sold_totals s USING (sku)
)
SELECT
    sku,
    SUM(units_consumed_from_lot * unit_cost) AS cogs_fifo
FROM matched
GROUP BY sku
ORDER BY cogs_fifo DESC;

The trick is the two cumulative sums per lot: one ending at the current row and one ending just before. Each lot contributes LEAST(units_sold, cumulative_received) - cumulative_received_before units to COGS, clipped at zero so lots received after the units_sold threshold contribute nothing. Production FIFO systems track lot-level consumption with explicit allocation tables, but for ad hoc analytics this pattern is good enough and runs in a single query.

LIFO (last-in, first-out) flips the ordering and is mostly a US tax accounting choice. International standards (IFRS) ban it. Use FIFO or WAC for analytics.

Common pitfalls

Cost variance across receipts is where most first-pass COGS queries go wrong. When your supplier raises prices mid-quarter, computing one flat average over all receipts smears the increase across units that were actually sold from the cheaper batch. The fix is either WAC over a shorter trailing window or proper FIFO matching, depending on how sensitive your gross margin is to the swing. A 6 percent supplier price change can move quarterly gross margin by 2 to 3 points, which is enough to flip a board narrative.

Currency mismatches between purchases and sales create silent reporting errors. If you buy in USD from an overseas supplier but report sales in EUR or GBP, you have to convert the unit cost using the FX rate from the purchase date, not the sale date. The accounting convention is that landed cost locks in at receipt. Pulling spot FX at query time will make your gross margin oscillate with the FX market instead of with operational reality.

Landed cost is the single most common omission. The unit cost on the supplier invoice is not the cost of getting one unit to a position where it can be sold. You also paid freight, customs, duties, insurance, and the inbound handling fee your 3PL charges per pallet. Real COGS includes all of that. Excluding inbound logistics typically inflates reported gross margin by 4 to 8 points for hardware and apparel, which is exactly the gap finance teams scream about during budget season.

Returns flow back into inventory and must reduce COGS in the period the return was processed, not the period the original sale shipped. A naive query that subtracts refunded revenue but leaves COGS intact will understate gross margin. The right move is to compute net units sold (sold minus returned, joined on the original order) and apply the cost basis to net units, then add a separate line for refurbishment cost on the items that came back damaged.

Period mismatch between purchase date and sale date is the trap that catches new analysts. You sold a unit in Q1, but the lot it came from was received in November of the previous year. The cost belongs to the unit, not to the receipt window. WAC over a trailing 12 months smooths this naturally. FIFO handles it correctly by construction. A naive query that only sums purchases in Q1 will miss the cost basis for units sold from older inventory.

Optimization tips

For large purchase_orders and orders tables, partition by month on received_at and created_at respectively, and cluster on sku. The cumulative sum in the FIFO query is a window function that benefits enormously from a covering index (sku, received_at, quantity, unit_cost) because it can stream through pre-sorted data without a sort step.

Materialize the WAC table as a daily snapshot if you query gross margin on dashboards. Recomputing a 12-month rolling weighted average per SKU on every dashboard hit is wasteful. A daily wac_snapshot(sku, as_of_date, weighted_avg_cost) table backed by an incremental dbt model keeps p95 dashboard latency under 200 ms and gives finance a clean audit trail of how unit cost moved over time.

When FIFO matching scales past a few million lots, push the cumulative join into a single CTE with ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW rather than self-joining. The self-join pattern is O(n squared) per SKU; the window function is O(n log n) with a sort and O(n) once the data is partitioned correctly. On Snowflake or BigQuery, this can be the difference between a 12-second query and a 4-minute one.

If you want to drill SQL questions like this every day, NAILDD is launching with 500+ SQL problems across exactly this pattern.

FAQ

Weighted average versus FIFO — which should I use?

Weighted average cost is simpler, faster to compute, and stable enough for most monthly reporting where unit costs do not swing dramatically. FIFO is more accurate when costs are moving in one direction across the period, which is typical during inflation or after a supplier change. If gross margin is within 1 to 2 points either way, WAC is fine. If you are explaining a 4-point gross margin variance to the board, switch to FIFO and you will usually find the answer.

What exactly counts as landed cost?

Landed cost is everything you spent to put one sellable unit in your warehouse at a ready-to-ship status. That includes the supplier invoice unit cost, inbound freight, customs duties and tariffs, marine or air insurance, brokerage fees, and the inbound handling charge your fulfillment partner bills per pallet or per unit. Real COGS includes landed cost. Excluding it consistently overstates gross margin by 4 to 8 points for hardware, apparel, and most direct-to-consumer categories.

What is COGS for a software or services business?

Pure software companies usually call this Cost of Revenue instead of COGS, and it covers hosting infrastructure, third-party APIs charged per call, customer support headcount allocated to served customers, and payment processing fees if those are not netted from revenue. Marketplaces sit in between — Stripe, DoorDash, Airbnb all carry meaningful Cost of Revenue lines that look conceptually like COGS but are computed from internal cost-allocation models, not lot-level purchase data.

COGS dropped this quarter — is that automatically good?

Not without context. If revenue dropped proportionally or more, gross margin held flat and the lower COGS just reflects fewer units shipped. The diagnostic ratio is gross margin (revenue minus COGS divided by revenue), not the absolute COGS line. A genuine win is COGS down while revenue holds steady, which means either supplier negotiation, mix shift toward higher-margin SKUs, or reduced waste. Always pair the COGS delta with a gross margin delta before celebrating.

How do inventory write-offs flow into COGS?

When inventory is written off because it expired, was damaged, or became obsolete, the cost of that inventory hits COGS in the period of the write-off. Practically this means your monthly COGS query has two components: COGS from units sold (matched via WAC or FIFO) and COGS from write-offs (the carrying cost of units that left inventory without a corresponding sale). Some teams break these out as two lines on the gross margin dashboard so operations can see how much margin is being eaten by shrink versus actual cost of sales.

How do I handle returns that come back damaged and cannot be resold?

When a return is processed, you reverse the original COGS entry by adding the unit cost back to inventory. If the item is then refurbished, the refurbishment labor and parts cost add to the unit cost basis. If the item is scrapped, it goes to write-off and hits COGS in the write-off period. The clean SQL pattern is to compute three buckets per period — sold COGS, returned COGS (negative), and write-off COGS — and sum them for the headline number, while keeping the components visible for diagnostics.