ICE prioritization framework for product managers

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

What ICE is and why growth teams use it

ICE is a prioritization framework introduced by Sean Ellis — the same person who coined the term growth hacking — to rank experiments inside growth teams at companies like Dropbox and LogMeIn. The whole point is speed: you score every idea on three dimensions, multiply, sort, and move on. A growth team with 20-50 ideas in the backlog can run an ICE session in under an hour, which is exactly the cadence a weekly experimentation cycle needs.

The acronym stands for Impact, Confidence, and Ease. Each gets a score from 1 to 10, and the product is your priority number. Higher score, higher priority. That is the whole framework — and that simplicity is both its greatest strength and the reason it fails in some contexts.

If you are interviewing for a growth PM role at Stripe, Notion, DoorDash, or any consumer subscription company, ICE will come up. It is the most common "what prioritization framework do you use?" answer that does not immediately mark you as a textbook reader, because growth teams genuinely live in ICE while the rest of product orgs tend toward RICE or weighted scoring.

Load-bearing trick: ICE is a conversation tool, not an oracle. The score does not pick the winner — the discussion you have while scoring picks the winner.

The formula

ICE Score = Impact x Confidence x Ease

All three factors are scored 1-10. Multiplication (not addition) is deliberate: a single low score should drag the total down. An experiment scoring Impact 10, Confidence 1, Ease 10 lands at 100, which correctly signals "huge upside but we are just guessing — do a cheaper validation first."

The multiplicative shape is also why ICE penalizes ambitious-but-uncertain bets more aggressively than weighted-additive frameworks do. Worth remembering before you swap formulas.

Scoring Impact, Confidence, Ease

Impact is how much the experiment moves your North Star or the target metric for the cycle. A rough anchor:

Impact score What it means Example
1-3 Minimal lift, edge case Renaming a settings button
4-6 Medium lift on a meaningful metric New empty-state on the dashboard
7-10 Material movement on the team KPI Restructured onboarding flow

What counts as "high impact" is team-dependent. A +1% conversion lift is medium for a PM running a checkout funnel at 10M monthly visitors, but low for a seed-stage startup with 1,000 visitors a week where absolute counts matter more than percentages.

Confidence is how sure you are that Impact is real. This is the dimension juniors get most wrong. The anchor:

  • 1-3 — pure hypothesis, zero supporting data
  • 4-6 — qualitative basis (user interviews, support tickets, intuition from a similar product)
  • 7-10 — quantitative basis (a previous A/B test, segmentation data, a public benchmark from a comparable company)

Inside a growth team where the whole point is testing risky ideas, the modal confidence is 5-6. If everything in your backlog scores 8+, you are not running a growth team, you are running a polish team. Worth saying out loud once a quarter.

Ease is implementation cost — engineering hours, design hours, and analytics setup combined. Higher score = easier:

  • 1-3 — multi-week project, multiple teams involved
  • 4-6 — one sprint, single squad
  • 7-10 — hours or a couple of days, ideally a single engineer

Gotcha: Ease is the inverse of "Effort" in RICE. RICE rewards low Effort with a high final score (because Effort is in the denominator); ICE rewards high Ease the same way (because Ease is a multiplier). When you migrate between frameworks, do not paste the same number across — flip the scale.

Worked example: four experiments

A growth team at a B2C SaaS — think Notion-style productivity app — is sequencing the next two weeks. Four ideas are on the table:

Experiment Impact Confidence Ease ICE
New CTA copy on homepage hero 5 6 9 270
Welcome email with onboarding checklist 7 7 6 294
Redesigned pricing page 8 4 5 160
Referral program with double-sided reward 9 3 3 81

The welcome email wins — solid impact, reasonable confidence (a similar onboarding nudge worked on a previous product), moderate engineering lift. The referral program looks sexier on paper (Impact 9), but the team has never shipped a referral loop before, so confidence is 3 and the build is heavy, dragging the score to 81.

A senior PM reading this table would also notice that the homepage CTA at 270 is close enough to the welcome email at 294 that it is a coin flip. The right move is to do both — they are independent, do not share engineering resources, and together cover top-of-funnel and activation. ICE flags that they are comparable; the PM decides they are complementary.

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

ICE vs RICE

The most common follow-up in interviews is "why ICE over RICE — or vice versa?" The honest answer is that they target different problems.

Dimension ICE RICE
Number of factors 3 4
Reach captured? No Yes
Formula I x C x E (R x I x C) / E
Best for Growth experiments Product backlog with varying audience size
Speed of scoring Very fast Slower (Reach often needs analytics)
Origin Sean Ellis Intercom

The single biggest gap is Reach. If experiment A touches 1% of users and experiment B touches 50% of users, ICE can score them identically. RICE will not — Reach multiplies into the numerator. For a growth team running funnel-stage experiments where Reach is roughly constant (everyone hits the homepage), this does not matter. For a product backlog where one fix is for an enterprise admin and another is for every signed-in user, it matters a lot.

The second gap is the denominator. RICE divides by Effort, which makes the scale interpretable — the score has units of "value per engineering week." ICE multiplies Ease in, which is faster but loses that interpretability.

In practice, most PMs end up using both: ICE for the weekly experimentation queue, RICE for the quarterly roadmap.

When to use ICE (and when not to)

ICE shines when you need to rank many small bets quickly and Reach is approximately constant across them. Specifically:

Growth experiments inside a funnel — landing pages, signup, onboarding, paywall, retention emails. The audience is "people who hit this step," and that audience is the same across ideas. Reach cancels out of the comparison, so dropping it from the formula costs nothing.

Idea triage in a brainstorm. You generated 40 ideas in a workshop. You do not need precision; you need to cut to the top 10 in 30 minutes. ICE is built for this exact moment.

Solo-PM contexts where you do not have an analytics team to pull Reach numbers. ICE lets you make decisions on a Tuesday afternoon without a SQL query.

ICE breaks down when audience size varies dramatically between experiments. A feature shipped to 3% of paying enterprise customers and a feature shipped to 80% of free users are not in the same comparison universe. Use RICE, weighted scoring, or split the backlog into Reach-comparable buckets and use ICE within each bucket.

It also breaks down when one of the three dimensions is fundamentally unknowable. If your team is brand new to the product area and every Confidence score is 2, ICE just sorts by Impact x Ease and confidence becomes noise. Spend a week building intuition first.

Common pitfalls

Treating the score as the answer. The single most common mistake. ICE 270 vs ICE 294 is statistical noise — those experiments are functionally tied. Engineers and PMs new to the framework often defend a single point of ICE difference as if it were a confidence interval. It is not. The framework's job is to surface the bottom of the list (things you definitely should not do this sprint) and the top of the list (things obviously worth trying). The middle is for discussion, not arithmetic.

Inflating Confidence by default. When everyone scores Confidence at 7 or 8 "because I am pretty sure," the dimension collapses and the framework reduces to Impact x Ease. Be honest: without data from a previous test or a strong comparable, Confidence should sit at 5 or below. A useful tactic is to require a one-sentence "here is why I am confident" annotation for any score of 7+. Most claims do not survive that prompt.

Ignoring strategic context. ICE optimizes for "what should we do this sprint" — it is myopic by design. It says nothing about whether the area you are experimenting in matters for the company's two-year direction. A growth team can ICE-rank its way into a local maximum on the signup page while the whole product loses to a competitor on retention. Strategy is a separate input, not an ICE column. Run ICE inside a strategic frame, not as a replacement for one.

Ignoring Reach when it actually varies. This is the failure mode the framework warns you about and that teams still walk into. If you find yourself scoring an enterprise-only feature against a free-tier feature using the same ICE sheet, stop. Either split the backlog into two sheets, or migrate to RICE for this round.

Scoring solo and pretending it is objective. ICE is at its best when 3-5 people score independently, then compare. Divergent scores are the signal — they tell you where the team disagrees about either impact or feasibility. A single PM filling out an ICE sheet alone is just writing down their gut feeling with extra steps.

If you want to drill PM frameworks and growth case prompts every day, NAILDD is launching with hundreds of product cases built around exactly this kind of decision.

FAQ

ICE or RICE — which one should I bring up in an interview?

Bring up both, and explain when each fits. The answer interviewers are listening for is not "I use ICE" — it is "I use ICE for the weekly experimentation backlog where Reach is roughly constant, and RICE for the quarterly roadmap where Reach varies by an order of magnitude across features." That answer signals you have actually used the frameworks in anger, not just read a blog post about them.

Can I change the 1-10 scale?

Yes — the absolute scale does not matter, consistency inside the team does. Some teams use 1-5 (lower noise, fewer arguments about 7 vs 8), others use 1-100 (more granularity, more noise). What matters is that everyone on the scoring panel is using the same scale and the same anchor descriptions. If you walk into an interview and the interviewer says "we use 1-5," accept it and move on — do not argue the scale.

How do I avoid Confidence inflation?

Two tactics. First, require a one-sentence evidence note for any Confidence score above 6 — "previous A/B test on the empty-state showed +4% activation" is acceptable, "I just have a feeling" is not. Second, calibrate retrospectively. After every cycle, look at experiments where the predicted Impact missed by more than 50% and check what the team's Confidence scores were. Teams that do this seriously see Confidence scores drift downward over time, which is usually a sign of healthier scoring.

Does ICE work for B2B?

It works for B2B growth experiments — landing pages, free-trial conversion, in-product activation — where the user base is large enough that Reach is comparable across ideas. It breaks down for enterprise B2B roadmap decisions, where the right framework is closer to opportunity sizing per account or weighted scoring that includes sales-team input. Do not force ICE onto a 20-account enterprise roadmap; the math stops being meaningful.

Is ICE official Sean Ellis methodology or a community invention?

Sean Ellis publicly described the ICE scoring approach in his GrowthHackers community posts and in his book Hacking Growth. The framework has since been adapted heavily by individual teams — some swap Ease for Effort, some add a Reach column (at which point it becomes RICE in everything but name), some weight the three factors unequally. The version in this article is the canonical multiplicative form.

What is a reasonable cadence for re-scoring the backlog?

Weekly for active growth experimentation, monthly for a slower product team. Scores age fast — a Confidence 6 today can become Confidence 8 next week if a related test wraps up with strong results, or Confidence 3 if a competitor ships the same thing and burns the surprise. Re-scoring is cheap; re-running an experiment because you used stale priorities is not.