cost
the bernstein router picks the cheapest model that still passes your tests on each task. this page lets you put your own bills in and see what the math suggests for your situation. heuristic, not a promise.
your bill, audited
what cheapest-passing-test routing would shift.
enter your last month's spend on three of the bills bernstein users typically pay. the calculation below uses a documented heuristic, hardcoded model prices, and shows every step so you can audit it.
estimated band [?]
$180–$360 /mo
show the math
- total monthly llm spend$400 + $200 + $0 = $600
- fraction of tasks routable to a cheaper model that still passes tests40%–80% (heuristic)
- cost ratio: cheapest-passing model vs current premium model~25% of original (see model-prices table below)
- saving = total × routable% × (1 − cost ratio)$180–$360 /mo
the routable% is a heuristic, not a measured number. real saving depends on whether the cheaper models pass your project's tests on each task. on a repo where tests are flaky or coverage is low, routing falls back to the premium model and the band shrinks toward zero. on a repo with tight tests and a lot of mechanical work, the band shifts higher than 80%.
your last month's bill was $600.
sponsoring at $25/mo is 4% of $600.
bernstein keeps routing the cheapest model that passes tests.
model prices
| family | model | input / 1m | cached input / 1m | output / 1m |
|---|---|---|---|---|
gemini-2.5-flash-lite | $0.10 | $0.01 | $0.40 | |
| deepseek | deepseek-v4-flash | $0.14 | $0.0028 | $0.28 |
| xai | grok-4.1-fast | $0.20 | n/a | $0.50 |
| openai | gpt-5.4-nano | $0.20 | $0.02 | $1.25 |
gemini-2.5-flash | $0.30 | $0.03 | $2.50 | |
| openai | gpt-5.4-mini | $0.75 | $0.075 | $4.50 |
| anthropic | claude-haiku-4.5 | $1 | $0.10 | $5 |
gemini-2.5-pro | $1.25 | $0.125 | $10 | |
| xai | grok-4.3 | $1.25 | n/a | $2.50 |
| openai | gpt-5.4 | $2.50 | $0.25 | $15 |
| anthropic | claude-sonnet-4.6 | $3 | $0.30 | $15 |
| anthropic | claude-opus-4.7 | $5 | $0.50 | $25 |
sources: claude.com/pricing, platform.openai.com/docs/pricing, ai.google.dev/gemini-api/docs/pricing, openrouter.ai/models.
prices update sporadically. the snapshot date above is the last manual update. since late 2025 the cadence has picked up: anthropic, openai, and google have all shipped a new tier within the last six months, and absolute numbers can shift by 20-40% between revisions even when relative ordering survives. check the source links before quoting these in a procurement conversation. cached input prices assume a 10% multiplier on the base input rate where the provider supports prompt caching.
how the band is computed
the calculator multiplies your total monthly llm spend by the fraction of tasks bernstein could route to a cheaper model (40-80%, a heuristic) and by the cost gap between the premium and cheap models (about 75% saving on the routable tasks at typical claude opus 4.7 vs gemini 2.5 flash-lite ratios — opus is $5/m input, flash-lite is $0.10/m input, so swapping one for the other on a routable task saves roughly 98% of input cost; the 75% blended figure is the band-weighted average across mixed task types). the result is a band, not a point. the math is shown step by step in the calculator block so you can substitute your own assumptions if the heuristic does not fit your repo.
the routable fraction is the load-bearing assumption. on a codebase with flaky tests it skews toward zero — the cheaper models do not pass, the bandit falls back to the premium model, and the saving collapses. on a codebase with tight tests and a lot of mechanical work (typed refactors, test scaffolding, lint fixes) it skews higher than the upper bound. neither extreme is a promise.
if you want to verify any of this on your own repo before you sponsor, install bernstein with pipx install bernstein and check the cost column in the run report after one parallel run. the numbers there are real, not heuristic.