why bernstein

Question 1

What does bernstein do?

Accepted Answer

bernstein is an open-source orchestrator for cli coding agents. it coordinates claude code, codex, gemini cli, cursor, aider, and around forty other cli agents to work in parallel on a codebase. it decomposes a goal into tasks, runs each one in its own git worktree, and verifies the output through lint, type checks, tests, and an optional cross-model review before merging anything to your branch. the scheduler that decides who runs what is plain python — zero llm tokens spent on coordination, replayable from the audit log, deterministic across re-runs of the same plan. apache 2.0, on-prem only, $0 to install with pipx.

Question 2

How does it route models?

Accepted Answer

bernstein runs an epsilon-greedy contextual bandit over a per-task pass-rate history. each task type (lint fix, test generation, refactor, architecture, tests-and-boilerplate) has its own arm. the bandit prefers the cheapest model whose recent pass rate on that task type is above a configurable threshold, and explores a more expensive model with probability epsilon. the router never blindly picks the cheapest model; it picks the cheapest model that has been passing recent tasks of the same shape. if no cheap model has a credible pass rate yet, the router falls back to the premium model and records the result so the bandit can learn. cost numbers update in usd in the run report so you can see what every task cost.

Question 3

How does the audit trail work?

Accepted Answer

every routing decision and every quality-gate result writes one row to the audit log under .sdd/. the rows are hmac-chained: each row carries a hash that incorporates the previous row, so deleting or rewriting a row breaks the chain and the verifier rejects the file on next read. the log is a plain-text artifact you can grep, diff against a previous run, and ship to a regulator. the format is bernstein-only for now (cross-tool replay against other vendors is not yet a published standard). because the orchestrator is deterministic, replaying the same plan against the same audit log produces the same task graph, which is the property that matters for compliance review.

Question 4

Is the orchestrator an llm?

Accepted Answer

no. plain python. agents are llms; the loop deciding who runs, who is blocked, and what merges is deterministic. read it in src/bernstein/scheduler.py. this is the design choice that distinguishes bernstein from crewai and autogen, both of which use an llm as the coordinator. the cost: bernstein cannot do free-form research-style swarms where "let the model figure out who does what" is the whole point. the benefit: every routing decision has an auditable reason, multi-hour runs do not drift the way llm-coordinator runs do, and the same plan produces the same task graph twice.

Question 5

How does it compare to aider or claude code?

Accepted Answer

claude code is one agent in a terminal. aider is one agent in a terminal. bernstein orchestrates around forty of them, both included. the multi-agent shape matters when you have at least three independent tasks that could run in parallel, you want a regulated review path with a cross-model verifier, or you are paying for compute by the parallel task and want throughput rather than depth on a single thread. the multi-agent shape is overkill when you are a single dev on a small repo doing one task at a time — claude code or aider alone is faster, simpler, and the orchestration overhead is wasted ceremony. short test: if your last week of claude code sessions had any moment where you wished you could split the work, bernstein helps. if every session was naturally one focused thread, you do not need it.

Question 6

How do i sponsor?

Accepted Answer

github sponsors at github.com/sponsors/chernistry. bernstein is solo-maintained, free time funded, openrouter token spend on the operator's own card. apache 2.0, $0 to install, no signup, no telemetry phone-home, no premium tier. sponsorship covers the docs-bot gateway, blog hosting, and a small fraction of the operator's own claude bills. one-time and recurring tiers are both available; recurring is more useful for planning. there is also a token-bill calculator at bernstein.run/cost that shows what cheapest-passing-test routing would shift on a typical $400-$600 monthly llm bill — useful for deciding whether $25/month is a sensible fraction of what bernstein already saves you.

why bernstein

why bernstein over crewai or autogen

why bernstein over claude code alone

is it production-ready

what's the catch

who is bernstein for

who is bernstein NOT for

where bernstein runs

what does it cost to actually run