Skip to main content
← Back to blog

A daemon that closes its own pull requests

Bernstein opens a PR. CI fails. You're not at your laptop. The autofix daemon notices, fetches the failing log, classifies the break, hands the log to a fresh agent in the same worktree, and pushes the fix commit if the agent succeeds. The loop, the rails that keep it from going sideways, the operator surface.

the boring red

Most failing CI on Bernstein-opened PRs is boring. A renamed import the agent missed in one file. A ruff rule that bites only in CI's stricter config. A flaky integration test with a clear retry path. A type signature right in the local file and wrong in the consumer.

Boring red is the ideal autofix workload. The agent already has the diff, the log spells out the failure, the fix is mechanical. Sitting in front of the terminal to paste a log into a new agent prompt is exactly the kind of toil that should be automated away.

the loop

bernstein autofix start runs a daemon ticking every poll_interval_seconds (default 60s). Each tick:

  1. Calls a FailingPRSource for Bernstein-opened PRs whose head commit has a red CI run.
  2. For each candidate, fetches the failing log via gh run view --log-failed, head-truncated to log_byte_budget bytes (default 64KB, enough for a full pytest traceback and small enough to keep prompt budgets predictable).
  3. Runs the log through a keyword classifier that buckets the failure (lint, type, test, dependency, build, unknown) and picks a bandit arm with model and effort tuned for that bucket.
  4. Synthesises a goal that includes the run id, the classifier verdict, and the truncated log. Hands it to a DispatchHook that runs the agent against the PR's worktree.
  5. If the hook returns success and a non-empty commit SHA, records the attempt in the audit chain. If it doesn't, records the failure and lets the next tick try again, up to a hard cap.

Both FailingPRSource and DispatchHook are protocols, not hardwired implementations. The daemon owns lifecycle, cost accounting, audit chain, safety rails; you supply the two ends. v1.9.0 ships the rails-and-state half; the network-backed source and default GitHub-aware dispatch hook land in a follow-up. Operators who want to drive the daemon today can install their own hooks via the Python API.

the rails

Self-healing CI is one bad heuristic away from a 3am runaway that burns through your API budget pushing wrong commits onto a release branch. Five rails ship, none optional:

Per-PR opt-in label. A PR is invisible to the daemon unless it carries bernstein-autofix (configurable per repo). Removing the label stops further attempts immediately.

Per-repo cost cap. cost_cap_usd (default $5) is checked before every dispatch. The dispatcher refuses to invoke the bandit router for a repo over budget; the budget resets on a configurable window. A misbehaving classifier cannot drain an account.

Hard attempt budget per push SHA. MAX_ATTEMPTS_PER_PUSH = 3. Three failed attempts against the same commit and the dispatcher labels the PR needs-human, posts a summary comment with the attempt history, and stops touching the PR until a human pushes a new commit or removes the escalation label. No retry-forever path.

Force-push off by default. allow_force_push = false. The dispatcher pushes a fix commit on top of the PR branch — never overwrites the operator's history. Force-push is per-repo, opt-in, audited separately.

HMAC-chained audit log. Every dispatch decision — every label add, every comment posted, every attempt outcome — is appended to .sdd/audit/ in an HMAC-chained JSONL. The chain detects tampering and gives a complete reconstruction of the daemon's behaviour after the fact. If autofix did something surprising, the audit log explains why.

the operator surface

The daemon writes pid, status, and recent attempts into .sdd/runtime/autofix/. Three commands cover the workflow:

bernstein autofix status
# autofix daemon: running (pid=84219)
# last tick:      Sun Apr 27 14:08:01 2026
#
# Recent attempts (newest first):
#   sipyourdrink-ltd/bernstein#960  attempt=1  outcome=success  classifier=lint  cost=$0.0214
#   sipyourdrink-ltd/bernstein#958  attempt=2  outcome=needs_human  classifier=test  cost=$0.4832

status --watch tails new attempts. status --json for piping into a dashboard. Both work whether the daemon is local or under a bernstein daemon install-ed systemd unit.

bernstein autofix attach
# {"attempt_id":"...","repo":"sipyourdrink-ltd/bernstein","pr_number":960,...}
# {"attempt_id":"...","repo":"sipyourdrink-ltd/bernstein","pr_number":958,...}

attach is the resume-token handoff: rejoin a session from any terminal — second laptop, SSH from a phone — by replaying the JSONL log and tailing it. The daemon does not need to know its current viewer, and a shell exit never loses history.

what it doesn't do

Mechanical fixes only. No refactoring. No merging. No new tests, no schema changes, no files outside the failing run's blast radius. No bypassing branch protection or required reviews. Every fix commit goes through the same review pipeline as anything else. If the failure is "this entire feature is wrong," the classifier hits unknown, the bandit picks a conservative single attempt, and the daemon lands on needs-human after one try.

A patch on top of your review process. Branch protection still rules.

try it

pipx install 'bernstein>=1.9.0'
 
# write ~/.config/bernstein/autofix.toml:
#   [[repo]]
#   name = "your-org/your-repo"
#   cost_cap_usd = 5.0
 
bernstein autofix start --foreground --once
# autofix daemon completed after 1 tick(s).

Run it under bernstein daemon install to keep it up across reboots. Pair with bernstein chat serve if you want the needs-human escalations in your Telegram thread. The operator-commands post covers the chat bridge and daemon installer; this one closes the loop.

Bernstein