After a Bernstein session finishes, most teams end up running the same four shell snippets by hand: open a PR with the results, copy a ticket into a task, kick off a run on a beefier remote box, fire a post-merge hook. 1.8.14 turns each into a first-class command.
the glue tax
A healthy multi-agent workflow looks simple from outside: set a goal, four agents work in worktrees, the janitor verifies, the queue lands what passes. In practice every team that runs this daily ends up with a scripts/ directory full of wrappers:
- A
post-run.shthat runsgit push && gh pr create --body "$(cat last-run-summary.md)"with a hand-rolled summary. - A Python one-liner that reads a Linear webhook and
curls the task server. - A
ssh-run.shthat rsyncs to a build box, opens screen, tails logs. - A
.git/hooks/pre-committhat checks the janitor's JSON against a policy.
Each is ~30 lines. Each is slightly wrong somewhere: missing auth env var, wrong PATH, no retry, no graceful cleanup. The operator pack ships the four most common as composable subcommands sharing the same state model.
bernstein pr — open a PR with the janitor's receipts
bernstein pr --session-id last --draftReads .sdd/sessions/<id>-wrapup.json, gate results, cost tracker. Opens a GitHub PR with a conventional-commit title and body:
## Summary
- Add JWT auth with refresh tokens
- Cover the refresh endpoint with tests
- Document the new /auth/* routes
## Changes
src/auth/middleware.py | 74 +++++
tests/test_auth.py | 112 ++++++++
docs/auth.md | 42 ++++
## Verification
✅ lint (ruff: 0 findings)
✅ types (pyright: 0 errors)
✅ tests (pytest: 48 passed)
✅ security (semgrep: clean)
## Cost
$0.38 · 182,410 tokens
manager: $0.04
engineer: $0.27
qa: $0.07
Generated from Bernstein session 7c4f1a3b9d22.Flags: --base, --title, --draft, --dry-run, --no-push, --session-id. Conventional-commit prefix is inferred from goal + role (a docs-heavy session gets docs:, a bugfix gets fix:, default is feat:).
bernstein from-ticket <url>
You wrote the work description once, in the ticket. Pull it straight in:
bernstein from-ticket https://linear.app/acme/issue/ENG-412 --runThree providers out of the box:
- Linear — GraphQL via
LINEAR_API_KEY. - GitHub Issues — local
ghCLI when available, elseGITHUB_TOKEN+ REST. - Jira Cloud — REST v3 via
JIRA_EMAIL+JIRA_API_TOKEN.
Labels drive role and scope. bug → qa, docs → docs, epic bumps scope from medium to large. Provider, external ID, and URL get stashed on the task so downstream tooling can round-trip.
--run dispatches immediately. --dry-run previews:
Task preview
goal: "Migrate session store to Redis"
role: backend
scope: medium
priority: medium
source: linear / ENG-412
assignee: dmitribernstein remote — SSH sandbox
Heavier than your laptop (large test matrix, GPU calls, staging DB), and a VPS is faster than a cloud sandbox? remote wraps it:
bernstein remote test build-box-1
bernstein remote run build-box-1 ~/work/bernstein --user alex --port 22Backed by an SSH SandboxBackend:
- ControlMaster reuse. First call opens
~/.ssh/bernstein-<host>-<pid>.sock; subsequent commands reuse it. Per-call overhead drops from ~500ms to ~30ms. - ConnectTimeout=10, ServerAliveInterval=30 so a flaky network doesn't hang the run.
- Error translation.
Connection refusedbecomesSandboxConnectionError(host=..., hint="check that sshd is running on port X").Permission deniedsuggestsssh-addor anIdentityFileentry.
Artifacts stay on the remote box for the duration. bernstein remote forget <host> tears the socket down.
bernstein hooks — pre/post lifecycle
Six events: pre_task, post_task, pre_merge, post_merge, pre_spawn, post_spawn. Hook any of them with a shell script, a Python callable, or a pluggy @hookimpl:
# bernstein.yaml
hooks:
pre_task:
- script: "scripts/check-branch.sh"
timeout: 10
post_merge:
- script: "scripts/notify-slack.sh"
- plugin: "bernstein_plugin_jira"Shell hooks get a JSON payload on stdin plus BERNSTEIN_EVENT, BERNSTEIN_TASK_ID, BERNSTEIN_SESSION_ID, BERNSTEIN_WORKDIR in the env. Env is whitelisted (PATH, HOME, USER, BERNSTEIN_*) so credentials don't leak into third-party scripts. Stdout truncated at 10 MB. Non-zero exit from a pre_* hook aborts the event, useful for "don't spawn an agent if the working tree is dirty."
Three subcommands round it out: hooks list, hooks run <event> (fires with empty context for debugging), hooks check validates every script path.
why they compose
All four read the same .sdd/ state. So:
bernstein from-ticket https://linear.app/acme/issue/ENG-412 --run- …agents run on the SSH sandbox via
bernstein remote run build-box-1 .… bernstein pr --session-id lastonce the janitor signs off.post_mergehook fires the Slack notification and closes the Linear ticket.
No ad-hoc glue, no script drift between ~/work/*/scripts/. The session metadata that flowed from ticket → task → merge is still there if you need to replay or audit.
what's missing
- No GitLab or Bitbucket ticket providers yet, open an issue. Provider interface is one small file per source.
- SSH sandbox uses OpenSSH, not paramiko. Works everywhere OpenSSH does, won't embed in pure-Python deployments.
SandboxBackendis stable; a paramiko adaptation is ~200 lines. - PR generator targets GitHub only. GitLab is small follow-up; gate results and cost tracker are already provider-agnostic.
pipx install 'bernstein>=1.8.14'. bernstein pr --help. Open an issue with whichever shell snippet is next on your list.