Compare
bernstein vs openai agents sdk
These are not really competitors - they live at different abstraction levels. OpenAI Agents SDK is an in-process Python framework for building agents that call OpenAI models. Bernstein is one layer up: a scheduler that spawns whole CLI coding agents in parallel git worktrees. Bernstein ships an adapter that runs the Agents SDK as one of those agents.
Last checked against the upstream README and Quickstart on 2026-05-17. Sources: https://github.com/openai/openai-agents-python, https://openai.github.io/openai-agents-python/quickstart/.
tl;dr
| Dimension | OpenAI Agents SDK | Bernstein |
|---|---|---|
| Abstraction | In-process Python framework for LLM agents | Out-of-process scheduler for CLI coding agents |
| What it spawns | OpenAI model calls (Responses API) inside your process | Subprocesses: Claude Code, Aider, Codex CLI, Cursor, ... |
| Coordination | Handoffs, guardrails, sessions (LLM-driven) | YAML task graph, deterministic Python scheduler (no LLM) |
| Provider lock-in | OpenAI-first (Responses API native; others via Chat Completions) | Provider-agnostic by design (40+ adapters across vendors) |
| Isolation | One process, one event loop | One git worktree per agent, merged on green gates |
| Install | pip install openai-agents | pipx install bernstein |
what each tool actually does
OpenAI Agents SDK
The README describes it as a lightweight, powerful framework for multi-agent workflows. It gives you five primitives in Python: Agent (an LLM plus instructions, tools, and guardrails), Tool (function/MCP/hosted), Guardrail (input/output safety check), Handoff (one agent delegating to another), and Session (automatic conversation history). A Runner executes the graph. Tracing is built in. Everything runs inside one Python process; the model calls leave the box, the orchestration does not.
Source: github.com/openai/openai-agents-python.
Bernstein
Bernstein orchestrates external CLI coding agents - the actual claude, aider, codex, cursor binaries. Each task in bernstein.yaml is mapped to an adapter, spawned in its own git worktree, monitored, and merged back only after quality gates pass. The control loop is plain Python with no LLM. Every routing decision lands in an HMAC-SHA256 hash-chained log under .sdd/. Forty-plus adapters ship in-tree, including one for the OpenAI Agents SDK itself.
when to pick openai agents sdk
- You are building one Python service that talks to OpenAI models. The Agents SDK is purpose-built for that shape and the API surface is small.
- You want the SDK's handoffs / guardrails / session primitives - those are real engineering, not framework theatre, and you do not get them for free from Bernstein.
- You want OpenAI's built-in tracing dashboard and Responses-API features (built-in web search, file search, code interpreter) wired up out of the box.
- Your "agents" are LLM personas with tools, not external CLI tools that edit files. The SDK is at the right altitude for that.
when to pick bernstein
- The agents you want to coordinate are themselves CLI coding tools - Claude Code, Aider, Codex CLI - not in-process LLM personas. Bernstein is built for that.
- You want to mix providers per task. Claude Opus on architect, GPT-5.5 on backend, DeepSeek on docs, all under one scheduler.
- You need isolation by git worktree, so two concurrent agents cannot corrupt each other's edits. The merge is the only place state combines.
- You need a tamper-evident, replayable audit log on disk - the HMAC-SHA256 chain under
.sdd/is a Bernstein primitive.
same task in both tools
"Answer a question, then have a second agent fact-check the answer."
OpenAI Agents SDK
import asyncio
from agents import Agent, Runner
agent = Agent(
name="History Tutor",
instructions="You answer history questions clearly and concisely.",
)
async def main():
result = await Runner.run(agent, "When did the Roman Empire fall?")
print(result.final_output)
if __name__ == "__main__":
asyncio.run(main())Quickstart shape, from the upstream docs (2026-05-17). Fact-checking is added with a second Agent plus a Handoff.
Bernstein
# bernstein.yaml
tasks:
- id: draft
role: analyst
agent: claude
model: opus
prompt: "Draft a short answer to the user's question."
- id: factcheck
role: adversary
agent: codex # different binary, different provider
model: gpt-5.5
depends_on: [draft]
prompt: "Read the draft from {{draft.output}} and flag anything unverified."Each task runs in its own git worktree. The merge to the working branch only happens after the gate set (lint / types / tests / security) is green.
honest gaps
The Agents SDK is the right answer when the work is one Python service and the OpenAI ecosystem is where you want to live. It has handoffs, guardrails, sessions, and tracing as first-class primitives; Bernstein does not reimplement those. The SDK is also backed by OpenAI, so its support for new Responses-API features lands first there. Bernstein is younger, has a narrower contributor base, and is built for a different shape of work (coordinating real CLI tools in git, not orchestrating in-process LLM personas). Pick the layer the problem actually lives at; if you want both, wrap the SDK with the openai_agents adapter and let Bernstein schedule it alongside everything else.
faq
Are these competitors?
Not really. OpenAI Agents SDK is an in-process Python framework for building agents that call OpenAI models via the Responses API. Bernstein is one layer up: a scheduler that spawns whole CLI coding agents (Claude Code, Aider, Codex, etc.) in git worktrees. Bernstein's openai_agents.py adapter wraps the SDK as one of those agents.
Can I use the OpenAI Agents SDK inside Bernstein?
Yes - bernstein/adapters/openai_agents.py spawns an Agents SDK session as one worker among many. The SDK owns its handoffs / guardrails / sessions inside that worker; Bernstein owns the git worktree, the routing decision, and the merge gates around it.
When is the Agents SDK enough on its own?
When the work is a single Python service calling OpenAI models, you control the code end-to-end, you want the SDK's built-in tracing dashboard, and you do not need to involve CLI coding tools (Claude Code, Aider, Codex CLI) or have a tamper-evident audit log on disk.
When is Bernstein the right layer instead?
When the agents you want to coordinate are themselves CLI coding tools, when you want to mix model providers per role, when isolation between concurrent edits must be by git worktree rather than by careful coding, or when you need an HMAC-chained audit log on disk for compliance review.
Does Bernstein support handoffs and guardrails like the SDK?
Bernstein has its own task graph (depends_on / role / agent routing), deterministic in Python and outside the LLM. It does not implement the SDK's handoff primitive verbatim. If you want the SDK's handoff semantics inside one worker, run that worker through the openai_agents adapter and let it do its thing.