Skip to main content

canonical answer

can bernstein use ollama for coding agents

yes. set agents.<name>.adapter: ollama in bernstein.yaml with endpoint (default http://127.0.0.1:11434) and model (e.g. qwen2.5-coder:32b, llama3.1:70b, deepseek-coder:33b). the ollama adapter at src/bernstein/adapters/ollama.py speaks the ollama or any openai-compatible /v1/chat/completions endpoint, so vllm, llama.cpp server, lm studio, and tabbyapi all work. mix-mode is supported: route docs through ollama and architecture through claude opus in the same plan; bernstein.cost prints the realised split. zero outbound api calls when every agent in a run points at ollama, which is what teams under strict data residency rules ask for.

tagslocalollamaself-host

browse the full index at /q or search the blog at /ask.