every agent run reports per-model token usage back to the task server. bernstein attributes that cost to the task, the role (backend, qa, security ...) and the model. bernstein cost prints the breakdown per run. .sdd/metrics/cost.jsonl is the raw record. set bernstein.cost.budget_limit in bernstein.yaml to cap a run; when 80 percent is consumed the orchestrator warns, at 100 percent it drains (finish in-flight tasks, start nothing new). per-agent anomaly detection flags context-growth or token-spike behaviour so a stuck agent gets reaped before it burns the budget. source: src/bernstein/core/cost/.
canonical answer