Code with Claude SF 2026: Anthropic Ships Outcomes and Multi-Agent Orchestration

Anthropic opened Code with Claude SF 2026 in San Francisco on May 19 with a deliberate choice. There were no new models. The five things Anthropic shipped instead, Dreaming, Outcomes, multi-agent orchestration, Claude Finance with 10 pre-built agents, and Add-ins for Microsoft 365, were all capabilities that sit above the model layer. The message, delivered across four days at a sold-out venue that the company had to extend twice due to demand, was that the competition in AI has moved from raw model performance to infrastructure for getting reliable work done at scale.

Two of the five features are new ground. Outcomes changes how developers define and enforce quality in agentic pipelines. Multi-agent orchestration changes how those pipelines are structured when a task is too large or too varied for a single agent to handle well.

Outcomes: A Second Agent Grades the Work

The practical problem Outcomes addresses is deceptively simple. When you deploy an agent on a real task, you need to know whether the output is good enough to deliver. Human review does not scale. Automated test suites catch bugs in code but say nothing about whether a research summary is accurate, a pitch deck is persuasive, or a due-diligence report covers the right ground.

Outcomes solves this by introducing a grading layer. You write a rubric describing what success looks like for a particular task. When the agent completes the task, a separate grading agent scores the output against that rubric. If the output meets the quality threshold, it is delivered. If it does not, the grading agent highlights specifically where the output fell short and returns the task to the working agent for another pass.

The rubric is plain text. You can describe success in terms of accuracy, completeness, tone, format, or any other criteria that matter for the specific workflow. The grading agent and the working agent use different model instances, which means the grader is not simply validating its own reasoning, it is applying your criteria as an independent check. Anthropic has not published data yet on how much Outcomes improves output quality across different task types, but early access teams at financial services firms and law practices have been testing it since late April.

Key Facts: Code with Claude SF 2026

Event datesMay 19–22, 2026, San Francisco
New models shippedNone
Features shipped5 (Dreaming, Outcomes, multi-agent, Claude Finance, Add-ins)
Claude Code rate limit changeDoubled
Claude Opus API limit changeRaised
Outcomes grading modelSeparate instance from working agent

Multi-Agent Orchestration

Multi-agent orchestration addresses a different constraint. Complex tasks often require different types of reasoning, different tools, and different context at different stages. A single agent running sequentially through all of those stages is slow and tends to lose context. Running specialized agents in parallel, each focused on a narrower problem, is faster and often produces better outputs on each sub-task.

Anthropic's implementation gives a lead agent the ability to delegate to specialist subagents that work in parallel on a shared filesystem. Each subagent has its own model selection, system prompt, and tool configuration. The lead agent defines the plan, assigns tasks, and assembles the final output. The whole flow is traceable in the Claude Console, where developers can see which subagent ran which tool calls, what each produced, and where failures occurred.

The shared filesystem is what makes parallel subagent work coherent. Each subagent can read outputs produced by other subagents, so the final assembly step has access to a full picture rather than isolated fragments. Anthropic has also confirmed that the orchestration layer integrates with the self-hosted sandboxes and MCP tunnels announced on the same day, meaning multi-agent workflows can reach internal databases and private APIs through the same private tunnel infrastructure.

"The five features they shipped — Dreaming, Outcomes, multi-agent orchestration, Claude Finance with 10 pre-built agents, and Add-ins — are a more honest map of where the real competition in AI has moved." Code with Claude SF 2026 recap, May 2026

What Did Not Ship

Absent from the conference was any announcement about Claude Mythos, Anthropic's advanced research model that remains behind a closed preview. The question came up from the audience during a Q&A session, and Anthropic's response was consistent with its earlier public statements: Mythos will not be released broadly until defenders have had sufficient time to act on its security findings.

There was also no update on Claude Sonnet 4.8, which has appeared in internal testing leaks but has not been formally announced. Anthropic's emphasis throughout the conference was on the agentic infrastructure layer rather than frontier model capabilities, a framing that reflects where most enterprise deployments currently are: running on Opus 4.7 or Sonnet 4.6 and working through the operational questions of reliability, observability, and scale.

Developer Infrastructure Changes

Alongside the five feature announcements, Anthropic doubled rate limits on Claude Code and raised API limits for Claude Opus, both changes effective immediately. For teams that have been hitting rate walls on large agentic workloads, the Claude Code limit increase in particular removes a constraint that has been limiting deployment scale for production pipelines.

Anthropic also announced general availability of cache diagnostics in the Claude Developer Platform. The feature explains prompt cache misses in Messages, reporting a cache_miss_reason that identifies where the prompt cache prefix diverged. For teams building cost-sensitive agents with heavy prompt caching, this closes a debugging gap that has been a recurring friction point.

Code with Claude SF runs through May 22. The London event, which began on May 19, continues in parallel. Recordings from the San Francisco sessions are expected to be published on the Code with Claude page within the week. For developers building production agentic systems, the Outcomes and multi-agent orchestration documentation is now live in the Claude developer docs.

Further reading: Learn more about Claude's model family, read our background on Anthropic, or browse the latest Claude AI news.

Code with Claude SF 2026: Anthropic Ships Outcomes and Multi-Agent Orchestration

Outcomes: A Second Agent Grades the Work

Key Facts: Code with Claude SF 2026

Multi-Agent Orchestration

What Did Not Ship

Developer Infrastructure Changes

Related Stories

Code with Claude Goes Global: London Today, Tokyo in June

Anthropic's 'Dreaming' Lets AI Agents Learn From Mistakes

Anthropic Adds MCP Tunnels and Self-Hosted Sandboxes to Claude Managed Agents