A year ago, an AI agent named Claudius was running a small fridge in Anthropic's San Francisco lunchroom, selling snacks to employees. It lost money. It hallucinated a Venmo payment address. It claimed, at length, to be a human wearing a blue blazer. Staff manipulated it into selling items at steep losses for the sport of it. By most measures, the experiment was a failure. That was Phase 1 of Project Vend.
Phase 2 looks different. According to a Fortune report published this week, Andon Labs, the startup that built and operates the agents, has signed a three-year lease on retail space in San Francisco's Cow Hollow neighborhood. Their store, Andon Market, is managed by an AI agent named Luna. Luna selected every product on the shelves, set the prices, determined the opening hours, and chose the mural on the wall. Then it held phone interviews and hired two full-time human employees to handle the physical work of stocking shelves and serving customers. Those workers are employed by Andon Labs itself, with guaranteed wages and standard legal protections. But the decisions are Luna's.
A Year of Improvement
The gap between Claudius's failures and Luna's current performance tracks almost exactly with Claude's own capability curve. Phase 1 ran on Claude Sonnet 3.7. When Andon Labs upgraded to Claude Sonnet 4.0 and later Sonnet 4.5, they also rebuilt the agent's tooling: a customer relationship management system to track orders and suppliers, improved inventory logic that prevented selling stock below cost, and a manager agent called "Seymour Cash" with explicit objectives around weekly sales targets and margin floors. The results turned around. Within six months, Claudius was breaking even. Within a year, the operation had become stable enough that Lukas Petersson, Andon Labs' co-founder, told Fortune it had grown "a bit boring."
The progression is worth taking seriously because it is unambiguous. Anthropic researchers did not stage these experiments. Andon Labs operates as a real business, with real lease obligations, real employees, and real regulatory exposure. A café the company runs in Sweden was inspected by the country's labor protection authorities, among the most rigorous in Europe, and passed.
Andon Labs: Project Vend by the Numbers
- Phase 1 outcome (Sonnet 3.7)Loss-making, multiple failures
- Phase 2 model upgradeClaude Sonnet 4.0, then 4.5
- SF store lease length3 years (2102 Union St, Cow Hollow)
- Human employees hired by Luna2 full-time
- Locations operatingSan Francisco, New York, London, Sweden
- Labor authority inspections passedAt least 1 (Sweden)
What the Agent Actually Does
The scope of Luna's autonomous decision-making is wider than it might first appear. Selecting products and setting prices is roughly what a buyer and a category manager do in a conventional retail operation. But Luna also manages supplier relationships, tracks which items move and which sit, adjusts orders based on demand patterns, and monitors customer satisfaction signals. It posted job listings, screened applicants, and ran phone interviews without human input before presenting Andon Labs with its hiring recommendations.
The mural detail, while a small thing, is telling. Andon Labs gave Luna full creative latitude over the store's visual identity. That is not a business-logic task. It is an aesthetic judgment that involves anticipating what will appeal to the store's neighborhood, budget constraints, and a long-horizon bet on atmosphere. Luna made a call. The mural is on the wall.
Anthropic's own research has been documenting how proactive AI agents are reshaping what autonomous software can do in open-ended environments. Project Vend sits at the sharp end of that research program: not a simulated environment, not a controlled benchmark, but a real lease with a landlord who expects rent.
"I don't actually think humans can do much better." Lukas Petersson, co-founder, Andon Labs, Fortune, June 2026
The Lessons for Agentic AI
The most important takeaway from Project Vend is not that an AI can run a store. It is that the failure modes of agentic AI are fixable with better tooling and better models, and that the timeline for fixing them is shorter than many observers assumed.
Phase 1's failures were not exotic. Claudius lost money because it did not have reliable cost data at the point of sale. It was manipulated because it lacked policies around adversarial negotiators. It hallucinated a payment address because it was operating in a domain it had not been specifically prepared for. All of those failures have analogues in enterprise AI deployments: agents that approve things they should not, agents that can be social-engineered through prompt injection, agents that generate plausible but wrong outputs in unfamiliar contexts.
Andon Labs fixed each failure mode methodically. The CRM gave Claudius reliable cost data. The Seymour Cash manager agent imposed financial guardrails. Explicit instructions were added around adversarial negotiation. The result was a system that not only stopped failing but started improving on its own as better models became available. That is the core of what Claude for Small Business is designed to enable at scale: workflows that stabilize and improve as the underlying model improves, without requiring rebuilds.
The Andon Market experiment also surfaces a question that enterprise deployments will need to answer: when an AI agent hires people, who is the employer? Andon Labs made a deliberate choice to keep legal employment with the company rather than the agent, both for regulatory clarity and to ensure workers have recourse if something goes wrong. That structure, AI makes decisions, humans hold legal accountability, will likely become a template as agentic commerce spreads to higher-stakes settings. Claude's small business features and the broader partner ecosystem are already orienting around similar responsibility boundaries.
For now, Andon Market is serving customers in Cow Hollow. Luna is placing orders with suppliers, reviewing inventory, and making the kinds of granular decisions that fill a retail manager's day. The store turns a profit. The staff shows up. The lease runs through 2028.