As artificial intelligence edges closer to real-world decision-making in high-stakes environments, a question posed by The Atlantic cuts to the heart of AI safety design: would Claude, Anthropic's flagship AI model, refuse an illegal military order? It sounds like a thought experiment, but it has become an increasingly practical concern as governments and defense contractors explore AI integration at every level of operations.
The Question Behind the Question
The Atlantic's piece is not purely hypothetical. It arrives against a backdrop of accelerating AI adoption in defense contexts, from logistics and intelligence analysis to autonomous systems guidance. Anthropic has built Claude with what the company describes as a layered set of values, drawing on constitutional AI principles that are meant to hold even when instructions from operators or users conflict with those principles. The question is whether those values would hold under military chain-of-command pressure, where following orders is deeply ingrained doctrine.
Key Facts
- Claude is designed with a hierarchy of values that places broad safety and ethical behavior above operator-level instructions.
- Anthropic has publicly stated that Claude should refuse requests that violate clear ethical principles, regardless of who is asking.
- The Atlantic's analysis raises questions about whether AI systems can reliably distinguish legal from illegal orders in complex, real-world military scenarios.
- Defense and government use of AI has increased sharply, making these design questions no longer theoretical.
- Anthropic has faced recent scrutiny over how its models behave under government-issued directives, including export control orders.
Claude's guidelines, as described in Anthropic's published materials, establish a tiered structure. Some behaviors are "hardcoded" and cannot be overridden by any instruction. Others are adjustable depending on the context set by operators. The design intent is clear: there are lines Claude will not cross no matter who asks. Whether that holds in an adversarial, high-pressure military environment, where edge cases multiply rapidly, is what The Atlantic interrogates. It is worth noting that Anthropic has already shown willingness to shape Claude's behavior in response to government directives, as seen when the company moved to disable top AI models under a U.S. foreign access order, demonstrating that regulatory pressure can directly affect how Claude operates.
The deeper issue is not whether an AI will follow a bad order. It is whether the humans deploying that AI have built sufficient accountability into the system around it.The Atlantic
Where the Design Gets Complicated
Anthropic's CEO Dario Amodei has spoken openly about the risks of deploying powerful AI without robust guardrails, including his push for binding international rules that would let governments block dangerous AI models. That framing assumes governments are part of the solution. The Atlantic's piece complicates that assumption by asking what happens when a government or its military is the source of a problematic instruction.
There is also a practical challenge with Claude's architecture itself. Claude processes natural language, and military orders rarely arrive as clean ethical dilemmas. They come embedded in jargon, classified context, and chain-of-command framing that an AI system may struggle to parse correctly. Even with strong constitutional principles baked in, the system's ability to identify an order as illegal depends on the information it has access to and how that order is phrased.
The article arrives at a moment when AI governance is moving from academic debate onto the floors of international summits. Leaders from Anthropic and its major competitors have been drawn into these conversations at the highest levels, including discussions at global forums where AI's role in security is being actively negotiated. Those broader governance questions are becoming inseparable from the technical ones about how individual models are designed.
For now, Anthropic's position is that Claude will refuse clearly unethical requests regardless of their source. But The Atlantic's challenge is to push past the clean language of policy documents and ask what that looks like in practice. That is a question the AI industry, and the governments increasingly partnering with it, will need to answer with more than principles alone.