Anthropic Drops Safety Pledge and Faces Pentagon [Model Behavior]

Anthropic is undergoing a significant strategic pivot, simultaneously expanding its enterprise presence while retracting previous safety commitments. This week, the company announced that Claude will now integrate directly into Microsoft Excel and PowerPoint, bolstered by the $50 million acquisition of computer-use startup Vercept. However, these advancements coincide with a major policy shift; Anthropic has dropped its signature "Responsible Scaling Policy" pledge to pause development if safety mitigations aren't guaranteed in advance. This move comes amid a high-stakes standoff with the Pentagon, where Defense Secretary Pete Hegseth has threatened to blacklist the company over what he terms "woke AI" safety standards, even suggesting the use of the Defense Production Act to compel compliance. Meanwhile, research benchmarks like "Humanity’s Last Exam" reveal that despite these commercial pushes, frontier models continue to struggle with specialized expert knowledge, with GPT-4o and Claude 3.5 scoring below 5% on niche academic tasks. To bridge this integration gap, rival OpenAI has formed the "Frontier Alliance" with major consultancy firms like McKinsey and Accenture to facilitate real-world enterprise deployment.

[00:00] Announcer: From Neural Newscast, this is Model Behavior,
[00:03] Announcer: AI-focused news and analysis on the models shaping our world.
[00:11] Nina Park: Welcome to Model Behavior.
[00:13] Nina Park: Model Behavior examines how AI systems are built, deployed,
[00:17] Nina Park: and operated in real professional environments.
[00:20] Nina Park: Joining us today is a director-level AI and security leader with a systems-level perspective on automation and enterprise risk.
[00:29] Nina Park: It is good to have you here.
[00:30] Thatcher Collins: We're looking at a remarkably dense week for Anthropic, Nina.
[00:34] Thatcher Collins: They are currently pushing deep into the Office ecosystem.
[00:38] Thatcher Collins: On Tuesday, they announced that Claude will now live inside Microsoft Excel and PowerPoint,
[00:44] Thatcher Collins: allowing the model to generate slide decks directly from spreadsheet data.
[00:49] Thatcher Collins: This follows their acquisition of Verisept, a startup that specializes in computer use agents for a reported $50 million.
[00:58] Thatcher Collins: It's a strategic expansion, Thatcher.
[01:01] Thatcher Collins: By acquiring Vercept, Anthropic is trying to solve the last mile problem of automation.
[01:09] Thatcher Collins: However, this aggressive move into specific workflows like HR and wealth management is already rattling the market.
[01:17] Thatcher Collins: Um.
[01:18] Thatcher Collins: We saw software industry ETFs drop 6% earlier this month because investors fear Claude might make specialized enterprise tools obsolete.
[01:29] Thatcher Collins: From a systems perspective, the risk isn't just technology.
[01:33] Thatcher Collins: It's the friction with legacy providers like IBM.
[01:37] Chad Thompson: The speed of that development is actually leading to a retreat on the safety front, Thatcher.
[01:43] Chad Thompson: Anthropic has formally abandoned its signature promise to never train or release a frontier model without guaranteed safety mitigations in advance.
[01:53] Chad Thompson: They're moving to a framework of transparency reports and safety roadmaps instead.
[02:00] Chad Thompson: They've stated that unilateral restraint no longer makes sense in a market defined by geopolitical urgency.
[02:08] Thatcher Collins: That geopolitical urgency is manifesting quite literally at the Pentagon, Nina.
[02:13] Thatcher Collins: The Secretary is currently threatening to blacklist anthropic from U.S. military contracts.
[02:19] Thatcher Collins: The standoff is over the refusal to allow its models to be used for autonomous weapons and domestic mass surveillance.
[02:27] Thatcher Collins: Restrictions he has labeled as woke AI.
[02:31] Thatcher Collins: There are reports the administration might even invoke the Defense Production Act to force compliance.
[02:37] Thatcher Collins: This is a significant operational resilience challenge for Anthropic Thatcher.
[02:42] Thatcher Collins: The CEO is holding a hard line on AI-directed warfare.
[02:48] Thatcher Collins: But the Pentagon views this as a supply chain risk.
[02:52] Thatcher Collins: If the Defense Production Act is invoked, it essentially forces the company to hand over control of its tools.
[03:00] Thatcher Collins: It's a classic conflict between corporate ethics and national security mandates,
[03:06] Thatcher Collins: and it could impact their upcoming public offering.
[03:11] Nina Park: While this policy battle rages, we're seeing new data on what these models can actually do.
[03:18] Nina Park: A consortium of a thousand researchers just released Humanity's Last Exam,
[03:24] Nina Park: which consists of 2,500 expert-level questions.
[03:28] Nina Park: The results were stark.
[03:30] Nina Park: GPT-40 scored just 2.7%, and Claude 3.5 Sonnet only reached 4.1%.
[03:39] Nina Park: These are questions involving ancient inscriptions and micro-anatomy, areas where, you know, simple pattern recognition fails.
[03:48] Thatcher Collins: OpenAI seems to be addressing that gap through human partnerships rather than just model scaling, Nina.
[03:54] Thatcher Collins: They've launched the Frontier Alliance, a multi-year partnership with Accenture, McKinsey, BCG, and CAP Gemini.
[04:02] Thatcher Collins: The goal is to embed open AI engineers directly into these consulting firms to help enterprises manage AI agents that can perform real-world work.
[04:12] Thatcher Collins: It's a clear signal that model intelligence alone isn't enough.
[04:15] Thatcher Collins: You need human-guided implementation.
[04:18] Thatcher Collins: Exactly, Thatcher.
[04:20] Thatcher Collins: The HLE benchmark shows that AI still lacks deep specialized context.
[04:26] Thatcher Collins: By partnering with consultancies, OpenAI is effectively borrowing human expertise to
[04:32] Thatcher Collins: bridge the gap between model capabilities and enterprise requirements.
[04:39] Thatcher Collins: For leaders, the takeaway is that while the models are moving into our apps,
[04:45] Thatcher Collins: we are still a long way from fully autonomous expert systems
[04:50] Nina Park: The distance between the lab and the office continues to shrink, but the friction is increasing.
[04:57] Nina Park: Thank you for your perspective today.
[04:59] Nina Park: Thatcher, any final thoughts?
[05:01] Thatcher Collins: Thank you for listening to Model Behavior, a neural newscast editorial segment.
[05:07] Thatcher Collins: You can find our technical breakdowns at mb.neuralnewscast.com.
[05:13] Thatcher Collins: Neural Newscast is AI-assisted, human-reviewed.
[05:17] Thatcher Collins: View our AI Transparency Policy at neuralnewscast.com.
[05:22] Announcer: This has been Model Behavior on Neural Newscast.
[05:26] Announcer: Examining the systems behind the story.
[05:28] Announcer: Neural Newscast uses artificial intelligence in content creation,
[05:32] Announcer: with human editorial review prior to publication.
[05:35] Announcer: While we strive for factual, unbiased reporting,
[05:38] Announcer: AI-assisted content may occasionally contain errors.
[05:41] Announcer: Verify critical information with trusted sources.
[05:44] Announcer: Learn more at neuralnewscast.com.

Anthropic Drops Safety Pledge and Faces Pentagon [Model Behavior]
Broadcast by