OpenAI Launches GPT-5.4 with Native Computer Use and 1M Token Window
[00:00] Cole Mercer: From Neural Newscast, I'm Cole Mercer.
[00:04] Evelyn Hartwell: And I'm Evelyn Hartwell. It is Thursday, March 5th, 2026.
[00:10] Cole Mercer: OpenAI has released its latest Foundation model, GPT-5.4, today. It is being positioned
[00:17] Cole Mercer: as a primary tool for professional work, featuring three distinct versions, a standard
[00:22] Cole Mercer: model, GPT-5.4 Thinking for Complex Reasoning, and GPT-5.4 Pro for High Performance Tasks.
[00:30] Evelyn Hartwell: The thinking version is particularly notable, Cole. It provides an outline of its logic for
[00:36] Evelyn Hartwell: complex queries, allowing users to adjust requests mid-response. According to OpenAI, this version
[00:42] Evelyn Hartwell: is rolling out to plus, team, and pro users across the web app and Android platforms first.
[00:48] Cole Mercer: Beyond the reasoning features, the most significant shift is what OpenAI calls native computer use.
[00:54] Cole Mercer: The
[00:54] Cole Mercer: The model can now write code to operate a computer,
[00:57] Cole Mercer: issuing keyboard and mouse commands in response to screenshots.
[01:01] Cole Mercer: This is a deliberate step toward autonomous agents
[01:04] Cole Mercer: that can manage entire workflows across different applications.
[01:08] Evelyn Hartwell: Those capabilities are already being reflected in benchmark tests,
[01:13] Evelyn Hartwell: TechCrunch reports that GPT-5.4 has set record scores on OS World Verified and Web Arena.
[01:20] Evelyn Hartwell: It also led Merkhor's Apex Agents Benchmark, which specifically measures professional skills in legal analysis and financial modeling.
[01:28] Cole Mercer: Factuality remains a central focus of this release.
[01:32] Cole Mercer: OpenAI claims individual claims made by the model are 33% less likely to be false compared to GPT-5.2.
[01:41] Cole Mercer: There is also a new safety evaluation in place to monitor the model's chain of thought,
[01:46] Cole Mercer: ensuring it doesn't misrepresent its reasoning to the user.
[01:50] Evelyn Hartwell: On the technical side, the API version now supports a context window of one million tokens.
[01:56] Evelyn Hartwell: They also introduced tool search, which lets the model look up tool definitions as needed,
[02:01] Evelyn Hartwell: rather than loading them all at once.
[02:04] Evelyn Hartwell: This makes requests faster and more cost-effective for enterprise systems with large tool sets.
[02:10] Cole Mercer: These updates suggest a transition away from simple chatbots toward functional infrastructure that can execute work independently.
[02:19] Evelyn Hartwell: It is a clear shift toward agentic future, Cole, where AI operates in the background to handle long-horizon deliverables.
[02:28] Cole Mercer: From Neural Newscast, I'm Cole Mercer.
[02:32] Evelyn Hartwell: And I'm Evelyn Hartwell.
[02:34] Evelyn Hartwell: Neural Newscast is AI-assisted, human-reviewed.
[02:38] Evelyn Hartwell: View our AI transparency policy at neuralnewscast.com.
[02:43] Cole Mercer: Neural Newscast uses artificial intelligence in content creation
[02:47] Cole Mercer: with human editorial review prior to publication.
[02:50] Cole Mercer: While we strive for factual, unbiased reporting,
[02:53] Cole Mercer: AI-assisted content may occasionally contain errors.
[02:56] Cole Mercer: Verify critical information with trusted sources.
[02:59] Cole Mercer: Learn more at neuralnewscast.com.
