Anthropic launches Claude Opus 4.5 with expanded coding, agentic, and safety capabilities
Anthropic has introduced Claude Opus 4.5, reporting improvements in software engineering performance, agentic tasks, and model safety across its latest evaluations.
Anthropic has announced the release of Claude Opus 4.5 across its official channels, with Head of Developer Relations Alex Albert taking to LinkedIn to outline key capabilities.
Albert wrote in the post that “Claude Opus 4.5 is out today. It's state-of-the-art on coding, agents, and computer use, and meaningfully better at everyday tasks like slides and spreadsheets.” He added that internal testers consistently report that the model “just ‘gets it,’” adding that it “handles ambiguity and reasons about tradeoffs without hand-holding.”
Albert wrote that the model was tested on a difficult engineering take-home exam and “within the 2-hour time limit, it scored higher than any human candidate ever has.” The model is accessible through the Claude API and major cloud platforms including Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry.
Coding and benchmark performance shows gains
Anthropic’s accompanying blog outlines results from SWE-bench Multilingual, where Opus 4.5 leads across seven of eight programming languages. Charts published by the company show performance increases in C, C++, Java, JavaScript/TypeScript, PHP, Ruby, and Rust.
Customer feedback included in the blog reinforces this pattern. Sean Ward, CEO and Co-founder of iGent AI, says: “Claude Opus 4.5 handles long-horizon coding tasks more efficiently than any model we’ve tested.”
Sarah Sachs, AI Lead Engineer at Notion, comments: “We’ve found that Opus 4.5 excels at interpreting what users actually want, producing shareable content on the first try.”
Anthropic states that Opus 4.5 reduces token usage during longer tasks, with reductions of up to 65 percent in some comparisons.
The blog includes examples from agentic benchmarks, including one where Opus 4.5 identified an alternative solution to modify a basic economy booking by first upgrading the cabin and then changing the flights. The benchmark scored the solution as a failure because it did not match the expected approach, but Anthropic cites it as an example of creative problem-solving within policy constraints.
The company also reports increases in safety and robustness. According to Anthropic, Opus 4.5 shows improved resistance to advanced prompt-injection attacks and lower rates of concerning behavior in internal evaluations.
Developer tools updated alongside the model
Anthropic has updated the Claude Developer Platform with effort controls, improved tool use, and expanded support for multi-agent workflows. At medium effort, the company says Opus 4.5 “matches Sonnet 4.5’s best score on SWE-bench Verified, but uses 76% fewer output tokens.” At high effort, it “exceeds Sonnet 4.5 performance by 4.3 percentage points.”
Claude Code has also been updated, with new planning features and support in the desktop app for multiple concurrent sessions. Consumer apps now support longer conversations through automatic summarization, Claude for Chrome is available to all Max users, and Claude for Excel has expanded beta availability.
Albert closed his post by writing: “It’s also our safest model to date, and significantly more efficient than its predecessors. Available now on the API and all major cloud platforms.”
The ETIH Innovation Awards 2026
The EdTech Innovation Hub Awards celebrate excellence in global education technology, with a particular focus on workforce development, AI integration, and innovative learning solutions across all stages of education.
Now open for entries, the ETIH Innovation Awards 2026 recognize the companies, platforms, and individuals driving transformation in the sector, from AI-driven assessment tools and personalized learning systems, to upskilling solutions and digital platforms that connect learners with real-world outcomes.
Submissions are open to organizations across the UK, the Americas, and internationally. Entries should highlight measurable impact, whether in K–12 classrooms, higher education institutions, or lifelong learning settings.