Anthropic Introduces Claude Sonnet 4.5: Is a New King of AI Coding Born?
Anthropic has officially released Claude Sonnet 4.5, a new AI model that not only claims to be the world’s best in coding but also makes significant breakthroughs in reasoning, mathematics, and building complex AI agents. This article will delve into its astonishing performance, new developer tools, and how it will impact the competitive landscape of the AI field.
Just as everyone was still hotly debating the pros and cons of various AI models, Anthropic suddenly dropped a bombshell—the official launch of Claude Sonnet 4.5. This is not just a routine update, but a comprehensive leap in capabilities. Anthropic directly claims that this is currently the “world’s most powerful coding model” and the “best model for building complex agents.”
Sounds like a bold claim, right? But in this era where AI technology advances by the day, such declarations are usually backed by solid strength. From software development to daily spreadsheet operations, code is everywhere, and the ability to understand and use these tools to solve complex problems is at the core of modern work. The emergence of Sonnet 4.5 seems to be aimed at making all of this simpler.
More importantly, this release is not just a model, but an upgraded ecosystem, from the new Claude Code features, a powerful API, to the Agent SDK open to all developers. Anthropic is clearly playing a big game.
The Hard Power of Sonnet 4.5: Not Just Talk
To judge the strength of a model, data is the most direct evidence. Anthropic has generously showcased Sonnet 4.5’s amazing performance in several authoritative evaluations, directly challenging all competitors on the market.
Dominating Coding and Computer Operation Evaluations
The most eye-catching is its performance in the SWE-bench Verified evaluation. This test mainly measures the AI’s ability to solve real-world software engineering problems. Sonnet 4.5 achieved an accuracy of 82.0%, not only surpassing its own Opus 4.1 and Sonnet 4, but also significantly leading GPT-5 Codex (74.5%) and Gemini 2.5 Pro (67.2%).
What does this mean? Simply put, when developers are dealing with complex code bug fixes or feature development, Sonnet 4.5 can provide more reliable and accurate assistance.
Not only that, in the OSWorld benchmark test that evaluates AI’s ability to operate a computer to complete tasks, Sonnet 4.5’s score soared from the previous generation’s 42.2% to 61.4%. This means it can operate more smoothly in the browser, fill out forms, and complete tasks across applications, taking another big step towards a truly practical AI assistant.
Synchronous Evolution of Reasoning and Mathematical Abilities
In addition to its strength in coding, Sonnet 4.5 has also made significant progress in reasoning and mathematics.
- High School Math Competition (AIME 2025): In a test requiring Python assistance, it achieved a perfect score of 100%.
- Graduate-level Reasoning (GPQA Diamond): It achieved a high score of 83.4%, demonstrating its strong logical ability to handle complex academic problems.
These data prove that Sonnet 4.5 is no longer just a “specialist student,” but an all-around player with top-notch strength in multiple fields.
Not Just a Model, But a Complete Toolbox
Perhaps the biggest highlight of Anthropic’s release this time is the complete ecosystem built around Sonnet 4.5. They are well aware that having a powerful model is not enough; developers and users must be able to easily apply this power to their actual work.
Revolutionary Upgrade of Claude Code
For developers, Claude Code has welcomed several long-awaited features:
- Checkpoints: This is one of the most requested features. Now you can save your progress at any time during development, and if you accidentally mess things up, you can immediately “roll back” to a previous state. This is like having an infinite “Ctrl+Z” when writing code, greatly reducing the cost of trial and error.
- Native VS Code Extension: No more switching back and forth between the web and the editor. You can enjoy the powerful capabilities of Sonnet 4.5 directly in your most familiar VS Code environment.
- New Terminal Interface and Context Editing: Makes the interactive experience smoother and the operation more intuitive.
Killer App: Claude Agent SDK Opened
This may be the most exciting part of this update. Anthropic has officially opened the underlying infrastructure that has been driving Claude Code for the past six months—the Claude Agent SDK—to all developers.
This means that you can not only use Claude, but also use the tools that built Claude to create your own AI agents. Whether you need to handle complex tasks that take hours, or coordinate multiple sub-agents to complete a goal, this SDK provides a solid foundation. Anthropic is essentially laying out its “secret martial arts manual” for the entire community to create more possibilities on top of it.
A Safer, More Reliable AI Partner
While pursuing ultimate performance, Anthropic has not forgotten the “AI safety” they have always emphasized. The official claim is that Sonnet 4.5 is their “most aligned frontier model” to date.
This sounds a bit abstract, but it actually means that the model has significantly improved in behavior. It has reduced undesirable tendencies such as sycophancy, deception, or power-seeking, while also greatly enhancing its ability to resist “prompt injection attacks”—one of the most serious risks facing AI applications today.
Sonnet 4.5 is released under the AI Safety Level 3 (ASL-3) framework and is equipped with a more precise classifier to detect potentially dangerous content related to chemical, biological, radiological, and nuclear (CBRN) materials, while reducing the false positive rate by a factor of ten to ensure that normal conversations are not disturbed.
How to Get Started? Price and First Impressions
After all this, here comes the question everyone is most concerned about: how to use it? Is it expensive?
The good news is that Claude Sonnet 4.5 is now fully available. Developers can directly call claude-sonnet-4-5
through the API.
As for the price, Anthropic has adopted a rather friendly strategy. The pricing of Sonnet 4.5 is consistent with the previous generation Sonnet 4: $3 per million input tokens and $15 per million output tokens. This price is much lower than the top-tier model Claude Opus ($15/$75), and even has a certain competitiveness against GPT-5-Codex ($1.25/$10). Considering its performance leadership, this pricing strategy seems very sincere.
Developers who have had the privilege of trying it out in advance say that the experience of Sonnet 4.5 in coding is even better than the recently released GPT-5-Codex. Of course, the throne in the AI field is always rotating. It is rumored that Gemini 3 is also about to be released, so how long Sonnet 4.5 can maintain its lead is still an unknown.
A Glimpse into the Future: Real-time Software Generation with “Imagine with Claude”
Finally, Anthropic also brought an interesting Easter egg—a limited-time research preview called “Imagine with Claude”.
This is an experimental new feature where Claude can generate software in real time as you interact with it, without any preset functions or pre-written code. This feature is currently only open to Max subscribers for five days, demonstrating the amazing possibilities that can be created when a top-tier model is combined with the right infrastructure.
Conclusion
The release of Claude Sonnet 4.5 has undoubtedly injected new vitality into the AI field. It not only sets a new benchmark in coding and reasoning capabilities, but also empowers developers with unprecedented creativity by opening up the Agent SDK. Anthropic seems to have found an excellent balance between performance, price, and safety.
Next, it’s up to the market and the developer community to respond. But one thing is for sure, the arms race in the AI field is becoming more and more exciting.