November 25, 2025, might mark a significant milestone in the history of artificial intelligence development. Anthropic unexpectedly released its killer app — Claude Opus 4.5, which is not just about computational power but a redefinition of “how AI should work.” Meanwhile, Google and OpenAI were also busy, each introducing new moves in their respective domains. This AI competition has clearly shifted from a mere battle of raw power to a contest of mental agility and practicality.
Claude Opus 4.5: It’s No Longer Just Fast, It “Gets” You Better
The long-awaited Claude Opus 4.5 has finally officially launched. If previous models were like talented but occasionally reckless college graduates, Opus 4.5 is more like a seasoned professional. According to Anthropic’s internal testers, the model’s biggest feature is that it “really gets it.”
In the past, when assigning complex tasks to AI, especially programming or debugging, we often had to guide it step-by-step, like a nanny, telling it where to look and what to pay attention to. But Opus 4.5, when faced with ambiguous instructions or needing to choose between multiple options, demonstrates surprising autonomous judgment. For instance, when dealing with multi-system linkage bugs that are a headache for engineers, it can independently sort out the repair path without much human intervention.
Speed or Quality? Now You Can Choose
The most eye-catching feature in this update is the new “Effort Control.”
This is like when you delegate tasks to a colleague. Sometimes, you just need a “good enough” quick answer; other times, you need them to spend days considering all possible extreme scenarios. With Effort Control, developers can decide whether Claude should respond quickly with “intuition” or enter a “deep thought” mode.
At its highest intensity setting, Opus 4.5’s performance in the SWE-bench Verified software engineering benchmark even surpassed its predecessor, Sonnet 4.5, and—this is crucial—it consumed nearly half the number of tokens. This means the model has learned a smarter thinking path instead of blindly trying. Priced at $5 per million input tokens and $25 per million output tokens, this significantly reduces cost pressure for enterprises adopting high-end AI.
Price Comparison
Price Comparison: Per Million Characters (Tokens)
- Input: $5 USD / Million Characters
- Output: $25 USD / Million Characters
This is much cheaper than the previous Opus ($15 USD / $75 USD), giving it a competitive edge against other models.
Competitor Price Overview (Per Million Characters)
| Model Series | Input Price (USD) | Output Price (USD) | Notes |
|---|---|---|---|
| New Model | 5 | 25 | |
| Opus (Old Version) | 15 | 75 | |
| GPT-5.1 Series | 1.25 | 10 | |
| Gemini 3 Pro | 2 | 12 | $4/$18 for over 200k characters |
| Sonnet 4.5 | 3 | 15 | |
| Haiku 4.5 | 1 | 5 |
A Blessing for Developers: Teaching AI to Use the Toolbox
As models become smarter, enabling them to proficiently use external tools (like GitHub, Jira, Slack) has become a new challenge. Anthropic has simultaneously released advanced tool features for the Claude developer platform, addressing the long-standing “context explosion” problem that plagued engineers.
Here are three key technological breakthroughs:
1. Tool Search Tool: Don’t Carry the Entire Hardware Store on Your Back
Previously, to enable AI to use various APIs, developers had to feed thousands of tool definitions to the model all at once. This is like a plumber going to fix a faucet but carrying all the parts from the entire hardware store on their back, getting exhausted before even starting work (running out of token allowance).
The new “Tool Search Tool” allows AI to “find” suitable tools only when needed. Claude will first analyze the task, then actively search “What tools do I have available?”, and then load only the relevant ones. Test data shows that this mechanism can save up to 85% of token usage.
2. Programmatic Tool Calling: Replacing Tedious Chitchat with Code
Traditional AI tool calling was much like playing a game of telephone: AI: “Help me look up data A.” System: “Okay, here’s A.” AI: “Now, help me look up data B.” System: “Okay, here’s B.”
This back-and-forth conversation was not only slow but also generated a lot of intermediate junk information. The new “Programmatic Tool Calling” allows Claude to directly write a piece of Python code to coordinate these tasks. It can run loops and make decisions independently within a sandbox environment, finally returning only the “final result.” This not only significantly reduces latency but also dramatically improves accuracy due to clear logic.
3. Tool Use Examples: Examples Speak Louder Than a Thousand Words
Sometimes, no matter how detailed API documentation is, it’s not as clear as providing a practical example. Developers can now directly embed “correct examples” within tool definitions, which is particularly effective for those APIs with finicky formatting requirements.
Google and OpenAI’s Counterattack
Naturally, competitors are not sitting idle. Good news comes from the Google camp: Gemini 3 CLI access is now fully open. All paid plan users (including Google AI Pro) can now directly use this latest model via the command-line interface, and the usage quota for individual subscribers has also been significantly increased. Google is clearly accelerating hardware resource deployment, attempting to seize the developer market with more abundant computing power.
OpenAI, on the other hand, chose to innovate in user experience. Remember the pain of countless browser tabs open every time you wanted to buy something online? The newly launched Shopping Research feature turns ChatGPT into your personal shopping consultant. It doesn’t just list links; it conducts interactive research, helping you organize specifications and compare prices, making “impulse buying” decisions more rational (or more impulsive?).
Furthermore, for the video generation model Sora, OpenAI has also released Sora Styles functionality. Creators can now specify whether the video should be in a “retro style,” “anime style,” or “news report style,” transforming AI video generation from a simple “blind box” draw into a more controllable creative tool.
New Toys for Academia and the Open Source Community
The influence of AI continues to deepen in academia. Renowned AI scholar Andrew Ng released a paper review tool called Agentic Reviewer. The impetus came from seeing students get rejected six times within three years when trying to publish papers, with each feedback cycle taking half a year. This AI agent can simulate the review process, and tests show its review results are highly correlated with human reviewers. This might solve a long-standing efficiency pain point in academia.
Meanwhile, on the open-source model platform OpenRouter, a mysterious model named Bert-Nebulon Alpha appeared these past two days. It boasts an astonishing 256k context length and is currently labeled as an invisible test (Cloaked Model). Although its name sounds like a sci-fi character, community excavation suggests its underlying architecture might originate from Mistral (its answer when asked who it was, as it rarely claims to be Mistral), though some inferences point to GLM (OpenRouter hasn’t seen Chinese cloaked models). It is specifically optimized for long-text comprehension.
Frequently Asked Questions (FAQ)
Q: What is the biggest difference between Claude Opus 4.5 and previous versions? The core difference lies in “intelligence” and “flexibility.” Opus 4.5 acts more like a human expert in handling complex logic, coding, and agent tasks, capable of self-correction. Furthermore, it introduces “Effort Control,” allowing you to choose between “quick response” and “deep deliberation,” which is highly practical for business applications.
Q: When should I use the Tool Search Tool? You should definitely use it when your AI application needs to integrate dozens or even hundreds of tools (APIs). If you cram all tool definitions into the prompt, it not only becomes expensive but also makes the model less intelligent. The Tool Search Tool allows the model to fetch tool definitions only when needed, saving a significant amount of tokens and improving accuracy.
Q: Can general free users access Google Gemini 3? Currently, CLI access is primarily open to “paid plan” users. If you are a free user, you might have to wait a bit longer or consider upgrading to a Google AI Pro plan for early access.
Q: What’s the difference between OpenAI’s Shopping Research and Google Search? Google Search provides you with a list of links, and you have to click through and process the information yourself. Shopping Research, however, “reads” this information for you, then organizes it into tables or provides recommendations. It’s more like a shopping assistant that does the homework for you, rather than just a librarian.
Q: Is the mysterious Bert-Nebulon Alpha model worth trying? If you have extremely long texts (such as an entire novel or hundreds of pages of financial reports) that need analysis, it’s definitely worth a try. It boasts a 256k context window, and it currently appears to be open on OpenRouter for feedback collection, making it a good testing opportunity for developers.


