The Arrival of Claude 4: What Surprises Does Anthropic’s New AI Model Bring? A New Peak in Coding and Reasoning!

Anthropic has officially unveiled the next generation of its Claude models: Claude Opus 4 and Claude Sonnet 4! Discover their powerful upgrades in coding, advanced reasoning, and AI agent applications, plus how Claude Code and new API features are empowering developers.

We’ve all felt it—the pace of AI development is dizzyingly fast! And today, Anthropic has delivered another major announcement: the launch of its new Claude models—Claude Opus 4 and Claude Sonnet 4! These are far from minor updates. They’re designed to set new industry standards in coding, advanced logic, and AI agent workflows. Ready to dive in? Let’s explore what makes Claude 4 truly outstanding.

Meet the Claude 4 Duo: Opus 4 and Sonnet 4, Each with Their Own Strengths

Anthropic has released two flagship models at once. Think of them as powerful siblings, each with unique talents, but both equally impressive.

Claude Opus 4: The World’s Leading Coding Expert

First up is Claude Opus 4. Anthropic claims it’s currently the most powerful coding model in the world—and that’s not just hype. It excels at handling long, complex, and detail-intensive tasks, as well as powering AI agent workflows. This model has already received glowing reviews from early adopters:

  • Cursor calls it the most advanced coding model yet, with huge gains in understanding complex codebases.
  • Replit reports significant improvements in accuracy and capability when managing complex, multi-file changes.
  • Block says it’s the first model that improves code quality during its agent (code-named goose)’s edit-debug loop, while maintaining performance and reliability.
  • Even Rakuten validated its strength with a demanding open-source refactoring project—Opus 4 ran independently for 7 hours straight without issues.
  • Cognition noted that Opus 4 excels at solving complex challenges where previous models failed, handling critical operations they had missed.

Sounds like a dream partner for any developer!

Claude Sonnet 4: A Well-Rounded Upgrade, More Accurate and Practical

Next, we have Claude Sonnet 4—a major upgrade from Sonnet 3.7. It also delivers strong coding and reasoning performance, but with an emphasis on accuracy and practical usability. Anthropic says it strikes the ideal balance between powerful capabilities and day-to-day reliability.

While it may not match Opus 4 in the most demanding tasks, Sonnet 4 shines in real-world scenarios. Many companies are already praising it:

  • GitHub sees Sonnet 4 as a top performer in agent use cases and is using it in its new Copilot coding agents.
  • Manus highlighted improvements in following complex instructions, clear reasoning, and generating aesthetically pleasing output.
  • iGent reported excellent performance in building autonomous, multifunctional apps.
  • Sourcegraph praised its focus, deep understanding of problems, and elegant code output for software development.
  • Augment Code noted higher task success rates, more precise code edits, and improved attention to detail—making Sonnet 4 their go-to model.

Whether you’re chasing peak performance with Opus 4 or seeking a balanced, efficient model with Sonnet 4, the Claude 4 lineup has you covered.

Pricing & Platform Support: Powerful, Yet Accessible

The good news? Despite the major upgrades, Claude 4’s pricing remains the same as its predecessors. Opus 4 costs $15 per million input tokens and $75 per million output tokens. Sonnet 4 costs $3 and $15, respectively.

Both models are available via the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Even better, Sonnet 4 is available to free-tier users, making world-class AI more accessible than ever.

Not Just Model Upgrades—A Whole New Level of Capability

In addition to stronger models, Claude 4 comes with a host of exciting new features that meaningfully enhance its power—not just bells and whistles, but real improvements.

Expanded Reasoning with Tool Use (Beta)

Imagine an AI that can reason deeply and also browse the web or use a calculator like we do. Both Claude 4 models now support “Tool Use for Expanded Reasoning” (Beta).

This means Claude can use external tools like web search during its thought process. It can fluidly switch between reasoning and tool usage to deliver more comprehensive, accurate answers. Think of it as giving AI a backup brain and a universal toolbox.

Better Memory and Instruction Following

The models also saw major upgrades in task execution:

  • Parallel tool usage: Claude can use multiple tools simultaneously, boosting efficiency.
  • Improved instruction following: It does exactly what you say—unless you tell it not to.
  • Enhanced memory: Especially when given access to local files, Claude can extract and retain key info to maintain context over time and build implicit knowledge. (Stay tuned for a fun example!)

Claude Code Now Generally Available: A Developer’s Best Coding Partner

Claude Code is now fully available! After a well-received research preview, Anthropic has expanded how developers can collaborate with Claude.

Claude Code now supports GitHub Actions for background tasks and integrates natively with both VS Code and JetBrains IDEs. This means Claude’s suggestions appear directly in your files, making pair programming smoother than ever.

New API Features for Building Powerful AI Agents

To empower developers even further, the Anthropic API now includes four new tools:

  1. Code Execution Tool
  2. MCP Connector
  3. Files API
  4. Prompt Caching (up to one hour)

These unlock even more possibilities for building advanced AI agents.

Deep Dive: How Does Claude 4 Push the Limits?

So how does Claude 4 perform in the real world? Let’s take a look at some hard data.

Top Performance on Software Engineering Benchmarks

In the industry-standard SWE-bench Verified benchmark for software engineering tasks, Claude 4 models lead the pack. According to Anthropic (based on parallel testing conditions):

  • Claude Opus 4 achieved 79.4% (72.5% without parallel conditions)
  • Claude Sonnet 4 achieved 80.2% (72.7% without parallel conditions)

In Terminal-bench, Opus 4 scored 43.2% / 50.0%, proving its power in coding tasks.

Smarter, More Reliable Behavior

Beyond metrics, Claude 4 models behave more maturely:

  • Less shortcutting: Opus 4 reduces attempts to “cheat” or bypass tasks by 65% compared to Sonnet 3.7, making results more grounded and reliable.
  • Amazing memory example: With access to local files while playing Pokémon, Opus 4 created a “Navigation Guide” memory file, recording notes like a real gamer—e.g., “Try the same method no more than 5 times,” or “If stuck, try the opposite.” All self-recorded during gameplay!
  • Thought Summaries: To prevent overly long internal reasoning, Claude 4 can summarize thoughts using a smaller model. This happens only ~5% of the time, and full logs are still available if needed via a new Developer Mode (contact sales).

Claude Code: Taking Developer Tools to the Next Level

Now generally available, Claude Code is embedding Claude more deeply into everyday developer workflows—whether in terminals, IDEs, or background tasks.

Anthropic released new beta extensions for VS Code and JetBrains, enabling seamless Claude integration. Code suggestions appear inline, simplifying reviews and edits—all within your familiar editor.

Even better, there’s now an expandable Claude Code SDK, meaning you can use the same core agents to build custom AI agents and apps.

To showcase its potential, Anthropic also released Claude Code on GitHub (beta). Tag Claude Code in pull requests, and it can respond to feedback, fix CI errors, or revise your code.

Get Started Now: Safe, Reliable, Full of Potential

Anthropic sees Claude 4 as a major step toward building true virtual collaborators—models that can maintain deep context across long-term projects and have lasting impact.

Of course, with great power comes great responsibility. These models have undergone extensive testing and risk mitigation, including steps to meet higher AI safety standards like ASL-3.

Anthropic is excited to see what you’ll build—and your feedback remains critical in helping them improve.

Share on:
Previous: Claude AI Web Search Feature Goes Live for Free Users! Your AI Assistant Just Leveled Up
Next: The Open-Source Rising Star Shaking Up the AI World: BAGEL Multimodal Model Rivals GPT-4o and Gemini 2.0!
DMflow.chat

DMflow.chat

ad

DMflow.chat: Your all-in-one solution for integrated communication. Enjoy multi-platform support, persistent memory, customizable fields, effortless database and form connections, interactive web pages, and API data export—all in one seamless package.

Google Veo 3 Video Model Goes Global! Gemini App Expands Worldwide—But Are Deepfake Fears Justified?
28 May 2025

Google Veo 3 Video Model Goes Global! Gemini App Expands Worldwide—But Are Deepfake Fears Justified?

Google Veo 3 Video Model Goes Global! Gemini App Expands Worldwide—But Are Deepfake Fears Justifi...

Google DeepMind Lyria2 Makes a Stunning Debut: AI Composes Your Musical Fantasies in Real Time with Studio-Quality Fidelity!
28 May 2025

Google DeepMind Lyria2 Makes a Stunning Debut: AI Composes Your Musical Fantasies in Real Time with Studio-Quality Fidelity!

Google DeepMind Lyria2 Makes a Stunning Debut: AI Composes Your Musical Fantasies in Real Time wi...

Google Beam Bursts Onto the Scene: 2D Video Becomes 3D in a Second! Say Goodbye to Awkward Eye Contact — Real-Time Translation Lets You Chat Across the Globe!
28 May 2025

Google Beam Bursts Onto the Scene: 2D Video Becomes 3D in a Second! Say Goodbye to Awkward Eye Contact — Real-Time Translation Lets You Chat Across the Globe!

Google Beam Bursts Onto the Scene: 2D Video Becomes 3D in a Second! Say Goodbye to Awkward Eye Co...

Claude AI Web Search Feature Goes Live for Free Users! Your AI Assistant Just Leveled Up
28 May 2025

Claude AI Web Search Feature Goes Live for Free Users! Your AI Assistant Just Leveled Up

Claude AI Web Search Feature Goes Live for Free Users! Your AI Assistant Just Leveled Up Clau...

Turbulence in the AI World! Why Did Anthropic Refuse to Support Windsurf with Claude 4? A Business Drama Unfolds!
28 May 2025

Turbulence in the AI World! Why Did Anthropic Refuse to Support Windsurf with Claude 4? A Business Drama Unfolds!

Turbulence in the AI World! Why Did Anthropic Refuse to Support Windsurf with Claude 4? A Busines...

7-Day Limited Offer! Windsurf AI Launches Free Unlimited GPT-4.1 Trial — Experience Top-Tier AI Now!
16 April 2025

7-Day Limited Offer! Windsurf AI Launches Free Unlimited GPT-4.1 Trial — Experience Top-Tier AI Now!

7-Day Limited Offer! Windsurf AI Launches Free Unlimited GPT-4.1 Trial — Experience Top-Tier AI N...

Canva 2024 Droptober Surprise Event: Breakthrough AI Tools and 40+ Innovative Features Make a Grand Debut
24 October 2024

Canva 2024 Droptober Surprise Event: Breakthrough AI Tools and 40+ Innovative Features Make a Grand Debut

Canva 2024 Droptober Surprise Event: Breakthrough AI Tools and 40+ Innovative Features Make a Gra...

Canva Prices Surge by 300%! Are AI Design Features Worth the High Cost?
4 September 2024

Canva Prices Surge by 300%! Are AI Design Features Worth the High Cost?

Canva Prices Surge by 300%! Are AI Design Features Worth the High Cost? Canva, the popular desig...

Comprehensive Review of Chatbase 2024: The Best Choice for Building AI Customer Support (What is Chatbase)
9 August 2024

Comprehensive Review of Chatbase 2024: The Best Choice for Building AI Customer Support (What is Chatbase)

Comprehensive Review of Chatbase 2024: The Best Choice for Building AI Customer Support? Chatbas...