GPT-5 vs Sonnet-4: Side-by-Side on Real Coding Tasks

August 19, 2025 | 3 min Read

Two of today’s most popular AI coding models—GPT-5 and Sonnet-4—are often compared using benchmarks or synthetic tasks. But how do they behave in real-world coding scenarios? In this video, we put them head-to-head on three practical examples to uncover where they differ and how that impacts your prompting strategy.

For us at EclipseSource, finding and understanding such nuances in model behavior is essential: we integrate these models into specialized tools and IDEs, and we need to optimize prompts, system messages, and tool design accordingly. That’s why we believe it’s valuable to share these insights with the community, so others can benefit from them too.

🎥 See the comparison here:

Three Examples, Three Insights

The side-by-side test runs inside the AI-powered Theia IDE, a free and open source coding environment built on Theia AI.

Here’s what we found:

  1. Instruction Following vs. Convention
    GPT-5 followed prompts with high precision, while Sonnet-4 leaned more on existing conventions in the code. Depending on your use case, one approach may be preferable.

  2. Tool Usage and Context Retrieval
    Sonnet-4 eagerly used the GitHub MCP integration to fetch both issues and comments, while GPT-5 sometimes missed key context unless prompted more explicitly.

  3. Complex Bug Fixing
    When tackling a tricky file-watcher bug, GPT-5 consistently identified and fixed the root cause, while Sonnet-4 offered partial or less effective solutions.

Why This Matters

These differences aren’t just academic—they shape how you should design prompts, configure agents, and evaluate AI tools for your own workflows. Whether you value strict instruction following, intuitive tool use, or deeper reasoning in bug fixing, knowing these tendencies helps you make better choices.

For us, these experiments directly inform how we evolve the AI-powered Theia IDE and tailored, domain-specific AI tools we work on: every nuance we uncover helps us refine prompt strategies, adapt system messages, and improve the design of AI integrations. Sharing these findings is part of our commitment to building transparent, open, and community-driven AI tooling.

Try It Yourself

🛠️ Download the AI-powered Theia IDE and experiment with GPT-5 and Sonnet-4 side-by-side.

📺 Watch the full video to see each example step-by-step.

🧰 Learn more about Theia AI.

Build Your Own AI-Native Tools

At EclipseSource, we specialize in AI-native tools and IDEs. Whether you want to integrate AI into your coding workflow, build a domain-specific IDE, or explore AI-native software engineering practices, we’re here to help.

👉 Services for Building AI-enhanced Tools and IDEs

👉 Support for AI-Native Software Engineering

👉 Contact us to discuss your AI-native tool project

💼 Follow us: EclipseSource on LinkedIn

🎥 Subscribe to our YouTube channel: EclipseSource on YouTube

Stay Updated with Our Latest Articles

Want to ensure you get notifications for all our new blog posts? Follow us on LinkedIn and turn on notifications:

  1. Go to the EclipseSource LinkedIn page and click "Follow"
  2. Click the bell icon in the top right corner of our page
  3. Select "All posts" instead of the default setting
Follow EclipseSource on LinkedIn
Image Image Image

Jonas, Maximilian & Philip

Jonas Helming, Maximilian Koegel and Philip Langer co-lead EclipseSource, specializing in consulting and engineering innovative, customized tools and IDEs, with a strong …