claude-sonnet-4-5 - Sean Kim

September 30, 2025

Published by Sean Kim on September 30, 2025

Claude Sonnet 4.5 Release: 77.2% SWE-bench Score and 30-Hour Autonomous Agents — What Changed

Anthropic just dropped Claude Sonnet 4.5, and the numbers speak for themselves: 77.2% on SWE-bench Verified, 61.4% on OSWorld, and agents that can stay focused for […]

September 2, 2025

Published by Sean Kim on September 2, 2025

Claude Sonnet 4.5 Benchmark Deep Dive: 77.2% SWE-bench Crushes GPT-5 and Gemini

77.2% on SWE-bench Verified. That single number just rewrote the rules of the AI coding model market. Anthropic’s Claude Sonnet 4.5 benchmark results don’t just represent […]

September 1, 2025

Published by Sean Kim on September 1, 2025

Claude Sonnet 4.5 Release: 77.2% SWE-Bench Score, 30-Hour Autonomous Coding, and Why Developers Are Switching

Anthropic just mass-deployed its most dangerous weapon in the AI coding wars — and it costs exactly the same as the model it replaces. Claude Sonnet […]