June 6, 2025

Claude 3.5 Sonnet Agentic Coding: How 49% on SWE-bench Rewrote the Rules for AI Developer Tools

A year ago, most developers treated AI coding assistants as glorified autocomplete. Then Anthropic dropped a model that scored 49% on SWE-bench Verified — solving nearly […]