benchmark - Sean Kim — Arts and Tech

March 23, 2026

Published by Sean Kim on March 23, 2026

Categories

Tech & Hardware

MacBook Air M5 Review: 31% GPU Boost, Benchmark Deep Dive, and the Real Upgrade Story (2026)

Geekbench 6 single-core: 4,195. Multi-core: 17,276. From a fanless ultrabook. The MacBook Air M5 landed on March 11, 2026, and those numbers alone are enough to […]

February 2, 2026

Published by Sean Kim on February 2, 2026

Categories

AI Tools & Services

Claude Opus 4.6: 7 Groundbreaking Features That Make It Anthropic’s Most Powerful AI Model

Finally — the model we’ve been waiting for. On February 5, 2026, Anthropic dropped Claude Opus 4.6, and after spending the past few days pushing it […]

January 9, 2026

Published by Sean Kim on January 9, 2026

Categories

AI Tools & Services

OpenAI o3 One Year Later: How the 87.5% ARC-AGI Score Rewrote the Rules of AI Reasoning

On December 20, 2024, OpenAI o3 scored 87.5% on the ARC-AGI benchmark. The previous best? 55.5%. That wasn’t an improvement—it was a category rupture. Now, in […]

October 30, 2025

Published by Sean Kim on October 30, 2025

Categories

Tech & Hardware

Apple MacBook Pro M5 Launch: New Chip Brings 3.5x AI Performance and a GPU Revolution

Finally — the M5 is here, and the numbers Apple is throwing around aren’t just marketing fluff this time. A 3.5x leap in AI performance? A […]

September 2, 2025

Published by Sean Kim on September 2, 2025

Categories

AI Tools & Services

Claude Sonnet 4.5 Benchmark Deep Dive: 77.2% SWE-bench Crushes GPT-5 and Gemini

77.2% on SWE-bench Verified. That single number just rewrote the rules of the AI coding model market. Anthropic’s Claude Sonnet 4.5 benchmark results don’t just represent […]

August 11, 2025

Published by Sean Kim on August 11, 2025

Categories

AI Tools & Services

Claude Opus 4.1 vs GPT-5: Which AI Model Should You Use in August 2025?

Two days. That’s all it took for the AI landscape to completely shift. On August 5, Anthropic dropped Claude Opus 4.1. On August 7, OpenAI fired […]

May 29, 2025

Published by Sean Kim on May 29, 2025

Categories

AI Tools & Services

MLCommons AILuminate AI Safety Benchmark: The First Industry Standard Grading AI Models Across 12 Hazard Categories

Your AI chatbot just got a safety report card — and some models barely passed. The MLCommons AILuminate AI safety benchmark v1.0 has tested major language […]