July 29, 2025Published by Sean Kim on July 29, 2025Categories Programming & DevelopmentModal Labs GPU Cloud: Serverless AI Inference at 50% Lower Cost Than AWSHere’s a number that should make every ML engineer reconsider their cloud bill: 100 inference requests, 2 seconds each, $0.06 total. That same workload on AWS? […]