Noti.Group RSS Feed
  • Contact Us
Wednesday, March 18, 2026
Noti Group Logo
  • Home
  • World News
  • Business
  • Health
  • Sports
  • Entertainment
No Result
View All Result
  • Home
  • World News
  • Business
  • Health
  • Sports
  • Entertainment
No Result
View All Result
Noti Group
No Result
View All Result
ADVERTISEMENT

Amazon’s bet that AI benchmarks don’t matter

in Technology
Reading Time: 10 mins read
407 4
A A
0
Amazon’s bet that AI benchmarks don’t matter
137
SHARES
6.8k
VIEWS
ShareShareShareShareShare

This is an excerpt of Sources by Alex Heath, a newsletter about AI and the tech industry, syndicated just for noti.group subscribers once a week.

Amazon’s AI chief has a message for the model benchmark obsessives: Stop looking at the leaderboards.

“I want real-world utility. None of these benchmarks are real,” Rohit Prasad, Amazon’s SVP of AGI, told me ahead of today’s announcements at AWS re:Invent in Las Vegas. “The only way to do real benchmarking is if everyone conforms to the same training data and the evals are completely held out. That’s not what’s happening. The evals are frankly getting noisy, and they’re not showing the real power of these models.”

It’s a contrarian stance when every other AI lab is quick to boast about how their new models quickly climb the leaderboards. It’s also convenient for Amazon, given that the previous version of Nova, its flagship model, was sitting at spot 79 on LMArena when Prasad and I spoke last week. Still, dismissing benchmarks only works if Amazon can offer a different story about what progress looks like.

“They’re not showing the real power of these models.”

The centerpiece of today’s re:Invent announcements is Nova Forge, a service that Amazon claims lets companies train custom AI models in ways previously impossible without spending billions of dollars. The problem Forge addresses is real. Most companies trying to customize AI models face three bad options: fine-tune a closed model (but only at the edges), train on open-weight models (but without the original training data and risking capability regression, where the AI becomes an expert on new data but forgets original, broader skills), or build a model from scratch at enormous cost.

Forge offers something else: access to Amazon’s Nova model checkpoints at the pre-training, mid-training, and post-training stages. Companies can inject their proprietary data early in the process, when the model’s “learning capacity is highest,” as Prasad put it, rather than just tweaking model behavior at the end.

“What we have done is democratize AI and frontier model development for your use cases at fractions of what it would cost [before],” Prasad said. Forge was created because Amazon’s internal teams wanted a tool to inject their domain expertise into a base model without having to build from scratch.

“We built Forge because our internal teams wanted Forge,” he said. It’s a familiar Amazon pattern. AWS itself famously began as infrastructure built for Amazon’s own retail operation before becoming the company’s profit engine.

Reddit has been using Forge to build custom safety models trained on 23 years of community moderation data. “I haven’t seen anything like it yet,” Chris Slowe, Reddit’s CTO and first employee, told me. “We’ve had a distinguished engineer who’s just been like a kid in the candy shop.”

Slowe said Reddit ran a continued pre-training job last week that’s “looking really promising.” The goal: Replace multiple bespoke safety models with a single Reddit-expert model that understands the nuances of community moderation, including the notoriously subjective rule that appears across subreddits everywhere: “Don’t be a jerk.”

“Having an expert model, it’s going to understand the community,” Slowe said. “It’s gonna have a pretty good notion of what jerk means.”

That’s the thread Amazon wants developers to pull on: not raw IQ points, but control and specialization.

He explained that Forge enables Reddit to control its models, avoid surprises from API changes, retain ownership of its weights, and avoid sending sensitive data to third-party model providers. He said Reddit is already exploring using the same approach for Reddit Answers and other products.

When I asked Slowe whether it mattered that Nova isn’t a top-tier model on benchmarks, he was blunt: “In this context, what matters is the Reddit expertness of the model.” That’s the thread Amazon wants developers to pull on: not raw IQ points, but control and specialization.

With Forge, Amazon is making a calculated bet that the model race has commoditized and that it can succeed by being the place where companies can build specialized AI for specific business problems. It’s a very AWS-shaped view of the world: infrastructure over intelligence and customization over raw capability. The strategy also lets Amazon sidestep direct comparisons with OpenAI and Anthropic, both of which it once hoped to compete with at the model layer.

Whether Forge is genuinely pioneering or just clever positioning depends, of course, on developer adoption. Amazon insists that the model race, as it’s widely understood, doesn’t matter. If that ends up being true, the scoreboard shifts to something much quieter and harder to game: whether AI models actually deliver real-world utility.

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

  • Alex Heath

    Alex Heath

    Sources author, Verge contributor

    Posts from this author will be added to your daily email digest and your homepage feed.

    See All by Alex Heath

  • AI

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All AI

  • Column

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All Column

  • Sources

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All Sources

[Notigroup Newsroom in collaboration with other media outlets, with information from the following sources]

Tags: AIColumnSources
Previous Post

The question at center of Mets’ Devin Williams signing

Next Post

Steam Machine today, Steam Phones tomorrow

Related Posts

Now everyone in the US is getting Google’s personalized Gemini AI
Technology

Now everyone in the US is getting Google’s personalized Gemini AI

March 17, 2026
Nvidia’s DLSS 5 is like motion smoothing for video games, but worse
Technology

Nvidia’s DLSS 5 is like motion smoothing for video games, but worse

March 17, 2026
Tom Warren
Technology

Microsoft appoints a new Copilot boss after AI leadership shakeup

March 17, 2026
UK pushes up a law criminalizing deepfake nudes in response to Grok
Technology

Teens sue Elon Musk’s xAI over Grok’s AI-generated CSAM

March 16, 2026
Load More
Next Post
Steam Machine today, Steam Phones tomorrow

Steam Machine today, Steam Phones tomorrow

No Result
View All Result

Recent Posts

  • Stephanie Gaitley and Fairleigh Dickinson are everything one another needed
  • Livvy Dunne, Brooks Nader practice sexy ‘Baywatch’ run on Venice Beach
  • World Baseball Classic can go from hit to home run with a few fixes
  • Darren Mougey whiffing on first Jets QB swing won’t afford him too many more chances
  • The USA Hockey ethos Mauricio Pochettino is taking into USMNT camp

Recent Comments

  • Stefano on The Last Byzantine Medieval Town on Earth Is Being Destroyed, and It’s Too Late
  • Van Hens on The Last Byzantine Medieval Town on Earth Is Being Destroyed, and It’s Too Late
  • Ioannis K on The Last Byzantine Medieval Town on Earth Is Being Destroyed, and It’s Too Late
  • Panagiotis Nikolaos on The Last Byzantine Medieval Town on Earth Is Being Destroyed, and It’s Too Late
  • John Miele on UK government suggests deleting files to save water

Noti Group All rights reserved

No Result
View All Result
Noti Group

What’s New Here

  • Stephanie Gaitley and Fairleigh Dickinson are everything one another needed
  • Livvy Dunne, Brooks Nader practice sexy ‘Baywatch’ run on Venice Beach
  • World Baseball Classic can go from hit to home run with a few fixes

Topics to Cover!

  • Business (4,752)
  • Entertainment (1,867)
  • General News (326)
  • Health (327)
  • Investigative Journalism (11)
  • Lifestyle (4)
  • Sports (8,200)
  • Technology (6,097)
  • World News (1,336)
  • Contact Us
  • Terms and Conditions
  • Privacy Policy
  • RSS
  • Contact News Room
  • Code of Conduct
  • Careers
  • Values
  • Advertise
  • DMCA

© 2025 - noti.group - All rights reserved - noti.group runs on 100% green energy.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • World News
  • Business
  • Health
  • Sports
  • Entertainment

© 2025 - noti.group - All rights reserved - noti.group runs on 100% green energy.