Grok 4.2 just hit 60% on the ARC AGI 2 benchmark. Pretty solid performance there. Looks like we're watching a new state-of-the-art moment unfold in AI capabilities. The progress on these standardized benchmarks keeps pushing the boundaries of what these models can handle.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
15 Likes
Reward
15
5
Repost
Share
Comment
0/400
LiquidationHunter
· 7h ago
60%? That's just the beginning, still have to keep pushing forward.
View OriginalReply0
SnapshotLaborer
· 16h ago
60% huh, this number looks pretty good but not that outrageous... Anyway, these benchmarks don't really mean much; what's important is how it performs in actual use.
View OriginalReply0
ForkInTheRoad
· 16h ago
60%? Feels not as explosive as I imagined... I thought it could break 70.
View OriginalReply0
MEV_Whisperer
· 17h ago
NGL, the ARC benchmark has been refreshed again, but does this 60% really mean anything? It feels like these rankings are still worlds apart from actual applications...
View OriginalReply0
NeonCollector
· 17h ago
60%? How much of that benchmark is just fluff... True AGI is still a long way off.
Grok 4.2 just hit 60% on the ARC AGI 2 benchmark. Pretty solid performance there. Looks like we're watching a new state-of-the-art moment unfold in AI capabilities. The progress on these standardized benchmarks keeps pushing the boundaries of what these models can handle.