2026-01-01 18:26:36

Grok 4.2 just hit 60% on the ARC AGI 2 benchmark. Pretty solid performance there. Looks like we're watching a new state-of-the-art moment unfold in AI capabilities. The progress on these standardized benchmarks keeps pushing the boundaries of what these models can handle.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

15 Likes

Reward
15
5
Repost
Share

Comment

0/400

LiquidationHunter

· 7h ago

60%? That's just the beginning, still have to keep pushing forward.

View OriginalReply0

SnapshotLaborer

· 16h ago

60% huh, this number looks pretty good but not that outrageous... Anyway, these benchmarks don't really mean much; what's important is how it performs in actual use.

View OriginalReply0

ForkInTheRoad

· 16h ago

60%? Feels not as explosive as I imagined... I thought it could break 70.

View OriginalReply0

MEV_Whisperer

· 17h ago

NGL, the ARC benchmark has been refreshed again, but does this 60% really mean anything? It feels like these rankings are still worlds apart from actual applications...

View OriginalReply0

NeonCollector