Karpathy: AI capability perception has a major gap; the free tier and the cutting-edge agent are “completely different products”

ChainNewsAbmedia

Former Tesla AI Chief Architect and OpenAI founding member Andrej Karpathy published a long post on X on April 9, pointing out that the public’s understanding of AI capabilities is becoming severely split. He believes that people using the free version of ChatGPT and technical professionals using cutting-edge agent tools like Codex and Claude Code every day are actually discussing “completely different products,” yet both sides think they’re seeing the full picture of AI.

Two worlds, two types of AI understanding

Karpathy currently divides AI users into two groups.

The first group tried the free version of ChatGPT at some point last year, and formed their overall impression of AI from that. What they see are various failures of the model—hallucinations, absurd search results, and even simple questions like whether the voice mode should “drive or walk to get a car wash.” Karpathy admits these problems do exist, but emphasizes that the free version and outdated models can’t represent the real capabilities of cutting-edge agent models before 2026.

The second group satisfies two conditions at the same time: they pay to use the latest cutting-edge agent models (such as OpenAI Codex or Claude Code), and they use them professionally in technical fields like software development, mathematics, and research. Karpathy says this group is experiencing a high level of “AI psychosis,” because the recent progress of these models in technical areas can only be described as astonishing—you can literally watch them solve in an hour programming architecture problems that previously would have taken days or even weeks.

Why progress is concentrated in technical fields

Karpathy explains why improvements in AI capabilities are particularly noticeable in technical fields like software development, but less so in general uses such as search, writing, and making recommendations.

There are two reasons: first, technical fields provide a verifiable reward function (for example, whether unit tests pass), which makes reinforcement learning training work effectively; by contrast, it’s hard to determine objectively how good writing quality is. Second, technical fields have greater commercial value in B2B scenarios, so AI companies put the largest share of team resources into these directions.

The two groups can’t understand what the other is saying

Karpathy concludes that these two groups are “talking past each other.” OpenAI’s free voice mode botches everyday problems, while OpenAI’s top-tier paid Codex can restructure an entire codebase or discover system vulnerabilities within an hour—both of these things are simultaneously true.

In a follow-up reply, he added that someone offered him an observation: the OpenClaw incident drew so much social attention precisely because it introduced a large number of non-technical people to the latest agent models for the first time, and these people previously only knew that AI equals ChatGPT’s web version.

This article by Karpathy: AI capability recognition shows a severe gap; the free version and the cutting-edge Agent are “completely different products.” First appeared on Chain News ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Justin Sun Highlights AI Agent as Core Driver for Web3 Intelligence Evolution

Justin Sun urged that AI Agents will replace manual Web3 interactions, enabling autonomous, intent-driven DApps that plan and execute on-chain tasks, unlock productivity, and push mass adoption at the Genesis Hackathon. Abstract: Justin Sun's Genesis Hackathon remarks underscore AI Agents as a catalyst for Web3, shifting from manual operations to autonomous, intent-driven processes that manage on-chain tasks and cross-chain trades, addressing user growth bottlenecks and stimulating ecosystem-wide adoption.

GateNewsJust Now

Alipay Launches AI Agent Payments in Hangzhou, Reaches 100M Users

Gate News message, April 21 — Alipay has launched an AI agent payment service in Hangzhou that enables OpenClaw-type AI agents to make purchases and process payments on a user's behalf. The feature requires users to enable it, verify their identity, and approve each transaction, with risk controls a

GateNews48m ago

Moonshot AI Launches Kimi K2.6 With 300-Agent Swarm Capability, Advancing Autonomous AI Systems

Moonshot AI's Kimi K2.6 expands parallel sub-agents to 300, boosts multi-domain task speed to 4,000 steps, and adds a Skills tool for converting documents into reusable templates. Abstract: Moonshot AI releases Kimi K2.6, an open-source model that scales agent orchestration to 300 parallel sub-agents and 4,000 coordinated steps. It improves long-horizon coding across Rust, Go, and Python, enhances front-end, DevOps, and performance optimization, and introduces a Skills mechanism that converts PDFs, spreadsheets, and Word files into reusable task templates for autonomous multi-step workflows and persistent monitoring.

GateNews1h ago

Tencent Cloud Open-Sources Cube Sandbox, AI Agent Execution Environment Compatible with OpenAI and Manus

Gate News message, April 21 — Tencent Cloud announced the official open-sourcing of Cube Sandbox, an execution environment foundation designed for AI agents. The sandbox is the industry's first to combine hardware-level isolation with sub-100-millisecond startup times, according to the company. Cub

GateNews1h ago

Li Auto Debuts Amap Automotive Travel AI Agent with Advanced Navigation Features

Abstract: Li Auto and Amap announce the debut of an automotive travel AI Agent, enabling nuanced natural-language understanding, multi-turn route adjustments, personalized navigation, and intelligent long-distance charging planning, as demonstrated in Amap's promotional video. Summary: Li Auto teams with Amap to launch an automotive travel AI Agent that deciphers complex travel intents, updates routes via dialogue, offers personalized navigation, and plans long-distance charging.

GateNews2h ago

Singapore's MetaComp Launches AI Agent Framework for Financial Compliance and Payments

MetaComp debuts StableX Know Your Agent for regulated AI in payments, combining multi-vendor analytics to slash false clean rates, with AgentX Skills supporting Claude; aims for auditable cross-border finance via downloadable AI Skills. Abstract: MetaComp introduces the StableX Know Your Agent framework to govern AI agents in regulated payments and wealth management, covering identity, permissions, monitoring, auditing, and agent-to-agent interactions. It reduces false positives by parallel analytics from multiple vendors and enables auditable cross-border finance through downloadable AI Skills (AgentX), starting with Claude support and expansion across regions.

GateNews2h ago
Comment
0/400
No comments