The second is adaptation to domestic chips, closely collaborating with Huawei during development, and quickly adapting to domestic chips like Ascend and Cambricon.

Coincidentally, the second-ranked model on the Hugging Face open-source leaderboard is Kimi K2.6, released and open-sourced late on April 20th.

If across the Pacific Ocean, the “clash” of two trillion-parameter models would inevitably involve valuation and business territory disputes, but domestically, a completely different scene unfolded: no exposés, no undercurrents of PR battles, and even a “change of guard” at the technical core.

Behind the “unusual” scene lies a divergence in AI technology routes between China and the US: Silicon Valley is frantically “building high walls,” trying to protect vested interests through closed-source models; domestic large model vendors are choosing to “tear down walls,” evolving collaboratively on open-source soil.

01 Silicon Valley陷入“权力的游戏”

Unlike the diverse open-source routes of domestic large models, Silicon Valley AI giants like OpenAI, Anthropic, and Google Gemini are all staunch supporters of closed-source development.

Cutting-edge technological innovations are locked within their data centers. Facing heavy compute costs and market expectations, the “Silicon Valley spirit” of openness and collaboration is gradually fading, and players are inevitably caught in a zero-sum “power game.”

Over the past two years, the “secret war” has evolved into open conflict, with the most typical tactic being “stealing the spotlight”: at critical moments when competitors release new products, quickly launching major updates to suppress their voice has become routine in Silicon Valley.

As early as May 2024, OpenAI and Google released new AI products simultaneously—one claimed GPT-4o was globally leading, the other said the Gemini family covered the entire ecosystem and all pathways. Both CEOs couldn’t sit still and publicly mocked each other on social media.

Not only in the “tussle” with Google, but the competition between OpenAI and Anthropic also heated up: on April 16th, Anthropic released its new model Claude Opus 4.7, and just over two hours later, OpenAI announced a major update to Codex, with the slogan “Codex for (almost) everything).” It was obvious to all that the timing clash was no coincidence but a carefully planned “sniper” targeting Anthropic.

Beyond public “word battles,” the “exposing the old” and “fighting with facts” in Silicon Valley have also become commonplace.

On April 7th, Anthropic announced its annual revenue reached $30 billion, surpassing OpenAI’s $25 billion.

A week later, OpenAI’s Chief Revenue Officer candidly told all employees in an internal letter that Anthropic’s claimed $30 billion annual revenue was seriously inflated because it used the “total method,” including the cuts given to cloud service providers like Amazon and Google, which led to an overestimation of about $8 billion.

Such internal discrediting is rare in the tech industry, mainly aimed at telling investors that—Anthropic’s growth myth is water-inflated.

Once hostility breeds, it can infiltrate every decision.

After Anthropic refused to delete certain safety clauses from its contracts and “broke” with the Pentagon, OpenAI quickly announced a cooperation with the U.S. Department of Defense just hours later.

At the 2026 “Super Bowl,” Anthropic invested heavily in an ad saying, “The ad is entering the AI field, but will not enter Claude,” directly challenging OpenAI, which had just begun testing its ad feature…

Why did former “fellow apprentices” turn into sworn enemies?

The root lies in the inherent logic of the closed-source commercial model: the survival of closed-source depends on building moats, and the premise of building moats is to block technology diffusion and monopolize the most advanced productivity. Coupled with incompatible technical routes and opposing product narratives, a Nash equilibrium naturally forms: whoever “ceases fire” first will see their brand narrative collapse, sinking deeper into internal strife.

02 Open-source camp’s “collaborative evolution”

Turning back to China, the script’s trajectory is entirely different.

More than a year ago, the emergence of DeepSeek-R1 slowed down the rapid pace of large model startups, with the “Six Little Tigers” of large models taking the front seat. Unlike Silicon Valley’s biggest players, DeepSeek did not act as a “shark” devouring all fish in the pond but instead, like a catfish, activated the entire Chinese large model ecosystem, with many embracing open source.

A direct example is the close alignment of the growth trajectories of DeepSeek and Yue Zhi An Mian, both startups starting in 2023, maintaining small teams with high talent density, and being firm believers in Scaling Law.

In July 2025, Yue Zhi An Mian released the world’s first trillion-parameter open-source model Kimi K2, openly stating in the technical report that it adopted DeepSeek’s open-source MLA architecture. For large models, the biggest nightmare in handling ultra-long texts is the memory wall, but the MLA architecture’s disruptive innovation is its clever compression of KV Cache to over 93%.

With the “industry standard” contributed by DeepSeek, large model teams like Yue Zhi An Mian can avoid reinventing the wheel, significantly reducing inference costs.

The story did not end there.

Looking at DeepSeek V4’s technical documentation, it details the model architecture, with a key upgrade replacing most modules’ optimizers from AdamW to Muon, achieving faster convergence and better training stability.

Kimi K2.6’s technical documentation also mentions the Muon optimizer, which achieved a 2x efficiency boost under the same training conditions.

Both models mention the Muon optimizer, first proposed by independent researcher Keller Jordan in a blog at the end of 2024. The Yue Zhi An Mian team, troubled by AdamW, made critical engineering improvements to Muon in early 2025, adding features like Weight Decay and RMS control, and named it MuonClip.

Yue Zhi An Mian was the first to verify Muon’s stability on Kimi K2, achieving “zero Loss Spike” throughout pretraining. DeepSeek, when training V4 large models, also used the proven Muon optimizer.

It should be noted that the “collaborative evolution” of open-source large models is not leading to homogenization but toward a path of “harmony in diversity.”

For example, DeepSeek-V4 focuses on core capabilities of foundational models, further solidifying the performance ceiling of global open-source large models, providing a baseline comparable to closed-source flagship models; Kimi K2.6 emphasizes agent engineering and implementation, solving the pain points of long-term autonomous execution, and opening a key pathway for large models to enter real production scenarios.

Throughout this process, there have been no prolonged commercial negotiations or intense patent battles. In the open-source camp, technological innovation flows freely like water—those who do well are used by all.

By drawing nutrients from the open-source ecosystem and complementing each other’s technical routes, China’s large model vendors demonstrate an alternative possibility outside Silicon Valley.

03 US “building walls,” China “building roads”

While praising open-source collaborative evolution, we must face a harsh business reality.

Currently, OpenAI and Anthropic both have annual revenues exceeding $10k, while leading domestic large model vendors have just crossed the $5.58M mark.

OpenAI’s valuation in the secondary market is about $880 billion, Anthropic’s valuation has soared to around $1 trillion, while Kimi and DeepSeek’s new funding rounds valued at $18 billion and $20 billion respectively.

Some claim China’s large model companies are undervalued, while others believe: “Turning technological reputation into real cash is a life-and-death test for Chinese firms.” As a result, discussions on the “cost-effectiveness” of open source are rampant.

To see the ultimate outcome, one might consider the competition phases of large models:

The first phase is “parameter wars and benchmark battles.” By late April 2026, this phase is essentially over, with no substantial score gaps on leaderboards.

The second phase is “training efficiency, inference cost, and architectural innovation.” This is the current stage, driven by compute cost pressures.

The third phase will be “Agent systems, ecosystems, and developer engagement.” When tokens shift from free traffic to “fuel” for executing tasks, the vitality of the ecosystem will determine survival.

What is the ecological position of domestic open-source large models? We found two sets of intuitive comparative data.

One is training costs.

GPT-5, released in August 2025, cost over $500 million to train; Kimi K2 Thinking, at the same time, cost about $4.6 million; DeepSeek has not disclosed training costs for V4 series, but V3 models only cost $100k… domestic large model vendors used less than a tenth of OpenAI’s resources to train models of comparable quality.

Another is usage volume.

From 2026 onward, data from the multi-model aggregation platform OpenRouter shows that, driven by the Agent product OpenClaw, global token consumption has grown exponentially, and China’s “Open Source Dream Team,” with its reputation for being “easy to use and cheap,” has continuously surpassed the US for several weeks.

The reason is simple.

China’s open-source ecosystem has already established a “positive feedback flywheel”: Company A open-sources core technology, Company B adopts and optimizes it, then feeds the improvements back into the ecosystem. While closed-source models evolve through massive compute scaling with linear growth, open-source routes will lead to exponential technological innovation through mutual collision.

According to Morgan Stanley’s research report, from 2025 to 2030, China’s AI inference token consumption will achieve an approximately 330% CAGR, skyrocketing from 10 trillion tokens in 2025 to 3.9 quadrillion tokens in 2030, a 370-fold increase.

In other words, 2026 is still in the early stage of AI explosion, with hundreds of times growth potential in the next five years, far from the final verdict.

It is precisely this confidence in long-term opportunities that, while Silicon Valley giants are desperately building walls, China’s large model vendors are choosing to reinforce the road through collaboration, continuously solidifying the path toward AGI.

04 Final words

Who will be the last to laugh in this grand AI wave? The answer involves not only models but also the independence and controllability of computing power. If models are like “atomic bombs,” then domestically controlled chips are the “rockets” that send the bomb into the sky.

Fortunately, the integration of domestic models and chips is becoming closer: DeepSeek V4’s technical documentation lists Ascend NPU and NVIDIA GPU side by side in hardware verification; Yue Zhi An Mian’s latest paper runs large model inference prefill and decoding on different chips, opening the door for domestic chips to participate in large-scale model inference.

In early 2025, DeepSeek R1 gave domestic large models a chance to enter the game; by 2026, China’s open-source large model ecosystem is continuously creating more hardware that defines the rules through collaboration.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
WCTCTradingKingPK
249.03K Popularity
#
CryptoMarketSeesVolatility
303.56K Popularity
#
rsETHAttackUpdate
105.47K Popularity
#
US-IranTalksStall
449.3K Popularity
#
ETHMemeCoinFLORKSurges
61.35K Popularity

Sitemap

DeepSeek V4 Market Collapse Behind the Scenes: Silicon Valley is "Building Walls," China is "Laying Roads"

Trending Topics

WCTCTradingKingPK

CryptoMarketSeesVolatility

rsETHAttackUpdate

US-IranTalksStall

ETHMemeCoinFLORKSurges

Pin