Ark Invest: The Current State and Future of AI Infrastructure

金色财经_

2026-03-28 07:00:25

Source: Frank Downing, Ark Invest; Translated by Golden Finance Claw

AI Infrastructure Spending is Exploding

Since the release of ChatGPT three years ago, the demand for accelerated computing has exploded. Nvidia’s annual revenue has surged nearly 8-fold, rising from $27 billion in 2022 to $216 billion in 2025, with the market widely expecting a further increase of 62% in 2026, reaching $350 billion. Global data center system investment (including computing, networking, and storage hardware) has accelerated from an average annual growth rate of 5% over the past decade to 30% over the last three years, and is expected to grow by more than 30% again in 2026, reaching $653 billion.

ARK’s research shows that accelerated computing driven by GPUs and AI-specific integrated circuits (ASICs) now dominates server investment, accounting for 86% of computing server sales.

Rapid Cost Declines Drive Adoption of Acceleration

The driving force behind the continuous increase in spending on the accelerated computing infrastructure needed to run AI models comes from the expanding use cases of generative AI on both the consumer and enterprise sides, as well as the demand to train smarter foundational models in the pursuit of “superintelligence.”

The rapid decline in costs is further accelerating demand growth. According to our research, the cost of AI training has decreased by 75% each year. The cost of inference is declining even faster—benchmark tests tracked by Artificial Analysis show that models scoring over 50% have median annualized cost reductions of up to 95%.

Two forces are driving the significant decline in costs: first, industry leaders like Nvidia are launching new products each year, bringing generational improvements in hardware performance; second, algorithmic improvements at the software level are continuously enhancing the efficiency of training and inference on the same hardware.

Strong Demand Signals from Consumers and Enterprises

The speed at which consumers are adopting AI is significantly faster than the speed at which the internet was adopted in its early days. The AI adoption rate has expanded to about 20% within three years, more than double the speed of consumer transition to the internet.

Enterprise demand is also growing at an astonishing rate. For example, according to OpenRouter data, token demand has increased 28-fold since December 2024.

In the past two years, the AI lab most favored by enterprise customers, Anthropic, has achieved an astounding revenue growth of about 100 times—rising from an annualized operating income of $100 million at the end of 2023 to an estimated $8 billion to $10 billion by the end of 2025. Anthropic’s growth momentum has continued into 2026, with an announcement in February of this year indicating an annualized revenue of $14 billion and a completed $30 billion funding round, bringing its valuation to $380 billion.

OpenAI, which is competing on both consumer and enterprise fronts, has also seen strong growth among enterprise users, having reached 1 million enterprise customers as of November 2025. According to CFO Sarah Friar, OpenAI’s enterprise revenue is growing faster than its consumer business and is expected to account for 50% of the company’s total revenue by 2026. Friar also outlined the rationale for further investment in infrastructure in a blog post in January 2026: over the past three years, OpenAI’s revenue has grown in direct proportion to its computing capacity.

Private Market Funding for AI Buildout

To meet the strong demand signals, large-scale infrastructure investment has become necessary. According to Crunchbase data, private AI labs raised over $200 billion in funding in 2025, with about $80 billion going to foundational model developers like OpenAI, Anthropic, and xAI. In the public market, hyperscale cloud companies are tapping into cash reserves and seeking other financing methods to support their AI capital expenditure plans—this spending could reach as high as $700 billion in 2026.

Reports indicate that Meta’s $30 billion deal with Blue Owl is the largest private capital transaction in history. The deal is structured as a joint venture primarily financed by debt, and its special purpose vehicle (SPV) structure will prevent the project debt from appearing on Meta’s balance sheet, a move that has sparked considerable controversy.

AMD and Other Manufacturers Emerge as Strong Challengers to Nvidia

Outside of physical data centers, computing chips have always been at the core of AI capital expenditure. Nvidia has been at the forefront of the accelerated computing era, but now the largest AI chip purchasers are seeking to maximize the AI computing power obtained from every dollar invested. Since acquiring ATI Technologies in 2006, Advanced Micro Devices (AMD) has been selling GPUs alongside Nvidia in the consumer market and has now become an emerging competitor in the enterprise market. Since launching the EPYC series of processors in 2017, AMD’s market share in server CPUs has grown from nearly zero in 2017 to 40% in 2025.

In terms of small model inference, AMD GPUs are now competitive with Nvidia in total cost of ownership (TCO) relative performance. TCO considers both the upfront purchase cost (capital expenditure) of the chip and the operating costs (operating expenditure) over the chip’s lifespan. Performance benchmarks use SemiAnalysis’s InferenceMax metric, measuring the number of tokens processed per GPU per second during throughput optimization, while cost benchmarks are based on SemiAnalysis’s estimates of hourly capital and operating expenditures.

Despite AMD having “caught up” on small model performance, Nvidia still maintains a significant lead in large model performance, as shown in the figure below.

Nvidia’s rack-level solution, Grace Blackwell, connects 72 Grace Blackwell GPUs (GB200) to operate like a massive GPU with shared memory. This close interconnection between chips enhances the inference capabilities of large models—large models require model weights to be distributed across multiple GPUs, needing more communication bandwidth than small models. To narrow the gap before Nvidia’s Vera Rubin release, AMD’s rack-level solution is scheduled to launch in the second half of 2026. So far, AMD has won orders from clients including Microsoft, Meta, OpenAI, xAI, and Oracle.

Hyperscale Cloud Vendors Lead the Custom Chip Revolution

In addition to commercial GPU suppliers, hyperscale cloud vendors and AI labs also aim to control Nvidia’s influence and reduce AI computing costs through self-developed chips. For over a decade, Google has been designing its own AI-specific integrated circuits—tensor processing units (TPUs)—to run recommendation models for its search business, with performance optimizations for generative AI in the latest TPU v7. SemiAnalysis estimates that Google reduces the cost of each computation by 62% by using self-developed TPUs for internal workloads. Anthropic and Meta are expanding their computing capabilities using Google’s TPUs, which may corroborate the 62% estimate’s proximity to reality.

Amazon’s Trainium chip appears to be a next-generation solution. After acquiring Annapurna Labs in 2015, Amazon was the first to develop custom chips for its cloud business, expanding on ARM architecture-based Graviton CPUs and Nitro data processing units (DPUs) to support vital computing power for Amazon Web Services (AWS). Amazon recently announced that Graviton has provided more than half of AWS’s new CPU computing power for the third consecutive year in 2025. In addition to using TPUs, Anthropic has also selected AWS and Trainium as its preferred training platforms.

Microsoft only entered the custom chip field in 2023 with the release of the AI accelerator Maia 100, which did not focus on generative AI at the time. Its second-generation product is now being launched, targeting AI inference scenarios.

Broadcom Dominates the Custom Chip Service Market

Google and Amazon focus on front-end chip design (architecture and functionality), while back-end design partners are responsible for translating their logic into silicon, managing advanced packaging, and coordinating production with foundries like TSMC. Against the backdrop of challenges in Intel’s foundry business, TSMC has become the preferred partner for most major AI chip projects, while Broadcom has emerged as the leading back-end design partner for Google’s TPUs, Meta’s MTIA, and OpenAI’s upcoming custom chips set to launch in 2026. Apple has historically completed the full design process for its phone and PC chips in-house but has reportedly been collaborating with Broadcom to develop AI chips. Citigroup predicts that Broadcom’s AI revenue could grow fivefold over the next two years, from $20 billion in 2025 to $100 billion in 2027.

Amazon’s Trainium development path is rather unique among its peers—reports suggest that Trainium 2 partnered with Marvell, but due to Marvell’s poor execution, Trainium 3 and Trainium 4 switched to collaborating with Alchip. Amazon’s ability to change back-end partners indicates that vertical integration does pose certain risks for companies like Broadcom. Notably, Apple and Tesla collaborate directly with foundries. Google may do the same with its TPU v8—this product has two SKUs, one co-designed by Broadcom and the other designed and controlled by Google with support from MediaTek.

Chip Startup Activity Heats Up

Our research indicates that a tail of companies attempting new architectural paradigms may further challenge the market positions of existing chip manufacturers. Cerebras is known for its wafer-scale engine (a giant chip made from a single piece of silicon the size of a pizza box), offering the fastest token processing speed in the market, and is reportedly planning to launch this year. The company recently announced a collaboration with OpenAI to launch a high-speed programming model, Codex Spark, following a partnership agreement reached in January this year. Groq has also achieved outstanding performance in token processing speed and recently signed a non-exclusive $20 billion intellectual property licensing agreement with Nvidia, which includes 90% of Groq’s employees, as well as CEO and TPU co-founder Jonathan Ross. This effectively amounts to an acquisition of Groq’s team and technology, with this deal structure becoming increasingly popular in the M&A market as tech giants seek to avoid delays from regulatory scrutiny. In other acquisition dynamics, Intel has recently established a partnership with SambaNova after reported acquisition negotiations fell through. Since 2014, Intel has made four acquisitions in the AI field but has yet to launch a widely recognized AI product, a record that is quite regrettable.

Looking Ahead: A $1.4 Trillion Market by 2030

According to our research, the sustained growth in demand and continuous performance improvements over the next five years will drive the development of AI software and cloud services, with AI infrastructure spending projected to triple over the next five years—from $500 billion in 2025 to nearly $1.5 trillion by 2030.

Our forecast is based on historical observations of data center system investment relative to software revenue. In the early 2010s, with the rise of cloud computing, system investment accounted for about 50% of global software spending. By 2021, excessive investment and customer optimizations following the COVID-19 pandemic reduced the proportion of system investment relative to software spending to a low of just over 20%. Our $1.5 trillion forecast assumes a 2030 investment amount of 20% of our neutral forecast scenario for global software spending (which is $7 trillion in 2030), a proportion we detailed in a blog last year. We believe that the 20% level adequately accounts for potential overinvestment risks before 2030, as well as the possibility that software revenue growth may lag behind the neutral forecast scenario—in the latter case, we believe that infrastructure investment will continue to grow rapidly, much like in the early 2010s.

As the demand for AI-driven computing power continues to grow, we expect the share of custom chips in computing expenditure to keep increasing—because the time and financial investment required to design chips for specific workloads will manifest as increasingly significant performance advantages per dollar when scaled. We believe that by 2030, the share of custom ASICs in the computing market could exceed one-third.

Overall, our research indicates that the ongoing infrastructure buildout is not a bubble about to burst, but rather the foundation for a once-in-a-lifetime platform-level transformation. ARK predicts that annual spending on AI infrastructure will approach $1.5 trillion by 2030, driven by real and accelerating demand from both consumers and enterprises, while continuously declining costs validate and unleash new use cases. We believe that the companies that stand out in the next five years will be those that can design the most efficient chips, build the most powerful models, and deploy both at scale.

As Nvidia CEO Jensen Huang articulated in the earnings call for Q4 of fiscal year 2026, truly practical AI agents have only just begun to be deployed at scale in recent months. They consume a massive amount of tokens but possess capabilities far beyond what most users have been accustomed to with prior AI products. Scaling these agents to millions of enterprises will be an extremely compute-intensive task, and in our view, the resulting productivity gains will more than justify these investments.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.