In the age of AI, is cloud still a good business?

First, the conclusion: In the early stages of AI development, public cloud is a 'good model,' but it's only a profitable business for a few players. It suits cloud vendors with capital, clients, engineering scheduling capabilities, and ecosystem access (overall, the ecosystem and top-tier model capabilities are relatively more important, as seen with Google Cloud); it's not suitable for small to medium-sized vendors trying to force their way through by 'buying GPUs → renting computing power → taking a cut from MaaS.'

1. Why does the public cloud seem like a good model in its early stages?

The core contradiction in the early stages of AI is: scarce computing power, fast model iteration, and companies' reluctance to build their own clusters. At this point, the public cloud has a natural advantage:

Enterprise customers want 'ready-to-use GPUs, networks, storage, security, model APIs, inference deployment, billing, and compliance,' not purchasing machines themselves. So during periods of demand explosion, the public cloud acts like 'the power grid of the AI era': whoever controls the supply of computing power, customer entry points, and development platforms can reap the early benefits.

2. But it's an extremely asset-heavy business, not a lightweight platform play.

The issue is that AI public cloud isn't the high-margin model typical of traditional SaaS; instead, it requires upfront capital expenditures, with revenue realized later (GPU leasing + MaaS).

Microsoft's Azure revenue for the fiscal year 2025 will exceed $75 billion for the first time, growing by 34% year-over-year, but Microsoft also explicitly stated that the gross margin of Microsoft Cloud dropped to 69%, partly due to expanding AI infrastructure.

Amazon's example is even more obvious: AWS revenue in 2025 reached $128.7 billion, operating profit $45.6 billion, but free cash flow fell from $38.2 billion to $11.2 billion, mainly because purchases of property and equipment increased by $50.7 billion, primarily reflecting AI investments; Amazon also forecasted that the company’s capex for 2026 would be approximately $200 billion.

This indicates that cloud vendors are temporarily sacrificing cash flow for future capacity and market share in the short term.

3. The quality of 'rented computing power' revenue is generally low, with utilization rates and lock-in periods being the key factors.

Relying solely on GPU leasing is essentially a 'high-depreciation inventory business.' There are four core risks:

First, GPU and data center investments are fixed, but customer demand may fluctuate.

Second, chips iterate quickly; generational shifts like H100, B200, and Rubin will put pressure on the prices of older cards.

Third, electricity, server rooms, networking, liquid cooling, and maintenance will all eat into gross margins.

Fourth, once large model training demand shifts towards inference optimization, the computing power structure may change, and training clusters might not remain fully loaded.

Therefore, whether this model works well depends on several metrics: GPU utilization, the proportion of prepayments or long-term contracts from clients, the share of inference workloads, the speed of unit token cost decline, depreciation cycles, and financing costs.

If GPU utilization is high, customers sign multi-year contracts, cloud vendors have in-house chips or pricing power (the upstream AI server supply chain continues to rapidly raise prices, while downstream affordability remains a big question mark—so short-to-medium term cloud vendor gross margins require close attention; subsequently, cloud vendors might sacrifice some gross margins to boost AI revenue scale), it can be a good business (currently, AI computing power is in short supply, so corresponding GPU leasing can maintain relatively strong gross margins, albeit only higher than traditional IaaS). But if it’s just buying cards at high prices and renting them out by the hour, that's risky.

4. MaaS has a lighter model than renting computing power, but its moat is also weaker.

MaaS, or Model-as-a-Service, appears better than renting GPUs on the surface because it resembles platform commission models: model providers supply the model, while cloud vendors handle distribution, APIs, inference, billing, and enterprise sales.

The problem, however, is that MaaS commissions alone are unlikely to support massive capex.

If cloud vendors are merely channels for model APIs, pricing power may be taken away by leading model companies (at the same time, the entry point for leading model vendors may not necessarily come from cloud vendors, and the entry effect of cloud vendors in the AI era may quickly weaken; in the current rapid development of AI, cloud vendors are more inclined towards being computing power infrastructure, with the vast majority of AI revenue coming from GPU rentals); if enterprise customers only call models such as OpenAI, Anthropic, Qwen, DeepSeek, or Claude, what cloud vendors gain is channel fees and computing power fees, and profits may not be high (so to achieve high profits, cloud vendors’ own models must be competitive). Truly high-quality MaaS revenue comes from three layers of叠加:

Computing power revenue + model invocation revenue + enterprise platform/data/agent/security/operations revenue.

In other words, MaaS will transform from 'selling APIs' to an 'enterprise AI operating system' only when it is integrated with PaaS, databases, data governance, RAG, agent orchestration, DevOps, and security compliance.

5. In the Chinese context, public cloud is more like a strategic entry point rather than a short-term profit center.

Alibaba Cloud is a typical case. Alibaba announced in 2025 that it would invest at least 380 billion RMB in AI and cloud infrastructure over the next three years (current management indicated that this figure may increase significantly); in the medium to short term, profits of companies like Alibaba and Tencent are likely to be weighed down by investments in AI infrastructure. This indicates that Chinese cloud vendors are also following a similar path: first using AI cloud to capture clients, ecosystems, and developers, then discussing profit margin recovery. However, the Chinese market has an additional layer of variables: domestic chips, information technology innovation, government-enterprise privatization/hybrid cloud, price wars, and model open-sourcing, all of which will suppress the excess returns from pure public cloud computing power rentals.

6. Current judgment: in the early AI phase, public cloud is a good strategy but not necessarily a good financial model.

For players like AWS, Azure, Google Cloud, and Alibaba Cloud, public cloud is a solid model because they already have clients, cash flow, data centers, electricity, sales systems, platform ecosystems, and long-term contract capabilities. They can withstand short-term capex and gross margin pressures, trading infrastructure for future AI platform status.

For new entrants or smaller cloud vendors, simply doing 'GPU rental + MaaS sharing' is not a good model unless there is clear differentiation: such as specific industry clients, sovereign clouds, localized compliance, low-cost electricity, self-developed chips, extremely high cluster utilization, or binding to a strong model ecosystem.

To sum up in one sentence: Public cloud is an infrastructure dividend in the early stages of AI, but not a get-rich-quick business; the winners are not 'those who have GPUs,' but those who can package computing power, models, data, platforms, and customer workflows together to sell.

88K Views