English
Back
Open Account
CPU returns to the core of AI! Who are the big winners?
富途寰球私享匯
joined discussion ·

Alpha Call | NVIDIA GTC Conference Preview Summary Report

Introduction
NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections:
Part 1 | Joe Yu: Investment Framework and Key Highlights
A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty
① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.'
Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks.
Next Chapter | Tim: In-depth Dissection of the Industry Chain
Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors:
①Optical communications: The production timeline for CPO switches is set, with optical interconnects becoming essential at the scale-up layer, breaking down value chains such as optical engines, ELS, and MPO; ②Storage: The emergence of the HBF warm data layer elevates storage from a "backstage supporting role" to a "frontline leading role," driving demand for both DRAM and NAND; ③PCIe and CXL: Demand for Retimers surges due to speed transitions, with the shift towards "more optics, less copper" not being a zero-sum game, resulting in exponential growth in total interconnect links; ④Q&A highlights: The positioning battle between Samsung and SK Hynix in HBM, and Agentic AI’s unexpected boost to infrastructure.
**A terminology index is provided at the end for reference as needed**
Previous section:
I. Core Outlook: GTC 2026ClarifiedNew roadmap for AI development: Transitioning fromModel TrainingtoAI operational reasoning and physical AI. The current market focus has shifted fromunderlying computing power scale, substantively towardsapplication implementation and monetizationtransfer. Based on this, the followinginvestment logic frameworkis outlined as follows:Short-term trading flexibility: Relying onAI infrastructure constructionMid-term valuation expansion: byPhysical AI(Such asEmbodied intelligence applications) andAgent-based AI.
Second, this framework is highly consistent withForeign capitalYes.AI capital expenditure expansionthe core logic of the medium to short term. Looking ahead to2026 to 2028NVIDIAofCash Flow and MoatWill be supported bythe rapid iteration of computing architecturewith the core roadmap beingGB300 → Rubin → Rubin Ultra → Feynman
Third, specific product performance expectationsSecond half of 2026Rubin NVL144Listed,reasoning capabilitiesreaching the previous generation'sGB300 NVL72of3.3 timesThe second half of 2027Rubin Ultra NVL576launch,Inference computing poweris set to surge toGB300approximately14 times
Historical Performance Review
Historical Performance Review
Summary of historical trend projections:
I. Historical excess returns and current changes
Historical normOver the past decade, NVIDIA has outperformed the S&P 500 with an average excess return of 34% in the six months before the event and approximately 19% in the three months after, likely surpassing the broader market.
>2025 Turning PointThe excess return in the six months before the event shrinks to 4.4%, while it reverses to 13% in the three months after.
>2026 Current SituationThere is almost no growth or even negative growth in the six months prior to the event. The core reason lies in leading cloud vendors' AI capital expenditure exceeding $600 billion, causing the market's game to become more rational and significantly cooling earlier expectations.
II. Review of the Logic Behind Price Movements Before and After GTC: Three Major Factors for Pre-Event Strength
>Expectation Gap for New ProductsThe release of a new generation chip exceeds market expectations.
>Double Benefit from Earnings and ValuationStrong earnings reports combined with excellent industry narratives (Note: Although this earnings report was good, expectations had already been fully priced in beforehand).
> Sentiment barometer: GTC, as the global AI industry's tone-setting conference, directly boosted sentiment across the entire sector.
Two main reasons for the post-conference narrowing of gains:
> Profit-taking after positive news: 'Buy the rumor, sell the fact,' with some capital taking profits.
> Valuation adjustment: Digesting pre-conference gains and entering a technical correction (macroeconomic and seasonal factors are temporarily omitted).
Three, Outlook for 2026 Trends Based on historical patterns, there is a probability of repeating the 2025 market trend this year. In the context of extremely low pre-conference expectations and restrained price movements, if the conference releases a strong new narrative—such as the Rubin chip IPO being scheduled or deep linkage among multiple players in the supply chain—it will directly reverse the market’s pessimistic outlook and become a key turning point driving stock price surges.
four core suspense points
four core suspense points
The current market is not only focused on new concepts like 'OpenClaw,' but also urgently awaiting answers to four key questions at this GTC conference.
1. Rubin series production timeline: Can the mass production timelines for the Rubin GPU (expected in the second half of 2026) and Rubin Ultra (expected in the second half of 2027) be moved forward or further clarified? Advancing the progress of flagship products would be the most direct catalyst for stock prices.
2. Debut of the Feynman architecture: Will the next-generation Feynman provide clear architectural insights, especially regarding OIO and LPU performance in low-latency inference, along with the synergy path of advanced processes and packaging? This is the core cornerstone anchoring NVIDIA's long-term value.
3. Evolution pace of CPO and optical interconnects: How will the network architecture evolve? The roadmap for Scale-up (vertical scaling, intra-rack communication with low latency but limited scale) versus Scale-out (horizontal scaling, a highly anticipated direction that could exceed expectations) needs clarification. This will directly determine the benefiting logic for the optical communication and optical module sectors.
4. Hardware ecosystem delivery capability: In response to surging computational heat density, can key materials and technologies such as liquid cooling, HBM4 memory, CoWoS packaging, and orthogonal electrical panels be delivered on time and overcome system-level bottlenecks? This is the critical guarantee for the entire industry chain implementation.
Behind the four core questions lies a comprehensive view of eight technological directions.
Behind the four core questions lies a comprehensive view of eight technological directions.
Behind the four core issues lies a comprehensive view of eight technological directions:We will focus on the three core directions with the highest certainty (corresponding to Directions 1, 2, and 5 in the original PPT). Technological development is not an isolated island. For instance, the expansion of 'AI factories' is not merely a stacking of infrastructure but also a cornerstone extending into the fields of 'physical AI' and robotics.
Based on technological maturity, the directions in the current market with 'high certainty' are as follows:
1. AI Factories and Infrastructure (Highest Maturity) Core Nodes:The active Blackwell architecture (B300/GB200 series) and the next-generation Rubin NVL144 system.Benefiting Areas:Key semiconductor and hardware manufacturers such as GPU, ASIC, CPU, and HBM.
2. High-Performance Inference and Training (Extremely High Activity) Core Logic:Computing power demand is spilling over from chips to system-level solutions.Benefiting Areas:In addition to the aforementioned chips, there will be a further surge in peripheral hardware such as AI servers, integrated rack systems, and power management.
Third, CUDA and the Developer Ecosystem (with the deepest ecological barriers): Core Data:The CUDA platform has gathered over 4.5 million developers, with more than 3,000 accelerated applications implemented.Core rationale:The deep moat of integrated software and hardware benefits NVIDIA and its closely-knit ecosystem the most.
Summary:These three highly mature directions form the market's expectation of 'high certainty.' Coupled with the grand narrative of accelerated deployment of AI agents and the ultimate realization of performance, they collectively constitute the underlying investment logic of the current AI industry chain.
Core Direction One
Core Direction One
Today, we will mainly share three core directions., The strongest main thread is:AI Factory and Infrastructure Expansion
Market focus is shifting from NVIDIA's GB300 to the next-generation Rubin system. The core contradiction in the current market is not insufficient demand (demand far exceeds supply), but ratherWhether it can be delivered on time or even exceed expectations
The challenge behind this lies in:A qualitative change in the form of delivery: from single chips to 'standardized AI factory modules'The focus of the next generation's delivery is no longer a single Rubin GPU, but rather system-level synergy deeply integrated with Vera CPUs, NVLink 6/7 switches, and ConnectX-9 network cards.
> Parameter extremes of the super system (Rubin Ultra):Taking the future cabinet-level super system as an example, it is expected to feature more than 280 Rubin Ultra GPUs, equipped with over 280TB of HBM4e and 72TB of DRAM.
> Underlying architectural changes triggering peripheral ecosystems:The key point is no longer just the number of GPUs, but a disruptive approach to interconnection inside the cabinet. This increase in computing density will directly extend industry benefits to:High-frequency multi-layer PCBs, high-speed connectors, orthogonal backplanes, and 100% liquid cooling systems
Summary:NVIDIA’s biggest test currently has evolved from the simple mass production of Rubin chips to the extremely complex comprehensive delivery capability of entire cabinet ecosystems.
Core Direction Two
Core Direction Two
The Second Major Direction:High-Performance Inference and Physical AI
Current industry consensus: the focus of AI is shifting from mere "computational training" to complex "application inference." The next core issue is:How can inference capabilities deeply integrate with physical AI (such as embodied intelligence and robotics)? What is NVIDIA's long-term roadmap?
This is not only an inevitable step in technological evolution but also a call from a billion-dollar market. Two sets of third-party forecast data highlight the explosive potential of this field:
> Terminal shipment growth:Robot shipments are expected to soar from approximately 20,000 units in 2025 to 10 million units in 2035.
> Industrial market size surge:The market size for AI industrial robots is expected to expand from $2 billion in 2022 to over $10 billion by 2026.
Summary:Amid this grand expectation, whether this conference can unveil a system-level physical AI solution that bridges 'cloud GPU computing power' with 'terminal edge computing' is the second major highlight for the market.
Core Direction Three
Core Direction Three
The third major direction:AgenticAI (Agent-based AI) and commercial monetization
Currently, the profit ceiling of the single 'chatbot' model is fully exposed, and the industry is moving comprehensively towardsAgenticAI based on reasoning and workflow optimization(in conjunction with the continuous upgrades of CUDA developer tools).
This marks a substantial leap from AI’s 'computing power training phase' to its 'reasoning application phase,' with the core focus beingthe validation of business models.Industry data also confirms this explosive growth: the AI Agent market size, which was $7.5 billion in 2025, is expected to surpass $10.8 billion this year (2026), and by 2034, it is projected to approach the $200 billion mark (with a compound annual growth rate CAGR of 43%).
The profound significance of comprehensive implementation lies in:
> Debunking the bubble theory:Strongly refuting market concerns about 'excessive investment in AI infrastructure' with tangible, real-world application scenarios backed by actual capital.
> Reciprocating cash flow:Providing cloud vendors and related AI investment enterprises with robust self-sustaining capabilities.
Summary:The essence of the third major direction is addressing themonetization loop, cash flow health, and long-term sustainable developmentissues within the AI industry.
Key clues
Key clues
After reviewing these three directions, we can assess the key forward-looking indicators for each of them to understand their general status.
Forward-looking clues and investment logic framework:
Clue One: Rubin architecture mass production and revenue visibilityMarket consensus suggests that Rubin has entered the mass production phase (expected to ramp up in the second half of this year). NVIDIA forecasts combined annual revenue from Blackwell and Rubin to exceed $500 billion, projecting an 'AI infrastructure annual market potential of $3-4 trillion.'
Clue Two: Rack-level GPU (NVL576) and the 'copper vs. optical' debateNVL576 represents the evolution from individual servers to a 'rack as a super GPU.' The market’s biggest point of contention is whether the introduction of stronger Rubin Ultra will lead to the adoption of optical modules for scale-up by 2027, or whethercopper interconnectswill remain dominant? If this conference can clarify the timing for switching technology paths, it will reshape the valuation logic for optical module sectors (although some expectations are already priced in).
Clue Three: Progress on commercialization of the 2028 Feynman architecture and OIO (Optical I/O)The next-generation GPU architecture, Feynman, is expected to launch in 2028. The core suspense in the market now is:Will OIO technology move beyond the 'conceptual discussion' stage and officially enter the substantive 'product definition' phase?If NVIDIA can provide a clear roadmap and commercial timeline for OIO technology at this conference, eliminating market uncertainty about the technology's implementation, it will directly boost the medium- to long-term valuation expectations of the related optical communication industry chain.
Summary: Two Core Investment Validation Frameworks
> Timing Validation:Whether Rubin’s mass production and scaling speed exceed market expectations.
> Hardware Ecosystem Validation:Whether system-level solutions for core technologies surrounding GPUs are clear. Key areas to monitor include HBM specifications, CoWoS packaging capacity, and orthogonal backplane design, among others.
Investment benefits framework one
Investment benefits framework one
Investment benefits framework two
Investment benefits framework two
Investment Portfolio Framework and Target Mapping:
Based on certainty and technical roadmaps, we divide investment targets into three tiers.
First-tier (Grade A/Primary Choice): NVIDIA.Core of the industrial chain, with optimal certainty.
Second-tier (Secondary Choice): Storage sector.The core rationale lies in a well-balanced supply-demand state and extremely high earnings certainty.
Third-tier (High Elasticity): Optical modules.The key variable is the competition between 'copper interconnect vs optical modules' technology paths. If this conference can clarify the optical module route, its technological direction and earnings visibility will become fully clear, and market tolerance for its high valuation will significantly increase. Regarding the industry details of storage (HBM) and optical modules, my colleague Tim will provide an in-depth analysis later.
Summary and Valuation Analysis:Based on current market expectations, to further drive NVIDIA's stock price and valuation center upward, regular positive catalysts are no longer sufficient; a strong 'above-expectations' event must materialize.
Expectation consolidation
Expectation consolidation
Beyond the standard expectations for the aforementioned core products and ecosystem, the emergence of any of the following three scenarios would constitute significantExceeding expectationsPositive factors:
>Direct release of LPU and LPX: Not just停留在 concept previews, but directly showcasing the next-generation product form.
>Clear roadmap for interconnects and architecture: The Scale-up interconnection solutions of Rubin Ultra and NVL5600, as well as the orthogonal electrical architecture design path, are clearer than expected.
>Optical module adoption timeline confirmed: Not only is there a cooperation framework, but clear commercialization时间节点 has also been provided. This will be a major industryCatalyst(Note: The original term '区划' appears to be a misstatement or typo).With such positive developments, we need to reverse our perspective: Are there any potential risks at this GTC conference?
Potential Risks
Potential Risks
Potential Risks at the GTC Conference
The GTC has always been a high-expectation event, and falling short of expectations could trigger a valuation pullback. Three major risks need to be watched currently:
>Lack of incremental information in the architecture: The market has already fully priced in the mass production of Rubin architecture by the second half of 2026. If Rubin or the next-generation Feynman architecture fails to present a surprisingly early production timeline or technical specifications, a pullback may occur.
>Uncertainty in optical module progress: In the Scale-up (upward expansion) process, if significant disagreements persist regarding the 'copper-to-optical' transition route, the optical module sector will face short-term pressure due to 'high expectations and low delivery.'
>Embodied intelligence not yet implemented: If Physical AI (embodied intelligence) and AI Agents remain at the demo stage without reaching production-grade commercial closed-loop, delays in monetization nodes will undermine market confidence.
>Commercial closed-loop and compute demand logicNVIDIA is expected to highlight enterprise-grade microservice products (e.g., NIM) at this conference. This aligns closely with industry discussions around LangChain and AI Agents, aiming to facilitate B-end commercialization. Recent data shows that, driven by the rise of high cost-effective domestic large models, token consumption on a certain open-source API platform surged in February (from 2.7 trillion weekly to 5.2 trillion), surpassing US-based models. This proves that compute power remains extremely scarce. In the future, as AI workflow tools are optimized and B-end enterprise access expands, token consumption is set to grow exponentially. This trend could significantly improve the cash flow of 'AI factories,' thereby completely alleviating concerns over unsustainable CapEx (capital expenditure) for tech giants. This fundamentally benefits NVIDIA as a provider of foundational compute infrastructure.
Summary of five core highlights from the GTC Conference
>Mass production timelineThe specific mass production schedule for the Rubin architecture in the second half of the year.
Product implementationWhether LPX and LPU will be officially launched as standalone products.
Interconnection RouteThe interconnection solution for high-density cabinets such as NVL576, whether to 'continue using copper cables' or 'massively transition to optical modules.'
Cooling UpgradeUnder extremely high cabinet power density, the upgrade speed and penetration rate of liquid cooling/water cooling solutions.
Cutting-edge technologyWhether the Feynman architecture can provide a clear roadmap for OIO (Optical I/O) and CPO (Co-packaged Optics).
Valuation and investment strategyNVIDIA's forward price-to-earnings ratio is currently around 22x. Over the past six months, its valuation has been at a low point of one standard deviation below the historical average. Typically, pre-event gains are larger than post-event. Against the backdrop of both the stock price and valuation lagging, once this GTC releases the above-mentioned better-than-expected signals, the stock price will see significant recovery. The subsequent key highlights and in-depth industry analysis will be broken down by Tim.
Next article:
Building on Joe’s mention of the rack product line, Tim will expand on the industrial impact regarding optical communication, storage, PCIe, and CXL sectors.
Introduction NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections: Part 1 | Joe Yu: Investment Framework and Key Highlights A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty: ① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.' Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks. Next Chapter | Tim: In-depth Dissection of the Industry Chain Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors: ①Optical Communication: CPO...
This conference is expected to focus on the evolution of the following three core racks:
>NVL72: As the basic unit, it continues to use single-layer copper cable interconnects.
>NVL144: It is expected to introduce an orthogonal backplane design, integrating it into amassive rack.
>NVL576: The true watershed moment for interconnect architecture. Market expectations suggest a dual-layer network approach: the first layer retains copper cabling, while the second layer (NVLink switch layer) transitions fully to CPO. This move will directly anchor the adoption pace and ratio of CPO and NPO within server racks, further solidifyingOptical communication officially breaking into the Scale-up core layeras a landmark milestone.
Additionally, the conference is highly likely to unveil theVera CPU cabinet(rumored to have been pre-ordered by Meta). The rationale behind its launch is that CPUs have now become a bottleneck constraining the development of AI agents. On one hand, CPU scheduling accounts for the majority of end-to-end latency in complex agent task flows; on the other hand, CPU core counts have become an extremely scarce resource when handling high-concurrency AI inference tasks. Vera CPU cabinets are designed to address this challenge.
Introduction NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections: Part 1 | Joe Yu: Investment Framework and Key Highlights A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty: ① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.' Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks. Next Chapter | Tim: In-depth Dissection of the Industry Chain Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors: ①Optical Communication: CPO...
Next, focus on updates to two core hardware components, which reveala trend towards finer division of labor in the AI inference pipeline.
1. Rubin CPX: Breaking the 'one-size-fits-all' myth
Rubin CPX, released in September last year (expected to hit the market by the end of 2026), is an inference GPU specifically designed for 'million-scale ultra-long contexts.' In the past, the industry believed that 'large GPUs + large HBM' could handle all inference tasks. However, the emergence of CPX shows that a single architecture can no longer efficiently adapt to the entire lifecycle of large model inference. Large model inference is divided into two steps:Context filling (Prefill) and decoding (Decode)In the future, CPX will be dedicated to handling the compute-intensive Prefill phase, while regular Rubin GPUs will take over Decode. This 'specialized chip for specific tasks' split will significantly optimize system efficiency.
II. LPU Architecture: Achieving Ultra-Low Latency
Another highlight is NVIDIA's LPU architecture acquired through acquisition. It must be clarified thatLPU is an incremental supplement to the GPU, not a replacementGPUs are inherently not suited for ultra-low latency inference, whereas LPUs are specifically designed for real-time interaction during the Decode phase.
The core lies in utilizingon-chip SRAMwhich delivers ultra-high bandwidth and extremely low latency; the trade-off being its minimal capacity, making it unsuitable for general large model training. Thus, in practical deployment, it needs to be paired with high-performance CPUs and extensive DDR memory to compensate for capacity limitations. Market expectations suggest that LPU will integrate into NVIDIA's computing ecosystem as a 'plug-and-play expansion module', completing the final piece of the ultra-low latency puzzle.
Introduction NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections: Part 1 | Joe Yu: Investment Framework and Key Highlights A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty: ① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.' Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks. Next Chapter | Tim: In-depth Dissection of the Industry Chain Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors: ①Optical Communication: CPO...
Outline of NVIDIA’s computing architecture evolution over the next three years:
>2026: Rubin ArchitectureThe current AI computing power battleground is dominated by the Rubin architecture, with a core focus on system scalability and energy efficiency improvements.
>2027: Rubin Ultra 576 CabinetThe focus of computing power will smoothly transition to the Rubin Ultra 576, representing a further scaling up of rack-level AI factories by 2027.
>2028: Feynman ArchitectureThe next-generation Feynman architecture, named after physicist Richard Feynman, will officially take over. Its core breakthrough lies innative chip-level integration: Feynman will deeply integrate LPU architecture, directly releasing an enhanced SRAM processor embedded at the foundational layer. This move will completely streamline the underlying division of labor for large model inference—relying on native SRAM forultra-low latency inference (Decode phase), forming a seamless complementarity with GPUs' high-throughput context filling (Prefill phase). At this point, LPU will complete its evolution from an 'external pluggable module' to a 'native architectural foundation'.
Introduction NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections: Part 1 | Joe Yu: Investment Framework and Key Highlights A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty: ① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.' Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks. Next Chapter | Tim: In-depth Dissection of the Industry Chain Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors: ①Optical Communication: CPO...
Next, we explore the core of the next-generation network interconnect—CPO (Co-Packaged Optics) switches
1. Three core CPO switches and delivery timelines
The market is currently focused on three upcoming CPO switches:
>Quantum 3400: NVIDIA’s InfiniBand switch designed for high-performance computing, expected to be delivered by the end of 2026.
Ethernet Series (6800 and 6810): Expected delivery in Q1 2027, with the possibility of an earlier release by the end of this year (2026).
Timeline confirmation: Last month (February 2026), optical communications giant Lumentum disclosed a roughly $1 billion CPO-related order (delivery scheduled for H1 2027), solidifying the mass production timeline for CPO.
Second, the optical communication industry chain is entering a phase of technological realization Yield rates for underlying hardware and supporting facilities are accelerating past bottlenecks:
> ELS (External Light Source): Module power consumption has been successfully reduced by 30%.
> FAU (Fiber Array Unit): Optical coupling efficiency has significantly increased to 95%. All indicators show that CPO switches have officially moved past the R&D challenges and entered the commercial realization phase.
Third, penetration rate expectations and leading company indicators
Although the current CPO market penetration rate remains low (around 3% this year, expected to reach 8% by 2027), NVIDIA is highly likely to adopt a bundled sales strategy of 'computing chips + CPO network,' which could potentially trigger unexpected growth. The next key catalyst lies inDemand Side. Meta and Microsoft, as the first top-tier clients to take the plunge, their statements are crucial. The market is closely watching two major signals:
> Will they explicitly include CPO in their AI cluster procurement lists for the next two years (2026-2027)?
> Will tech giants publicly endorse: Does CPO hold irreplaceable strategic value in 'reducing power consumption, increasing bandwidth, and controlling costs'?
Introduction NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections: Part 1 | Joe Yu: Investment Framework and Key Highlights A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty: ① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.' Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks. Next Chapter | Tim: In-depth Dissection of the Industry Chain Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors: ①Optical Communication: CPO...
Here's an overview of the evolution and core technology trends in the optical communication sector:
1. Market Dynamics: Coexistence of CPO and Pluggable ModulesThe market is concerned that CPO technology might squeeze the share of traditional optical modules. However, in reality, 1.6T pluggable optical modules are experiencing explosive growth this year (2026).CPO primarily serves as a technical reserve for the Rubin Ultra platform in 2027. In the short term, the two will coexist complementarily rather than being absolute substitutes.
2. Scale-out Side: Dominance of 1.6T with NPO as a TransitionIn scale-out networks, 1.6T remains the dominant future standard.
Current Technical Status: Stable operation of single-channel 200G currently, evolving towards mass production of 400G+. In terms of reducing power consumption and process aspects (e.g., thin-film lithium niobate, silicon photonics solutions), pluggable modules still offer extremely high cost-effectiveness.Evolution Path: CPO will achieve small-scale commercial implementation in this field. However, it is constrained by the system's high-reliability requirements,NPO (Near Package Optics)will serve as a key transitional solution. Leading optical module manufacturers are highly likely to showcase core NPO solutions at the upcomingOFC (Optical Fiber Communication) conference.
Third, Scale-up (vertical scaling) side: The absolute necessity for CPO
Within the NVLink interconnect domain, the demand for extensive multi-rack expansion and extremely high port density makes it difficult to deploy traditional pluggable modules,making CPO an essential requirement. Key focus:Can optical interconnects break through the physical limitations of a single rack and move towards 'cross-rack, cross-row' or even broader low-latency consistent interconnects (i.e., the second layer of Scale-up)?
Market growth: Bandwidth demand for Scale-up is an order of magnitude higher than that for Scale-out, which will open up a much larger growth space for the optical communication industry.
IV. Evolution of Hardware Architecture and Soaring Demand for Optical Engines
Currently, the SerDes (Serializer/Deserializer) rate of NVLink has reached 224G,Orthogonal backplaneandOptical Interconnecthas become a definitive trend.Sharp increase in ratio: Under the current Rubin Ultra solution, the ratio of GPU to optical engine is approximately 1:4.5. If 'co-packaging of network cards and optical engines' exceeds expectations, this ratio will rise to 1:5.5, significantly boosting market demand for optical engines.
V. Value Breakdown of the Industrial Chain and Core Targets
The scaling up of CPO will reshape the value distribution of optical components: optical engines account for about 40%, ELS (External Light Source) 20%, MPO connectors 10%, and Shuffle boxes 4%.
Tianfu Communications: A leader in core components, with a comprehensive layout in optical engines and FAU.
Lumentum: Strong certainty, currently the sole CW (Continuous Wave) light source core supplier for NVIDIA's first batch of CPO switches, already in mass配套 on the Ethernet platform.
CoherentIt is expected to begin mass deliveries by the end of this year (2026) to early 2027.
Introduction NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections: Part 1 | Joe Yu: Investment Framework and Key Highlights A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty: ① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.' Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks. Next Chapter | Tim: In-depth Dissection of the Industry Chain Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors: ①Optical Communication: CPO...
Next, we move into the analysis of the storage sector.
1. Core pain point: Storage bottlenecks caused by long context and multimodal demands
Over the past few years, the industry has focused heavily on GPU HBM (High Bandwidth Memory). However, as long context, AI agents, and multimodal reasoning become central, practical issues have emerged.Not all active data justifies occupying expensive HBM.If all data is relegated to traditional SSDs, latency and bandwidth will become system bottlenecks. Therefore, NVIDIA, in its Rubin platform, officially moved storage from a 'background supporting role' to the 'forefront.'
2. Breakthrough solution: Introduction of the G3.5 warm data layer and HBF
At this GTC conference, NVIDIA is highly likely to showcase a new storage tiering collaboration solution:HBF (High Bandwidth Flash): Positioned as a new storage layer between HBM and SSD.ICMS (Context Storage Platform): Released this January (2026), it aims to offload long-context data that needs to be repeatedly accessed during inference from expensive, compact memory layers to lower-cost, highly scalable storage layers. NVIDIA defines it asG3.5 layer (Warm Data Layer)The core competition in current AI infrastructure has shifted from sheer computing power scale to 'optimizing the cost of moving data across different levels.'
III. Industry Implementation Timeline
February 2026: SK Hynix and Western Digital (SanDisk) have initiated standardization work on HBF under the OCP framework.The second half of 2026: SanDisk will provide the first batch of HBF samples.Early 2027: The first AI inference devices equipped with HBF are expected to debut.
IV. Integration of Feynman Architecture and SRAM
The market expects that the GTC conference will discuss the next-generation Feynman architecture, focusing onthe integration of 3D-stacked SRAM with LPOprocess and foundry: The on-chip SRAM and GPU logic layer process are from the same source (N7/N5/N3), currently manufactured by Taiwan Semiconductor, with Samsung and Intel potentially being introduced in the future.Alleviate market concerns: Will the increase in SRAM reduce HBM demand? The answer isNo, it will not. Because whether it's for updating training parameters or supporting large context lengths, the capacity of SRAM alone is far from sufficient. The two are not mutually exclusive.
V. Investment trends and core targets
This GTC will comprehensively drive overall demand for DRAM and NAND:SK Hynix & SamsungHBM demonstrates the strongest mid- to long-term certainty. Samsung has potential opportunities that exceed expectations at the HBM4 node and in capturing advanced packaging overflow demand.SanDisk/Western DigitalDirectly benefits from the NAND capacity increase brought by G3.5 layers, as well as the mid- to long-term evolution demand for cost-effective enterprise SSDs and HDDs.
Introduction NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections: Part 1 | Joe Yu: Investment Framework and Key Highlights A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty: ① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.' Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks. Next Chapter | Tim: In-depth Dissection of the Industry Chain Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors: ①Optical Communication: CPO...
PCIe and CXL Sector Analysis
I. Connectivity Foundation: The Evolution of PCIe and CXL
The GTC conference not only showcases computing power but also comprehensively drives the upgrade of AI cluster infrastructure. All architectural advancements heavily rely on three key underlying connectivity supports:Memory interconnect chipsEnsure the reliability of massive data transmission between the CPU and memory.PCIe busA high-speed channel connecting the CPU, GPU, network cards, and SSDs. Currently, single-channel speeds are evolving toward PCIe 7.0 (148Gbps).CXL Protocol: An open interconnect protocol built on the PCIe physical layer. It supports memory pooling, allowing GPUs to bypass the CPU and directly access storage layers such as HBF, significantly reducing system latency.
Second, the rate transition and industry dynamics of 'optics advancing while copper retreats'
Retimer demand explodes: When inter-chip connection bandwidth transitions from 224G to 448G, signal loss increases exponentially. For 448G, if the trace exceeds 10 centimeters, a PCIe Retimer must be introduced for signal compensation, driving a surge in demand.Optical fiber integration into motherboards: Under the extreme attenuation of PCIe 7.0, the effective distance of copper cables is reduced to only a few dozen centimeters. Industry leaders (such as AMD, Intel) are beginning to embrace PCIe optical solutions, marking the formal adoption of optical communication technology at the motherboard bus level.'Optics advancing while copper retreats' is a false crisis: The market had concerns about the use of orthogonal backplanes in compute and switch boardsOrthogonal backplane(90-degree vertical connections) will eliminate external copper cables, causing stress on interconnect chips. However, in reality: PCB routing still cannot do without chip compensation; in centimeter-level interconnections within computing trays, copper cables continue to dominate due to their "extremely low latency"; and as the density of chips per cabinet surges (e.g., NVL576), the total number of PCIe and CXL links will only increase exponentially.
Three, Core Targets
AsteraLabs (US Stock): Consistently holds 80% of the high-end Retimer market share and is expanding into CXL memory controllers, PCIe switches, and optical connectors, transitioning from a single chip provider to a technical module supplier.Montage Technology (A-share): Firmly maintains its leading position in memory interconnect chips as the world's second-largest PCIe Retimer manufacturer, with leadership in CXL memory controller chips. Deeply benefits from demand from key overseas clients and the AI infrastructure wave during the domestic 'Fifteenth Five-Year Plan' period, showing extremely high earnings certainty.
[Q&A Core Summary] Q1: Under NVIDIA's major architecture upgrade, how have Samsung and SK Hynix's strategies changed?
> Short-term clash: Hynix fiercely defends its leading position in HBM3/3E/4; Samsung, on the other hand, is attempting to accelerate mass production at the HBM4 node to overtake competitors and capture the high-end market.
> Medium- to long-term evolutionLeading players will shed their identity as 'single chip manufacturers' and transition into 'AI storage system providers.' They will integrate HBM, NAND, and CXL technologies and even develop on-chip collaboration solutions for SRAM and HBM targeting the future Feynman architecture. High-end HBM and warm data (NAND) will advance side by side without any trade-off.
Q2: What are the unexpectedly significant impacts of AgenticAI (agent-based AI) technology evolution on infrastructure?
> Technological Evolution: Large models are transitioning from 'single-step short text question answering' to 'long-context memory management' and 'multi-step planning/tool invocation.'
> Reversal of Computational Logic: The investment focus of data centers is shifting from 'heavy training' to 'heavy inference.' Corporate payment models have also shifted to long-term Token service subscriptions.
> Hardware Exceeding Expectations Across the Board: Intelligent workflows have significantly increased sustained demand for storage across different temperature tiers (e.g., enterprise SSDs). Not only will storage remain tight due to surging demand and supply disruptions (such as strikes), but the incremental growth in AI data centers regarding cabinet expansion and liquid cooling ecosystems also holds immense unexpected potential.
Terminology Index:
Introduction NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections: Part 1 | Joe Yu: Investment Framework and Key Highlights A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty: ① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.' Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks. Next Chapter | Tim: In-depth Dissection of the Industry Chain Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors: ①Optical Communication: CPO...
Introduction NVIDIA GTC 2026 is about to kick off, with the AI industry standing at a critical inflection point, transitioning from a 'training arms race' to 'inference application monetization.'This article builds a comprehensive analytical framework from macro logic to industrial details around the conference preview, divided into two major sections: Part 1 | Joe Yu: Investment Framework and Key Highlights A review of NVIDIA's stock performance shows excess returns patterns before and after previous GTC events. Expectations are extremely compressed ahead of the 2026 event, similar to 2025 trends, presenting a potential opportunity of 'low expectations → strong catalysts → post-event recovery.' Subsequently, the market’s most focusedfour core suspense points— Rubin production timeline, Feynman architecture debut, CPO optical interconnect progress, hardware ecosystem delivery capability — leading to the exploration ofthree investment themes with the highest certainty: ① AI factories and infrastructure: A qualitative leap from single-chip to full rack system-level delivery; ② High-performance inference and physical AI: Embodied intelligence transitioning from concept to tens of millions of units shipped in an industry boom; ③ Agentic AI and commercialization: Token economic closed-loop achieved, dispelling the 'AI bubble theory.' Finally, a three-tier framework is presented (NVIDIA → storage → optical modules), with a clear two-way analysis of potential upside catalysts and downside risks. Next Chapter | Tim: In-depth Dissection of the Industry Chain Following the framework from the previous chapter, we delve into the technical details and investment implications of the four key hardware sub-sectors: ①Optical Communication: CPO...
Risk Disclaimer: The above content only represents the author's view. It does not represent any position or investment advice of Futu. Futu makes no representation or warranty.Read more
Thumbs Up
3
64K Views
Report
Comments
Write a Comment...
3
5