Zhitong Finance App learned that China Post Securities released a research report saying that at the CES 2026 exhibition site, Huang Renxun officially announced that his next-generation AI supercomputing platform Vera Rubin has entered full production. The Rubin platform reconstructs HBM-DRAM-NAND three-tier storage. The Rubin GPU integrates a new generation of high-bandwidth memory HBM4. The unit price is significantly higher than 3e, which is expected to clearly drive the original manufacturer's gross margin increase. Vera Rubin deployed BlueField-4 processors in the rack to specifically manage KVCache. In terms of unit prices, due to increased demand from cloud service providers and AI applications, the industry expects a double-digit percentage increase in NAND prices throughout 2026.
The main views of China Post Securities are as follows:
Nvidia's Vera Rubin is fully put into production, and the storage architecture is being restructured to improve the “memory wall” dilemma
At the CES 2026 exhibition site, Hwang In-hoon officially announced that his next-generation AI supercomputing platform Vera Rubin has entered full production. According to data released by Nvidia, the Rubin GPU is equipped with a third-generation Transformer engine, and the NVFP4 inference and training computing power reached 50/35 PFLOPS, 5/3.5 times that of the previous generation Blackwell; the HBM4 bandwidth was 22 TB/s, 2.8 times that of the previous generation; and the number of transistors was 336 billion, 1.6 times that of Blackwell.
Resolving contextual storage bottlenecks, Rubin platform reconstructs HBM-DRAM-NAND three-tier storage
Pyramid storage architecture. In the age of Agentic AI, agents need to remember long conversation histories and complex contexts, which can generate huge KV Caches. The traditional solution is to cram this data into expensive HBM video memory, but HBM has limited capacity and is expensive. Nvidia designed a new storage architecture for this purpose and launched a third-tier inference context memory storage platform driven by BlueField-4, which increased the number of tokens processed per second by up to 5 times.
HBM: Rubin GPU upgraded to HBM4, becoming the “computing core” tightly bound to the GPU
Rubin GPUs integrate a new generation of high-bandwidth memory HBM4, which doubles the interface width compared to HBM3e. Through a new memory controller, deep co-design with the memory ecosystem, and tighter computing-memory integration, Rubin GPUs have nearly tripled the memory bandwidth of Blackwell. In terms of quantity, each Rubin GPU HBM4 has a capacity of 288GB and a bandwidth of 22Tb/s. It is no longer just a “high-speed cache” near the GPU, but a hard constraint on the throughput of the entire system. In terms of unit price, HBM4 is significantly higher than 3e, which is expected to clearly drive an increase in the gross margin of the original manufacturer.
DRAM: Vera CPU upgraded to LPDDR5X, responsible for storing temperature data (KV cache)
Vera combines SCF with up to 1.5TB of LPDDR5x memory subsystem (Grace memory is 480GB LPDDR5X) to provide up to 1.2TB/s bandwidth (512 GB/s Grace bandwidth) at low power consumption. In application, LPDDR5X and HBM4 can be treated as a single consistent memory pool, reducing data movement overhead, and supporting techniques such as KV cache offload and efficient multi-model execution. In terms of unit price, the price/profit of server-side high-end DRAM increased significantly, and consumer-side DRAM withstood cost pressure and price transmission in passive squeezing, forming a new “AI first” structural price increase cycle.
NAND: Launched a BlueField-4-driven inference context memory storage platform, which is expected to become an inflationary product linearly related to the number of GPUs
Vera Rubin deployed BlueField-4 processors in the rack to specifically manage KVCache. BlueField-4 integrates a 64-core Grace CPU and high-bandwidth LPDDR5X memory, and ConnectX-9 networking to provide ultra-low latency Ethernet or InfiniBand connectivity of up to 800Gb/s. In terms of capacity, on top of the original 1TB of memory per GPU, the BlueField-4DPU memory storage platform added an additional 16TB of memory/GPU, and 1152TB of memory for the NVL72 rack. In terms of unit prices, due to increased demand from cloud service providers and AI applications, the industry expects a double-digit percentage increase in NAND prices throughout 2026.
Investment advice
Optimistic about the narrative upgrade logic of the storage industry chain, it is recommended to focus on: 1) overseas leaders: Hynix, Samsung, Micron (MU.US), SanDisk (SNDK.US), Kiox.US (KIOX.US), etc.; 2) Domestic targets: Shannon Xinchuang (300475.SZ), Demingli (001309.SZ), Zhaoyi Innovation (603986.SH), Pran Co., Ltd. (688766.SH), Tongyou Technology (00302.SZ), etc.
Risk Alerts
The pace of supply and demand falls short of expectations, industry competition intensifies, technology iteration falls short of expectations, etc.