|
|
This video is part of the appearance, “Solidigm Presents at AI Infrastructure Field Day“. It was recorded as part of AI Infrastructure Field Day 4 at 8:00AM – 9:30AM PT on January 30, 2026.
Watch on YouTube
Watch on Vimeo
A discussion between Solidigm and Vast on the efforts in the last year, from the all-flash TCO Colloborate to the way our technologies have synced to solve AI Market demands, discusses some recent Context-related impacts to the year of Inference for 2026. With the evolution of DPU-enabled inference platforms, the value and capabilities of Solidigm storage and Vast Data solutions drive even greater customer success. Solidigm’s Scott Shadley initiated the presentation by highlighting the immense power and storage demands of future AI infrastructure, using the “1.21 gigawatts” analogy. He projected that one gigawatt of power could support 550,000 NVIDIA Grace Blackwell GB300 GPUs and 25 exabytes of storage in 2025. This scale requires extremely efficient, high-capacity solid-state drives (SSDs) to stay within power envelopes, making Solidigm’s 122 terabyte drives a key enabler.
In 2026, the presentation introduced NVIDIA’s Vera Rubin platform with Bluefield 4 DPUs, which fundamentally alters AI storage architecture. This new design introduces an “inference context memory storage platform” (ICMSP) layer. This layer, positioned between direct-attached storage and object/data lake storage, is critical for rapid access to KV cache data in AI inference workloads. The new hierarchy distributes the 25 exabytes across high-capacity network-attached storage, the new 6.4 exabytes of context memory storage, and 6.1 exabytes of direct-attach storage. This evolution, while reducing the number of supportable GPUs within the 1-gigawatt limit, requires faster NVMe storage to improve performance and is projected to drive a 5x or greater compound annual growth rate (CAGR) in high-capacity storage demand.
Phil Manez from Vast Data then detailed their role in driving storage efficiency for AI. Vast’s disaggregated shared-everything (DASE) architecture separates compute from storage, utilizing Solidigm SSDs for dense capacity. This design enables global data reduction through a combination of compression, deduplication, and similarity-based reduction, achieving significantly higher data efficiency (often 3-4x more effective capacity) compared to traditional shared-nothing architectures, which is crucial amidst SSD supply constraints. Critically, Vast can deploy its C-node (storage logic) directly on the powerful Bluefield 4 DPUs, creating a highly optimized ICMSP. This approach accelerates time to first token, boosts GPU efficiency by offloading context computation, and dramatically reduces power consumption by eliminating intermediate compute layers, enabling AI inference workloads to operate at unprecedented speed and scale with shared, globally accessible context.
Personnel: Phil Manez, Scott Shadley








