Storage Density for AI Without Compromise with Solidigm

This video is part of the appearance, "Solidigm and MinIO present at AI Infrastructure Field Day 5". It was recorded as part of AI Infrastructure Field Day 5 at 10:30am-12:00pm on June 11, 2026.

Watch on YouTube
Watch on Vimeo

Storage Density for AI Without Compromise with Solidigm. Allyn Malventano, AI/SSD Technologist at Solidigm, introduced his role in AI workload categorization and SSD optimization, then outlined the presentation, which would cover an overview of storage applications for AI, a cluster-scale exercise with MinIO, and a brief mention of MemKV and Metrim/RAG work. He highlighted how a traditional storage diagram from only a couple of years ago surprisingly omitted any mention of KV caching, illustrating the rapid evolution of AI infrastructure, in which KV caching has become crucial for accelerating inference as AI adoption spreads widely. Solidigm positions its high-capacity solutions in the lower, denser tiers of the AI storage pyramid, which are becoming increasingly vital as higher-tier resources like HBM and DRAM face severe constraints and rising prices.

Solidigm embarked on a significant cluster-scale exercise with MinIO, featuring an 8×8 setup comprising eight high-performance client systems initiating workloads and eight server systems fully populated with Solidigm’s 122 TB drives, totaling an impressive 24 petabytes of storage. Using simulated GPU workloads, the MinIO benchmarking tool accurately emulated storage initiation. The cluster leveraged 400 Gigabit Ethernet NICs, achieving over 250 gigabytes per second over TCP, a figure Malventano found extreme given TCP’s inherent overhead. This baseline performance scaled almost linearly up to 8 nodes and resulted from extensive tuning across the network stack, including switch and NIC buffer adjustments based on precise cable lengths and MinIO’s parity layout. This meticulous optimization led to a threefold increase in performance over the initial setup.

Looking ahead, Solidigm aims to explore further performance enhancements. Potential next steps include implementing dual-pathing to double initiator bandwidth, which is especially relevant for future integration with compute nodes such as the NVIDIA B200 or B300. A more significant avenue is to leverage RDMA (Remote Direct Memory Access) to bypass CPU overhead during data transfers, enabling direct memory-to-memory communication between network adapters. While the current exercise simulated GPU memory with DRAM, RDMA would still significantly reduce CPU bottlenecks, enabling even higher throughput. The combination of dual pathing and RDMA presents a complex yet promising approach to continually pushing the boundaries of storage performance for demanding AI workloads, despite the considerable tuning challenges involved.

Personnel: Allyn Malventano

CTERA Wants Your File System to Be the AI Agent Coordination Layer

Is the Future of AI Infrastructure About Computing Less?

Building a Bridge to Savings and Performance with Qumulo and Cisco

CTERA Wants Your File System to Be the AI Agent Coordination Layer

Can You Trust What You Recover?

Fortinet Oddly Puts LCD Screens and LoraWAN on Wi-Fi 7 APs at MFD14

Storage Density for AI Without Compromise with Solidigm

Sign up for updates to Tech Field day events

Sign up for updates to
Tech Field day events