Optimized Storage from Supermicro and Solidigm to Accelerate Your Al Data Pipeline
Event: AI Field Day 4
Appearance: Solidigm Presents at AI Field Day 4
Company: Solidigm, Supermicro
Video Links:
- Vimeo: Optimized Storage from Supermicro and Solidigm to Accelerate Your Al Data Pipeline
- YouTube: Optimized Storage from Supermicro and Solidigm to Accelerate Your Al Data Pipeline
Personnel: Paul McLeod, Wendell Wenjen
Wendell Wenjen and Paul McLeod from Supermicro discuss challenges and solutions for AI and machine learning data storage. Supermicro is a company that provides servers, storage, GPU-accelerated servers, and networking solutions, with a significant portion of their revenue being AI-related.
They highlighted the challenges in AI operations and machine learning operations, specifically around data management, which includes collecting data, transforming it, and feeding it into GPU clusters for training and inference. They also emphasized the need for a large capacity of storage to handle the various phases of the AI data pipeline.
Supermicro has a wide range of products designed to cater to each stage of the AI data pipeline, from data ingestion, which requires a large data lake, to the training phase, which requires retaining large amounts of data for model development and validation. They also discussed the importance of efficient data storage solutions and introduced the concept of an “IO Blender effect,” where multiple data pipelines run concurrently, creating a mix of different IO profiles.
Supermicro delved deeper into the storage solutions, highlighting their partnership with WEKA, a software-defined storage company, and how their architecture is optimized for AI workloads. They explained the importance of NVMe flash storage, which can outpace processors, and the challenges of scaling such storage solutions. They also discussed Supermicro’s extensive portfolio of storage servers, ranging from multi-node systems to petascale architectures, designed to accommodate different customer needs.
Supermicro’s approach to storage for AI includes a two-tiered solution with flash storage for high performance and disk-based storage for high capacity at a lower cost. They also touched on the role of GPU direct storage in reducing latency and the flexibility of their software-defined storage solutions.
The presentation concluded with an overview of Supermicro’s product offerings for different AI and machine learning workloads, from edge devices to large data center storage solutions.