|
This video is part of the appearance, “Solidigm Presents at AI Data Infrastructure Field Day 1“. It was recorded as part of AI Data Infrastructure Field Day 1 at 10:30-12:30 on October 3, 2024.
Watch on YouTube
Watch on Vimeo
Ace Stryker from Solidigm presented on the critical role of data infrastructure in AI value creation, emphasizing the importance of quality and quantity in training data. He illustrated this with an AI-generated image of a hand with an incorrect number of fingers, highlighting the limitations of AI models that lack intrinsic understanding of the objects they depict. This example underscored the necessity for high-quality training data to improve AI model outputs. Stryker explained that AI models predict desired outputs based on training data, which often lacks comprehensive information about the objects, leading to errors. He stressed that these challenges are not unique to image generation but are prevalent across various AI applications, where data variety, low error margins, and limited training data pose significant hurdles.
Stryker outlined the AI data pipeline, breaking it down into five stages: data ingestion, data preparation, model development, inference, and archiving. He detailed the specific data and performance requirements at each stage, noting that data magnitude decreases as it moves through the pipeline, while the type of I/O operations varies. For instance, data ingestion involves large sequential writes to object storage, while model training requires random reads from high-performance storage. He also discussed the importance of checkpointing during model training to prevent data loss and ensure efficient recovery. Stryker highlighted the growing trend of distributing AI workloads across core data centers, regional data centers, and edge servers, driven by the need for faster processing, data security, and reduced data transfer costs.
The presentation also addressed the challenges and opportunities of deploying AI at the edge. Stryker noted that edge environments often have lower power budgets, space constraints, and higher serviceability requirements compared to core data centers. He provided examples of edge deployments, such as medical imaging in hospitals and autonomous driving solutions, where high-density storage solutions like QLC SSDs are used to enhance data collection and processing. Stryker emphasized the need for storage vendors to adapt to these evolving requirements, ensuring that their products can meet the demands of both core and edge AI applications. The session concluded with a discussion on Solidigm’s product portfolio and how their SSDs are designed to optimize performance, energy efficiency, and cost in AI deployments.
Personnel: Ace Stryker