|
This video is part of the appearance, “Hammerspace presents at AI Infrastructure Field Day 3“. It was recorded as part of AI Infrastructure Field Day 3 at 10:30-12:30 on September 10, 2025.
Watch on YouTube
Watch on Vimeo
Tier 0 Storage is activated within existing GPU and CPU clusters by putting the underused NVMe in servers to work. Instead of sitting siloed and unprotected, this stranded capacity is unified with other storage tiers through Hammerspace, ensuring protection, reducing Tier 1 load, and eliminating vendor silos. Tier 0 automates AI data placement across sites and clouds, transforming wasted storage into high-performance capacity that accelerates AI and lowers costs.
Floyd Christopherson from Hammerspace introduces Tier 0, focusing on how it accelerates AI workflows in GPU and CPU-based clusters. The core problem addressed is the stranded capacity of local NVMe storage within servers, which, despite its speed, is often underutilized. Accessing data over the network to external storage becomes a bottleneck, especially in AI workflows with growing context lengths and fast token access requirements. While increasing network capacity is an option, it’s expensive and still limited. Tier 0 aggregates this local capacity into a single storage tier, making it the primary storage for workflows and enabling programmatic data orchestration, effectively unlocking petabytes of previously unused storage and eliminating the need to buy additional expensive Tier 1 storage.
Hammerspace’s Tier 0 leverages standards-based environments, with the client-side using standard NFS, SMB, and S3 protocols, eliminating the need for client-side software installations. The technology utilizes parallel NFS v4.2 with flex files, contributed to the Linux kernel, to enhance performance and efficiency. This approach avoids proprietary clients and special server deployments, allowing the system to work with existing infrastructure. The orchestration and unification of capacity across servers are key to the solution, turning compute nodes into storage servers without creating isolated islands, thereby reducing bottlenecks and improving data access speeds.
The presentation highlights the performance benefits of Tier 0, showcasing theoretical results and MLPerf benchmarks that demonstrate superior performance per rack unit. By utilizing local NVMe storage, Hammerspace reduces the reliance on expensive and slower cloud storage networks, leading to greater GPU utilization. Furthermore, Hammerspace contributes enhancements to the Linux kernel, such as local IO, to reduce CPU utilization and accelerate write performance, solidifying its commitment to standard-based solutions and continuous improvement in data accessibility. The architecture is designed to be non-disruptive, allowing for live data mobility behind the scenes, ensuring seamless user experience.
Personnel: Floyd Christofferson