|
This video is part of the appearance, “VMware by Broadcom Presents at AI Field Day 6“. It was recorded as part of AI Field Day 6 at 10:00-12:00 on January 29, 2025.
Watch on YouTube
Watch on Vimeo
A technical review of the generally available VMware Private AI Foundation with NVIDIA product, an advanced service on VMware Cloud Foundation, was presented. The presentation focused on the architecture of VMware Private AI Foundation (VPF), highlighting its reliance on VMware Cloud Foundation (VCF) 5.2.1 as its base. The speaker, Justin Murray, explained the layered architecture, distinguishing between the infrastructure provisioning layer (VMware’s intellectual property) and the data science layer, which includes containers running inference servers from NVIDIA and other open-source options. Significant advancements since the product’s minimum viable product launch in May 2024 were emphasized, including enhanced model governance capabilities for safe model testing and deployment.
The presentation delved into the rationale behind using VCF and VPF for managing AI/ML workloads. The speaker argued that the increasing complexity of model selection, infrastructure setup (including GPU selection), and the need for RAG (Retrieve, Augment, Generate) applications necessitates a robust and manageable infrastructure. VMware Cloud Foundation, with its virtualization capabilities, provides this solution by enabling isolated deep learning VMs and Kubernetes clusters for different teams and projects, preventing management nightmares and optimizing resource utilization. A key element is the self-service automation, allowing data scientists to request resources (like GPUs) with minimal IT interaction, streamlining the process and enabling faster model deployment.
A significant portion of the presentation covered GPU management and sharing, emphasizing the role of NVIDIA drivers in enabling virtual GPU (VGPU) profiles that allow for efficient resource allocation and isolation. The speaker highlighted the advancements in VMotion for GPUs, enabling rapid migration of workloads, and the integration of tools for monitoring GPU utilization within the VCF operations console. The discussion touched on model version control, the role of Harbor as a repository for models and containers, and the availability of a service catalog for deploying various AI components. The presentation concluded with a demo showing the quick and easy deployment of a Kubernetes cluster for a RAG application, showcasing the self-service capabilities and simplified infrastructure management offered by VMware Private AI Foundation.
Personnel: Justin Murray