|
![]() Alex Fanous, Tasha Drew, Justin Murray, and Roger Fortier presented for Broadcom at AI Field Day 6 |
This Presentation date is January 29, 2025 at 10:00-12:00.
Presenters: Alex Fanous, Justin Murray, Tasha Drew
Broadcom is making its third appearance at AI Field Day to answer the question “Why VMware for Private AI?” You can expect to hear about three major trends driving customer interest in Private AI, get an update on what’s new in VMware Private AI Foundation, and hear from customer-facing teams about how Broadcom customers are building and deploying AI apps. And, in possibly one of the most important conversations in the AI space today, Broadcom will dive deep into model governance and security.
Three Reasons Customers Choose VMware Private AI from Broadcom
Watch on YouTube
Watch on Vimeo
Overview of why customers are choosing VMware Private AI and popular enterprise use cases we are seeing in the field. Tasha Drew’s presentation at AI Field Day 6 highlighted three key reasons driving the adoption of VMware Private AI. First, she addressed the often-overlooked issue of GPU underutilization. Data from UC Berkeley’s Sky Computing Lab, corroborated by VMware’s internal findings, demonstrated that current deployment practices, such as dedicating one GPU per model, lead to significant inefficiency due to inconsistent inference workload patterns. This underutilization is exacerbated by a phenomenon Drew termed “GPU hoarding,” where teams within organizations hoard their allocated GPUs, fearing resource sharing. VMware Private AI addresses this through intelligent workload scheduling and resource pooling, maximizing GPU utilization and enabling resource sharing across different teams and priorities.
The second driver for private AI adoption is cost. Drew presented data indicating a dramatic increase in cloud spending driven by AI workloads, often leading to budget reallocation and project cancellations. This high cost is attributed to various factors including platform fees, data security, infrastructure expenses, and the cost of upskilling staff to handle cloud-based AI tools. In contrast, VMware Private AI offers a more predictable and potentially lower total cost of ownership (TCO) by optimizing resource usage within the enterprise’s existing infrastructure. The presentation referenced an IDC white paper showing that a significant percentage of enterprises perceive on-premise AI solutions as equally or less expensive than cloud-based alternatives, primarily due to the decreased infrastructure and service costs.
Finally, Drew emphasized the critical role of model governance in driving the shift towards private AI. As enterprises embrace generative AI and train models on proprietary data and intellectual property (IP), concerns around data sensitivity and security become paramount. VMware Private AI tackles these concerns by providing robust control mechanisms such as role-based access control (RBAC) to regulate model access and ensure compliance with data protection regulations. While the technical complexities of managing access control within embedding and vector databases are acknowledged, Drew highlighted ongoing development efforts to integrate comprehensive security measures at both the database and application output levels. Overall, the presentation positioned VMware Private AI as a comprehensive solution addressing the challenges of cost, efficiency, and security in deploying and managing enterprise AI workloads.
Personnel: Tasha Drew
VMware Private AI Foundation Capabilities and Features Update from Broadcom
Watch on YouTube
Watch on Vimeo
A technical review of the generally available VMware Private AI Foundation with NVIDIA product, an advanced service on VMware Cloud Foundation, was presented. The presentation focused on the architecture of VMware Private AI Foundation (VPF), highlighting its reliance on VMware Cloud Foundation (VCF) 5.2.1 as its base. The speaker, Justin Murray, explained the layered architecture, distinguishing between the infrastructure provisioning layer (VMware’s intellectual property) and the data science layer, which includes containers running inference servers from NVIDIA and other open-source options. Significant advancements since the product’s minimum viable product launch in May 2024 were emphasized, including enhanced model governance capabilities for safe model testing and deployment.
The presentation delved into the rationale behind using VCF and VPF for managing AI/ML workloads. The speaker argued that the increasing complexity of model selection, infrastructure setup (including GPU selection), and the need for RAG (Retrieve, Augment, Generate) applications necessitates a robust and manageable infrastructure. VMware Cloud Foundation, with its virtualization capabilities, provides this solution by enabling isolated deep learning VMs and Kubernetes clusters for different teams and projects, preventing management nightmares and optimizing resource utilization. A key element is the self-service automation, allowing data scientists to request resources (like GPUs) with minimal IT interaction, streamlining the process and enabling faster model deployment.
A significant portion of the presentation covered GPU management and sharing, emphasizing the role of NVIDIA drivers in enabling virtual GPU (VGPU) profiles that allow for efficient resource allocation and isolation. The speaker highlighted the advancements in VMotion for GPUs, enabling rapid migration of workloads, and the integration of tools for monitoring GPU utilization within the VCF operations console. The discussion touched on model version control, the role of Harbor as a repository for models and containers, and the availability of a service catalog for deploying various AI components. The presentation concluded with a demo showing the quick and easy deployment of a Kubernetes cluster for a RAG application, showcasing the self-service capabilities and simplified infrastructure management offered by VMware Private AI Foundation.
Personnel: Justin Murray
Real-World Customer Journey with VMware Private AI from Broadcom
Watch on YouTube
Watch on Vimeo
Broadcom is actively engaged with customers on proof of concepts and production deployments of VMware Private AI Foundation. This session details a composite example of a typical customer journey, drawing from real-world scenarios encountered during customer engagements. The presentation focuses on the infrastructure aspects often overlooked, emphasizing the importance of a robust foundation for data scientists and AI engineers to effectively utilize AI tools. It highlights the iterative process of deploying and refining a private AI solution, starting with a simple Retrieve Augmented Generation (RAG) application built on VMware Private AI Foundation.
The customer journey begins with a high-level mandate from senior leadership to implement AI, often without specific technical details. A common starting point is a simple application, such as a chat app, using readily available data such as HR policies. This initial deployment allows for a gradual learning curve, introducing the use of vector databases for similarity searches and leveraging the VMware Private AI Foundation console for easy deployment. The presentation showcases how customers typically customize the initial templates, often adopting open-source tools like OpenWebUI for a more familiar user interface. The iterative process involves continual refinement, adjusting parameters, testing various LLMs, and ultimately scaling the infrastructure as needed using load balancers and multiple nodes.
Throughout the customer journey, the presentation stresses the importance of iterative development and feedback. The process emphasizes starting with a functional prototype, gathering feedback, and then progressively improving performance and scalability. This approach involves close collaboration between the infrastructure team, data scientists, and developers. The use of VMware’s existing infrastructure, such as vCenter and Data Services Manager, is emphasized as a key advantage, minimizing the need for retraining staff or adopting new vendor-specific tools. The session concludes by highlighting the flexibility and adaptability of the VMware Private AI Foundation platform, its ability to accommodate evolving AI architectures and future-proof investments in AI infrastructure.
Personnel: Alex Fanous
AI Model Security and Governance – Broadcom VMware Private AI Model Gallery Demo
Watch on YouTube
Watch on Vimeo
Model governance is crucial as enterprises adopt AI, requiring secure and consistent model behavior. This presentation by Tasha Drew of Broadcom VMware focuses on the challenges of achieving model governance and how VMware Private AI’s model gallery addresses these challenges through its capabilities and workflows. The core issue highlighted is the risk associated with introducing models into enterprise environments, similar to the security concerns surrounding containers in their early adoption. This necessitates robust security protocols and consistent monitoring to prevent vulnerabilities and ensure the models operate as intended.
A key aspect of the presentation emphasizes the growing importance of “agentic workflows,” where Large Language Models (LLMs) act as interfaces, orchestrating interactions with various tools and agents to achieve more accurate and comprehensive results. The example of a sales agent leveraging multiple data sources (public internet, internal documents, CRM systems) to generate a compelling presentation illustrates this concept. This highlights the complexity of integrating AI into business processes and the need for robust governance to manage the multiple data sources and agents involved.
The presentation then details how VMware Private AI Foundation, integrated with NVIDIA, helps achieve model governance. This includes a demo showcasing a workflow from model import (from sources like Hugging Face) through security testing (using tools like Giscard) to deployment in a secure environment (Harbor). This integrated approach allows for programmatic model evaluation, monitoring for behavioral drift, and controlled access through versioning and access control mechanisms. The ultimate goal is to enable enterprises to safely adopt AI by operationalizing security testing and providing a centralized, auditable repository for their AI models, thereby minimizing risks and maximizing the benefits of AI within their organizations.
Personnel: Tasha Drew