|
Ramesh Radhakrishnan, Jake Augustine, and Justin Murray presented for VMware at AI Field Day 5 |
This Presentation date is September 12, 2024 at 10:30-12:30.
Presenters: Chris Gully, Jake Augustine, Justin Murray, Ramesh Radhakrishnan
VMware Private AI Business Update
Watch on YouTube
Watch on Vimeo
This session will provide a business update on the state of VMware Private AI in the market. It focuses on advancements and announcements since VMware by Broadcom’s presentation at AI Field Day 4, including the key Enterprise AI challenges and the most common business use cases that have emerged. This session is followed by technical sessions and demos detailing the generally available version of VMware Private AI Foundation with NVIDIA, best practices for operationalizing VMware Private AI, and real world application of VMware Private AI do deliver AI applications to users.
The VMware Private AI Business Update presented by Jake Augustine at AI Field Day 5 provided a comprehensive overview of VMware’s advancements in the AI space, particularly focusing on the VMware Private AI Foundation with NVIDIA. Since its general availability in July, the solution has been designed to simplify the deployment of AI workloads across enterprises, leveraging VMware Cloud Foundation (VCF) and NVIDIA AI Enterprise. The collaboration between VMware and NVIDIA allows enterprises to operationalize GPUs within their data centers, providing a familiar control plane for IT teams while enabling data scientists to accelerate AI initiatives. The solution supports NVIDIA-certified hardware, including GPUs like the A100, H100, and L40, and offers flexibility in storage options, with VSAN being recommended but not mandatory for all workloads.
One of the key challenges VMware aims to address is the growing complexity and sprawl of AI workloads within organizations. As AI adoption increases, particularly with the rise of generative AI and large language models, enterprises are struggling to scale these workloads efficiently. VMware’s platform-based approach provides a unified infrastructure that allows IT teams to manage AI workloads at scale, reducing the need for data scientists to focus on infrastructure management. This approach also helps stabilize the organic growth of AI projects within organizations, offering better visibility into resource utilization and cost planning. By virtualizing AI workloads, VMware enables enterprises to optimize GPU usage, reducing costs and improving operational efficiency.
The presentation also highlighted the importance of time-to-value for enterprises adopting AI. VMware’s solution has demonstrated significant improvements in deployment speed, with one financial services customer reducing the time to deploy a RAG (retrieval-augmented generation) application from weeks to just two days. Additionally, the platform’s ability to handle both inference and training workloads, while integrating with third-party models and tools, makes it a versatile solution for enterprises at different stages of AI adoption. Overall, VMware’s Private AI Foundation with NVIDIA is positioned as a scalable, secure, and cost-effective solution for enterprises looking to operationalize AI across their organizations.
Personnel: Jake Augustine
VMware Private AI Foundation with NVIDIA Technical Overview and Demo
Watch on YouTube
Watch on Vimeo
This session will provide an update on VMware Private AI Foundation with NVIDIA, showcasing its evolution from preview to general availability. Key features and improvements made since the preview phase will be highlighted, giving delegates a clear understanding of what the product looks like in its fully realized state. The session will illustrate a day in the life of an GenAI Application Developer, the product’s capabilities for Retrieval Augmented Generation (RAG), and then walk through a demo.
The VMware Private AI Foundation with NVIDIA has evolved from its preview phase to general availability, with key updates in its architecture and features. One of the significant changes is the introduction of the NVIDIA Inference Microservice (NIM), replacing the Triton Inference Server, and the addition of the Retriever microservice, which retrieves data from a vector database in the Retrieval Augmented Generation (RAG) design. The session emphasizes the importance of RAG in enhancing large language models (LLMs) by integrating private company data stored in vector databases, which helps mitigate issues like hallucinations and lack of citation in LLMs. The demo showcases how VMware provisions the vector database and the chosen LLM, automating the process to streamline the workflow for data scientists and developers.
The presentation also highlights the challenges faced by data scientists, such as managing infrastructure and keeping up with the rapid pace of model and toolkit updates. VMware Cloud Foundation (VCF) addresses these challenges by providing a virtualized environment that allows for flexible GPU allocation and infrastructure management. The demo illustrates how data scientists can easily request AI workstations or Kubernetes clusters with pre-configured environments, reducing setup time from days to minutes. The automation tools provided by VMware simplify the deployment of deep learning VMs and Kubernetes clusters, allowing data scientists to focus on model development and testing rather than infrastructure concerns.
Additionally, the session touches on the importance of governance and lifecycle management in AI development. VMware offers tools to control and version models, containers, and infrastructure components, ensuring stability and compatibility across different environments. The demo also showcases how private data can be loaded into a vector database to enhance LLMs, and how Kubernetes clusters can be auto-scaled to handle varying workloads. The presentation concludes with a discussion on the frequency of updates to the stack, with VMware stabilizing on specific versions of NVIDIA components for six-month intervals, while allowing for custom upgrades if needed.
Personnel: Justin Murray
Getting Started with VMware Private AI Foundation with NVIDIA
Watch on YouTube
Watch on Vimeo
In this session, we will take a practitioner’s point of view of Private AI, walking through the value of not trying to create a do it yourself AI infrastructure, how to pick the right GPU for your organization, and delivering your AI use case.
In this presentation, Chris Gully from VMware by Broadcom discusses the challenges and solutions for organizations embarking on their AI journey, particularly focusing on the importance of not attempting to build AI infrastructure from scratch. He emphasizes that simply acquiring GPUs and installing them in servers is not enough to create a functional AI environment. There are numerous logistical considerations, such as power, airflow, and compatibility, that need to be addressed. Gully advocates for purchasing pre-validated and certified solutions to ensure a smoother experience, better support, and faster deployment of AI services. He also highlights the importance of selecting the right GPU for specific AI use cases, as different GPUs offer varying levels of performance and functionality.
Gully also delves into the complexities of GPU virtualization and the benefits of using technologies like VGPU and MIG (Multi-Instance GPU) to optimize resource utilization. These technologies allow organizations to slice and share GPU resources more effectively, ensuring that expensive hardware is used efficiently across multiple business units. He shares real-world examples of customers who faced challenges when their AI models did not fit the GPUs they had selected, underscoring the importance of understanding the technical requirements of AI workloads before making hardware decisions. Gully also discusses how VMware works closely with OEMs and NVIDIA to ensure that their solutions are fully certified and supported, providing customers with confidence that their AI infrastructure will work as expected.
The presentation further explores VMware’s Private AI Foundation, which integrates NVIDIA’s AI technologies into VMware’s Cloud Foundation (VCF) platform. This solution provides a streamlined, automated approach to deploying AI workloads, allowing organizations to quickly roll out AI use cases without the need for extensive manual configuration. Gully explains how VMware’s automation tools, such as the VCF Quick Start, enable rapid deployment of AI environments, reducing the time it takes to get AI models up and running. He also highlights the flexibility of the platform, which allows customers to customize their AI environments and add proprietary models to their catalogs. Overall, the session emphasizes the importance of simplifying AI infrastructure deployment and management to help organizations realize the value of AI more quickly and efficiently.
Personnel: Chris Gully
The Art of the Possible with VMware Private AI
Watch on YouTube
Watch on Vimeo
This session will discuss how VMware’s Private AI architectural approach enables the flexibility to run a range of GenAI solutions for your environment. We’ll explore how customers can achieve business value by running applications on a Private AI that offers unique advantages in privacy, compliance, and control. We will demo the “VMware Expert” app, built on VMware Private AI. Join us to learn how your organization can maximize its data strategy with this powerful platform.
In this session, Ramesh Radhakrishnan from VMware by Broadcom discusses the potential of VMware’s Private AI platform in enabling organizations to run generative AI (GenAI) solutions while maintaining control over privacy, compliance, and data management. He emphasizes that once customers have deployed VMware Private AI in their environments, the next challenge is demonstrating business value. The platform provides a flexible infrastructure that allows developers, data scientists, and software engineers to leverage GPUs for various AI applications. Radhakrishnan’s team, which includes infrastructure experts, software developers, and data scientists, has been working internally to build services on top of this platform, such as Jupyter notebooks and Visual Studio IDE environments, which allow users to access GPUs and AI capabilities for tasks like code completion and large language model (LLM) development.
One of the key services highlighted is the LLM service, which functions similarly to OpenAI but is designed for regulated industries that require strict control over data. This service allows organizations to run LLMs on their private infrastructure, ensuring that sensitive information is not exposed to third-party providers. Additionally, Radhakrishnan introduces the “VMware Expert” app, an internal tool that leverages AI to improve documentation search and provide expert Q&A capabilities. The app has evolved from a basic search tool using embedding models to a more advanced system that integrates retrieval-augmented generation (RAG) techniques, allowing users to interact with large language models that are fine-tuned with VMware-specific knowledge. This tool has shown significant improvements in search accuracy, with results being five to six times better than traditional keyword searches.
Radhakrishnan also discusses the challenges of ensuring that AI-generated answers are accurate and not prone to hallucination, a common issue when the LLM is not provided with the correct documents. To address this, VMware is exploring corrective RAG techniques and post-training methods to embed domain-specific knowledge directly into the models. This approach, which involves fine-tuning large language models on VMware’s internal documentation, has shown promising results and can be replicated by other organizations using VMware Private AI. The session concludes with a demonstration of the “VMware Expert” app and a discussion on how organizations can use VMware’s platform to build their own AI-driven solutions, maximizing the value of their data and infrastructure.
Personnel: Ramesh Radhakrishnan