|
This video is part of the appearance, “VMware Presents at AI Field Day 5“. It was recorded as part of AI Field Day 5 at 10:30-12:30 on September 12, 2024.
Watch on YouTube
Watch on Vimeo
This session will provide an update on VMware Private AI Foundation with NVIDIA, showcasing its evolution from preview to general availability. Key features and improvements made since the preview phase will be highlighted, giving delegates a clear understanding of what the product looks like in its fully realized state. The session will illustrate a day in the life of an GenAI Application Developer, the product’s capabilities for Retrieval Augmented Generation (RAG), and then walk through a demo.
The VMware Private AI Foundation with NVIDIA has evolved from its preview phase to general availability, with key updates in its architecture and features. One of the significant changes is the introduction of the NVIDIA Inference Microservice (NIM), replacing the Triton Inference Server, and the addition of the Retriever microservice, which retrieves data from a vector database in the Retrieval Augmented Generation (RAG) design. The session emphasizes the importance of RAG in enhancing large language models (LLMs) by integrating private company data stored in vector databases, which helps mitigate issues like hallucinations and lack of citation in LLMs. The demo showcases how VMware provisions the vector database and the chosen LLM, automating the process to streamline the workflow for data scientists and developers.
The presentation also highlights the challenges faced by data scientists, such as managing infrastructure and keeping up with the rapid pace of model and toolkit updates. VMware Cloud Foundation (VCF) addresses these challenges by providing a virtualized environment that allows for flexible GPU allocation and infrastructure management. The demo illustrates how data scientists can easily request AI workstations or Kubernetes clusters with pre-configured environments, reducing setup time from days to minutes. The automation tools provided by VMware simplify the deployment of deep learning VMs and Kubernetes clusters, allowing data scientists to focus on model development and testing rather than infrastructure concerns.
Additionally, the session touches on the importance of governance and lifecycle management in AI development. VMware offers tools to control and version models, containers, and infrastructure components, ensuring stability and compatibility across different environments. The demo also showcases how private data can be loaded into a vector database to enhance LLMs, and how Kubernetes clusters can be auto-scaled to handle varying workloads. The presentation concludes with a discussion on the frequency of updates to the stack, with VMware stabilizing on specific versions of NVIDIA components for six-month intervals, while allowing for custom upgrades if needed.
Personnel: Justin Murray