|
This Presentation date is February 21, 2024 at 8:00-10:00.
Presenters: Chris Wolf, Justin Murray, Ramesh Radhakrishnan, Shawn Kelly
VMware by Broadcom Private AI Primer – An Emerging Category
Watch on YouTube
Watch on Vimeo
Private AI as an architectural approach that aims to balance the business gains from AI with the practical privacy and compliance needs of the organization. What is most important is that privacy and control requirements are satisfied, regardless of where AI models and data are deployed. This session will walk through the core tenets of Private AI and the common use cases that it addresses.
Chris Wolf, Global Head of AI and Advanced Services at VMware by Broadcom, discusses the evolution of application innovation, highlighting the shift from PC applications to business productivity tools, web applications, and mobile apps, and now the rise of AI applications. He emphasizes that AI is not new, with its use in specialized models for fraud detection being a longstanding practice. Chris notes that financial services with existing AI expertise have quickly adapted to generative AI with large language models, and he cites a range of industry use cases, such as VMware’s use of SaaS-based AI services for marketing content creation.
He mentions McKinsey’s projection of the annual potential economic value for generative AI being around $4.4 trillion, indicating a significant opportunity for industry transformation. Chris discusses the early adoption of AI in various regions, particularly in Japan, where the government invests in AI to compensate for a shrinking population and maintain global competitiveness.
The conversation shifts to privacy concerns in AI, with Chris explaining the concept of Private AI, which is about maintaining business gains from AI while ensuring privacy and compliance needs. He discusses the importance of data sovereignty, control, and not wanting to inadvertently benefit competitors with shared AI services. Chris also highlights the need for access control to prevent unauthorized access to sensitive information through AI models.
He then outlines the importance of choice, cost, performance, and compliance in the AI ecosystem, asserting that organizations should not be locked into a single vertical AI stack. Chris also describes the potential for fine-tuning language models with domain-specific data and the use of technologies like retrieval augmented generation (RAG) for simplifying AI use cases.
Finally, Chris emphasizes the need for adaptability in AI solutions and mentions VMware’s focus on adding value to the ecosystem through partnerships. He briefly touches on technical implementation, including leveraging virtualization support for GPU resources and partnering with companies like IBM Watson for model serving and management. He concludes by providing resources for further information on VMware’s AI initiatives.
Personnel: Chris Wolf
Introduction to VMware Private AI
Watch on YouTube
Watch on Vimeo
VMware Private AI brings compute capacity and AI models to where enterprise data is created, processed, and consumed, whether that is in a public cloud, enterprise data center, or at the edge. VMware Private AI consists of both product offerings (VMware Private AI Foundation with NVIDIA) and a VMware Private AI Reference Architecture for Open Source to help customers achieve their desired AI outcomes by supporting best-in-class open source software (OSS) technologies today and in the future. VMware’s interconnected and open ecosystem supports flexibility and choice in customers’ AI strategies.
Chris Wolf, the Global Head of AI and Advanced Services at VMware by Broadcom, discusses VMware’s Private AI initiative, which was announced in August 2023. The goal of Private AI is to democratize general AI and ignite business innovation across all enterprises while addressing privacy and control concerns. VMware focuses on providing AI infrastructure, optimizations, security, data privacy, and data serving, leaving higher-level AI services to AI ISVs (Independent Software Vendors). This non-competitive approach makes it easier for VMware to partner with ISVs since VMware does not directly compete with them in offering top-level AI services, unlike public clouds.
Wolf shares an example of VMware’s code generation use case with a 92% acceptance rate by software engineers using an internal solution based on an open-source model for the ESXi kernel. He discusses the importance of governance and compliance, particularly in AI-generated code, and mentions VMware’s AI council and governance practices.
He highlights use cases such as call center resolution and advanced information retrieval across various industries. VMware’s solution emphasizes flexibility, choice of hardware and software, simplifying deployment, and mitigating risks. Wolf also notes VMware’s capability to stand up an AI cluster with preloaded models in about three seconds, which is not possible in public clouds or on bare metal.
The discussion covers the advantages of VMware Private AI in managing multiple AI projects within large enterprises, including efficient resource utilization and integration with existing operational tools, leading to lower total cost of ownership.
Wolf touches on the trend of AI adoption at the edge, the importance of security features within VMware’s stack, and the curated ecosystem of partners that VMware is building. He points out that VMware’s Private AI solution can leverage existing IT investments by bringing AI models to where the data already resides, such as on VMware Cloud Foundation (VCF).
Finally, Wolf previews upcoming Tech Field Day sessions that will go into detail about VMware’s collaborations with NVIDIA, Intel, and IBM, showcasing solutions like Private AI Foundation with NVIDIA and WatsonX SaaS service deployment on-premises. He encourages attendees to participate in these sessions to learn more about VMware’s AI offerings.
Personnel: Chris Wolf
VMware Private AI Foundation with NVIDIA Overview
Watch on YouTube
Watch on Vimeo
VMware Private AI Foundation with NVIDIA is a fully integrated solution featuring generative AI software and accelerated computing from NVIDIA, built on VMware Cloud Foundation and optimized for AI. The solution includes integrated AI tools to empower enterprises to customize models and run generative AI applications adjacent to their data while addressing corporate data privacy, security and control concerns. The platform will feature NVIDIA NeMo, which combines customization frameworks, guardrail toolkits, data curation tools and pretrained models to offer enterprises an easy, cost-effective and fast way to adopt generative AI.
In this presentation, Justin Murray, Product Marketing Engineer at VMware by Broadcom, discusses the VMware Private AI Foundation with NVIDIA, which is a solution designed to run generative AI applications with a focus on privacy, security, and control for enterprises. The platform is built on VMware Cloud Foundation and optimized for AI, featuring NVIDIA NeMo for customization and generative AI model deployment.
Murray explains the architecture of the solution, which includes a self-service catalog for data scientists to easily access their tools, GPU monitoring in the vCenter interface, and deep learning VMs pre-packaged with data science toolkits. He emphasizes the importance of vector databases, particularly PG vector, which is central to retrieval-augmented generation (RAG). RAG combines database technology with large language models to provide up-to-date and private responses to queries.
He also touches on the GPU operator and Triton inference server from NVIDIA for managing GPU drivers and scalable model inference. Murray notes that the solution is designed to be user-friendly for data scientists and administrators serving them, with a focus on simplifying the deployment and management of AI applications.
Murray mentions that the solution is compatible with various vector databases and is capable of being used with private data, making it suitable for industries like banking. He also indicates that there is substantial demand for this architecture across different industries, with over 60 customers globally interested in it before the product’s general availability.
The presentation aims to provide technical details about the VMware Private AI Foundation with NVIDIA, including its components, use cases, and the benefits it offers to enterprises looking to adopt generative AI while maintaining control over their data.
Personnel: Justin Murray
VMware Private AI Foundation with NVIDIA Demo
Watch on YouTube
Watch on Vimeo
This VMware Private AI Foundation with NVIDIA demo works with the data scientist user as well as the VMware system administrator/devops person. A data scientist can reproduce their LLM environment rapidly on VMware Cloud Foundation (VCF). This is done through a self-service portal or through assistance from a VCF system administrator. We show that a VCF administrator can serve the data scientist with a set of VMs, created in a newly automated way from deep learning VM images, with all the deep learning tooling and platforms already active in them. We show a small LLM example application running on this setup to give the data scientist a head-start on their work.
In this presentation, Justin Murray, product marketing engineer from Broadcom, demonstrates VMware Private AI Foundation with NVIDIA technology. The demo is structured to show how the end user, particularly a data scientist, can benefit from the solution. Key points from the transcript include:
- Application Demonstration: Justin begins by showcasing a chatbot application powered by a large language model (LLM) which utilizes retrieval-augmented generation (RAG). The bot is demonstrated to answer questions more accurately after updating its knowledge base.
- Deep Learning VMs: The demo highlights the use of virtual machines (VMs) that come pre-loaded with deep learning toolkits, which are essential for data scientists. These VMs can be rapidly provisioned using ARIA automation, and they can be customized with specific tool bundles as per the data scientist’s requirements.
- Containers and VMs: Justin explains the solution uses a combination of containers and VMs, with NVIDIA components shipped as containers that can be run using Docker or integrated into Kubernetes clusters.
- Private AI Foundation Availability: The Private AI Foundation with NVIDIA is mentioned to be an upcoming product that will be available for purchase in the current quarter, with some customers already having early access to the beta version.
- Automation and User Interface: The ARIA automation tool is showcased, which allows data scientists or DevOps personnel to request resources through a simple interface, choosing the amount of GPU power they require.
- GPU Visibility: The demo concludes with a look at GPU visibility, showing how vCenter can be used to monitor GPU consumption at both the host and VM level, which is important for managing resources in LLM operations.
- Customer Use and Power Consumption: Justin notes that there’s interest in both dedicated VMs for data scientists and shared infrastructure like Kubernetes. He also acknowledges the importance of power consumption as a concern for those using GPUs.
VMware Private AI Foundation with NVIDIA aims to simplify the deployment and management of AI applications and infrastructure for data scientists, offering a combination of automation, privacy, and performance monitoring tools.
Personnel: Justin Murray
Running Best-of-Breed AI Services on a Common Platform with VMware Cloud Foundation
Watch on YouTube
Watch on Vimeo
VMware Cloud Foundation streamlines AI production, providing enterprises with unmatched flexibility, control, and choice. Through Private AI, businesses can seamlessly deploy best-in-class AI services across various environments, while ensuring privacy and security. Join us to explore VMware’s collaborations with IBM watsonx, Intel AI, and AnyScale Ray, delivering cutting-edge AI capabilities on top of VMware’s Private Cloud platform.
Sean Kelly, Principal Engineer at Broadcom, discusses the benefits of using VMware Cloud Foundation (VCF) to run AI services. He explains that VCF solves many of the infrastructure challenges associated with AI projects, such as agility, workload migration, avoiding idle compute resources, scaling, lifecycle management, privacy, and security.
He addresses concerns about VCF not being the only platform for AI, noting that while other products like vSphere and vSAN are still in use, VCF is the strategic direction for VMware, particularly for their strategic customers. He clarifies that VCF includes underlying vSphere technology and that using VCF inherently involves using vSAN.
Kelly also talks about performance, mentioning that VMware’s hypervisor scheduler has been optimized over two decades to match bare-metal speeds, with only a plus or minus 2% performance difference in AI workloads. He confirms that VMware supports NVIDIA’s NVLink, which allows multiple GPUs to connect directly to each other.
The talk then moves on to VMware’s Private AI, which is an architectural approach that balances business AI benefits with privacy and compliance needs. Kelly highlights collaborations with AnyScale Ray, an open-source framework for scaling Python AI workloads, and IBM Watson X, which brings IBM Watson capabilities on-premises for customers with specific data compliance requirements.
He covers the integration of Ray with vSphere, demonstrating how it can quickly spin up worker nodes (RayLits) for AI tasks. He also addresses licensing concerns, noting that while NVIDIA handles GPU licensing, Ray is an open-source plugin without additional licensing costs.
For IBM WatsonX, Kelly discusses the stack setup with VMware Cloud Foundation at the base, followed by OpenShift and WatsonX on top. He emphasizes security features, such as secure boot, identity and access management, and VM encryption. He also mentions the choice of proprietary, open-source, and third-party AI models available on the platform. Kelly briefly touches on use cases enabled by WatsonX, such as code generation, contact center resolution, IT operations automation, and advanced information retrieval. He concludes by directing listeners to a blog for more information on Private AI with IBM WatsonX.
Personnel: Shawn Kelly
Real-World Use of Private AI at VMware by Broadcom
Watch on YouTube
Watch on Vimeo
This session offers a deep dive into VMware’s internal AI services used by VMware employees, including our services for coding-assist, document search using retrieval augmented generation (RAG), and our internal LLM API.
In this presentation, Ramesh Radhakrishnan of VMware discusses the company’s internal use of AI, particularly large language models (LLMs), for various applications. He leads the AI Platform and Solutions team and shares insights into VMware’s AI services, which were developed even before the advent of LLMs.
Large language models (LLMs) are versatile tools that can address a wide range of use cases with minimal modification. VMware has developed internal AI services for coding assistance, document search using Retrieval-Augmented Generation (RAG), and an internal LLM API. Content generation, question answering, code generation, and the use of AI agents are some of the key use cases for LLMs at VMware.
VMware has implemented a Cloud Smart approach, leveraging open-source LLMs trained on the public cloud to avoid the environmental impact of running their own GPUs. The company has worked with Stanford to create a domain-adapted model for VMware documentation search, which significantly improved search performance compared to traditional keyword search.
The VMware Automated Question Answering System (Wacqua) is an information retrieval system based on language models, which allows users to ask questions and get relevant answers without browsing through documents. The system’s implementation involves complex processes, including content gathering, preprocessing, indexing, caching, and updating documentation.
VMware has scaled up its GPU capacity to accommodate the increased demand from software developers empowered by AI tools. The AI platform at VMware provides a GPU pool resource, developer environments, coding use cases, and LLM APIs, all running on a common platform.
Data management is highlighted as a potential bottleneck for AI use cases, and standardizing on a platform is critical for offering services to end-users efficiently. Collaboration between AI teams and infrastructure teams is essential to ensure that both the models and the infrastructure can support the workload effectively.
Ramesh encourages organizations to start small with open-source models, identify key performance indicators (KPIs), and focus on solving business problems with AI. The session concludes with Ramesh emphasizing the importance of a strategic approach to implementing AI and the benefits of leveraging a shared platform for AI services.
Personnel: Ramesh Radhakrishnan