|
This video is part of the appearance, “VMware by Broadcom Presents at AI Field Day 4“. It was recorded as part of AI Field Day 4 at 8:00-10:00 on February 21, 2024.
Watch on YouTube
Watch on Vimeo
This session offers a deep dive into VMware’s internal AI services used by VMware employees, including our services for coding-assist, document search using retrieval augmented generation (RAG), and our internal LLM API.
In this presentation, Ramesh Radhakrishnan of VMware discusses the company’s internal use of AI, particularly large language models (LLMs), for various applications. He leads the AI Platform and Solutions team and shares insights into VMware’s AI services, which were developed even before the advent of LLMs.
Large language models (LLMs) are versatile tools that can address a wide range of use cases with minimal modification. VMware has developed internal AI services for coding assistance, document search using Retrieval-Augmented Generation (RAG), and an internal LLM API. Content generation, question answering, code generation, and the use of AI agents are some of the key use cases for LLMs at VMware.
VMware has implemented a Cloud Smart approach, leveraging open-source LLMs trained on the public cloud to avoid the environmental impact of running their own GPUs. The company has worked with Stanford to create a domain-adapted model for VMware documentation search, which significantly improved search performance compared to traditional keyword search.
The VMware Automated Question Answering System (Wacqua) is an information retrieval system based on language models, which allows users to ask questions and get relevant answers without browsing through documents. The system’s implementation involves complex processes, including content gathering, preprocessing, indexing, caching, and updating documentation.
VMware has scaled up its GPU capacity to accommodate the increased demand from software developers empowered by AI tools. The AI platform at VMware provides a GPU pool resource, developer environments, coding use cases, and LLM APIs, all running on a common platform.
Data management is highlighted as a potential bottleneck for AI use cases, and standardizing on a platform is critical for offering services to end-users efficiently. Collaboration between AI teams and infrastructure teams is essential to ensure that both the models and the infrastructure can support the workload effectively.
Ramesh encourages organizations to start small with open-source models, identify key performance indicators (KPIs), and focus on solving business problems with AI. The session concludes with Ramesh emphasizing the importance of a strategic approach to implementing AI and the benefits of leveraging a shared platform for AI services.
Personnel: Ramesh Radhakrishnan