|
![]() Laura Jordana, Mike Barmonde, Ashwini Vasanth, and Jesse Gonzales presented for Nutanix at AI Infrastructure Field Day 2 |
This Presentation date is April 24, 2025 at 15:30 - 17:00.
Presenters: Ashwini Vasanth, Jesse Gonzales, Laura Jordana, Mike Barmonde
Company Overview and AI Challenges we address with Nutanix
Watch on YouTube
Watch on Vimeo
GenAI’s rapid advancement and impact present a significant challenge for enterprises seeking to leverage its potential. Nutanix helps businesses transition from GenAI possibilities to production with its Nutanix Enterprise AI (NAI) solution, a full-stack AI infrastructure designed specifically for IT needs. The NAI offering provides a standardized inferencing solution centered around a model repository, allowing for creating secure endpoints with APIs for GenAI applications, spanning from edge to public clouds.
Mike Barmonde, the Sr. Product Marketing Manager for Nutanix AI products, presented an overview of Nutanix and its approach to addressing AI challenges. The presentation focused on how Nutanix simplifies AI inferencing for IT, highlighting that many organizations struggle to scale their AI initiatives. Nutanix Enterprise AI provides a four-step process to deploy AI infrastructure, including Kubernetes selection, hardware choice (with options for public cloud or air-gapped environments), LLM deployment from various sources, and the creation of secure endpoints, all managed from a central location.
The presentation emphasized the comprehensive nature of Nutanix’s AI infrastructure approach, extending from LLMs down to the underlying hardware. Nutanix’s goal is to streamline the entire process, enabling seamless Day 2 operations. This allows IT professionals to centralize their AI infrastructure and provide a better experience for their developers and application owners.
Personnel: Mike Barmonde
Lets take a look at Nutanix Enterprise AI
Watch on YouTube
Watch on Vimeo
Ashwini Vasanth presented Nutanix Enterprise AI, which simplifies the complexities of adopting and deploying GenAI models and addresses common customer challenges. The product, launched in November 2024, focuses on providing a curated and validated approach to model selection, deployment, and security. The presentation highlighted the “cold start” problem, acknowledging the overwhelming number of available models and the need for a user-friendly starting point for IT or AI admins.
Nutanix Enterprise AI offers a curated list of validated models through partnerships with Hugging Face and NVIDIA to address these challenges, providing a “small, medium, and large” selection. This approach aims to simplify model selection and ensure reliable operation. Additionally, the platform handles GPU selection, inference engine choices, and security complexities, incorporating dynamic endpoint creation to streamline the deployment process. Key to Nutanix’s offering is the integrated security, where Nutanix security experts perform scans for vulnerabilities, eliminating the need for customers to manage their security efforts.
Beyond the mechanics of model deployment, Vasanth discussed the need for on-premises deployment, choice of environments, and addressing the “shadow IT” problem through centralized resource management and monitoring dashboards. The presentation underscored Nutanix’s strategic move into the AI space, leveraging its existing infrastructure expertise, including its Kubernetes platform, storage solutions, and the core principles of simplifying infrastructure. The company’s approach has evolved from a solutions-based offering to a full-fledged product based on the need for a pre-integrated AI platform.
Personnel: Ashwini Vasanth
Nutanix Enterprise AI Demonstration
Watch on YouTube
Watch on Vimeo
As presented by Laura Jordana, Nutanix Enterprise AI (NAI) is designed to simplify the process of deploying and managing AI models for IT administrators and developers. The presentation begins by demonstrating the NAI interface, a Kubernetes application deployable on various platforms. The primary use case highlighted is enabling IT admins to provide developers with easy access to LLMs by connecting to external model repositories and creating secure endpoints. This allows developers to build and deploy AI workflows while keeping data within the organization’s control.
The demo showcases the dashboard, which offers insights into active endpoints, request metrics, and infrastructure health. This view is crucial for IT admins to monitor model usage and impact on resources. The process involves importing models from various hubs like Hugging Face and creating endpoints that serve as the inference engine connection. The presenter emphasized the simplicity of this process, with much of the configuration pre-filled to ease the admin workload. They also highlighted the platform’s OpenAI compatibility, allowing integration with existing tools.
While focusing on inferencing, not model training, the platform provides a secure and streamlined way to deploy and manage models within the organization’s infrastructure. The key takeaway from the presentation is the simplification of AI model deployment, focusing on day 2 operations and ease of use. The platform leverages Kubernetes’ ability to run on Nutanix, EKS, and other cloud instances. It also provides API access and monitoring capabilities for IT admins, and easy access to LLMs for AI developers.
Personnel: Laura Jordana
AI Inferencing Sizing Considerations on Nutanix Enterprise AI
Watch on YouTube
Watch on Vimeo
Jesse Gonzales, Staff Solution Architect, offers sizing guidance for AI inferencing based on real-world experience. The presentation focuses on the critical aspect of appropriately sizing AI infrastructure, particularly for inferencing workloads. Gonzales emphasized the need to understand model requirements, GPU device types, and the role of inference engines. He walks the audience through considerations like CPU and memory requirements based on the selected inference engine, and how this directly impacts the resources needed on Kubernetes worker nodes. The discussion also touches on the importance of accounting for administrative overhead and high availability when deploying LLM endpoints, offering a practical guide to managing resources within a Kubernetes cluster.
The presentation highlights the value of the Nutanix Enterprise AI’s pre-validated models, offering recommendations on the specific resources needed to run a model in a production-ready environment. Gonzales discussed the shift in customer focus from proof-of-concept to centralized systems that allow for sharing large models. The discussion also underscores the importance of accounting for factors like planned maintenance and ensuring sufficient capacity for pod migration. Gonzales explained the sizing process, starting with model selection, GPU device identification, and determining GPU count, followed by calculating CPU and memory needs.
Throughout the presentation, Gonzales addresses critical aspects like FinOps and cost management, highlighting the forthcoming integration of metrics for request counts, latency, and eventually, token-based consumption. He addressed questions about the deployment and licensing options for Nutanix Enterprise AI (NAI), offering different scenarios for on-premises, bare metal, and cloud deployments, depending on the customer’s existing infrastructure. Nutanix’s approach revolves around flexibility, supporting various choices in infrastructure, virtualization, and Kubernetes distributions. The presentation demonstrates how the company streamlines AI deployment and management, making it easier for customers to navigate the complexities of AI infrastructure and scale as needed.
Personnel: Jesse Gonzales
Wrapping up and summarizing Nutanix Enterprise AI
Watch on YouTube
Watch on Vimeo
The Nutanix presentation at AI Infrastructure Field Day focused on enterprise AI solutions, emphasizing giving customers a solid technical understanding of Nutanix Enterprise AI (NAI) and its role in addressing key customer challenges. The discussion highlighted the curated model catalog, offering pre-configured and customizable models, and the ability to incorporate new cutting-edge models, even within air-gapped environments, easily. This approach provides control over models and data, which is particularly relevant for customers seeking sovereign AI solutions and needing to deploy AI models in their own environments.
Nutanix also emphasized the “deploy once, inference many” model, allowing for the creation of a shared service model where multiple applications can connect to deployed models via endpoints. Furthermore, the session touched upon the simplification of sizing, as NAI streamlines the deployment of models, making the process straightforward. The speaker reiterated the benefits of NAI as an application running on Kubernetes, offering flexibility and portability. The presentation concluded by discussing the future of distributed inference across multiple nodes, acknowledging its importance and status as a planned future development.
A key takeaway from the presentation was the growing demand for sovereign AI, driven by geopolitical factors and specific terms of service that restrict the use of certain models in certain regions. Nutanix recognizes and actively helps its customers address this need by providing the necessary tools and infrastructure to enable control over AI models and data within their own environments. The company’s commitment to adapting and evolving its AI solutions to meet the rapid advancements in the AI landscape was underscored, ensuring that Nutanix remains a relevant player in the enterprise AI space.
Personnel: Mike Barmonde