Design First or Build First? – AI Field Day 6 Delegate Roundtable

Event: AI Field Day 6

Appearance: AI Field Day 6 Delegate Roundtable Discussion

Company: Tech Field Day

Video Links:

Personnel: Stephen Foskett

This AI Field Day 6 delegate roundtable, moderated by Stephen Foskett, grappled with a fundamental question: is the current proliferation of AI tools driven by genuine need, or simply by the “cool” factor and the profit potential? The conversation highlighted a parallel with earlier technological advancements, such as the initial excitement surrounding desktop publishing, where functionality often outpaced practical application. Delegates debated whether the current focus on rapid development and deployment of AI solutions prioritized innovation over careful design, raising concerns about unintended consequences and ethical implications. The discussion touched upon the potential for misuse and manipulation, echoing historical parallels like the printing press and genetic engineering, where the potential for both immense good and catastrophic harm existed.

A key point of contention revolved around the design process itself. Several delegates challenged the notion of a strictly linear “design first” approach, arguing that many successful products and technologies emerged from experimentation and serendipitous discoveries. The examples of TikTok and Twitter were cited to illustrate how initial intentions can drastically diverge from the final outcome, shaped more by user adoption and unforeseen applications. However, this didn’t negate the primary concern: the need for careful consideration of ethical implications and potential societal impact, particularly concerning the generation and use of data, the influence of marketing, and the risks of unchecked technological advancement.

Ultimately, the delegates concluded that the rapid pace of AI development necessitates a proactive and multi-faceted approach. They emphasized the importance of ethical considerations, a need for guardrails to mitigate potential harms, and a focus on understanding the motivations behind the development and deployment of AI tools. While acknowledging the potential for positive transformation, the discussion underscored the crucial role of technologists in shaping the narrative around AI, preventing its misuse, and ensuring that future advancements serve genuine human needs rather than merely capitalizing on novelty or hype. The delegates suggested conferences like All Things Open and events like South by Southwest as potential avenues to track both technological developments and their societal impact.


We’re Still in the Early Innings of AI – AI Field Day 6 Delegate Roundtable

Event: AI Field Day 6

Appearance: AI Field Day 6 Delegate Roundtable Discussion

Company: Tech Field Day

Video Links:

Personnel: Stephen Foskett

This AI Field Day 6 roundtable discussion centered on the surprisingly nascent stage of artificial intelligence development, despite significant advancements and investment. Participants compared the current state of AI to the early days of personal computing, noting that while impressive progress has been made, we’re far from widespread, user-friendly applications. Analogies ranged from “Wiffle Ball” to “batting practice,” highlighting that even recent breakthroughs like the DeepSeek model, which gained unexpected mainstream attention, represent only a small step in a long journey. The rapid pace of current innovation, fueled by readily available computing power and massive datasets, was emphasized as a key factor in the current perception of rapid advancement.

The discussion highlighted the crucial role of data, particularly referencing the impact of ImageNet and Fei-Fei Li’s work, as a catalyst for recent progress. However, ethical concerns, especially regarding data ownership and the lack of informed consent in utilizing various languages and cultural data, were prominent. The potential for legal challenges related to data privacy violations was anticipated, mirroring the trajectory of legal battles following the emergence of other disruptive technologies. The lack of standardized benchmarks for measuring AI performance and the ongoing evolution of model architectures further underscored the field’s immaturity.

Ultimately, the delegates agreed that while the underlying mathematical concepts have been around for decades, the application and integration of AI into everyday life are still in their infancy. The current “pre-Cambrian explosion” of AI models presents a landscape rife with experimentation, with participants expressing skepticism about the near-term prospects of Artificial General Intelligence (AGI). The focus needs to shift from the race for AGI towards addressing fundamental challenges and establishing clear definitions for key AI terminology to avoid misleading anthropomorphism. The panelists emphasized the importance of approaching AI development with an engineering mindset, concentrating on its practical applications and addressing ethical considerations proactively, rather than solely focusing on the hype surrounding AGI.


Kamiwaza Model Context Protocol for Private Models – Inferencing and Data Within the Enterprise

Event: AI Field Day 6

Appearance: Kamiwaza Presents at AI Field Day 6

Company: Kamiwaza.AI

Video Links:

Personnel: Luke Norris, Matt Wallace

In this segment, we demonstrated how MCP (Model Context Protocol) enables enterprises to integrate and manage private models and agents through Kamiwaza. The presentation showcased a simple chat interface capable of handling complex tasks, such as interacting with GitHub to review and fix code. This was achieved through the open-source Model Context Protocol (MCP) and accompanying tools, all of which are designed to simplify the integration of private AI models into existing workflows. The demo highlighted the platform’s ability to bridge the gap between AI innovation and practical enterprise solutions.

A key element of the demonstration involved an agent with a defined persona that could execute a series of tasks based on simple instructions. These tasks included cloning a repository, copying files, analyzing code, debugging, and committing changes, all within a private and controlled environment. This demonstrated the power of MCP in enabling more complex agent interactions while maintaining control and security. The presenters also emphasized the open-source nature of both the front-end application and the back-end server, fostering community contributions and broader adoption of the technology.

The presentation further showcased Kamiwaza’s broader strategy, encompassing both highly customized, enterprise-specific applications and simpler, more general-purpose tools like chatbots and research assistants. The platform aims to simplify the deployment and management of these tools, integrating them with existing data and infrastructure. Kamiwaza plans to release open-source ports of several useful applications to further encourage adoption and collaboration within the community. A significant portion of the presentation also focused on the platform’s potential to replace traditional RPA systems, providing a more flexible, cost-effective, and ultimately more powerful solution for automating enterprise processes.


Move from the AI Pilot Circle of Doom to Achieving Outcomes Today with Kamiwaza Enterprise Outcome Support

Event: AI Field Day 6

Appearance: Kamiwaza Presents at AI Field Day 6

Company: Kamiwaza.AI

Video Links:

Personnel: Luke Norris, Matt Wallace

Discover how Kamiwaza’s outcome-driven approach ensures measurable success for organizations of any size. This novel approach enables not just break-fix support but true outcome-based support of AI workflows and applications. The Kamiwaza platform facilitates innovation across various sectors, including manufacturing and government, as demonstrated through real-world use cases presented at AI Field Day 6. The presentation showcased the platform’s capabilities through live demos, highlighting its user-friendly interface and powerful features for managing and deploying AI models, even large language models.

A core element of the Kamiwaza platform is its cluster management system, enabling seamless deployment and scaling of models across various environments, from local machines to cloud-based clusters. The platform leverages Ray, a distributed computing framework, for efficient load balancing and resource allocation. Demonstrations included data ingestion and processing pipelines, showcasing the platform’s ability to handle large datasets and distribute workloads effectively across multiple nodes. The presentation also emphasized the platform’s developer-centric design, providing tools and APIs for building and integrating custom AI applications.

Furthermore, the presentation explored the concept of “agents” within the Kamiwaza ecosystem, illustrating how these agents can automate complex tasks and workflows. Examples included automated data conversion, report generation, and even the creation of new applications based on user requests. The agents’ capabilities were demonstrated through live demos, emphasizing their potential to significantly accelerate AI-driven processes and improve efficiency. The presenters highlighted the importance of human oversight and collaboration, emphasizing that while agents automate tasks, human experts can provide crucial guidance, ensuring accuracy and contextual understanding, particularly when dealing with complex or sensitive data.


Kamiwaza – A Single API and SDK for GenAI Applications to Run in the Enterprise

Event: AI Field Day 6

Appearance: Kamiwaza Presents at AI Field Day 6

Company: Kamiwaza.AI

Video Links:

Personnel: Luke Norris, Matt Wallace

Experience a live demo showcasing Kamiwaza’s capabilities, including how our platform seamlessly integrates third-party applications via our SDK and API across all locations and silicon. A technical demo showed data ingestion for the RAG process and how data and security integrations, such as authentication, are handled for both internal and third-party enterprise applications. Kamiwaza positions itself as a “Docker for Generative AI,” providing a single API and SDK to manage multiple GenAI applications, eliminating the need for individual stacks and security layers for each application. This approach allows for seamless interaction between various applications, significantly boosting efficiency and delivering tangible business outcomes.

The platform boasts hardware agnosticism, enabling large language models to run on diverse hardware, from single servers to large clusters. Bench testing demonstrates impressive throughput, with a 70B parameter model achieving 8,000 tokens per second on a single 8-way AMD MI300 server. Kamiwaza is offered in several tiers: a free community version, a per-GPU licensing model for smaller deployments, and a $25,000 enterprise edition with unlimited GPUs. The enterprise edition includes unique outcome-based support, where a dedicated GenAI architect helps clients achieve specific business goals, ensuring the platform delivers practical value and isn’t just shelfware.

Unlike other solutions focused solely on infrastructure, Kamiwaza also addresses the application layer through its “app garden” and provides integrations with other observability tools. While they are not currently integrating at the prompt level, they leverage existing solutions to provide a robust and scalable platform. Future development includes expanding the app garden to allow third-party developers to easily build and deploy their own applications and agents. The company’s vision is to facilitate a more modular and customizable enterprise GenAI ecosystem, challenging the traditional monolithic approach to enterprise software and enabling rapid development of tailored AI solutions.


The Power of a True Enterprise AI Orchestration with Kamiwaza

Event: AI Field Day 6

Appearance: Kamiwaza Presents at AI Field Day 6

Company: Kamiwaza.AI

Video Links:

Personnel: Luke Norris, Matt Wallace

Kamiwaza is a groundbreaking platform designed to enable enterprise AI at scale. Its Distributed Inference Mesh and Locality-Aware Data Engine deliver unmatched performance across diverse environments, including cloud, on-premises, and edge locations, while remaining independent of specific silicon technologies. A key focus is inference optimization for the operational aspect of AI, addressing the challenges of increasing inference loads associated with complex applications like retrieval-augmented generation (RAG) and autonomous agents. The platform tackles the “data gravity” problem, which significantly hinders enterprise AI adoption, by processing data locally wherever it resides to minimize data transfer and maintain data sovereignty.

The Kamiwaza platform distinguishes itself through four key differentiators. First, it provides a complete, opinionated yet loosely coupled enterprise AI stack delivered within Docker containers, allowing for rapid deployment and easy customization. Components like vector databases can be swapped with minimal code changes. Second, the global inference mesh and locality-aware data engine enable distributed inference across multiple locations, intelligently routing requests based on data location and available resources. This approach drastically reduces data movement while maintaining performance and compliance requirements. A global catalog service tracks metadata across all locations, facilitating efficient data access and processing.

Finally, Kamiwaza’s architecture is completely silicon-agnostic, enabling deployment on diverse hardware in various environments. It offers a unified API across all locations, fostering seamless integration with existing enterprise security and authentication systems. This approach, combined with its ability to integrate third-party and in-house Agentic applications, positions Kamiwaza as a “Docker for Enterprise AI,” simplifying the deployment and management of large-scale generative AI solutions across complex, distributed enterprise infrastructures.


How CEOs are Preparing for AI in 2025 – Futurum CEO Insights

Event: AI Field Day 6

Appearance: Futurum Presents CEO Insights on AI Strategies

Company: The Futurum Group

Video Links:

Personnel: Dion Hinchcliffe

The Futurum Group’s presentation at AI Field Day 6, focusing on CEO perspectives on AI in 2025, revealed a significant disconnect between C-suite leadership and IT teams. While 59% of surveyed CEOs (from billion-dollar revenue companies globally) believe they’re leading AI strategy, this perception is largely driven by board pressure to compete in a rapidly evolving AI landscape. CEOs see AI not merely as a technological tool, but as a strategic imperative for business transformation affecting all organizational levels, understanding that failure to adapt will lead to disruption.

The study, conducted in partnership with Kearney, highlights a notable overconfidence among CEOs regarding AI readiness. Despite widespread recognition of AI’s potential (e.g., $20 trillion injection into the global economy by 2030, potential 5x return on investment), only 25% feel prepared. This unpreparedness stems from challenges like talent acquisition and the immaturity of AI technologies to address CEOs’ long-term strategic goals, which frequently involve developing entirely new products and services. The study emphasizes that successful AI adoption is strongly correlated with a decentralized leadership approach, rigorous ROI tracking, and a culture that addresses employee concerns proactively.

Successful AI implementation, according to the study’s findings on high-performing firms, hinges on several key factors. Decentralized leadership, where the vision is set at the top but execution is delegated, proves far more effective than a micro-management approach. Rigorous tracking of ROI is critical for demonstrating value and securing further investment. Finally, fostering a fearless culture that directly addresses worker anxieties about AI’s impact on jobs is paramount. The study concludes that while many CEOs are forging ahead with ambitious AI plans, a measured, data-driven approach, coupled with effective governance, is crucial to avoid costly failures.


MemVerge Fireside Chat with Steve Yatko of Oktay

Event: AI Field Day 6

Appearance: MemVerge Presents at AI Field Day 6

Company: MemVerge, Oktay

Video Links:

Personnel: Charles Fan, Steve Yatko

Charles Fan and Steve Yatko discussed enterprise experiences with AI application and infrastructure deployments. The conversation highlighted the challenges faced by organizations adopting generative AI, particularly the unpreparedness for the rapid advancements and the need for strategic planning. Key challenges revolved around defining appropriate use cases for generative AI, maximizing business value and revenue generation, and effectively managing confidential data within AI initiatives. The discussion also touched upon simpler issues like improving developer productivity and documentation.

A central theme emerging was the critical need for manageable AI application workloads and efficient resource utilization. Steve Yatko, drawing on his extensive experience in financial services and technology, emphasized the importance of dynamic resource management, similar to the evolution of virtualization technology. He highlighted the limitations of existing approaches and the advantages offered by MemVerge’s technology in enabling seamless resource allocation, mobilization of development and testing environments, and efficient cost optimization. This included the ability to create internal spot markets for resources, thereby maximizing utilization and sharing across departments.

Yatko specifically praised MemVerge’s technology for its ability to address the critical challenges facing enterprises in the AI space, particularly in financial services. He noted the ability to checkpoint and restore workloads across nodes, enabling greater flexibility and resilience. The solution’s support for multi-cloud, hybrid cloud, and diverse GPU configurations makes it particularly relevant for organizations needing adaptable and scalable solutions. Overall, the presentation positioned MemVerge’s platform as a crucial component for enterprises to efficiently and cost-effectively deploy and manage AI applications at scale, ultimately unlocking productivity and driving greater business value.


MemVerge Memory Machine AI Transparent Checkpointing

Event: AI Field Day 6

Appearance: MemVerge Presents at AI Field Day 6

Company: MemVerge

Video Links:

Personnel: Bernie Wu

Bernie Wu’s presentation at AI Field Day 6 detailed MemVerge’s transparent checkpointing technology for AI workloads, addressing limitations of existing checkpointing methods. This technology, implemented as an MMAI Kubernetes operator, enables efficient pausing and relocation of long-running GPU tasks without requiring application modifications or awareness. This contrasts with other schedulers that necessitate application-level changes or cold restarts, significantly improving resource management and reducing friction for users.

The core of MemVerge’s approach is its ability to perform transparent checkpointing at the platform level, distinct from application-level checkpointing found in frameworks like PyTorch and TensorFlow. While the latter focuses on model optimization and rollback within the data scientist’s workflow, MemVerge’s solution targets site reliability engineers and platform engineers, handling tasks like graceful node maintenance, elastic workload bin-packing, and reclaiming idle resources, including spot instances. The technology initially developed for CPUs has been extended to GPUs through collaboration with NVIDIA, leveraging a two-stage checkpoint/restore process and techniques like incremental memory snapshots and asynchronous checkpointing to minimize overhead.

Future developments include parallelizing the checkpointing process for improved performance, extending support to AMD GPUs and multi-GPU nodes, and enabling cluster-wide checkpointing for distributed training and inferencing. MemVerge also plans to integrate their solution with other schedulers and expand its use cases to encompass hybrid cloud scheduling, heterogeneous pipelines, and HPC environments, further streamlining AI workload management and enhancing operational efficiency.


MemVerge Memory Machine AI GPU-as-a-Service

Event: AI Field Day 6

Appearance: MemVerge Presents at AI Field Day 6

Company: MemVerge

Video Links:

Personnel: Steve Scargall

Steve Scargall introduces Memory Machine AI (MMAI) software from MemVerge, a platform designed to optimize GPU usage for platform engineers, data scientists, developers, MLOps engineers, decision-makers, and project leads. The software addresses challenges in strategic resource allocation, flexible GPU sharing, real-time observability and optimization, and priority management. MMAI allows users to request specific GPU types and quantities, abstracting away the complexities of underlying infrastructure. Users interact with the platform through familiar environments like VS Code and Jupyter notebooks, simplifying the process of launching and managing AI workloads.

A key feature of MMAI is its “GPU surfing” capability, which enables the dynamic movement of workloads between GPUs based on resource availability and priority. This is facilitated by MemVerge’s checkpointing technology, allowing seamless transitions without requiring users to manually manage or even be aware of the location of their computations. The platform supports both on-premises and cloud deployments, initially focusing on Kubernetes but with planned support for Slurm and other orchestration systems. This flexibility allows for integration with existing enterprise infrastructure and workflows, providing a path for organizations of various sizes and technical expertise to leverage their GPU resources more efficiently.

MMAI offers a comprehensive UI providing real-time monitoring and telemetry for both administrators and end-users. Features include departmental billing, resource sharing and bursting, and prioritized job execution. The software supports multiple GPU vendors (Nvidia initially, with AMD and Intel planned), allowing for heterogeneous environments. The presentation highlights the potential for future AI-driven scheduling and orchestration based on the rich telemetry data collected by MMAI, demonstrating a commitment to continuous improvement and optimization of GPU resource utilization in complex, multi-departmental settings. The business model is based on the number of managed GPUs.


Supercharging AI Infra with MemVerge Memory Machine AI

Event: AI Field Day 6

Appearance: MemVerge Presents at AI Field Day 6

Company: MemVerge

Video Links:

Personnel: Charles Fan

Dr. Charles Fan’s presentation at AI Field Day 6 provided an overview of large language models (LLMs), agentic AI applications, and workflows for AI workloads, focusing on the impact of agentic AI on data center technology and AI infrastructure software. He highlighted the two primary ways enterprises are currently deploying AI: leveraging API services from providers like OpenAI and Anthropic, and deploying and fine-tuning open-source models within private environments for data privacy and cost savings. Fan emphasized the recent advancements in open-source models, particularly DeepSeek, which significantly reduces training costs, making on-premise deployment more accessible for enterprises.

The core of Fan’s presentation centered on MemVerge’s solution to address the challenges of managing and optimizing AI workloads within the evolving data center architecture. This architecture is shifting from an x86-centric model to one dominated by GPUs and high-bandwidth memory, necessitating a new layer of AI infrastructure automation software. MemVerge’s software focuses on automating resource provision, orchestration, and optimization, bridging the gap between enterprise needs and the complexities of the new hardware landscape. A key problem addressed is the low GPU utilization in enterprises due to inefficient resource sharing, which MemVerge aims to improve through their “GPU-as-a-service” offering.

MemVerge’s “GPU-as-a-service” solution acts as an orchestrator, improving resource allocation and utilization, addressing the lack of effective virtualization for GPUs. This includes features like transparent checkpointing to minimize data loss during workload preemption and multi-vendor support for GPUs. Their upcoming Memory Machine AI platform will also encompass inference-as-a-service and fine-tuning-as-a-service, further simplifying the deployment and management of open-source models within private enterprise environments. Fan concluded by announcing a pioneer program to engage early adopters and collaborate on refining the platform to meet specific enterprise needs.


AI Model Security and Governance – Broadcom VMware Private AI Model Gallery Demo

Event: AI Field Day 6

Appearance: VMware by Broadcom Presents at AI Field Day 6

Company: VMware by Broadcom

Video Links:

Personnel: Tasha Drew

Model governance is crucial as enterprises adopt AI, requiring secure and consistent model behavior. This presentation by Tasha Drew of Broadcom VMware focuses on the challenges of achieving model governance and how VMware Private AI’s model gallery addresses these challenges through its capabilities and workflows. The core issue highlighted is the risk associated with introducing models into enterprise environments, similar to the security concerns surrounding containers in their early adoption. This necessitates robust security protocols and consistent monitoring to prevent vulnerabilities and ensure the models operate as intended.

A key aspect of the presentation emphasizes the growing importance of “agentic workflows,” where Large Language Models (LLMs) act as interfaces, orchestrating interactions with various tools and agents to achieve more accurate and comprehensive results. The example of a sales agent leveraging multiple data sources (public internet, internal documents, CRM systems) to generate a compelling presentation illustrates this concept. This highlights the complexity of integrating AI into business processes and the need for robust governance to manage the multiple data sources and agents involved.

The presentation then details how VMware Private AI Foundation, integrated with NVIDIA, helps achieve model governance. This includes a demo showcasing a workflow from model import (from sources like Hugging Face) through security testing (using tools like Giscard) to deployment in a secure environment (Harbor). This integrated approach allows for programmatic model evaluation, monitoring for behavioral drift, and controlled access through versioning and access control mechanisms. The ultimate goal is to enable enterprises to safely adopt AI by operationalizing security testing and providing a centralized, auditable repository for their AI models, thereby minimizing risks and maximizing the benefits of AI within their organizations.


Real-World Customer Journey with VMware Private AI from Broadcom

Event: AI Field Day 6

Appearance: VMware by Broadcom Presents at AI Field Day 6

Company: VMware by Broadcom

Video Links:

Personnel: Alex Fanous

Broadcom is actively engaged with customers on proof of concepts and production deployments of VMware Private AI Foundation. This session details a composite example of a typical customer journey, drawing from real-world scenarios encountered during customer engagements. The presentation focuses on the infrastructure aspects often overlooked, emphasizing the importance of a robust foundation for data scientists and AI engineers to effectively utilize AI tools. It highlights the iterative process of deploying and refining a private AI solution, starting with a simple Retrieve Augmented Generation (RAG) application built on VMware Private AI Foundation.

The customer journey begins with a high-level mandate from senior leadership to implement AI, often without specific technical details. A common starting point is a simple application, such as a chat app, using readily available data such as HR policies. This initial deployment allows for a gradual learning curve, introducing the use of vector databases for similarity searches and leveraging the VMware Private AI Foundation console for easy deployment. The presentation showcases how customers typically customize the initial templates, often adopting open-source tools like OpenWebUI for a more familiar user interface. The iterative process involves continual refinement, adjusting parameters, testing various LLMs, and ultimately scaling the infrastructure as needed using load balancers and multiple nodes.

Throughout the customer journey, the presentation stresses the importance of iterative development and feedback. The process emphasizes starting with a functional prototype, gathering feedback, and then progressively improving performance and scalability. This approach involves close collaboration between the infrastructure team, data scientists, and developers. The use of VMware’s existing infrastructure, such as vCenter and Data Services Manager, is emphasized as a key advantage, minimizing the need for retraining staff or adopting new vendor-specific tools. The session concludes by highlighting the flexibility and adaptability of the VMware Private AI Foundation platform, its ability to accommodate evolving AI architectures and future-proof investments in AI infrastructure.


VMware Private AI Foundation Capabilities and Features Update from Broadcom

Event: AI Field Day 6

Appearance: VMware by Broadcom Presents at AI Field Day 6

Company: VMware by Broadcom

Video Links:

Personnel: Justin Murray

A technical review of the generally available VMware Private AI Foundation with NVIDIA product, an advanced service on VMware Cloud Foundation, was presented. The presentation focused on the architecture of VMware Private AI Foundation (VPF), highlighting its reliance on VMware Cloud Foundation (VCF) 5.2.1 as its base. The speaker, Justin Murray, explained the layered architecture, distinguishing between the infrastructure provisioning layer (VMware’s intellectual property) and the data science layer, which includes containers running inference servers from NVIDIA and other open-source options. Significant advancements since the product’s minimum viable product launch in May 2024 were emphasized, including enhanced model governance capabilities for safe model testing and deployment.

The presentation delved into the rationale behind using VCF and VPF for managing AI/ML workloads. The speaker argued that the increasing complexity of model selection, infrastructure setup (including GPU selection), and the need for RAG (Retrieve, Augment, Generate) applications necessitates a robust and manageable infrastructure. VMware Cloud Foundation, with its virtualization capabilities, provides this solution by enabling isolated deep learning VMs and Kubernetes clusters for different teams and projects, preventing management nightmares and optimizing resource utilization. A key element is the self-service automation, allowing data scientists to request resources (like GPUs) with minimal IT interaction, streamlining the process and enabling faster model deployment.

A significant portion of the presentation covered GPU management and sharing, emphasizing the role of NVIDIA drivers in enabling virtual GPU (VGPU) profiles that allow for efficient resource allocation and isolation. The speaker highlighted the advancements in VMotion for GPUs, enabling rapid migration of workloads, and the integration of tools for monitoring GPU utilization within the VCF operations console. The discussion touched on model version control, the role of Harbor as a repository for models and containers, and the availability of a service catalog for deploying various AI components. The presentation concluded with a demo showing the quick and easy deployment of a Kubernetes cluster for a RAG application, showcasing the self-service capabilities and simplified infrastructure management offered by VMware Private AI Foundation.


Three Reasons Customers Choose VMware Private AI from Broadcom

Event: AI Field Day 6

Appearance: VMware by Broadcom Presents at AI Field Day 6

Company: VMware by Broadcom

Video Links:

Personnel: Tasha Drew

Overview of why customers are choosing VMware Private AI and popular enterprise use cases we are seeing in the field. Tasha Drew’s presentation at AI Field Day 6 highlighted three key reasons driving the adoption of VMware Private AI. First, she addressed the often-overlooked issue of GPU underutilization. Data from UC Berkeley’s Sky Computing Lab, corroborated by VMware’s internal findings, demonstrated that current deployment practices, such as dedicating one GPU per model, lead to significant inefficiency due to inconsistent inference workload patterns. This underutilization is exacerbated by a phenomenon Drew termed “GPU hoarding,” where teams within organizations hoard their allocated GPUs, fearing resource sharing. VMware Private AI addresses this through intelligent workload scheduling and resource pooling, maximizing GPU utilization and enabling resource sharing across different teams and priorities.

The second driver for private AI adoption is cost. Drew presented data indicating a dramatic increase in cloud spending driven by AI workloads, often leading to budget reallocation and project cancellations. This high cost is attributed to various factors including platform fees, data security, infrastructure expenses, and the cost of upskilling staff to handle cloud-based AI tools. In contrast, VMware Private AI offers a more predictable and potentially lower total cost of ownership (TCO) by optimizing resource usage within the enterprise’s existing infrastructure. The presentation referenced an IDC white paper showing that a significant percentage of enterprises perceive on-premise AI solutions as equally or less expensive than cloud-based alternatives, primarily due to the decreased infrastructure and service costs.

Finally, Drew emphasized the critical role of model governance in driving the shift towards private AI. As enterprises embrace generative AI and train models on proprietary data and intellectual property (IP), concerns around data sensitivity and security become paramount. VMware Private AI tackles these concerns by providing robust control mechanisms such as role-based access control (RBAC) to regulate model access and ensure compliance with data protection regulations. While the technical complexities of managing access control within embedding and vector databases are acknowledged, Drew highlighted ongoing development efforts to integrate comprehensive security measures at both the database and application output levels. Overall, the presentation positioned VMware Private AI as a comprehensive solution addressing the challenges of cost, efficiency, and security in deploying and managing enterprise AI workloads.


MLCommons MLPerf Storage

Event: AI Field Day 6

Appearance: ML Commons Presents at AI Field Day 6

Company: ML Commons

Video Links:

Personnel: Curtis Anderson, David Kanter

MLCommons’ MLPerf Storage benchmark addresses the rapidly growing need for high-performance storage in AI training. Driven by the exponential increase in data volume and the even faster growth in data access demands, the benchmark aims to provide a standardized way to compare storage systems’ capabilities for AI workloads. This benefits purchasers seeking informed decisions, researchers developing better storage technologies, and vendors optimizing their products for AI’s unique data access patterns, which are characterized by random reads and massive data volume exceeding the capacity of most on-node storage solutions.

The benchmark currently supports three training workloads (UNET 3D, ResNet-50, and CosmoFlow) using PyTorch and TensorFlow, each imposing distinct demands on storage systems. Future versions will incorporate additional workloads, including a RAG (Retrieval Augmented Generation) pipeline with a vector database, reflecting the evolving needs of large language model training and inference. A key aspect is the focus on maintaining high accelerator utilization (aiming for 95%), making the storage system’s speed crucial for avoiding costly GPU idle time. The benchmark offers both “closed” (apples-to-apples comparisons) and “open” (allowing for vendor-specific optimizations) categories to foster innovation.

MLPerf Storage has seen significant adoption since its initial release, with a substantial increase in the number of submissions and participating organizations. This reflects the growing importance of AI in the market and the need for a standardized benchmark for evaluating storage solutions designed for these unique demands. The benchmark’s community-driven nature and transparency are enabling more informed purchasing decisions, moving beyond arbitrary vendor claims and providing a more objective way to assess the performance of storage systems in the critical context of modern AI applications.


MLCommons MLPerf Client Overview

Event: AI Field Day 6

Appearance: ML Commons Presents at AI Field Day 6

Company: ML Commons

Video Links:

Personnel: David Kanter

MLCommons presented MLPerf Client, a new benchmark designed to measure the performance of PC-class systems, including laptops and desktops, on large language model (LLM) tasks. Released in December 2024, it’s an installable, open-source application (available on GitHub) that allows users to easily test their systems and provides early access for feedback and improvement. The initial release focuses on a single large language model, LLaMA 2.7 billion, using the Open Orca dataset, and includes four tests simulating different LLM usage scenarios like content generation and summarization. The benchmark prioritizes response latency as its primary metric, mirroring real-world user experience.

A key aspect of MLPerf Client is its emphasis on accuracy. While prioritizing performance, it incorporates the MMLU (Massive Multitask Language Understanding) benchmark to ensure the measured performance is achieved with acceptable accuracy. This prevents optimizations that might drastically improve speed but severely compromise the quality of the LLM’s output. The presenters emphasized that this is not intended to evaluate production-ready LLMs, but rather to provide a standardized and impartial way to compare the performance of different hardware and software configurations on common LLM tasks.

The benchmark utilizes a single-stream approach, feeding queries one at a time, and supports multiple GPU acceleration paths via ONNX Runtime and Intel OpenVINO. The presenters highlighted the flexibility of allowing hardware vendors to optimize the model (LLaMA 2.7B) for their specific devices, even down to 4-bit integer quantization, while maintaining sufficient accuracy as judged by the MMLU threshold. Future plans include expanding hardware support, adding more tests and models, and implementing a graphical user interface (GUI) to improve usability.


MLCommons and MLPerf – An Introduction

Event: AI Field Day 6

Appearance: ML Commons Presents at AI Field Day 6

Company: ML Commons

Video Links:

Personnel: David Kanter

MLCommons is a non-profit industry consortium dedicated to improving AI for everyone by focusing on accuracy, safety, speed, and power efficiency. The organization boasts over 125 members across six continents and leverages community participation to achieve its goals. A key project is MLPerf, an open industry standard benchmark suite for measuring the performance and efficiency of AI systems, providing a common framework for comparison and progress tracking. This transparency fosters collaboration among researchers, vendors, and customers, driving innovation and preventing inflated claims.

The presentation highlights the crucial relationship between big data, big models, and big compute in achieving AI breakthroughs. A key chart illustrates how AI model performance significantly improves with increased data, but eventually plateaus. This necessitates larger models and more powerful computing resources, leading to an insatiable demand for compute power. MLPerf benchmarks help navigate this landscape by providing a standardized method of measuring performance across various factors including hardware, algorithms, software optimization, and scale, ensuring that improvements are verifiable and reproducible.

MLPerf offers a range of benchmarks covering diverse AI applications, including training, inference (data center, edge, mobile, tiny, and automotive), storage, and client systems. The benchmarks are designed to be representative of real-world use cases and are regularly updated to reflect technological advancements and evolving industry practices. While acknowledging the limitations of any benchmark, the presenter emphasizes MLPerf’s commitment to transparency and accountability through open-source results, peer review, and audits, ensuring that reported results are not merely flukes but can be validated and replicated. This approach promotes a collaborative, data-driven approach to developing more efficient and impactful AI solutions.


Enabling AI Ready Data Products with Qlik Talend Cloud

Event:

Appearance: Qlik Tech Field Day Showcase

Company: Qlik

Video Links:

Personnel: Sharad Kumar

In this video, Sharad Kumar, Field CTO of Data at Qlik, discusses how Qlik is revolutionizing how organizations create, manage, and consume data products, bridging the gap between data producers and business users. Qlik’s platform enables teams to deliver modular, trusted, and easily consumable data that’s packed with business semantics, quality rules, and access policies. With Qlik, data ownership, transparency, and collaboration are simplified, empowering organizations to leverage data for advanced analytics, machine learning, and AI at scale. Unlock faster decision-making, reduced costs, and impactful insights with Qlik’s data product marketplace and powerful federated architecture.


Transforming Data Architecture – Qlik’s Approach to Open Table Lakehouses

Event:

Appearance: Qlik Tech Field Day Showcase

Company: Qlik

Video Links:

Personnel: Sharad Kumar

In this video, Sharad Kumar, Field CTO of Data at Qlik, discusses the future of data architecture with Open Table-based Lakehouses. Learn how formats like Apache Iceberg are transforming the way businesses store and manage data, offering unparalleled flexibility by decoupling compute from storage. Sharad highlights how Qlik’s integration with Iceberg enables seamless data transformations, empowering customers to optimize performance and costs using engines like Spark, Trino, and Snowflake. Discover how Qlik simplifies building modern data lakes with Iceberg, providing the scalability, control, and efficiency needed to drive business success.