Managing Applications in Offline Edge Scenarios with Avassa

Event: Edge Field Day 3

Appearance: Avassa Presents at Edge Field Day 3

Company: Avassa

Video Links:

Personnel: Carl Moberg, Fredrik Jansson

Discover how Avassa ensures robust edge deployments with site-local clustering and offline capabilities. In this demo, we dive into the critical aspects of maintaining high availability for your containerized applications at the edge, even when connectivity is disrupted.

We cover the essential services needed for smooth operations—such as integrated logging and secrets management. Learn how Avassa simplifies log collection and analysis across distributed sites, ensuring quick access to operational insights. Our demo also highlights how secrets are securely managed, with encrypted storage and seamless distribution across edge sites.

By the end of this video, you’ll understand how Avassa’s site-local clustering and offline capabilities, combined with its logging and secrets management, provide a resilient and secure environment for your edge applications.

Book a demo of the Avassa Edge Platform: https://info.avassa.io/demo


How to Migrate Legacy VMs to Containers with Avassa

Event: Edge Field Day 3

Appearance: Avassa Presents at Edge Field Day 3

Company: Avassa

Video Links:

Personnel: Carl Moberg, Fredrik Jansson

Looking to transition away from a single VM to a containerized environment at the edge? In this demo, we show how Avassa makes it easy to migrate to distributed container applications at the edge. We start by outlining the limitations of VMs at the edge and introduce the core concepts: container package formats, distribution mechanisms, and lifecycle operations in the edge container runtime.

See how Avassa uses these to simplify edge deployments including packaging VMs as containers during a migration period. We deploy a sample application and demonstrate full health probe functionality and monitoring for real-time insights. Then, we seamlessly replace the VM-based application with a new Linux-native container version, showcasing a smooth upgrade process with minimal downtime. By the end, you’ll understand how Avassa empowers efficient application migration from VMs to containers at the edge.

Book a demo of the Avassa Edge Platform: https://info.avassa.io/demo


Transform Manufacturing with VMware Edge Compute Stack

Event: Edge Field Day 3

Appearance: VMware Presents at Edge Field Day 3

Company: VMware

Video Links:

Personnel: Chris Taylor

Smart Manufacturing, also known as Industry 4.0, has the potential to significantly improve operational technology (OT) environments by enhancing productivity and quality. VMware’s Edge Compute Stack is designed to optimize the deployment and management of these advanced technologies across factory settings. By leveraging real-time and deterministic processing of virtual Programmable Logic Controllers (PLCs) and other industrial applications, VMware enables manufacturers to achieve greater agility, enhanced security, and improved sustainability in their operations.

Traditionally, manufacturing environments rely on separate systems for different functions, such as human-machine interfaces (HMI), robot control, and PLCs for conveyor belts. VMware’s solution allows for the virtualization of these systems, enabling them to run concurrently on a single server. This approach not only simplifies the infrastructure but also allows for the integration of both real-time and non-real-time applications. For example, VMware has been working with Audi, which operates with thousands of industrial PCs across its factories, to virtualize these systems and optimize their performance based on latency requirements.

By virtualizing the factory floor, VMware’s Edge Compute Stack offers manufacturers the flexibility to deploy applications either on the factory line or in the server room, depending on their specific needs. This approach reduces the need for multiple physical systems, streamlines operations, and enhances the overall efficiency of the manufacturing process. The collaboration with Audi demonstrates the practical application of this technology, showcasing how edge computing can transform traditional manufacturing environments into more agile, secure, and sustainable operations.

Presented by Chris Taylor, Product Marketing, Software-Defined Edge

See how Audi is using edge compute in their factory: https://www.vmware.com/explore/video-library/video/6360760638112


VMware Edge Compute Stack Deep Dive and Demo

Event: Edge Field Day 3

Appearance: VMware Presents at Edge Field Day 3

Company: VMware

Video Links:

Personnel: Alan Renouf

Deploying applications at edge sites, close to where data is generated, presents unique challenges compared to traditional cloud or data center environments. Edge sites face limited connectivity or private network constraints, physical security constraints, and typically lack trained IT staff. In this session, discover how VMware simplifies fleet management of edge applications and infrastructure across thousands of resource-constrained sites using a GitOps and desired state management approach.

Deploying applications at edge sites presents unique challenges, such as limited connectivity, physical security constraints, and a lack of trained IT staff. VMware’s Edge Compute Stack aims to simplify the management of edge applications and infrastructure across thousands of resource-constrained sites using a GitOps and desired state management approach. GitOps allows infrastructure to be defined through text files, such as YAML, which specify the desired state of the infrastructure, including virtualization layers, networking configurations, and applications. These files are stored in a Git repository, providing version control, auditing, and security benefits. The infrastructure at the edge site can reference these files to configure itself, ensuring it remains in the desired state even when connectivity is limited.

VMware’s solution integrates both infrastructure and application management, bridging the gap between developers and infrastructure administrators. Developers can focus on creating applications, while infrastructure admins manage the platform on which these applications run. The platform supports continuous integration and continuous deployment (CICD) pipelines, allowing for automated testing and deployment of edge configurations. For example, a virtual edge device can be spun up in a CICD pipeline to test infrastructure and application changes before they are deployed to production. This approach ensures that updates are thoroughly tested and can be rolled out incrementally, starting with a few edge sites and gradually expanding to thousands.

The demo showcased how VMware’s Edge Compute Stack can be deployed on a small device, such as an Intel NUC, and how it integrates with GitOps to manage edge infrastructure. The demo included deploying a computer vision application and configuring a display screen connected to the edge device. The platform allows for easy deployment and management of applications, with the ability to monitor and update edge sites remotely. VMware’s solution is designed to provide a complete edge stack, including virtualization, Kubernetes, networking, and monitoring, making it a comprehensive solution for managing edge environments.

Presented by Alan Renouf, Technical Product Manager, Software-Defined Edge

Try VMware Edge Compute Stack: https://images.sw.broadcom.com/Web/CAInc2/{e35e3a44-c0c1-44aa-9da0-7ab729b5348d}_ECS_Trial_License_Request_091724.pdf


Enabling Mass Innovation with the VMware Edge Compute Stack

Event: Edge Field Day 3

Appearance: VMware Presents at Edge Field Day 3

Company: VMware

Video Links:

Personnel: Alan Renouf, Chris Taylor

Across all industries, organizations are adding intelligence to enhance their business operations at the edge for lower costs, higher quality, and increased sales. But scaling out innovation across sites is challenging. VMware’s approach removes edge complexity with zero touch operations.

VMware’s approach to edge computing focuses on simplifying operations and enabling mass innovation across industries by addressing the unique challenges of managing distributed infrastructure at scale. The VMware Edge Compute Stack is designed to handle the complexities of edge environments, where organizations are increasingly deploying intelligent systems to enhance business operations. These edge environments, such as retail stores, manufacturing plants, and energy substations, require localized computing power to process large amounts of data in real-time, often without reliable network connectivity. VMware’s solution integrates edge computing with networking and security services, offering a full-stack approach that includes SD-WAN and SASE technologies to ensure reliable and secure operations across dispersed locations.

The VMware Edge Compute Stack is built to handle the specific constraints of edge environments, such as limited on-site personnel, ruggedized hardware, and the need for real-time processing. The platform supports both virtual machines and containerized applications, allowing organizations to run legacy systems alongside modern applications. VMware’s orchestration platform, Edge Cloud Orchestra, enables zero-touch provisioning, making it easier to deploy and manage edge infrastructure without requiring IT staff at each location. This pull-based management model, inspired by the way smartphones update themselves, allows edge devices to autonomously check for updates and install them, reducing the need for manual intervention and minimizing downtime.

VMware’s edge computing solutions are already being used in various industries, including retail, manufacturing, and energy. For example, in retail, edge computing is used for loss prevention through computer vision, while in manufacturing, companies like Audi are using edge AI to improve the precision of welding robots and torque wrenches. In the energy sector, virtualizing electrical substations allows for faster response times and reduced operational costs. VMware’s flexible and scalable platform is designed to meet the evolving needs of edge environments, ensuring that organizations can innovate and optimize their operations while maintaining security and reliability.

Presented by Alan Renouf, Technical Product Manager, Software-Defined Edge and Chris Taylor, Product Marketing, Software-Defined Edge

See how VMware Edge Compute Stack works: https://www.youtube.com/watch?v=LiJ3YAWDASw


Less AI Chat More Action – AI Field Day 5 Delegate Roundtable Discussion

Event: AI Field Day 5

Appearance: AI Field Day 5 Delegate Roundtable Discussions

Company: Tech Field Day

Video Links:

Personnel: Alastair Cooke

The AI Field Day 5 delegate roundtable discussion, moderated by Alastair Cooke, centered on the prevalent use of chat-based interfaces in AI applications and the desire for more actionable AI solutions. The participants expressed frustration with the current trend of AI providing verbose responses to simple queries, arguing that AI should enhance applications rather than dominate them. They emphasized that AI should be a feature that improves the functionality of applications, rather than being the focal point. The discussion highlighted the need for AI to perform useful tasks, such as automating expense reports, rather than merely engaging in dialogue.

The delegates discussed the limitations of chat interfaces and the potential for AI to take more direct actions on behalf of users. They pointed out that while chatbots can be useful in certain scenarios, such as customer service, the ultimate goal should be for AI to perform tasks autonomously without requiring constant user input. The conversation also touched on the issue of trust in AI, noting that while users may not fully trust AI to take actions independently, they could still benefit from AI performing preliminary tasks that users can then review and approve. The participants agreed that AI should be used to handle repetitive and tedious tasks that humans are not well-suited for, thereby enhancing productivity and efficiency.

The roundtable concluded with a vision for the future of AI, where chat-based applications have their place, but are complemented by other forms of AI that can perform more complex and useful tasks. The delegates emphasized the importance of using the right AI tools for the right problems and moving beyond the current fascination with large language models and chat interfaces. They envisioned a future where AI is seamlessly integrated into applications, performing tasks that improve users’ lives without detracting from their experiences. The discussion underscored the need for AI to be a tool that assists and augments human capabilities, rather than replacing them or becoming a source of frustration.


AI Is Not Your Friend – AI Field Day 5 Delegate Roundtable Discussion

Event: AI Field Day 5

Appearance: AI Field Day 5 Delegate Roundtable Discussions

Company: Tech Field Day

Video Links:

Personnel: Stephen Foskett

The roundtable discussion at AI Field Day 5, moderated by Stephen Foskett, delved into the overly friendly nature of AI products and the implications of this design choice. The conversation began with the observation that many AI interfaces are designed to be exceedingly polite and user-friendly, akin to a vending machine thanking you after a frustrating interaction. While this friendliness is preferable to a rude AI, it can be misleading as it creates an illusion of companionship. The delegates shared their experiences with AI chat services, noting that while these systems are polite, they often fail to meet the user’s actual needs, leading to frustration. The discussion highlighted the need for AI to be efficient and effective rather than just friendly.

The conversation then shifted to the broader implications of AI and smart technology, particularly the pervasive data collection and surveillance. The delegates expressed concerns about the lack of user control over data collected by smart devices, such as TVs and cars, which often gather and transmit data without explicit user consent. This data is valuable to companies for targeted advertising and other purposes, raising significant privacy issues. The discussion underscored the tension between the benefits of smart technology, such as improved accessibility and convenience, and the invasive nature of data collection. The delegates argued that while AI and smart devices can enhance quality of life, especially for individuals with disabilities, the trade-off often involves sacrificing privacy and autonomy.

Finally, the roundtable touched on the regulatory landscape and the need for stronger protections against data misuse. The delegates noted that while some regions, like Europe, have more stringent privacy regulations, the enforcement and effectiveness of these laws vary. The conversation highlighted the role of regulation in ensuring that companies do not exploit user data and the importance of collective decision-making in addressing these issues. The discussion concluded with a reflection on the future of AI and smart technology, emphasizing the need for a balance between innovation and privacy, and the importance of designing AI systems that are both user-friendly and respectful of user autonomy.


Solving AI Cluster Scaling and Reliability Challenges in Training, Inference, RAG, and In-Memory Database Applications with Enfabrica

Event: AI Field Day 5

Appearance: Enfabrica Presents at AI Field Day 5

Company: Enfabrica

Video Links:

Personnel: Rochan Sankar

Enfabrica’s presentation at AI Field Day 5, led by founder and CEO Rochan Sankar, delved into the company’s innovative solutions for addressing AI cluster scaling and reliability challenges. Sankar highlighted the benefits of Enfabrica’s Aggregation and Collapsing Fabric System (ACFS), which enables wide fabrics with fewer hops, significantly reducing GPU-to-GPU hop latency. This reduction in latency is crucial for improving the performance of parallel workloads across GPUs, not just in training but also in other applications. The ACFS allows for a 32x multiplier in network ports, facilitating the connection of up to 500,000 GPUs in just two layers of switching, compared to the traditional three layers. This streamlined architecture enhances job performance and increases utilization, offering a potential 50-60% savings in total cost of ownership (TCO) on the network side.

Sankar also discussed the resiliency improvements brought by the multi-planar switch fabric, which ensures that every GPU or connected element can multipath out in case of failures. This hardware-based failover mechanism allows for immediate traffic rerouting without loss, while software optimizations ensure optimal load balancing. The presentation emphasized the importance of this resiliency, especially as AI clusters scale and the network’s reliability becomes increasingly critical. Enfabrica’s approach addresses the challenges posed by optical connections and high failure rates, ensuring that GPU operations remain unaffected by individual component failures, thus maintaining overall system performance and reliability.

In the context of AI inference and retrieval-augmented generation (RAG), Sankar explained how the ACFS can provide massive bandwidth to both accelerators and memory, creating a memory area network with microsecond access times. This architecture supports a tiered cache-driven approach, optimizing the use of expensive memory resources like HBM. By leveraging cheaper memory options and shared memory elements, Enfabrica’s solution can significantly enhance the efficiency and scalability of AI inference workloads. The presentation concluded with a summary of the ACFS’s capabilities, including high throughput, programmatic control of the fabric, and substantial power savings, positioning it as a critical component for next-generation data centers and large-scale AI deployments.


Enfabrica’s Approach to Solving IO Scaling Challenges in Accelerated Compute Clusters using Networking Silicon

Event: AI Field Day 5

Appearance: Enfabrica Presents at AI Field Day 5

Company: Enfabrica

Video Links:

Personnel: Rochan Sankar

Enfabrica, under the leadership of Rochan Sankar, has developed a novel solution to address the I/O scaling challenges in accelerated compute clusters by leveraging networking silicon. Their approach, termed the Accelerated Compute Fabric (ACF), refactors the traditional endpoint attachment to accelerators. Instead of using a single RDMA NIC for each accelerator, Enfabrica’s solution employs a fully connected I/O hub that integrates the functionalities of a PCI switch, an array of NICs, and a network switch into a single device. This ACF card connects to a scalable compute surface on one side and a scalable network surface on the other, facilitating high port density and efficient data movement.

The ACF architecture aims to eliminate inefficiencies in the current system where GPUs communicate through multiple layers of PCI switches and NICs to scale out. By collapsing these layers into a single, more efficient system, Enfabrica’s solution reduces the number of memory copies and improves burst bandwidth to GPUs, thereby enhancing overall compute efficiency. The ACF device supports both scale-up and scale-out interfaces, allowing it to handle memory reads and writes directly into memory spaces and communicate packets over long distances. This design is particularly beneficial for AI workloads, which require rapid and efficient data movement across large compute clusters.

Enfabrica’s ACF device is designed to be compatible with existing programming models and protocols, ensuring seamless integration into current data center architectures. The device supports standard PCIe and CXL interfaces, and its programmability allows for flexible transport and congestion control. By integrating multiple NICs and a crossbar switch within a single chip, the ACF device offers enhanced resiliency and load balancing capabilities. This innovative approach not only addresses the immediate scaling challenges faced by AI and accelerated computing workloads but also positions Enfabrica as a key player in the evolving landscape of data center architecture.


Accelerated Compute for AI from a Systems Perspective with Enfabrica

Event: AI Field Day 5

Appearance: Enfabrica Presents at AI Field Day 5

Company: Enfabrica

Video Links:

Personnel: Rochan Sankar

Enfabrica, led by founder and CEO Rochan Sankar, is pioneering a new category of networking silicon designed to support accelerated computing and AI at unprecedented scales. The company has developed the Accelerated Compute Fabric Supernick (ACFS), a product aimed at addressing the evolving needs of data centers as they increasingly incorporate GPUs and TPUs. Sankar highlights that traditional networking solutions are no longer sufficient due to the rapid increase in compute intensity, which has outpaced the growth of I/O and memory bandwidth. This imbalance creates significant challenges for building distributed, resilient, and scalable systems, necessitating a rethinking of system I/O architectures.

The ACFS, specifically designed for high-performance distributed AI and GPU server networking, represents a significant leap in networking capabilities. Enfabrica’s first chip, codenamed Millennium, achieves an unprecedented 8 terabits per second of bandwidth, compared to the current standard of 400 gigabits per second. This innovation addresses the critical issue of compute flops scaling faster than data movement capabilities, which has led to inefficiencies in model performance and hardware utilization. Sankar explains that the current system architectures, which were originally designed for traditional compute, are not optimized for the demands of modern AI workloads, leading to inefficiencies and bottlenecks.

Sankar also discusses the historical context of computing models, contrasting the tightly coupled, low-latency communication of supercomputers with the distributed, high-tolerance networking of hyperscale cloud systems. Modern AI and machine learning systems require a hybrid approach that combines the performance of supercomputers with the scalability and resilience of cloud infrastructure. However, current solutions involve disparate communication networks that do not effectively interoperate, leading to imbalanced bandwidth and inefficiencies. Enfabrica aims to address these challenges by creating a unified networking fabric that can support both tightly coupled and distributed computing models, thereby improving overall system efficiency and scalability.


AI Networking Visibility with Arista

Event: AI Field Day 5

Appearance: Arista Presents at AI Field Day 5

Company: Arista

Video Links:

Personnel: Tom Emmons

In the presentation at AI Field Day 5, Tom Emmons, the Software Engineering Lead for AI Networking at Arista Networks, discussed the challenges and solutions related to AI networking visibility. Traditional network monitoring strategies, which rely on interface counters and packet drops, are insufficient for AI networks due to the high-speed interactions that occur at microsecond and millisecond intervals. To address this, Arista has developed advanced telemetry tools to provide more granular insights into network performance. One such tool is the AI Analyzer, which captures traffic statistics at 100-microsecond intervals, allowing for a detailed view of network behavior that traditional second-scale counters miss. This tool helps identify issues like congestion and load balancing inefficiencies by providing a microsecond-level perspective on network traffic.

Emmons also introduced the AI Agent, an extension of Arista’s EOS (Extensible Operating System) to the NIC (Network Interface Card) servers. This feature allows for centralized management and monitoring of both the Top of Rack (TOR) switches and the NIC connections. The AI Agent facilitates auto-discovery and configuration synchronization between the switch and the NIC, ensuring consistent network settings across the entire infrastructure. This centralized approach helps prevent common issues such as mismatched configurations between network devices and servers, which can lead to suboptimal performance. The AI Agent’s ability to integrate with various NICs through specific plugins further enhances its versatility and applicability in diverse network environments.

Additionally, the AI Agent’s integration with Arista’s CloudVision software provides a unified management view that includes both network and server statistics. This comprehensive visibility enables network engineers to correlate network events with server-side issues, significantly improving the efficiency of network troubleshooting. By incorporating AI and machine learning techniques, Arista aims to identify real anomalies and correlate them with network events, thereby distinguishing between genuine issues and noise. This holistic approach to network visibility and debugging ensures that engineers can quickly and accurately diagnose and resolve performance problems, ultimately leading to more reliable and efficient AI network operations.


AI Network Challenges & Solutions with Arista

Event: AI Field Day 5

Appearance: Arista Presents at AI Field Day 5

Company: Arista

Video Links:

Personnel: Hugh Holbrook

Hugh Holbrook, Chief Development Officer at Arista, presented on the unique challenges and solutions associated with AI networking at AI Field Day 5. He began by highlighting the rapid growth of AI models and the increasing demands they place on network infrastructure. AI workloads, particularly those involving large-scale neural network training, require extensive computational resources and generate significant network traffic. This traffic is characterized by high bandwidth, burstiness, and synchronization, which can lead to congestion and inefficiencies if not properly managed. Holbrook emphasized that traditional data center networks are often ill-equipped to handle these demands, necessitating specialized solutions.

One of the primary challenges in AI networking is effective load balancing. Holbrook explained that AI servers typically generate fewer, but more intensive, data flows compared to traditional servers, making it difficult to evenly distribute traffic across the network. Arista has developed several solutions to address this issue, including congestion-aware placement of flows and RDMA-aware load balancing. These methods aim to ensure that traffic is evenly distributed across all available paths, thereby minimizing congestion and maximizing network utilization. Additionally, Arista has explored innovative architectures like the distributed Etherlink switch, which sprays packets across multiple paths to achieve even load distribution.

Holbrook also discussed the importance of visibility and congestion control in AI networks. Monitoring AI traffic is challenging due to its high speed and distributed nature, but Arista offers a suite of tools to provide deep insights into network performance. Congestion control mechanisms, such as priority flow control and ECN marking, are essential to prevent packet loss and ensure smooth operation. Holbrook highlighted the role of the Ultra Ethernet Consortium in advancing Ethernet technology to better support AI and HPC workloads. He concluded by affirming Ethernet’s suitability for AI networks and Arista’s commitment to providing robust, scalable solutions that cater to both small and large-scale deployments.


The AI Landscape and Arista’s Strategy for AI Networking

Event: AI Field Day 5

Appearance: Arista Presents at AI Field Day 5

Company: Arista

Video Links:

Personnel: Hardev Singh

Arista’s presentation at AI Field Day 5, led by Hardev Singh, General Manager of Cloud and AI, delved into the evolving AI landscape and Arista’s strategic approach to AI networking. Singh emphasized the critical need for high-quality network infrastructure to support AI workloads, which are becoming increasingly complex and demanding. He introduced Arista’s Etherlink AI Networking Platforms, highlighting their consistent network operating system (EOS) and management software (Cloud Vision), which provide seamless integration and high performance across various network environments. Singh also discussed the shift from traditional data centers to AI centers, where the network’s backend connects GPUs and the frontend integrates with traditional data center components, ensuring a cohesive and efficient AI infrastructure.

Singh highlighted the rapid advancements in network speeds and the increasing demand for high-speed ports driven by AI workloads. He noted that the transition from 25.6T to 51.2T ASICs has been the fastest in history, driven by the need to keep up with the performance of GPUs and other accelerators. Arista’s Etherlink AI portfolio includes a range of 800-gig products, from fixed and modular systems to the flagship AI spines, capable of supporting large-scale AI clusters. Singh emphasized the importance of load balancing and power efficiency in AI networks, noting that Arista’s solutions are designed to optimize these aspects, ensuring reliable and cost-effective performance.

The presentation also touched on the challenges of power consumption and the innovations in optics technology to address these issues. Singh discussed the transition to 800-gig and 1600-gig optics, highlighting the benefits of linear pluggable optics (LPO) in reducing power consumption and cost. He provided insights into the future of AI networking, including the potential for even higher-density racks and the need for advanced cooling solutions to manage the increased power and heat. Overall, Arista’s strategy focuses on providing robust, scalable, and efficient networking solutions to meet the growing demands of AI workloads, ensuring that their infrastructure can support the rapid advancements in AI technology.


Elasticseach Vector Database RAG Demo

Event: AI Field Day 5

Appearance: Elastic Presents at AI Field Day 5

Company: Elastic

Video Links:

Personnel: Philipp Krenn

Philipp Krenn, Director of DevRel & Developer Community at Elastic, presented a detailed demonstration of the capabilities of Elasticsearch’s vector database, particularly focusing on retrieval-augmented generation (RAG). He explained how Elasticsearch optimizes memory usage by reducing the dimensionality of vector representations and employing techniques to enhance precision while minimizing memory consumption. This optimization is now automated in recent versions of Elasticsearch, allowing users to save significant memory without compromising on result quality. Krenn also highlighted the multi-stage approach to search, where a coarse search retrieves a larger set of documents, which are then re-ranked using more precise methods to deliver the most relevant results.

Krenn emphasized the extensive ecosystem that Elasticsearch supports, including connectors for various data sources and integrations with major cloud providers. This makes it easier for developers to ingest data from different platforms and make it searchable within Elasticsearch. He also mentioned the integration with popular frameworks like LangChain and Llama Index, which are widely used in the generative AI space. These integrations facilitate the development of applications that leverage both Elasticsearch for data retrieval and large language models (LLMs) for generating responses, thereby enhancing the relevance and accuracy of the answers provided by the AI.

The presentation also included a live demo of the RAG capabilities, showcasing how users can connect an LLM, such as OpenAI’s GPT-4, to Elasticsearch and use it to answer queries based on their data. Krenn demonstrated the flexibility of the system, allowing users to customize prompts and refine their queries to get the desired results. He also discussed the cost-effectiveness of running such queries, noting that even with multiple requests, the expenses remain minimal. Additionally, Krenn touched on the importance of security features, such as anonymizing sensitive data before sending it to public LLMs, and the potential for using Elasticsearch in security and observability contexts to provide more intuitive and conversational ways of exploring data.


Elasticsearch Vector Database

Event: AI Field Day 5

Appearance: Elastic Presents at AI Field Day 5

Company: Elastic

Video Links:

Personnel: Philipp Krenn

Philipp Krenn from Elastic provided an in-depth presentation on the capabilities and evolution of Elasticsearch, particularly focusing on its vector database functionalities. He began by giving a brief history of Elasticsearch, which started as a distributed, open-source, RESTful search engine built on Apache Lucene. Initially designed to solve text lexical search problems, Elasticsearch has significantly evolved to include AI, generative AI, and vector search capabilities. Krenn emphasized the importance of combining various data types and formats to enhance search relevance, which traditional databases struggle to achieve. He illustrated this with an example of searching for bars, where factors like ratings, descriptions, and geolocation are combined to provide the most relevant results.

Krenn then delved into the technical aspects of vector search, explaining the hierarchical navigable small worlds (HNSW) algorithm, which is used to approximate and speed up the search process by reducing the number of vector comparisons needed. He highlighted the importance of memory in vector search, as HNSW requires the data structure to fit into memory for optimal performance. Krenn also discussed the trade-offs between different algorithms and the importance of vector compression to reduce memory requirements. He explained how Elasticsearch supports dense vectors and has been improving its capabilities over the years, including adding HNSW for better performance and vector compression techniques.

The presentation also covered the practical implementation of vector search in Elasticsearch. Krenn demonstrated how to create and manage vector representations using Elasticsearch’s APIs, including integrating models from Hugging Face and other sources. He explained the concept of hybrid search, which combines keyword and vector search to provide more accurate and relevant results. Krenn also touched on the importance of combining vector search with traditional filters and role-based access control to refine search results further. The session concluded with a live demo, showcasing how to set up and use vector search in Elasticsearch, highlighting its flexibility and power in handling complex search queries.


The Precision of Search The Intelligence of AI with Elasticsearch

Event: AI Field Day 5

Appearance: Elastic Presents at AI Field Day 5

Company: Elastic

Video Links:

Personnel: Kevin Murray

Kevin Murray, Group Vice President at Elastic, introduced the company’s evolution from its roots in search technology to its current role in AI-driven solutions. Elastic, founded in 2012, has grown to nearly 3,000 employees and operates in over 40 countries, serving more than half of the Fortune 500. The company’s core strength lies in its search capabilities, which have evolved from basic keyword search to more advanced, context-based search using vector databases and semantic search. Elastic’s search technology is now being applied to AI, particularly in areas like vector search and retrieval-augmented generation, which enhance the relevance and precision of search results. This combination of search precision and AI intelligence is central to Elastic’s offerings, and the company prides itself on its comprehensive approach to search excellence, including the use of embeddings and hybrid search techniques.

In addition to its core search capabilities, Elastic has expanded into the security and observability markets, offering out-of-the-box solutions that leverage its search technology. These solutions are designed to address the challenges of managing large volumes of data in these crowded and competitive fields. Elastic’s approach to security and observability is differentiated by its use of search to drive greater relevance and effectiveness. For example, in security, Elastic’s solutions focus on detection, diagnosis, and remediation, using machine learning-powered detection rules and anomaly detection to identify potential threats. The company recently introduced “attack discovery,” a feature that uses generative AI to recommend remediation steps tailored to an organization’s specific data and context, significantly reducing the time to resolve security incidents.

Elastic’s platform continues to evolve with new features like automatic data import and ESQL, a query language designed to simplify data management and visualization. The company’s architecture is built around its core search capabilities, with shared services for visualization and automation, and solutions for security and observability layered on top. Elastic remains committed to innovation in search technology, working closely with its developer community and large enterprise customers to address scalability challenges and drive better outcomes. As the company continues to grow, it aims to maintain its leadership in both search and AI-driven solutions, helping organizations manage, visualize, and protect their data more effectively.


The Art of the Possible with VMware Private AI

Event: AI Field Day 5

Appearance: VMware Presents at AI Field Day 5

Company: VMware

Video Links:

Personnel: Ramesh Radhakrishnan

This session will discuss how VMware’s Private AI architectural approach enables the flexibility to run a range of GenAI solutions for your environment. We’ll explore how customers can achieve business value by running applications on a Private AI that offers unique advantages in privacy, compliance, and control. We will demo the “VMware Expert” app, built on VMware Private AI. Join us to learn how your organization can maximize its data strategy with this powerful platform.

In this session, Ramesh Radhakrishnan from VMware by Broadcom discusses the potential of VMware’s Private AI platform in enabling organizations to run generative AI (GenAI) solutions while maintaining control over privacy, compliance, and data management. He emphasizes that once customers have deployed VMware Private AI in their environments, the next challenge is demonstrating business value. The platform provides a flexible infrastructure that allows developers, data scientists, and software engineers to leverage GPUs for various AI applications. Radhakrishnan’s team, which includes infrastructure experts, software developers, and data scientists, has been working internally to build services on top of this platform, such as Jupyter notebooks and Visual Studio IDE environments, which allow users to access GPUs and AI capabilities for tasks like code completion and large language model (LLM) development.

One of the key services highlighted is the LLM service, which functions similarly to OpenAI but is designed for regulated industries that require strict control over data. This service allows organizations to run LLMs on their private infrastructure, ensuring that sensitive information is not exposed to third-party providers. Additionally, Radhakrishnan introduces the “VMware Expert” app, an internal tool that leverages AI to improve documentation search and provide expert Q&A capabilities. The app has evolved from a basic search tool using embedding models to a more advanced system that integrates retrieval-augmented generation (RAG) techniques, allowing users to interact with large language models that are fine-tuned with VMware-specific knowledge. This tool has shown significant improvements in search accuracy, with results being five to six times better than traditional keyword searches.

Radhakrishnan also discusses the challenges of ensuring that AI-generated answers are accurate and not prone to hallucination, a common issue when the LLM is not provided with the correct documents. To address this, VMware is exploring corrective RAG techniques and post-training methods to embed domain-specific knowledge directly into the models. This approach, which involves fine-tuning large language models on VMware’s internal documentation, has shown promising results and can be replicated by other organizations using VMware Private AI. The session concludes with a demonstration of the “VMware Expert” app and a discussion on how organizations can use VMware’s platform to build their own AI-driven solutions, maximizing the value of their data and infrastructure.


Getting Started with VMware Private AI Foundation with NVIDIA

Event: AI Field Day 5

Appearance: VMware Presents at AI Field Day 5

Company: VMware

Video Links:

Personnel: Chris Gully

In this session, we will take a practitioner’s point of view of Private AI, walking through the value of not trying to create a do it yourself AI infrastructure, how to pick the right GPU for your organization, and delivering your AI use case.

In this presentation, Chris Gully from VMware by Broadcom discusses the challenges and solutions for organizations embarking on their AI journey, particularly focusing on the importance of not attempting to build AI infrastructure from scratch. He emphasizes that simply acquiring GPUs and installing them in servers is not enough to create a functional AI environment. There are numerous logistical considerations, such as power, airflow, and compatibility, that need to be addressed. Gully advocates for purchasing pre-validated and certified solutions to ensure a smoother experience, better support, and faster deployment of AI services. He also highlights the importance of selecting the right GPU for specific AI use cases, as different GPUs offer varying levels of performance and functionality.

Gully also delves into the complexities of GPU virtualization and the benefits of using technologies like VGPU and MIG (Multi-Instance GPU) to optimize resource utilization. These technologies allow organizations to slice and share GPU resources more effectively, ensuring that expensive hardware is used efficiently across multiple business units. He shares real-world examples of customers who faced challenges when their AI models did not fit the GPUs they had selected, underscoring the importance of understanding the technical requirements of AI workloads before making hardware decisions. Gully also discusses how VMware works closely with OEMs and NVIDIA to ensure that their solutions are fully certified and supported, providing customers with confidence that their AI infrastructure will work as expected.

The presentation further explores VMware’s Private AI Foundation, which integrates NVIDIA’s AI technologies into VMware’s Cloud Foundation (VCF) platform. This solution provides a streamlined, automated approach to deploying AI workloads, allowing organizations to quickly roll out AI use cases without the need for extensive manual configuration. Gully explains how VMware’s automation tools, such as the VCF Quick Start, enable rapid deployment of AI environments, reducing the time it takes to get AI models up and running. He also highlights the flexibility of the platform, which allows customers to customize their AI environments and add proprietary models to their catalogs. Overall, the session emphasizes the importance of simplifying AI infrastructure deployment and management to help organizations realize the value of AI more quickly and efficiently.


VMware Private AI Foundation with NVIDIA Technical Overview and Demo

Event: AI Field Day 5

Appearance: VMware Presents at AI Field Day 5

Company: VMware

Video Links:

Personnel: Justin Murray

This session will provide an update on VMware Private AI Foundation with NVIDIA, showcasing its evolution from preview to general availability. Key features and improvements made since the preview phase will be highlighted, giving delegates a clear understanding of what the product looks like in its fully realized state. The session will illustrate a day in the life of an GenAI Application Developer, the product’s capabilities for Retrieval Augmented Generation (RAG), and then walk through a demo.

The VMware Private AI Foundation with NVIDIA has evolved from its preview phase to general availability, with key updates in its architecture and features. One of the significant changes is the introduction of the NVIDIA Inference Microservice (NIM), replacing the Triton Inference Server, and the addition of the Retriever microservice, which retrieves data from a vector database in the Retrieval Augmented Generation (RAG) design. The session emphasizes the importance of RAG in enhancing large language models (LLMs) by integrating private company data stored in vector databases, which helps mitigate issues like hallucinations and lack of citation in LLMs. The demo showcases how VMware provisions the vector database and the chosen LLM, automating the process to streamline the workflow for data scientists and developers.

The presentation also highlights the challenges faced by data scientists, such as managing infrastructure and keeping up with the rapid pace of model and toolkit updates. VMware Cloud Foundation (VCF) addresses these challenges by providing a virtualized environment that allows for flexible GPU allocation and infrastructure management. The demo illustrates how data scientists can easily request AI workstations or Kubernetes clusters with pre-configured environments, reducing setup time from days to minutes. The automation tools provided by VMware simplify the deployment of deep learning VMs and Kubernetes clusters, allowing data scientists to focus on model development and testing rather than infrastructure concerns.

Additionally, the session touches on the importance of governance and lifecycle management in AI development. VMware offers tools to control and version models, containers, and infrastructure components, ensuring stability and compatibility across different environments. The demo also showcases how private data can be loaded into a vector database to enhance LLMs, and how Kubernetes clusters can be auto-scaled to handle varying workloads. The presentation concludes with a discussion on the frequency of updates to the stack, with VMware stabilizing on specific versions of NVIDIA components for six-month intervals, while allowing for custom upgrades if needed.


VMware Private AI Business Update

Event: AI Field Day 5

Appearance: VMware Presents at AI Field Day 5

Company: VMware

Video Links:

Personnel: Jake Augustine

This session will provide a business update on the state of VMware Private AI in the market. It focuses on advancements and announcements since VMware by Broadcom’s presentation at AI Field Day 4, including the key Enterprise AI challenges and the most common business use cases that have emerged. This session is followed by technical sessions and demos detailing the generally available version of VMware Private AI Foundation with NVIDIA, best practices for operationalizing VMware Private AI, and real world application of VMware Private AI do deliver AI applications to users.

The VMware Private AI Business Update presented by Jake Augustine at AI Field Day 5 provided a comprehensive overview of VMware’s advancements in the AI space, particularly focusing on the VMware Private AI Foundation with NVIDIA. Since its general availability in July, the solution has been designed to simplify the deployment of AI workloads across enterprises, leveraging VMware Cloud Foundation (VCF) and NVIDIA AI Enterprise. The collaboration between VMware and NVIDIA allows enterprises to operationalize GPUs within their data centers, providing a familiar control plane for IT teams while enabling data scientists to accelerate AI initiatives. The solution supports NVIDIA-certified hardware, including GPUs like the A100, H100, and L40, and offers flexibility in storage options, with VSAN being recommended but not mandatory for all workloads.

One of the key challenges VMware aims to address is the growing complexity and sprawl of AI workloads within organizations. As AI adoption increases, particularly with the rise of generative AI and large language models, enterprises are struggling to scale these workloads efficiently. VMware’s platform-based approach provides a unified infrastructure that allows IT teams to manage AI workloads at scale, reducing the need for data scientists to focus on infrastructure management. This approach also helps stabilize the organic growth of AI projects within organizations, offering better visibility into resource utilization and cost planning. By virtualizing AI workloads, VMware enables enterprises to optimize GPU usage, reducing costs and improving operational efficiency.

The presentation also highlighted the importance of time-to-value for enterprises adopting AI. VMware’s solution has demonstrated significant improvements in deployment speed, with one financial services customer reducing the time to deploy a RAG (retrieval-augmented generation) application from weeks to just two days. Additionally, the platform’s ability to handle both inference and training workloads, while integrating with third-party models and tools, makes it a versatile solution for enterprises at different stages of AI adoption. Overall, VMware’s Private AI Foundation with NVIDIA is positioned as a scalable, secure, and cost-effective solution for enterprises looking to operationalize AI across their organizations.