Real World Deployments for AI at the Edge with Xsight Labs

Event: AI Infrastructure Field Day 4

Appearance: Xsight Labs Presents at AI Infrastructure Field Day

Company: Xsight Labs

Video Links:

Personnel: John C Carney, Ted Weatherford

This final technical section transitions from theoretical architecture to practical use cases, spanning Warm Flash Storage to “Extreme Edge” networking satellites. It showcases industry-first milestones, such as the 800G DPU for virtualized hosting and SmartSwitch technology for NIC pooling. Each example demonstrates how the X and E series products solve specific bottlenecks in modern cloud compute and AI storage networks. Xsight Labs, a nine-year-old fabless semiconductor company, focuses on real-world deployments for AI at the edge using its X series Ethernet switch and E series DPU. Their core philosophy centers on being software-defined, appealing to software engineers by offering performance comparable to fixed-function products while providing greater flexibility through an open instruction set architecture and Linux-based programming with tools such as DPTK or Open Virtual Switch. They target the edge market, believing it holds the highest volume, and have designed their single-die products for extreme power efficiency and high performance.

The company’s chips are deployed in diverse settings, from the “extreme edge” to terrestrial wireless infrastructure. A significant win is their integration into Starlink Gen 3 satellites, where multiple Ethernet switches per satellite are being launched at scale. This required Xsight Labs to deliver unparalleled programmability, power efficiency, and resilience against vibration, radiation, and extreme temperatures, crucial for a system that cannot be physically serviced. Similarly, their programmable Ethernet switches and DPUs are ideal for 5.5G or 6G terrestrial wireless infrastructure, addressing the complex, stateful packet-processing needs of antennas and associated processing units. These low-power, single-die solutions offer advantages in temperature range, cost, and operating expenses, including reduced carbon footprint.

Xsight Labs is also targeting the expanding AI market, particularly for inference, which is pushing computing out into half-rack, full-rack, and multi-row deployments. Their DPUs serve as front-end and scale-out back-end solutions for these systems, enabling very high-density general compute. Additionally, their Ethernet switches are used to cluster these AI systems, marking a departure from traditional “clos” architectures by supporting local clustering topologies such as Dragonfly. For example, in AI training systems similar to Amazon’s ultra-servers, Xsight Labs’ products with 100G serdes and 6.4T/12.8T switches can replicate or enhance existing topologies. The Starlink win underscores their capability to provide future-proof, high-performance, and power-efficient solutions essential for the most demanding and inaccessible environments.


The E-Series Delivering Cloud-on-a-Chip from Xsight Labs

Event: AI Infrastructure Field Day 4

Appearance: Xsight Labs Presents at AI Infrastructure Field Day

Company: Xsight Labs

Video Links:

Personnel: John C Carney, Ted Weatherford

The E-Series session explores the convergence of storage, networking, compute, and security into a single, cohesive silicon platform. This “Cloud-on-a-chip” approach is dissected through its architecture and programming model to show how it simplifies complex data center environments. We will highlight our partnership with the Hammerspace solution, demonstrating how E-Series silicon powers a global data environment. Xsight Labs presents its E-Series, a System-on-a-Chip (SOC) designed to deliver “Cloud-on-a-chip” capabilities by integrating essential cloud elements. This includes Ethernet connectivity, robust security features, virtualized storage, and powerful processing via 64 ARM Neoverse N2 cores. The E-Series chip, which has been generally available for about four months, is offered in various form factors, including a server, an add-in card, and a COMEX module, targeting applications ranging from embedded systems to full servers.

Xsight Labs differentiates its E-Series architecture from traditional DPUs, which typically evolve from a NIC with a constrained CPU cluster. The E-Series began with a server-class compute system featuring 64 ARM Neoverse N2 cores, specifically optimized and sized for data-plane applications. This allows all packets and PCIe transactions to be terminated and processed in software using standard programming models like Linux, DPDK, or SPDK, eliminating proprietary code. The chip integrates an E-unit for Ethernet connectivity, offering inline encryption and stateless offloads, and a P-unit for PCIe Gen 5, providing up to 40 lanes and 800Gb bandwidth. This PCIe unit can software-emulate various devices (storage, networking, RDMA), offering immense flexibility. With a typical power consumption of 50-75W (up to 120W TDP) and a SpecInt rating of 170, the E-Series offers significant compute power efficiently. Beyond network and memory encryption, the roadmap for the follow-on E2 product includes CXL support, targeting 1.6T bandwidth.

The E-Series supports a broad range of use cases, from front-end DPUs in public cloud and AI clusters (offloading the host, providing virtualization and isolation) to back-end DPUs in AI inference clusters for KV cache offload. It also extends to local storage, bump-in-the-wire network appliances for security and load balancing, smart switches for stateful processing, edge servers, and storage target appliances. Xsight Labs provides a comprehensive software development kit, ensuring compatibility with standard ARM server operating systems such as Ubuntu, as well as Linux and DPDK drivers. A key demonstration of the E-Series’ capability is its performance on the Sonic Dash “Hero Benchmark,” a highly intensive SDN workload. This test requires processing millions of routes, prefixes, and mappings, which largely depends on off-chip DRAM due to poor cache locality. The E1 exceeded the benchmark requirement of 12 million new connections per second with 120 million background connections, without packet drops, by almost 20%, while still retaining CPU capacity for control-plane operations, making it the only DPU to pass this test at 800Gb with a single device.


The X Series Architecting for High Performance Scale with Xsight Labs

Event: AI Infrastructure Field Day 4

Appearance: Xsight Labs Presents at AI Infrastructure Field Day

Company: Xsight Labs

Video Links:

Personnel: John C Carney, Ted Weatherford

This section provides a high-level overview of the product roadmap, specifically introducing the X-Series and E-Series lineups. It identifies the six critical chips required to build modern AI Factories and explains the concept of a “Truly Software Defined” stack that operates at full line-rate across layers L1-7. This serves as the technical foundation for the subsequent specialized deep dives.

The X-Series, an Ethernet switch, distinguishes itself through a “truly software-defined” programmable architecture, utilizing 3072 Harvard architecture cores operating on a “run-to-complete” model, unlike competitors’ fixed pipelines. This provides unparalleled flexibility, enabling parallel packet operations, recursion, and extensive header processing, including 11 layers of MPLS and various encapsulations. This design is particularly well-suited for emerging AI-centric protocols such as Ultra Ethernet (UEC) and ESON, enabling customizable congestion management and efficient in-flight packet handling. The X-Series boasts significantly lower latency, achieving 450 nanoseconds compared to the typical 800 nanoseconds, and demonstrates exceptional buffer utilization, consistently above 86% even under heavy load.

The X-Series also stands out with its low power consumption, operating at under 200 watts for a 12.8T switch, which is described as disruptive. Its software-defined physical layer supports diverse SERDES speeds (10G to 200G) and modulation schemes, enabling mixed-and-matched configurations that facilitate connections between new and legacy interfaces. The programming model, though initially assembler-based with Python wrappers and libraries, has seen customers such as Oxide develop P4 compilers, with Xsight Labs planning to develop their own. This powerful, flexible, and low-power solution is specifically designed for edge deployments, including half-rack to two-rack configurations, satellites, and base stations, delivering significant reductions in power, rack space, and cost. The X-Series product was generally available in November 2022 and has been in mass production since the summer of 2023.


Redefining Infrastructure Philosophy – the Xsight Labs Vision

Event: AI Infrastructure Field Day 4

Appearance: Xsight Labs Presents at AI Infrastructure Field Day

Company: Xsight Labs

Video Links:

Personnel: Ted Weatherford

This introductory session establishes the company’s core identity and its unique approach to the semiconductor market. It explores a product philosophy built on the pillars of extreme scalability, open architecture, and vertical integration to reduce Total Cost of Ownership (TCO). By the end of this section, the audience will understand how the company’s commitment to agility and simplicity drives its engineering decisions. Xsight Labs, a fabless semiconductor company founded in 2017, designs and sells chips manufactured through TSMC. Led by serial entrepreneurs and backed by $440 million in top-tier VC funding, the company employs over 200 engineers globally. Their unique approach aims to democratize the semiconductor space by providing open, programmable, and vertically integrated solutions for the rapidly evolving AI and data center infrastructure markets.

Xsight Labs focuses on two critical components of the “AI factory” or “token machine”: an Ethernet switch chip (X-series) and a Data Processing Unit (DPU) or infrastructure processor (E-series). These products are developed on 5nm technology, are generally available, and the X-series is already in mass production. The company emphasizes a “software-defined infrastructure” philosophy, claiming to be the first chip company to offer wire-speed, energy-efficient, and truly programmable products without compromising performance, price, or power. This agility is crucial given the unpredictable nature of future AI applications, and their open instruction sets and collateral allow for community contributions and custom compilers, accelerating innovation.

The E-series DPU, specifically the E1 800 gig product, is designed from an ARM server perspective rather than a traditional network interface card, offering 64-core ARM chips with derivatives to optimize for various power and performance needs. The upcoming E1L will be a low-power version targeting control plane markets and programmable SmartNICs. The X-series Ethernet switch, with its X2 12.8 terabit monolithic die, stands out for its exceptionally low power consumption (180W compared to competitors’ 300-600W) while maintaining high performance, low latency, and full programmability from Layer 1 to Layer 4 with embedded memory switches. The future X3 will further expand bandwidth and radix points through clever die combining, reinforcing Xsight Labs’ commitment to innovative, power-efficient, and highly flexible infrastructure solutions.


A Leap Forward in Storage Efficiency with the OFP Initiative and Hammerspace

Event: AI Infrastructure Field Day 4

Appearance: Hammerspace Presents at AI Infrastructure Field Day

Company: Hammerspace

Video Links:

Personnel: Kurt Kuckein, Ted Weatherford

Hammerspace is driving the Open Flash Platform (OFP) Initiative, an effort to significantly reduce the complexity and cost associated with large-scale flash storage for AI and other demanding workloads. This presentation introduced a reference design for a high-density, low-power flash storage solution that achieves unprecedented capacity and efficiency within data centers. The goal is to deliver one exabyte of storage in a single rack, enabling a new paradigm of “disappearing storage” in which compact 1U systems are distributed throughout a data center, leveraging otherwise unused rack space and minimal power consumption.

The development process involved several design iterations, shifting from a challenging 2U form factor to a more efficient 1U design. This shift addressed issues such as chassis deformation, power/cooling inefficiencies, and wasted space, requiring extensive thermal and pressure analyses to ensure reliable operation in a tightly packed environment. A significant breakthrough was selecting the Xsight DPU, which delivers robust compute capabilities comparable to an x86 server from a few years ago, in a highly power-efficient package that supports Linux and storage services within this compact design. Ted Weatherford highlighted the Xsight E1 chip as the world’s first 800-Gig DPU, featuring 64 Neoverse cores, a programmable NIC, and an “all fast path design” that eliminates data bottlenecks, achieving 800-Gig line rates, as independently verified by KeySite.

Looking ahead, Hammerspace and its partners are actively exploring new flash form factors to overcome current E2 limitations and achieve the one exabyte-per-rack goal. The OFP Initiative aims to standardize within the Open Compute Project (OCP) to ensure broad industry adoption and benefits. The versatility of the Xsight chip enables applications beyond shared file storage, including block storage and a homogeneous boot device for hyperscalers, streamlining qualification and management across diverse server infrastructures. The project is currently in prototyping and validation, with early-access customers receiving units this quarter and general availability targeted for the second half of the year, while continually recruiting more industry participants to drive this standard forward.


Unifying AI Enterprise Data into a Single Instantly Accessible Global Namespace with Hammerspace

Event: AI Infrastructure Field Day 4

Appearance: Hammerspace Presents at AI Infrastructure Field Day

Company: Hammerspace

Video Links:

Personnel: Kurt Kuckein, Sam Newnam

Hammerspace introduced its AI Data Platform solution to address the pervasive challenge of data fragmentation, a significant inhibitor to AI readiness. The presentation highlighted the complexity of AI tooling and the substantial capital outlay required, leading to enterprise fears of missing out (FOMO) and messing up (FOMU) on AI initiatives. Their solution aims to simplify these challenges by integrating seamlessly with NVIDIA’s reference designs to deliver a comprehensive, outcome-driven platform rather than a complex toolkit of disparate components.

Hammerspace’s AI Data Platform combines its unique global namespace and Tier Zero capabilities with NVIDIA software, including RAG Blueprints and RTX 6000 Pro, and is often deployed on standard servers such as Cisco C210s. This platform allows enterprises to connect to existing hybrid data through assimilation, whether full or read-only, making vast amounts of legacy data instantly accessible without costly and time-consuming migrations. The core mechanism involves discovering new files and automatically moving them to Tier Zero, a high-performance NVMe flash layer within the servers, for intensive processing such as extraction, embedding, and indexing. This heavy lifting is performed without burdening existing storage systems, with Hammerspace managing the entire process from data ingestion and validation to cleanup, ensuring AI-ready data is available in minutes. The software-defined nature enables flexibility across various hardware platforms and cloud environments, while leveraging protocols such as PNFS and NFS-direct to optimize GPU utilization.

The ultimate goal of Hammerspace’s AI Data Platform is to accelerate time-to-value by eliminating data gravity and GPU gravity. By shifting to a data-first strategy, the platform integrates data categorization and tagging, embedding security and performance characteristics directly into the data’s metadata. This enables automated, intelligent decisions about data placement and processing, replacing manual, script-driven workflows with an intuitive agentic system. This approach allows organizations to leverage their existing capital investments, transforming fragmented enterprise data into a unified, instantly accessible global namespace for AI applications within weeks, effectively creating an AI factory that starts where they are.


Taming Data Estate Chaos for AI with Hammerspace

Event: AI Infrastructure Field Day 4

Appearance: Hammerspace Presents at AI Infrastructure Field Day

Company: Hammerspace

Video Links:

Personnel: Kurt Kuckein, Sam Newnam

Hammerspace introduces itself as a “data company,” distinguishing itself from traditional storage vendors by offering a solution that addresses the complex data demands of modern infrastructure, particularly for AI workloads. The core concept behind Hammerspace is an instantly accessible, infinite virtual space that disaggregates data from its underlying infrastructure, enabling it to reside in any location, across any cloud, and on any storage backend, thereby eliminating data silos. This is achieved by assimilating metadata from existing storage systems into a single, global namespace, managed by metadata servers outside the data path. This approach not only accelerates data pipelines but also enhances existing infrastructure and enables rapid, easy integration of new technologies, providing users with visibility and access to all their data within minutes, rather than requiring lengthy, costly data migrations.

Hammerspace extends its capabilities to address critical challenges in AI infrastructure, including the current tight market and rising flash memory costs. The solution leverages underutilized flash storage within existing environments by aggregating systems and intelligently orchestrating data placement across tiers. It introduces “Tier Zero,” which consumes and aggregates local flash within compute (CPU and GPU) clusters into the global namespace, providing extremely high-performance storage by eliminating network latency. Hammerspace also treats cloud storage as a direct extension of on-premises infrastructure, not just a destination for data, thereby maximizing the use of available flash resources. The software-defined platform ensures data portability and access through a parallel file system (PNFS v4.2) and multi-protocol access (S3, NFS, SMB). Importantly, its policy-driven orchestration automates data movement and ensures data durability and availability through redundant metadata nodes and erasure coding across storage systems. It also centralizes privileged access and security policies, allowing permissions to follow data regardless of its physical location, critical for cross-border data compliance and auditability, and supports rich custom metadata beyond basic POSIX attributes.

Customer examples illustrate these benefits, such as a digital payments company that reduced storage costs by $5 million and simplified workflows for 3,000 data scientists by providing parallel file system access over object storage and enabling hybrid cloud agility. Another customer, facing a 3-4x increase in performance demand from new NVIDIA servers, leveraged Hammerspace to maintain existing NAS systems while deploying high-performance NVMe storage, avoiding significant new infrastructure investments. For inference workloads where latency is critical, Hammerspace can use policies to preload entire projects into local NVMe (Tier Zero) directly connected to GPUs, maintaining high performance and data consistency across globally distributed inference farms. Ultimately, through its integration with platforms like the NVIDIA AI Data Platform, Hammerspace goes beyond merely unifying data access; it truly unlocks the value within data by automating data preparation and orchestration, moving organizations from data chaos to a state of AI-ready data, often allowing interaction with the system via natural language for streamlined management.


Fabrix.ai Demo – Building Agentic AI at scale for Production

Event: AI Infrastructure Field Day 4

Appearance: Fabrix.ai Presents at AI Infrastructure Field Day

Company: Fabrix.ai

Video Links:

Personnel: Rached Blili

Fabrix.ai is building agentic AI at scale for production, moving beyond proofs of concept to deliver robust solutions. In the video from the Fabrix.AI channel, Rached Blili demonstrated the Fabrix.ai platform, highlighting its agent catalog, where users can access and manage a variety of agents, both developed by Fabrix.ai and custom-built. The platform offers an AI Storyboard dashboard that provides a comprehensive view of AI operations, enabling agents to be organized into projects with distinct permissions and toolsets. A significant emphasis is placed on observability, including detailed AI cost tracking at both global and project levels, and visibility into individual “conversations” or agentic sessions. Uniquely, Fabrix.ai provides performance evaluation for agents, treating them as digital workers by monitoring their performance over time, identifying top and underperforming agents, and suggesting specific fixes, such as modifying system prompts, to continuously improve their efficacy.

The demonstration showcases two types of agents: autonomous and interactive. Autonomous agents operate in the background, triggered by events, alerts, or schedules, as exemplified by a Network Root Cause Analysis agent. This agent automatically diagnoses network failures, such as router configuration errors, by analyzing logs, incident data, and router configurations. It generates comprehensive reports detailing the root cause, impact assessment, and multiple remediation plans, which a remediation agent can then use for automated implementation and verification. For interactive use, Fabrix.ai’s copilot, Fabio, enables users to converse directly with agents to manage complex tasks, such as verifying VPNs or configuring Netflow in a lab network, significantly reducing manual intervention and saving time.

Delving into the underlying architecture, the presentation revealed that complex problems are tackled using multi-agent complexes, where an orchestrator agent calls specialized sub-agents, each handling a specific part of the problem with a sequestered context. This approach enhances individual agents’ capabilities while enabling detailed cost management, tracking token usage, time, and expenses, and capturing individual agent contributions within a hierarchical structure. A detailed example illustrated an application root-cause analysis in which the orchestrator agent systematically investigated incident details, application dependency maps, and even interpreted plain-English change requests from a ticketing system. The platform’s advanced context and tooling engines are critical to operating at scale, enabling mass operations across numerous devices in parallel and efficiently processing vast tool outputs by storing them in a context cache for later retrieval and analysis, ensuring effective, secure, and reliable agent deployment.


Crossing the Production Gap to Agentic AI with Fabrix.ai

Event: AI Infrastructure Field Day 4

Appearance: Fabrix.ai Presents at AI Infrastructure Field Day

Company: Fabrix.ai

Video Links:

Personnel: Rached Blili

Fabrix.ai highlights the critical challenges in deploying agentic AI from prototype to production within large enterprises. The Rached Blili noted that while agents are quick to prototype, they frequently fail in real-world environments due to dynamic variables. These failures typically stem from issues in context management, such as handling large tool responses and maintaining “context purity,” as well as from operational challenges related to observability and infrastructure, including security and user rights. To overcome these hurdles, Fabrix.ai proposes three core principles: moving as much of the problem as possible to the tooling layer, rigorously curating the context fed to the Large Language Model (LLM), and implementing comprehensive operational controls that monitor for business outcomes rather than just technical errors.

Fabrix.ai’s solution is a middleware built on a “trifabric platform” comprising data, automation, and AI fabrics. This middleware features two primary functional components: the Context Engine and the Tooling and Connectivity Engine. The Context Engine focuses on delivering pure, relevant information to the LLM through intelligent caching of large datasets (making them addressable and providing profiles such as histograms) and sophisticated conversation compaction that tailors summaries to the current user goal, preserving critical information better than traditional summarization. The Tooling and Connectivity Engine serves as an abstraction layer that integrates various enterprise tools, including existing MCP servers and non-MCP tools. It allows tools to exchange data directly, bypassing the LLM and preventing token waste. This engine uses a low-code, YAML-based approach for tool definition and dynamic data discovery to automatically generate robust, specific tools for common enterprise workflows, thereby reducing the LLM’s burden and improving reliability.

Beyond these core components, Fabrix.ai emphasizes advanced operational capabilities. Their platform incorporates qualitative analysis of agentic sessions, generating reports, identifying themes, and suggesting optimizations to improve agent performance over time, effectively placing agents on a “performance improvement plan” (PIP). This outcome-based evaluation contrasts with traditional metrics like token count or latency. Case studies demonstrated Fabrix.ai’s ability to handle queries across vast numbers of large documents, outperforming human teams in efficiency and consistency, and to correlate information across numerous heterogeneous systems without requiring a data lake, thanks to dynamic data discovery. The platform also includes essential spend management and cost controls, recognizing the risk that agents may incur high operational costs if not properly managed.


Build Reliable, Secure, and Performant Agents using Fabrix.AI AgentOps Platform

Event: AI Infrastructure Field Day 4

Appearance: Fabrix.ai Presents at AI Infrastructure Field Day

Company: Fabrix.ai

Video Links:

Personnel: Shailesh Manjrekar

Fabrix.AI addresses the evolving AI operations landscape with an AgentOps platform that builds reliable, secure, and high-performance agents. The company, formerly Cloudfabrics.com, rebranded as Fabrix.AI in response to customer demand for agentic functionality, moving beyond traditional AIOps, which relies on manual remediation after correlation and root-cause analysis. This shift was motivated by real-world challenges, such as an 8-hour telco outage caused by inadvertent access control list changes, highlighting the need for autonomous or semi-autonomous remediation workflows powered by Large Language Models (LLMs). However, this transition introduces new complexities, including the non-deterministic nature of LLMs, context and data management at scale, and the challenge of connecting to diverse data sources, which can lead to issues such as hallucination and an “agentic value gap,” where experimental demos rarely translate to enterprise value.

Fabrix.AI’s solution centers on proprietary middleware that serves as a critical intermediary between AI agents/LLMs and various data sources. This middleware comprises two main components: the Context Engine and Universal Tooling. The Context Engine ensures “purity of context” by providing only curated, summarized data to the LLM, thereby preventing context corruption and reducing hallucination, while also maintaining state across interactions. The Universal Tooling dynamically connects to over 1,700 disparate data sources, including MCP-enabled endpoints, API-based systems, and raw or legacy data, by creating necessary wrappers and normalizing data schemas for LLM understanding, and can even dynamically generate tools by scraping public APIs. This approach allows the platform to integrate seamlessly with existing IT environments, offering a full-stack solution from data acquisition to automation.

The platform is purpose-built for real-time data environments, differentiating it from generic agentic frameworks that may not meet these requirements. It offers a “co-pilot” for conversational queries and an “Agent Studio” for building custom agents, supplementing its library of 50 out-of-the-box agents across AIOps, Observability, SecOps, and BizOps. Fabrix.AI emphasizes operationalizing agents through its AgentOps model, which incorporates trust via prompt templates and dynamic instructions, governance through FinOps models, security via a “least agency” principle, and comprehensive observability at the agentic layer with audit trails and real-time flow maps. By consolidating tools, reducing Mean Time to Resolution (MTTR) and alert noise, and enabling faster deployments, Fabrix.AI positions itself as a robust, enterprise-grade platform that complements and enhances existing observability and ITOM tools.


Resilient Wireless Networks for AI with Cisco Enterprise Networking

Event: AI Infrastructure Field Day 4

Appearance: Cisco Enterprise Networking Presents at AI Infrastructure Field Day

Company: Cisco

Video Links:

Personnel: Minse Kim

Minse Kim, Cisco’s wireless product manager, emphasized that the AI era is profoundly changing enterprise networking, extending beyond data centers to encompass “physical AI” applications in factories, medical facilities, and dynamic workspaces. He noted that surging demand for AI infrastructure components is also influencing customer buying cycles, with some customers proactively investing in Wi-Fi 7 now. A key insight is that while AI infrastructure is often perceived as data center-centric, the actual consumption and training of AI models, particularly for robotics and autonomous systems, relies heavily on high-performance, low-latency wireless connectivity, making Wi-Fi 6, 6E, and 7 crucial “last mile” technologies. Cisco’s Wi-Fi 7 access points are designed to meet these demands, offering multi-gigabit speeds and backhaul capabilities up to 20 Gbps per AP.

Addressing Wi-Fi’s traditional reliability-versus-speed trade-off, Cisco has developed Ultra-Reliable Wireless Backhaul (URWB) capabilities integrated into its Wi-Fi 7 APs. By dedicating a radio, URWB provides a stable, predictable, and low-latency “wired-like” connection, which is essential for critical applications like robotics that cannot tolerate the blips and jitters common in traditional Wi-Fi during client roaming. Beyond connectivity, Cisco Wi-Fi 7 APs also enhance spatial awareness and location services. Leveraging technologies such as 802.11mc (FTM) and Ultra-Wideband (UWB) with sensor fusion, these APs deliver sub-meter (e.g., one-foot) location accuracy and low latency, resolving long-standing problems in asset tracking and network operations, as demonstrated by real-time asset tracking in an office environment. This ability to accurately digitize the physical world is fundamental for AI analytics.

Furthermore, Cisco is integrating AI into network operations to simplify management and optimize performance. For instance, AI models leverage telemetry data from 35 million Cisco APs globally to intelligently manage firmware upgrades, learning from customer rollback decisions to improve future deployments. AI also enhances Radio Resource Management (RRM) by moving beyond simple rule-based engines to intelligently optimize RF configurations, leveraging historical interference patterns and dynamically adapting to environmental changes to maximize network efficiency and stability. Cisco is even introducing the concept of APs acting as “synthetic clients” to proactively collect network statistics and provide informed recommendations. This comprehensive AI-powered approach, delivering ultra-reliable, high-speed wireless, precise spatial awareness, and intelligent network automation, is not a future vision but a current reality, with thousands of customers already using Cisco’s AI-powered network solutions.


Smarter Switching for AI with Cisco Enterprise Networking

Event: AI Infrastructure Field Day 4

Appearance: Cisco Enterprise Networking Presents at AI Infrastructure Field Day

Company: Cisco

Video Links:

Personnel: Kenny Lei

The foundational goal of campus switching (providing connectivity to users and endpoints) remains unchanged, but the ecosystem it serves is undergoing rapid transformation driven by evolving applications and devices. Kenny Lei, a Technical Marketing Engineer at Cisco, highlighted the pervasive influence of AI tools like ChatGPT and GitHub Copilot, the surging adoption of Wi-Fi 7 for its increased bandwidth and user density, and the emerging security challenges posed by quantum computing. These trends necessitate a campus network capable of handling dramatically increased, often symmetric, data traffic, with higher performance, lower latency, and robust security.

To address these demands, Cisco has introduced its new “Smart Switch” series, featuring the Catalyst 9350 for access layers and the Catalyst 9610 for aggregation. The Catalyst 9350 offers high Power over Ethernet (90W) and 10Gbps copper ports, complemented by multiple 100Gbps uplinks, significantly reducing oversubscription and ensuring optimal performance for latency-sensitive AI applications. The modular Catalyst 9610, with up to 25 Terabits of performance and support for hundreds of 100Gbps ports (with future 400Gbps capabilities), serves as a high-capacity core. Both platforms are powered by Cisco Silicon One A6 ASICs, which use a virtual output queuing (VOQ) architecture to prevent head-of-line blocking and support up to seven queues for granular traffic prioritization. This intelligent design, coupled with a hybrid buffer memory system, ensures that latency-sensitive traffic is processed swiftly while bulk data transfers avoid packet drops even under congestion.

Cisco emphasizes that security is embedded in the network fabric, featuring Trust Anchor Modules (TAMs) for hardware and software integrity, IPsec/MACsec for secure transport, and a zero-trust model powered by Security Group Tags (SGTs) and the Identity Services Engine (ISE) for continuous authentication and policy enforcement. The new switches also enhance visibility and policy management through HCAM (a combination of TCAM and SRAM), enabling efficient NetFlows and ACLs while significantly reducing resource consumption. Furthermore, the enhanced CPU and memory on these smart switches allow for hosting AI workloads closer to the edge, fostering distributed intelligence and faster processing. Operational efficiency is boosted by innovations such as the eXpress Forwarding Software Upgrade (XFSU), which minimizes outage time during updates by separating the control and data planes and offloading critical processes. Cisco also integrates AI into network operations through an AI Assistant in the Meraki dashboard, streamlining day-zero, day-one, and day-N tasks from inventory management and troubleshooting to compliance checks, ensuring a high-performance, secure, and quantum-ready network infrastructure for the AI era.


Secure Routing for AI with Cisco Enterprise Networking

Event: AI Infrastructure Field Day 4

Appearance: Cisco Enterprise Networking Presents at AI Infrastructure Field Day

Company: Cisco

Video Links:

Personnel: Rahul Sagi

Secure Routing with Cisco Enterprise Networking tackles the increasing complexity, user experience demands, and security requirements of modern WAN networks, especially with the advent of AI branches. Rahul Sagi introduced Cisco Secure Routers, launching in 2025, designed to converge Cisco’s best-in-class networking with advanced security in a single product. This convergence is enabled by a new Secure Networking Processor (SNP) that delivers the high throughput and capacity essential for future AI applications. These routers offer comprehensive on-box security capabilities, including a full stack of hybrid mesh firewalls with IPS/IDS, URL filtering, and AMP Threat Grid, while also supporting cloud security options for direct Internet access (DIA) use cases.

The Secure Networking Processor, an ARM-based chip with Cisco IP, is central to these innovations, enabling inline cryptographic acceleration and a natively integrated Next-Generation Firewall (NGFW) stack for superior performance. Cisco highlights significant improvements, including up to three times the IPsec performance and high security efficacy, with threat protection throughput reaching up to 11 Gbps even with all security features enabled. Addressing the impending threat of quantum computing, the new portfolio integrates Post-Quantum Cryptography (PQC) algorithms, specifically ML-CEM for key exchanges in WAN transport (IPsec and MACsec) and quantum-resistant secure boot, ensuring networks are future-proofed against quantum attacks by 2030, a critical concern for sectors like public, healthcare, retail, and finance. The secure routers also boast improved power efficiency and increased WAN interface capacities, supporting up to 100 Gbps, to handle the escalating I/O demands of AI-driven environments. Furthermore, some platforms include a dedicated AI/ML engine for local inferencing to enhance network performance in future software releases, and native zero-trust principles are embedded throughout the system.

Beyond hardware, Cisco is leveraging AI to simplify WAN operations, offering “AI for networking” tools for administrators. This includes “Branch as Code” with Cisco Validated Designs and integration into CI/CD pipelines for automated, scalable deployments across hundreds of sites. The AI Assistant in management solutions such as Catalyst SD-WAN Manager and the Meraki dashboard streamlines configuration and troubleshooting. Specific AI-powered features include Predictive Path Recommendations, which analyze historical network behavior to suggest optimal transport paths for applications at specific times, and Bandwidth Forecasting, which helps predict and plan for circuit upgrades. Anomaly Detection continuously monitors network attributes such as round-trip time, jitter, and loss to proactively alert administrators to anomalous behavior, reducing troubleshooting time. These combined efforts aim to deliver AI-ready networking products, simplify WAN operations with intelligent tools, and reduce risk across all layers with robust, future-proof security controls.


Cisco Enterprise Networking Platform Approach

Event: AI Infrastructure Field Day 4

Appearance: Cisco Enterprise Networking Presents at AI Infrastructure Field Day

Company: Cisco

Video Links:

Personnel: Shai Silberman

Cisco is unifying its enterprise networking platforms (Meraki and Catalyst) to deliver a single, consistent user experience with common AI and data services, consistent APIs, and shared workflows across cloud, on-prem, and hybrid deployments. This unification began with the creation of a dedicated network platform team that brought together the Meraki and Catalyst groups to foster a “build once, deploy twice” philosophy. New Cisco hardware, including switches, wireless routers, and IoT equipment, now supports both cloud and on-premises management out of the box, allowing customers to choose their preferred management method without making purchasing decisions based on deployment. This approach ensures consistent outcomes and experiences by leveraging the same underlying engines and logic across all platforms.

The convergence journey also includes a unified hardware and licensing model, a “magnetic UI framework” for a common user experience across all Cisco products, and consistent APIs. These APIs enable common tasks, infrastructure as code, and robust integrations with third-party systems such as ServiceNow and Splunk, as exemplified by the API-driven setup of the Paris Olympics infrastructure. At the core is a common AI and data layer, powered by a single Cisco cloud and shared algorithms. This enables deployment of the AI Assistant chatbot on both the Meraki Dashboard (generally available) and the Catalyst Center (open beta), using the same backend to deliver identical experiences and use cases. Additionally, Cisco Workflows, a free low-code solution, is integrated into the Meraki interface, offering templates and horizontal integration across domains and even other vendor products via APIs.

Further advancing management capabilities, Cisco introduced “Global Overview,” a generally available cloud-based product designed for customers operating both cloud and on-premises infrastructures. Global Overview provides a single cloud experience to integrate multiple Meraki organizations and Catalyst Centers, offering consolidated network health visibility, unified inventory, and single sign-on for seamless cross-launching into specific management platforms. Complementing the AI Assistant, AI Canvas (currently in alpha) offers cross-domain collaboration and troubleshooting by integrating multiple data sources and third-party applications via natural language AI agents. Cisco’s AI is powered by a proprietary “deep networking model,” a purpose-built Large Language Model trained on Cisco’s extensive knowledge base, including TAC and CX insights, to deliver highly specific, accurate networking solutions without using customer data and to continuously learn from live telemetry. This innovative approach aims to accelerate root-cause analysis and provide automated remediation while maintaining a human-in-the-loop model to build customer trust.


Cisco Enterprise Networking Vision, Strategy, and Execution

Event: AI Infrastructure Field Day 4

Appearance: Cisco Enterprise Networking Presents at AI Infrastructure Field Day

Company: Cisco

Video Links:

Personnel: Kiran Ghodgaonkar

Cisco presents its enterprise networking vision and strategy, detailing how it is executed from a platform perspective, particularly in the context of the rapidly evolving AI era. Kiran Ghodgaonkar, who leads product marketing for Cisco’s Secure WAN portfolio, introduced the session and outlined how the company is adapting its familiar routing, switching, wireless, and management products. With over 40 years of history, Cisco has been at the forefront of innovation through previous disruptions, including the internet, mobile, and cloud eras, consistently focusing on connecting people to users and applications. The current AI era, however, necessitates a fundamental rethink of how networking products are built to adapt to evolving application and data consumption.

In this new landscape, Cisco observes three consistent themes among its customers: increasing complexity from diverse devices and disparate product stacks; significant IT hiring and budget constraints exacerbated by a skills gap in networking and security; and the challenge of deploying long-lived networking equipment in a fast-evolving AI environment. To address these concerns and build an AI-ready, secure network, Cisco’s strategy is founded on three key pillars. First, it focuses on simplifying operations through Agentic Ops to assist IT leaders. Second, the strategy emphasizes integrating security directly into the network, leveraging it as a primary line of defence against emerging threats such as deepfakes and data leakage, while also adhering to new standards such as NIST post-quantum cryptography. Finally, Cisco aims to develop scalable AI-optimized devices that can simultaneously handle networking and security functions with low latency for demanding AI workloads.

Building hardware for the AI era means a significant evolution in Cisco’s approach. This includes developing custom silicon to deliver high bandwidth, performance, post-quantum readiness, and integrated security, moving beyond the limitations of off-the-shelf solutions. Enhanced observability, including deep packet inspection, is also crucial. For its operating system, IOS XE, Cisco is focused on easier deployment and upgrades without downtime, deep observability, efficient container execution, and robust programmability to support secure API communication for telemetry and management tools. From a broader systems perspective, the company is prioritizing visibility, programmability, and the maintenance of an open, interoperable ecosystem. A critical consideration for these systems is power efficiency, acknowledging networking equipment’s energy consumption and the growing importance of sustainability and carbon footprint management globally.


Building AI Pods with Nexus Hyperfabric from Cisco

Event: AI Infrastructure Field Day 4

Appearance: Cisco Data Center Networking Presents at AI Infrastructure Field Day

Company: Cisco

Video Links:

Personnel: Alex Burger, Dan Backman

This presentation introduces Cisco Nexus Hyperfabric, a cloud-managed platform that simplifies the deployment and ongoing management of AI infrastructure. It addresses the growing need for repeatable, scalable, and operationally efficient networks specifically for enterprise AI clusters. Cisco emphasizes that while hyperscalers build immense AI factories, a significant and growing market exists for smaller, enterprise-level AI deployments, often below 256 nodes, which they term “AI Clusters for the Rest of Us.”

The shift to these smaller, on-premises AI clusters is driven by several factors, including the increasing size and sensitivity of data (e.g., healthcare, intellectual property), making cloud undesirable, a trend of workloads returning from the cloud, and the need for project- or application-specific infrastructure rather than shared general-purpose IT. The rapidly evolving AI technology also means enterprises prefer incremental build-outs rather than massive, infrequent investments, allowing them to leverage newer generations of hardware more frequently. However, designing and deploying these dense, complex, lossless Ethernet networks is challenging and time-consuming for traditional network practitioners, often involving weeks of design, lengthy procurement, and meticulous cabling.

Cisco Nexus Hyperfabric addresses these challenges by delivering a Meraki-like SaaS experience for data center network deployment. It offers pre-designed, NVIDIA ERA-compliant templates for AI clusters that automate the generation of a complete bill of materials, including optics and cables. This drastically reduces design time and eliminates manual errors, accelerating the “time to first token” for AI projects. Hyperfabric also streamlines day-one operations with step-by-step cabling instructions and real-time validation via server-side agents, ensuring correct physical connectivity. Beyond deployment, it provides end-to-end network visibility, proactive monitoring of components such as optics, and integrates advanced Ethernet features, including lossless capabilities (PFC, ECN) and adaptive routing, to optimize performance for demanding AI workloads.


Cisco Reference Architectures for AI Networking with the Nexus Dashboard

Event: AI Infrastructure Field Day 4

Appearance: Cisco Data Center Networking Presents at AI Infrastructure Field Day

Company: Cisco

Video Links:

Personnel: Meghan Kachhi, Richard Licon

Cisco provides comprehensive reference architectures for AI networking, scalable from small 96-GPU clusters up to massive 32,000-GPU deployments. These designs, available on Cisco.com and Nvidia.com, are vendor-agnostic, supporting Nvidia, AMD, and Intel. The core focus is to simplify operations for customers, ensuring ease of design at scale while maintaining automation and end-to-end visibility. This is achieved through the Nexus Dashboard platform, which streamlines the complex requirements of AI infrastructure.

The Nexus Dashboard significantly simplifies AI networking management. It enables customers to quickly create AI fabrics, choosing between routed or VXLAN EVPN options, with best-practice configurations for lossless fabrics, including QoS, ECN, and PFC, automatically applied. The platform also enables easy activation of advanced features, such as Dynamic Load Balancing (DLB), with minimal clicks. It facilitates the discovery and onboarding of switches into the AI fabric, organizes them into scalable units, and provides guardrails against misconfigurations. Customers can manage their AI clusters seamlessly alongside traditional data center and storage fabrics, leveraging a unified dashboard that offers clear topology views and inventory details of switches, interfaces, and connected GPUs.

Beyond setup and management, the Nexus Dashboard provides critical visibility into AI jobs and troubleshooting capabilities. Integrating with workload managers like Slurm enables users to monitor AI jobs and correlate network performance with GPU and NIC issues. The dashboard offers an “at a glance” view of AI resources, highlighting anomalies and advisories. Users can drill down into specific jobs to visualize resource utilization and pinpoint performance bottlenecks. Detailed analytics provide insights into Ethernet interface drops, CRC errors, and GPU-specific metrics, including temperature, utilization, and power. The platform generates job-specific topologies, identifies anomalies down to individual links and GPUs, and provides actionable insights for root-cause analysis and resolution. For customers seeking integration with multi-vendor environments or custom automation workflows, the Nexus Dashboard also provides a comprehensive set of APIs that complement Cisco’s broader AI Canvas for multi-domain orchestration.


Cisco AI Cluster Design, Automation, and Visibility

Event: AI Infrastructure Field Day 4

Appearance: Cisco Data Center Networking Presents at AI Infrastructure Field Day

Company: Cisco

Video Links:

Personnel: Meghan Kachhi, Richard Licon

Cisco’s presentation on AI Cluster Design, Automation, and Visibility, led by Meghan Kachhi and Richard Licon, aims to simplify AI infrastructure and address the challenges of lengthy design and troubleshooting cycles for GPU clusters. The core focus is on enhancing cluster designs, automating deployments, and providing end-to-end visibility to protect a competitive edge. The session outlines Cisco’s reference architectures, key components for building AI clusters, and upcoming updates to its Nexus Dashboard platform, which is expected to streamline design, automation, and monitoring at scale. This comprehensive approach is crucial because the battle for AI success lies at the infrastructure layer, ensuring GPUs are not underutilized by network inefficiencies.

Cisco leverages three unique pillars in its AI networking strategy. Firstly, its systems feature custom Silicon One platforms with programmable pipelines that quickly adapt to evolving AI infrastructure demands, and a partnership with NVIDIA that provides NX-OS on NVIDIA Spectrum X silicon to ensure full-stack reference architecture compliance. Rigorously tested transceivers and a mature NX-OS software, now optimized for AI workloads, complete the system offerings. Secondly, the operating model includes the Nexus Dashboard for on-premises management and Nexus Hyperfabric for a full-stack, cloud-managed solution, complemented by an API-first approach to seamless integration with existing customer automation frameworks. Thirdly, extensive AI reference architectures serve as validated blueprints, spanning enterprise-scale deployments (under 1024 GPUs) to hyperscale cloud environments (1K-16K+ GPUs), providing detailed component lists and ensuring a consistent networking experience across vendors such as NVIDIA, AMD, and storage solutions. An AI cluster is broadly defined to encompass front-end, storage, and backend GPU-to-GPU networks, with a growing trend toward convergence enabled by high-speed Ethernet to unify operating models.

Designing an efficient AI backend network requires a non-blocking architecture that maintains a 1:1 subscription ratio, keeping every GPU within one hop of others for optimal communication. Cisco employs a “scalable unit” concept, enabling incremental expansion by repeating validated blocks while adjusting spine-layer connectivity to maintain high performance. For smaller-scale deployments, such as a 32-GPU university cluster, Cisco demonstrates how front-end, storage, and backend networks can be converged onto fewer, high-density switches, simplifying infrastructure. A critical consideration for such converged environments is Cisco’s policy-based load balancing, an innovation leveraging Silicon One ASICs. This enables preferential treatment of critical traffic, such as GPU-to-GPU training, over storage or front-end traffic, ensuring AI jobs run with minimal latency and maximum GPU utilization, even when sharing network resources.


Practical AI for Business Growth

Event: AI Field Day 7

Appearance: Utilizing AI Podcast

Company: The Futurum Group

Video Links:

Personnel: Nick Patience, Stephen Foskett

Stephen Foskett and Nick Patience are introducing a new podcast called Utilizing AI, where they focus on the practical applications of artificial intelligence in enterprises to drive efficiency, improve decision-making, and foster innovation. They invite listeners to participate in weekly episodes that discuss practical AI applications within various business sectors, offering insights and examples of real-world outcomes from incorporating AI technologies. The episodes will also explore AI Field Day events and analyze insights from The Futurum Group’s research and AI practice.

The podcast’s first episode was recorded live during AI Field Day in Santa Clara, bringing various groups together, including individuals from TechStrong, Tech Field Day, and Futurum’s research and analyst team. They aim to present different perspectives on how AI can be transformative within enterprise IT, presenting AI as a pervasive general-purpose technology impacting various facets of business operations. Highlights from the discussion included upcoming AI Field Day events, planned appearances at industry conferences, and an exploratory overview of significant players and their roles, such as AWS, Google, and Oracle, in the AI ecosystem. The podcast intends to focus on keeping up with the fast pace of AI developments and distinguishing meaningful trends and innovations from fleeting ones.

They shared insights into AI’s impact on industries and detailed upcoming plans and discussions related to AI usage. As AI continues to evolve and impact various industries differently, there’s an emphasis on enterprise adoption of AI technologies to automate processes, improve customer service, and optimize operations. The team plans to highlight practical examples and discuss how businesses can navigate the rapidly-changing AI landscape, balancing the fear of missing out with the day-to-day operational needs. The podcast is set to respond to weekly changes and announcements in the AI field, continuously aiming to inform and discuss the current state and future of AI technologies.


Considering ResOps – a Tech Field Day Roundtable at Commvault SHIFT 2025

Event: Tech Field Day Experience at Commvault SHIFT 2025

Appearance: Commvault SHIFT Roundtable Discussion

Company: Commvault

Video Links:

Personnel: Jay Cuthrell, Karen Lopez, Michael Stempf, Shala Warner, Stephen Foskett, Tom Hollingsworth

At the Commvault SHIFT 2025 Tech Field Day Roundtable in New York City, moderator Stephen Foskett convened a panel of industry experts to discuss the latest trends in data protection, resilience, and artificial intelligence. The panel included Jay Cuthrell, Karen Lopez, Shala Warner, and Tom Hollingsworth, as well as Michael Stempf from Commvault, each bringing perspectives from security, data management, DevOps, and cloud architecture. The discussion focused on Commvault’s strategic announcements around ResOps—an emerging discipline combining practices from DevOps, SecOps, and FinOps into a holistic approach to cyber resilience. Panelists noted the importance of cross-team collaboration, integrations with major cloud and security platforms, and the convergence of operational practices, all of which align with the increasing complexity of enterprise IT environments and the growing threat landscape fueled by AI-driven attacks.

A key topic was the shift from traditional disaster recovery (DR) and backup, which assumed non-malicious outages, towards a mindset anchored in active defense against adversarial threats like ransomware. Jay Cuthrell and Tom Hollingsworth highlighted innovations such as synthetic restore—a method to selectively recover clean data and minimize downtime after an attack—as well as the crucial role of identifying attack persistence in overlooked areas like Active Directory. The panel emphasized the necessity of incorporating AI for faster detection and remediation, but also pointed out the risk of AI-generated threats and the importance of comprehensive data inventories. Karen Lopez stressed that recovery, not just backup, should be the ultimate goal, asserting that organizations need robust strategies to know what data they have, where it lives, and how it is being protected.

The roundtable concluded that Commvault’s announced direction—moving beyond storage toward broader cyber and AI resilience—was credible and matched the realities of modern IT. Panelists praised new capabilities such as conversational interfaces and integrations with collaboration tools (e.g., Office 365, Google Workspace, and cloud-native databases), while also pointing to the need for organizations to invest in people and processes, not just technology. The panel agreed that cyber resiliency is now a “team sport,” requiring cooperation across IT, security, legal, and business units, facilitated by intelligent automation and education programs. The event served as both a showcase of Commvault’s evolution and a broader industry call to arms for holistic, AI-aware data protection.