Cisco AI Networking Cluster Operations Deep Dive

Event: Networking Field Day 39

Appearance: Cisco Presents at Networking Field Day 39

Company: Cisco

Video Links:

Personnel: Paresh Gupta

Paresh Gupta’s deep dive on AI cluster operations focused on the extreme and unique challenges of high-performance backend networks. He explained that these networks, which primarily use RDMA over Converged Ethernet (ROCE), are exceptionally sensitive to both packet loss and network delay. Because ROCE is UDP-based, it lacks TCP’s native congestion control, meaning a single dropped packet can stall an entire collective communication operation, forcing a costly retransmission and wasting expensive GPU cycles. This problem is compounded by AI traffic patterns, such as checkpointing, where all GPUs write to storage simultaneously, creating massive incast congestion. Gupta emphasized that in these environments, where every nanosecond of delay matters, traditional network designs and operational practices are no longer sufficient.

Cisco’s strategy to solve these problems is built on prescriptive, end-to-end validated reference architectures, which are tested with NVIDIA, AMD, Intel Gaudi, and all major storage vendors. Gupta detailed the critical importance of a specific Rail-Optimized Design, a non-blocking topology engineered to ensure single-hop connectivity between all GPUs within a scalable unit. This design minimizes latency by keeping traffic off the spine switches, but its performance is entirely dependent on perfect physical cabling. He explained that these architectures are built on Cisco’s smart switches, which use Silicon One ASICs and are optimized with fine-tuned thresholds for congestion-notification protocols like ECN and PFC.

The most critical innovations, however, are in operational simplicity, delivered via Nexus Dashboard and HyperFabric AI. These platforms automate and hide the underlying network complexity. Gupta highlighted the automated cabling check feature. The system generates a precise cabling plan for the rail-optimized design and provides a task list to on-site technicians; the management UI will only show a port as green when it is cabled to the exact correct port, solving the pervasive and performance-crippling problem of miscabling. This feature, which customers reported reduced deployment time by 90%, is combined with job scheduler integration to detect and flag performance-degrading anomalies, such as a single job being inefficiently spread across multiple scalable units.


Introduction to Cisco AI Cluster Networking Design with Paresh Gupta

Event: Networking Field Day 39

Appearance: Cisco Presents at Networking Field Day 39

Company: Cisco

Video Links:

Personnel: Paresh Gupta

Paresh Gupta, a principal engineer at Cisco focusing on AI infrastructure, began by outlining the diverse landscape of AI adoption, which spans from hyperscalers with hundreds of thousands of GPUs to enterprises just starting with a few hundred. He categorized these environments by scale—scale-up within a server, scale-out across servers, and scale-across between data centers—and by use case, such as foundational model training versus fine-tuning or inferencing. Gupta emphasized that the solutions for these different segments must vary, as the massive R&D budgets and custom software of a hyperscaler are not available to an enterprise, which needs a simpler, more turnkey solution.

Gupta then deconstructed the modern AI cluster, starting with the immense computational power of GPU servers, which can now generate 6.4 terabits of line-rate traffic per server. He detailed the multiple, distinct networks required, highlighting a recent shift in best practices: the front-end network and the storage network are now often converged. This change is driven by cost savings and the realization that front-end traffic is typically low, making it practical to share the high-bandwidth 400-gig fabric. This converged network is distinct from the inter-GPU backend network, which is dedicated solely to GPU-to-GPU communication for distributed jobs, as well as a separate management network and potentially a backend storage network for specific high-performance storage platforms.

Finally, Gupta presented a simplified, end-to-end traffic flow to illustrate the complete operational picture. A user request does not just hit a GPU; it first traverses a standard data center fabric, interacts with applications and centralized services like identity and billing, and only then reaches the AI cluster’s front-end network. From there, the GPU node may access high-performance storage, standard storage for logs, or organization-wide data. If the job is distributed, it ignites the inter-GPU backend network. This complete flow, he explained, is crucial for understanding that solving AI networking challenges requires innovations at every point of entry and exit, not just in the inter-GPU backend.


The Age of Operations: Third Party Views of DCF Innovation

Event: Networking Field Day 39

Appearance: Nokia Presents at Networking Field Day 39

Company: Nokia

Video Links:

Personnel: Scott Robohn

Scott Robohn of the consulting firm Solutional provided a third-party, operational perspective on data center innovation, based on a collaboration with The Futurum Group and Nokia. He introduced his background as a former network engineer and Tech Field Day delegate, now focused on NetOps and AI adoption. Robohn’s central thesis is that with AI driving new infrastructure builds and hardware becoming normalized, the next major performance gains in data centers will come from a relentless focus on improving operations. The joint project between Solutional, Futurum, and Nokia aimed to validate this by looking at AI’s dual role: both as a driver for new network infrastructure and as a set of tools to be used for network operations.

Robohn detailed the market trends shaping this new age of operations, starting with AI as a durable technology driving massive fabric build-outs by hyperscalers and a new class of NeoCloud providers. He defined NeoClouds as specialized providers focused on renting expensive, complex GPU-interconnect infrastructure for workloads like model training and video rendering. He then argued that as data center hardware has normalized around merchant silicon and stable architectures like Spine-Leaf, the hardware itself is no longer the key differentiator. This consistency makes automation more achievable and shifts the entire industry’s focus to operations as the critical area for innovation and reliability.

To validate this operational focus, the project stood on four legs. First, a Futurum reliability survey which found that network reliability is the number one purchasing criterion for data center infrastructure, far outweighing cost, and that human error remains a major issue. Second, a collaboration with Bell Labs on a reliability model, which concluded the most significant reduction in downtime comes from fixing operations-related issues. Third, interviews with Nokia’s own 80,000-employee internal enterprise IT team, who are successfully migrating their complex, high-stakes manufacturing and office networks to Nokia’s SR Linux and EDA platform. Finally, general market engagement confirming a palpable, industry-wide appetite for a new class of automation and AIOps tools.


Root Cause Analysis with Nokia AI Operations Automation

Event: Networking Field Day 39

Appearance: Nokia Presents at Networking Field Day 39

Company: Nokia

Video Links:

Personnel: Clayton Wagar

Clayton Wagar introduced Nokia’s AI-driven approach to root cause analysis, focusing on solving difficult day-two operational challenges. The presentation highlighted the chronic pain of hidden impairments or gray failures, where traditional monitoring systems fail because physical links appear active while protocols or services are down. The goal of Nokia’s deep RCA tool is to move beyond simple port-up/port-down alarming by correlating end-to-end application connectivity (from VM to VM) with all layers of the network, including the underlay, overlay, and control plane, to dramatically compress troubleshooting time.

A live demonstration was shown on the EDA SaaS platform using a real hardware Spine-Leaf network. The team introduced a gray failure by impairing a fiber link in a way that kept the interface status “up” but caused the BFD and BGP protocols to fail. Wagar explained that the AI’s multi-agent workflow correctly diagnosed this. Instead of using one large, monolithic model, a planning model first determines which tool-calling agents to deploy. These agents gather specific, relevant data from logs, topology, and configuration, which is then filtered and passed to a reasoning model. This agentic-based curation of data is Nokia’s key to reducing costs and avoiding the AI hallucinations that would otherwise be a risk in mission-critical networks.

The tool’s capabilities were further demonstrated by successfully identifying a classic, hard-to-find MTU mismatch. Another key feature highlighted was the “Time Machine,” which allows an operator to select a past timeframe, such as thirty minutes prior to an event, and run the same AI-driven root cause analysis on the historical data from that moment. The entire process concludes with the AI generating a comprehensive report that provides a human-readable summary, a confidence score, and the specific evidence gathered by the agents, effectively solving a complex logic puzzle that would have taken an engineer hours or days to manually diagnose.


Delivering Nokia Enhanced AIOps with the Right Foundations

Event: Networking Field Day 39

Appearance: Nokia Presents at Networking Field Day 39

Company: Nokia

Video Links:

Personnel: Bruce Wallis

Bruce Wallis pivoted the discussion to Nokia’s AIOps capabilities, centered on a new natural language interface called Ask EDA. This feature, which resembles a ChatGPT for the network, allows operators to interact with the EDA platform through a simple chat box. The core idea is to abstract the complexity of network operations, enabling users to ask plain-English questions, such as “list my SRL interfaces” or “is BFD enabled?”, and receive back live, structured data, tables, and even on-the-fly visualizations like pie charts and line graphs. This approach removes the need for operators to understand the complex underlying Yang models or schemas for each vendor, as the AI handles the translation from human language to machine query.

The right foundation for this capability, as Wallis explained, is not a single, monolithic, trained model, but a flexible agentic AI framework. In this model, the central LLM acts as a brain that coordinates a set of pluggable agents, or tools, each with a specific function. The most powerful aspect of this design is its real-time extensibility. Wallis demonstrated this by first showing the AI failing to understand a request to “enable the locator LED.” He then installed a new support application from EDA’s App Store; when asked again, the AI agent immediately recognized and used this new tool to successfully execute the command. This app-based approach allows Nokia to add new troubleshooting workflows and capabilities on the fly, without retraining the model or upgrading the core platform.

This agentic framework is applied directly to troubleshooting and operations. Wallis showed how “Ask EDA” can be used to investigate “deviations,” or configuration drift, where the running config no longer matches the intended state. In another example, with a BGP peer alarm active, the AI was asked to investigate. It used its agents to query various resources, analyze the topology, and correctly identified that the BGP manager process had crashed and restarted, providing a direct link to the deviation. Wallis emphasized that this method of using the LLM to query factual data from live telemetry and tools is how Nokia is addressing the problem of hallucinations, ensuring the AI’s answers are grounded in reality.


Nokia Event-Driven Automation (EDA) Multi Vendor – Deliver on the Promise

Event: Networking Field Day 39

Appearance: Nokia Presents at Networking Field Day 39

Company: Nokia

Video Links:

Personnel: Bruce Wallis

Bruce Wallis, Product Manager for Nokia’s EDA (Event-Driven Automation) platform, began by revisiting the core assertions made when the product was unveiled a year prior. He reiterated that EDA was built on the same successful principles as Kubernetes: abstraction and a declarative model. The goal was to apply this logic to networking, creating a ubiquitous platform that could define a unit of work for the network, just as Kubernetes did for compute workloads. This approach aims to normalize networking primitives like interfaces and BGP peers, allowing an operator to declare the desired end state without scripting the specific sequential steps, letting the platform handle the how.

The presentation’s main focus was delivering on the multi-vendor promise made at the previous event. Wallis conducted a live demo, bootstrapping an eight-node, dual-stack fabric underlay using a single 58-line YAML file. This high-level abstract definition was automatically reconciled by EDA, which then generated and pushed the correct, vendor-specific configurations to four different operating systems running on the leaf switches: Nokia SR Linux, Nokia SROS, Cisco Nexus, and Arista EOS. This demonstrated the platform’s ability to manage a heterogeneous network through one common model.

Finally, Wallis addressed other key platform features, including the ability to bubble operational state up into the abstract model, allowing operators to view the “health” of the entire fabric rather than just individual components. He also clarified for delegates that while he used YAML for demo speed, the platform is fully operable via a form-based UI for users unfamiliar with programmatic inputs. He concluded the demo by successfully deploying a complex EVPN overlay network across the newly built, multi-vendor underlay, again using a single, simple declarative input.


Exploring the Power and Potential of Enhanced AIOps with Nokia

Event: Networking Field Day 39

Appearance: Nokia Presents at Networking Field Day 39

Company: Nokia

Video Links:

Personnel: Clayton Wagar

Clayton Wagar, leading the AI practice for Nokia’s IP division, framed his presentation as creating a “through line” connecting the history of network operations to its AI-driven future. Recalling his own start managing a 911 data center with manual, CLI-driven processes, he traced the evolution of the industry as operators moved from mastering workflows in their heads to writing them down and eventually using automation. Wagar emphasized that as AI is introduced, it’s crucial to be prescriptive about its use and to understand its two main facets: the “plumbing” aspect of building massive networks to support AI, and the application of AI tools to network operations itself.

To illustrate the challenge of applying AI to mission-critical systems, Wagar told a story about a 1943 discussion at Bell Labs between Claude Shannon and Alan Turing. They faced a choice: build a computer like a human brain (a neural network) or like an adding machine (a deterministic system). They chose the adding machine, not only because the technology for neural networks didn’t exist, but critically because telcos and governments required predictable, deterministic outputs, not a system that might hallucinate. This historical context highlights the primary challenge Nokia addresses today: reducing AI hallucinations to make the technology safe for essential, real-world networks where reliability is paramount.

Wagar then connected this to the modern concept of autonomous networks, such as the levels defined by TM Forum. He proposed that these frameworks were largely developed before modern AI and assumed a deterministic path, whereas AI introduces a new, separate plane of capability. He pointed to Google’s public journey toward autonomous networking, which leverages custom-built AI agents to move beyond simple event-driven workflows to a truly autonomous SRE model. Wagar concluded by positioning Nokia’s strategy as learning from these leaders, blending AI and traditional automation to inform their product development and establish new best practices for the industry.


Introduction to Nokia with Andy Lapteff

Event: Networking Field Day 39

Appearance: Nokia Presents at Networking Field Day 39

Company: Nokia

Video Links:

Personnel: Andy Lapteff

Andy Lapteff, a network engineer who became a product marketing manager for Nokia Data Center, introduced his presentation by sharing his personal journey. He admitted that, like many, he previously only knew Nokia for its indestructible cell phones and was unaware of its significant presence in the data center market. It was at a previous Tech Field Day event, attending as a delegate, that he learned Nokia had been building mission-critical networking infrastructure for decades for entities like air traffic control and power grids, and was now applying those same principles of ultra-reliability to the data center. This revelation was so compelling that it motivated him to join the company.

Lapteff contrasted Nokia’s approach with his own past experiences working midnights in a network operations center (NOC), where networks were complex, fragile, and “on fire all the time,” and a “no change Fridays” mentality was common. His motivation at Nokia is driven by the memory of those challenges, aiming to build more robust and reliable networks for data center operators. He expressed genuine excitement for the value Nokia provides, which he feels is real and tangible, unlike other vendor jobs he has held.

The presentation set the stage for a deeper dive into the future of networking, particularly focusing on the often-confusing landscape of AI, AIOps, and the different levels of network autonomy. Lapteff noted that while the industry is largely at a manual or low-automation level, Nokia is pushing toward full autonomous operations. He concluded by previewing the day’s key topics, promising impressive demonstrations on “Agentic AI ops in action,” AI-driven root cause analysis, the automation of AI back-end fabrics, and a much-anticipated update on Nokia’s potential support for non-Nokia hardware.


The Resource Costs of AI

Event: Networking Field Day 39

Appearance: Networking Field Day 39 Delegate Roundtable Discussion

Company: Tech Field Day

Video Links:

Personnel: Tom Hollingsworth

The Networking Field Day 39 roundtable, led by Tom Hollingsworth, dove straight into the massive resource drain caused by the current AI boom. The delegates discussed how the industry is already behind the eight ball on power, with AI’s exponential demand making the existing, outdated power grid’s problems significantly worse. This isn’t just a data center issue. It’s causing widespread component shortages for essentials like RAM and GPUs, affecting everyone from enterprise users to home gamers. The conversation highlighted that consumers are ultimately paying for this AI race, not just through new tech, but through soaring power and water bills as AI companies with deep pockets outbid ordinary consumers for finite resources.

In response to this power crisis, the discussion shifted to solutions. The delegates noted a serious new investigation into small modular nuclear reactors and even restarting old plants like Three Mile Island, things unthinkable just a few years ago, alongside more hopeful developments in solar. On the networking side, this resource demand is forcing the creation of entirely new, expensive technologies like Ultra Ethernet and massive 800-gig switches just to keep these AI data centers fed. These come with huge R&D costs, which will inevitably be passed down, raising the question of whether networking costs are about to go through the roof for reasons outside the network engineer’s control.

Finally, the panel debated the future of AI itself, noting the buzzword is becoming meaningless as the industry pivots from massive LLMs to more efficient, domain-specific smaller models (SLMs) and agentic AI. The group observed that AI might move from giant cloud data centers to running locally on devices with new NPUs, or even on standard laptops. Ultimately, the delegates concluded that the future of AI won’t just be decided by the hyperscalers. It will be shaped by consumers and engineers through the products they choose to use and the companies they criticize for wasting resources.


Graphiant Demos with Vinay Prabhu

Event: Networking Field Day 39

Appearance: Graphiant Presents at Networking Field Day 39

Company: Graphiant

Video Links:

Personnel: Vinay Prabhu

Chief Product Officer Vinay Prabhu demonstrated the Graphiant infrastructure, first focusing on the network for AI. He framed AI as a massive publisher-subscriber problem and demoed a B2B data exchange where services, like GPU farms, can be published to a personal marketplace. This allows partners (both on and off-network) to be securely connected in minutes, automating complex routing, NAT, and security. This capability is then monitored by the Data Assurance Dashboard, which uses a real-time telemetry pipeline (correlating NetFlow, DNS, and DPI) to provide deep visibility without decrypting payloads. This dashboard identifies malicious threats, provides full auditability, and offers an “Uber-like” spatial and temporal view, allowing operators to prove an application’s exact path and confirm compliance with geofencing policies.

This visibility enables absolute control, where users can define policies for performance, path, or risk. Prabhu confirmed customers can enforce policies to drop traffic rather than failover to a non-compliant path, ensuring governance is never compromised. The presentation concluded with the AI for the network component: GINA, the Graphiant Intelligent Network Assistant. GINA acts as a virtual team member, capable of running a “60-minute stand-up in 60 seconds” by generating guided operational and compliance reports. Prabhu stressed that GINA does not train on customer data; it uses Generative AI to interpret queries and accesses information strictly through the user’s existing role-based access control (RBAC) APIs, ensuring data remains secure.


Graphiant: The AI Strategy

Event: Networking Field Day 39

Appearance: Graphiant Presents at Networking Field Day 39

Company: Graphiant

Video Links:

Personnel: Vinay Prabhu

Vinay Prabhu, Chief Product Officer at Graphiant, outlined a two-part strategy encompassing network for AI and AI for network. The network for AI pillar addresses the challenge of AI being a massive distributed publisher and subscriber problem, where data is generated in one location and inference happens in another, often across different business boundaries. To manage this, Graphiant provides a platform for secure data exchange, analogous to financial or healthcare workloads. Prabhu emphasized that simplifying exchange is insufficient without trust, using a ride-sharing app analogy: just as a passenger needs to see the driver and the path, enterprises need real-time observability, auditability, and centralized control to program governance policies directly onto the global fabric.

The second pillar, AI for the network, is embodied by GINA (Graphiant Intelligent Network Assistant). GINA is designed to act as a virtual member of the operations team, automating complex, time-consuming tasks. Prabhu gave the example of a CSO requesting a monthly compliance report, a task that might take an hour to manually collate data from various dashboards and databases. GINA can generate this report, along with threat intelligence and infrastructure insights, almost instantly. Prabhu summarized GINA’s value as running a 60-minute stand-up in 60 seconds, buying back valuable time for practitioners to focus on innovation rather than manual data gathering.


Graphiant Use Cases with Arsalan Khan

Event: Networking Field Day 39

Appearance: Graphiant Presents at Networking Field Day 39

Company: Graphiant

Video Links:

Personnel: Arsalan Mustafa Khan

Graphiant’s CSO, Arsalan Khan, detailed use cases focused on this challenge, beginning with unified connectivity. He explained that the Graphiant fabric treats all endpoints, public clouds, data centers, and emerging AI neoclouds, as part of a single any-to-any fabric. This model eliminates the need for traffic backhauling, providing lower latency and guaranteed paths for high-bandwidth AI workloads, all while ensuring data privacy with end-to-end encryption that is never decrypted in transit.

Khan then highlighted business-to-business data exchange and data assurance as key enablers for AI. The platform simplifies partner collaboration, which is critical for many AI ecosystems, by handling network complexities like IP overlapping and NAT. This capability extends to partners not on the Graphiant platform and includes the ability to dynamically revoke access if a partner is breached. The core data assurance use case provides a centralized tool for CISO and governance teams. Using role-based access control, they can enforce network-level policies, such as ensuring specific data never leaves a geographical boundary, rather than relying on individual application developers to implement compliance.

Finally, Khan addressed how this infrastructure specifically serves AI workloads. He clarified the strategy is “networking for AI,” meaning the platform is designed to offload the complex burden of security and data governance from AI applications to the network itself. This accelerates AI deployment by simplifying compliance. The system supports this with threat intelligence that, without inspecting encrypted payloads, uses public feeds and behavioral analysis. By classifying normal application flows, the network can detect and flag erratic behavior, providing an essential layer of security for moving and processing the large, sensitive data “haystacks” required by AI.


About Graphiant NaaS with Arsalan Khan

Event: Networking Field Day 39

Appearance: Graphiant Presents at Networking Field Day 39

Company: Graphiant

Video Links:

Personnel: Arsalan Mustafa Khan

As enterprises accelerate AI adoption, data governance and network security have become inseparable. In Graphiant’s Networking Field Day presentation titled, “Why Data Governance Demands a Unified and Secure Approach to AI Networking,” Graphiant explores how a secure, compliant, and unified networking infrastructure is essential to enable responsible AI at scale.

Arsalan Khan framed the core problem: enterprises are investing heavily in AI, but their data is siloed across on-prem data centers, multiple clouds, and emerging neocloud providers. This creates massive infrastructure headaches, high costs, and significant security risks. The challenge is compounded by complex data governance regulations, especially for sensitive PII in finance and healthcare. Khan noted that while traditional networking struggles to “catch up” to new technologies, AI demands moving the “whole haystack,” not just finding needles, making network-level control and compliance essential from the start.

Graphiant’s solution is a Network-as-a-Service (NaaS) built on a stateless core, which functions as an overlay/underlay network operated by Graphiant using leased fiber. This provides a single, ubiquitous fabric for any-to-any connectivity with SLA guarantees. The key, Khan emphasized, is simplifying the “plumbing” so businesses can focus on their AI goals. The platform provides centralized visibility and control over metadata, allowing enterprises to see traffic paths and applications (without decrypting payloads) and enforce granular policies, such as guaranteeing that specific data never leaves a geographical boundary. This approach aims to provide the auditability, security, and cost-effectiveness required to manage modern AI data flows.


Futurum Signal – Agentic AI Platforms for Enterprise

Event: AI Field Day 7

Appearance: Futurum Signal Presentation at AI Field Day 7

Company: The Futurum Group

Video Links:

Personnel: Stephen Foskett

In his presentation at AI Field Day 7, Stephen Foskett, President of Tech Field Day at The Futurum Group, introduced the Futurum Signal, a groundbreaking vendor evaluation survey designed to challenge traditional analyst methodologies. The Signal leverages Agentic AI to provide a fresh perspective on evaluating major enterprise AI platforms. Unlike traditional, manual processes that involve extended data collection from vendors and often lead to out-of-date reports, this new method utilizes a combination of proprietary data, industry analysis, AI-driven insights, and human intelligence to generate timely, comprehensive assessments. The process is streamlined to offer enterprise decision-makers updated insights, highlighting the agility of AI-enhanced analytics in evolving technical landscapes.

Foskett shared the latest Signal Report focusing on Agentic AI platforms for enterprises, evaluating major players and identifying strategic partners best suited for enterprise buyers aiming to revolutionize business processes with AI. Through a sophisticated AI-driven system, analysts within the Futurum Research Group assess a pool of significant companies to determine their fit as partners in the AI space. This evaluation considers data integrity, collaboration among multiple agents, governance, and enterprise-oriented controls, all while illuminating promising trends for advanced AI deployment. The report places Microsoft and Salesforce as top contenders in the elite zone, recognized for their comprehensive suite of tools suitable for the largest enterprise clients. Google, IBM, SAP, and ServiceNow are also notable, while AWS and Oracle occupy the established zone, reflecting the dynamic and competitive landscape of agent-based enterprise AI solutions.

The integration of AI into the analytical process allows for real-time data processing and the generation of reports that incorporate recent and relevant updates, such as financial results or organizational changes within evaluated companies. This capability ensures that the information remains fresh and actionable for decision-makers. Futurum’s commitment to leveraging AI as a foundational element in their signal reports underscores a strategic shift toward more responsive, data-enriched analyses. Foskett emphasized the importance of timely and frequent updates, projecting that future reports, including those for Tech Field Day, will be heavily influenced by insights gathered from AI-driven data, aiming for transformative impacts in technology evaluation and enterprise strategy.


Battle of the Bots – Which AI Assistant Delivers with Calvin Hendryx-Parker

Event: AI Field Day 7

Appearance: Calvin Hendryx-Parker Presents at AI Field Day 7

Company: Ignite, Six Feet Up

Video Links:

Personnel: Calvin Hendryx-Parker

Calvin Hendryx-Parker, Co-Founder and CTO of Six Feet Up, delivered an insightful presentation at AI Field Day 7, evaluating the efficacy of various AI coding assistants in real-world developer workflows. His talk built upon an earlier session by exploring updates in agentic AI tools, which have become indispensable in modern coding practices. These tools, including Aider, Goose, Claude Code, Cursor, Juni, and OpenAI Codex, interact with a developer’s environment via APIs, leveraging protocols such as the Model Context Protocol (MCP) to enable autonomous or semi-autonomous coding assistance. Each AI tool has unique strengths, such as differential context management capabilities, sub-agent functionality, and tool-specific interfaces, which can deeply affect a developer’s productivity and workflow efficiency.

Hendryx-Parker’s discussion emphasized the transformative impact of AI assistants on developers’ operational efficiencies, highlighting specific products and protocols. Aider, for instance, is noted for its integration with Git and can handle local models such as Llama to ensure data privacy, while also providing a semiautonomous coding experience through its architect and code modes. Goose by Block is lauded for its inclusivity on multiple models, including support for OpenRouter. It stands out with its recipes for repeated task automation and container isolation to mitigate risk during operations. Claude Code, developed by Anthropic, supports proprietary tools and is inherently more empathetic, which can be advantageous during discussions or negotiations, despite its non-open-source nature.

The presentation culminated in an analysis of the trajectory and potential dominators among these AI tools. Goose and Claude Code were seen as potential leaders due to their robust feature sets and wide-ranging usefulness for enterprise and individual users alike. Goose’s integration with GUI tools indicates a focus on a wider market, possibly covering both professional developers and office workers with coding needs. Hendryx-Parker also touched upon innovations such as the Agent Control Protocol (ACP) for enhanced tool interoperability and pointed to the necessity for developers, especially juniors, to familiarize themselves with these tools to maintain a competitive edge in the rapidly evolving technological landscape. The talk was a comprehensive overview of the AI coding assistant landscape, providing a detailed insight into each tool’s unique capabilities and potential for streamlining developer productivity.


Cloud Field Day 24 Delegate Roundtable Discussion

Event: Cloud Field Day 24

Appearance: Cloud Field Day 24 Delegate Roundtable Discussion

Company: Tech Field Day

Video Links:

Personnel: Alastair Cooke

This final session of Cloud Field Day 24 features a roundtable discussion with the delegates to explore their impressions of the event and delve into topics not thoroughly covered during the presentations. The delegates, who practically work with these products in complex environments, aim to discuss pertinent issues and the right solutions for their customers, particularly regarding the evolution of hybrid or private cloud and how it differs from public cloud.

The discussion centered on the shift towards on-premises cloud solutions, highlighted by companies like Oxide and Morpheus, that mimic cloud functionality without replicating the public cloud model. A key theme was the concept of multi-cloud, which enables workload placement and movement based on specific needs, alongside essential observability and management capabilities. Data management, particularly data sovereignty, was identified as a major driver for on-premise solutions, due to the physics involved in data transfer and the vendors’ belief that current cloud services often fail to meet enterprise data sovereignty requirements.

Further topics included the financial implications of diverse platforms, the rising costs of cloud-based AI agents, and the need for simplification in IT operations amid increasing complexity and staffing challenges. The discussion also addressed the security vulnerabilities in AI and the importance of incorporating security into AI infrastructures from the outset, rather than as an afterthought. Finally, the panel discussed the distinct approaches to cloud solutions, contrasting single-source providers like Oxide with vendors that enable best-of-breed integrations, acknowledging the challenges of integrating new solutions into complex, legacy environments.


Scaling Autonomous IT: The Real Enterprise Impact with Digitate ignio

Event: AI Field Day 7

Appearance: Digitate Presents at AI Field Day 7

Company: Digitate

Video Links:

Personnel: Rajiv Nayan

Discover how Digitate is transforming enterprise operations through an AI-first go-to-market strategy built for global scale and complexity. In this session, Digitate’s GM and VP, Rajiv Nayan, will dive into real-world customer success stories that showcase how Digitate is scaling innovation across industries, building multi-billion dollar business opportunities, and reshaping how businesses run in the AI era. Learn how your enterprises can scale smarter, faster, and more proactively with Digitate: https://digitate.com/ai-agents/

At AI Field Day 7, Rajiv Nayan, Vice President and General Manager of Digitate, presented “Scaling Autonomous IT: The Real Enterprise Impact with Digitate ignio.” Nayan introduced ignio as an agent-based platform designed to bring autonomy to enterprise IT operations, built from the ground up over the past decade and protected by over 110 patents. Targeting a $31 billion global market across retail, pharma, banking, and manufacturing, ignio leverages machine intelligence and agent-based automation to address repetitive, knowledge-driven IT tasks, aiming to shift enterprises from assisted or augmented operations to true autonomy. According to Nayan, the platform is used by over 250 customers worldwide and has earned a high customer satisfaction rating of 4.4 out of 5 on G2.

Nayan illustrated ignio’s capabilities through detailed customer stories. For luxury retailer Tapestry, ignio integrated with IBM Sterling order management, financial, and logistics systems to monitor and optimize the journey of orders across 37 global webfronts. The platform proactively handled issues ranging from data inconsistencies to job cycle failures, ultimately saving the company millions and managing over 100,000 orders. In another case, a large pharmaceutical company with 70 complex system interfaces used ignio to streamline their prosthetic limb supply chain, reducing critical demand planning processes from weeks to hours. A global consumer goods company also employed ignio to automate order fulfillment across SAP systems and manufacturing plants, avoiding disruptions in a high-volume direct-to-store delivery model and preventing millions in potential losses.

At scale, ignio demonstrated significant operational efficiencies for enterprises such as a major pharmaceutical distributor and a retail pharmacy chain. In the distribution environment, ignio handled over 20,000 configuration items and 220 business-critical applications, achieving 80 percent noise reduction in event management and automating 110,000 hours of annual manual work. For the retail pharmacy chain with 9,000 stores, ignio automated ticket management tied to revenue assurance for promotions, reducing mean time to resolution from nearly three days to under ten minutes and recapturing $17 million in revenue while saving $5 million in support costs. Across its deployments, ignio processed 1.2 billion events last year, achieved 87 percent noise reduction, and executed 300 million automated actions—demonstrating that agentic, autonomous IT platforms can significantly reduce business disruptions and free human talent for higher-value work.


Incident Resolution with Digitate’s ignio AI Agent

Event: AI Field Day 7

Appearance: Digitate Presents at AI Field Day 7

Company: Digitate

Video Links:

Personnel: Rahul Kelkar

At AI Field Day 7, Rahul Kelkar, Chief Product Officer at Digitate, presented the capabilities of ignio, an AI-based incident resolution agent designed to automate, augment, and improve IT operations. Ignio uses a logical reasoning model for incident resolution, leveraging enterprise blueprints to understand situations in a closed loop and applying automation where possible. When a fully automated response is not viable, ignio augments human efforts through assisted resolution, supplying prioritized incident lists based on business impact, providing situational context, and capturing both historical and episodic memory about recurring issues. The product integrates with various data sources to build a formal enterprise IT model, supporting information ingestion via templates or extraction from existing documentation, and includes adapters for common ITSM systems like ServiceNow for seamless change management.

Technically, ignio’s core incident resolution operates via automated root cause analysis, performing real-time health checks across application hierarchies—such as applications running on Oracle databases hosted on Red Hat servers—and comparing the current state to baselines to isolate anomalies. It can autonomously apply prescriptive fixes, such as restarting services, and then validate remediation by rechecking stack health. In more complex scenarios, like SAP HANA environments or intricate batch job dependencies in retail order management, ignio handles non-vertical, multi-layered issues involving middleware, business processes, and interdependent bad jobs. The solution features out-of-the-box knowledge for common technologies and allows continuous augmentation with customer-specific logic. Custom operational models and atomic actions can be enhanced using Ignio Studio, and the system learns from user feedback through reinforcement learning, improving accuracy in prioritizing incidents, suggesting fixes, and predicting service level agreement (SLA) violations before they occur.

Ignio extends beyond deterministic resolution to assist engineers and SREs via conversational augmentation. For issues not resolved autonomously, ignio provides contextual insights—including previous incidents, typical resolutions, and guidance for next-steps—while collaborating via a “resolution assistant” so humans can contribute domain knowledge and validate procedure. The demo showed proactive recommendation capabilities, identifying dominant recurring SLA violations and offering actionable, prioritized problem management insights. Ignio integrates with multiple agent-based platforms for orchestrated, multi-channel incident management flows, including email, Slack, and ticketing systems, using orchestration protocols and adapters. The platform employs advanced anomaly mining and sequence analysis, allowing users to identify root causes not only within vertical stacks but also through complex temporal and conditional relationships across business functions, ultimately supporting predictive, reactive, and continuous improvement use cases in large-scale enterprise IT environments.


Powering Autonomous IT with ignio AI Agents from Digitate

Event: AI Field Day 7

Appearance: Digitate Presents at AI Field Day 7

Company: Digitate

Video Links:

Personnel: Rahul Kelkar

Digitate is a global provider of agentic AI platform for autonomous IT operations. Powered by ignio™, Digitate combines unified observability, AI-powered insights, and closed-loop automation to deliver resilient, agile, and self-healing IT and business operations. In this presentation, Digitate’s Chief Product Officer, Rahul Kelkar, will introduce Digitate’s vision for an autonomous enterprise, where organizations learn, adapt, and make decisions with minimal human intervention. Through a series of demos, Rahul will also showcase how Digitate’s purpose-built AI agents work seamlessly across observability, cloud operations, and IT for business to boost cloud ROI, predict delays, and ensure long-term stability. Learn more about Digitate and its ignio platform, visit: https://digitate.com/ai-agents/

At AI Field Day 7, Rahul Kelkar, Chief Product Officer at Digitate, presented on powering autonomous IT with ignio, Digitate’s agentic AI platform designed for IT operations. Kelkar began by outlining the industry’s evolution from manual IT operations and cognitive automation toward modern AIOps and agentic AI, framing the journey towards fully autonomous IT as a progression through stages of manual, task-automated, and augmented operations. He described how ignio leverages a unified three-pillar approach: unified (or business) observability for comprehensive monitoring of both technical and business processes, AI-driven insights using traditional and agentic AI including machine learning and generative models, and closed-loop automation that not only provides recommendations but executes prescriptive actions with high confidence. This architecture aims to proactively eliminate business disruptions due to IT, identify issues before they impact business productivity, and reduce incident resolution times.

Ignio operates on what Digitate calls the “Enterprise Blueprint,” essentially a digital twin or knowledge graph that captures both structural and behavioral aspects of enterprise IT. The platform integrates with common monitoring and IT management tools, ingesting metrics, events, logs, and traces to provide a layered view of health across infrastructure, application stacks, and business value streams. Observability data is enriched with AI-based noise filtering, anomaly detection, correlation, and dynamic thresholding, automatically triaging and suppressing redundant alerts. Kelkar highlighted ignio’s “composite AI” approach, combining logical reasoning (rule-based and machine learning models), analogical reasoning (using generative AI and large language models for contextualization where knowledge is incomplete), and assisted reasoning (bringing domain experts into the loop to validate and tune recommendations). The workflow encompasses agent-based management of event and incident handling, automated root cause analysis, and remediation actions, all while learning from human validation to continuously improve performance.

The platform is designed to address complex, applied use cases in large-scale, modern environments, such as event reduction, proactive incident management, observability, patch management, and cost optimization across multi-cloud and containerized workloads. ignio supports integrations through out-of-the-box adapters for 30-40 major tools (with customization for specialized environments) and specific modules for SAP applications, batch scheduling, digital workspaces, and procurement processes. Its agentic capability is extended with Ignio Studio, empowering SREs and IT operations teams to continuously extend and customize workflows. As demonstrated, ignio’s AI agents interact via conversational interfaces, notifications, and dashboards, enabling a shift to smaller, cross-functional SRE teams—supported by autonomous agents handling the bulk of monitoring, triage, and remediation, with humans focusing on governance, validation, and improvement. This supports a vision of truly autonomous, resilient IT operations that adapt rapidly to changing workloads and technologies, minimizing disruptions and keeping business-critical systems running smoothly.


The Technical Foundations of Articul8’s Agentic Al Platform

Event: AI Field Day 8

Appearance: Articul8 Presents at AI Field Day 7

Company: Articul8

Video Links:

Personnel: Arun Subramaniyan, Renato Nascimento

Dr. Renato Nascimento, Head of Technology at Articul8, and Dr. Arun Subramaniyan, Founder and CEO, presented the technical architecture and capabilities of the Articul8 platform at AI Field Day 7. The platform is built to enable orchestration and management of hundreds or thousands of domain- and task-specific AI models and agents at enterprise scale, supporting both cloud and on-premises deployments on all major cloud providers. The core architecture leverages Kubernetes for elasticity, high availability, and robust isolation of components. Key elements include a horizontally scalable API service layer and a proprietary “model mesh orchestrator,” which coordinates dynamic, low-latency, real-time executions across a variety of AI models deployed for customer-specific workloads. Observability, auditability, and compliance features are integrated at the intelligence layer, allowing enterprises to track, validate, and meet regulatory requirements for SOC and other audit demands.

At the heart of the platform is the automated construction and utilization of knowledge graphs, which are generated during data ingestion without manual annotation. For example, Articul8 demonstrated the ingestion and analysis of a 200,000-page aerospace dataset, generating a knowledge graph with over 6 million entities, 160,000 topics, 800,000 images, and 130,000 tables. The system automatically identifies topics, clusters, and semantic relationships, enabling precise search, reasoning, and model flows (e.g., distinguishing charts from images, applying task-specific models for tables including OCR, summary statistics, and understanding content). This knowledge graph forms the substrate for supporting both the training and real-time inference of domain-specific and task-specific AI models. The Model Mesh intelligence layer breaks down incoming data, determines its type, and routes it through appropriate model pipelines for processing, ensuring that the architecture can support both large and small models as appropriate for the data and task complexity.

The platform also showcases advanced agentic functionalities such as the creation of digital twins—AI-powered proxies of individuals or departments—which can be quickly spun up from public or private data and progressively improved through feedback and additional data integration. In an illustrative demo, Articul8 built digital twins of AI Field Day participants and orchestrated live, multi-agent discussions on technical topics. The platform supports squad-mode interactions, wherein multiple digital twins can collaborate, offer opinions, revise answers, and converge or diverge in real-time analysis. All these actions are fully tracked and auditable, supporting enterprise security and access controls. The discussion outcomes are summarized and can be exported, making the platform suitable not only for typical enterprise Q&A and knowledge retrieval, but also for scenario planning, decision support, and collaborative agentic workflows in secure, controlled environments.