Follow on Twitter using the following hashtags or usernames: #NFD40
Watch on YouTube
Watch on Vimeo
In this session, Mike Hoffman, co-founder of NetAI, discusses the critical role of deterministic root cause analysis as a prerequisite for safe autonomous network operations. He explains why current AIOps solutions still necessitate manual intervention and how NetAI’s graph neural network (GNN) technology provides a verifiable diagnostic layer that bridges the gap between observability and automated action. Hoffman draws on his extensive industry experience to highlight the evolution of troubleshooting and the current industry stall where traditional AIOps tools often provide only best-guess scenarios rather than definitive answers.
The presentation argues that the fundamental flaw in modern network management is the reliance on manual workflows and reactive tools that fail to provide actionable intelligence. Hoffman contrasts NetAI’s approach with Large Language Models, noting that while LLMs excel at processing words, networks are inherently graphs composed of complex relationships between nodes and edges. By utilizing a GNN-based engine, NetAI maintains a constant understanding of network topology and data flows, allowing the system to recognize anomalies firsthand. This architectural choice eliminates the need to rely on secondhand alerts from disparate devices, which often lead to the chair swivel effect where operators must jump between multiple point tools to verify issues.
By providing a deterministic layer between observability and automation, NetAI claims to achieve accelerated mean time to repair, often resolving correlation and root cause analysis within seconds rather than minutes or hours. Hoffman emphasizes that autonomous operations are unattainable without a foundation of trust, which can only be built through verifiable accuracy. NetAI’s goal is to replace the traditional process of elimination with a precise diagnostic that identifies the specific root cause capable of clearing thousands of downstream tickets. This high level of precision aims to give operators the confidence to move away from best-guess troubleshooting and toward a truly autonomous, self-healing network environment.
Personnel: Mike Hoffman
Watch on YouTube
Watch on Vimeo
In this session, Dr. Deepak Kakadia, founder and CEO of NetAI, discusses the technical architecture of NetAI’s graph neural network (GNN) and how it provides deterministic root cause analysis for autonomous network operations. Kakadia leverages his experience at Sun Microsystems, Verizon Labs, and Google to explain why traditional AIOps and Large Language Models (LLMs) often fail in networking environments. He argues that while LLMs are designed to model human language and behavior, networks are human-built structures that are better represented as mathematical graphs, allowing for a more precise and deterministic approach to troubleshooting.
The presentation details how NetAI’s GNN-based engine captures the structural relationships between routers and edges, mapping protocol layers and topology to identify exact root causes rather than statistical guesses. Unlike LLMs, which require exhaustive training on every possible permutation of alarms and symptoms, GNNs utilize the inherent causal relationships of the network to provide verifiable diagnostics. This approach eliminates the “best guess” nature of probabilistic models, reducing the burden on network engineers who would otherwise have to manually verify AI-generated suggestions. The system acts as a digital twin that records the state of the network at every timestamp, enabling historical analysis of intermittent issues that are notoriously difficult to replicate.
Kakadia emphasizes that NetAI is a product-focused company offering a rapid-deployment, containerized solution that can run on-premise in air-gapped environments or in the cloud. By integrating with existing observability data and automation scripts, NetAI fills the gap between identifying a problem and taking corrective action, effectively enabling self-healing network capabilities. The session concludes by highlighting the tool’s ability to lower Mean Time to Repair (MTTR) and improve productivity by allowing engineers to focus on root causes rather than downstream correlated alarms. While focused strictly on the networking stack from Layer 1 to Layer 4, the platform provides deep insights that help organizations rule out network issues during complex application outages.
Personnel: Deepak Kakadia
Watch on YouTube
Watch on Vimeo
In this functional architecture deep dive, Irfan Lateef, Sales Engineering and Business Development lead, demonstrates the practical application of NetAI’s graph neural network (GNN) for large-scale networking. Lateef details the platform’s multi-layered ingestion process, which pulls configuration data via SSH CLI to build a comprehensive graph of the network, alongside real-time telemetry from SNMP, Syslogs, and GNMI. This data is processed on high-performance NVIDIA H100 GPUs to perform fault management, correlation, and anomaly detection. The system provides a multi-layer topology visualization that spans from physical links and Layer 3 routing to complex overlays like MPLS, VXLAN, and GRE tunnels, allowing operators to see exactly how issues propagate across the network fabric.
The presentation features a live demonstration where a simulated link failure between Los Angeles and New York triggers a cascade of OSPF and interface alarms. Unlike traditional tools that would flood an operator with thousands of separate tickets, the GNN engine distills these into a single deterministic root cause. Lateef showcases the “Evidence Timeline” and “Causal Chain,” which provide a human-readable explanation of how the AI arrived at its conclusion, tracing the blast radius from the initial configuration change through downstream symptoms. This transparency is designed to build the operator trust necessary for auto-remediation, where the system can automatically execute scripts, such as a no shutdown command, to resolve the issue in seconds, effectively achieving Level 4 or 5 autonomous operations.
Addressing the practicalities of deployment, Lateef explains that NetAI offers flexible models including air-gapped on-premise installations for security-conscious tier-one operators and cloud-based deployments for rapid scalability. The platform is designed to replace AIOps fatigue”with a tool that delivers immediate ROI by focusing on materially significant anomalies rather than subjective noise. By integrating with existing ITSM tools like Jira and ServiceNow, NetAI aims to be the single pane of glass that bridges the gap between different technical silos. The session concludes by emphasizing that while LLMs are limited to what they have been trained on, the GNN’s structural understanding of network protocols allows it to solve novel problems deterministically, reducing Mean Time to Repair (MTTR) by a factor of ten.
Personnel: Irfan Lateef
Thank you for being part of the Tech Field Day community! Our mailing list is a great way to stay up to date on our events and technical content, and we appreciate your signup.
We promise that we’ll never spam you, send ads, or sell your information. This list will only be used to communicate with our community about our events and content. And we’ll limit it to no more than one message per week.
Although we only need your email address, it would be nice if you provided a little more information to help us get to know you better!