Watch on YouTube
Watch on Vimeo
Selector AIOps is a game-changing full-stack observability platform that leverages AI/ML to solve the network and infrastructure visibility gap. Traditional observability solutions have failed to keep pace with the increasing complexity and scale of modern networks, leading to prolonged Mean Time To Resolution (MTTR) and operational inefficiencies. Selector AIOps provides actionable insights by correlating events across the entire IT stack – from network, infrastructure, and application layers – enabling enterprises to proactively identify, troubleshoot, and resolve issues, ultimately improving network uptime and business continuity. Selector, founded in 2019 with an AI/ML-first approach, recognized that existing observability tools were failing to provide adequate visibility in the networking space, leading to costly downtime for complex networks. Their founding team, with deep networking expertise from companies like Juniper and Google, aimed to fill this gap by building a solution from the ground up with AI and machine learning at its core, enabling rapid identification and resolution of issues.
The core technology of Selector leverages an ML-driven autocorrelation approach to correlate all telemetry across the full stack—network, application, infrastructure, and cloud—explicitly tying insights back to the network. This allows users to quickly answer critical questions like “was it the application, infrastructure, or network?” and provides actionable “who, what, when, where, how” details for resolution in seconds and minutes, not hours or days. The platform drastically reduces alert fatigue, automates incident response, and consolidates disparate tools into a true single pane of glass, reducing tool sprawl and technical debt. By democratizing access to data through AI, Selector eliminates the need for deep data science knowledge. This approach has led to significant market validation, with 80% of Selector’s customer base now comprising Fortune 1000 companies, addressing challenges like data silos, tooling debt, and “escalation chaos” by significantly reducing the number of engineers required to address P1 incidents.
Selector emphasizes that simply applying generic AI tools to a data lake isn’t sufficient; the data requires context and intelligence, and success depends on unifying networking/operations expertise with AI/ML engineering. Selector addresses this by integrating both skill sets within its team, ensuring purpose-built solutions that prevent “dirty data in, dirty outcomes out.” While the platform offers extensive automation capabilities, including running e-books and workflows through partnerships with tools like Ansible and Puppet, the extent of full automation depends on customer readiness, often retaining human-in-the-loop for critical changes. Customers utilize Selector to build unified Network Management Systems (NMS), perform AI-driven root cause analysis, conduct synthetic monitoring with smart alerts, and leverage LLM-integrated textual data correlation. The platform also includes an “operational twin” for “what-if” analysis without impacting production and can be integrated into CI/CD pipelines for testing changes before deployment.
Personnel: Stephen Ochs
Watch on YouTube
Watch on Vimeo
Enterprises are faced with overwhelming challenges trying to manage their hybrid and multi-cloud networks. AIOps platforms have emerged as a potential solution to help network operators manage their complex environments. At Selector, we believe that bringing network and AI/ML expertise together is critical to bridging the gap between network teams and AI teams to deliver solutions to these challenges. This presentation will cover how Selector brings a data-centric approach to AI/ML and the Selector AI/ML stack, which powers its network-aware, closed-loop AIOps platform. Surya Nimmagadda, Chief Data Scientist at Selector, highlighted his company’s unique blend of deep networking knowledge and AI/ML proficiency. This combination addresses a significant disconnect in network operations where specialized network teams often lack AI skills, and AI teams lack crucial network context. Selector aims to fill this void by providing network-specific observability, emphasizing that direct customer collaboration and feedback are paramount, as “without the feedback, there is no AI.”
Selector’s foundational philosophy is a “data-centric approach” to AI/ML, prioritizing meticulous data curation and cleaning over constant model iteration. The company believes that once data is properly prepared and understood—likened to refining “oil”—models can perform exceptionally well. The platform supports flexible deployment options, including Selector’s public cloud, on-premises behind corporate firewalls for data security, or hybrid models, all leveraging a Kubernetes-based architecture. This allows for rapid, near real-time insights, with processing times often less than five to ten minutes, which is critical for dynamic network environments. Selector maintains strict data privacy, ensuring that each customer’s data is housed in a dedicated instance, preventing cross-customer data leakage and tailoring models specifically to that customer’s environment rather than relying on broad anonymization.
The Selector AI/ML stack is structured in four layers: Ingest, Enrichment, Network Intelligence, and Agent TKI. The Ingest layer is designed to handle millions of diverse data points per minute from various sources—metrics, logs, events, and unstructured data—across multi-vendor environments, using push, pull, API, or message bus mechanisms. The Enrichment layer, considered Selector’s “secret sauce,” automatically cleans, normalizes, and contextualizes this raw data through a declarative ETL (Extract, Transform, Load) system, establishing crucial network relationships and metadata. The Network Intelligence layer then employs traditional, explainable machine learning models like statistical and regression analysis for metrics, and natural language processing (NLP) for logs. This layer establishes baselines, identifies anomalies, and translates disparate log messages (e.g., “link down” across different vendors) into unified, contextualized events, often correlating hundreds of individual anomalies into a single, actionable insight. Finally, the Agent TKI layer utilizes these highly refined and correlated insights to interact with Large Language Models (LLMs), generating actionable recommendations, automating responses, and reducing operational fatigue by transforming complex data into clear, concise guidance for network operators.
Personnel: Surya Nimmagadda
Watch on YouTube
Watch on Vimeo
Selector AI Agents Architecture is built to interact with enriched data and gain insights through a natural language interface, supporting web portals and integrations like Slack and Teams. It transforms how users chat with their network to uncover critical information. The initial approach, conceived in 2023 at the advent of public LLMs, involved translating natural language questions into STQL (Selector Query Language), running a single query, and sending the results to an LLM for reasoning. This single-shot method proved insufficient for complex networking issues, which often require multiple data sources and iterative steps. Limitations included small context windows, the need for pre-seeded translation phrases, and challenges with accurate natural language understanding, such as misinterpreting named entity recognition (NER) for terms.
Selector evolved to a multi-turn reasoning AI agent architecture, solving these early shortcomings. This system employs a React pattern, allowing agents to iteratively plan, execute tools, and observe results to reach a final answer. The architecture is three-tiered: a central orchestrator agent acts as the “general contractor,” planning and coordinating tasks while maintaining conversational context. It dispatches requests to domain-specific worker agents—like firewall, load balancer, or cloud observability agents—which act as “specialized plumbers or electricians.” These worker agents, which are also React agents, have a limited set of tools (the “tool belt”) available via the Microservice Control Plane (MCP) and focus solely on answering their specific query before reporting back to the orchestrator. Selector emphasizes LLM agnosticism, offering connections to private, enterprise, or secure public LLMs (like Gemini with data non-training guarantees), and provides extensive auditability and traceability, logging every agent action and decision through OpenTelemetry and MongoDB.
The platform supports two main integration patterns: using Selector’s orchestrator with customer-provided tools, or customers integrating their own agentic ecosystems with Selector’s MCP toolset. This enables guided remediation, from creating maintenance windows and alert rules to executing Ansible playbooks and integrating with third-party workflow engines like Itential for configuration changes. Joby Rudolph demonstrated these capabilities by showcasing how an agent can diagnose unreachable applications by synthesizing data from synthetics, routing, anomalies, and cloud agents, providing both root cause analysis and recommended actions. He also showed a closed-loop remediation where an agent used an external Itential workflow to correct a typo in a device configuration, highlighting the system’s extensibility and ability to foster trust by allowing human-in-the-loop validation for critical actions. The architecture’s flexibility allows customers to define new agents and tools, leveraging Selector’s underlying enriched data for comprehensive network observability and management.
Personnel: Joby Rudolph
Thank you for being part of the Tech Field Day community! Our mailing list is a great way to stay up to date on our events and technical content, and we appreciate your signup.
We promise that we’ll never spam you, send ads, or sell your information. This list will only be used to communicate with our community about our events and content. And we’ll limit it to no more than one message per week.
Although we only need your email address, it would be nice if you provided a little more information to help us get to know you better!