Arista CloudVision 360° Observability for AI

This video is part of the appearance, "Arista Presents at Networking Field Day 40". It was recorded as part of Networking Field Day 40 at 8:00-9:30 on April 9, 2026.

Watch on YouTube
Watch on Vimeo

Monitoring and managing complex AI infrastructure requires moving beyond traditional networking tools that treat the environment as a black box. Praful Bhaidsana explains that the industry has long suffered from a mean time to truth problem where network operators are blamed for issues they cannot properly diagnose because they lack visibility into what is connected to the network. Arista aims to change this Stone Age approach by evolving from simple monitoring to 360-degree observability. This strategy is centered on CloudVision, a NetOps platform that utilizes a common network data lake called NetDL to aggregate high-fidelity streaming telemetry from every Arista device across the data center, campus, and WAN.

The architecture relies on the fact that Arista’s EOS provides consistent, reliable state data, ranging from MAC address tables and routing updates to microburst signals and configuration changes. This information is stored in a time-series database, allowing operators to travel back in time to compare network states before and after an incident. To manage the resulting deluge of data, Arista employs an AI/ML engine known as AVA, or Autonomous Virtual Assist. AVA identifies patterns and anomalies, filtering out the noise to show only the relevant signals. This allows human operators to focus on making informed decisions rather than spending hours manually correlating events across different silos.

Furthermore, CloudVision has opened its ecosystem to ingest data from third-party systems, AI job orchestrators, and compute and storage metrics via Prometheus. This integration is critical for AI environments where a job stall could be caused by anything from a GPU failure to a NIC issue. Arista has introduced a dedicated AI jobs dashboard that correlates specific training jobs with the underlying flows, servers, and switches. To simplify interactions with this massive dataset, a digital virtual assistant allows users to query their infrastructure using natural language. This integrated approach ensures that expensive GPU resources do not sit idle and that the resolution of complex performance bottlenecks can happen in minutes rather than days.

Personnel: Praful Bhaidasna

The Value of Validation with Nokia

Start with Wyebot at MFD14

AI Strategy in Chaos: Models, Infrastructure, and Neoclouds

Cisco Secures Wi-Fi 802.11bt from Post Quantum Cryptography at MFD14

HPE Juniper Host “AA” Meetings for Recovering NPS Users at MFD14

Upscale AI at NFD40: The Pitch Before the Product

Arista CloudVision 360° Observability for AI

Sign up for updates to Tech Field day events

Sign up for updates to
Tech Field day events