Watch on YouTube
Watch on Vimeo
In this presentation by Arista at Mobility Field Day 14, Suparna Dam introduces the on-premises architecture of the AGNI product, focusing on its scaling capabilities and the innovative AGNI Cluster Config Manager (ACCM). The session covers how AGNI delivers features like updates, remote monitoring through a secure HTTPS tunnel, and data protection, ensuring the cluster remains functional even during temporary tunnel disruptions. The discussion details the node structure, including principal, standby, and auxiliary nodes, and outlines deployment flexibility across different scaling tiers. Dam also addresses the real-world operational challenges of managing large network access control (NAC) clusters and demonstrates how ACCM resolves these issues by transitioning from a monolithic setup to a highly managed, localized architecture.
The presentation emphasizes that while AGNI is available as both a cloud and an on-premises solution, the on-premises deployment introduces a unique architecture that maintains a secure HTTPS tunnel to the AGNI Cloud. This tunnel allows Arista’s site reliability engineering (SRE) team to proactively monitor system health factors like CPU and disk usage without leaking customer data or interrupting operations if connectivity drops. On-premises clusters are built using three types of nodes: a read-write principal node acting as the administrative leader, a standby node serving as a designated survivor for high availability, and auxiliary nodes utilized for horizontal scaling. Dam explains that the system supports up to 500,000 sessions per cluster and requires a replication latency of under 700 milliseconds alongside Layer 3 connectivity across all nodes, making configuration management across geographically diverse environments highly precise.
To overcome the inherent risks of managing massive monolithic clusters, such as the global impact of an incorrect configuration or tedious node-by-node upgrades, Arista introduced ACCM to decouple cluster management from execution. This centralized control plane breaks large deployments into smaller, distinct clusters managed through a single interface, featuring a dedicated staging cluster to safely test changes. Operators can modify and validate configurations or system upgrades on the staging cluster, save the state with a tracking tag, and then selectively roll out the audited changes to specific production clusters one at a time. The accompanying live demo illustrates this workflow by creating sequential configuration tags, pushing a localized differential policy update, and triggering an automated cluster upgrade that sequentially updates principal and secondary nodes without requiring manual individual intervention.
Personnel: Suparna Dam
Thank you for being part of the Tech Field Day community! Our mailing list is a great way to stay up to date on our events and technical content, and we appreciate your signup.
We promise that we’ll never spam you, send ads, or sell your information. This list will only be used to communicate with our community about our events and content. And we’ll limit it to no more than one message per week.
Although we only need your email address, it would be nice if you provided a little more information to help us get to know you better!