Watch on YouTube
Watch on Vimeo
Alex Saroyan, CEO and co-founder of Netris, provides insights from the company’s experience in deploying and automating large-scale GPU clusters. This second part of the presentation focuses specifically on the life cycle of AI networking, emphasizing that sustainable AI business strategies require architecting for long-term growth and newer GPU generations rather than single-cluster deployments. Saroyan highlights the high financial stakes of these deployments, noting that organizations cannot afford play time once hardware arrives; the networking infrastructure must be ready to go live immediately to start generating revenue.
To meet these aggressive timelines, Netris utilizes a sophisticated modeling and simulation phase, often referred to as a digital twin, using technologies like NVIDIA’s DSX Air or Netris’s own CloudSim. This approach allows network engineering teams to model the topology, IP addressing, and cloud constructs, such as VXLANs and VRFs, before the physical hardware is even on-site. By pre-validating configurations in a simulated environment, teams can identify missing upstream connections or storage integration issues early. Once the hardware arrives, a Zero Touch Provisioning (ZTP) process identifies switches via MAC addresses and brings them into the master topology, where the Netris controller automatically generates and applies the necessary configurations without engineers having to write manual switch code.
The Netris platform further streamlines the deployment life cycle through automated validation and troubleshooting. Since the system operates based on intent-generated configurations rather than static files, the Netris agent on each node can instantly detect miswiring or link discrepancies and provide specific instructions for remediation. Saroyan explains that while human error in configuration is largely eliminated by the automation, hardware or control plane anomalies can still occur at scale. Consequently, Netris continues to expand its suite of smart tests to help engineers identify zombie switches or performance bottlenecks, providing a green status validation that allows GPU implementation teams to begin their work with total confidence in the underlying fabric.
Personnel: Alex Saroyan
Thank you for being part of the Tech Field Day community! Our mailing list is a great way to stay up to date on our events and technical content, and we appreciate your signup.
We promise that we’ll never spam you, send ads, or sell your information. This list will only be used to communicate with our community about our events and content. And we’ll limit it to no more than one message per week.
Although we only need your email address, it would be nice if you provided a little more information to help us get to know you better!