Tech Field Day

The Independent IT Influencer Event

  • Home
    • The Futurum Group
    • FAQ
    • Staff
  • Sponsors
    • Sponsor List
      • 2026 Sponsors
      • 2025 Sponsors
      • 2024 Sponsors
      • 2023 Sponsors
      • 2022 Sponsors
    • Sponsor Tech Field Day
    • Best of Tech Field Day
    • Results and Metrics
    • Preparing Your Presentation
      • Complete Presentation Guide
      • A Classic Tech Field Day Agenda
      • Field Day Room Setup
      • Presenting to Engineers
  • Delegates
    • Delegate List
      • 2025 Delegates
      • 2024 Delegates
      • 2023 Delegates
      • 2022 Delegates
      • 2021 Delegates
      • 2020 Delegates
      • 2019 Delegates
      • 2018 Delegates
    • Become a Field Day Delegate
    • What Delegates Should Know
  • Events
    • All Events
      • Upcoming
      • Past
    • Field Day
    • Field Day Extra
    • Field Day Exclusive
    • Field Day Experience
    • Field Day Live
    • Field Day Showcase
  • Topics
    • Tech Field Day
    • Cloud Field Day
    • Mobility Field Day
    • Networking Field Day
    • Security Field Day
    • Storage Field Day
  • News
    • Coverage
    • Event News
    • Podcast
  • When autocomplete results are available use up and down arrows to review and enter to go to the desired page. Touch device users, explore by touch or with swipe gestures.
You are here: Home / Videos / Cisco AI Cluster Networking Operations DLB Demo with Paresh Gupta

Cisco AI Cluster Networking Operations DLB Demo with Paresh Gupta



Networking Field Day 39


This video is part of the appearance, “Cisco Presents at Networking Field Day 39“. It was recorded as part of Networking Field Day 39 at 10:30-12:00 on November 6, 2025.


Watch on YouTube
Watch on Vimeo

Paresh Gupta concluded the deep dive by focusing on the most complex challenge in AI networking: congestion and load balancing in the backend GPU-to-GPU fabric. He explained that while operational simplicity and cabling are critical, the primary performance bottleneck, even in non-oversubscribed networks, is the failure of traditional ECMP load balancing. Because ECMP hashes flows randomly, it creates severe traffic imbalances, where one link may be congested at 105% capacity while another sits idle at 60%. This non-uniform utilization, not a lack of total capacity, creates congestion, triggers performance-killing pause frames, and can lead to out-of-order packets, which are devastating for tightly coupled collective communication jobs.

To solve this, Cisco has developed advanced load-balancing techniques, moving beyond simple ECMP. Gupta detailed a “flowlet” dynamic load balancing (DLB) approach, where the switch detects inter-packet gaps to identify a flowlet and sends the next flowlet on the current, least-congested link. More importantly, he highlighted a fully validated, joint-reference architecture codesigned with NVIDIA. This solution combines Cisco’s per-packet DLB, running on its switches, with NVIDIA’s adaptive routing and direct data placement capabilities on the SuperNIC. This handshake between the switch and the NIC is auto-negotiated, and Gupta showed video benchmarks of a 64-GPU cluster where this method improved application-level bus bandwidth by 35-40% and virtually eliminated pause frames compared to standard ECMP.

This advanced capability, Gupta explained, is made possible by the P4-programmable architecture of Cisco’s Silicon One ASIC, which allows new features to be delivered without a multi-year hardware respin. He framed this as the foundational work that is now being standardized by the Ultra Ethernet Consortium (UEC), of which Cisco is a steering member. By productizing these next-generation transport features today, Cisco is able to provide a consistent, high-performance, and validated networking experience for any AI environment, offering enterprises a turnkey solution that rivals the performance of complex, custom-built hyperscaler networks.

Personnel: Paresh Gupta

  • Bluesky
  • LinkedIn
  • Mastodon
  • RSS
  • Twitter
  • YouTube

Event Calendar

  • Nov 11-Nov 12 — Tech Field Day at KubeCon North America 2025
  • Jan 28-Jan 29 — AI Infrastructure Field Day 4
  • Mar 11-Mar 12 — Cloud Field Day 25
  • Apr 8-Apr 9 — Networking Field Day 40
  • Apr 15-Apr 16 — AI AppDev Field Day 3
  • Apr 29-Apr 30 — Security Field Day 15
  • May 6-May 8 — Mobility Field Day 14
  • May 13-May 14 — AI Field Day 8

Latest Coverage

  • Storage Vendors Are Finally Speaking Data Governance and Management and I Like It
  • FortiCNAP and the New Frontier of AI Security: Lessons from Cloud Field Day 24
  • NFD39 Was Great! An Overview
  • The Cloud Comes Home: Oxide Reimagines On-Prem Computing
  • Fortinet’s Fabric-Based Approach to Cloud Security

Tech Field Day News

  • Tech Field Day Returns to KubeCon North America Live from Atlanta!
  • Exploring How AI Transforms the Enterprise Network at Networking Field Day 39

Return to top of page

Copyright © 2025 · Genesis Framework · WordPress · Log in