Tech Field Day

The Independent IT Influencer Event

  • Home
    • The Futurum Group
    • FAQ
    • Staff
  • Sponsors
    • Sponsor List
      • 2025 Sponsors
      • 2024 Sponsors
      • 2023 Sponsors
      • 2022 Sponsors
    • Sponsor Tech Field Day
    • Best of Tech Field Day
    • Results and Metrics
    • Preparing Your Presentation
      • Complete Presentation Guide
      • A Classic Tech Field Day Agenda
      • Field Day Room Setup
      • Presenting to Engineers
  • Delegates
    • Delegate List
      • 2025 Delegates
      • 2024 Delegates
      • 2023 Delegates
      • 2022 Delegates
      • 2021 Delegates
      • 2020 Delegates
      • 2019 Delegates
      • 2018 Delegates
    • Become a Field Day Delegate
    • What Delegates Should Know
  • Events
    • All Events
      • Upcoming
      • Past
    • Field Day
    • Field Day Extra
    • Field Day Exclusive
    • Field Day Experience
    • Field Day Live
    • Field Day Showcase
  • Topics
    • Tech Field Day
    • Cloud Field Day
    • Mobility Field Day
    • Networking Field Day
    • Security Field Day
    • Storage Field Day
  • News
    • Coverage
    • Event News
    • Podcast
  • When autocomplete results are available use up and down arrows to review and enter to go to the desired page. Touch device users, explore by touch or with swipe gestures.
You are here: Home / Videos / Demystifying Artificial Intelligence and Machine Learning Infrastructure for a Network Engineer with Cisco

Demystifying Artificial Intelligence and Machine Learning Infrastructure for a Network Engineer with Cisco



AI Field Day 5

Paresh Gupta presented for Cisco at AIFD5


This video is part of the appearance, “Cisco Presents at AI Field Day 5“. It was recorded as part of AI Field Day 5 at 14:00-16:00 on September 11, 2024.


Watch on YouTube
Watch on Vimeo

Cisco’s presentation at AI Field Day 5, led by Paresh Gupta and Nicholas Davidson, focused on demystifying AI/ML infrastructure for network engineers, particularly in the context of building and managing GPU clusters for AI workloads. Paresh, a technical marketing leader, began by explaining the challenges of setting up a GPU cluster, emphasizing the importance of inter-GPU networking and how Cisco’s Nexus 9000 Series switches address these challenges. He highlighted the complexity of cabling and configuring such clusters, which can take weeks to set up, but with Cisco’s validated solutions, the process can be streamlined to just eight hours. Paresh also discussed the importance of non-blocking, non-over-subscribed network designs, such as the “Rails Optimized” design used by Nvidia and the “Fly” design by Intel, which ensure efficient communication between GPUs during distributed AI training tasks.

The presentation also delved into the technical aspects of inter-GPU communication, particularly the need for collective communication protocols like all-reduce and reduce-scatter, which allow GPUs to synchronize their states during parallel processing. Paresh explained how Cisco’s network designs, such as the use of dynamic load balancing and static pinning, help optimize the flow of data between GPUs, reducing congestion and improving performance. He also touched on the importance of creating a lossless network using priority-based flow control to avoid packet loss, which can significantly delay AI training jobs. Cisco’s Nexus Dashboard plays a crucial role in monitoring and detecting anomalies, such as packet loss or congestion, ensuring that the network operates efficiently.

Nicholas Davidson, a machine learning engineer at Cisco, then shared his experience of building a generative AI (GenAI) application using the on-premises GPU cluster managed by Paresh. He explained how the infrastructure allowed him to train models on Cisco’s private data, which could not be moved to the cloud due to security concerns. By leveraging the GPU cluster, Nicholas was able to reduce training times from days to hours, processing billions of tokens in a fraction of the time it would have taken using cloud-based resources. He also demonstrated how the AI model, integrated with Cisco’s Nexus Dashboard, could provide real-time insights and anomaly detection for network engineers, showcasing the practical benefits of having an on-prem AI/ML infrastructure.

Personnel: Paresh Gupta


  • Bluesky
  • LinkedIn
  • Mastodon
  • RSS
  • Twitter
  • YouTube

Event Calendar

  • May 28-May 29 — Security Field Day 13
  • Jun 4-Jun 5 — Cloud Field Day 23
  • Jun 10-Jun 11 — Tech Field Day Extra at Cisco Live US 2025
  • Jul 9-Jul 10 — Networking Field Day 38
  • Jul 16-Jul 17 — Edge Field Day 4
  • Jul 23-Jul 24 — AppDev Field Day 3
  • Sep 10-Sep 11 — AI Infrastructure Field Day 3
  • Oct 29-Oct 30 — AI Field Day 7

Latest Links

  • Meraki Campus Gateway: Cloud-Managed Overlay for Complex Networks
  • Exploring the Future of Cybersecurity at Security Field Day 13
  • 5G Neutral Host: Solving Enterprise Cellular Coverage Gaps
  • Qlik Connect 2025: Answers For Agentic AI
  • Scaling Wi-Fi with Arista Networks EVPN using VESPA and MRO

Return to top of page

Copyright © 2025 · Genesis Framework · WordPress · Log in