Tech Field Day

The Independent IT Influencer Event

  • Home
    • The Futurum Group
    • FAQ
    • Staff
  • Sponsors
    • Sponsor List
      • 2026 Sponsors
      • 2025 Sponsors
      • 2024 Sponsors
      • 2023 Sponsors
      • 2022 Sponsors
    • Sponsor Tech Field Day
    • Best of Tech Field Day
    • Results and Metrics
    • Preparing Your Presentation
      • Complete Presentation Guide
      • A Classic Tech Field Day Agenda
      • Field Day Room Setup
      • Presenting to Engineers
  • Delegates
    • Delegate List
      • 2026 Delegates
      • 2025 Delegates
      • 2024 Delegates
      • 2023 Delegates
      • 2022 Delegates
    • Become a Field Day Delegate
    • What Delegates Should Know
  • Events
    • All Events
      • Upcoming
      • Past
    • Field Day
    • Field Day Extra
    • Field Day Exclusive
    • Field Day Experience
    • Field Day Live
    • Field Day Showcase
  • Topics
    • Tech Field Day
    • Cloud Field Day
    • Mobility Field Day
    • Networking Field Day
    • Security Field Day
    • Storage Field Day
  • News
    • Coverage
    • Event News
    • Podcast
  • When autocomplete results are available use up and down arrows to review and enter to go to the desired page. Touch device users, explore by touch or with swipe gestures.
You are here: Home / Videos / Google Cloud Network Infrastructure for AI/ML

Google Cloud Network Infrastructure for AI/ML



Cloud Field Day 20


This video is part of the appearance, “Google Cloud Presents at Cloud Field Day 20“. It was recorded as part of Cloud Field Day 20 at 13:00-15:30 on June 13, 2024.


Watch on YouTube
Watch on Vimeo

Victor Moreno, a product manager at Google Cloud, presented on the network infrastructure Google Cloud has developed to support AI and machine learning (AI/ML) workloads. The exponential growth of AI/ML models necessitates moving vast amounts of data across networks, making it impossible to rely on a single TPU or host. Instead, thousands of nodes must communicate efficiently, which Google Cloud achieves through a robust software-defined network (SDN) that includes hardware acceleration. This infrastructure ensures that GPUs and TPUs can communicate at line rates, dealing with challenges like load balancing and data center topology restructuring to match traffic patterns.

Google Cloud’s AI/ML network infrastructure involves two main networks: one for GPU-to-GPU communication and another for connecting to external storage and data sources. The GPU network is designed to handle high bandwidth and low latency, essential for training large models distributed across many nodes. This network uses a combination of electrical and optical switching to create flexible topologies that can be reconfigured without physical changes. The second network connects the GPU clusters to storage, ensuring periodic snapshots of the training process are stored efficiently. This dual-network approach allows for high-performance data processing and storage communication within the same data center region.

In addition to the physical network infrastructure, Google Cloud leverages advanced load balancing techniques to optimize AI/ML workloads. By using custom metrics like queue depth, Google Cloud can significantly improve response times for AI models. This optimization is facilitated by tools such as the Open Request Cost Aggregation (ORCA) framework, which allows for more intelligent distribution of requests across model instances. These capabilities are integrated into Google Cloud’s Vertex AI service, providing users with scalable, efficient AI/ML infrastructure that can automatically adjust to workload demands, ensuring high performance and reliability.

Personnel: Victor Moreno

  • Bluesky
  • LinkedIn
  • Mastodon
  • RSS
  • Twitter
  • YouTube

Event Calendar

  • Mar 11-Mar 12 — Cloud Field Day 25
  • Mar 23-Mar 24 — Tech Field Day Extra at RSAC 2026
  • Apr 8-Apr 10 — Networking Field Day 40
  • Apr 13-Apr 15 — Tech Field Day Experience at Qlik Connect 2026
  • Apr 29-Apr 30 — Security Field Day 15
  • May 6-May 8 — Mobility Field Day 14
  • May 13-May 14 — AI Field Day 8
  • Jun 2-Jun 3 — Tech Field Day Extra at Cisco Live US 2026

Latest Coverage

  • Finally, Network Silicon That Thinks for Itself
  • One Rack, One Exabyte, Zero Excuses: How Open Storage Is Rewriting AI Infrastructure
  • Your GPUs Are Only as Good as the Network Feeding Them
  • 174: GreyBeards talk SDN chips with Ted Weatherford, VP Bus. Dev. & John Carney. Dist. Eng. at Xsight Labs
  • The Governance Controls Cisco Didn’t Know They Were Selling

Tech Field Day News

  • Cloud Strategy, The Future of Infrastructure, and Of Course AI at Cloud Field Day 25
  • Cutting-Edge AI Networking and Storage Kick Off 2026 at AI Infrastructure Field Day 4

Return to top of page

Copyright © 2026 · Genesis Framework · WordPress · Log in