Juniper Networks presented its latest Apstra functionality for AI data center network operations at AI Infrastructure Field Day. It focused on providing operators with the context and tools to manage complex AI networks efficiently. Jeremy Wallace, a Data Center/IP Fabric Architect, emphasized the importance of context in understanding the network’s expected behavior to identify and […]
Day 1: Managing your AI data center at scale with Juniper Networks
This presentation by Kyle Baxter focuses on how Juniper Networks’ Apstra solution can manage AI data centers at scale. Apstra simplifies network configuration for AI/ML workloads by providing tools to assign virtual networks across numerous ports, an essential capability in environments with potentially millions of ports. The core of the presentation highlights the ability to […]
Day 0: Designing your AI data center with Juniper Networks
Juniper Networks’ presentation at AI Infrastructure Field Day focuses on designing AI data centers using Apstra, specifically emphasizing rail-optimized designs and highlighting Apstra’s ability to create a fully functional network architecture in just minutes, incorporating native modeling for these specialized designs. Kyle Baxter, Head of Apstra Product Management at Juniper Networks, demonstrates how Apstra simplifies […]
GPYOU: Building and Operating your AI Infrastructure with Juniper Networks
AI infrastructure is a critical but complex domain, and IT organizations face the pressure to deliver results quickly. Juniper Networks shows Juniper Apstra as a solution to streamline the management of AI data centers, providing proven designs. Kyle Baxter emphasizes the necessity of a robust network foundation for AI and ML workloads and highlights the […]
Securing AI Clusters, Juniper’s Approach to Threat Protection with Juniper Networks
AI clusters are high-value targets for cyber threats, requiring a defense-in-depth strategy to safeguard data, workloads, and infrastructure. Kedar Dhuru highlighted how Juniper’s security portfolio provides end-to-end protection for AI clusters, including secure multitenant environments, without compromising performance. The presentation addressed the challenges of securing AI data centers, focusing on securing WAN and data center […]
Maximize AI Cluster Performance using Juniper Self-Optimizing Ethernet with Juniper Networks
Vikram Singh, Sr. Product Manager, AI Data Center Solutions at Juniper Networks, discussed maximizing AI cluster performance using Juniper’s self-optimizing Ethernet fabric. As AI workloads scale, high GPU utilization and minimized congestion are critical to maximizing performance and ROI. Juniper’s advanced load balancing innovations deliver a self-optimizing Ethernet fabric that dynamically adapts to congestion and […]
AI Unbound, Your Data Center Your Way with Juniper Networks
Praful Lalchandani, VP of Product, Data Center Platforms and AI Solutions at Juniper Networks, opened the presentation by highlighting the rapid growth of the AI data center space and its unique challenges. He noted that Juniper Networks, with its 25 years of experience in networking and security, is uniquely positioned to address these challenges and […]
Secure and optimize AI and ML workloads with the Cross-Cloud Network with Google Cloud
Vaibhav Katkade, a Product Manager at Google Cloud Networking, presented on infrastructure enhancements in cloud networking for secure, optimized AI/ML workloads. Focusing on the lifecycle of AI/ML, encompassing training, fine-tuning, and serving/inference, and the corresponding network imperatives for each stage. Data ingestion relies on fast, secure connectivity to on-premises environments via interconnect and cross-cloud interconnect, […]
Cloud WAN Connecting networks for the AI Era with Google Cloud
This presentation by Aniruddha Agharkar, Product Manager at Google Cloud Networking, centers on Cloud WAN, Google’s fully managed backbone solution designed for the enterprise era and powered by Google’s planet-scale network. Customers have historically relied on bespoke networks using leased lines and MPLS providers, leading to inconsistencies in security, operational challenges, a lack of visibility, […]
AI Hypercomputer and TPU (Tensor) acceleration with Google Cloud
Rose Zhu, a Product Manager at Google Cloud TPU, presented on TPUs for large-scale training and inference, emphasizing the rapid growth of AI models and the corresponding demands for compute, memory, and networking. Zhu highlighted the specialization of Google’s TPU chips and systems, purpose-built ASICs for machine learning applications, coupled with innovations in power efficiency, […]
AI hypercomputer and GPU acceleration with Google Cloud
Dennis Lu, a Product Manager at Google Cloud specializing in GPUs, presented on AI hypercomputer and GPU acceleration with Google Cloud. Lu covered Google Cloud’s AI hypercomputer, from consumption models to purpose-built hardware. Focus was given to Google’s cluster director for managing GPU fleets. Dennis then moved to the hardware aspect of Google Cloud’s AI […]
Analytics Storage and AI, Data Prep and Data Lakes with Google Cloud
Vivek Sarswat, Group Product Manager at Google Cloud Storage, presented on analytics storage and AI, focusing on data preparation and data lakes. He emphasized the close ties between analytics and AI workloads, highlighting key innovations built to address related challenges. The presentation demonstrates that analytics play a crucial role in the AI data pipeline, particularly […]
The latest in high-performance storage, Rapid on Colossus with Google Cloud
Michal Szymaniak, Principal Engineer at Google Cloud, presented on Rapid Storage, a new zonal storage product within the cloud storage portfolio, powered by Google’s foundational distributed file system, Colossus. The goal in designing Rapid Storage was to create a storage system that offers the low latency of block storage, the high throughput of parallel file […]
Intro to Managed Lustre with Google Cloud
Dan Eawaz, Senior Product Manager at Google Cloud, introduced Managed Lustre with Google Cloud, a fully managed parallel file system built on DDN Exascaler. The aim is to solve the demanding requirements of data preparation, model training, and inference in AI workloads. Managed Lustre provides high throughput to keep GPUs and TPUs fully utilized and […]
Overview of Cloud Storage Storage for AI, Lustre, GCSFuse, and Anywhere cache with Google Cloud
Marco Abella, Product Manager at Google Cloud Storage, presented an overview of Google Cloud’s storage solutions optimized for AI/ML workloads. The presentation addressed the critical role of storage in AI pipelines, emphasizing that an inadequate storage solution can significantly bottleneck GPU utilization, causing idle GPUs and hindering data processing from initial data preparation to model […]
Google Kubernetes Engine and AI Hypercomputer with Google Cloud
Ishan Sharma, Group Product Manager in the Google Kubernetes Engine team, presented on GKE and AI Hypercomputer, focusing on industry-leading infrastructure, training quickly at mega scale, serving with lower cost and latency, economic access to GPUs and TPUs, and faster time to value. He emphasized that Google Cloud is committed to ensuring new accelerators are […]
AI Hypercomputer Cluster Toolkit with Google Cloud
Ilias Katsardis, Senior Product Manager for AI infrastructure at Google Cloud, presented on the AI Hypercomputer Cluster Toolkit, addressing the complexities of deploying AI infrastructure on Google Cloud’s compute engine and GKE. He highlighted the challenges customers face when trying to quickly and efficiently create supercomputers in the cloud, including performance uncertainty, troubleshooting difficulties, and […]
Storage Intelligence with Google Cloud
Manjul Sahay, Group Product Manager at Google Cloud Storage, presented on Storage Intelligence with Google Cloud, focusing on helping customers, both enterprises and startups, manage their storage effectively for AI applications. These customers often face challenges in managing storage at scale for security, cost, and operational efficiency, particularly with small and new teams. A key […]
Introudction to the AI Hypercomputer with Google Cloud
Sean Derrington, Product Manager, Storage at Google Cloud, introduced the AI Hypercomputer at AI Infrastructure Field Day, highlighting Google Cloud’s investments in making it easier for customers to consume and run their AI workloads. The focus is on infrastructure with consideration to the consumption model and optimized software. The AI Hypercomputer encompasses optimized software and […]
Demonstrating Keysight’s AI Fabric Test Methodology
This session provides an overview of the Keysight AI fabric test methodology, demonstrating key findings and improvements achieved through automated testing and the search for optimal configuration parameters. Alex Bortek, Lead Product Manager at Keysight Technologies, introduces the Keysight AI fabric test methodology using the Kai Data Center Builder product. The methodology guides users through […]