|
This video is part of the appearance, “Enfabrica Presents at AI Field Day 5“. It was recorded as part of AI Field Day 5 at 13:00-14:30 on September 13, 2024.
Watch on YouTube
Watch on Vimeo
Enfabrica, led by founder and CEO Rochan Sankar, is pioneering a new category of networking silicon designed to support accelerated computing and AI at unprecedented scales. The company has developed the Accelerated Compute Fabric Supernick (ACFS), a product aimed at addressing the evolving needs of data centers as they increasingly incorporate GPUs and TPUs. Sankar highlights that traditional networking solutions are no longer sufficient due to the rapid increase in compute intensity, which has outpaced the growth of I/O and memory bandwidth. This imbalance creates significant challenges for building distributed, resilient, and scalable systems, necessitating a rethinking of system I/O architectures.
The ACFS, specifically designed for high-performance distributed AI and GPU server networking, represents a significant leap in networking capabilities. Enfabrica’s first chip, codenamed Millennium, achieves an unprecedented 8 terabits per second of bandwidth, compared to the current standard of 400 gigabits per second. This innovation addresses the critical issue of compute flops scaling faster than data movement capabilities, which has led to inefficiencies in model performance and hardware utilization. Sankar explains that the current system architectures, which were originally designed for traditional compute, are not optimized for the demands of modern AI workloads, leading to inefficiencies and bottlenecks.
Sankar also discusses the historical context of computing models, contrasting the tightly coupled, low-latency communication of supercomputers with the distributed, high-tolerance networking of hyperscale cloud systems. Modern AI and machine learning systems require a hybrid approach that combines the performance of supercomputers with the scalability and resilience of cloud infrastructure. However, current solutions involve disparate communication networks that do not effectively interoperate, leading to imbalanced bandwidth and inefficiencies. Enfabrica aims to address these challenges by creating a unified networking fabric that can support both tightly coupled and distributed computing models, thereby improving overall system efficiency and scalability.
Personnel: Rochan Sankar