Tech Field Day

The Independent IT Influencer Event

  • Home
    • The Futurum Group
    • FAQ
    • Staff
  • Sponsors
    • Sponsor List
      • 2025 Sponsors
      • 2024 Sponsors
      • 2023 Sponsors
      • 2022 Sponsors
    • Sponsor Tech Field Day
    • Best of Tech Field Day
    • Results and Metrics
    • Preparing Your Presentation
      • Complete Presentation Guide
      • A Classic Tech Field Day Agenda
      • Field Day Room Setup
      • Presenting to Engineers
  • Delegates
    • Delegate List
      • 2025 Delegates
      • 2024 Delegates
      • 2023 Delegates
      • 2022 Delegates
      • 2021 Delegates
      • 2020 Delegates
      • 2019 Delegates
      • 2018 Delegates
    • Become a Field Day Delegate
    • What Delegates Should Know
  • Events
    • All Events
      • Upcoming
      • Past
    • Field Day
    • Field Day Extra
    • Field Day Exclusive
    • Field Day Experience
    • Field Day Live
    • Field Day Showcase
  • Topics
    • Tech Field Day
    • Cloud Field Day
    • Mobility Field Day
    • Networking Field Day
    • Security Field Day
    • Storage Field Day
  • News
    • Coverage
    • Event News
    • Podcast
  • When autocomplete results are available use up and down arrows to review and enter to go to the desired page. Touch device users, explore by touch or with swipe gestures.
You are here: Home / Appearances / Enfabrica Presents at AI Field Day 5

Enfabrica Presents at AI Field Day 5



AI Field Day 5

Rochan Sankar presented for Enfabrica at AI Field Day 5

This Presentation date is September 13, 2024 at 13:00-14:30.

Presenters: Rochan Sankar


Accelerated Compute for AI from a Systems Perspective with Enfabrica


Watch on YouTube
Watch on Vimeo

Enfabrica, led by founder and CEO Rochan Sankar, is pioneering a new category of networking silicon designed to support accelerated computing and AI at unprecedented scales. The company has developed the Accelerated Compute Fabric Supernick (ACFS), a product aimed at addressing the evolving needs of data centers as they increasingly incorporate GPUs and TPUs. Sankar highlights that traditional networking solutions are no longer sufficient due to the rapid increase in compute intensity, which has outpaced the growth of I/O and memory bandwidth. This imbalance creates significant challenges for building distributed, resilient, and scalable systems, necessitating a rethinking of system I/O architectures.

The ACFS, specifically designed for high-performance distributed AI and GPU server networking, represents a significant leap in networking capabilities. Enfabrica’s first chip, codenamed Millennium, achieves an unprecedented 8 terabits per second of bandwidth, compared to the current standard of 400 gigabits per second. This innovation addresses the critical issue of compute flops scaling faster than data movement capabilities, which has led to inefficiencies in model performance and hardware utilization. Sankar explains that the current system architectures, which were originally designed for traditional compute, are not optimized for the demands of modern AI workloads, leading to inefficiencies and bottlenecks.

Sankar also discusses the historical context of computing models, contrasting the tightly coupled, low-latency communication of supercomputers with the distributed, high-tolerance networking of hyperscale cloud systems. Modern AI and machine learning systems require a hybrid approach that combines the performance of supercomputers with the scalability and resilience of cloud infrastructure. However, current solutions involve disparate communication networks that do not effectively interoperate, leading to imbalanced bandwidth and inefficiencies. Enfabrica aims to address these challenges by creating a unified networking fabric that can support both tightly coupled and distributed computing models, thereby improving overall system efficiency and scalability.

Personnel: Rochan Sankar

Enfabrica’s Approach to Solving IO Scaling Challenges in Accelerated Compute Clusters using Networking Silicon


Watch on YouTube
Watch on Vimeo

Enfabrica, under the leadership of Rochan Sankar, has developed a novel solution to address the I/O scaling challenges in accelerated compute clusters by leveraging networking silicon. Their approach, termed the Accelerated Compute Fabric (ACF), refactors the traditional endpoint attachment to accelerators. Instead of using a single RDMA NIC for each accelerator, Enfabrica’s solution employs a fully connected I/O hub that integrates the functionalities of a PCI switch, an array of NICs, and a network switch into a single device. This ACF card connects to a scalable compute surface on one side and a scalable network surface on the other, facilitating high port density and efficient data movement.

The ACF architecture aims to eliminate inefficiencies in the current system where GPUs communicate through multiple layers of PCI switches and NICs to scale out. By collapsing these layers into a single, more efficient system, Enfabrica’s solution reduces the number of memory copies and improves burst bandwidth to GPUs, thereby enhancing overall compute efficiency. The ACF device supports both scale-up and scale-out interfaces, allowing it to handle memory reads and writes directly into memory spaces and communicate packets over long distances. This design is particularly beneficial for AI workloads, which require rapid and efficient data movement across large compute clusters.

Enfabrica’s ACF device is designed to be compatible with existing programming models and protocols, ensuring seamless integration into current data center architectures. The device supports standard PCIe and CXL interfaces, and its programmability allows for flexible transport and congestion control. By integrating multiple NICs and a crossbar switch within a single chip, the ACF device offers enhanced resiliency and load balancing capabilities. This innovative approach not only addresses the immediate scaling challenges faced by AI and accelerated computing workloads but also positions Enfabrica as a key player in the evolving landscape of data center architecture.

Personnel: Rochan Sankar

Solving AI Cluster Scaling and Reliability Challenges in Training, Inference, RAG, and In-Memory Database Applications with Enfabrica


Watch on YouTube
Watch on Vimeo

Enfabrica’s presentation at AI Field Day 5, led by founder and CEO Rochan Sankar, delved into the company’s innovative solutions for addressing AI cluster scaling and reliability challenges. Sankar highlighted the benefits of Enfabrica’s Aggregation and Collapsing Fabric System (ACFS), which enables wide fabrics with fewer hops, significantly reducing GPU-to-GPU hop latency. This reduction in latency is crucial for improving the performance of parallel workloads across GPUs, not just in training but also in other applications. The ACFS allows for a 32x multiplier in network ports, facilitating the connection of up to 500,000 GPUs in just two layers of switching, compared to the traditional three layers. This streamlined architecture enhances job performance and increases utilization, offering a potential 50-60% savings in total cost of ownership (TCO) on the network side.

Sankar also discussed the resiliency improvements brought by the multi-planar switch fabric, which ensures that every GPU or connected element can multipath out in case of failures. This hardware-based failover mechanism allows for immediate traffic rerouting without loss, while software optimizations ensure optimal load balancing. The presentation emphasized the importance of this resiliency, especially as AI clusters scale and the network’s reliability becomes increasingly critical. Enfabrica’s approach addresses the challenges posed by optical connections and high failure rates, ensuring that GPU operations remain unaffected by individual component failures, thus maintaining overall system performance and reliability.

In the context of AI inference and retrieval-augmented generation (RAG), Sankar explained how the ACFS can provide massive bandwidth to both accelerators and memory, creating a memory area network with microsecond access times. This architecture supports a tiered cache-driven approach, optimizing the use of expensive memory resources like HBM. By leveraging cheaper memory options and shared memory elements, Enfabrica’s solution can significantly enhance the efficiency and scalability of AI inference workloads. The presentation concluded with a summary of the ACFS’s capabilities, including high throughput, programmatic control of the fabric, and substantial power savings, positioning it as a critical component for next-generation data centers and large-scale AI deployments.

Personnel: Rochan Sankar


  • Bluesky
  • LinkedIn
  • Mastodon
  • RSS
  • Twitter
  • YouTube

Event Calendar

  • May 28-May 29 — Security Field Day 13
  • Jun 4-Jun 5 — Cloud Field Day 23
  • Jun 10-Jun 11 — Tech Field Day Extra at Cisco Live US 2025
  • Jul 9-Jul 10 — Networking Field Day 38
  • Jul 16-Jul 17 — Edge Field Day 4
  • Sep 10-Sep 11 — AI Infrastructure Field Day 3
  • Oct 29-Oct 30 — AI Field Day 7

Latest Links

  • Compliance Does Not Equal Security
  • Meraki Campus Gateway: Cloud-Managed Overlay for Complex Networks
  • Exploring the Future of Cybersecurity at Security Field Day 13
  • 5G Neutral Host: Solving Enterprise Cellular Coverage Gaps
  • Qlik Connect 2025: Answers For Agentic AI

Return to top of page

Copyright © 2025 · Genesis Framework · WordPress · Log in