Follow on Twitter using the following hashtags or usernames: #NFD40
Watch on YouTube
Watch on Vimeo
Upscale AI was founded in 2025 and quickly emerged from stealth to become a unicorn following $300 million in seed and Series A funding. The leadership team consists of industry veterans from major firms like Cisco, Broadcom, and NVIDIA, focusing on a clean sheet architecture designed to solve the specific demands of AI networking. Unlike traditional data center providers, Upscale AI is a pure-play firm targeting the back-end and lean front-end networks where collective communication and high-density traffic require more than general-purpose switching.
The technical strategy centers on addressing both scale-up and scale-out domains through a combination of proprietary silicon and open-source software. For scale-out networking, the company is partnering with NVIDIA to build systems around Spectrum-X, while simultaneously developing its own optimized silicon and trays to enable open scale-up architectures for various processing units. A key pillar of their philosophy is the enablement of heterogeneous compute, ensuring that their fabric can support a diverse landscape of ASICs, including GPUs, TPUs, and LPUs. By utilizing a unified software stack based on SONiC, they aim to provide a turnkey, customizable operating system that simplifies deployment for neoclouds, enterprises, and hyperscalers.
To ensure long-term viability and industry alignment, Upscale AI is heavily involved in standards bodies such as OCP, UEC (Ultra Ethernet Consortium), and UA Link. Their approach emphasizes predictability and reliability by stripping away the legacy features required for enterprise or service provider markets, focusing solely on AI-specific traffic patterns. Through active partnerships with GPU vendors and hyperscalers, the company is validating its designs in real-world environments to ensure interoperability. Ultimately, Upscale AI intends to move beyond the current homogeneous nature of AI clusters, providing an open, standards-based substrate that can scale and evolve alongside the next decade of AI innovation.
Personnel: Aravind Srikumar
Watch on YouTube
Watch on Vimeo
Upscale AI, founded in 2025, recently emerged from stealth as a unicorn following $300 million in combined seed and Series A funding. With a team of industry veterans, Upscale AI is focused on building a clean sheet networking architecture specifically for the backend and lean front-end of AI clusters. The speakers emphasize that traditional data center networking is a round peg in a square hole for AI, as existing infrastructures were designed for general-purpose web traffic rather than the massive, synchronized communication required by billions of parameters and trillion-token models.
The presentation details the shift from a simple client-server model to a distributed ecosystem, where the network acts as the nervous system of an intelligent manufacturing plant or token factory. In this environment, the key performance indicators (KPIs) have shifted from bits per second to tokens per second and tokens per watt. As large language models (LLMs) outgrow the memory capacity of single GPUs, parallelism, such as slicing math problems across thousands of processors, becomes mandatory. This creates a massive data movement problem where any network synchronization stall or hot spot directly results in idle compute time and lost revenue, making predictability and low latency table stakes rather than optional features.
To address these challenges, Upscale AI is developing a portfolio that includes both scale-up and scale-out solutions built on open standards like Ethernet, SONiC, and the Ultra Ethernet Consortium (UEC). Their scale-out systems are designed around a partnership with NVIDIA’s Spectrum-X, while their scale-up innovation involves purpose-built silicon and trays to support heterogeneous compute environments. By focusing exclusively on AI traffic and removing the bloat of legacy enterprise features, Upscale AI aims to provide a reliable, predictive substrate that can proactively identify malfunctioning hardware. This architectural approach is intended to help operators maximize their token flywheel, ensuring that massive infrastructure investments yield the highest possible intelligence output per watt of power consumed.
Personnel: Deepti Chandra
Watch on YouTube
Watch on Vimeo
Upscale AI argues that traditional cloud and front-end networks, which are largely based on a client-server architecture, are fundamentally ill-suited for the unique demands of AI workloads. While standard web traffic is connection-oriented and tolerant of latency, AI clusters rely on collective communication where GPUs perform synchronized all-to-all data exchanges. This shift results in a move from north-south traffic patterns to intense east-west traffic, where a single request triggers massive bursts of data across the fabric. The presentation establishes that to maintain efficiency, the network must evolve from a reactive system to an architected substrate that treats the entire cluster as a single, coordinated engine.
AI networking requires a radical departure from the traditional OSI seven-layer processing model. In a standard network, packets traverse the full stack and are processed by the CPU;. However, AI traffic utilizes RDMA (Remote Direct Memory Access) to bypass the kernel and CPU entirely, performing zero-copy memory transactions directly between GPUs. This creates a different packet profile where payload data is memory itself rather than application data. Furthermore, while cloud networks handle congestion reactively through TCP retransmits, AI clusters require a lossless environment. In these systems, a single dropped packet can stall thousands of GPUs, leading to a computational head-of-line blocking that halts progress across the entire token factory.
To solve these challenges, Upscale AI advocates for a purpose-built network stack that optimizes every layer from silicon to software. Traditional data center switches are often burdened by bloated feature sets and complex pipelines designed for general-purpose routing, which increases power consumption and latency. By stripping away unnecessary protocols and focusing on AI-specific requirements like microsecond-level telemetry and adaptive load balancing to prevent hash collisions, the company aims to deliver a more efficient fabric. The speakers conclude that achieving a 100% success rate for collective communication is necessary to maximize tokens per watt, moving beyond the tuning of existing hardware toward a clean-sheet architecture designed for the next decade of AI scale.
Personnel: Aravind Srikumar
Watch on YouTube
Watch on Vimeo
Upscale AI posits that traditional data center networking is a round peg in a square hole for AI, as existing infrastructures were designed for general-purpose web traffic rather than the massive, synchronized communication required by billions of parameters and trillion-token models. By focusing exclusively on AI traffic and removing the bloat of legacy enterprise features, Upscale AI aims to provide a reliable, predictive substrate that treats the entire cluster as a single, coordinated engine.
The technical strategy addresses the evolution of AI workloads from dense training to inference-centric agentic AI and persistent states. As models outgrow the memory capacity of single GPUs, parallelism, like slicing math problems across thousands of processors, becomes mandatory, creating a massive data movement problem. Upscale AI advocates for a distributed ecosystem where the network must be technology-agnostic to support a plethora of specialized ASICs, including GPUs, TPUs, and custom hyperscaler XPUs. This architecture moves away from reactive TCP-based recovery toward a lossless, RDMA-driven environment where the network proactively manages congestion to prevent computational stalls, ensuring that every GPU cycle is utilized to maximize tokens per watt.
To future-proof these investments, Upscale AI is developing a portfolio of scale-up and scale-out systems built on open standards like Ethernet, SONiC, and UA Link. Their scale-out systems leverage a partnership with NVIDIA’s Spectrum-X, while their scale-up innovation involves purpose-built silicon and trays to support heterogeneous compute and memory pooling. By utilizing a unified software stack based on SONiC, the company provides a common substrate that simplifies operational onboarding for neoclouds and enterprises. Ultimately, Upscale AI’s mission is to move beyond the current homogeneous nature of AI clusters, providing an open, standards-based fabric that allows diverse hardware to interoperate seamlessly for the next decade of AI innovation.
Personnel: Deepti Chandra
Watch on YouTube
Watch on Vimeo
Upscale AI distinguishes between two critical domains: scale-up networking, which creates a large compute environment within a rack where multiple GPUs see a flat, unified memory, and scale-out networking, which connects these domains through memory copy operations. The presentation highlights that the network has become the backplane of a distributed ecosystem, moving from a standard client-server model to a highly synchronized all-to-all communication pattern. Upscale AI aims to solve the challenges of this new era by providing purpose-built hardware and software that prioritizes predictable, ultra-low latency and zero-oversubscription bandwidth to prevent computational stalls.
In the scale-up domain, the architecture must support load-store operations with latencies under one microsecond. Aravind Srikumar, introduces the Skyhammer architecture, a clean-sheet design specifically built for the scale-up environment that emphasizes “performance, performance, performance.” Unlike traditional networking, these systems utilize lightweight, optimized headers and offload congestion handling, such as Link Layer Retry (LLR) and Priority Flow Control (PFC), directly to the switch to minimize jitter. For scale-out needs, Upscale AI has partnered with NVIDIA to utilize the Spectrum-X substrate, building open, Ethernet-based systems around it that feature AI-optimized operating systems, hitless upgrades, and specialized circuitry for real-time power management and telemetry.
The company’s overarching vision is to enable a future of heterogeneous compute where customers can mix and match various processing units, such as GPUs, LPUs, and DPUs, without being locked into a single proprietary ecosystem. By utilizing open standards like SONiC, ESON, and UA Link, Upscale AI ensures that its fabric remains technology-agnostic and interoperable. This approach is designed to protect customer investments over a five-to-seven-year lifecycle, allowing the network to adapt as new AI workloads and specialized chips emerge. Ultimately, the goal is to transform the data center into an efficient token factory where every ounce of power and compute is maximized through an architected, rather than merely tuned, networking stack.
Personnel: Aravind Srikumar
Thank you for being part of the Tech Field Day community! Our mailing list is a great way to stay up to date on our events and technical content, and we appreciate your signup.
We promise that we’ll never spam you, send ads, or sell your information. This list will only be used to communicate with our community about our events and content. And we’ll limit it to no more than one message per week.
Although we only need your email address, it would be nice if you provided a little more information to help us get to know you better!