|
![]() Haseeb Budhani presented for Rafay at AI Infrastructure Field Day 3 |
Accelerating AI Infrastructure Adoption for GPU Providers and Enterprises with Rafay
Watch on YouTube
Watch on Vimeo
Haseeb Budhani, CEO of Rafay Systems, begins by highlighting the confusion surrounding Rafay’s classification, noting that people variously describe it as a platform as a service (PaaS), orchestration, or middleware, and he welcomes feedback on which term best fits. He then pivots to discussing the current market dynamics in AI infrastructure, particularly the discrepancy between the cost of renting GPUs from providers like Amazon versus acquiring them independently. He illustrates this with an example of using DeepSeek R1, highlighting that while Amazon charges significantly more for consuming the model via Bedrock, renting the underlying H100 GPU directly is much cheaper.
Budhani argues that many companies renting out GPUs are not true “clouds” and may struggle in the long term because they are not selling services on top of the GPUs. He references an Accenture report suggesting that GPU as a Service (GPaaS) will diminish as the market matures, with more value being derived from services. He emphasizes that hyperscalers like Amazon have understood this for a long time, generating most of their revenue from services rather than infrastructure as a service (IaaS). This presents an opportunity for Rafay to help GPU providers and enterprises deliver these higher-level services, enabling them to compete more effectively with hyperscalers and unlock significant cost savings, citing an example of a telco in Thailand that could save millions by deploying its own AI infrastructure with Rafay’s software.
The speaker concludes by emphasizing the increasing importance of sovereign clouds, especially in regions like Europe and the Middle East. Telcos, which previously lost business to public clouds, now have a renewed opportunity to provide AI infrastructure locally due to sovereignty requirements. He states that Rafay aims to provide these telcos and other regional providers with the necessary software stack to deliver these services, thereby addressing a common problem across various geographic locations. He highlights a telco in Indonesia, Indosat, as an early example of a customer using Rafay to deliver a sovereign AI cloud, underscoring the growing demand for such solutions globally.
Personnel: Haseeb Budhani
Bridging the gap from GPU-as-a-Service to AI Cloud with Rafay
Watch on YouTube
Watch on Vimeo
Rafay CEO Haseeb Budhani argues that to truly be considered a cloud provider, organizations must offer self-service consumption, applications (or tools), and multi-tenancy. He contends that many GPU clouds currently rely on manual processes like spreadsheets and bare metal servers, which don’t qualify as true cloud solutions. Budhani emphasizes that users should be able to access a portal, create an account, and consume services on demand, without requiring backend intervention for tasks like VLAN setup or IP address management.
Budhani elaborates on his definition of multi-tenancy, outlining the technical requirements for supporting diverse customer needs. This includes secure VMs, operating system images with pre-installed tools, public IP addresses, firewall rules, and VPCs. He highlights the difference between customers needing a single GPU versus those requiring 64 GPUs and emphasizes that all necessary networking and security configurations must be automated to provide a true self-service experience.
Ultimately, Budhani argues that the goal is self-service consumption of applications or tools, not just GPUs. He believes the industry is moving beyond the “GPU as a service” concept, with users now focused on consuming models and endpoints rather than managing the underlying GPU infrastructure. He suggests that his company, Rafay, addresses many of the complexities in this space, offering solutions that enable the delivery of applications and tools in a self-service, multi-tenant environment.
Personnel: Haseeb Budhani
From Infrastructure Chaos to Cloud-Like Control with Rafay
Watch on YouTube
Watch on Vimeo
Rafay, founded seven years ago, initially focused on Kubernetes but has evolved to address the broader challenge of simplifying compute consumption across various environments. Their solution aims to provide self-service compute to companies across verticals.
Rafay typically engages with companies that already have existing infrastructure, automation, and deployments. The core problem they solve is standardization across diverse environments and users. They help companies build a platform engineering function that enables efficient management of environments, upgrades, and policies. The Rafay platform abstracts the underlying infrastructure, providing an interface for users to request and consume compute resources without needing to understand the complexities of the underlying systems.
Rafay’s platform allows organizations to deliver self-service compute across diverse environments and teams, managing identity, policies, and automation. The goal is to reduce the time developers waste on infrastructure tasks, which, according to Rafay, can be as high as 20% in large enterprises. They offer a comprehensive solution that encompasses inventory management, governance, and control, all while generating the underlying infrastructure as code for versioning and auditability. In summary, Rafay enables companies to move away from custom, in-house solutions to a standardized, automated, and cloud-like compute consumption model.
Personnel: Haseeb Budhani
Unlock AI Cloud Potential with the Rafay Platform
Watch on YouTube
Watch on Vimeo
Haseeb Budhani, CEO of Rafay Systems, discusses how the Rafay platform can be used to address AI use cases. The platform provides a white-label ready portal that allows end users to self-service provision various compute resources and AI/ML platform services. This enables cloud providers and enterprises to offer services like Kubernetes, bare metal, GPU as a service, and NVIDIA NIM with a simple and standardized experience.
The Rafay platform leverages standardization, infrastructure-as-code (IaC) concepts, and GitOps pipelines to drive consumption for a large number of enterprises. Built on a Git engine for configuration management and capable of handling complex multi-tenancy requirements with integration to various identity providers, the platform allows customers to offer different services, compute functions, and form factors to their end customers through configurable, white-labeled catalogs. Additionally, the platform features a serverless layer for deploying custom code on Kubernetes or VM environments, enabling partners and customers to deliver a wide range of applications and services, from DataRobot to Jupyter notebooks, as part of their offerings.
Rafay addresses security concerns through SOC 2 Type 2 compliance for its SaaS product, providing pentest reports and agent reports for customer assurance. For larger customers, particularly cloud providers, an air-gapped product is offered, allowing them to deploy and manage the Rafay controller within their own secure environments. Furthermore, the platform’s unique Software Defined Perimeter (SDP) architecture enables it to manage Kubernetes clusters remotely, even on edge devices with limited connectivity, by establishing an inside-out connection and a proxy service for secure communication.
Personnel: Haseeb Budhani