|
Brandon Whitelaw, Dack Busch, and Douglas Gourlay presented for Qumulo at Cloud Field Day 21 |
This Presentation date is October 23, 2024 at 11:00-12:30.
Presenters: Brandon Whitelaw, Dack Busch, Douglas Gourlay
Douglas Gourlay Introduces the Qumulo Cloud Data Platform
Watch on YouTube
Watch on Vimeo
Douglas Gourlay, CEO of Qumulo, introduced the Qumulo Cloud Data Platform by discussing the unprecedented growth in data and the challenges it presents for storage and processing. He highlighted how data growth is outpacing the ability of traditional storage solutions, such as SSDs and HDDs, to keep up, especially in environments where power and space are limited. This has led organizations to explore alternatives like cloud storage or building new data centers. Gourlay emphasized that the data being generated today is not just for storage but is increasingly valuable, feeding into critical applications like medical imaging, AI processing, and research. He shared examples of customers dealing with massive amounts of data, such as research institutions generating hundreds of terabytes weekly, and the need to move this data efficiently to processing centers.
Gourlay also addressed the ongoing debate between cloud and on-premises storage, noting that the industry is moving towards a hybrid model where both options are viable depending on the specific needs of the business. He criticized the myopic views of some industry players who advocate for cloud-only or on-prem-only solutions, arguing that businesses need the freedom to choose the best option for their workloads. Qumulo’s strategy is to eliminate technological barriers, allowing customers to make decisions based on business needs rather than being constrained by the limitations of the technology. By normalizing the cost of cloud storage and making it comparable to on-prem solutions, Qumulo aims to provide flexibility and enable businesses to store and process data wherever it makes the most sense.
The Qumulo Cloud Data Platform is designed to run anywhere, whether on x86, AMD, or ARM architectures, and across multiple cloud providers like Amazon and Azure. The platform’s global namespace feature ensures that data is available everywhere it is needed, with strict consistency to prevent data loss. Gourlay explained how Qumulo’s system optimizes data transfer across wide-area networks, significantly reducing the time it takes to move large datasets between locations. The platform also integrates with AI systems, enabling customers to leverage their data in advanced AI models while protecting their data from being absorbed into the AI’s training process. Looking ahead, Qumulo aims to build a global data fabric that supports both unstructured and structured data, with features like global deduplication and automated data management to ensure data is stored in the most efficient and cost-effective way possible.
Personnel: Douglas Gourlay
What Qumulo is Hearing from Customers
Watch on YouTube
Watch on Vimeo
In this presentation, Brandon Whitelaw, VP of Cloud at Qumulo, discusses the evolving landscape of data management and the challenges customers face in adopting hybrid and multi-cloud strategies. He highlights that the traditional approach of consolidating disparate file systems into a single scale-out system is no longer sufficient, as most companies now operate across multiple clouds and geographic locations. Whitelaw points out that 94% of companies have adopted multi-cloud strategies, often using different clouds for different workloads, which adds complexity. He emphasizes that file data, once considered secondary, has now become critical, especially with the rise of AI and other next-gen applications. However, many file systems struggle to operate efficiently in the cloud, often offering only a fraction of their on-prem performance at a much higher cost.
Whitelaw explains that one of the key challenges is the inefficiency of moving data between on-prem and cloud environments, particularly when using traditional file systems that are not optimized for cloud performance. He notes that many companies end up creating multiple copies of their data across different systems, which increases costs and complexity. Qumulo aims to address this by providing a unified data fabric that allows seamless access to data across on-prem, cloud, and edge environments. This approach reduces the need for data replication and ensures that data is accessible and performant, regardless of where it resides. Qumulo’s solution also includes real-time file system analytics, which helps optimize data access and performance by preemptively caching frequently accessed data.
The presentation also delves into the technical aspects of Qumulo’s cloud-native file system, which is designed to leverage the scalability and cost-effectiveness of object storage like AWS S3 or Azure Blob, while overcoming the performance limitations typically associated with these storage types. By using advanced data layout techniques and caching mechanisms, Qumulo ensures that data stored in object storage can be accessed with the performance of a traditional file system. This approach allows customers to benefit from the elasticity and cost savings of cloud storage without having to rewrite their applications. Whitelaw concludes by emphasizing the importance of providing a consistent, high-performance data experience across all environments, enabling customers to focus on their workloads rather than managing complex data pipelines.
Personnel: Brandon Whitelaw
Cloud Native Qumulo Architecture and Demo
Watch on YouTube
Watch on Vimeo
Qumulo’s cloud-native architecture, as presented by Dack Busch, emphasizes elasticity, performance, and cost-efficiency, particularly in AWS environments. The system is designed to scale dynamically, allowing users to adjust the number of nodes in a cluster based on workload demands. This flexibility is crucial for industries like media and entertainment, where workloads can spike unpredictably. Qumulo’s architecture allows users to scale up or down without service disruption, and even change EC2 instance types in real-time to optimize performance and cost. The system’s read cache is stored locally on NVMe drives, while the write cache is stored on EBS, which is more cost-effective and ensures data durability. The architecture also supports multi-AZ deployments, ensuring high availability and durability by spreading clusters across multiple availability zones.
One of the key features of Qumulo’s cloud-native solution is its integration with S3 for persistent storage. The system defaults to S3 Intelligent Tiering, but users can choose other S3 classes based on their needs. The architecture is designed to be highly efficient, with a focus on data consistency and cache coherency. Unlike some cloud systems that are eventually consistent, Qumulo ensures that data is always consistent, which is critical for customers who prioritize data integrity. The system also supports global namespace metadata, allowing users to access their data from anywhere as if it were local. This is particularly useful for scenarios where data needs to be accessed across different regions or environments, such as in disaster recovery or cloud bursting scenarios.
Qumulo’s architecture also offers significant economic advantages. Customers only pay for the capacity they use, and there is no need to provision storage in advance. This pay-as-you-go model aligns with the principles of cloud-native design, where resources are only consumed when needed. The system also supports automated scaling through CloudWatch and Lambda functions, allowing users to add or remove nodes based on real-time performance metrics. Additionally, Qumulo’s integration with third-party tools and its ability to ingest data from existing S3 buckets make it a versatile solution for organizations looking to migrate or manage large datasets in the cloud. The demo showcased the system’s ability to scale from three to 20 nodes in just a few minutes, demonstrating its real-time elasticity and high-performance capabilities.
Personnel: Dack Busch