|
This video is part of the appearance, “Qumulo Presents at Storage Field Day 8“. It was recorded as part of Storage Field Day 8 at 8:00 - 10:00 on October 22, 2015.
Watch on YouTube
Watch on Vimeo
In his presentation on Qumulo Core Metadata Analysis at Storage Field Day 8, Aaron Passey focused on the importance of analytics, particularly metadata analytics, in managing large datasets. He emphasized the need for rapid answers to data-related queries in environments with billions of files, highlighting the limitations of traditional methods such as running disk usage commands or using find commands. As data scales up, these conventional approaches become less effective, leading to significant latencies and outdated information when conducting routine operations like data scans for backups.
To tackle the challenges presented by large-scale data, Passey described Qumulo’s innovative approach, which involves integrating metadata analytics directly into the file system itself. By utilizing aggregates—functions that summarize key attributes of files and directories—Qumulo enhances the efficiency of metadata queries. This method allows for instantaneous access to aggregated data such as total blocks used or the last changed times, drastically reducing the need for lengthy tree scans that plague traditional systems. The ability to query the metadata without exhaustive searches not only saves time but also minimizes I/O overhead on storage systems.
Additionally, Passey addressed how aggregates can enhance the search for files modified within specific timeframes, thereby enabling much more efficient incremental backups. He pointed out that unlike conventional methods that can leave data outdated and irrelevant, Qumulo’s system ensures that metadata remains relatively fresh, generally updated within a minute or less. By avoiding the bottlenecks associated with extensive scans and establishing a system that allows for quick access to detailed file metrics, Qumulo positions itself as a leader in storage solutions, particularly suitable for enterprises managing vast amounts of unstructured data.
Personnel: Aaron Passey