Analytics Storage and AI, Data Prep and Data Lakes with Google Cloud

This video is part of the appearance, "Google Cloud Presents at AI Infrastructure Field Day 2 – Afternoon". It was recorded as part of AI Infrastructure Field Day 2 at 13:00 - 16:30 on April 22, 2025.

Watch on YouTube
Watch on Vimeo

Vivek Sarswat, Group Product Manager at Google Cloud Storage, presented on analytics storage and AI, focusing on data preparation and data lakes. He emphasized the close ties between analytics and AI workloads, highlighting key innovations built to address related challenges. The presentation demonstrates that analytics play a crucial role in the AI data pipeline, particularly in ingestion, data preparation, and cleaning.

Sarswat explained how customers increasingly build unified data lake houses using open metadata table formats like Apache Iceberg. This approach enables analytics and AI workloads, including running analytics on AI data. He cited Snap as a customer example, processing trillions of user events weekly using Spark for data preparation and cleaning on top of Google Cloud Storage. Google Cloud Storage offers optimizations like the Cloud Storage Connector, Anywhere Cache, and Hierarchical Namespace (HNS) to enhance data preparation.

Sarswat covered the concept of a data lakehouse, combining structured and unstructured data in a unified platform with a separation layer using open table formats. Examples from Snowflake, Databricks, Uber, and Google Cloud’s BigQuery tables for Apache Iceberg illustrated the diverse architectures employed. Sarswat also addressed common customer challenges like data fragmentation, performance bottlenecks, and optimization for resilience, security, and cost, offering solutions like Storage Intelligence, Anywhere Cache, and Bucket Relocate, referencing customer case studies such as Spotify and Two Sigma.

Personnel: Vivek Saraswat

Fortinet Oddly Puts LCD Screens and LoraWAN on Wi-Fi 7 APs at MFD14

HPE Bets on Standard Power to Fix 6 GHz’s Weakest Link

Is Object Storage Becoming Part of the AI Memory Hierarchy?

Big Branch Improvements from Cisco

The New Governance Control Plane for Enterprise AI

AIOps Tools: Forward

Analytics Storage and AI, Data Prep and Data Lakes with Google Cloud

Sign up for updates to Tech Field day events

Sign up for updates to
Tech Field day events