|
This video is part of the appearance, “Google Cloud Presents at Cloud Field Day 20“. It was recorded as part of Cloud Field Day 20 at 16:00-17:00 on June 13, 2024.
Watch on YouTube
Watch on Vimeo
In this presentation, Lisa Shen, a product manager at Google Cloud, introduces Cloud Run, Google Cloud’s serverless runtime platform, and discusses its integration with generative AI (GenAI) applications. Cloud Run simplifies the deployment and scaling of modern workloads by removing the overhead of infrastructure management. Built on container technology, it offers flexibility, portability, and cost-saving benefits, as users only pay when their code is running. Shen highlights two primary resources within Cloud Run: services for HTTP endpoints and jobs for executing tasks to completion, making it suitable for various use cases, including web applications and batch data processing.
Shen provides examples of companies like L’Oreal and Ford that have adopted Cloud Run to modernize their infrastructure and accelerate innovation. L’Oreal, for instance, used Cloud Run to implement a GenAI service for its employees, resulting in the rapid launch of L’Oreal GPT. Similarly, Ford transitioned to a Cloud Run-first approach to enhance scalability and reliability in vehicle design and manufacturing. These examples illustrate Cloud Run’s ability to improve developer velocity, reduce costs, and simplify application deployment, making it an attractive option for both cloud-native and traditional enterprises.
The presentation includes a demonstration of building a GenAI application using Cloud Run and Vertex AI. Shen explains how Cloud Run can handle various architectural components of GenAI applications, such as serving and orchestration, data ingestion, and quality evaluation. The demo showcases the process of deploying a web-based application that queries Cloud Run release notes, highlighting the ease of use and efficiency of Cloud Run in handling such tasks. Shen emphasizes that while Cloud Run is primarily for serving and orchestrating applications, more complex tasks like model fine-tuning and heavy lifting are better suited for Vertex AI or Google Kubernetes Engine (GKE). The session concludes with a discussion on managing costs and scaling with Cloud Run, ensuring that users can deploy applications efficiently without unexpected expenses.
Personnel: Lisa Shen