|
This Presentation date is June 13, 2024 at 16:00-17:00.
Presenters: Bobby Allen, Brandon Royal, Lisa Shen, Neama Dadkhahnikoo
For more information, visit http://g.co/cloud/fieldday2024
Google Cloud Vertex AI Platform
Watch on YouTube
Watch on Vimeo
Google Cloud’s Vertex AI platform is built on a rich history of innovation and enterprise readiness, offering an integrated AI-optimized portfolio. The platform leverages Google’s groundbreaking technologies such as TPUs and the transformer architecture, which have been instrumental in the development of large language models (LLMs) like Gemini. Gemini stands out for its multimodal capabilities, allowing it to process and reason across text, images, audio, and video simultaneously. This multimodal approach enables advanced functionalities like identifying specific moments in a video or understanding complex prompts that combine text and images. The platform also emphasizes flexibility and choice, providing options for different model sizes and prompting windows to match various use cases and cost considerations.
The presentation highlighted the practical applications of Vertex AI through several demos. One notable example demonstrated the model’s ability to process a 44-minute video and accurately identify a specific scene based on a text prompt, showcasing its capability to handle long context understanding. Another demo illustrated the use of a multimodal prompt, where a simple doodle was used to locate a corresponding scene in the video. These examples underscore the potential of Vertex AI in real-world scenarios, such as customer service chatbots, sports highlight identification, and even complex tasks like code transformation and financial document analysis. The platform’s ability to cache context and perform batch processing further enhances its efficiency and cost-effectiveness.
Vertex AI also focuses on enterprise readiness, ensuring data security, governance, and compliance. The platform provides tools for model evaluation, monitoring, and customization, allowing enterprises to tailor models to their specific needs while protecting their data. Features like grounding APIs help ensure the accuracy of model outputs by linking responses to verified data sources, addressing concerns about AI-generated content’s reliability. Additionally, the platform supports various levels of coding expertise, from no-code to full-code, making it accessible to a wide range of users. With its comprehensive suite of tools and emphasis on security and flexibility, Vertex AI positions itself as a robust solution for enterprises looking to leverage AI for diverse applications.
Personnel: Neama Dadkhahnikoo
Google Cloud Run and GenAI Apps
Watch on YouTube
Watch on Vimeo
In this presentation, Lisa Shen, a product manager at Google Cloud, introduces Cloud Run, Google Cloud’s serverless runtime platform, and discusses its integration with generative AI (GenAI) applications. Cloud Run simplifies the deployment and scaling of modern workloads by removing the overhead of infrastructure management. Built on container technology, it offers flexibility, portability, and cost-saving benefits, as users only pay when their code is running. Shen highlights two primary resources within Cloud Run: services for HTTP endpoints and jobs for executing tasks to completion, making it suitable for various use cases, including web applications and batch data processing.
Shen provides examples of companies like L’Oreal and Ford that have adopted Cloud Run to modernize their infrastructure and accelerate innovation. L’Oreal, for instance, used Cloud Run to implement a GenAI service for its employees, resulting in the rapid launch of L’Oreal GPT. Similarly, Ford transitioned to a Cloud Run-first approach to enhance scalability and reliability in vehicle design and manufacturing. These examples illustrate Cloud Run’s ability to improve developer velocity, reduce costs, and simplify application deployment, making it an attractive option for both cloud-native and traditional enterprises.
The presentation includes a demonstration of building a GenAI application using Cloud Run and Vertex AI. Shen explains how Cloud Run can handle various architectural components of GenAI applications, such as serving and orchestration, data ingestion, and quality evaluation. The demo showcases the process of deploying a web-based application that queries Cloud Run release notes, highlighting the ease of use and efficiency of Cloud Run in handling such tasks. Shen emphasizes that while Cloud Run is primarily for serving and orchestrating applications, more complex tasks like model fine-tuning and heavy lifting are better suited for Vertex AI or Google Kubernetes Engine (GKE). The session concludes with a discussion on managing costs and scaling with Cloud Run, ensuring that users can deploy applications efficiently without unexpected expenses.
Personnel: Lisa Shen
Google Kubernetes Engine – The Container Platform for AI at Scale from Google Cloud
Watch on YouTube
Watch on Vimeo
Brandon Royal, a Product Manager at Google Cloud, describes how Kubernetes can be leveraged for AI applications, particularly focusing on model training and serving. He begins by emphasizing the growing importance of generative AI across many organizations, highlighting that Google Kubernetes Engine (GKE) provides a robust platform for integrating AI into products and services. The platform is designed to handle the increasing complexity and scale of AI models, which demand high efficiency and cost-effectiveness. Royal mentions that GKE, often referred to as the operating system of Google’s AI hypercomputer, orchestrates workloads across storage, compute, and networking to deliver optimal price performance.
Royal addresses the challenges of scaling AI workloads, noting that model sizes are growing and pushing the limits of infrastructure. To tackle these challenges, GKE offers several optimizations, such as dynamic workload scheduling and container preloading, which enhance the efficiency and utilization of AI resources like CPUs, GPUs, and TPUs. He introduces the concept of “good put,” a metric for measuring machine learning productivity, which includes scheduling good put, runtime good put, and program good put. These metrics help ensure that resources are utilized effectively, minimizing idle time and maximizing forward progress in model training. Royal also highlights the importance of leveraging open-source frameworks like Ray and Kubeflow, which integrate seamlessly with GKE to provide a comprehensive AI development and deployment environment.
The presentation includes a demo showcasing the optimization capabilities of GKE. Royal demonstrates how container preloading and persistent volume claims can significantly reduce the time required to deploy AI models. By preloading container images and sharing model weights across instances, GKE can cut down deployment times from several minutes to mere seconds. This optimization is crucial for large-scale AI deployments, where efficiency and speed are paramount. Royal concludes by encouraging the audience to explore the resources and tutorials available for building AI platforms on GKE, emphasizing that these optimizations can provide a competitive edge in the fast-evolving field of AI.
Personnel: Brandon Royal
Cloud Field Day at Google Cloud Wrap-Up
Watch on YouTube
Watch on Vimeo
Wrapping up the Cloud Field Day 20 presentation by Google Cloud, Bobby Allen emphasized the importance of innovation in the realm of cloud computing. He pointed out that innovation happens at the intersection of where the model lives and where the application runs, highlighting the significance of both components. Allen referenced various Google Cloud tools presented during this day-long session such as Vertex, GKE, and Cloud Run, and demonstrated through examples how these tools can be leveraged to build generative AI applications. He stressed the flexibility and adaptability of these tools depending on the specific needs and stages of a project, suggesting that different iterations and combinations can be used simultaneously to optimize outcomes.
Allen also discussed the broader implications of leveraging AI within different layers of the platform and the software development lifecycle (SDLC). He reinforced the idea that AI itself is not the end goal but a means to enhance and transform existing processes and applications. By showcasing tools like Cloud Assist and Code Assist along with AI-centric platforms, he illustrated how these technologies can be integrated to provide substantial improvements. The emphasis was on Google Cloud’s role as a transformation partner rather than just a technology provider, offering numerous options that avoid technical debt and allow for flexible decision-making.
Personnel: Bobby Allen