Play all

Introduction

Problems with large language models

Core things to consider

Why kubernetes

G Appliances

Customer feedback

Technology learnings

Technology

Implementation

Scaling characteristics

Serving subsystem

LangChain

Security

Data sensitivity

Natural language API

Identity Wear Proxy

GK Quick Start

Jupiter Hub

Description:

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Learn how to deploy a fully-functional Retrieval-Augmented Generation (RAG) application to Google Cloud using open-source tools and models from Ray, HuggingFace, and LangChain in this 41-minute conference talk from Google Cloud Next 2024. Discover techniques for augmenting the application with custom data using Ray on Google Kubernetes Engine (GKE) and Cloud SQL's pgvector extension. Explore the process of deploying any model from HuggingFace to GKE and rapidly developing LangChain applications on Cloud Run. Gain insights into core considerations, technology learnings, implementation strategies, scaling characteristics, serving subsystems, security measures, and data sensitivity. Speakers Alex Zakonov, Brandon Royal, and Stephen Allen cover topics including problems with large language models, the benefits of Kubernetes, G Appliances, customer feedback, natural language APIs, Identity Wear Proxy, and GK Quick Start. By the end of the session, acquire the knowledge to deploy and customize your own RAG application to meet specific needs. Read more

Deploying Retrieval-Augmented Generation Applications with Ray, Hugging Face, and LangChain on Google Cloud

Google Cloud Tech

Add to list

#Computer Science #Artificial Intelligence #Natural Language Processing (NLP) #LangChain #Programming #Cloud Computing #Google Cloud Platform (GCP) #Cloud SQL #Cloud Run #Machine Learning #Hugging Face #Retrieval Augmented Generation

0:00 / 0:00