Play all

Open Source LLMs on AWS SageMaker

Open Source RAG Pipeline

Deploying Hugging Face LLM on SageMaker

LLM Responses with Context

Why Retrieval Augmented Generation

Deploying our MiniLM Embedding Model

Creating the Context Embeddings

Downloading the SageMaker FAQs Dataset

Creating the Pinecone Vector Index

Making Queries in Pinecone

Implementing Retrieval Augmented Generation

Deleting our Running Instances

Description:

Learn how to build Large Language Model (LLM) and Retrieval Augmented Generation (RAG) pipelines using open-source models from Hugging Face deployed on AWS SageMaker in this comprehensive video tutorial. Explore the implementation of semantic search using the MiniLM sentence transformer with Pinecone. Discover the process of deploying Hugging Face LLMs on SageMaker, generating LLM responses with context, and understanding the benefits of Retrieval Augmented Generation. Follow along as the instructor demonstrates deploying the MiniLM embedding model, creating context embeddings, and setting up a Pinecone vector index using the SageMaker FAQs dataset. Gain practical insights into making queries in Pinecone and implementing RAG for improved AI-powered applications. The tutorial also covers essential steps for managing and deleting running instances to optimize resource usage.

Hugging Face LLMs with SageMaker - RAG with Pinecone

James Briggs

Add to list

#Computer Science #Machine Learning #Hugging Face #Artificial Intelligence #Retrieval Augmented Generation (RAG)

0:00 / 0:00