Play all

Intro

Robert Caulk

Context Engineering

Text embedding inference

Microservice orchestration

Startups vs incumbents

Timestamp filtering

Database retrieval evaluation

Allinone options

Recommendations

Description:

Learn how to build and deploy a production-scale Retrieval Augmented Generation (RAG) system for real-time news processing in this technical talk from Vector Space Talks. Discover the architecture behind AskNews.app's ability to process over 1 million daily news articles through the integration of four key open-source technologies: Flowdapt for cluster orchestration, Qdrant for vector database management, vLLM for language model serving, and TEI for embedding generation. Explore essential features like efficient batch upserting, fast vector search capabilities, filtering mechanisms, and multi-node scaling while understanding how these tools enable real-time news distillation and enriched chat experiences for thousands of simultaneous users. Gain insights into why modern startups leveraging these foundational tools have competitive advantages over established tech companies, and learn practical implementation strategies for deploying production-ready RAG systems at scale.

Production-Scale Retrieval Augmented Generation for Real-Time News Distillation

Qdrant - Vector Database & Search Engine

Add to list

#Computer Science #Machine Learning #Retrieval Augmented Generation #Programming #Cloud Computing #Microservices #Databases #Vector Databases #Distributed Systems #Cluster Architecture #Qdrant #vLLM

0:00 / 0:00