Главная
Study mode:
on
1
Intro
2
Robert Caulk
3
Context Engineering
4
Text embedding inference
5
Microservice orchestration
6
Startups vs incumbents
7
Timestamp filtering
8
Database retrieval evaluation
9
Allinone options
10
Recommendations
Description:
Learn how to build and deploy a production-scale Retrieval Augmented Generation (RAG) system for real-time news processing in this technical talk from Vector Space Talks. Discover the architecture behind AskNews.app's ability to process over 1 million daily news articles through the integration of four key open-source technologies: Flowdapt for cluster orchestration, Qdrant for vector database management, vLLM for language model serving, and TEI for embedding generation. Explore essential features like efficient batch upserting, fast vector search capabilities, filtering mechanisms, and multi-node scaling while understanding how these tools enable real-time news distillation and enriched chat experiences for thousands of simultaneous users. Gain insights into why modern startups leveraging these foundational tools have competitive advantages over established tech companies, and learn practical implementation strategies for deploying production-ready RAG systems at scale.

Production-Scale Retrieval Augmented Generation for Real-Time News Distillation

Qdrant - Vector Database & Search Engine
Add to list
0:00 / 0:00