Главная
Study mode:
on
1
Intro
2
Please meet Retriever
3
Retriever is a special purpose data store
4
What is Honeycomb?
5
How Honeycomb works
6
Honeycomb under the hood
7
Our requirements
8
Requirements - summary
9
Retriever at a glance
10
Retriever compared to Scuba
11
Architecture - write path
12
Architecture - read path
13
Data model - datasets
14
Data model - events
15
Row oriented storage
16
Column oriented storage
17
Storage Format - timestamp column
18
Storage Format - reading
19
Distributed queries
20
Distributed reads - calculations
21
Distributed reads - fanout
22
Detour - Kafka
23
Ingestion
24
Quota management
25
Fault tolerance
26
Failure recovery
27
Bootstrapping new nodes
Description:
Explore the architecture and implementation of Retriever, a custom-built distributed column store database, in this 43-minute Strange Loop Conference talk. Learn how Honeycomb addressed the challenges of understanding complex distributed systems in production by developing a low-latency, schemaless database inspired by Facebook's Scuba. Discover the design decisions behind Retriever, including its use of disk storage, efficient column-oriented storage model, and ability to handle multi-tenancy and cost constraints. Gain insights into the write and read paths, data model, storage format, distributed queries, and fault tolerance mechanisms. Understand how Retriever ingests events from Kafka, manages quotas, and handles failure recovery. Delve into the lessons learned from operating a hand-rolled database at production scale with paying customers, and see how it compares to other solutions for sub-second complex queries over large data volumes in real time.

Why We Built Our Own Distributed Column Store

Strange Loop Conference
Add to list
0:00 / 0:00