Главная
Study mode:
on
1
intro
2
preamble
3
about chinmay naik
4
mongodb to rdbms data migration
5
student collection mongodb
6
student table postresql
7
student - address and phone relationships
8
data migration - mongodb to postresql
9
how mongodb json data maps to sql
10
inserts are cool, what about updates and deletes in mongodb?
11
how do we migrate data?
12
mongo oplog operation log
13
what does oplog record look like?
14
when are we gerring to the golang concurrency?
15
sequential data pipeline
16
mongo oplog / two oplogs / postgresql
17
sequential pipeline performance
18
perf improvemwent - let's add worker pool
19
worker pool
20
worker pools v2.0
21
worker pool v2.0 performance
22
can you guess the problem?
23
worker pools v2.0 - the problem
24
back to drawing board?
25
fan-out for each database
26
concurrent data pipeline
27
performance comparison
28
resource utilization
29
concurrent data pipeline - improvement
30
16 databases and 128 collections per db
31
performance comparison
32
final concurrent data pipeline
33
key takeaways
34
keep learning
Description:
Explore a conference talk on leveraging Go concurrency for a gigabyte-scale real-world data pipeline. Dive into the challenges of migrating data from MongoDB to a relational database system, understanding the intricacies of MongoDB's oplog, and the evolution of pipeline designs. Learn how to implement and optimize worker pools, address performance bottlenecks, and utilize fan-out techniques for each database. Discover the power of concurrent data pipelines, compare performance metrics, and gain insights into resource utilization. Examine the final optimized pipeline design handling 16 databases with 128 collections each, and extract key takeaways for building efficient, scalable data migration solutions using Go's concurrency features.

Go Concurrency Powering Gigabyte-Scale Real-World Data Pipeline

Conf42
Add to list