Главная
Study mode:
on
1
Intro
2
Why make your data pipelines dumb-proof?
3
How to make your data pipelines dumb-proof?
4
Fixing Hard coded Data Pipelines
5
Parameters & Input Validation
6
Externalizing Configuration
7
Configuration in JSON Format
8
Optimized Configuration in HOCON format
9
Readable and maintainable Configuration
10
Configuration Library
11
Refactor Code - Loading and Parsing Configuration
12
Boilerplate free configuration code
13
Sample Code
14
Summary
Description:
Discover techniques to create robust and maintainable data pipelines in this 22-minute Databricks talk. Learn why configurable pipelines are crucial, how to seamlessly promote them across environments, and reconfigure in production without recompiling. Explore the pros and cons of Databricks Notebook widgets, methods to externalize configurations, and leverage Scala features with pure config and typesafe libraries for boilerplate-free code. Gain insights on input validation, preventing data loss and corruption, and ensuring data correctness. Walk away with practical knowledge to enhance your data pipeline development and maintenance processes.

Dumb-Proofing Data Pipelines: Techniques for Configurable and Maintainable ETL - Databricks

Databricks
Add to list
0:00 / 0:00