Play all

Intro

GoDataDriven

Data Build Tool

SOL with some Ninja2 sauce

DBT as a SOL Runner

DBT as a SOL Compiler

Next to the SOL there is documentation

dbt docs generate dbt docs serve

Testing

How does DBT communicate with Spark?

Switch to incremental ingestion

Switch to incremental Delta

In practice

DBT Macro's

Observability is king

Very simple Hive UDF

Small snippet of Scala

Use the UDF in DBT

Be proactive

Feedback

Description:

Explore a comprehensive 26-minute talk on integrating Data Build Tool (DBT) with Databricks and Delta for efficient data lake management. Learn how this open-source, SQL-first technology enhances data quality and documentation throughout the data lake lifecycle. Discover the basics of DBT and its synergy with Databricks for powerful data processing. Examine how DBT supports Delta to enable SQL-based upserts. Investigate the integration of DBT and Databricks within the Azure cloud environment. Gain insights into emitting pipeline metrics to Azure Monitor for improved observability. Dive into topics such as DBT as a SQL runner and compiler, documentation generation, testing, incremental ingestion, DBT macros, and the use of Hive UDFs. Master the art of maintaining high-quality data pipelines using software engineering best practices.

DBT Using Databricks and Delta

Databricks

Add to list

#Business #Business Intelligence #Data Warehousing #dbt (Data build tool) #Programming #Domain-Specific Languages (DSL) #SQL #Data Science #Big Data #Databricks #Data Lakes #Data Processing #Computer Science #DevOps #Observability #Cloud Computing #Azure Cloud

0:00 / 0:00