Play all

Intro

Life as a data professional.

What is a Live Table?

Development vs Production

Declare LIVE Dependencies

Choosing pipeline boundaries

Pitfall: hard-code sources & destinations

Ensure correctness with Expectations

Expectations using the power of SQL

Using Python

Installing libraries with pip

Metaprogramming in python

Best Practice: Integrate using the event log

DLT Automates Failure Recovery

What is SparkTM Structured Streaming?

Using Spark Structured Streaming for ingestion

Use Delta for infinite retention

Partition recomputation

Description:

Explore a comprehensive talk on Delta Live Tables (DLT), a revolutionary ETL framework that simplifies data transformation and pipeline management. Learn how DLT incorporates modern software engineering practices to deliver reliable and trusted data pipelines at scale. Discover techniques for rapid innovation in pipeline development and maintenance, automation of administrative tasks, and improved visibility into pipeline operations. Gain insights into built-in quality controls and monitoring for accurate BI, data science, and ML. Understand how to implement simplified batch and streaming with self-optimizing and auto-scaling data pipelines. Delve into topics such as live table dependencies, pipeline boundaries, SQL expectations, Python integration, metaprogramming, event log integration, failure recovery automation, Spark Structured Streaming for ingestion, and Delta for infinite retention.

Delta Live Tables: Modern Software Engineering for ETL Pipelines

Databricks

Add to list

#Data Science #Data Engineering #ETL #Computer Science #Software Engineering #Data Processing #Data Transformation #Big Data #Stream Processing #Batch Processing #Apache Spark #Delta Lake #Delta Live Tables

0:00 / 0:00