Play all

Introduction

Data Lakes

Typical Data Lake Project

Who uses Delta

Getting started

Data

Download Data

Park Table

Stop Streaming

Initializing Streaming

Working with Parker

Using Delta Lake

Streaming Job

Multiple Streaming Queries

Counting Continuously

Schema Evolution

Merged Schema

Summary

History

Vacuum

Mods

Merge

Update Data

Define DataFrame

Merge Syntax

Random Data

For Each Batch

Summarize

Community

Question

Thank you

Description:

Explore the world of data reliability and performance in big data workloads through this 43-minute tutorial on building data-intensive analytic applications with Delta Lake. Learn how Delta Lake, an open-source storage layer, brings ACID transactions to Apache Spark™ and addresses key challenges faced by data engineers. Discover the requirements of modern data engineering and how Delta Lake can improve data reliability at scale. Through presentations, code examples, and interactive notebooks, gain insights into applying this innovation to your data architecture. Understand key data reliability challenges, how Delta Lake fits within an Apache Spark™ environment, and practical ways to implement data reliability improvements. Dive into topics such as data lakes, streaming, schema evolution, and merge operations while exploring hands-on examples using Delta Lake's features.

Building Data Intensive Analytic Applications on Top of Delta Lakes

Databricks

Add to list

#Data Science #Data Analysis #Big Data #Apache Spark #Business #Business Intelligence #Data Lakes #Data Engineering #Delta Lake