Главная
Study mode:
on
1
Introduction
2
Evaluations
3
Agenda
4
Architecture Example
5
Apache Spark
6
On Edge The Inside
7
Integration Points
8
EdgeInsight
9
Notebook
10
Notebook Extension
11
Data Visualization
12
Custom Visualizations
13
Importing Data
14
Hive Context
15
Yarn Resource Manager
16
Pause Spark Cluster
17
Batch Pipeline
18
ETL Pipeline
19
Data Sources
20
Visuals Considerations
21
Hive Tess
22
Data Lineage
23
Data Pipeline Tools
24
Demo Data Factory
25
Data Pipeline Options
26
RealTime Pipelines
Description:
Explore common patterns for building end-to-end data analytics pipelines using Apache Spark on Azure HDInsight in this conference talk from PASS Summit 2017. Dive into architecture examples, integration points, and various components of modern data pipelines. Learn about edge computing, notebooks, data visualization, and custom visualizations. Discover how to import data, use Hive context, and manage resources with Yarn. Examine batch and ETL pipelines, data sources, and pipeline tools like Azure Data Factory. Gain insights into real-time pipelines and data lineage considerations for building robust, scalable data solutions.

Building Modern Data Pipelines with Spark on Azure HDInsight

PASS Data Community Summit
Add to list
0:00 / 0:00