Главная
Study mode:
on
1
Intro
2
Motivation of Fugue
3
Node Vec: Fugue Code
4
Fugue Programming Model
5
A Workflow Example
6
The Fugue Extensions
7
Fugue SQL vs Spark SQL
8
Fugue Programming Interface vs SQL
9
Fugue ML Components
10
Model & Parameter Sweeping model
11
Benchmark Test
12
An Interactive On-demand Spark Ecosystem
13
Summary
Description:
Explore the Fugue framework, an abstraction layer unifying various big data analytics solutions like Apache Spark, TensorFlow, Druid, Dask, and Flink. Learn how this SQL-like language represents end-to-end pipelines, extensible with Python, to create reliable, performant, and maintainable data processing workflows. Discover the benefits of a unified K8S Spark environment for interactive development, batch processing, and near real-time streaming jobs. See demonstrations of instant dependency updates, on-demand Spark K8s cluster management, and Fugue extensions for Kinesis and Kafka. Understand how Fugue provides abstraction for machine learning pipelines, enabling distributed training, hyperparameter tuning, and inference across various ML libraries. Gain insights into extensive testing on Spark 3.0 and the resulting performance improvements in this 22-minute talk from Databricks.

Fugue: Unifying Big Data Analytics Ecosystems for ETL and Machine Learning

Databricks
Add to list
0:00 / 0:00