Главная
Study mode:
on
1
Intro
2
What is Apache Spark?
3
A Large Community
4
Apache Spark Users
5
Original Spark Vision
6
Motivation: Unification
7
Motivation: Concise API
8
How Did the Vision Hold Up?
9
Libraries Built on Spark
10
Which Libraries Do People Use?
11
Top Applications
12
Main Challenge: Functional API
13
Which API Call Causes Most Tickets?
14
Example Problem
15
Challenge: Data Representation
16
Why Structure?
17
DataFrames and Datasets
18
Execution Steps
19
DataFrame API
20
Why DataFrames?
21
What Structured APIs Enable
22
Performance
23
Dataset API Details
24
Data Sources
25
Data Source API
26
Examples
27
Hardware Trends
28
Project Tungsten
29
Tungsten's Compact Encoding
30
Space Efficiency
31
Runtime Code Generation
32
Long-Term Vision
33
Versioning in Spark
34
Major Features in 2.0
35
Background
36
Structured Streaming High-level streaming API built on DataFrames/Datasets
37
Structured Streaming API
38
Example: Batch Aggregation
39
Example: Continuous Aggregation
40
Incrementalized By Spark
41
Release Timeline
42
Conclusion
43
Want to Learn Apache Spark?
Description:
Explore the evolution of Apache Spark's API in this keynote presentation from Scala Days New York 2016. Dive into the upcoming features of Spark 2.0, including more declarative APIs for automatic optimizations and improved links between Scala data types and binary data formats for efficient processing. Learn about Spark's journey as a large-scale Scala project, its functional API, and its impact on distributed programming. Discover the challenges faced in API design, data representation, and performance optimization. Gain insights into DataFrames, Datasets, and Structured Streaming APIs. Understand Project Tungsten's role in improving space efficiency and runtime code generation. Get a glimpse of Spark's long-term vision and versioning strategy, and find resources to further your Apache Spark knowledge.

Spark 2.0

Scala Days Conferences
Add to list