Главная
Study mode:
on
1
Intro
2
What is Spark
3
How to optimize
4
Major bottlenecks
5
Spark Casting
6
Spark Casting Demo
7
Disadvantages of Casting
8
Advantages of Broadcast
9
Architecture of Broadcast
10
Serialization
11
Serializer
12
DataFrame
13
UDF
14
Filter Data
15
Supply
16
Reducing Supply
17
Importance of File Format
18
Handling of Data
19
File Format Optimization
20
Executor Optimization
21
Out of Memory
22
Memory Tuning
23
Conclusion
Description:
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Learn essential best practices for running Apache Spark in production environments and optimize system-level performance in this 42-minute tutorial. Explore major bottlenecks, understand Spark casting techniques, and discover the advantages of broadcast operations. Dive into serialization, DataFrame operations, and UDF implementation. Master data filtering, supply reduction, and file format optimization. Gain insights on executor optimization, memory tuning, and handling out-of-memory errors. Equip yourself with the knowledge to fine-tune Apache Spark for peak performance in real-world scenarios.

Apache Spark Performance Tuning and Best Practices

NashKnolX
Add to list