Главная
Study mode:
on
1
Intro
2
About Data Mechanics
3
Core Concepts
4
Configuration Tips
5
Spark Performance
6
Pod Resource Usage Manager
7
Spot History Server
8
Timeseries DB
9
Security
10
Upcoming features
11
Conclusion
12
Highlevel checklist
Description:
Explore best practices and potential pitfalls of running Apache Spark on Kubernetes in this 25-minute conference talk from Databricks. Dive into core concepts, setup procedures, and configuration tips for optimizing performance and resource sharing. Learn about Spark-app level dynamic allocation, cluster level autoscaling, and Kubernetes-specific considerations for data I/O performance. Discover monitoring and security best practices, as well as current limitations and planned future developments. Gain valuable insights from lessons learned while building a serverless Spark platform powered by Kubernetes, covering topics such as efficient resource usage, spot instance management, and security measures. Conclude with a high-level checklist to ensure successful implementation of Spark on Kubernetes in your data analytics infrastructure.

Running Apache Spark on Kubernetes - Best Practices and Pitfalls

Databricks
Add to list