Play all

Introduction

Agenda

Spark Use Cases

YARN in 2018

Scaling

Challenges

Image Management

Hybrid Approach

Hybrid Architecture

Hybrid Architecture Advantages

Spark Operator

Image Hierarchy Distribution

Recap

Improvements

Future Plans

Takeaways

Description:

Explore a 44-minute conference talk from Databricks detailing Lyft's innovative hybrid Apache Spark architecture utilizing YARN and Kubernetes. Dive into the challenges faced by Lyft when scaling their Batch ETL and ML spark workloads on Kubernetes, and discover the hybrid solution developed to optimize both containerized and non-containerized workloads. Learn about the dynamic runtime controller for environment-specific configurations and seamless resource manager switching. Gain insights into Spark use cases, scaling challenges, image management, and the advantages of the hybrid approach. Examine the Spark Operator, image hierarchy distribution, and recent improvements. Conclude with future plans and key takeaways for implementing a robust Spark architecture in large-scale transportation technology environments.

Hybrid Apache Spark Architecture: Optimizing YARN and Kubernetes for Lyft's Workloads

Databricks

Add to list

#Data Science #Big Data #Apache Spark #Computer Science #Machine Learning #DevOps #Kubernetes #Software Engineering #Scalability #Containerization

0:00 / 0:00