Главная
Study mode:
on
1
Intro
2
Why use Spark
3
Python Spark
4
pandas
5
Python vs Spark
6
Scenarios
7
Python UDF
8
Pandas UDF
9
Pandas Type Hints
10
Undecorated UDFs
11
Functional API
12
Spark DataFrame
13
Databricks
14
Kaggle Notebook
15
Testing in Spark
16
Exploration Goals
17
Mindset
18
Demo
19
Notebook Environment
20
Tokenization
21
Sentiment Scores
Description:
Explore techniques for simplifying testing of Spark applications in this conference talk presented by Megan Yow from Sobeys and Han Wang from Lyft. Dive into the advantages of using Spark, compare Python Spark with pandas, and understand various scenarios involving Python UDFs, Pandas UDFs, and Pandas Type Hints. Learn about undecorated UDFs, the Functional API, and working with Spark DataFrames. Gain insights into using Databricks and Kaggle Notebooks for Spark development. Discover best practices for testing in Spark, including exploration goals and mindset. Watch a demonstration showcasing a notebook environment, tokenization techniques, and sentiment score analysis. This 46-minute presentation provides valuable knowledge for developers looking to enhance their Spark application testing skills.

Simplifying Testing of Spark Applications

Linux Foundation
Add to list