Главная
Study mode:
on
1
Intro
2
Title
3
Agenda
4
Nicol Giso
5
Customer plants
6
Tenova Industrial IoT platform
7
Data security
8
Data application
9
Sample file
10
Requirements
11
Architecture
12
Variables
13
Merge Notebook
14
Update Last Values Notebook
15
Read Table Notebook
16
Update DataFrame
17
Timestamp Notebook
18
Data Notebook
19
Databricks Release Pipeline
20
Azure Databricks
21
Next steps
22
Questions
Description:
Explore an ETL solution for transforming telemetry data into CSV files using Python, Spark, and Azure Databricks in this EuroPython 2021 conference talk. Learn how Tenova, an engineering company, collects and processes data from industrial equipment to support data scientists and process engineers in developing analytics solutions and retraining AI models. Discover the implementation of Databricks Notebooks that leverage PySpark and Pandas to manipulate raw JSON Lines files into formatted CSVs, meeting specific requirements such as device-specific data, daily file generation, and midnight value recording. Gain insights into the architecture, variable handling, and various notebook functionalities involved in this data transformation process, as well as the integration with Azure DataFactory for daily execution.

From Telemetry Data to CSVs with Python, Spark and Azure Databricks

EuroPython Conference
Add to list