Services architecture • Services configured as Erlang clusters with nodes. • Nodes deployed on containers • Nodes running the service will spawn Erlang processes
3
Scheduled Job framework written in El • Handles coordination of jobs across the Erlang nodies in a service rerunning failed jobs and persisting of status logs
4
Steps of a scheduled job workflow Airflow
5
Scheduled jobs in Application start()
6
Case study: Notification view analytic
7
Case study: Notification analysis
8
Setting up Kafka Ex 1. Add mix dependency to build
9
Supervisor module to listen on consur
10
GenServer consumer
Description:
Discover how to leverage Elixir for innovative data engineering solutions in this 24-minute conference talk from Databricks. Explore the power of Erlang's lightweight distributed process coordination for running worker clusters across Docker containers and performing data ingestion. Learn about a framework that integrates Elixir functions as steps in Airflow graphs. Dive into techniques for consuming and processing Kafka events directly within Elixir microservices. Examine real system examples with step-by-step walkthroughs of key elements, covering topics such as services architecture, scheduled job frameworks, and case studies on notification analytics. Gain insights into setting up Kafka consumers and implementing GenServer modules, all without requiring prior Erlang or Elixir knowledge.
Elixir for Data Engineering - Batch and Stream Processing