Fabricator: Streamlining Declarative Feature Engineering at DoorDash
Description:
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Grab it
Explore a 26-minute conference talk on Fabricator, a comprehensive framework developed to streamline declarative data pipelines for machine learning at DoorDash. Learn how this innovative system efficiently orchestrates 1400 daily batch jobs, managing 2.2 trillion feature values across all business verticals. Discover the components of Fabricator, including its job registry, library for large-scale data ELT jobs, and orchestration and execution service. Understand the numerous advantages offered by Fabricator, such as streamlining feature development with a declarative feature DSL and centralized repository, accelerating data fabrication using a high-level SDK, mitigating latency and consistency discrepancies between offline and online feature data, and automating operational tasks. Gain insights into how Databricks Jobs and Delta Lake were leveraged in Fabricator's construction and the lessons learned during its development. Presented by Hebo Yang, ML Infra Engineer, and Kunal Shah, Software Engineer from DoorDash, this talk provides valuable knowledge for professionals interested in advanced feature engineering techniques and machine learning infrastructure.
Read more
Fabricator - Streamlining Declarative Feature Engineering at DoorDash