Главная
Study mode:
on
1
Introduction
2
Background
3
Example
4
Problem Statement
Description:
Explore how SQL can enhance data organization for machine learning in this 11-minute video presentation by Columbia PhD student Zachary Huang. Learn about JoinBoost, a lightweight Python library that transforms tree training algorithms over normalized databases into pure SQL queries. Discover how this innovative approach addresses the mismatch between ML data organization requirements and traditional database structures, offering a simplified, all-in-one data stack solution. Gain insights into JoinBoost's compatibility with various DBMS and data stacks, its exceptional performance and scalability, and how it outperforms specialized ML libraries like LightGBM in terms of speed and scalability for random forests and gradient boosting algorithms.

SQL for Efficient Data Organization in Machine Learning

Snorkel AI
Add to list
0:00 / 0:00