Play all

Intro

HOW MUCH TRAINING DATA DO YOU NEED?

UNDERSTANDING EXTRAPOLATION AND INTERPOLATION

MULTIVARIATE EXTRAPOLATION

EXTRAPOLATION AND INTERPOLATION IN HIGH DIMENSIONS

MODEL DESIGNED FOR EXTRAPOLATION OR INTERPOLATION

RECOGNIZING MULTIMODAL DISTRIBUTIONS

MACHINE LEARNING CURVE

MAHALANOBIS DISTANCE

EXAMPLE DATASET

DIABETES DATASET

FEATURE IMPORTANCE RANKING

DETERMINE HIGH AND LOW VALUES FOR

CREATE A BOUNDING HYPER-RECTANGLE

MOST DISTANT EDGES OF BOUNDING HYPER- RECTANGLE

Description:

Explore techniques for determining the appropriate amount of data needed to build effective machine learning models in this 26-minute video by Jeff Heaton. Learn about extrapolation and interpolation in both univariate and multivariate contexts, and understand how to measure data coverage across multiple dimensions. Discover methods for recognizing multimodal distributions, interpreting machine learning curves, and using Mahalanobis distance. Examine a practical example using a diabetes dataset, including feature importance ranking and creating bounding hyper-rectangles. Gain insights into ensuring your training data adequately represents the full range of scenarios your model may encounter in real-world applications.

How Much Data Is Enough to Build a Machine Learning Model

Jeff Heaton

Add to list

#Computer Science #Machine Learning #Artificial Intelligence #Data Science #Mathematics #Interpolation

0:00 / 0:00