Logo

Mathematical Sciences Research Institute

Home » Workshop » Schedules » Datamodels: Predicting Predictions with Training Data

Datamodels: Predicting Predictions with Training Data

[Virtual] Hot Topics: Foundations of Stable, Generalizable and Transferable Statistical Learning March 07, 2022 - March 10, 2022

March 07, 2022 (08:00 AM PST - 08:25 AM PST)
Speaker(s): Aleksander Madry (Massachusetts Institute of Technology)
Location: MSRI: Online/Virtual
Tags/Keywords
  • machine learning

  • robustness

  • influence funcitons

Primary Mathematics Subject Classification
Secondary Mathematics Subject Classification No Secondary AMS MSC
Video

Datamodels: Predicting Predictions With Training Data

Abstract

Machine learning models tend to rely on an abundance of training data. Yet, understanding the underlying structure of this data---and models' exact dependence on it---remains a challenge.

In this talk, we will present a new framework---called datamodeling---for directly modeling predictions as functions of training data. This datamodeling framework, given a dataset and a learning algorithm, pinpoints---at varying levels of granularity---the relationships between train and test point pairs through the lens of the corresponding model class. Even in its most basic version, datamodels enable many applications, including discovering subpopulations, quantifying model brittleness via counterfactuals, and identifying train-test leakage.

Supplements
92771?type=thumb Datamodels: Predicting Predictions with Training Data 1.48 MB application/pdf Download
Video/Audio Files

Datamodels: Predicting Predictions With Training Data

Troubles with video?

Please report video problems to itsupport@msri.org.

See more of our Streaming videos on our main VMath Videos page.