Speaker: Danica Sutherland
Abstract
Learning dynamics, which describes how the learning of specific training examples influences the model’s predictions on other examples, gives us a powerful tool for understanding the behaviour of deep learning systems. This talk will cover how we can use a local understanding of training steps, building on empirical neural tangent kernels, to better understand phenomena that happen across the course of training a deep network. Applications include better understandings of knowledge distillation, fine-tuning, and particularly preference tuning for LLM post-training.
Danica Sutherland is an Assistant Professor in computer science at the University of British Columbia, and a Canada CIFAR AI Chair at Amii. She did her PhD at Carnegie Mellon University, a postdoc at University College London’s Gatsby unit, and was a research assistant professor at TTI-Chicago. Her research focuses on understanding, exploiting, and improving the representations learned by modern deep learning methods, particularly using kernel methods as a tool, and on the use of machine learning to solve statistical problems. Her work has been recognized with an Outstanding Paper Award at ICLR 2025 (the work in this talk, with her PhD student Yi Joshua Ren) and a Best Paper Award at FAccT 2023 (done as a core organizer with Queer in AI).
Share