Jingfeng Wu

Bio: Jingfeng Wu is a postdoc at the Simons Institute at UC Berkley, hosted by Prof. Peter Bartlett and Prof. Bin Yu. He earned his Ph.D. in Computer Science at Johns Hopkins University, advised by Prof. Vladimir Braverman. Before that, he obtained his B.S. in Mathematics and M.S. in Applied Math from Peking University.

His research interests are in the theory and algorithms of deep learning and related topics in machine learning, optimization, and statistics.

Talk Title: Theoretical Insights into Gradient Descent and Stochastic Gradient Descent in Deep Learning

Abstract: Gradient Descent (GD) and Stochastic Gradient Descent (SGD) are fundamental optimization algorithms in machine learning, but their behaviors sometimes defy intuitions from classic optimization and statistical learning theories. In deep learning, GD often exhibits local oscillations while still converging over time. Moreover, SGD-trained models generalize effectively even when overparameterized. In this talk, I will revisit the theories of GD and SGD for classic problems but in new scenarios motivated by deep learning, presenting two novel insights:

(1) For logistic regression with separable data, GD with an arbitrarily large stepsize minimizes empirical risk, potentially in a non-monotonic fashion.

(2) For linear regression and ReLU regression, one-pass SGD and its variants can achieve low excess risk, even in overparameterized regime.

Initiatives

Programs

Academic Programs

Other Programs

UChicago political scientist Molly Offer-Westort named Carnegie Fellow

Nūtrad, from Transform Cohort 2, featured in The Chicago Maroon

Community Data Fellow Stephania Tello Zamudio helps broaden internet access for Illinois residents

Teaching materials to adapt

From Protein Structures to Clean Energy Materials to Cancer Therapies: Using AI to Understand and Exploit X-ray Damage Effects

Using Computer Vision to Study Chicago Neighborhoods

DSI Research Day

Ask a Student in MS in Applied Data Science

Chicago Data Night – Dexter Horthy (Metalytics)