Jennifer Sun

Bio: Jennifer is a PhD candidate in Computing and Mathematical Sciences at Caltech, advised by Pietro Perona and Yisong Yue. Her research is on machine learning for scientific applications, in order to enable efficient interactions between scientists and data analysis systems. Her current work is at the intersection of machine learning and behavior analysis, with projects on learning behavioral representations, social behavior recognition, interpretable modeling, and keypoint discovery. She worked on organizing two interdisciplinary workshops in 2021, on affective computing (AUVi) and multi-agent behavior modeling (MABe). In particular, MABe is organized with the Kennedy Lab at Northwestern, which aims to connect researchers across science and data science. Her work was awarded best student paper at CVPR2021 and is supported by the Kortschak Scholars Program and a Natural Sciences and Engineering Research Council of Canada (NSERC) Postgraduate Fellowship.

Talk Title: AI for Science: Learning from Experts and Data

Talk Abstract: In many fields, the amount of recorded scientific data is increasing much faster than the speed at which researchers can analyze and interpret them. For example, recorded videos of animal behavior over a few days can take domain experts months to analyze. Innovations in data science, such as machine learning, provide a promising direction to enable scientists to scalably perform data-driven experiments. However, scientific applications raise a number of challenges for existing methods: data creation is expensive, model interpretability is important, and tools are often needed to translate algorithmic improvements to practical benefits.

To address these challenges, my current work has focused on incorporating domain knowledge into machine learning to reduce human effort for data analysis. I will discuss methods to improve the sample-efficiency and interpretability of models in the context of behavior modeling. To learn annotation-sample efficient representations, we developed a framework to unify self-supervision with weak programmatic supervision from domain experts. We demonstrated that our method reduces annotation requirements up to a factor of 10 without compromising accuracy, compared to previous approaches. Furthermore, we investigate program synthesis as a promising direction to produce interpretable descriptions of behavior. We integrate interpretable programs from our method with an existing tool in behavioral neuroscience. These interdisciplinary approaches of machine learning with experts in the loop are important to broaden the application of data science across scientific domains.

Initiatives

Programs

Academic Programs

Other Programs

Community Data Fellow Stephania Tello Zamudio helps broaden internet access for Illinois residents

DSI Software Engineers create interactive map tool to maximize climate investment tax benefits

Transform cohort 3 participant Healee uses AI to improve healthcare

Uncovering Patterns in Structure for Voltage Sensing Membrane Proteins with Machine Learning

Finding the likely causes when potential explanatory factors look alike

An Intro to Gravitational-Wave Astronomy

Chicago Data Night – Aramide Kehinde (Amazon Web Services)

MS in Applied Data Science – Information Session (In-Person Program)

Neubauer Collegium Director’s Lecture with Stuart Russell – AI: What If We Succeed?