Jennifer Sun
Bio: Jennifer is a PhD candidate in Computing and Mathematical Sciences at Caltech, advised by Pietro Perona and Yisong Yue. Her research is on machine learning for scientific applications, in order to enable efficient interactions between scientists and data analysis systems. Her current work is at the intersection of machine learning and behavior analysis, with projects on learning behavioral representations, social behavior recognition, interpretable modeling, and keypoint discovery. She worked on organizing two interdisciplinary workshops in 2021, on affective computing (AUVi) and multi-agent behavior modeling (MABe). In particular, MABe is organized with the Kennedy Lab at Northwestern, which aims to connect researchers across science and data science. Her work was awarded best student paper at CVPR2021 and is supported by the Kortschak Scholars Program and a Natural Sciences and Engineering Research Council of Canada (NSERC) Postgraduate Fellowship.
Talk Title: AI for Science: Learning from Experts and Data
Talk Abstract: In many fields, the amount of recorded scientific data is increasing much faster than the speed at which researchers can analyze and interpret them. For example, recorded videos of animal behavior over a few days can take domain experts months to analyze. Innovations in data science, such as machine learning, provide a promising direction to enable scientists to scalably perform data-driven experiments. However, scientific applications raise a number of challenges for existing methods: data creation is expensive, model interpretability is important, and tools are often needed to translate algorithmic improvements to practical benefits.
To address these challenges, my current work has focused on incorporating domain knowledge into machine learning to reduce human effort for data analysis. I will discuss methods to improve the sample-efficiency and interpretability of models in the context of behavior modeling. To learn annotation-sample efficient representations, we developed a framework to unify self-supervision with weak programmatic supervision from domain experts. We demonstrated that our method reduces annotation requirements up to a factor of 10 without compromising accuracy, compared to previous approaches. Furthermore, we investigate program synthesis as a promising direction to produce interpretable descriptions of behavior. We integrate interpretable programs from our method with an existing tool in behavioral neuroscience. These interdisciplinary approaches of machine learning with experts in the loop are important to broaden the application of data science across scientific domains.