Ying Jin
Bio: I am a fifth-year PhD student at Department of Statistics, Stanford University, advised by Professors Emmanuel Candès and Dominik Rothenhäusler. Prior to Stanford, I obtained my bachelor’s degree in Mathematics from Tsinghua University. My research aims at developing modern statistical methods for trusted inference and decision-making, that are simple, robust, generalizable, and minimal in assumptions. A large proportion of my work develops model-free methods that provide solid guarantees for the use of black-box prediction models in complex scientific discovery and decision-making processes, especially with applications to drug discovery. In addition, I study how to generalize causal inference to new contexts and new decision rules; my work under this theme concerns methodological and empirical foundations for the replicability, robustness, and transportability of causal effects, as well as algorithms and theories for policy learning with offline data from sequential experiments.
Talk Title: Model-free selective inference with conformal p-values and its application in drug discovery
Abstract: In decision-making or scientific discovery pipelines such as job hiring and drug discovery, before any resource-intensive step, there is often an initial screening step that uses predictions from a machine learning model to shortlist a few candidates from a large pool. We introduce a framework that allows using any prediction model to select candidates whose unobserved outcomes exceed user-specified values, while rigorously controlling the false positives. Given a set of calibration data that are exchangeable with the test sample, we leverage conformal inference ideas to construct p-values that allow us to shortlist candidates with exact false discovery rate (FDR) control. In addition, I will discuss new ideas to further deal with covariate shifts between calibration and test samples, a scenario that occurs in almost all such problems including drug discovery, hiring, causal inference, and healthcare. Our methods are flexible wrappers around any complex model. In practical drug discovery tasks, our methods greatly narrow the set of promising drug candidates to manageable sizes while maintaining rigorous FDR control.