Tijana Zrnic (Stanford): Statistics and DSI Joint Colloquium
Please join us for a Statistics and DSI joint colloquium.
Wednesday, January 29
4:00pm – 5:00pm
John Crerar Library, 390
Title: AI-Assisted Approaches to Data Collection and Inference
Abstract: Recent breakthroughs in AI offer tremendous potential to reduce the costs of data collection. For example, there is a growing interest in leveraging large language models (LLMs) as efficient substitutes for human judgment in tasks such as model evaluation and survey research. However, AI systems are not without flaws—generative language models often lack factual accuracy, and predictive models remain vulnerable to subtle perturbations. These issues are particularly concerning when critical decisions, such as scientific discoveries or policy choices, rely on AIgenerated outputs. In this talk, I will present recent and ongoing work on AI-assisted approaches to data collection and statistical inference. Rather than treating AI as a replacement for data collection, our methods leverage AI to strategically guide data collection and improve the power of subsequent inferences, all the while retaining provable validity guarantees. I will demonstrate the benefits of this methodology through examples from computational social science, proteomics, and more.
Bio: Tijana Zrnic: I’m a Ram and Vijay Shriram Postdoctoral Fellow at Stanford University, affiliated with Stanford Data Science. I work with Emmanuel Candès in the Department of Statistics. I obtained my PhD in Electrical Engineering and Computer Sciences at UC Berkeley in 2023, where I was advised by Moritz Hardt and Michael Jordan. I spent the summer of 2020 interning at Apple AI Research, hosted by Vitaly Feldman. My PhD research was generously supported by an Apple PhD Fellowship in AI/ML. Before starting my PhD, I completed my BEng in Electrical and Computer Engineering at the University of Novi Sad in Serbia, where I was advised by Dragana Bajovic. During undergrad I spent a summer at Caltech, working with Babak Hassibi.
My research establishes foundations to ensure data-driven technologies have a positive impact. Topics of interest include performative prediction, a framework that formalizes the impacts that predictive algorithms can have on society and studies algorithms for finding desirable equilibria in such settings. Much of my recent work focuses on prediction-powered inference and active inference, developing statistically valid methods for an increasingly common setting in which a small amount of “gold-standard” data is supplemented by a large amount of AI predictions. My work also contributes to drawing reliable conclusions in the presence of selection bias, such as the bias arising from cherry-picking only those effects and models that seem most promising based on the data.