Skip to main content

Project: Causality for Credible AI. The backbone of modern ML is evaluation on randomly held out data. However, models with strong held out performance frequently still exhibit disturbing failures. For example, they can degrade when deployed out of domain, learn to depend on apparently irrelevant input features, or change substantively in response to tiny variations in training procedure (e.g., changing the random seed). The aim of this project is to develop methods to mitigate this kind of failure by baking in causal knowledge to model design and evaluation. For example, we may know changing certain input features shouldn’t change predictions (e.g., your medical diagnosis shouldn’t change if we edit your zipcode), or we may know something about common causal structure between domains (rain should affect a self-driving car the same in Chicago and New York). The challenge here is to translate domain knowledge and desiderata into formal requirements, and then determine how to enforce and measure these using available data.

Bio: I am an assistant professor of Statistics and Data Science at the University of Chicago and a research scientist at Google Cambridge. My recent work revolves around the intersection of machine learning and causal inference, as well as the design and evaluation of safe and credible AI systems. Other noteable areas of interests include network data, and the foundations of learning and statistical inference.

I was previously a Distinguished Postdoctoral Researcher in the department of statistics at Columbia University, where I worked with the groups of David Blei and Peter Orbanz. I completed my Ph.D. in statistics at the University of Toronto, where I was advised by Daniel Roy. In a previous life, I worked on quantum computing at the University of Waterloo. I won a number of awards, including the Pierre Robillard award for best statistics thesis in Canada.