Skip to main content

DSI Affiliated Scholars are University of Chicago faculty, Other Academic Appointees, or CASE affiliates that are leaders in advancing the intellectual agenda of data science and actively engage in one or more DSI activities. As an Affiliate, scholars have access to DSI benefits and resources such as engagement with DSI’s strategic planning (initiatives, proposals), interaction with the DSI Industry Affiliates program, and opportunities for joint mentorship of students and postdoctoral scholars.

Affiliated Scholars

Anjali Adukia is an assistant professor at the University of Chicago Harris School of Public Policy and the College. In her work, she is interested in understanding how to reduce inequalities such that children from historically disadvantaged backgrounds have equal opportunities to fully develop their potential.  Her research is focused on understanding factors that motivate and shape behavior, preferences, attitudes, and educational decision-making, with a particular focus on early-life influences.  She examines how the provision of basic needs—such as safety, health, justice, and representation—can increase school participation and improve child outcomes in developing contexts.

Adukia completed her doctoral degree at the Harvard University Graduate School of Education, with an academic focus on the economics of education. Her work has been funded from organizations such as the William T. Grant Foundation, the National Academy of Education, and the Spencer Foundation.  Her dissertation won awards from the Association for Public Policy Analysis and Management (APPAM), Association for Education Finance and Policy (AEFP), and the Comparative and International Education Society (CIES). Adukia received recognition for her teaching from the University of Chicago Feminist Forum.  She completed her masters of education degrees in international education policy and higher education (administration, planning, and social policy) from Harvard University and her bachelor of science degree in molecular and integrative physiology from the University of Illinois at Urbana-Champaign.  She is a faculty research fellow of the National Bureau of Economic Research and a faculty affiliate of the University of Chicago Education Lab and Crime Lab.  She is on the editorial board of Education Finance and Policy.  She was formerly a board member of the Young Nonprofit Professionals Network – San Francisco Bay Area. She continues to work with non-governmental organizations internationally, such as UNICEF and Manav Sadhna in Gujarat, India.

Raul Castro Fernandez is an Assistant Professor of Computer Science at the University of Chicago. In his research he builds systems for discovering, preparing, and processing data. The goal of his research is to understand and exploit the value of data. He often uses techniques from data management, statistics, and machine learning. His main effort these days is on building platforms to support markets of data. This is part of a larger research effort on understanding the Economics of Data. He’s part of ChiData, the data systems research group at The University of Chicago.


Kyle Chard is a Research Assistant Professor in the Department of Computer Science at the University of Chicago and Argonne National Laboratory. He has been Program Director of the Data & Computing Summer Lab since its first iteration under CDAC in 2019, and previously oversaw the Summer Internship Program ran by the former Computation Institute.

He received his Ph.D. in Computer Science from Victoria University of Wellington in 2011. He co-leads the Globus Labs research group which focuses on a broad range of research problems in data-intensive computing and research data management. He currently leads projects related to parallel programming in Python, scientific reproducibility, and elastic and cost-aware use of cloud infrastructure.

I completed my PhD in Statistics in June 2020 at Stanford University, where I was lucky to be supervised by Professor Susan Holmes and where I had the chance to work with Prof. Jure Leskovec. Prior to Stanford, I studied Applied Mathematics and Engineering at Ecole Polytechnique (France), where I received an M.S and B.S equivalent.

My research interests lie in the analysis of patterns and the quantification of uncertainty in high-dimensional datasets, and in particular, graphs and networks, geared towards biomedical applications.

Dr. Allyson Ettinger’s research is focused on language processing in humans and in artificial intelligence systems, motivated by a combination of scientific and engineering goals. For studying humans, her research uses computational methods to model and test hypotheses about mechanisms underlying the brain’s processing of language in real time. In the engineering domain, her research uses insights and methods from cognitive science, linguistics, and neuroscience in order to analyze, evaluate, and improve natural language understanding capacities in artificial intelligence systems. In both of these threads of research, the primary focus is on the processing and representation of linguistic meaning.

My research focuses on the collective system of thinking and knowing, ranging from the distribution of attention and intuition, the origin of ideas and shared habits of reasoning to processes of agreement (and dispute), accumulation of certainty (and doubt), and the texture—novelty, ambiguity, topology—of understanding. I am especially interested in innovation—how new ideas and practices emerge—and the role that social and technical institutions (e.g., the Internet, markets, collaborations) play in collective cognition and discovery. Much of my work has focused on areas of modern science and technology, but I am also interested in other domains of knowledge—news, law, religion, gossip, hunches, machine and historical modes of thinking and knowing. I support the creation of novel observatories for human understanding and action through crowd sourcing, information extraction from text and images, and the use of distributed sensors (e.g., RFID tags, cell phones). I use machine learning, generative modeling, social and semantic network representations to explore knowledge processes, scale up interpretive and field-methods, and create alternatives to current discovery regimes.

My research has been supported by the National Science Foundation, the National Institutes of Health, the Air Force office of Science Research, and many philanthropic sources, and has been published in NatureScienceProceedings of the National Academy of ScienceAmerican Journal of SociologyAmerican Sociological ReviewSocial Studies of ScienceResearch Policy, Critical Theory, Administrative Science Quarterly, and other outlets. My work has been featured in the EconomistAtlantic MonthlyWiredNPRBBCEl PaísCNN, Le Monde, and many other outlets.

At Chicago, I am Director of Knowledge Lab, which has collaborative, granting and employment opportunities, as well as ongoing seminars. I also founded and now direct on the Computational Social Science program at Chicago, and sponsor an associated Computational Social Science workshop. I teach courses in augmented intelligence, the history of modern science, science studies, computational content analysis, and Internet and Society. Before Chicago, I received my doctorate in sociology from Stanford University, served as a research associate in the Negotiation, Organizations, and Markets group at Harvard Business School, started a private high school focused on project-based arts education, and completed a B. A. in Anthropology at Brigham Young University.


I lead an interdisciplinary computational and theoretical research group working on materials self-assembly, biomolecular simulation, viral dynamics, and vaccine design. My doctoral training provided me with expertise in molecular simulation, statistical mechanics, and machine learning, in which I developed new nonlinear machine learning approaches to study the conformations and dynamics of proteins, polymers, and confined water. During my post-doctoral fellowship, I acquired knowledge and skills in immunology and viral dynamics, and developed new computational tools for structure-free prediction of antibody binding sites, and the computational design of HIV vaccines using statistical mechanical principles.

Since establishing my independent research program in 2012, I have combined these expertise to establish a dynamic research program in computational materials science and computational virology for which I have attracted over $2.9M in federal research funding, established a strong publication record (60+ papers) in leading journals, and have been recognized with a number of national awards including a 2018 Royal Society of Chemistry Molecular Systems Design and Engineering Emerging Investigator Award, 2017 Dean’s Award for Excellence in Research, 2016 AIChE CoMSEF Young Investigator Award, 2015 ACS Outstanding Junior Faculty Award, 2014 ACS Petroleum Research Fund Doctoral New Investigator Award, 2013 NSF CAREER Award, and I was named the 2013 Institution of Chemical Engineers North America “Young Chemical Engineer of the Year”. I am engaged and active within my professional organization serving on the AIChE Area 1a Programming Committee and as CoMSEF Liaison Director, and in organizing multiple scientific sessions at our national meetings. In addition to independent theoretical work, my research interests lead naturally to close collaboration with experimentalists and clinicians, teaching me the power of mutually reinforcing theoretical and experimental work and the importance of effective communication, planning, budgeting, teamwork and leadership.


William Howell is the Sydney Stein Professor in American Politics at the University of Chicago Harris School of Public Policy, a professor in the Department of Political Science and the College, and the director of the Center for Effective Government. He has written widely on separation-of-powers issues and American political institutions, especially the presidency. He currently is working on research projects on Obama’s education initiatives, distributive politics, and the normative foundations of executive power.


Kate Keahey is one of the pioneers of infrastructure cloud computing. She created the Nimbus project, recognized as the first open source Infrastructure-as-a-Service implementation, and continues to work on research aligning cloud computing concepts with the needs of scientific datacenters and applications. To facilitate such research for the community at large, Kate leads the Chameleon project, providing a deeply reconfigurable, large-scale, and open experimental platform for Computer Science research. To foster the recognition of contributions to science made by software projects, Kate co-founded and serves as co-Editor-in-Chief of the SoftwareX journal, a new format designed to publish software contributions. Kate is a Scientist at Argonne National Laboratory and a Senior Fellow at the Computation Institute at the University of Chicago.

I’m an assistant professor at the Department of Statistics at University of Chicago. I am also a member of Committee on Computational and Applied Mathematics (CCAM). I am interested in computational problems in structural biology and quantum many-body physics. I was extremely fortunate to have Amit Singer at Princeton as my Ph.D. adviser during 2012-2016, Lexing Ying at Stanford as my post-doc mentor during 2016-2019, and Phuan Ong at Princeton supervising my master thesis in experimental physics during 2010-2012.

Sanjay Krishnan is an Assistant Professor of Computer Science. His research group studies the theory and practice of building decision systems that are robust to corrupted, missing, or otherwise uncertain data. His research brings together ideas from statistics/machine learning and database systems. His research group is currently studying systems that can analyze large amounts of video, certifiable accuracy guarantees in partially complete databases, and theoretical lower-bounds for lossy compression in relational databases.


Nicole Marwell is Associate Professor in the University of Chicago Crown Family School of Social Work, Policy, and Practice. She is also a faculty affiliate of the Department of Sociology, a faculty fellow at the Center for Spatial Data Science, and a member of the Faculty Advisory Council of the Mansueto Institute for Urban Innovation. Her research examines urban governance, with a focus on the diverse intersections between nonprofit organizations, government bureaucracies, and politics.

Since 2018, I am an Assistant Professor in the Department of Statistics at the University of Chicago, where I am a member of the Committee on Computational and Applied Mathematics. Previously, I was a postdoctoral research associate and a member of the Data Science Initiative at Brown University. In 2016 I completed my PhD in Mathematics and Statistics at the University of Warwick, UK, under the supervision of Andrew Stuart and Gareth Roberts.

My research interests are in graph-based learning, inverse problems and data assimilation. The main theme that drives my research across these three disciplines is the desire to blend complex preditive models with large data-sets. My work addresses both theoretical and compuational challenges motivated by data-centric applications.

My work is currently funded by the National Science Foundation, the National Geospatial-Intelligence Agency and the BBVA Foundation. I have been awarded the 2020 José Luis Rubio de Francia prize to the best Spanish mathematician under 32 by the Spanish Royal Society of Mathematics.

I am the organizer of the CAM Colloquium.

My lab works on a wide variety of problems at the interface of Statistics and Genetics. We often tackle problems where novel statistical methods are required, or can learn something new compared with existing approaches. Thus, much of our research involves developing new statistical methodology, many of which have a non-trivial computational component. And because data sets are getting larger and larger our work often involves modern methods for “high-dimensional statistics”. Our work often makes extensive use of Bayesian hierarchical models to borrow information across data sets or sampling units.

Recently my lab has been increasingly focussed on making its research more open, reproducible and extensible. This is because I see this as the first step towards greater cooperation of scientists to achieve common goals.

See for an example of a recent project I conducted “in the open”. And see for an R package we have developed to help students and others make research websites of their analyses. In learning to do research this way, my lab uses git for version control, github for sharing code, and knitr and RStudio for helping make our R analyses clear and share-able.

Current research interests include:

  • Sparsity, shrinkage, and false discovery rates, particularly for complex inter-related datasets.
  • Factor Analysis, dimension reduction, and estimation of large covariance matrices.
  • Clustering methods, and generalizations (eg grade of membership)
  • Applications of multi-scale and wavelet methods to genomic data
  • Reproducible research and open science

Samuel L. Volchenboum, MD, PhD, MS, is an expert in pediatric cancers and blood disorders. He has a special interest in treating children with neuroblastoma, a tumor of the sympathetic nervous system.

In addition to caring for patients, Dr. Volchenboum studies ways to harness computers to enable research and foster innovation using large data sets. He directs the development of the International Neuroblastoma Risk Group Database project, which connects international patient data with external information such as genomic data and tissue availability. The Center he runs provides computational support for the Biological Sciences Division at the University of Chicago, including high-performance computing, applications development, bioinformatics, and access to the clinical research data warehouse.