Matthew Stephens

My lab works on a wide variety of problems at the interface of Statistics and Genetics. We often tackle problems where novel statistical methods are required, or can learn something new compared with existing approaches. Thus, much of our research involves developing new statistical methodology, many of which have a non-trivial computational component. And because data sets are getting larger and larger our work often involves modern methods for “high-dimensional statistics”. Our work often makes extensive use of Bayesian hierarchical models to borrow information across data sets or sampling units.

Recently my lab has been increasingly focussed on making its research more open, reproducible and extensible. This is because I see this as the first step towards greater cooperation of scientists to achieve common goals.

See http://github.com/stephenslab/ash for an example of a recent project I conducted “in the open”. And see https://jdblischak.github.io/workflowr/ for an R package we have developed to help students and others make research websites of their analyses. In learning to do research this way, my lab uses git for version control, github for sharing code, and knitr and RStudio for helping make our R analyses clear and share-able.

Current research interests include:

Sparsity, shrinkage, and false discovery rates, particularly for complex inter-related datasets.
Factor Analysis, dimension reduction, and estimation of large covariance matrices.
Clustering methods, and generalizations (eg grade of membership)
Applications of multi-scale and wavelet methods to genomic data
Reproducible research and open science

Initiatives

Programs

Academic Programs

Other Programs

Community Data Fellow Stephania Tello Zamudio helps broaden internet access for Illinois residents

DSI Software Engineers create interactive map tool to maximize climate investment tax benefits

Transform cohort 3 participant Healee uses AI to improve healthcare

Towards New Physics at Future Colliders: Machine Learning Optimized Detector and Accelerator Design

Uncovering Patterns in Structure for Voltage Sensing Membrane Proteins with Machine Learning

Finding the likely causes when potential explanatory factors look alike

First Annual UChicago Transit Datathon

Ask a Student in MS in Applied Data Science

Bryce Meredig (Northwestern) – AI+Science Schmidt Fellows Speaker Series