Tamara Kolda (MathSci.ai) - Tensor Moments of Gaussian Mixture Models

Gaussian mixture models (GMMs) are fundamental tools in statistical and data sciences that are useful for clustering, anomaly detection, density estimation, etc. We are interested in high-dimensional problems (e.g., many features) and a potentially massive number of data points. One way to compute the parameters of a GMM is via the method of moments, which compares the sample and model moments. The first moment is the mean, the second (centered) moment is the covariance. We are interested in third, fourth, and even higher-order moments. The d-th moment of an n-dimensional random variable is a symmetric d-way tensor (multidimensional array) of size n x n x n x … x n (d times), so working with moments is assumed to be prohibitively expensive in both storage and time for d>2 and larger values of n. In this talk, we show that the estimation of the model parameters can be accomplished without explicit formation of the model or sample moments. In fact, the cost per iteration for the method of moments is the same order as that of expectation maximization (EM), making method of moments competitive. Along the way, we show how to concisely describe the moments of Gaussians and GMMs using tools from algebraic geometry, enumerative combinatorics, and multilinear algebra. Numerical results validate and illustrate the numerical efficiency of our approaches.

Bio: Tamara Kolda is an independent mathematical consultant under the auspices of her company MathSci.ai based in California. She is also a Distinguished Visiting Professor in the Department of Industrial Engineering & Management Science at Northwestern University in Evanston, Illinois. From 1999-2021, she was a researcher at Sandia National Laboratories in Livermore, California. She specializes in mathematical algorithms and computation methods for tensor decompositions, tensor eigenvalues, graph algorithms, randomized algorithms, machine learning, network science, numerical optimization, and distributed and parallel computing.

She is currently serving as the founding editor-in-chief for the SIAM Journal on Mathematics of Data Science (SIMODS), on the Senior Program Committee for the Conference on Learning Theory (COLT’22), and as the Chair of the Illustrating the Impact of the Mathematical Sciences study for the National Academies. She is also a member of the National Academies’ Board on Mathematical Sciences and Analytics (BMSA), the board of advisors for the Institute for Mathematical and Statistical Innovation (IMSI), the SIAM Ethics Committee, the SIAM Block Lecture Selection Committee, the ACM-IEEE CS George Michael Memorial HPC Fellowships Committee, and the Schmidt Postdoctoral Fellowship Selection Committee.

She is a member of the National Academy of Engineering (NAE), Fellow of the Society for Industrial and Applied Mathematics (SIAM), and Fellow of the Association for Computing Machinery (ACM). Other recognitions include two best paper prizes from the IEEE International Conference on Data Mining (ICDM), a best paper prize from the SIAM International Conference on Data Mining (SDM), an R&D100 Award from R&D Magazine, and a Presidential Early Career Award for Scientists and Engineers (PECASE).

Initiatives

Programs

Academic Programs

Other Programs

Community Data Fellow Stephania Tello Zamudio helps broaden internet access for Illinois residents

DSI Software Engineers create interactive map tool to maximize climate investment tax benefits

Transform cohort 3 participant Healee uses AI to improve healthcare

Towards New Physics at Future Colliders: Machine Learning Optimized Detector and Accelerator Design

Uncovering Patterns in Structure for Voltage Sensing Membrane Proteins with Machine Learning

Finding the likely causes when potential explanatory factors look alike

First Annual UChicago Transit Datathon

Ask a Student in MS in Applied Data Science

Bryce Meredig (Northwestern) – AI+Science Schmidt Fellows Speaker Series

More on this topic

Navigating the Data Science Job Market: Insights and Opportunities

Inderjit S. Dhillon (The University of Texas at Austin) – MatFormer: Nested Transformer for Elastic Inference

Brandon Stewart (Princeton University) – Getting Inference Right with LLM Annotations in the Social Sciences

Introducing PalmWatch: Mapping the impact of big brands’ palm oil use