Skip to main content

Gaussian mixture models (GMMs) are fundamental tools in statistical and data sciences that are useful for clustering, anomaly detection, density estimation, etc. We are interested in high-dimensional problems (e.g., many features) and a potentially massive number of data points. One way to compute the parameters of a GMM is via the method of moments, which compares the sample and model moments. The first moment is the mean, the second (centered) moment is the covariance. We are interested in third, fourth, and even higher-order moments. The d-th moment of an n-dimensional random variable is a symmetric d-way tensor (multidimensional array) of size n x n x n x … x n (d times), so working with moments is assumed to be prohibitively expensive in both storage and time for d>2 and larger values of n. In this talk, we show that the estimation of the model parameters can be accomplished without explicit formation of the model or sample moments. In fact, the cost per iteration for the method of moments is the same order as that of expectation maximization (EM), making method of moments competitive. Along the way, we show how to concisely describe the moments of Gaussians and GMMs using tools from algebraic geometry, enumerative combinatorics, and multilinear algebra. Numerical results validate and illustrate the numerical efficiency of our approaches.

Bio: Tamara Kolda is an independent mathematical consultant under the auspices of her company based in California. She is also a Distinguished Visiting Professor in the Department of Industrial Engineering & Management Science at Northwestern University in Evanston, Illinois. From 1999-2021, she was a researcher at Sandia National Laboratories in Livermore, California. She specializes in mathematical algorithms and computation methods for tensor decompositions, tensor eigenvalues, graph algorithms, randomized algorithms, machine learning, network science, numerical optimization, and distributed and parallel computing.

She is currently serving as the founding editor-in-chief for the SIAM Journal on Mathematics of Data Science (SIMODS), on the Senior Program Committee for the Conference on Learning Theory (COLT’22), and as the Chair of the Illustrating the Impact of the Mathematical Sciences study for the National Academies. She is also a member of the National Academies’ Board on Mathematical Sciences and Analytics (BMSA), the board of advisors for the Institute for Mathematical and Statistical Innovation (IMSI), the SIAM Ethics Committee, the SIAM Block Lecture Selection Committee, the ACM-IEEE CS George Michael Memorial HPC Fellowships Committee, and the Schmidt Postdoctoral Fellowship Selection Committee.

She is a member of the National Academy of Engineering (NAE), Fellow of the Society for Industrial and Applied Mathematics (SIAM), and Fellow of the Association for Computing Machinery (ACM). Other recognitions include two best paper prizes from the IEEE International Conference on Data Mining (ICDM), a best paper prize from the SIAM International Conference on Data Mining (SDM), an R&D100 Award from R&D Magazine, and a Presidential Early Career Award for Scientists and Engineers (PECASE).