Irina Rish (Université de Montréal) - Scaling to AGI

Part of the Spring 2024 Distinguished Speaker Series.

The field of AI is advancing at unprecedented speed due to the rise of foundation models – large-scale, self-supervised pre-trained models whose impressive capabilities greatly increase with scaling the data, models and compute. Empirical neural scaling laws aim to predict scaling behaviors of foundation models, thus serving as an “investment tool” towards choosing the best-scaling methods with increased compute, likely to stand the test of time and escaping “the bitter lesson”. Moreover, predicting AI behaviors at scale, especially “phase transitions” and emergence, is highly important from the perspective of AI Safety and Alignment with human intent. I will present our efforts towards accurate forecasting of AI behaviors using both an open-box approach, when model’s internal learning dynamics is accessible, and a closed-box approach of inferring neural scaling laws based solely on external observations of AI behavior at scale.

Finally, this talk will provide an overview of open-source foundation models our lab has built and released so far using large-scale INCITE allocation on Summit and Frontier supercomputers, including several 9.6B LLMs trained continually, the first Hindi model Hi-NOLIN, multimodal vision-text model suite Robin, as well as time-series foundation models. I will highlight the continual pre training paradigm that allows to train models on potentially infinite datasets, as well as approaches to AI ethics and multimodal alignment. See our CERC-AAI project page for more details: https://www.irina-lab.ai/projects.

Bio: Irina Rish is a Full Professor at the Université de Montréal (UdeM), where she leads the Autonomous AI Lab, and a core faculty member of MILA – Quebec AI Institute. She holds Canada Excellence Research Chair (CERC) and a CIFAR Chair. Dr. Rish completed her MSc and PhD in AI at the University of California, Irvine, and also holds an MSc in Applied Mathematics from Moscow Gubkin Institute. Irina is the recipient of the INCITE compute grant by the US Department of Energy and currently leads an INCITE project on Scalable Foundation Models on Summit & Frontier supercomputers at the Oak Ridge Leadership Computing Facility, focusing on developing open-source large-scale AI models (a.k.a. Foundation Models). She is also a co-founder and the Chief Science Officer of nolano.ai, a company focused on both development of large-scale foundation models and providing a range of model services, including compression, inference acceleration, and evals.

Irina’s extensive research career spans multiple AI domains, from automated reasoning and probabilistic inference in graphical models, to machine learning, sparse modeling, and neuroscience-inspired AI. Irina’s current research endeavors concentrate on continual learning, out-of-distribution generalization, robustness; and understanding neural scaling laws and emergent behaviors (w.r.t. both capabilities and alignment) in foundation models – a vital stride towards achieving maximally beneficial Artificial General Intelligence (AGI). She teaches courses on AI scaling and alignment, and runs Neural Scaling & Alignment workshop series.

Before joining UdeM in 2019, Irina was a research scientist at the IBM T.J. Watson Research Center, where she worked on various projects at the intersection of neuroscience and AI, and led the Neuro-AI challenge. She received IBM Eminence & Excellence Award and IBM Outstanding Innovation Award (2018), IBM Outstanding Technical Achievement Award (2017), and IBM Research Accomplishment Award (2009). She holds 64 patents, over 140 research papers, several book chapters, three edited books, and a monograph on Sparse Modeling.

Agenda

Thursday, April 18, 2024

12:00 pm–12:30 pm

Lunch

Lunch will be provided on a first come, first serve basis.

12:30 pm–1:30 pm

Initiatives

Programs

Academic Programs

Other Programs

Community Data Fellow Stephania Tello Zamudio helps broaden internet access for Illinois residents

DSI Software Engineers create interactive map tool to maximize climate investment tax benefits

Transform cohort 3 participant Healee uses AI to improve healthcare

Towards New Physics at Future Colliders: Machine Learning Optimized Detector and Accelerator Design

Uncovering Patterns in Structure for Voltage Sensing Membrane Proteins with Machine Learning

Finding the likely causes when potential explanatory factors look alike

First Annual UChicago Transit Datathon

Ask a Student in MS in Applied Data Science

Bryce Meredig (Northwestern) – AI+Science Schmidt Fellows Speaker Series

Agenda

Thursday, April 18, 2024

Lunch

Talk and Q&A

Spring 2024 Distinguished Speaker Series

Inderjit S. Dhillon (The University of Texas at Austin) – MatFormer: Nested Transformer for Elastic Inference

Brandon Stewart (Princeton University) – Getting Inference Right with LLM Annotations in the Social Sciences

More on this topic

First Annual UChicago Transit Datathon

AI+Science Hackathon

Derivatives of Data Science Spring Quarter Kick-Off

AI+Science Summer School 2024