AI+Science Summer School 2024
Modern artificial intelligence and machine learning will fundamentally change scientific discovery. We are just beginning to understand the possibilities presented by an era of extraordinarily powerful computers coupled with advanced instruments capable of collecting enormous volumes of high-resolution experimental data. Off-the-shelf machine learning tools cannot fully extract the knowledge contained in these datasets, let alone generate new theories and propose future experiments.
The AI + Science Summer School will be held from July 15th – 19th, jointly hosted by the Data Science Institute (DSI), the Institute for Mathematical and Statistical Innovation (IMSI) at the University of Chicago, and Schmidt Sciences via our Eric and Wendy Schmidt AI in Science Fellowship program at the University of Chicago.
This year’s speakers will focus on applications of AI and Machine learning in core areas of domain science – genetics and physics, materials and chemistry, energy sciences, climate sciences and bioinformatics. The goal of the program is to introduce a new generation of diverse interdisciplinary graduate students and researchers to the emerging field of AI + Science. We also hope this program can build community and spur new research directions focused on AI-enabled scientific discovery across the physical and biological sciences. Lectures and tutorials will focus on some of these areas: Generative models, Representation learning, Label-efficient learning, Simulation-based inference and other inverse problems, and Uncertainty quantification.
The organizing committee for the AI + Science Summer School includes Peter Lu, Ritesh Kumar, Yihang Wang, Anthony Badea, Mark Schulze, Simona Ahmed, and Rebecca Willett.
Check-in Information:
Time/Date: 8:30 am Monday July 15, 2024 and 9:00 am Tuesday-Friday, July 16-19, 2024
Location: Atrium in front of Auditorium 142
Institute for Mathematical and Statistical Innovation (IMSI)
1155 East 60th Street
Chicago, IL 60637
Contact Information: Simona Ahmed – simonaa@uchicago.edu or Mark Schulze – mschulze@uchicago.edu
Agenda
Monday, July 15, 2024
Check-in
IMSI Auditorium Atrium
Welcoming Address
Nhan Tran (Fermilab) Lecture
Nhan Tran – Lecture Part 1 Video
Abstract:
Pursuing answers to fundamental questions about our universe requires searches for the ultra-rare, very subtle, and the inspection of nature at extremely fine spatial and temporal scales. Cutting edge experiments are often confronted with massive amounts of very rich data. To accelerate scientific discovery, enabling powerful machine learning across the data processing continuum, from sensor front-ends to large scale computing, is becoming increasingly valuable. To deploy ML in these challenging scientific environments, we require usable and accessible tool flows for efficient training and implementation across a broad range of scientific domains. This talk will introduce the motivations and requirements for “Fast” ML applications for science and how we use modern tools and techniques for developing and deploying them into our experiments.
Bio:
Tran’s research focus is on using accelerator-based experiments, such as CMS at the LHC, to search for new phenomena. His current activities center on the Higgs boson and dark sectors experiments. He is developing technology at the intersection of electronics, computing, and artificial intelligence to amplify experimental capabilities. He was a postdoctoral associate at Fermilab, and prior to that he received his PhD from Johns Hopkins University in 2011 and his bachelor’s degree from Princeton University in 2005. Tran is a recipient of the URA Tollestrup Award, the APS Henry Primakoff Award, and the DOE Early Career Award.
Break
Auditorium Lounge
Nhan Tran (Fermilab) Lecture
Nhan Tran – Lecture Part 2 Video
Abstract:
Pursuing answers to fundamental questions about our universe requires searches for the ultra-rare, very subtle, and the inspection of nature at extremely fine spatial and temporal scales. Cutting edge experiments are often confronted with massive amounts of very rich data. To accelerate scientific discovery, enabling powerful machine learning across the data processing continuum, from sensor front-ends to large scale computing, is becoming increasingly valuable. To deploy ML in these challenging scientific environments, we require usable and accessible tool flows for efficient training and implementation across a broad range of scientific domains. This talk will introduce the motivations and requirements for “Fast” ML applications for science and how we use modern tools and techniques for developing and deploying them into our experiments.
Bio:
Tran’s research focus is on using accelerator-based experiments, such as CMS at the LHC, to search for new phenomena. His current activities center on the Higgs boson and dark sectors experiments. He is developing technology at the intersection of electronics, computing, and artificial intelligence to amplify experimental capabilities. He was a postdoctoral associate at Fermilab, and prior to that he received his PhD from Johns Hopkins University in 2011 and his bachelor’s degree from Princeton University in 2005. Tran is a recipient of the URA Tollestrup Award, the APS Henry Primakoff Award, and the DOE Early Career Award.
Lunch Break
Ida Noyes – 1st floor
Zachary Ulissi (CMU/Facebook-Meta) Lecture
Zachary Ulissi – Lecture Video
Abstract:
Machine learning accelerated catalyst discovery efforts have seen much progress in the last few years. Large-scale data and machine learning modeling efforts like the Open Catalyst Project (https://opencatalystproject.org/, open source data/models at https://fair-chem.github.io/) have elevated computational catalysis to a first-class problem in the broader machine learning community and led to rapid improvements in accuracy with new state-of-the-art AI/ML models appearing every 3-4 months since 2020. Similar efforts are under way for other climate-related materials challenges like direct air capture via the OpenDAC collaboration (https://open-dac.github.io/).
In the first session I will highlight some of the recent FAIR Chemistry datasets and model efforts, and then review and discuss metrics and evaluation criteria for uncertainty quantification in these large models. I will then review some of the common and emerging methods in the literature to provide uncertainty estimates, especially with an eye on computational trade-offs (sources of uncertainty, training, inference, etc).
Bio:
Zack Ulissi is a research scientist on the FAIR Chemistry team in Meta’s Fundamental AI Research lab and an Adjunct Professor of Chemical Engineering at Carnegie Mellon University. He has led several open science projects and community efforts, the most notable of which is the Open Catalyst Project (https://opencatalystproject.org/ ). Prior to Meta, he was an Associate Professor of Chemical Engineering. He completed his undergraduate work at the University of Delaware, M.A.St. at Cambridge, PhD at MIT on carbon nanotube devices with Michael Strano and Richard Braatz, and post-doc in catalysis at Stanford with Jens Nørskov.
Break
IMSI Auditorium Lounge
Zachary Ulissi (CMU/Facebook-Meta) Interactive Session (Tutorial)
Abstract:
Machine learning accelerated catalyst discovery efforts have seen much progress in the last few years. Large-scale data and machine learning modeling efforts like the Open Catalyst Project (https://opencatalystproject.org/, open source data/models at https://fair-chem.github.io/) have elevated computational catalysis to a first-class problem in the broader machine learning community and led to rapid improvements in accuracy with new state-of-the-art AI/ML models appearing every 3-4 months since 2020. Similar efforts are under way for other climate-related materials challenges like direct air capture via the OpenDAC collaboration (https://open-dac.github.io/).
In the tutorial session, I will walk through some of the tutorials and case studies that use FAIR chemistry models, datasets, and checkpoints. This an online demo (https://open-catalyst.metademolab.com/) service that provides on-the-fly inference using some of our latest models. I will also show how the models can be used for estimating properties like adsorption energies (AdsorbML) or transition state energies (CatTsunami), both have which have demos that can be run offline.
Bio:
Zack Ulissi is a research scientist on the FAIR Chemistry team in Meta’s Fundamental AI Research lab and an Adjunct Professor of Chemical Engineering at Carnegie Mellon University. He has led several open science projects and community efforts, the most notable of which is the Open Catalyst Project (https://opencatalystproject.org/ ). Prior to Meta, he was an Associate Professor of Chemical Engineering. He completed his undergraduate work at the University of Delaware, M.A.St. at Cambridge, PhD at MIT on carbon nanotube devices with Michael Strano and Richard Braatz, and post-doc in catalysis at Stanford with Jens Nørskov.
Welcome Reception
Logan Center – 9th floor
Tuesday, July 16, 2024
Check-In
IMSI Auditorium Atrium
Suganya Sivagurunathan (Merkin Institute) Lecture
Suganya Sivagurunathan – Lecture Video
Abstract:
The workshop will teach attendees how to convert microscopy images into “image-based” or “morphological profiles”, similar to proteomic or transcriptomic profiles. These morphological profiles are created by measuring cellular features from the microscopic images. The profiles generated from cells treated with chemical or genetic perturbations can be used to identify similarities and differences between the “treated” cells and control groups. Interestingly, this approach can also reveal unexpected biological relationships between the treated groups and the features driving these relationships. These profile relationships can be visualized using Morpheus, a web-based software, or with Google Colab Notebooks.
Bio:
I am a postdoctoral associate in Beth Cimini’s lab at the Broad Institute of MIT and Harvard. My background is in Cell and Molecular Biology, and I gained experience in advanced microscopes during my research on intermediate filaments at Robert Goldman’s lab at Northwestern University. This expertise in advanced microscopes led me to the fascinating field of image analysis in biology. Currently, in Cimini’s lab, I develop image analysis pipelines using CellProfiler, ImageJ, cellpose, ilastik, and other image analysis tools to help our collaborators analyze their images. I also develop custom-trained deep learning models when the pre-trained models are not enough.
Break
IMSI Auditorium Lounge
Suganya Sivagurunathan (Merkin Institute) Interactive Session (Tutorial)
Suganya Sivagurunathan Tutorial Video
Abstract:
The workshop will teach attendees how to convert microscopy images into “image-based” or “morphological profiles”, similar to proteomic or transcriptomic profiles. These morphological profiles are created by measuring cellular features from the microscopic images. The profiles generated from cells treated with chemical or genetic perturbations can be used to identify similarities and differences between the “treated” cells and control groups. Interestingly, this approach can also reveal unexpected biological relationships between the treated groups and the features driving these relationships. These profile relationships can be visualized using Morpheus, a web-based software, or with Google Colab Notebooks.
Bio:
I am a postdoctoral associate in Beth Cimini’s lab at the Broad Institute of MIT and Harvard. My background is in Cell and Molecular Biology, and I gained experience in advanced microscopes during my research on intermediate filaments at Robert Goldman’s lab at Northwestern University. This expertise in advanced microscopes led me to the fascinating field of image analysis in biology. Currently, in Cimini’s lab, I develop image analysis pipelines using CellProfiler, ImageJ, cellpose, ilastik, and other image analysis tools to help our collaborators analyze their images. I also develop custom-trained deep learning models when the pre-trained models are not enough.
Lunch Break
Ida Noyes – 1st floor
Srijit Seal (Merkin Institute) Lecture
Video (to come)
Abstract:
Using Machine Learning in Predicting Drug Properties
This workshop will provide an in-depth exploration of human pharmacokinetics (PK) parameters crucial for evaluating drug toxicity, emphasizing the integration of cheminformatics and machine learning techniques. Participants will gain a thorough understanding of fundamental PK parameters such as volume of distribution (VDss), clearance (CL), half-life (t½), fraction unbound in plasma (fu), and mean residence time (MRT). The course introduces chemical structural fingerprints, physicochemical properties, and cell morphology data, including Cell Painting data, essential for cheminformatics prediction tasks. Key learning objectives include acquiring hands-on experience with machine learning models like Random Forests to predict compound properties and validating these models using cross-validation and evaluation metrics such as balanced accuracy and AUC-ROC. Attendees will also explore the real-world application of predicted PK parameters and cell morphology data in liver toxicity evaluation. Participants will engage in a hands-on workshop using Google Colab Notebooks to predict liver toxicity, focusing on data standardization, feature selection, model implementation, validation, and result interpretation.
Bio:
Srijit Seal is a postdoc at the Broad Institute of MIT and Harvard and previously a PhD at the University of Cambridge. He specializes in machine learning and computational biology. His work focuses on developing machine learning algorithms for drug discovery, particularly toxicity prediction. He actively engages in academic outreach, promoting the understanding of Artificial Intelligence and delivering seminars on its applications in drug discovery.
Break
Ida Noyes – 1st floor
Srijit Seal (Merkin Institute) Interactive Session (Tutorial)
Abstract:
Using Machine Learning in Predicting Drug Properties
This workshop will provide an in-depth exploration of human pharmacokinetics (PK) parameters crucial for evaluating drug toxicity, emphasizing the integration of cheminformatics and machine learning techniques. Participants will gain a thorough understanding of fundamental PK parameters such as volume of distribution (VDss), clearance (CL), half-life (t½), fraction unbound in plasma (fu), and mean residence time (MRT). The course introduces chemical structural fingerprints, physicochemical properties, and cell morphology data, including Cell Painting data, essential for cheminformatics prediction tasks. Key learning objectives include acquiring hands-on experience with machine learning models like Random Forests to predict compound properties and validating these models using cross-validation and evaluation metrics such as balanced accuracy and AUC-ROC. Attendees will also explore the real-world application of predicted PK parameters and cell morphology data in liver toxicity evaluation. Participants will engage in a hands-on workshop using Google Colab Notebooks to predict liver toxicity, focusing on data standardization, feature selection, model implementation, validation, and result interpretation.
Bio:
Srijit Seal is a postdoc at the Broad Institute of MIT and Harvard and previously a PhD at the University of Cambridge. He specializes in machine learning and computational biology. His work focuses on developing machine learning algorithms for drug discovery, particularly toxicity prediction. He actively engages in academic outreach, promoting the understanding of Artificial Intelligence and delivering seminars on its applications in drug discovery.
Poster Presentation Session 1
Ida Noyes – 1st floor
Wednesday, July 17, 2024
Check-In
IMSI Auditorium Atrium
Muratahan Aykol (Google DeepMind) Lecture
Abstract:
Traditional methods for advancing materials science are costly and rely heavily on human intuition. In the first part of this lecture, I will showcase how machine learning algorithms can be used in a closed-loop setting to accelerate materials research by recasting research problems as optimization tasks. With several examples, we will discuss how experiments or computations could be driven by such algorithms without much human input. In the second part of the lecture, I will discuss applications of such methods in greater detail for specific research topics of materials discovery and predictive synthesis.
Bio:
Dr. Aykol is a researcher at Google DeepMind working at the intersection of materials science and machine learning. Prior to his current role, he led a Battery Data Science team at Rivian and spent many years at Toyota Research Institute as a research scientist and a founding member of the materials division. Murat received his PhD in Computational Materials Science from Northwestern University under the supervision of Prof. Chris Wolverton, and was a postdoctoral fellow at the Berkeley Lab in the Materials Project team led by Prof. Kristin Persson. His current research interests include AI for materials discovery, thermochemistry and phase diagram calculations with application areas ranging from finding better energy materials to enabling predictive synthesis of new compounds. He has authored close to 100 papers and patents in such subject areas.
Break
IMSI Auditorium Lounge
Muratahan Aykol (Google DeepMind) Lecture
Abstract:
Traditional methods for advancing materials science are costly and rely heavily on human intuition. In the first part of this lecture, I will showcase how machine learning algorithms can be used in a closed-loop setting to accelerate materials research by recasting research problems as optimization tasks. With several examples, we will discuss how experiments or computations could be driven by such algorithms without much human input. In the second part of the lecture, I will discuss applications of such methods in greater detail for specific research topics of materials discovery and predictive synthesis.
Bio:
Dr. Aykol is a researcher at Google DeepMind working at the intersection of materials science and machine learning. Prior to his current role, he led a Battery Data Science team at Rivian and spent many years at Toyota Research Institute as a research scientist and a founding member of the materials division. Murat received his PhD in Computational Materials Science from Northwestern University under the supervision of Prof. Chris Wolverton, and was a postdoctoral fellow at the Berkeley Lab in the Materials Project team led by Prof. Kristin Persson. His current research interests include AI for materials discovery, thermochemistry and phase diagram calculations with application areas ranging from finding better energy materials to enabling predictive synthesis of new compounds. He has authored close to 100 papers and patents in such subject areas.
Lunch Break
Ida Noyes – 3rd floor
Daniel Schwalbe-Koda (UCLA) Lecture
Daniel Schwalbe-Koda Lecture Video
Abstract:
Artificial intelligence (AI) methods have made substantial impacts in multiple areas of science, providing high-accuracy models that accelerate predictions and gather insights from complex data. Particularly in the last decade, deep learning has emerged as a successful way to create functions between arbitrary inputs and outputs with the use of neural networks (NNs). Despite their immense practical utility, NNs typically suffer from the “Clever Hans” effect and make predictions based on spurious correlations in the data. In addition, some NN models lack robustness to perturbations and extrapolation conditions. In this lecture, I will describe recent advances in uncertainty quantification (UQ), model robustness, and explainability for scientific machine learning (SciML). Starting with traditional UQ methods, I will review trade-offs, applications, and implementations of these methods in a range of model architectures. I will also describe how NNs are susceptible to adversarial attacks and in extrapolation conditions, which is particularly complicated for scientific AI. Finally, I will describe how interpretability methods can help improve models and discovery of new principles in SciML. Examples and demonstrations across fields will be provided, with a focus on the physical and chemical sciences.
Bio:
Daniel Schwalbe-Koda is an Assistant Professor of Materials Science and Engineering at UCLA. His research interests include computational materials design, digital synthesis models, high-throughput simulations, and scientific machine learning. Before joining UCLA, Daniel was a Lawrence Fellow in the Quantum Simulations Group at the Lawrence Livermore National Laboratory (LLNL). Daniel obtained a Ph.D. in Materials Science and Engineering from the Massachusetts Institute of Technology (MIT) in 2022.
Break
IMSI Auditorium Lounge
Daniel Schwalbe-Koda (UCLA) Interactive Session (Tutorial)
Daniel Schwalbe-Koda Tutorial Video
Abstract:
Artificial intelligence (AI) methods have made substantial impacts in multiple areas of science, providing high-accuracy models that accelerate predictions and gather insights from complex data. Particularly in the last decade, deep learning has emerged as a successful way to create functions between arbitrary inputs and outputs with the use of neural networks (NNs). Despite their immense practical utility, NNs typically suffer from the “Clever Hans” effect and make predictions based on spurious correlations in the data. In addition, some NN models lack robustness to perturbations and extrapolation conditions. In this lecture, I will describe recent advances in uncertainty quantification (UQ), model robustness, and explainability for scientific machine learning (SciML). Starting with traditional UQ methods, I will review trade-offs, applications, and implementations of these methods in a range of model architectures. I will also describe how NNs are susceptible to adversarial attacks and in extrapolation conditions, which is particularly complicated for scientific AI. Finally, I will describe how interpretability methods can help improve models and discovery of new principles in SciML. Examples and demonstrations across fields will be provided, with a focus on the physical and chemical sciences.
Bio:
Daniel Schwalbe-Koda is an Assistant Professor of Materials Science and Engineering at UCLA. His research interests include computational materials design, digital synthesis models, high-throughput simulations, and scientific machine learning. Before joining UCLA, Daniel was a Lawrence Fellow in the Quantum Simulations Group at the Lawrence Livermore National Laboratory (LLNL). Daniel obtained a Ph.D. in Materials Science and Engineering from the Massachusetts Institute of Technology (MIT) in 2022.
AI + Career Panel Discussion
IMSI Auditorium 142
Thursday, July 18, 2024
Check-In
IMSI Auditorium Atrium
Pedram Hassanzadeh (UChicago) Lecture
Pedram Hassanzadeh – Lecture Part 1 Video
Abstract:
AI and the 2nd Revolution in Weather and Climate Prediction
In recent years, there has been substantial interest in using machine learning (ML), especially deep neural networks (NNs), to improve the modeling and prediction of complex, multiscale, nonlinear dynamical systems such as turbulent flows and Earth’s climate. Remarkable success in performing 1 to 10-day weather prediction with NN-based models has been demonstrated in the past two years, referred to as the 2nd revolution in weather forecasting. There are also efforts focused on revolutionizing long-term climate predictions, particularly for predicting climate change, using AI-based parameterizations and AI climate emulators. I will highlight some of the achievements and promising results, as well as some of the outstanding questions and current shortcomings of AI-based weather and climate models. Prominent examples include the challenges these models face in dealing with very rare events, learning multi-scale chaotic dynamics, and addressing non-stationarity (e.g., a changing climate). I will discuss ideas around integrating ML theory, climate/nonlinear physics, and computational and applied math techniques to make progress toward addressing these challenges.
Bio:
Pedram Hassanzadeh leads the University of Chicago’s Climate Extreme Theory and Data Group and is an Associate Professor at the Department of Geophysical Sciences and Committee on Computational and Applied Math, and is an affiliate faculty at the Data Science Institute. He received his MA (in applied math) and PhD (working on geophysical turbulence) from UC Berkeley in 2013. He was a Ziff Environmental Fellow at Harvard University before joining Rice University in 2016 and moving to the University of Chicago in 2024. His research is at the intersection of climate change, scientific machine learning, computational and applied math, extreme weather and turbulence physics. He has received an NSF CAREER Award, ONR Young Investigator Award, and Early Career Fellowship from the National Academies Gulf Research Program.
Break
IMSI Auditorium Lounge
Pedram Hassanzadeh (UChicago) Lecture
Pedram Hassanzadeh – Lecture Part 2 Video
Abstract:
AI and the 2nd Revolution in Weather and Climate Prediction
In recent years, there has been substantial interest in using machine learning (ML), especially deep neural networks (NNs), to improve the modeling and prediction of complex, multiscale, nonlinear dynamical systems such as turbulent flows and Earth’s climate. Remarkable success in performing 1 to 10-day weather prediction with NN-based models has been demonstrated in the past two years, referred to as the 2nd revolution in weather forecasting. There are also efforts focused on revolutionizing long-term climate predictions, particularly for predicting climate change, using AI-based parameterizations and AI climate emulators. I will highlight some of the achievements and promising results, as well as some of the outstanding questions and current shortcomings of AI-based weather and climate models. Prominent examples include the challenges these models face in dealing with very rare events, learning multi-scale chaotic dynamics, and addressing non-stationarity (e.g., a changing climate). I will discuss ideas around integrating ML theory, climate/nonlinear physics, and computational and applied math techniques to make progress toward addressing these challenges.
Bio:
Pedram Hassanzadeh leads the University of Chicago’s Climate Extreme Theory and Data Group and is an Associate Professor at the Department of Geophysical Sciences and Committee on Computational and Applied Math, and is an affiliate faculty at the Data Science Institute. He received his MA (in applied math) and PhD (working on geophysical turbulence) from UC Berkeley in 2013. He was a Ziff Environmental Fellow at Harvard University before joining Rice University in 2016 and moving to the University of Chicago in 2024. His research is at the intersection of climate change, scientific machine learning, computational and applied math, extreme weather and turbulence physics. He has received an NSF CAREER Award, ONR Young Investigator Award, and Early Career Fellowship from the National Academies Gulf Research Program.
Lunch Break
Ida Noyes – 1st floor
Chihway Chang (UChicago) Lecture
Chihway Chang Lecture Part 1 Video
Abstract:
This lecture will be focused on the area of observational cosmology and how different AI/ML tools has been incorporated into our daily lives. In the first hour we will give some background on the big science questions that are relevant in cosmology and how the field has made progress in the past 20 years using observational data. I will introduce here a variety of AI/ML applications that has been used going back to the early days of machine learning. In the second hour I will talk about a new wave of AI/ML applications to cosmology problems that have been emerging in the past couple years thanks to the technical advances, and how that affected the field. I will then circle back to the theme of this summer school and to discuss a little practical things we can do to have science and AI truely advance each other.
This lecture will not have a hands-on tutorial component, but I will provide some resources for those who may be interested in looking into the tools introduced in the lecture. We will also have small-group discussions throughout the lecture to help people flesh out and exchange ideas.
Bio:
Chihway Chang is an Assistant Professor of the Astronomy and Astrophysics (A&A) Department at the University of Chicago and a Senior Member of the Kavli Institute of Cosmological Physics. She received a B. Sc. in Physics at the National Taiwan University (2007) and a PhD in Physics at Stanford University (2013). She went on to be a postdoctoral fellow at ETH Zurich (2013-2016), and then a KICP Fellow at the University of Chicago (2016-2018). She became an assistant professor at the University of Chicago in 2018. She received the DOE Early Career Award in 2021 and a Scialog Fellow in 2024 for early LSST science.
Break
IMSI Auditorium Lounge
Chihway Chang (UChicago) Lecture
Chihway Chang Lecture Part 2 Video
Abstract:
This lecture will be focused on the area of observational cosmology and how different AI/ML tools has been incorporated into our daily lives. In the first hour we will give some background on the big science questions that are relevant in cosmology and how the field has made progress in the past 20 years using observational data. I will introduce here a variety of AI/ML applications that has been used going back to the early days of machine learning. In the second hour I will talk about a new wave of AI/ML applications to cosmology problems that have been emerging in the past couple years thanks to the technical advances, and how that affected the field. I will then circle back to the theme of this summer school and to discuss a little practical things we can do to have science and AI truely advance each other.
This lecture will not have a hands-on tutorial component, but I will provide some resources for those who may be interested in looking into the tools introduced in the lecture. We will also have small-group discussions throughout the lecture to help people flesh out and exchange ideas.
Bio:
Chihway Chang is an Assistant Professor of the Astronomy and Astrophysics (A&A) Department at the University of Chicago and a Senior Member of the Kavli Institute of Cosmological Physics. She received a B. Sc. in Physics at the National Taiwan University (2007) and a PhD in Physics at Stanford University (2013). She went on to be a postdoctoral fellow at ETH Zurich (2013-2016), and then a KICP Fellow at the University of Chicago (2016-2018). She became an assistant professor at the University of Chicago in 2018. She received the DOE Early Career Award in 2021 and a Scialog Fellow in 2024 for early LSST science.
Poster Presentation Session 2
John Crerar Library atrium
Picnic
Friday, July 19, 2024
Check-In
Tess Smidt (MIT) Lecture
Tess Smidt Partial Lecture Video
Abstract:
A Primer on 3D Euclidean Neural Networks
Representing the locations and orientations of objects in 3D space typically involves using coordinates and coordinate systems. This poses a challenge for machine learning due to the sensitivity of coordinates to 3D rotations, translations, and inversions (the symmetries of 3D Euclidean space). Euclidean symmetry-equivariant Neural Networks (E(3)NNs) are specifically designed to address this issue. They faithfully capture the symmetries of physical systems, handle 3D geometry, and operate on the scalar, vector, and tensor fields that characterize these systems.
E(3)NNs have demonstrated state-of-the-art performance on diverse atomistic benchmarks, such as small molecule properties, protein-ligand binding, and force prediction for heterogeneous catalysis. These networks combine neural network operations with insights from group representation theory. Their success stems from a rigorous foundation, making them more robust, data-efficient, and capable of generalization compared to invariant or non-equivariant neural networks.
In this talk, I will give an overview of the theory behind E(3)NNs and how they are used for diverse tasks in the physical sciences. I’ll also connect how the building blocks of E(3)NNs generalize to other symmetry-equivariant architectures.
Bio:
Tess Smidt is an Assistant Professor of Electrical Engineering and Computer Science at MIT. Tess earned her SB in Physics from MIT in 2012 and her PhD in Physics from the University of California, Berkeley in 2018. Her research focuses on machine learning that incorporates physical and geometric constraints, with applications to materials design. Prior to joining the MIT EECS faculty, she was the 2018 Alvarez Postdoctoral Fellow in Computing Sciences at Lawrence Berkeley National Laboratory and a Software Engineering Intern on the Google Accelerated Sciences team where she developed Euclidean symmetry equivariant neural networks which naturally handle 3D geometry and geometric tensor data.
Break
IMSI Auditorium Lounge
Mit Kotak/Tuong Phung (MIT) Interactive Session (Tutorial)
Mit Kotak/Tuong Phung Tutorial Video
Abstract:
Symphony-Equivariant Point-Centered Spherical Harmonics for 3D Molecule Generation
This tutorial will provide an introduction to E(3)-equivariant neural networks for the task of autoregressive 3D molecule generation. First, we will describe a simple approach for 3D molecule generation, and show why E(3)-equivariance is a useful property for a model to have in this setting. Next, we introduce NequIP, an E(3)-equivariant message-passing network, explaining how the 3D geometry of molecular fragments can be processed efficiently.
Then, we introduce the spherical harmonics, and how they can be used to parametrize angular distributions efficiently. Finally, we combine all of these ingredients to create Symphony, our new autoregressive model. We demonstrate the model’s predictive capabilities on QM9, a dataset of small organic molecules. This tutorial will also introduce participants to JAX, a differentiable programming framework in Python. We end with a discussion of interesting research directions in this space.
Bio:
Tess Smidt is an Assistant Professor of Electrical Engineering and Computer Science at MIT. Tess earned her SB in Physics from MIT in 2012 and her PhD in Physics from the University of California, Berkeley in 2018. Her research focuses on machine learning that incorporates physical and geometric constraints, with applications to materials design. Prior to joining the MIT EECS faculty, she was the 2018 Alvarez Postdoctoral Fellow in Computing Sciences at Lawrence Berkeley National Laboratory and a Software Engineering Intern on the Google Accelerated Sciences team where she developed Euclidean symmetry equivariant neural networks which naturally handle 3D geometry and geometric tensor data.
Mit Kotak is an SM student in MIT CSE advised by Prof. Tess Smidt and Prof. Saman Amarasinghe. Mit earned his BS in Engineering Physics from University of Illinois Urbana-Champaign in 2023. His research focuses on machine learning systems for AI for Science applications.
Tuong Phung is an MEng student in Computer Science at the Massachusetts Institute of Technology (MIT) and a graduate research intern at Harvard University. His research focuses on utilizing geometric deep learning to investigate complex material phenomena and guide the design of novel materials. He was previously named an Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP) Fellow and has conducted research at both the NASA Goddard Space Flight Center and the NASA Marshall Space Flight Center. He holds a Bachelor of Science in Artificial Intelligence and Decision Making from MIT, where he was an Analog Devices Undergraduate Research and Innovation Scholar.
Ameya Daigavane is a PhD student in MIT EECS advised by Prof. Tess Smidt. Ameya earned his B.Tech in Computer Science and Engineering from IIT Guwahati in 2020. His research focuses on the theory and applications of equivariant neural networks for generative modeling.
Song Kim is an MEng student in MIT EECS advised by Prof. Tess Smidt. Song earned her SB in EECS from MIT in 2024. Her research focuses on generative models for small molecules and materials.