DSI celebrates the best of 2023-2024’s data science clinic projects, and graduating data science undergraduate students
This spring, the Data Science Institute celebrated the accomplishments of undergraduate students and mentors alike. On a rainy May afternoon, five data science clinic teams presented their work before retiring to a reception celebrating graduating data science majors.
This year represents the incredible growth within the data science degree program for undergraduates. In 2022, there were only two data science majors. This year, we have forty-six graduating data science majors and are projected to have nearly eighty graduating students next year.
“This past year has truly reflected the effort and energy of our data science graduates and undergraduates,” said Data Science Clinic Director and Associate Senior Instructional Professor Nick Ross. “With 46 projects involving over 200 students, faculty, and teaching assistants, the growth in interest and dedication has been palpable.”
Five student groups were selected to present at the May celebration, and earned the “2023-2024 Award for Excellence in Data Science.”
Project Title: AI Diffusion models for simulating particle physics experiments
Students:
Douglas Williams – BA Data Science
Keegan Ballantyne – BA Data Science
Carina Kane – BA Data Science
Ajay Singh – BA Data Science and Computer Science
The Fermi Lab Oz Amram group works on simulating particle colliders, however these simulations can be incredibly computationally expensive and slow. This clinic group worked on creating a way to compress the complex, high-dimensional data so that simulations could be faster to run while still making accurate predictions. This project extended across two academic quarters as the team worked on evaluating varying attempts at changing different compression factors.
Project Title: Poultry Packaging Consolidation
Students:
Colin McLuckie – BA Data Science
Stella Chen – BA Data Science
Aiwen Xiao – MA Computational Social Science
Yijin Bao – MA Public Policy
The RAFI team, supported by the 11th Hour Project, was focused on understanding the role of livestock auction houses on setting prices for smaller farmers and large conglomerate operations. They found that as the number of auction houses has decreased, so have prices for poultry. With medium and smaller auction houses closing their doors, large poultry conglomerates have been able to keep prices lower, which benefits these larger corporations over small farm operations.
Project Title: AI to recognize stormwater detention features
Students:
Spencer Ellis – MS Computer Science
Tamami Tamura – BA Data Science
Matthew Rubenstein – BA Data Science
Stormwater infrastructure, such as permeable pavement, preserved wetlands, and storm drains, are vital for mitigating potential flooding in cities. However, there is no standard inventory for understanding what amount of an area is preserved for stormwater mitigation. This team of students created and tuned a model, using machine learning, to predict which regions of land on a map are preserved wetlands and ponds. Next steps will be to identify less obvious types of stormwater infrastructure.
Project Title: Generative AI Models for housing images
Students:
Grace Wang – BS Computer Science
DB Christenson – BA Data Science
Leon Wang – BA Data Science
Olesia Khrapunova – MS Computer Science
Machine-learning based computer vision relies on having a large data set of training images, ideally with balanced numbers of stimuli in each category. This group of students focused on how to generate fake images to create more balanced categories within a data set. Using pictures of building facades, the students trained an image generation model to produce synthetic images. They then tested these new images, mixed with existing stimuli, on a classifier and found that adding synthetic data to impoverished categories improved the classifier’s accuracy.
Project Title: Operational requirement management using graph based knowledge networks
Students:
Jason Yu – BA Data Science
Maya Ghosal – BA Computer Science and Data Science
Vincent Chirio – BA Data Science
Andrew Ellis Brander – BA Data Science
Within a large organization like Argonne National Laboratory, there can be a confusing number of policies and written guidance, not all of which is updated regularly. The students on this project attempted to use artificial intelligence (AI) to sift through documentation and find the correct policies. They created a pipeline to pull important information and trained a large language model (LLM) to provide responses based on this documentation. They hope that further work on this LLM will lead to the creation of a tool for Argonne employees to easily find the policies and regulations relevant to their queries.
In addition to recognizing students’ work, Eric and Wendy Schmidt AI in Science Postdoctoral Fellow Peter Lu was selected as Mentor of the Year.
Reflecting on the year, Ross said, “Their collaborative efforts have spanned across industry, academia, non-profits, research labs, and mission-driven organizations, producing high-quality work that has set a new standard. This surge in interest reflects a broader trend: a generation of students passionate about leveraging data science to tackle real-world challenges and making a significant impact.”
Congratulations to Peter and all of the data science students for their hard work!