Data4All, a collaboration of the Data Science Institute, the Office of Civic Engagement, the Center for Spatial Data Science, and Argonne National Laboratory, engages high school students in high-level data and coding challenges. Our goal is to create a continuous, evolving computer science pathway for students, allowing them to smoothly transition from the basics of computing through high school to college and beyond.
Activities at the workshop include:
- Learning the Python programming language (no prior knowledge of Python necessary)
- Conducting inquiry-driven research in small groups
- Training on how to investigate data problems and present findings
- Developing critical collaboration, problem-solving, and communication skills
Data4All is open to high school sophomores, juniors, and seniors who have completed Algebra 1. The program lasts 8 weeks, with students meeting on Saturdays near the University of Chicago campus.
Collaborators from the Data Science Institute, the Office of Civic Engagement, Argonne National Laboratory, and the Center for Spatial Data Science have developed a bridge workshop aimed at bridging the transition from introductory computer science classes to data science research. The workshop will introduce students to the data science research lifecycle, essential computational skills needed for data analysis and visualization, and provide training on how to communicate their findings. The workshop will focus on creating a continuous learning environment from students’ structured classroom studies to more experimental, inquiry-driven research work in small groups. Read more about the March 2021 inaugural workshop here.
In most Chicago Public High Schools (CPS), students’ exposure to computer science is limited to one computer science survey course, with limited exposure to advanced topics such as artificial intelligence, data science, or application of computer science to societal issues. This lack of opportunity continues to be perpetuated when students seek internships and other employment experiences and do not have the confidence in their own knowledge to see computer science, data science or artificial intelligence as a possible career pathway for themselves. To address this need, a team of researchers and educators from the Data Science Institute, the Office of Civic Engagement, the Center for Spatial Data Science, and Argonne National Laboratory, are developing a data science bridge workshop that supports students from Chicago’s South Side community to develop a deeper understanding of data science and grow a tangible skill set that is grounded in scientific projects, real-world datasets, and professional tools. Through this program students will explore the foundational concepts of computer science and data science, working with authentic and complex datasets and leveraging principles of AI to gain insights from data and make predictions.
WORKSHOP FOCUS & APPROACH
The workshop will be taught by using case studies that contain a real-world scientific challenge (i.e. COVID-19), an authentic data set and associated professional tools. The case studies will be supplied from data generated by scientific research projects from the three institutions. Using the Python language, students will explore data structures with an emphasis on multidimensional arrays, manipulating and visualizing them with commonly used libraries in scientific computing such as NumPy, Pandas and Matplotlib. The datasets will provide students with many of the challenges associated with scientific data and provide them with the skills to perform statistical analysis and prediction all the while exploring real-world problems.
The workshop will model the collaborative nature of computer science, by situating students in teams with guidance and support of staff, including undergraduate and graduate mentors from the three institutions.
Funding for the workshop is supported by a grant awarded by UChicago’s Office of Research and National Laboratories Joint Task Force Initiative as well as by a grant from the Successful Pathways from School to Work initiative of the University of Chicago, funded by the Hymen Milgrom Supporting Organization.
Evelyn CampbellPreceptor, Data Science Institute
John DomyancichLearning Center Lead, Education Programs and Outreach, Argonne National Laboratory
Maria V. Fernandez (she/her)Research Program Manager, Data Science Institute
Bethany FrankBioEnergy Curriculum Developer, Education Programs and Outreach, Argonne National Laboratory
Julia KoschinskyExecutive Director, the Center for Spatial Data Science
Molly Long (they/them)Research Team Coordinator, Data Science Institute
Ann MerrellAssociate Director, Collegiate Scholars Program
Abel OchoaExecutive Director of College Readiness, UChicago Office of Civic Engagement
Shaz RasulExecutive Director, Education Partnerships, UChicago Office of Civic Engagement
Tyler SkluzacekCurrently: Research Scientist, Oak Ridge National Lab; Previously: PhD Candidate, University of Chicago
David Uminsky (he/him)Executive Director, Data Science Institute; Senior Research Associate, Department of Computer Science
Evelyn is a preceptor in data science focusing on data science education as a joint instructor for both the University of Chicago and City Colleges of Chicago. She obtained her PhD in Microbiology from the University of Chicago in 2022 and her BS in Biology from Rider University in 2016. She enjoys reading, writing, and talking with friends and family.
John Domyancich is Education Programs and Outreach’s Learning Center Lead, where he leads a team of Educators that focus on engaging middle through high school students in scientific inquiry, creating immersive experiences that highlight the work and mission of Argonne research. John plans and orchestrates Argonne’s summer camp and high school research programs, and he is also responsible for the Learning Lab field trips. His goal is to inspire and guide the next generation of Argonne scientists through STEM pathways.
John taught high school science for 11 years before joining Argonne in 2015. During his teaching career, he created student-centered classrooms and developed STEM curricula to integrate collaboration, technology and conceptual model development into the learning experience. To prepare himself for a career teaching science, he earned an M.A. in Secondary Education from Western Illinois University as well as a B.S. in Chemistry from the University of Iowa.
John is a member of the American Modeling Teachers Association and the Illinois Association of Chemistry Teachers. He is also active in the National Math and Science Initiative and the College Board. Outside work, he likes to run and spend time with his wife and three daughters.
Maria V. Fernandez is Research Program Manager for the Data Science Institute. Working closely with DSI leadership and faculty, Maria is responsible for the development, coordination, and execution of research initiatives at the DSI, including the Data & Democracy and AI+Science initiatives. She also manages the DSI’s Postdoctoral Scholars Program as well as the DSI’s student research and engagement initiatives, Summer Lab and Data4All.
Before joining the DSI in January 2023, Maria was a Project Manager and Metadata Specialist at the University of Chicago Library. She has worked for several years in higher education and research libraries at the University of Chicago, Brown University, the University of Texas at Austin, and Dartmouth College. She received a MS in Information Studies and a MA in Latin American Studies from the University of Texas at Austin and a BA in History from Dartmouth College.
Julia Koschinsky is the Executive Director of the Center for Spatial Data Science at the University of Chicago and has been part of the GeoDa team for over 16 years. She has been conducting and managing research funded through federal awards of over $8 million to gain insights from the spatial dimensions of urban challenges in housing, health, and the built environment.
Molly Long is the Research Team Coordinator at the Data Science Institute. Before moving to Chicago, they worked as a wildfire biologist with the U.S. Geological Survey in Boise, Idaho. Molly has experience in data science, classical violin performance, biological research, and college admissions. They earned a Bachelor of Arts in Biology and Music from Lawrence University in Appleton, Wisconsin.
Abel works with faculty and staff across the University to develop and enhance ongoing youth outreach and engagement initiatives. In this role, he helps to create new and expand existing college access and success models by engaging internal and external stakeholders and organizations that share the University’s interest in ensuring equitable outcomes for Chicago’s youth; especially students who reside or attend school in the mid-South-Side. Abel also provides leadership to the Collegiate Scholars Program and Office of Special Programs-College Prep, which are existing college readiness programs that are an integral part of UChicago Promise, and reinforce the institution’s commitment to broadening access to the University’s transformative education and outreach programs.
For the past 10 years, Abel has worked with youth and coached them and their families in the college preparation process. Before joining the University of Chicago, Abel worked for the Office of Undergraduate Admission at Northwestern University. Abel is a native of Chicago’s Pilsen neighborhood and holds a BA in Political Science and Spanish from the University of Illinois at Urbana-Champaign and an MA in Public Policy and Administration from Northwestern University. Abel is an alumnus of the Latino Policy Forum’s inaugural 2016-2017 Multicultural Leadership Academy and the University of Chicago Booth School of Business 2017 Executive Program for Emerging Leaders.
Bio: Tyler is a Ph.D. candidate in Computer Science at the University of Chicago, advised by Kyle Chard and Ian Foster. His research interests lie at the intersection of data management, data science, and HPC, focusing on enabling scientists to maximize the utility of massive amounts of data. His work has culminated in the design of the open-source system Xtract that can intelligently formulate metadata extraction workflows for data stored in heterogeneous file formats across leadership-scale computing facilities. Before joining the University of Chicago, he received his B.A. in Applied Mathematics and Statistics from Macalester College.
Talk Title: Enabling Data Utility Across the Sciences
Talk Abstract: Scientific data repositories are generally chaotic—files spanning heterogeneous domains, studies, and users are stuffed into an increasingly-unsearchable data swamp without regard for organization, discoverability, or usability. Files that could contribute to scientists’ future research may be spread across storage facilities and submerged beneath petabytes of other files, rendering manual annotation and navigation virtually impossible. To remedy this lack of navigability, scientists require a rich search index of metadata, or data about data, extracted from individual files. In this talk, we will explore automated metadata extraction workflows for converting dark data swamps into navigable data collections, given no prior knowledge regarding each file’s schema or provenance. I enable such extraction from files of vastly different structures by building a robust suite of “extractors” that leverage data scientific methods (e.g., keyword analysis, entity recognition, and file type identification) in order to maximize our body of knowledge about a diversity of files.
In this talk, I outline the construction, optimization, and evaluation of Xtract—a scalable metadata extraction system—that automatically constructs extraction plans for files distributed across remote cyberinfrastructure. I illustrate the scale challenges in processing these data, and outline techniques to maximize extraction throughput, by analyzing Xtract’s performance on three real science data sets.
David Uminsky joined the University of Chicago in September 2020 as a senior research associate and Executive Director of Data Science. He was previously an associate professor of Mathematics and Executive Director of the Data Institute at University of San Francisco (USF). His research interests are in machine learning, signal processing, pattern formation, and dynamical systems. David is an associate editor of the Harvard Data Science Review. He was selected in 2015 by the National Academy of Sciences as a Kavli Frontiers of Science Fellow. He is also the founding Director of the BS in Data Science at USF and served as Director of the MS in Data Science program from 2014-2019. During the summer of 2018, David served as the Director of Research for the Mathematical Science Research Institute Undergrad Program on the topic of Mathematical Data Science.
Before joining USF he was a combined NSF and UC President’s Fellow at UCLA, where he was awarded the Chancellor’s Award for outstanding postdoctoral research. He holds a Ph.D. in Mathematics from Boston University and a BS in Mathematics from Harvey Mudd College.