Educational Materials
The Data4All Curriculum
The Data4All curriculum was developed through a collaboration with the Data Science Institute, Argonne National Laboratory, and the Center for Spatial Data Science. The program utilizes a unique instructional approach to engage students interactively and facilitate project-based learning through the use of public health case studies. Throughout the program, the Data Science Reasoning Framework (DSRF) is repeatedly referenced to remind students that problem solving is an iterative process that requires asking testable questions, gathering evidence with data, refining hypotheses and proposed explanations, and articulating claims with evidence-based reasoning. A depiction of the DSFR is shown below. Click the image to see a bigger version of the DSRF.
Instructional Approach
Spark Activities
Spark activities are short, low-risk activities designed to introduce students to the key concepts of the main lesson in an intuitive and playful way. The goals of the spark activities are to 1) help transition students from outside activities to the day’s lesson, 2) elicit prior knowledge, 3) serve as a formative assessment to help guide the instruction of the main lesson, 4) build an intuitive understanding of the technical concepts covered in class, and 5) to be a fun activity to help build a sense of classroom community. Spark activities are diverse in their form and approach to achieving student engagement.
Google Colab and Jupyter/Colab Notebooks
The notebooks are largely independent work where students will be able to develop their computational thinking, coding and data science reasoning skills concurrently. Each notebook has a specific goal that the students are working towards and provides the opportunity to directly work with the data on a practical level. If they have questions, they can work with their small groups and their near-peer mentor.
The notebooks will use the “Use, Modify, Create” approach to coding. This means that the notebooks will often already have functioning code. Depending on the task, students simply need to run a cell of code (Use) or change (Modify) individual elements like variables and parameters of functions and view the output. As the workshop progresses and the students become more familiar with the code, students will be asked to write (Create) more of their code from scratch. This can include pulling code from other notebooks.
Interspersed through the notebooks are “Journal” prompts where students are asked to check understanding, reflect and predict. The journal prompts are an essential part of the notebook experience and students should be held accountable to record their thoughts in these spaces. This is valuable from a learning perspective and it enhances the notebook’s value to the student beyond the workshop as they can refer back to their thinking.
Classroom Structure
Data4All was originally structured to facilitate project-based learning through group collaboration. The program consists of 30 students divided into 6 groups. Each group had an assigned mentor that would help lead them through notebooks and activities. While this structure is not necessary to implement the curriculum, students have been shown to benefit from the help of near-peer mentors with a bit more experience in data science. Depending on your classroom size, adjustments can be made. If it is not possible to facilitate the aforementioned structure in your classroom environment, consider prioritizing group work for activities and discussions. The notebooks are structured such that students can work on them individually, in pairs, or in small groups.
Small Group Discussion
Small group discussions are an essential part of the curriculum as they are often used to enhance student voice, wrestle with challenging concepts, and solidify understanding. They are also a forum where students feel more comfortable to ask about coding bugs and gaps in understanding. Undergraduate students lead small group discussions and activities to facilitate collaboration and learning.
Large Group Discussion
These are opportunities for the whole class to tie everything together, e.g. in the “town hall” capstone experience in the last week where students articulate their evidence-based arguments to “public health decision makers”.
Lectures
Lectures are presentations to the whole class to introduce and explain key concepts of a class, such as the difference between correlation and causation or between proxy variables (such as demographic characteristics) and explanatory mechanisms.
Guest Speakers
Guest speakers give students an overview of how the workshop content is related to college and career options. E.g., in 2022, two PhD biologists walked students through innovative case studies of how they used data science to solve problems that save lives and improve health. Other speakers explained their own career trajectories to make the steps for choosing college majors and careers more transparent. They also overviewed where and how to access college and career information.
Curriculum Materials
All materials for the program can be found on our Github repository.