Abstract
Books shape how children learn about society and social norms, in part through the representation of different characters. To better understand the messages children encounter in books, we introduce new machine-led methods for systematically converting images into data. We apply these image tools, along with established text analysis methods, to measure the representation of race, gender, and age in children’s books commonly found in US schools and homes over the last century. We find that books selected to highlight people of color, or females of all races, have increasingly over time depicted characters with darker skin tones; “mainstream” books, by contrast, have consistently depicted chromatically ambiguous characters with an increase in lighter skin tones in the last three decades. Children are consistently depicted with lighter skin than adults, despite no systematic differences in skin tones by age. Comparing images and text, we find that females are more represented in images than in text. There is a persistent disproportionate representation of males, particularly White males, and lighter-skinned people relative to darker-skinned people. Our data provide a view into the “black box” of education through children’s books in US schools and homes, highlighting what has changed and what has endured.