Skip to main content

Project: Pediatric Cancer Data Commons

Collecting, aggregating, harmonizing, and sharing data from children with cancer is essential to making new discoveries and developing new cures. Too often, data are siloed and disconnected, drastically reducing the usefulness of these valuable resources. The Pediatric Cancer Data Commons (PCDC) at UChicago brings together researchers from around the world with the goal of building data dictionaries for all types of pediatric cancer. Consensus data models are balloted with experts from around the world, including clinicians, ontologists/taxonomists, statisticians, and data scientists. The resulting dictionary is used for harmonizing data from completed clinical trials and is subsequently leveraged as a framework for collecting data on new studies. The data are made available to the worldwide research community through a public-facing cohort discovery tool. Data are further connected to other sources through common identifiers, allowing novel new data sets to be developed for research and discovery.

Potential areas of research include: ontology development and data dictionary creation, data harmonization, automated methods of metadata extraction and data ingestion, development and deployment of novel data visualization tools and analytics, data governance and provenance methods and tools, developing novel methods of combining disparate data sets, and developing analytic methods for new modes of risk stratification. Experience with clinical data is preferred but not required.

Mentor: Samuel L. Volchenboum, Associate Professor of Pediatrics & Associate Chief Research Informatics Officer, UChicago Medicine

Samuel L. Volchenboum, MD, PhD, MS, is an expert in pediatric cancers and blood disorders. He has a special interest in treating children with neuroblastoma, a tumor of the sympathetic nervous system.

In addition to caring for patients, Dr. Volchenboum studies ways to harness computers to enable research and foster innovation using large data sets. He directs the development of the International Neuroblastoma Risk Group Database project, which connects international patient data with external information such as genomic data and tissue availability. The Center he runs provides computational support for the Biological Sciences Division at the University of Chicago, including high-performance computing, applications development, bioinformatics, and access to the clinical research data warehouse.