Skip to main content
abstract picture of a large language model network

The Capstone Project, the culmination of the MS in Applied Data Science program, unites students with industry partners to solve real-world analytics problems. The Summer Showcase featured projects spanning industries, data types, and methodological approaches—with one team standing out for special recognition.

“The quality of the work was truly impressive and the teams deserve our congratulations for their hard work, professionalism, and commitment to excellence,” said Greg Green, Director of MS in Applied Data Science Program. “Their inventiveness was especially on display with the solutions they developed for their industry partners. We’re looking forward to staying connected and hearing about the graduates’ next achievements.”

Learn more about the winning project below.

UChicago Medicine | Medical Language Processing – Patient Based Radiology Question
& Answering System
Presenters: Ananth Prayaga, Evelyn Wu, Nitin Gupta, Vivian Yeh
Faculty Advisor: Utku Pamuksuz

A key challenge in healthcare today centers around optimally using the enormous quantity of unstructured medical data available to healthcare professionals. Lacking a systematic and streamlined strategy, sorting through patient information demands significant time that could otherwise be spent directly with patients.

For their Capstone Project, Ananth Prayaga, Evelyn Wu, Nitin Gupta, and Vivian Yeh focused on the labor-intensive process of reading radiology reports. They used a Closed Book Question-Answering (Q&A) System and the latest AI technologies to design an intelligent knowledge receiver, which they named ContextualAI, that can respond accurately to queries without external knowledge sources.

“A central area of development in the generative AI field involves using LLMs to ingest site- or institution-specific data that can then be queried,” the team says. “Hospitals, financial services companies, and law firms don’t want to share their internal data with OpenAI or other AI systems due to security and privacy reasons. Our task involved creating a framework using an LLM that could ingest and query local data without compromising privacy or security.”

Focused on optimizing workflow efficiency, increasing productivity, and alleviating physician burnout, the team tested existing Large Language Models (LLM) and vector databases across a variety of approaches as they developed their framework. Two separate requirements drove their search for a solution: one, ingesting the unstructured data of the partner institution and, two, teaching the LLM a new skill to work with that data.

In the end, the team attained the best performance from two of Meta’s AI tools, Llama2 and Faiss, which generated answers with an approximate accuracy of 70% that the team validated using patient data and human evaluations.

“But we won’t stop there,” the team says. “This is a growing field and new models are being released every few weeks, so we will continue testing as new areas for improvement arise. Our client has already started to deploy our product and has expressed strong interest in further collaboration.”

The team also highlights the transformative power of their project and its potential for applications that extend beyond healthcare into other domains. It points, they say, to a future where intelligent answers are just a question away.

“This is just the beginning of what is possible and we’re really excited to see it developed further to reach its maximum potential.”