Skip to main content

The University of Chicago’s DSI hosted the Rising Stars in Data Science workshop on November 13-14, in collaboration with the University of California San Diego Halıcıoğlu Data Science Institute (HDSI). The workshop focuses on celebrating and fast-tracking the careers of exceptional data scientists at a critical inflection point in their career: the transition to postdoctoral scholar, research scientist, industry research position, or tenure track position.

This annual workshop began three years ago, and has since hosted 135 Rising Stars from 44 institutions. This fall was the fourth workshop, with 32 scholars in attendance across the two-day event.

Panels and Keynotes shed light on becoming a university faculty member

The Rising Stars workshop consists of career and research panels, networking and mentoring opportunities, and research talks from the Rising Stars. On November 13th, participants heard from Aloni Cohen, Rana Hanocka, Nikos Ignatiadis, and Haojian Jin (UC San Diego) as part of the panel, “The Data Science Job Search – Advice from New Faculty.” Alex Cloninger (UC San Diego) moderated the session, where participants asked questions about tailoring their materials for specific departments, the pros and cons of younger versus older departments, and how to negotiate once an offer is made.

The following day included three panels addressing different focuses. David Uminsky kicked off the first panel as moderator, with panelists Mike Franklin, Becca Willett, and Virginia De Sa (UC San Diego) revealing how search committees select candidates. The panelists stressed the importance of getting great letters and delivering an excellent job talk. They noted that successful candidates must be willing to engage with tough questions about their work and should have a clear research vision.

The second panel of the day, “How to Make the Most of Your First Year,” featured Mike Franklin moderating for Alex Kale, Molly Offer-Westort, Raul Castro Fernandez, Gal Mishne (UC San Diego), and Alex Cloninger. All the panelists spoke passionately on topics such as the top three priorities when starting a new faculty position, work-life balance, and tips for recruiting graduate students.

The third and final panel was moderated by Molly Offer-Westort and consisted of Rising Star alumni Maarten Sap, Haojian Jin, and Jonatas Marques. These researchers discussed their experiences being on the job market and gave advice on how to thrive in interdisciplinary fields.

Attendees also heard inspiring keynote talks from established, cutting-edge leaders in data science. Bo Li spoke about how to assess the trustworthiness of AI systems and Virginia De Sa discussed using machine learning to investigate visual and multi-sensory perception. Both presenters focused on the interdisciplinary nature of data science research, their experiences collaborating across research areas, and how machine learning can be leveraged as a tool across many domains.

 

 

Rising Stars shine during their Lightning Talks

The Rising Stars workshop provides a platform to showcase the exciting work of PhD students and postdocs. Lighting talks are rapid, 10-minute presentations presented by all participants on their research. These talks were grouped based on topic and scattered across the two day workshop. Topics included Natural Language Processing, Data-Driven Decision Making, Fairness and Interpretability, Health and Medicine, Data and Society, AI in Science and Technology, Machine Learning, and Optimization. All lightning talk abstracts can be found on the 2023 Rising Stars cohort page. Here’s a highlight of the depth and breadth of the research at RISING stars..

Sam Zhang, University of Colorado Boulder: Labor advantages drive the greater productivity of faculty at elite universities

“Faculty at prestigious institutions dominate scientific discourse, producing a disproportionate share of all research publications. Environmental prestige can drive such epistemic disparity, but the mechanisms by which it causes increased faculty productivity remain unknown. Here, we combine employment, publication, and federal survey data for 78,802 tenure-track faculty at 262 PhD-granting institutions in the American university system to show through multiple lines of evidence that the greater availability of funded graduate and postdoctoral labor at more prestigious institutions drives the environmental effect of prestige on productivity. In particular, greater environmental prestige leads to larger faculty-led research groups, which drive higher faculty productivity, primarily in disciplines with group collaboration norms. In contrast, productivity does not increase substantially with prestige for faculty publications without group members or for group members themselves. The disproportionate scientific productivity of elite researchers can be largely explained by their substantial labor advantage rather than inherent differences in talent.”

Emily Aiken, University of California, Berkeley: Targeting humanitarian aid with machine learning and digital data

“The vast majority of humanitarian aid and social protection programs globally are targeted, providing assistance to individuals or communities identified to be poorest or most in need. In low and middle-income countries, the targeting of aid programs is often limited by low-quality, out-of-date, or missing data on poverty and vulnerability. Novel “big” digital data sources, such as those captured by satellites, mobile phones, and financial services providers — when combined with advances in machine learning — can improve the accuracy of aid program targeting. In this talk, I will cover empirical results on the accuracy of these new data-driven and algorithmic approaches to aid allocation, and will discuss emergent implications for fairness, privacy, transparency, and community dynamics.”

Ying Jin, Stanford University: Model-free selective inference with conformal p-values and its application in drug discovery

“In decision-making or scientific discovery pipelines such as job hiring and drug discovery, before any resource-intensive step, there is often an initial screening step that uses predictions from a machine learning model to shortlist a few candidates from a large pool. We introduce a framework that allows using any prediction model to select candidates whose unobserved outcomes exceed user-specified values, while rigorously controlling the false positives. Given a set of calibration data that are exchangeable with the test sample, we leverage conformal inference ideas to construct p-values that allow us to shortlist candidates with exact false discovery rate (FDR) control. In addition, I will discuss new ideas to further deal with covariate shifts between calibration and test samples, a scenario that occurs in almost all such problems including drug discovery, hiring, causal inference, and healthcare. Our methods are flexible wrappers around any complex model. In practical drug discovery tasks, our methods greatly narrow the set of promising drug candidates to manageable sizes while maintaining rigorous FDR control.”

Congratulations again to all of our 2023 Rising Stars!

arrow-left-smallarrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-long-yellowarrow-right-smallfacet-arrow-down-whitefacet-arrow-downCheckedCheckedlink-outmag-glass