Skip to main content

Part of the 2024-25 DSI Distinguished Speaker Series and the Computer Science Distinguished Lecture Series.

The main goal of interpretability is to enable communication between humans and machines, whether it’s a value, knowledge, or an objective. In this talk, I argue that a better way to enable this communication is for humans to expand what they know and learn new things. Doing so enables us to also expand what machines know—by building better-aligned machines. I share why considering the representational gap is crucial in solving the alignment problem, and I provide an example of bridging the knowledge gap.

Bio: Been Kim is a senior staff research scientist at Google DeepMind. Her research focuses on helping humans to communicate with complex machine learning models: 1) building tools to aid human’s collaboration with machines (and detect when those tools fail) 2) study machines’ general nature and 3) leveraging machines’ knowledge to benefit humans. She gave a talk at the G20 meeting in Argentina in 2019 and a keynote at ICLR 2022 and ECML 2020. Her work TCAV received UNESCO Netexplo award, was featured at Google I/O 19′. Her work is in a chapter of Brian Christian’s book on “The Alignment Problem”. She is the General chair at ICLR2024, was a Senior Program Chair at ICLR 2023 and advisory board at TRAILS. She has been a senior area chair at NeurIPS, ICML, ICLR, AISTATS and others for the past few years. She is a steering committee member of FAccT conference and SATML. She received her Ph.D. from MIT.

Agenda

Tuesday, November 19, 2024

2:00pm–3:00pm

Talk and Q&A

arrow-left-smallarrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-long-yellowarrow-right-smallclosefacet-arrow-down-whitefacet-arrow-downCheckedCheckedlink-outmag-glass