Scientific Open Source Software with Fernando Pérez (UC-Berkeley, Project Jupyter)
Event Recap
In 2015, the LIGO experiment picked up the sound of two black holes merging 1.3 billion light years away from Earth. It was the first direct observation of gravitational waves, and one of the most important physics discoveries ever, receiving the Nobel Prize just two years later. The experiment cost more than $1 billion, but when the scientists communicated their results to the public, they used software that cost $0: open source Python tools including NumPy, SciPy, matplotlib, and Jupyter notebooks.
“This was a dream come true for us,” said Project Jupyter co-founder Fernando Pérez in his October 8th talk at the Center for Data and Computing. “Fifteen years earlier, we were arguing like crazies in the desert that this Python thing is real and we can do science with it. Now, the most important result of the last decade in physics was achieved with these tools.”
Jupyter is a programming language-agnostic interactive notebook that allows researchers to combine code, text, mathematics, and results into shareable documents. A descendent of the iPython notebook — created by Pérez in graduate school as “my PhD procrastination/depression-control mechanism” — Jupyter is now a core tool for scientists using computation across all fields of inquiry.
But the software behind Jupyter and other scientific Python tools is only part of the open source puzzle, Pérez said. The Jupyter team has also worked through challenges of forming standards and protocols, the economic sustainability of a project with 1500 volunteer contributors, and the traditional incentives of science, which have been slow to adapt to the rise of open source software.
“On any dimension that you look at, the creation and construction of these open, collaborative tools and ideas is at odds with the incentive mechanisms we have throughout the system. From academic hiring and promotion and publishing and tenure, to grant writing and awards,” Pérez said. “We managed to engineer a system of incentives that makes the right things, the hardest things to do.”
Watch Pérez’s full talk, including information on his geosciences project Pangeo and the UChicago connection to the birth of matplotlib, below.
Scientific Open Source Software: Meat and Bits But Not Papers. Is it Real Work?
Open source software is now the backbone of computation across the sciences and increasingly education. Yet the creation of scientific software is not well recognized as part of the enterprise of science in terms of training, career paths, intellectual recognition, organizational support, or funding. In this talk, I’ll explore the challenges of this contradictory situation, from the perspective of someone who has spent almost 20 years building open source software and communities. I have lived (often precariously) a dual life of “real academic” and of open source developer and advocate, working on IPython, Project Jupyter and the Scientific Python ecosystem since 2001.
I will provide an overview of Project Jupyter, including its intellectual core, the open source community context that surrounds it, and some of its impact. This will help frame the second part of the talk, where I’ll try to open a conversation on the social and organizational challenges of creating and sustaining open, collaborative communities in the structure of research and education. The scientific, technical and community dynamics of projects like Jupyter presents interesting challenges in the context of traditional scientific incentives (funding, publishing, hiring and promotion, etc.) I’ll briefly outline some of these but will mostly focus on some ideas that I hope can move the conversation forward in productive ways.
Agenda
Tuesday, October 8, 2019
Check-In
Welcome & Introductions
Talk
Audience Q&A
Speakers
Registration

Chicago Data Night: Arjun Ravi Kannan (Discover Financial Services)

Yuan-Sen Ting (OSU): AI+Science Schmidt Fellows Speaker Series

Shirley Ho (Flatiron Institute): AI+ Science Schmidt Fellows Speaker Series
