The Coolest Things I Learned at JupyterCon

I’m freshly back from JupyterCon in NY and still feeling the bubbly optimism that comes with bringing all you’ve learned at a conference back to your office. In that spirit, I wanted to share some of the coolest and most interesting things I learned with you all.

One quick note before we dive in: I was able to attend JupyterCon because of a very generous scholarship awarded jointly by JupyterCon and Capital One. I would not have been able to attend otherwise and I’m very grateful to these two groups for their commitment to diversity and inclusion in the tech community, so a big thank you to both groups.

In no particular order, here are some of the most interesting things I learned at JupyterCon:

Jupyter Notebooks, Generally

  • You can add a table of contents to a notebook(!) using nbextensions. (h/t Catherine Ordun)

  • You can parameterize notebooks, create notebook templates for analysis, and schedule them to run automatically with Papermill. (h/t Matthew Seal)

  • There are a few cons to teaching and learning with Jupyter notebooks that are worth knowing and acknowledging. Joel Grus’s ‘I Don’t Like Notebooks.’ was a cautionary tale on the use of Jupyter notebooks for teaching, and while I don’t agree with all of his points, I do think it’s worth the time to go through his deck.

29540207-a3d892fe-86cd-11e7-8476-54c79d9f8d7c

Notebooks in Production (!)

  • Netflix is going all-in on notebooks in production by migrating over 10k workflows to notebooks and using them as a way to bridge the chasm between technical and non-technical users. (h/t Michelle Ufford)

  • “Notebooks are, in essence, managed JSON documents with a simple interface to execute code within”. Netflix is putting notebooks into production by combining the JSON properties of notebooks with open-source library Papermill. (h/t Matthew Seal)

  • On debugging: by running a notebook on a notebook server against the same image, you can fix issues without needing to mock the execution environment or code, allowing you to debug locally. (h/t Matthew Seal)

  • On testing: templated notebooks are easy to test with Papermill -- just run tests with a scheduler using parameters like what a user would be inputting to hydrate and run the notebook (and look for errors). (h/t Matthew Seal)

Screen Shot 2018-08-27 at 1.26.53 PM

Data Science in Jupyter Notebooks

  • One of my favorite new-to-me ideas is to build your own Kaggle-style board to make iterating and judging performance of internal models more fun (and provide incentive to track them better!). (h/t Catherine Ordun)

  • In graph/network analysis, you can connect nodes using multiple edges (characteristics) using a multigraph. (h/t Noemi Derzsy’s great tutorial on graph/network analysis, which I learned a lot from)

  • There is a ton of research out there around visualization, including on human perception, that can and should be leveraged for creating impactful data visualizations. I highly recommend Bruno Gonçalves’s slide deck as tour de force of what we know about perception and how to apply it to data.

  • In a very cool use of Jupyter notebook widgets, “see” the impact that missing data can have on an analysis (in this case, a linear regression), check out the interactive-plot notebook from Matthew Bremsmissing data repo, which also contains reading materials and a great slide deck.

  • I finally figured out how all of the parts of a matplotlib figure go together thanks to this nifty visualization from the matplotlib documentation. (h/t Bruno Gonçalves)

… So yeah, I learned a lot of really cool things from some very talented people at JupyterCon this year. I’m excited to build new data products, apply network/graph analysis to IoT data, play with widgets, and maybe put a notebook or two into production.

If you’re doing cool things with notebooks at your company, I’d <3 to hear about them. Feel free to leave a comment here or ping me on Twitter.

[Title image credit: Jason Williams]

Previous
Previous

A Month in the Life of a Data Scientist

Next
Next

Better Allies