The Coolest Things I Learned at JupyterCon

I’m freshly back from JupyterCon in NY and still feeling the bubbly optimism that comes with bringing all you’ve learned at a conference back to your office. In that spirit, I wanted to share some of the coolest and most interesting things I learned with you all.

One quick note before we dive in: I was able to attend JupyterCon because of a very generous scholarship awarded jointly by JupyterCon and Capital One. I would not have been able to attend otherwise and I’m very grateful to these two groups for their commitment to diversity and inclusion in the tech community, so a big thank you to both groups.

In no particular order, here are some of the most interesting things I learned at JupyterCon:

Jupyter Notebooks, Generally

  • You can add a table of contents to a notebook(!) using nbextensions. (h/t Catherine Ordun)
  • You can parameterize notebooks, create notebook templates for analysis, and schedule them to run automatically with Papermill. (h/t Matthew Seal)
  • There are a few cons to teaching and learning with Jupyter notebooks that are worth knowing and acknowledging. Joel Grus’s ‘I Don’t Like Notebooks.’ was a cautionary tale on the use of Jupyter notebooks for teaching, and while I don’t agree with all of his points, I do think it’s worth the time to go through his deck.
Table of contents via nbextensions

Notebooks in Production (!)

  • Netflix is going all-in on notebooks in production by migrating over 10k workflows to notebooks and using them as a way to bridge the chasm between technical and non-technical users. (h/t Michelle Ufford)
  • Notebooks are, in essence, managed JSON documents with a simple interface to execute code within”. Netflix is putting notebooks into production by combining the JSON properties of notebooks with open-source library Papermill. (h/t Matthew Seal)
  • On debugging: by running a notebook on a notebook server against the same image, you can fix issues without needing to mock the execution environment or code, allowing you to debug locally. (h/t Matthew Seal)
  • On testing: templated notebooks are easy to test with Papermill — just run tests with a scheduler using parameters like what a user would be inputting to hydrate and run the notebook (and look for errors). (h/t Matthew Seal)

Screen Shot 2018-08-27 at 1.26.53 PM

Data Science in Jupyter Notebooks

  • One of my favorite new-to-me ideas is to build your own Kaggle-style board to make iterating and judging performance of internal models more fun (and provide incentive to track them better!). (h/t Catherine Ordun)
  • In graph/network analysis, you can connect nodes using multiple edges (characteristics) using a multigraph. (h/t Noemi Derzsy’s great tutorial on graph/network analysis, which I learned a lot from)
  • There is a ton of research out there around visualization, including on human perception, that can and should be leveraged for creating impactful data visualizations. I highly recommend Bruno Gonçalves’s slide deck as tour de force of what we know about perception and how to apply it to data.
  • In a very cool use of Jupyter notebook widgets, “see” the impact that missing data can have on an analysis (in this case, a linear regression), check out the interactive-plot notebook from Matthew Bremsmissing data repo, which also contains reading materials and a great slide deck.
  • I finally figured out how all of the parts of a matplotlib figure go together thanks to this nifty visualization from the matplotlib documentation. (h/t Bruno Gonçalves)
Netflix definitely won the prize for best conference swag.

… So yeah, I learned a lot of really cool things from some very talented people at JupyterCon this year. I’m excited to build new data products, apply network/graph analysis to IoT data, play with widgets, and maybe put a notebook or two into production.

If you’re doing cool things with notebooks at your company, I’d ❤ to hear about them. Feel free to leave a comment here or ping me on Twitter.

[Title image credit: Jason Williams]

1 thought on “The Coolest Things I Learned at JupyterCon”

Leave a Reply to A Month in the Life of a Data Scientist – Haystacks Cancel reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s