Reading List: 2017 Edition (and some thoughts on resolutions)

My not-so-secret secret? I was an English major.  I also majored in Stats and love math and data science, but I have always and forever loved reading. In an effort to read more often, each year I set a goal* of reading 25 books. So, in the spirit of Susan Fowler, and with the hope of getting good book suggestions, I want to share my 2017 reading list (with brief commentary). My top five recommended reads are designated with **.

1. The Great Gatsby, by F. Scott Fitzgerald
I don’t have much to say about The Great Gatsby that hasn’t been said already, but I can say that it was much more interesting than I remember it being in high school — and that I really, really want to go to a Gatsby party.

2. Men Explain Things to Me, by Rebecca Solnit
The word “mansplaining” was coined in reaction to Rebecca Solnit’s titular essay “Men Explain Things To Me“, which begins with a situation women might find vaguely familiar: after Solnit mentions the topic of her most recent book, a guy at a party asks if she’s heard of another *very important* book on the same topic, and it takes her friend’s repetition of “That’s her book” three or four times to sink in and leave the man speechless. The essay is short, and definitely worth a read, and the book does a good job of adding color to mansplaining and other gendered issues through added data and commentary, including a thoughtful, well-researched take on domestic violence that I hadn’t heard before.

3. Lean In, by Sheryl Sandberg**
I realize that I’m several years late to the game, but after finally reading Lean In, I would recommend it at the level of “required reading” for women navigating corporate America (and the tech world in particular). Sheryl Sandburg provides solid examples (and data!) on gendered differences in salary negotiations, likability, speaking up, explaining success, applying for level-up positions, getting promotions, ambition, and so much more. Reading this book inspired me to speak up (even about the little things!), and I’ve saved many of the factoids for future reference as I’m navigating my own career. Seriously, ladies, read this book if you haven’t already!

4. Hillbilly Elegy, by J.D. Vance
Hillbilly Elegy interweaves two stories: J.D. Vance’s personal story of “making it out” of a Glass-Castle-esque “hillbilly” upbringing by joining the Marines, going to college, and eventually law school at Yale, and a more general look at the problems confronting the modern white working class (in Appalachia and similar regions). The most interesting piece, for me, was a specific example of a town whose blue-collar factory jobs eventually dried up, and the impact this has on the town (focusing on home prices, lack of mobility, and personal pride, to name a few). The book is eye-opening, and I liked that it focused on facts as well as personal experience to paint a picture of the modern-day hillbilly’s plight.

5. Ender’s Game, by Orson Scott Card
Ender’s Game makes consistent appearances on reddit must-read lists, so I finally gave it a whirl and ended up liking it. This sci-fi novel focuses on a future where Earth is attacked by aliens and specially selected children are given military tactical training through a series of battle simulations (“games”) to fight aliens and protect humankind (all in zero gravity!). The book follows Ender, one of the chosen children, from normal childhood life through battle school, with a twist ending to boot.

6. The Girl with the Lower Back Tattoo, by Amy Schumer
While there’s plenty I love about Amy Schumer’s comedy, her autobiography was mostly repetition of stories I’d heard from her standup / interviews / etc. I’d skip it and watch her skits instead.

7. Brave New World, by Aldous Huxley
Another high school assignment, another worthy re-read. I went through a heavy dystopian novel phase in 2016, and this was the tail end. From gene therapy to pharmaceuticals to race relations to hookup culture, I think Brave New World is still, 86 years later, an incredibly relevant (and surprisingly current!) take on “modern” issues. (It would also make a great episode of Black Mirror.)

8. The Rational Optimist, by Matt Ridley
The central argument of this book is that things are better now than they ever have been (and they’re continuing to get better) — mostly due to trade and specialization among tribes of humans. I read this not long after Sapiens, which definitely colored my thinking (they’re actually “You Might Also Like…” pairs on Amazon). The Rational Optimist doesn’t have the breadth of Sapiens, but it covers the history of trade and specialization in much greater depth, and provides interesting historically-informed commentary on modern-day hot topics like fossil fuels, government, and war. (Full disclosure: this is my boss’s favorite book and there’s something cool about reading your boss’s favorite book and seeing where it might impact their perspective.)

9. The Argonauts, by Maggie Nelson
I discovered Maggie Nelson via Bluets, her poetic lyrical essay about a woman who falls in love with the color blue (which I *loved*). The Argonauts is a completely different “family” of story — a genre-bending take on parenting and romance that focuses on Maggie’s own queer family and relationship with fluidly gendered Harry Dodge. Maggie is open, brutally honest, and thoughtful, and I appreciate her sharing such personal experiences.

10. The Girl on the Train, by Paula Hawkins
Thriller. Girl (woman) on train sees a couple out the train window every day, and daydreams about their “perfect” life  — UNTIL one day she sees the woman kiss another man and that woman goes missing…

If this sounds interesting to you, you’ll probably like this book. It’s a quick read, has a few twists, and was perfect for filling time on a flight from Austin to Boston.

11. How to Make Friends and Influence People, by Dale Carnegie
Dale Carnegie’s advice on making friends and influencing people is timeless. This book is still getting updated and reprinted 75 years after its original publication, and still relevant (though some of the original examples are a little — charmingly — dated). If you’re interested in the basic techniques, the Wikipedia article does a good job of describing Carnegie’s basic system, but the book itself is a quick read and one I’d recommend.

12. The Undoing Project, by Michael Lewis
I love Michael Lewis books. The Undoing Project is another good one, focusing on the relationship between Daniel Kahneman and Amos Tversky, who together created the field of behavioral economics. Lewis, as usual, does a great job of explaining fairly technical concepts while weaving a really interesting story around the complex relationship between two men. This book is at turns triumphant and heartbreaking, and the research itself was interesting enough that I’d recommend it.

13. It, by Stephen King
My husband and I both read It this year in preparation for the new movie (which was so much better than the original!). This was my first Stephen King novel and won’t be my last.

14. The Heart, by Maylis Kerangal**
I read this book solely based on Bill Gates’ recommendation and I’m so glad I did. This is technically the story of a heart transplant, but it is actually much more than that — a beautiful, gripping look at the fragility of life and family and relationships. The poetic language provides a strong contrast between the family whose life is forever changed with the matter-of-factness for the medical professionals involved in the story (for whom this is a normal “day at the office”). This book is an experience (I cried more than once), but a highly recommended one.

15. My Own Words, by Ruth Bader Ginsburg
There is so much about Ruth Bader Ginsburg that I find inspiring — her drive to make it to the Supreme Court, her work on gender equality and women’s rights, her relationship with her husband, her ability to see beyond political views and build cross-aisle friendships, and even her workout regimen (at age 84!) are all reasons to look up to RBG. Her book focuses on specific court cases and is peppered with interesting details about life on the Supreme Court. I found her book a bit repetitive (as some of the cases are cited multiple times) but overall a good in-depth look at the life of an important women “in her own words”. I’m definitely a fan.

16. Sprint, by Jake Knapp
Sprint is like a time machine for business ideas: a process to get a team from concept to prototype with customer feedback in a single week. I read this book after hearing about the concept from the UX team at (where I work), who have developed several product features that are the direct result of sprints. If you work on a product team and are interested in ways to test and fast-track development ideas, I’d recommend this book.

17. Option B, by Sheryl Sandberg
Option B is a look at building resilience through loss, disappointment, and heartache. After the sudden loss of her husband, Sandburg took some time off to pick up the pieces — of her family, her job, and everything else — and this book tells that story. Like Lean In, Option B is a combination of experiences and research, and like Lean In, it’s full of interesting stories and practical advice — like how to be there for someone going through a loss (and the importance of questions like  “What do you not want on a burger?“. This one resonated for me personally after the sudden loss of my father, and I spent so much time thinking about my mom that I eventually just sent her a copy. If you’ve ever wondered what to say or how to help someone who has experienced loss, check out this book.

18. Switch, by Chip and Dan Heath
The tagline of Switch is “How to change when change is hard”. This book focuses on an eight-step process for making change and introduces the idea of clearing a path for the elephant and the rider — the elephant being your emotions, big and hard to control, and the rider being your rational side, technically “in-control” but sometimes not enough to overcome your emotional side. Creating a clear path that addresses the needs and interests of both the elephant (emotions) and rider (rational thinking) is a means for making change “stick”. My boss and I both read this book and it provided a useful framework and shorthand that we’ve used while trying to make organization-level changes.

19. Mammother, by Zachary Schomburg**
Zachary Schomburg is one of my favorite poets (his book Scary, No Scary is an all-time favorite), and a few years ago, he announced that he was working on his first novel — I have been looking forward to reading Mammother ever since and it did not disappoint. Mammother is the story of a town suffering from a mysterious plague called God’s Finger that leaves its victims dead with a giant hole in their chest. There is a large cast of characters, plenty of magical realism (ala Marquez), and dense, beautiful language to support a surprisingly emotional story (I cried on a plane at the ending). If you like poetry, magical realism, or weird, cool reads, I highly, highly recommend this book (and all of Schomburg’s poetry, for that matter).

20. Between the World and Me, by Ta-Nehisi Coates**
This is one of those books that totally changed my perspective. Between the World and Me is a letter from Coates to his son about the experience and realities of being black in the United States. In addition to being beautifully written, this book covers territory I didn’t know existed — on the relationship between fear and violence, on Howard University, on how different race is experienced in the US than other countries without a history of slavery, on bodily harm, and so much more on “being black”. I wish this was required reading and can’t recommend it enough.

21. A Column of Fire, by Ken Follett
A Column of Fire is the third book in the widely spaced “Kingsbridge” series (which starts with The Pillars of the Earth, written in 1990). I was shocked to find that there was a third book in this series and downloaded it immediately. Each book is engaging historical non-fiction, and focuses on the political intersection of government and religion (and those who exploit either — or both). A Column of Fire is an apt addition — interesting storyline, lots of characters, and a cool take on historical events, particularly Mary, Queen of Scots. It’s worth noting that while this is part of a ‘series’, the books stand perfectly well on their own — though I would still start with The Pillars of the Earth if you’re interested.

22. Dear Data, by Giorgia Lupi and Stefanie Posavec
Dear Data is a project created by two women (Giorgia and Stefanie) getting to know each other by sending weekly postcards based on self-collected and visualized data — like “how many times I swore”, “every time I looked in the mirror”, and “how many times I said ‘sorry'”. The data is interesting, and the visualizations and engrossing — so much so that the collection of postcards was purchased by MoMA for display. We kicked off the R-Ladies Austin book club by reading this, and creating our own postcards, which was so much fun — and made us realize that the data collection and visualization process is not nearly as easy as it looks!

23. Milk and Honey by Rumi Kaur
Milk and Honey is a poetry book in four chapters: “the hurting”, “the loving”, “the breaking”, and “the healing”. Kaur provides an intense look at each emotion in turn. I have to admit that I didn’t love this book. While I appreciate brutal honesty in poetry (and there is plenty in here), the translation of feelings to language read as a bit like teen-angst, which was a turn-off.

24. Weapons of Math Destruction by Cathi O’Neil**
Weapons of Math Destruction is a look at the algorithms pervade modern life — and the problems embedded within them. This book does a great job of explaining the ethical implications of collecting and using data to make decisions, and outlines a framework for creating responsible algorithms. After reading this, I’m noticing new algorithms and data issues almost weekly, so it’s definitely had an impact on my thinking and approach to creating algorithms and working with data. I think this will resonate with “data people” and everyone else (the examples jump from teachers to credit cards to court systems). Also, a quick shameless plug: this is the next book for the R-Ladies Austin book club, so if you want to discuss it in person, please join us on Jan 31!

25. A Little Princess by Frances Hodgson Burnett
I loved this movie growing up, and after my mom bought me the book, I devoured it during Christmas weekend. A Little Princess is a heartwarming story about a wealthy-but-charming girl is sent to boarding school by her loving father (who is her only parent). Soon after, her father dies, and she lives a riches-to-rags story in which she is forced into labor to earn her keep as an orphan and ward of the school — all while imagining a better life as a princess. The book is just as magical as the movie, and I wholeheartedly recommend both.

*Resolutions vs. Goals (or, bonus thoughts on the whole “New Year’s” thing):

About four years ago, I changed my approach to New Year’s resolutions. While I love the theme of self-improvement, I don’t do well with resolutions that require doing something every day, or always, or never. A resolution like “read every day” is uninspiring, doesn’t allow room for the ebbs and flows of varied routines, and seems to be designed for failure — it only takes one slip-up to tarnish a “read every day” track record. Vague resolutions like “read more” are also tough. I appreciate a more concrete number or outcome to work toward so that I can track my progress through the year and know whether I’ve achieved each goal at the end of it.

So, I’ve replaced vague and overly stiff resolutions with quantifiable goals to be accomplished gradually over the course of the year. Now a resolution like “read every day” (or “read more”) becomes a goal: “read 25 books this year”. This allows room for vacations, sick days, and life to happen without making me feel like I’ve “failed” if ever I can’t find time to read. The number 25 was a good goal for me because it felt doable, easy to track (about a book every other week), and I knew I’d feel good about it at the end of the year, which is motivation to keep making progress.

If you were to track my progress towards this year’s goal, it you’d find major spikes around vacations (I like reading on planes) and during weeks where I don’t have lots of events. All this is just to say that if you have a goal of reading more, or anything else that you haven’t been able to find time for, I’d highly recommend goals over resolutions.

Have you read anything moving lately? Have questions or a different take on any of these books? I’d love to hear about it. You can comment here or ping me on twitter.

Blue Christmas: A data-driven search for the most depressing Christmas song

Christmas music can be a lot of things — joyous, ironic, melancholy, cheerful, funny, and, in some cases, downright depressing. I personally realized this while watching an immensely sad scene in The Family Stone centered around Judy Garland singing ‘Have Yourself a Merry Little Christmas’ and haven’t yet fully recovered (or stopped noticing sad Christmas music). With this scene in mind, and without being able to think of any song that was more sad, I set out to use data to find the most depressing Christmas song. (Spoiler alert: I was wrong about Have Yourself A Merry Little Christmas.)

Data Collection

The data collection process broke down into three steps: choosing which songs to analyze, using Spotify to extract “musical” information about each song chosen, and using Genius (and Google) to collect the lyrics for each song.

Which Songs?

As it turns out, there are looooots of Christmas songs out there. (Just think of how many covers are released each year!). I was hoping for a Buzzfeed “Top 100 Christmas Songs of All Time” list but after checking FiveThirtyEight, Billboard, Spotify, and lots of Google searches, I wasn’t able to come up with anything satisfactory that included both title and artist (which I’d need to gather ‘musical’ track attributes). I settled for a Spotify’s ‘Christmas Classics’ playlist. This 60-song playlist contains many classics (“Silver Bells”, “Sleigh Ride”) as well as some modern classics (I’m looking at you, Mariah Carey). While it doesn’t include all songs, I think it does a good job of picking the most popular version of each song chosen and handily satisfies my “title and artist” requirement.

Gathering ‘Musical’ Data (from Spotify)

We can get an idea of musical sadness by using data from the Spotify API, which allows you to extract various musical attributes for a given track (like danceability, speechiness, and liveness). For this analysis, I focused mainly on two attributes: energy and valence.

Valence is defined by Spotify as: “A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).” We can use this to determine how sad a track sounds (independent of lyrics).

Energy (as defined by Spotify) also rates from 0 to 1 and represents “perceptual measure of intensity and activity”. Heavy metal would rate high on the energy scale and slow acoustic tracks would rate low.

To gather this data, I used Charlie Thompson’s fantastic spotifyr package to interface with the Spotify API. This package can be installed and loaded via github like so:


Regardless of how you access it, the Spotify API requires that you set up a dev account (here) to create a client_id and client_secret. Save these as system variables (see below) and you’re ready to start gathering data!

Sys.setenv(SPOTIFY_CLIENT_ID = "your client id here")
Sys.setenv(SPOTIFY_CLIENT_SECRET = "your client secret here")

The spotifyr package allows you to pull track info based on a given artist, playlist, or album. Since I couldn’t pull the data from Spotify’s playlist directly, I copied all of the songs into my own playlist (“christmas_classics_spotify”) for easy access.

The steps I took were:
1) Get all of my playlist names (since I have more than one) using get_user_playlists
2) Get tracks from each playlist using get_playlist_tracks
3) Filter all tracks to just the tracks from the christmas_classics_spotify playlist
4) Use the get_track_audio_features function to get the features for the songs I care about.

Here’s what that code looks like:


playlists <- get_user_playlists("1216605385")
tracks <- get_playlist_tracks(playlists)
xmas_tracks <- tracks %>%
    filter(playlist_name == "christmas_classics_spotify")

track_features <- get_track_audio_features(xmas_tracks)

Gathering Lyrics

Originally, I planned to get pull all of the song lyrics via the Genius API and Josiah Parry’s in-progress geniusR functions. This was a pretty good plan but I quickly realized that not all of the songs I wanted lyrics for were actually available on Genius (some are fairly old); so, I used a combination of geniusR and good ol’ fashioned copying and pasting to get the lyrics for all of the songs in my playlist. (Note: copying and pasting is not as boring as it sounds if you’re also watching old episodes of Parks and Recreation.)

If you do want to use the Genius API to gather data, you’ll need to create an account with Genius to get an API access token. Similar to what we did with the Spotify API, you can save this as an environment variable:

Sys.setenv(genius_token = "your access token here")

Since geniusR isn’t a fully instrumented package, the best way to use its functions is to clone the geniusR repo and run each script, or copy and paste each one into your own script. Most of these functions are helper functions for the genius_lyrics function, which is the only one you’ll need. This function takes artist, song as arguments (like below):

jingle_bell_rock <- genius_lyrics(artist = "Bobby Helms", song = "Jingle Bell Rock")

You can loop through this function as needed to get any lyrics you’d like to analyze. Once you’ve collected all of your lyrics, you’re ready to move on to analysis!
Data Analysis: Quantifying Sadness

A song is made up of music and lyrics, and we’ll use both to create a Downer Score (a measure of a song’s sadness).

Musical Sadness

Earlier I mentioned a couple of useful features from the Spotify data we can use to quantify sadness — energy and valence. While these measures are useful individually, combining them gives us a better picture of how depressing a song might be. For example, a song that is high-valence but low-energy would definitely be happy, but might be considered more ‘peaceful’ or ‘calm’ than ‘joyous’. Likewise, a song that is low-valence but high-energy might be considered more ‘angry’ or ‘turbulent’ than ‘sad’. The most depressing songs will be both low-valence and low-energy (think Eeyore!). If we plot valence against energy, the sad songs will be the ones closest to the point lowest-valence, lowest-energy (0, 0):

(Note: you can interact with the plot by clicking on it.)
To quantify the musical sadness of each song, we calculate that song’s distance (in terms of valence and energy) from the point (0, 0) — the lower this distance is, the sadder the song.

Based on musical sadness, the most depressing Christmas songs are:

Screen Shot 2017-12-21 at 10.54.42 PM

O Christmas Tree was definitely not one of my guesses for “most depressing Christmas song” (although a listen-through of this version might convince me otherwise), but fear not, we still have to take a look at the emotions conveyed in the lyrics…

Lyrical Sadness

To analyze lyrical sadness, I used Julia Silge and Dave Robinson’s tidytext package to perform sentiment analysis on each song. tidytext comes complete with a tokenizer (to break down long blocks of texts into their individual words for analysis), a list of stop words (common words like “a”, “an”, and “the” which don’t carry much meaning and are therefore removed), and several sentiment catalogs we can use to analyze feeling or emotion attached to a given word.

Let’s get to it! After loading up the tidytext package, I created a list of sad words and a list of joy words from the NRC emotion lexicon.


sad_words %>%
    filter(lexicon == "nrc", sentiment == 'sadness') %>%
    select(word) %>%
    mutate(sad = T)

joy_words %>%
    filter(lexicon == "nrc", sentiment == 'joy') %>%
    select(word) %>%
    mutate(joy = T)


Next I removed stop words and left-joined the sad and joy word lists into my set of lyrics to calculate the percent of sad words and the percent of joy words that appeared in each song.

with_sentiment %
    anti_join(stop_words) %>%
    left_join(sad_words) %>%
    left_join(joy_words) %>%
    summarise(pct_sad = round(sum(sad, na.rm = T) / n(), 4),
    pct_joy = round(sum(joy, na.rm = T) / n(), 4),
    sad_minus_joy = pct_sad - pct_joy)

You might have noticed that in the last line of code above, I subtracted the percent of joy words from the percent of sad words. Originally, I only looked at the percent of sad words, but I noticed that even happy songs (like Joy to the World) do have some sad words, while other songs had zero sad words. To account for this, I subtracted the percent of sad words from the percent of joy words. (In fact, I thought it was interesting that only two songs have a higher percent of sad words than joy words — Blue Christmas and You’re a Mean One, Mr. Grinch.)

Based on lyrical sadness, the most depressing Christmas songs are:

Screen Shot 2017-12-21 at 11.39.37 PM

The Downer Index

In order to crown the most depressing Christmas song, we’ll have to combine the metrics for lyrical sadness (pct sadwords) and musical sadness (distance). I’ve created a metric, the Downer Index, which does just that:


A downer index near 1 is a happier song and a downer index near 0 is a more depressing song. This index weights the musical and lyrical elements of the song equally, and both are on a (0, 1) scale such that a higher score represents a happier quantity. This metric (and blog post) is inspired by Charlie Thompson’s gloom index (and the accompanying blog article, which I highly recommend reading for a look into the sad songs of Radiohead).

Based on the data, the most depressing Christmas song is… Blue Christmas!

This doleful tune about unrequited love certainly delivers lyrically (Blue Christmas was the most lyrically sad song), which contributed highly to its ranking; it came in 28th overall for musical sadness with a score of .64. I also learned that Blue Christmas was not an Elvis original, though his is by far the most popular cover (thanks Wikipedia!).

And without further ado, here are the top ten most depressing Christmas songs:

Screen Shot 2017-12-22 at 9.35.17 AM

While this approach isn’t perfect, I’m pretty happy with the results (except that my horse wasn’t even in the top ten!) and think the data does a fairly good job of capturing both the musical and lyrical sadness in the songs I analyzed.

Bonus: Christmas Song Superlatives

If you’re a “glass half full” kind of person, you might also be interested in some of the happier Christmas songs, which I also dug up while performing this analysis:

This data was a lot of fun to play with and I only scratched the surface on types of analyses you could do with it. If anyone is interested, I’m happy to share it.

Merry Christmas!



Data Meta-Metrics

Sometimes I work with great data: I know how and when it’s collected, it lives in a familiar database, and represents exactly what I expect it represent. Other times, I’ve had to work with less-than-stellar data — the kind of data that comes with an “oral history” and lots of caveats and exceptions when it comes to using it in practice.

When stakeholders ask data questions, they don’t know which type of data — great, or less-than-stellar — is available to answer them. When the data available falls into the latter camp, there is an additional responsibility on the analyst to use the data appropriately, and to communicate honestly. I can be very confident about the methodologies I’m using to analyze data, but if there are issues with the underlying dataset, I might not be so confident in the results of an analysis, or my ability to repeat the analysis. Ideally, we should be passing this information — our confidences and our doubts — on to stakeholders alongside any results or reports we share.

So, how do we communicate confidences and doubts about data to a non-technical audience (in a way that is efficient and easily interpretable)? Lately I’ve been experimenting with embedding a “state of the data” in presentations through red, yellow, and green data meta-metrics.


Recently my team wanted to know whether a new product feature was increasing sales. We thought of multiple ways to explore whether the new feature was having impact, including whether emails mentioning the new feature had higher engagement, and using trade show data to see whether there was more interest in the product after the feature was released. Before starting the analysis, we decided that we’d like this analysis to be repeatable — that is, we’d like to be able to refresh the results as needed to see the long-term impact of the feature on product sales.

Sounds easy, right? Collect data, write some code, and build a reproducible analysis. I thought so too, until I started talking to various stakeholders in 5+ different teams about the data they had available.

I found the data we wanted in a variety of states — anywhere from “lives in a familiar database and easy to explore” to “Anna* needs to download a report with very specific filters from a proprietary system and give you the data” to “Call Matt* and see if he remembers”. Eventually I was able to get some good (and not-so-good) data together and build out the necessary analyses.

While compiling all of the data and accompanying analyses together for a presentation, I realized that I needed some way to communicate what I had found along the way: not all of the data was equally relevant to the questions we were asking of it, not all of the data was trustworthy, and not all of the analysis was neatly reproducible.

The data meta-metrics rating system below is what I’ve used to convey the quality of the data and its collection process to technical and non-technical members of my team. It’s based on three components: relevance, trustworthiness, and repeatability. The slide below outlines the criteria I used for each score (green, yellow, red) in each category.

Screen Shot 2017-11-13 at 9.00.49 PM

Within the presentation, I added these scores to the bottom of every slide. In the below example, the data we had definitely answered the question we were asking of it (it was relevant), and I trusted the source and data collection mechanism, but the analysis wasn’t fully reproducible — in this case, I needed to manually run a report and export a text file before being able to use it as an input in an automated analysis. Overall, this data is pretty good and I think the rating system reflects that. The improvement that would take this data to green-green-green would be pretty simple — just writing the email data to a more easily accessible database, which becomes a roadmap item if we feel this report is valuable enough that we’ll want to repeat it.

Screen Shot 2017-11-13 at 9.24.12 PM

Below is an example of a not-so-great data process. Trade shows are inherently pretty chaotic, and our data reflects that. It’s hard to tell what specifically makes a trade show attendee interested in a product, and tracking that journey in real-time is much harder without records of interactions like demos, phone calls, etc.. This becomes another road map item; if we want to dig deeper into trade show data and use it to guide product decisions, we need to implement better ways of collecting and storing that data.

Screen Shot 2017-11-13 at 9.18.37 PM

Overall, this exercise was helpful for diagnosing the strengths and weaknesses of our data storage and collection across multiple teams. Providing this data in an easy-to-understand format allowed us to have informative conversations about the state of our data and what we could do to improve it. Getting the rest of the team involved in the data improvement process also helps my understanding of what data we do and don’t have, what we can and can’t collect, and makes my analyses more relevant to their needs.

The meta-metrics I used here are the ones we specifically cared about for this type of analysis; I could certainly see use cases where we might swap out or add another data meta-metric. If you’ve worked on conveying “the state of the data” or data meta-metrics to your team, I’d love to hear more about your process and the meta-metrics you’ve used in the comments.

*  Names have been changed to protect the innocent.