Data Discourse #1
Welcome to the inaugural edition of Data Discourse. One of the essential ingredients to the SEO campaigns we run for clients is producing content that can generate attention and links. One of the very best methods to produce content with these properties is to effectively use data to add insight or solve a problem. As Matt Gillespie (Head of Data Science at Fractl) elegantly says: “Plotted points are more trustworthy than written words”.
Whether you generate your own dataset by scraping the web, conducting surveys, leveraging yours or your client’s own data, or use a secondary data source, the insight you create and communicate can give your content the edge over standard blog posts and opinion pieces. The type of edge required for the very best links.
There’s few – if any – better ways to help stir the imagination when ideating for your own company or client, than to expose yourself to as much quality content as possible. That’s where content curation provides real value to those in the industry. Following in the footsteps of other content curation series such as Content Curated, @DigitalPREx, and Campaign Edit, this series will focus specifically on great content built on data.
The content covered in this series may have been produced for content marketing or digital PR purposes, but it won’t be limited to content produced for that purpose. The common thread will be the effective use of data and the ability to help tell a story with that data. In doing so, there is potentially something to learn from each piece that may help spark inspiration for your next content idea.
Generally, these posts will cover content that launched between the previous release and the current release. As this is the inaugural post I’m going to take some creative license and sprinkle in some of my all-time favourite pieces of work.
The Pudding is simply the best place for entertaining content with data as the centrepiece. You can click on just about any article on their site and find something interesting. Even if the subject matter doesn't resonate with you, the design is always on-point and often the methodology is worth digging into for a bit of inspiration.
By analysing searches on Tenor’s GIF keyboard involving certain celebrities, The Pudding team were able to catalogue which celebrities are most commonly used to convey certain emotions or expressions.
For example, I was surprised to see that Michael Jordan laughing gifs (31%) are used slightly more often than Michael Jordan crying gifs (29%), given how iconic the ‘crying jordan’ meme is. This is probably because the iconic Jordan crying meme works better as a static image, and perhaps, more positively, people are searching for laughing GIFs at a much higher rate than crying GIFs across the board.
If you’re inspired to dig into the world of big GIF data yourself, see what’s available in the Tenor API.
This piece on the Washington Post is a great advert for elevating content by allowing the reader to insert themselves into the story. A list of billionaires and their net worth does elicit some emotion (*cough* envy *cough*) when I read it, but I don’t enjoy it.
That said, when their wealth is put to me in the following terms, the scale is evident and wrapped in a little humour:
Personal financial impact of said billionaire buying a mansion = same level as myself when weighing up whether I should take a packed lunch or get extravagant with a large Big Mac meal!
Have you also realised how much you touch your face now you’re thinking about it so much, due to the COVID-19 pandemic?
This web app can help! After recording two short ‘training’ videos, the app will send you an alert every time you touch your face.
The app is built using TensorFlow, which analyses the frames of the training videos you submitted. Using machine learning algorithms it learns about the properties of the image it’s seeing when you are touching your face, and when you aren’t. It then uses this information to give you a resounding alert each time you touch your face unconsciously.
The model works surprisingly well with only a short amount of training time, particularly if you make the most of the training phase at the start. In the ‘not touching your face’ training section, make sure to have your hands in view if your setup is such that they are going to enter the frame often, even when not touching your face. When you are teaching the model that your camera sees when you do touch your face, make the effort to touch your face in varying ways throughout. As with any machine learning application, the more training data you feed it, the better the result will be (well...mostly).
Leave a tab open on the webpage and you’ll get an alert every time you touch your face! This is a fun and potentially beneficial-to-public-health application comprised out of out-of-the-box machine learning models.
- Spammo - an app that pays your mates back in tons of micro payments.
- Google game of the year - a search trend themed quiz (for the year 2018).
- Meme buddy - generate memes via voice commands with Google assistant.
I include this not just because I was obsessed with space as a kid, but also because it’s quite simply stunning and packed full of amazing information.
The graphic was made by Nadieh Bremer, who also wrote an awesome in-depth blog post about its creation. This provides a thorough peek behind the curtain for what these truly bespoke graphics take to produce. This means that this entry not only provides visual inspiration, but methodological inspiration too, for creating things moving forward.
This is one of those all-time classics that I’m dropping in just because it’s the first in this series. If you haven’t seen this piece of work before, do yourself a favour and follow the link above. It’s a tour-de-force of storytelling using data.
The use of overlapping entries on a static radar plot is just perfect for visualising what could be a tricky point to get across effectively.
This piece just simply hits a home-run on every aspect that a data-driven piece of content can be graded upon. The story itself is compelling and has mass-appeal. The data is sound and reputable and fit for purpose. The visualisation method greatly enhances the way the story is told and understood. And finally, the overall package is presented with beautiful design.
Bravo. Encore please.
Nathan Yau, the author of Visualize This and operator of flowingdata.com, has made some pretty awesome charts over the years. A number of Flowing Data charts have proved pretty popular over the years such as The Stages of Relationships, Distributed and Where Bars Outnumber Grocery Stores.
But the FlowingData visualisation that I find the most powerful – as morbid as it may seem – is the Years You Have Left to Live, Probably interactive.
This is a powerful use of data. The data used is the probability distribution that describes the mortality curve (in the US anyway as this is a US-centric site). Mortality statistics will show you in % terms what the general probability of dying at a given age is. But just seeing the percentage isn’t that powerful unless it’s very high. For young people the % is going to be incredibly low and that makes it a completely forgettable statistic.
But, if you plug your details into this interactive and watch the simulation play out, there’s something about the moment the dot drops suddenly (close to your current age). Although it happens rarely, when it does, it produces a feeling that can’t possibly be produced by words alone.
Using a secondary data source of UFO sightings, this campaign calculates the odds of seeing a UFO in all 50 American states and presents the results in a beautiful, interactive map format. It’s beautiful in its simplicity and the additional information in the timeline carousel adds to the user experience. It also provides extra context for any publication interested in covering the story.
This is a great example of the powerful combination of a great secondary data source and timeliness. Given the added reliance on home broadband for both entertainment and work purposes at the moment, being able to comment authoritatively on regions with poor internet speeds at the current moment is a great recipe for success with obtaining coverage.
This type of content has increased odds of generating links if you gain coverage. Tt’s almost a given that any publication covering it would let their readers know they can see what the result for their area is by consulting the original source.
I do have a slight gripe with the slightly strange ‘key’ in lieu of typical table column headers...but that’s just a minor thing.
The worldwide spread of Coronavirus and the effect it’s had on our daily lives has of course been the main news story of the past couple of months. Data visualisations have played a key role in helping many people appreciate the scale of the consequences, should we not act fast with strict social distancing measures.
This piece by Harry Stevens for the Washington Post is a brilliant example of using data visualisation techniques to convey how important it is to follow the WHO guidelines, for attempting to contain the pandemic.
What I particularly like about this work is that Harry Stevens makes it clear that this work is not scientifically modelling the spread of Coronavirus specifically. His visualisations are merely a common sense outline of the spread of anything that requires the contact between nodes in a network to transmit information.
The accurate predictive modelling of the worldwide spread of a virus is very complicated and best left to the experts. That said, it hasn’t stopped many DIY pseudo-epidemiologists from having a go. By simplifying the transmission to a value that the average carrier will pass the virus onto, the communication becomes so much clearer and often this is what is necessary to convey complicated results. This number for COVID-19 was estimated to be between 2 and 3 in the early stages of the pandemic.
This is what the GIF from thespinoff.nz is showing. A chain of people starting with one carrier, each passing it on to three others. This is a great example of using animation effectively in my opinion. The flow of time is important to establish the passing-on of the virus before then retroactively showing what could have been avoided with the annotations.
By showing the transmission first and then taking away “infections” alongside annotations of specific “people” in the visualisation adhering to social distancing, the message is powerful. More so than it could be expressed in words alone.
London-based growth marketing agency Kaizen conducted research to help digital marketers understand exactly how much of the current coverage is COVID-centric and what else is being picked up during these times.
Keep in mind this is a few weeks old at the point of this post is being published, so it might not reflect the current landscape, as things move quickly. Nevertheless, it’s a great example of a relatively quick experiment that produced actionable insight for the digital marketing community.
The data collected by Kaizen provided a solid evidence base for what many leading marketers and PR folk on Twitter had been saying - publications do want non-COVID-19 stories to go alongside their Coronavirus coverage, and some more than others.
The data provided actionable insights for digital marketers who have campaigns to pitch during the crisis. If your client has something legitimate to add to the Coronavirus news cycle, that offers some social good, then targeting the publications with a higher concentration of COVID-19 coverage could pay dividends. Conversely, if you’re pitching a campaign that’s unrelated, then targeting those publications with a lower % coverage towards the pandemic could be most effective.
John Burn-Murdoch is a senior data-visualisation journalist at the Financial Times. There is a good chance you may have come across his work in the past couple of months, as the coronavirus tracker page is now the most viewed page in FT history. This is thanks to John’s fantastic work aggregating as much credible information as possible in one place - and endeavouring to convey the results with the nuance that such a weighty topic demands.
Now cumulative deaths:
• Nonetheless, US death toll now highest worldwide and still rising fast
• And UK curve still matching Italy’s
• Australia still looks promising
All charts: https://t.co/JxVd2cG7KIpic.twitter.com/ylryhihL8F
— John Burn-Murdoch (@jburnmurdoch) April 15, 2020
The charts themselves contain a lot of information while still maintaining readability and an appealing design, a combination that isn’t easy to pull off.
The main reason I’ve enjoyed John’s work of late though is that by following his twitter timeline you can see the entire process laid bare as further data sources were found by crowdsourcing, as well as explanations of decisions to use log scales which is exactly the type of decision that people make when visualising data and then assume all readers will know how to read it or why it’s pertinent in that situation.
You might be well aware of the r/dataisbeautiful subreddit – A place to share inspiring data viz with over 14 million members. Its lesser-known sibling-ish community is the r/dataisugly subreddit with a modest 68k members. Two data viz subreddits with different motives. To cap off each edition of Data Discourse I’ll round-up a few of the more interesting r/dataisbeautiful entries for that extra shot of inspiration and finish up with a bit of gold from r/dataisugly.
As a fan of the NBA, despite being unable to hit a jump shot if my life depended on it, I was struck by this animated visualisation of the most common shot positions over the past two decades. Thanks to the movement towards serious analytics in sport (the moneyball effect) the NBA world realised over the past 10 years that 3 points is more than 2 points, leading to a significant increase in 3-pointers being taken. While the result is completely expected, this is a great use-case for an animation to show the complete abolishment of mid-range jumpers in the NBA as it unfolded over the past few years.
Another basketball viz, but this time inspired by the new Netflix documentary The Last Dance; which details how the Chicago Bulls were spreading their salary cap amongst the team members during their 90s dynasty run. What makes this all the more interesting is that the NBA salary cap in 1996 was roughly $24m and in 1997 it was $27m (Not much of a cap then, clearly).
This chart which is essentially a conditional-formatted spreadsheet tells the story of the diminishing quality of The Simpsons as it’s aged. Any post that spawns a type of visualisation that inspires many copies clearly taps into something people want to see. This post inspired clones for tonnes of other TV shows including Spongebob Squarepants, Game of Thrones, and the GOAT Breaking Bad.
That y-axis is a felony crime.
I hope that bringing this selection together in one place has been useful; whether because you found some of it interesting or it sparked inspiration for new ideas!
This is the first post in a series that will be posted regularly here on the Evoluted blog so if you found it useful please check in regularly for new posts or follow me on Twitter where I’ll announce new releases.