helena * jambor

scientist interested in RNA, genomics and science visualizations

Month: August, 2016

What’s next after you postdoc?

Part 2 of “Pie or no Pie”.

Before_after2

In my last blog I discussed why pie charts are hard to read and therefore better to be avoided. Today, I offer a real life example and answer the question of all scientists in training: What’s next after my postdoc? And I have the answer! (at least for those of you working at the Max Planck in Dresden!

According to the numbers collected in the fifteen years since the institute was founded most of you, as you suspected, will not become professors, but most, 74%, will remain closely connected to academic science, by being a staff scientist, on a second postdoc or entering the administration. If you came to MPI to go into industry, bad luck!, your chances are low, as only 11% end up in Pharma (maybe because the tech industry in Dresden is not very strong yet?) Many that work in science-related business become editors or consultants. You don’t fall into any category? Me neither, and we are in the category “other”, which really is a miscellaneous category of people on parental leave or unemployed, working at a bank or freelancing.

So let’s think of how to present the data best. The default to show percentages of a whole is often the pie chart. But – we immediately see a problem: we would like to show the three large categories, academia, science-related businesses, and “others”, but we also want to split up each category into its subgroups, and there are 12 of them! This means, the pie chart has way too many categories to really comprehend them. And, to help the reader, each subcategory therefore must be labeled, resulting in a pie chart completely cluttered with category names and data labels, none of them nicely aligned.

1_PIE_bad

Let’s try the column or bar chart. From last week you might remember that it is easier to use horizontal bar if you have long category names: this gives you plenty of space for the labels, which really is a plus in this case!

Then, when using a bar chart we add another layer of information, and this for instance can be done using color. To visually group subcategories into the large categories, we use three distinct colors for ‘academia’, ‘science-related’, and ‘other’. Need help on choosing color? Colorbrewer is a fantastic resource! (For our plot: we have three data classes and they are all qualitatively different). Now, with one glance we can see the large categories and all subcategories! In addition, I have added a little more text to indicate the name and overall percentage of the three large categories.

3_bar_nice

One reason people love pie charts is that they visually present parts of a whole (although our eyes more often than not struggle to make that out!). To allow the audience to clearly make out parts of a whole, we can use a little trick and extent the bars to 100% (or here 50%) and fill the bar with color according to its percentage. I personally think there are too many categories and that the empty bars create lot of lines clutter on the right hand side. Another possibility is to show stacked bars, but one looks a bit lonely. I’d use this to compare for example the data per year.

Finally, here is a wonderful compilation of atrocious pie charts, and I hope you NEVER use one again.

Pie or no pie?

12273574_10153227535232129_7045043180135315566_o

Favorite Pie!

“Death to the pie chart” is a battle cry of the data visualization expert Cole Nussbaumer working in the bay area mainly with business clients. I learned a lot about visual communication of data from her blog, after all, it does not matter if plots are about business revenue or bacterial growth!

While pie charts are not outright wrong, they are very, very hard to read accurately. In my classes I ask students to estimate the percentage of categories read from either a pie chart or a bar chart – invariably, they do better with getting the information from the bar chart!* And very often, the error when reading from pie chart is 10% or more! Would you want to force your audience, be it your busy boss, readership of your paper or of the next grant, to have to make guesses about your numbers or rather be sure they easily grasp it from a well executed bar chart? The longer it takes your audience to understand your figures, the less likely he/she will be to want to continue reading.

I have an exercise for you! In the pie chart below, is blue or red bigger? How long did you take to figure it out?x_Pie_bad

Now double-check your result using the bar chart – and monitor how much faster it was reading it! And there you already have the answer: in the pie chart you actually have to guess while in the bar you will be able to pretty precisely read of the answer from the provided axis!

x_column

And, surprise surprise, if I turn the bar chart by 90 deg, the category names are right next to the bar and we can read names with the bar length just like we read text. This helps the overall readability tremendously and is of course way more interesting if categories have more complicated labels such as “Wildtype animal treated with DDSX5 (5mM)”.

x_bar_ok

Solution

The real numbers are A = 18%, B, C, E = 20%, D = 22%. Please comment how close you got, I am collecting the data, after all I am still a scientist!

Come back for my next blog on a re-vamping a real example and also how to nicely show percentages with bar charts!

 

PS Cole’s blog on pie chart can be found here

* of course, in each class is one person who is really good at reading pie charts and this invariably leads to a lot of discussion. It is wonderful for that person to be good at pie charts, but it still shows that most people have difficulties reading from pie. Since figures are targeted at as many people as possible, I recommend making the figures as readable as possible to as many people as you can! And this includes, do not use the pie chart!