Or: who gets how most ERC funding?
Of all the charts being ridiculed at WTFviz, many get shamed for their lack of a zero-baseline. When teaching DataViz, zero-baselines are invariably a topic of debate, even in the quietest groups. To participants, the rules to when zero is necessary to understand the data, and when it may be happily omitted, are often unclear. Therefore, let’s quickly recap.
Bar charts: always show zero
When amounts are encoded by length, as done in bar charts, the zero-baseline is critical to our intuitive understanding of the data. A bar twice as long represents that the category has twice the amount of counts. The number of the prestigious ERC starting grants to German host institutes roughly doubled from 2013 to 2014, correctly shown by a bar twice as long in (A).
If, however, the y-axis does not start at zero as in (B), the increase from 2013 to 2014 is hugely over-emphasized and looks roughly 4 to 5 times as high. In example (C) the baselines starts above the first data point and misleads the audiences thinking that only Germany received ERC funding in 2018.
Non-zero baselines skew the relative difference between categories and are misleading. (The same applies to axis-breaks in bar charts!). Non-zero baselines are often used to save space.
In most cases however, the chart could simply be shown with less overall height. This option maintains the relative bar sizes faithfully. When reading bar charts we are always interested in relative, not absolute size differences among our categories. (And I learned that Israel is part of the ERC funding consortium!)
Line charts are happy without zero
The situation is entirely different for line charts. We use them to show trends, e.g. increase or decrease in a category over time. The rate of change is encoded by the slope of the line relative to the horizon. We usually evaluate the slope independently from its distance to zero. For example, seeing the zero is not important for assessing that ERC successes in Germany fluctuate, while UK and France have stable funding rates. And, no matter where the zero-baseline is, why does the UK have such a curious funding peak in 2012, what happened there!?
Sometimes showing zero is misleading
Importantly, showing the zero-baseline in line charts may be misleading. For example, mapping human body temperatures at a scale from 0 to 100˚C would effectively mask us from seeing a life-threatening increase from 39 to 40˚C in a patient. Similarly, showing global temperatures at a scale from 0 to 120˚C results in an entirely flat line, and was used by opponents of climate research to hide man-made global temperature changes, alas, an outcry at twitter swiftly followed.
Distributions: it depends on the data
When showing statistical summaries, again the zero is usually not necessary to be visible. We are interested in the shape of the data (normal or bimodal), it’s median, and outliers. How far the majority of data points are from zero is not usually of interest as long as all data is shown. Instead, the relative distance of individual data points from each other are key.
Good practice for non-zero baselines
When using non-zero baselines, the common practice is to unlink the x- and y-axes. For educational purposes I cut data from the right example. This is a dangerous territory and in some cases may be misleading the audience. In this example, I effectively hide the early lead of the UK in winning ERCs!
Data European Research Council, https://erc.europa.eu/projects-figures/statistics, starting grants from 2007-2018.