Non-zero baselines: the good, the bad, and the ugly

Or: who gets how most ERC funding?

Of all the charts being ridiculed at WTFviz, many get shamed for their lack of a zero-baseline. When teaching DataViz, zero-baselines are invariably a topic of debate, even in the quietest groups. To participants, the rules to when zero is necessary to understand the data, and when it may be happily omitted, are often unclear. Therefore, let’s quickly recap.

Bar charts: always show zero

When amounts are encoded by length, as done in bar charts, the zero-baseline is critical to our intuitive understanding of the data. A bar twice as long represents that the category has twice the amount of counts. The number of the prestigious ERC starting grants to German host institutes roughly doubled from 2013 to 2014, correctly shown by a bar twice as long in (A).

If, however, the y-axis does not start at zero as in (B), the increase from 2013 to 2014 is hugely over-emphasized and looks roughly 4 to 5 times as high. In example (C) the baselines starts above the first data point and misleads the audiences thinking that only Germany received ERC funding in 2018.

Bar_varying_yaxis_2-01

Non-zero baselines skew the relative difference between categories and are misleading. (The same applies to axis-breaks in bar charts!). Non-zero baselines are often used to save space.

In most cases however, the chart could simply be shown with less overall height. This option maintains the relative bar sizes faithfully. When reading bar charts we are always interested in relative, not absolute size differences among our categories. (And I learned that Israel is part of the ERC funding consortium!)

Bar_varying_yaxisHEIGHT_2-01
Number of submitted ERC grants varies a lot across countries. Varying the physical height of the plot faithfully maintains the relative differences. 

Line charts are happy without zero

The situation is entirely different for line charts. We use them to show trends, e.g. increase or decrease in a category over time. The rate of change is encoded by the slope of the line relative to the horizon. We usually evaluate the slope independently from its distance to zero. For example, seeing the zero is not important for assessing that ERC successes in Germany fluctuate, while UK and France have stable funding rates. And, no matter where the zero-baseline is, why does the UK have such a curious funding peak in 2012, what happened there!?

Line_varying-yaxis_2-01
For understanding trends in line charts, we do not critically need to see the zero baseline.

Sometimes showing zero is misleading

Importantly, showing the zero-baseline in line charts may be misleading. For example, mapping human body temperatures at a scale from 0 to 100˚C would effectively mask us from seeing a life-threatening increase from 39 to 40˚C in a patient. Similarly, showing global temperatures at a scale from 0 to 120˚C results in an entirely flat line, and was used by opponents of climate research to hide man-made global temperature changes, alas, an outcry at twitter swiftly followed.

ClimateChange_noChange
Line chart misleading BECAUSE of a zero-baseline. Tweet: @EcoSenseNow, 23rd April 2019

Distributions: it depends on the data

When showing statistical summaries, again the zero is usually not necessary to be visible. We are interested in the shape of the data (normal or bimodal), it’s median, and outliers. How far the majority of data points are from zero is not usually of interest as long as all data is shown. Instead, the relative distance of individual data points from each other are key.

Good practice for non-zero baselines

When using non-zero baselines, the common practice is to unlink the x- and y-axes. For educational purposes I cut data from the right example. This is a dangerous territory and in some cases may be misleading the audience. In this example, I effectively hide the early lead of the UK in winning ERCs!

Line_varying-yaxis_size_2-01
One possibility to alert readers to a non-zero baseline in your charts.

Data European Research Council, https://erc.europa.eu/projects-figures/statistics, starting grants from 2007-2018.

 

Advertisement

2 thoughts on “Non-zero baselines: the good, the bad, and the ugly

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.