### Pick’n’mix plots

### Follow-up to: Showing distributions

When writing about the half-and-half plot, many of you replied with further discussion points, tips, and tutorials. I tried collected them here to make them available to everyone.

*More mixed boxplots*

Aaron Ellison @AMaxEll17 brought to our attention that he published a plot in 1993, where he overlaid the box plot with the data points (see fig 1A). Along with it he published the code, pre github et al. Aaron was inspired by the just published “Grammar of Graphics” by Wilkinson. He seems to be the first person to have published it in a paper?

Today, boxplot/data plots are common and easy to plot in R with ggplot2. Declan O’Regan @__DrDeclanORegan__ shows us one example in figure 1B. An “exploded” version, where the boxplot and its metrics are barely visible and the focus is on the data points, is shown in figure 1C (provided by the cystic fibrosis Gene therapy group @CFGT_Edinburgh).

*Box’n’Bee*

There is also the overlay of boxplot with the bee-swarm plot. Here, individual data points are ordered and arranged in a U-shape instead of randomly placed. An example is shown by Darren Wisniewski @Dmwizzle, who made this in ggplot2 (fig 2A).

But, beware of the bee-swarm: the ordered arrangement of the data (U-/ or A-shape most common) may introduces visual artifacts. And, personally, I draw a mental line through the U-shaped branches and straighten it to understand the data. This is error-prone and of course a waste of time when the line could equally be straight. In figure 2B I have plotted the same data as bee plot and dot plot for a direct comparison. I feel it is easier to see how the data is distributed in the data/dot plot. *(Data: gene expression of RNAs that are localized at the poles in the fruit fly oocyte. RNAs that localize at the posterior for days have higher expression than RNAs at the anterior pole that are localized just for a few hours).*

*Histogram & boxplot*

Robert Grant @**robertstats** pointed us to an interesting histogram overlaid with statistical summaries that was originally designed by @**f2harrell** (here is a link to a tutorial with R), see figure 3. The horizontal histogram shown below has particularly small bins and the median and quartiles indicated below – for my taste a bit too small.

*Violin and data*

Of course, there are also mixed plots with violin plots. Violin plot themselves most often already are overlaid with a boxplot. Another possibility by Wouter de Coster @wouter_decoster is to mix the violin plot with a bee swarm plot, which he implemented with python seaborn (fig 4A). As you know, I personally would have preferred the actual data instead of the bee swarm, see above.

Joey Burant @jbburant put forward the idea of mixing data points as a histogram with half of a violin plot in , see figure 4B.

Joey also nicely documented how in github:

When the histo-violin is flipped horizontal this looks like a raining cloud, Roger Kievit @**rogierK** therefore named it the raincloud plot and just deposited a preprint article about this plot type and its implementation. For matplotlib users **Sara Popham**@**sara_poppop **posted a guide in github.

*In excel…*

Jorge Camoes @wisevis shows us that such plot types are also possible to make in excel – he shows us a horizontal boxplot with data points above from his book (fig 5). I generally like horizontal boxplots, especially when comparing lots of categories! **Jon Schwabish **@**jschwabish **re-created the half-and-half plot it in excel. Both are phenomenal, I had no idea excel could do this much!

*… and matlab*

And finally, matlab user rejoice, it is also possible to make mixed plots in your favorite environment, Matt Cooper @mattguycooper suggests to use the ‘notboxplot’ function on the file exchange that creates ‘box plots’ with dot plots overlaid, this gives you plots as shown in figure 6:

*More: Tutorials and interactive plots*

Bogdan Micu @trizniak points us to a nice interactive violin plot: __https://plot.ly/r/violin/.__

A couple of tutorials: **Frank Soboczenski** @**h21k** shows us the code for making half-and-half boxplots in R: https://github.com/h21k/R/blob/master/snippets/half_box.R, James Rooney @jpkrooney pointed us to a great tutorial for making violin plots with ggplot2 by Katherine Wood @kathmwood https://inattentionalcoffee.wordpress.com/2017/02/14/data-in-the-raw-violin-plots/ and @**lisadebruine** compares different plots compare with the same data: https://debruine.github.io/plot_comparison.html.