helena * jambor

scientist interested in RNA, genomics and science visualizations

Teaching Figure Design and RNA in Israel

When PhD students invite to a retreat, it is an honor and obligation to go. They primarily invited me to teach about my research on RNA and its cellular localization, but I convinced them that visualization of biological data,  my recent passion, is as important. I ended up teaching both!
I am now somewhere in the clouds on my way back and am left truly impressed: by the wonderful program put together by the PhD students of the SignGene program; by the excellent organization headed by Dhana Friedrich Alon Appleboim, and their devotion to making an interesting, interactive and innovative program; and I am impressed by the scientific excellence and intellectual curiosity of all SingGene students!

I left Israel mesmerized by its cultural blend. The WinterSchool (held in a pleasant 25C sunshine environment) took place in a modern resort hotel in Elat. While we conferenced, we were surrounded by an orthodox Israelis, American families, Russian tourists, Arabs, Poles, Germans, African families, and Japanese travel groups. After an exhausting day of seminars, we gazed from the Israeli beach into Jordan, Saudi Arabia and Egypt, underneath us the African and Arabian continental plates touching and slowly sliding along each other, remembering all those that were here before: Moses, the Nabateans, the Romans, the silk road traders…

The trip was also personally touching for me. My beloved grandmother, Alice Jambor, had worked for the Israeli embassy in Bonn. She traveled to Israel countless times and loved it passionately. As I loved her passionately, I had long wanted to visit Israel too. While traveling, I kept her close to my heart by wearing a necklace she gifted to me as a child saying “love” in Hebrew.


I wish love to this beautiful region. I wish the bonds between Germany and the state of Israel remain strong, and those in Germany questioning this will remain a minority. I believe personal friendships strengthen these bonds and that scientific exchanges, such as for example the SignGene program, are fantastic starting points!


Information is Beautiful!

If you want to get inspired of how to create beautiful and informative figures with your data I urge you to browse the Information is Beautiful site.

The categories are Infographic, Data visualization, Interactives, Data journalism, websites and projects. What is missing is a scientific category that would highlight how cutting edge scientific findings are told in a compelling and clear way – if such a category exists, students could submit entries similar to the iGEM competition!

PS you can still vote TODAY on the shortlist of the best visualization!!!


how does she do it!!!!????!?!

Meant as a compliment, I still despise being asked in awe how I manage to raise kids and have a (science) career. The simple answer is: I try. And fail. And try again. Just like thousands of other working mothers. I always wonder if people would dare asking this question to a male scientist who also often combine family and career.

I never wanted to comment on working mothers – others said all before, and better. But a recent article in the local newspaper did prompt me to write a reply to the editor. The article portraits two women that, with support from the Technical University of Dresden, combine family and a science career.

The article however fails to notice: it portraits only women! And, they are at the postdoc stage, meaning, they are far from a successful career as independent scientists, which is tied to a professorship. And, the article also fails to mention the reason no female professor was interviewed: the institute in fact has 0% female professors (full and assistant level!) – a fact that I do find noteworthy in the context of this article! Not enough, I kid you not, the support from the TU Dresden is organized though the “office for chronically ill, disabled, and women” – clearly, 50% of the population are considered just another minority.


The printed shortened reply:

My entire article here (German):

Leserbrief, Anmerkungen zu „Zwischen Labor und Familie“, SZ 22. August 2016/Campus, Jana Mundus.

Der Artikel beschreibt sehr hübsch zwei Wissenschaftlerinnen am CRTD Institut, das zur Fakultät Naturwissenschaften der TU Dresden gehört. Es fehlt allerdings komplett eine auch nur ansatzweise kritische Auseinandersetzung mit der Situation von Frauen in der Wissenschaft. Warum kamen nicht mehr Professorinnen zu Wort? Es gibt sie nicht!

Ich bin Wissenschaftlerin und Mutter und musste in den letzten Jahren zusehen, wie eine nach der anderen meiner talentierten, ambitionierten und erfolgreichen Kolleginnen aufgegeben hat. Die Gründe sind vielfältig, aber letztendlich gibt es eben die berühmte Glasdecke, durch die Frauen oft nur schwer durchkommen. Und alles „Reinlehnen“ (siehe Sheryl Sandbergs Buch Lean in) reicht eben nicht aus, um die Glasdecke zu durchbrechen. Aber, um unabhängige Forschung zu betreiben, ist das Ziel immer die Professur oder eine äquivalente Stelle an einem Forschungsinstitut. Beide im Artikel beschriebenen Forscherinnen sind noch weit davon entfernt oder kommen da nur schwer hin, weil sie auf einer Nachwuchsstelle (Postdoc) festsitzen, die, wenn sie als Sprungbrett für die Professur genutzt wird, in der Regel vor dem vierzigsten Geburtstag beendet sein sollte (man muss ja auch eine Weile noch Juniorprofessor sein bevor man zum Vollprofessor ernannt wird!).

Das Institut, an dem die beiden Forscherinnen tätig sind, das CRTD, glänzt nicht mit einem hohen Frauenanteil. Im Gegenteil, seit dem Weggang von Professor Elly Tanaka gibt es KEINE Frau in der Riege der Professuren, oder auch nur Nachwuchsprofessuren. An den direkten Nachbarinstituten ist das nicht anders: Am BIOTEC sind gerade mal zwei von 14 Gruppenleiterinnen weiblich. Das dritte Institut am Tatzberg, B-cube, hat ebenfalls: KEINE Frau. Summa summarum ist der Frauenanteil am Center for Molecular and Cellular Bioengineering, dem die drei Institute zugeordnet sind, damit bei knapp über 5%.

Wird sich das ändern? Mit Sicherheit nicht. Woher ich das weiß? Alle Nachwuchsprofessuren an den genannten Instituten wurden mit Männern besetzt, inklusive der neuesten Nachwuchsgruppenleiterstellen. Dies senkt den Frauenanteil nochmals und zementiert den niedrigen Frauenanteil auch langfristig: wie soll man eine Professur mit einer Frau besetzen wenn es unter den Juniorprofessuren schon keine gibt? Dieser geringe Frauenanteil ist eine klare Missachtung der schon vor 10 Jahren vereinbarten „Forschungsorientierte Gleichstellungsstandards“ der DFG, und der von ihr geförderten Forschung!

Gibt es keinen Druck seitens der Universitätsleitung, der Politik, dem Aufsichtsrat? Der Aufsichtsrat (in der Wissenschaft: scientific advisory board) des CRTD und BIOTEC ist zu 100% mit Männern besetzt, denn, anders als in Dax-Konzernen, gibt es hier keine Frauenquote! Der Druck der Politik, trotz einer prominenten Frau, Eva-Maria Stage, an der Spitze des Wissenschaftsbereichs, reicht nicht aus. Und die TU Dresden? Solange die Belange der Frauen in der Stabstelle für „chronisch Kranke, Behinderte und Frauen“ (!!!) behandelt werden, wird sich das Bild festigen, dass Frauen nur eine weitere Minderheit sind, und nicht als exzellente Wissenschaftlerinnen im Hauptinteresse der Universitätsleitung stehen. Aber: wir sind keine Minderheit, wir sind 50%, im Biologie-Studium oft sogar 6—70%, und wir wollen bitte langfristig auch 50% der Professuren besetzten! Warum? Frei nach Justin Trudeau: Because it’s 2016.

What’s next after you postdoc?

Part 2 of “Pie or no Pie”.


In my last blog I discussed why pie charts are hard to read and therefore better to be avoided. Today, I offer a real life example and answer the question of all scientists in training: What’s next after my postdoc? And I have the answer! (at least for those of you working at the Max Planck in Dresden!

According to the numbers collected in the fifteen years since the institute was founded most of you, as you suspected, will not become professors, but most, 74%, will remain closely connected to academic science, by being a staff scientist, on a second postdoc or entering the administration. If you came to MPI to go into industry, bad luck!, your chances are low, as only 11% end up in Pharma (maybe because the tech industry in Dresden is not very strong yet?) Many that work in science-related business become editors or consultants. You don’t fall into any category? Me neither, and we are in the category “other”, which really is a miscellaneous category of people on parental leave or unemployed, working at a bank or freelancing.

So let’s think of how to present the data best. The default to show percentages of a whole is often the pie chart. But – we immediately see a problem: we would like to show the three large categories, academia, science-related businesses, and “others”, but we also want to split up each category into its subgroups, and there are 12 of them! This means, the pie chart has way too many categories to really comprehend them. And, to help the reader, each subcategory therefore must be labeled, resulting in a pie chart completely cluttered with category names and data labels, none of them nicely aligned.


Let’s try the column or bar chart. From last week you might remember that it is easier to use horizontal bar if you have long category names: this gives you plenty of space for the labels, which really is a plus in this case!

Then, when using a bar chart we add another layer of information, and this for instance can be done using color. To visually group subcategories into the large categories, we use three distinct colors for ‘academia’, ‘science-related’, and ‘other’. Need help on choosing color? Colorbrewer is a fantastic resource! (For our plot: we have three data classes and they are all qualitatively different). Now, with one glance we can see the large categories and all subcategories! In addition, I have added a little more text to indicate the name and overall percentage of the three large categories.


One reason people love pie charts is that they visually present parts of a whole (although our eyes more often than not struggle to make that out!). To allow the audience to clearly make out parts of a whole, we can use a little trick and extent the bars to 100% (or here 50%) and fill the bar with color according to its percentage. I personally think there are too many categories and that the empty bars create lot of lines clutter on the right hand side. Another possibility is to show stacked bars, but one looks a bit lonely. I’d use this to compare for example the data per year.

Finally, here is a wonderful compilation of atrocious pie charts, and I hope you NEVER use one again.

Pie or no pie?


Favorite Pie!

“Death to the pie chart” is a battle cry of the data visualization expert Cole Nussbaumer working in the bay area mainly with business clients. I learned a lot about visual communication of data from her blog, after all, it does not matter if plots are about business revenue or bacterial growth!

While pie charts are not outright wrong, they are very, very hard to read accurately. In my classes I ask students to estimate the percentage of categories read from either a pie chart or a bar chart – invariably, they do better with getting the information from the bar chart!* And very often, the error when reading from pie chart is 10% or more! Would you want to force your audience, be it your busy boss, readership of your paper or of the next grant, to have to make guesses about your numbers or rather be sure they easily grasp it from a well executed bar chart? The longer it takes your audience to understand your figures, the less likely he/she will be to want to continue reading.

I have an exercise for you! In the pie chart below, is blue or red bigger? How long did you take to figure it out?x_Pie_bad

Now double-check your result using the bar chart – and monitor how much faster it was reading it! And there you already have the answer: in the pie chart you actually have to guess while in the bar you will be able to pretty precisely read of the answer from the provided axis!


And, surprise surprise, if I turn the bar chart by 90 deg, the category names are right next to the bar and we can read names with the bar length just like we read text. This helps the overall readability tremendously and is of course way more interesting if categories have more complicated labels such as “Wildtype animal treated with DDSX5 (5mM)”.



The real numbers are A = 18%, B, C, E = 20%, D = 22%. Please comment how close you got, I am collecting the data, after all I am still a scientist!

Come back for my next blog on a re-vamping a real example and also how to nicely show percentages with bar charts!


PS Cole’s blog on pie chart can be found here

* of course, in each class is one person who is really good at reading pie charts and this invariably leads to a lot of discussion. It is wonderful for that person to be good at pie charts, but it still shows that most people have difficulties reading from pie. Since figures are targeted at as many people as possible, I recommend making the figures as readable as possible to as many people as you can! And this includes, do not use the pie chart!

This week’s DataViz on twitter

This was a great week on twitter for a RNA-scientist moonlighting as data visualization expert. There are a couple of mantras that I keep on repeating and they all came up this week.

1) In praise of drawings!

A simple drawing is better than a complicated diagram. And these days, it is super simple to draw on the ipad – even unskilled artists can go a long way! Or use whiteboard and include this as a picture in your talk.

This is a wonderful drawing of expansion microscopy by Christophe Leterrrier (@christlet):

Screen Shot 2016-06-03 at 11.32.01

2) Avoid 3D!

Unless you are making an interactive web-based animation, 3D is very hard to read and way too often results in confusion or even misleads the reader! Avoid it at all costs. See my recent blog-post on one possible work-around. Here is a recent example that circulated in the RNA-twitterverse by RNAseqblog about usage of RNAseq.

Screen Shot 2016-06-03 at 11.45.41

3) Time = line.

If you want to show changes over time, the time almost always goes to the x-axis! A line graph works best if you have many time-points. When dealing with only two time-points like in this case the go-to chart is the slope-chart!

Screen Shot 2016-06-03 at 11.31.40

Solution: turn bars into lines, shift 90deg and simplify by using colors strategically and finally, make straightforward labels! PS I hope Dan Graur does not mind me using his graph as an example 🙂  slope_dangraur

4) Faceting is always a solution.

If you can’t solve a problematic graph, try faceting aka small multiples aka many little similar graphs. I used it for example here. Faceting is really easy for people using R, but it is, like so many things, also possible in excel. I am not an evangelist for either, both have their value, – it is most important that you try making better graphs regardless of the program you use to create them! Have a look how to do faceting with excel:








By all means: avoid 3D!

You have so much nice data you want to show, but sadly only one flat piece of paper. Are 3-dimensional graphs a good solution? Quick answer: No, never, ever. Why, I will explain and show a recent example that I worked on.

We often have the trouble of wanting (or having) to show a lot data at once: let’s say the body temperature of mice over time, RNA expression in cell differentiation. If the data points diverge (and are color-coded!) this rapidly results in a highly cluttered graph. As a consequence the audience has to really “read” the data to decide for themselve what the main message is. PLot_spagettiWe also have a problem if the data is similar and partially overlaps. Again, the resulting graph is highly unreadable.

PLot_overlapWhat to do? To avoid such overlap in data points we tend to use 3-dimensional graphs: each data series can then be read individually. However, a 3-dimensional plot create more problems than it solves:

  • A reduction that is shown further along the z-axis (green data!) is visually heightened – and consequently cannot be fully appreciated. Vice-versa, if you wanted to show an increase, it would look much more dramatic if shown in the background – both are: misleading!
  • It is almost impossible to faithfully read the value of the y-axis correctly. What is the size of the first green peak? I’d have to use a ruler to asess where the peak would cross the y-axis (3rd tick) and then substract the height of where the green baseline crosses the y-axis (0.5 ticks). Quite a lot of work! PLot_3D


Show data individually, dare to show it small, the main point will still be clear! And make use of the power of showing multiples – here the reader has to read axes only once, but can apply this knowledge to all of the individual plots at once!

Note: the resulting picture is not bigger than the orignial and could possibly be further reduced in size while still being fully readable!


PS. To increased clarity I mute the colors of the y-axis and gene-model and show them in grey (there is no need to show each exon in a different color!). I then use color ONLY to highlight the main message: a strong reduction of RNA expression in homozygous mutants. By separating the data into three plots I circumvent the problem of having to show them in individual colors.

Evolution and the hourglass

Today, 134 years ago, Darwin died. A suitable day to share a data visualization on evolution!

In the early 18-hundreds, Karl von Baer made a couple of observations that lead to what is now commonly known as Baer’s laws of embryology. These state that while embryos of various species look strikingly different in the beginning of embryo development and as adults, there is one time-point when the variation is at its minimum that typifies a phylum, the phylotypic stage. Baer’s observations were later developed further and became known as the developmental “hourglass” (Sander 1983?). *see FOOTNOTE*

The hourglass model states that there are developmental constraints that work against variation – but this lacked, as many evolutionary models, experimental validation. How should one recapitulate or test an experiment that in nature took billions of years? I fondly remember my teacher Ingo Wallat’s classes on evolution and was therefore delighted when joining Pavel Tomancak’s lab that a team around Alex Kalinka was collecting the first molecular proof for the developmental hourglass and van Baer’s 200-year old theory!

Their paper was published in 2010 (also available here), but I must admit the nature of the evidence was initially hard to grasp for a RNA biologist like myself! I therefore decided to create an illustration of their findings to explain the science to a wider audience – and maybe also high-school students!





* Haeckel beautifully illustrated a similar idea of his own, that embryonic development is a recapitulation of evolution. In fact, his drawings are most often used to illustrate the developmental hourglass – a great case point for the power of a wonderful scientific illustration!

Science visualization 3: Redraw Figure 1

Part 3 on “How to accentuate the figures of a scientific paper”:

Re-drawing of Figures

I, just like most scientists, have no formal training in scientific data visualization. I rely on books that are primarily written for journalists dealing with data and for people working in business. Some aspects of data visualization we learn in our statistics and mathematics courses, but how to effectively use color etc. rarely is part of the curriculum. Apart from reading, I train myself by analyzing the figures in scientific publications. For you to improve too, I have here shown for four example figures how I analyze figures and the changes I suggest to implement.


1. Why is here a line? It seems its sole purpose is to separate panels A and B. This is not necessary if enough space is left between the panels and the panel contents are clearly grouped. Solution: remove the line and integrate the labels of the schematic model (“Liquid Disordered, Ld”etc) clearly into panel B – at the moment they float into the space of panel A and are visually cut off from panel B itself! In add ition, I have integrated headers directly into the figure – by now most journals accept this!

2. Inconsistency of labels: in panel A we see structures of Cholesterol and Diplopterol but neither is mentioned in B. Solution: For consistency the relationship of cholesterol, diplopterol, sterol and hopanoids should be made clear, especially since these terms are used throughout the paper.

3. Simplify labels 1: is it necessary to explain arrows and the strike-through of this arrow separately? Solution: explain it simpler!

4. Simplify labels 2: Redundancy between schematic and legend. Solution: Integrate part of the legend into the schematic – this would reduce cluttering and increase the readability of the schematic and also of the legend itself!

5. Color choice of the lipids: It is not clear why are some head groups yellow and green? Is it really necessary to distinguish these features of the lipids by color? Solution: remove all colors on lipids that are not the focus of this study – saturated and unsaturated lipids are easily distinguished based on their strikingly different shapes!




And now the same for Figures 2-4!



Science visualization 3: Redraw Figures 2-4

Part 3 on “How to accentuate the figures of a scientific paper”:

Re-drawing of Figures (2-4):

Figure 2



  1. Layout: The axis is too fat, it is almost more prominent than the data. Typically, I advocate muting it by showing a thin line in grey, for example. If a legend can be placed within the chart area, most often one can simply label the data lines themselves in the corresponding color! That way it takes even less time to read the entire graph.
  2. Color-scheme: For the entire figure set, I have reserved color exclusively for the data on the hopanoid diplopterol (yellow) while the control experiments are shown in shades of grey.
  3. Gridlines: are in 99/100 cases not necessary to guide the reader through the data. However, here they are used to point to the condensation plot on the right. But this takes some effort to find out! I have solved this by unlinking the axes of the monolayer data and the condensation plot.
  4. Axes: It was not immediately obvious that the condensation plot shares the y-axis with the SM monolayer plot. I have unlined the two plots and added a new axis to the condensation plot. In addition, the error bars are very prominent and in some cases they even hide the data bar.
  5. Bar versus Boxplot: Here the median of several experiments is shown in a bar graph – this would be better shown in a boxplot. Even better, if I had had access, would have been to show the distribution of the actual data (Beyond the bargraph). Or, a more radical solution would be to just state the two numbers! Usually, a plot is not necessary when only two numbers should be compared.
  6. Rotated text: is hard to read, it is almost always worth the space to avoid it!! Here: by having two lines of text! Then one can also remove abbreviations entirely!




Figure 3





  1. Color scheme: Here, values from measuring membrane packaging are shown. This just shows valued on a single scale – hence a single color would be sufficient! And be easier to read! And even if this actually was diverging data that critically needed two colors (above/below a threshold for example), one would and should not choose a rainbow color scale. As documented in many, many, many blog posts and opinion pieces, rainbow colors do not faithfully reveal graded distributions (Rainbow color map still considered harmful!).
  2. Label clearly: new abbreviations are used, but not introduced in the figure itself – again, it is almost always worth the extra space to increase readability. And here, we have a lot of space!
  3. Cluttering: the extra line is supposed to separate figure part A from B and C. See Figure 1: if the spacing and grouping of panel and panel parts is done clearly, there is no need for a separating line.
  4. Order of panels: Figures are “read” just like a text, from left to right. Therefore panel C will be read before panel While fixing this is sometimes really tricky, in this case it is easy!
  5. Intersection x/y-axis: as a rule (with few notable exceptions), the x-axis should intersect with the y-axis at zero! Also in this panel, the weight of the axes and lines as well as the color scheme does not match to the other figures (but, in this case I lack original data and therefore could not implement changes)
  6. Interrupted axes: interruptions of any axes should best be avoided or at least motivated by the data. In this case, I think it is not necessary to do at all! The plot shows the mean GP index shown in panel A (and the same value for ordered and disordered areas). I have used grey bars to guide the eye to the mean values and reserved white background for the additional calculations of the mean of sub populations.



 Figure 4





  1. Labeling of the structures could be slightly improved for clarity, especially since the names are re-used in the figure and paper.
  2. Spacing of panel parts: the spacing of charts in panel B could be improved to increase readability and I have used headers to guide the reader through the individual plots. Also, I have matched Figure 4B to the previous, similar Figure 2A.
  3. Data label/legends: as before, I have again chosen color just for the molecule of interest and mutated and homogenized the control data (here is an article on how not to mix attributes such as color, texture etc). The dotted line was visually more “active” than even the colored line showing the hopanoid data!!!
  4. Spacing: by spacing the parts of C better, the readability of the entire figure is enhanced.
  5. Legend: the legend is placed in between the two parts of C and in addition is not 100% identical to B although they should be!