6 Data Visualization
This week, we’ll start digging into visualizations of the data we have been wrangling. I should rephrase this - we are really only going to scratch the surface of visualizations.

Data visualizations are all around us. From what we read in the popular press to how we tackle problems within Biological Systems Engineering. In this unit, we’re going to focus on making some simple visualizations within RStudio. Prior to doing this, I wanted to begin with some tips to consider in preparing your own visualizations and spotting data visualizations that may inadvertently (or purposely) mislead the viewer. For example, the plot above shows the incidence of thyroid cancer with respect to time, and insinuates that glyphosate (Roundup) correlates to the rising rates of thyroid cancer. Key: Correlation is not causation. What else is wrong with the figure - specifically the secondary y-axis for glyphosate applied? Answer: you can’t have a negative value of glyphosate applied! Here, the authors adjusted the secondary y-axis scale so the red line followed thyroid cancer, which is clearly misleading.
This is just one example that Bergstrom and West use in their recent book “Calling Bullshit”. In the table below are key points they implore us to learn and consider as we evaluate visualizations and make our own:
CALLING BULLSHIT - TOP ISSUES WITH PLOTS | Why? |
1. Bar chart axes should include zero. | Size gaps can mislead interpretation, and bar graphs meant to look at absolute magnnitude. [visual weight of each bar = value of bar, or proportional ink] |
2. Line plots need not include zero. | Line graphs emphasize the change in the dependent variable as the independent variable changes. |
3. Multiple axes on a single graph | Correlation is not causality! |
4. Axis should not change scale mid-stream. | Clearly can mislead! |
Learning Objectives:
- Identify and avoid misleading plots
- Become familiar with types of visualizations
- Effectively map data values into quantifiable features of the resulting graphic: these are called aesthetics.
- Practice basic plotting within R using ggplot
6.1 Reading (complete by class on Monday)
This week we will use a resource developed by Clause Wilke, who wrote the book Fundamentals of Data Visualization. We’ll also start by looking at a section of another book by Carl Bergstrom and Jevin West - Calling Bullshit.
- Calling Bullshit page devoted to Visualizations. Here, read through this page to identify the common pitfalls associated with misleading plots.
- Correlation does not imply causation (be able to describe what this means)
- Rule of proportional ink
- Why a 0-axis for bar graphs, but not when plotting 2 lines on a x-y scatterplot?
The below also have LearnR interactive examples you can find in the Files pane (files 06-0-1, 06-0-2, 06-0-3). You can
Mapping data. “Whenever we visualize data, we take data values and convert them in a systematic and logical way into the visual elements that make up the final graphic. Even though there are many different types of data visualizations, and on first glance a scatter plot, a pie chart, and a heatmap don’t seem to have much in common, all these visualizations can be described with a common language that captures how data values are turned into blobs of ink on paper or colored pixels on screen.” The key insight is the following: All data visualizations map data values into quantifiable features of the resulting graphic. We refer to these features as aesthetics. Wilke’s cliffnote slides are here (optional).
Types of visualizations. Here, you’ll see examples of different visualizations that are useful in our field: amounts, distributions, proportions, x-y scatterplots, uncertainty, and geospatial data.
Visualizing distributions. We’ll follow this up with an exercise from C. Wilke.
Optional reading/resource: Visualization chapter in R for Data Science.
6.2 Tutorials to walk through on your own
Here, the idea is to reinforce the readings, and prepare you for success in the assignment.
There are 3 exercises within the Posit workspace for this week. These are built using the learnr package which makes websites that have r code chunks in them so you can practice coding, in a cleaner and easier environment than Posit. You can also Dr. Scott previously created a short video for each that may be helpful: Exercise 1, Exercise 2, and Exercise 3.
I am not going to expect any of you to memorize these functions and approaches; rather, I want you to be able to consider what type of visualization you can use, and have the background to dive into creating your own visualization using aesthetics and geoms. This will require you to do some reading in the help documentation of these different functions and refer back to the readings. From my perspective, simple, legible visualizations are best. Regardless, the exercises below will give you a great jumping off point.
Aesthetics Exercise 1 - see Wilke’s slides here. Use this to try yourself; the solution for each follows. The point is to learn how the data is “mapped” - the aesthetics.
Amounts Exercise 2 - see Wilke’s slides here. Again, this is for you to apply and practice, building on aesthetics but with bar data.
Distributions Exercise 3 - see Wilke’s slides here. Lastly, this series highlights approaches to show distributions of data.
6.3 More on Color Palettes
Also included in this weeks materials is a vignette from the Viridis package which has colorblind friendly color palletes.
R Markdown enables you to weave together content and executable code into a finished document. To learn more about R Markdown see rmarkdown.rstudio.org. This template uses R Markdown to demonstrate the color palettes of R’s Viridis package in three mediums:
- as an HTML, PDF, or Word document in
colors_document.Rmd
- as a slide deck in
colors_presentation.Rmd
- as a web page with interactive Shiny components in
colors_app.Rmd
6.3.1 Previewing
6.3.1.1 To preview the document
- Open the file
colors_document.Rmd
. - Then click the Knit button that will appear above the opened file. This will display the document as an HTML file.
- To display the document as a pdf or MS Word file, click the drop down menu to the left of the Knit icon and select one of:
- Knit to PDF
- Knit to Word
6.3.1.2 To preview the presentation
- Open the file
colors_presentation.Rmd
. - Then click the Knit button that will appear above the opened file. This will display the document as an ioslides HTML slide deck, which can be presented with any web browser.
- To display the presentation as a Slidy (HTML), beamer (PDF), or MS PowerPoint slide deck, click the drop down menu to the left of the Knit icon and select one of:
- Knit to HTML (Slidy)
- Knit to PDF (Beamer)
- Knit to PowerPoint
6.3.1.3 To preview the interactive document with Shiny components
- Open the file
06-0-4_colors_app.Rmd
. - Then click the Run Document button that will appear above the opened file. Because the file contains the YAML line
runtime: shiny
, R Markdown will run the file as an interactive Shiny app.