Yesterday, I was trying to put some finishing touches on a figure I made in
ggplot2 that visualizes some simulation results. The plot features several panels using
facet_grid(), and uses colors to distinguish between different regression models that were fit to the simulated data. I wanted to label certain axes and panel names using the Greek letters I had used as parameter notation, and I also wanted the labels in the color legend to correspond to the different regression models I had fit.
The problem was, I had no clue how to do this! So, I consulted #rstats Twitter, got some really great tips, and figured that I’d share them all in a quick demo blogpost (mostly so that I can easily find this info the next time I need it! 😂).
First, let’s load the necessary packages:
library(dplyr); library(ggplot2); library(scales)
Next, let’s generate some random data for plotting (I’m including two binary variables for grouping purposes):
Here’s what the data look like:
data = data.frame(x = rnorm(50), y = rnorm(50), c = factor(rep(c("a","b"),each=25)), d = factor(rep(0:1, length=50)))
Next, let’s make a simple panel of scatter plots using
ggplot(), coloring the points by the variable ‘c’ and creating two panels so that the points are grouped by the variable ‘d’:
ggplot(data) + geom_point(aes(x = x,y = y, col = c))+ facet_grid(~ d)
This is how the plot would look if we didn’t make any alterations to any of the labels. Using the code above as something to build upon, let’s go through some examples of how to change different types of labels on the plot to incorporate Greek symbols and math expressions.
Plot Titles, Axes and Legend Titles
One way to modify plot titles, axes and legend titles is through the
labs() function in
ggplot2. In order to add math notation to those labels, we can use the
expression() function to specify the label text. For example, if we wanted to modify the plot above such that the title was “\(Y \sim X\)”, the x axis was labeled as “\(\beta_0\),” and the legend title read “Values of \(\mu\),” we could run the following:
ggplot(data) + geom_point(aes(x = x,y = y, col = c))+ facet_grid(~ d) + labs(title = expression(Y %~% X), x = expression(beta), col = expression(paste('Values of ', mu)))
🌟 BTW: This website will come in handy when figuring out the math expression syntax!
Next, let’s play around with the text of the values shown in the legend. Suppose we want to show that the names of the groups used to color the points are not actually ‘a’ and ‘b,’ but ‘\(\alpha\)’ and ‘\(\beta\)’, respectively. In order to reformat the color legend values, we’ll use the
parse_format() function from the
🚨 Before modifying the plot, we will first recode the variable ‘c’ such that the values are character strings containing the expressions we want to show:
data = data %>% mutate(c = recode_factor(c, `a` = "alpha", `b` = "beta"))
Now, let’s modify the color labels:
ggplot(data) + geom_point(aes(x = x,y = y, col = c))+ facet_grid(~ d) + labs(title = expression(Y %~% X), x = expression(beta), col = expression(paste('Values of ', mu))) + scale_colour_discrete(labels = parse_format())
Lastly, let’s change the labels of the different plot panels to read ‘\(\gamma = 1\)’ and ‘\(\gamma = 2\)’. To do so, we will specify the label parameter in the
facet_grid() plotting step as
label = "label_parsed".
🚨 Again, before we do this, we’ll need to recode the variable that is used to create the facet grid:
data = data %>% mutate(d = recode_factor(d, `0` = "gamma == 1", `1` = "gamma == 2"))
Now let’s modify the panel names!
ggplot(data) + geom_point(aes(x = x,y = y, col = c))+ facet_grid(~ d, label = "label_parsed") + labs(title = expression(Y %~% X), x = expression(beta), col = expression(paste('Values of ', mu))) + scale_colour_discrete(labels = parse_format())
There you have it! Hopefully these examples will come in handy the next time you need to include math expressions in a plot. Thank you to Ben Williams and Jeremy Yoder for coming to my rescue on Twitter! 🙌 🎉