# R tips 3: ggpmisc adds new stats to ‘ggplot2’

## Aim of the package

1. Make it possible to use the grammar of graphics in some cases for which until now ad-hoc solutions were needed.
2. Show how easy it can be to write new statistics for ‘ggplot2 (>= 2.0.0)’.

### Case 1 (Stackoverflow question)

“Adding a 3rd order polynomial and its equation to a ggplot in r”

### Re-stated problem

“Adding a label with the R^2 or adjusted R^2 value from any `lm()` fit for each group and panel of a ggplot”

“Adding a polynomial of any degree and its equation for each group and panel of a ggplot”

As a follow up to a question by a colleague about the answer she found at Stackoverflow not working for a third degree polynomial, I wrote `stat_poly_eq()` as a general solution to the problem. For simplicity I used a different approach than in the Stackoverflow answer: I used package ‘polynom’. (Extending this stat to handle BIC and AIC or in fact any value which can be extracted or computed from the fitted model object should be just a simple edit.)

[code lang=”r”]
library(ggplot2)
library(ggpmisc)

formula = y ~ x
ggplot(cars, aes(speed, dist)) + geom_point() +
geom_smooth(formula = formula, method = “lm”) +
stat_poly_eq(formula = formula, aes(label = ..eq.label..), parse = TRUE)
[/code] ## Case 2 (Stackoverflow question)

“Plotting a simple time series in ggplot.”

### Restated problem

“Plotting time series in ggplot: converting a time series into a data frame suitable for plotting with ggplot.”

Function `try_data_frame()` does this conversion using as its core function `xts:try.xts()` to first convert if possible its argument into a `xts` object, and then converting the `xts` object into a data frame.

[code lang=”r”]
library(ggplot2)
library(ggpmisc)

ggplot(try_data_frame(lynx), aes(time, V.lynx)) + geom_line()
[/code]

## Case 3 (Stackoverflow question)

“add a curve that fits the peaks from a plot in R?”

### Restated problem

“Finding peaks and valleys and labelling them in each group and panel of a ggplot.”

The writing of these versions of `stat_peaks()` and `stat_valleys()` was inspired by a discussion in ggrepel’s issues area at Github. These ‘ggplot’ statistics are built on top of `splus2R::peaks()`. (ppc.peaks: In the next version I will make possible also the use of `ppc::ppc.peaks()`.)

[code lang=”r”]
library(ggplot2)
library(ggpmisc)

ggplot(try_data_frame(lynx), aes(time, V.lynx)) +
geom_line() +
stat_peaks(geom = “line”, linetype = “dashed”, color = “orange”, size = rel(1)) +
stat_peaks(geom = “line”, linetype = “dashed”, color = “red”, size = rel(1),
ignore_threshold = 0.7)
[/code] [code lang=”r”]
library(ggplot2)
library(ggpmisc)

ggplot(beaver1, aes(time, temp)) +
geom_point(aes(color = factor(activ))) +
geom_line() +
stat_peaks(geom = “text”, color = “red”,
x.label.fmt = “%04d”, span = 5,
angle = 90, hjust = -0.1) +
stat_valleys(geom = “text”, color = “blue”,
x.label.fmt = “%04d”, span = 5,
angle = 90, hjust = 1.1) +
ylim(36.25, 37.75) +
facet_grid(~day, scales = “free_x”, space = “free_x”,
labeller = “label_both”)
[/code] ## Case 4

“Peaks in a time series.”

[code lang=”r”]
library(ggplot2)
library(ggpmisc)

ggplot(try_data_frame(lynx), aes(time, V.lynx)) +
geom_line() +
stat_peaks(geom = “rug”, color = “red”) +
stat_peaks(geom = “point”, color = “red”) +
stat_valleys(geom = “rug”, color = “blue”) +
stat_valleys(geom = “point”, color = “blue”)
[/code]

## Case 5

“Custom label formatting for peaks and valleys”

[code lang=”r”]
library(ggplot2)
library(ggpmisc)
library(xts)

ggplot(try_data_frame(AirPassengers), aes(time, V.AirPassengers)) +
geom_line() + stat_peaks(x.label.fmt = “%b”, geom = “text”, angle = 90,
hjust = -0.1, color = “red”, span = 3) +
geom_line() + stat_valleys(x.label.fmt = “%b”, geom = “text”, angle = 90,
hjust = 1.1, color = “blue”, span = 3) +
scale_x_datetime(date_labels = “%b %y”, date_breaks = “1 year”) +
ylim(0,700)
[/code] ## Case 6

“Edited labels”

[code lang = “r”]
library(ggplot2)
library(ggpmisc)

ggplot(try_data_frame(ldeaths), aes(time, V.ldeaths)) +
stat_peaks(geom = “vline”, color = “red”, span = 11,
linetype = “dashed”) +
stat_peaks(x.label.fmt = “%b %Y”, y.label.fm = “%4.0f”,
geom = “label”,
color = “red”, span = 11, vjust = -0.2,
aes(label = paste(..y.label.., “deaths\n in”,
..x.label..))) +
geom_line() +
scale_x_datetime(date_labels = “%b %y”, date_breaks = “1 year”) +
ylim(0,4300)
[/code]

Share with