In this page code chunks are “folded” so as to decrease the clutter when searching for examples. Above each plot you will find one or more “folded” code chuncks signalled by a small triangle followed by “Code”. Clicking on the triangle “unfolds” the code chunk making visible the R code used to produce the plot.

The code in the chunks can be copied by clicking on the top right corner, where an icon appears when the mouse cursor hovers over the code listing.

The `</> Code`

drop down menu to the right of the page title makes it possible to unfold all code chunks and to view the Quarto source of the whole web page.

Names of functions and other R objects are linked to the corresponding on-line help pages. The names of R extension packages are linked to their documentation web sites when available.

# 1 Introduction

Here you will find examples of ggplots with labels for fitted model equations, various parameter estimates. The labels have been added to the plots with geometries defined in package ggpp and statistics from package ggpmisc both available at CRAN.

One needs to always check that annotations do not occlude anything significant, such as observations in the base plot. This needs special care when using annotations together with batch plotting. Either ensure that the scale limits of the base plot are expanded to avoid overlap or that the layer with the equations is the lowest one, i.e., added to the plot first.

# 2 Data labels and plot annotations

Data labels add textual information directly related to individual data points (shown as glyphs). Text position in this case is dependent on the scales used to represent data points. Text is usually displaced so that it does not occlude the glyph representing the data point and when the link to the data point is unclear, this link is signalled with a line segment or arrow. Data labels are distinct from annotations in that they contribute directly to the representation of data on a plot or map.

Annotations differ from data labels, in that their position is decoupled from their meaning. Insets can be thought as larger, but still self-contained annotations. In most cases the reading of inset tables and plots depends only weakly on the plot or map in which they are included.

In the case of annotations and insets the designer of a data visualization has the freedom to locate them anywhere, as long as they do not occlude features used to describe data. I will use the term annotation irrespective if the “labels” are textual or graphical.

The equations showing parameter estimates from models fit to data are normally displayed as annotations. However, it is also possible to consider them data labels “connected” to individual curves representing the corresponding model fits.

# 3 Annotations showing parameters estimates

Fitted model equations and other related estimates can be useful in plots as they ensure that graphical representation as a curve and numerical values for parameters that cannot be read from the curve itself are displayed as text on the same plot.

When adding annotations one should be aware that they add clutter to a plot, and clutter can make it difficult to see the patterns of interest in the data represented as points or curves.

So, as usual, less is more: *include those annotations that are relevant to the message conveyed by a plot and nothing more.*

Consider:

**Add model fit equations only if the values of the parameters are interpretable.**For example, in a linear regression slope and intercept are interpretable as is the reaction constant in the Michaelis-Menten equation.**If you add an estimate of \(R^2\) and/or \(P\)-value make sure that they correspond to the plotted curve.****Display observations in addition to fitted-model curves or smoothers whenever possible.**A curve not supported by observations is usually frowned at. However, if the number of observations is large, they may have to be described by their empirical distribution.**Confidence bands around curves can be used to describe the confidence we can have on the estimated line.**However, be aware that these bands are a property of the fitted model, not the data.**The number of observations, is usually given in the figure legend, and less frequently as an annotation in the figure.**A plot annotation can be automated and is, thus, less likely to not match the number of observations plotted.**Inset tables are not very common, but when computed when the plot is rendered, also ensure that the reported statistics or summaries match the observations shown in the plot.**

Fully automating the construction of fitted model equations is far from trivial for the general case, so I have implemented automation for specific types of models: polynomials fitted by OLS as linear models, polynomials fitted by quantile regression and major-axis regression fits. They are currently available as pairs of statistics with consistent interfaces, with each pair consisting in a curve-plotting statistics and an annotations statistics.

As shown below other model formulas can be rather easily assembled and added as annotations if one is familar with R’s expressions as used for `plotmath`

.

The examples shown in this page map to the *label* aesthetic `character`

strings ready to be parsed into R’s plotmath expressions. The statistics from ‘ggpmisc’ generate labels of this type by default.

Labels formatted as \(\LaTeX\) math mode equations, Markdown and plain text can be also created by passing different arguments to `output.type`

. Markdown is supported by package ‘ggtext’ and \(\LaTeX\) by the `tikz’ graphic device.

# 4 Preliminaries

The code used is shown on-demand above each plot and can be copied. We first load the packages we will use.

When package ggpmisc is loaded and attached, packages ggpp and ggplot2 are also attached. The only function from ggplot2 that is redefined by ggpp is `annotate()`

, which remains backwards compatible with ggplot2.

# 5 Linear correlation

## Code

```
# generate artifical data for a linear relationship
# two groups with different standard deviation around a slope = 1
set.seed(94321)
x <- (1:100) / 10
yA <- x + rnorm(length(x), sd = 2)
yB <- x + rnorm(length(x), sd = 8)
df1 <- data.frame(x = rep(x, 2),
y = c(yA, yB),
group = rep(c("A", "B"), rep(length(x), 2L)))
```

The examples in this section make use of `stat_correlation()`

from package ggpmisc.

Pearson correlation.

## Code

```
ggplot(subset(df1, group == "A"),
aes(x, y)) +
geom_point() +
stat_correlation()
```

## Code

```
ggplot(subset(df1, group == "A"),
aes(x, y)) +
geom_point() +
stat_correlation(small.r = TRUE)
```

Spearman correlation. (The default is to not compute the confidence interval.)

## Code

```
ggplot(subset(df1, group == "A"),
aes(x, y)) +
geom_point() +
stat_correlation(method = "spearman")
```

Kendall correlation. (The default is to not compute the confidence interval.)

## Code

```
ggplot(subset(df1, group == "A"),
aes(x, y)) +
geom_point() +
stat_correlation(method = "kendall")
```

Select which labels to show in plot, number of digits to display.

## Code

```
ggplot(subset(df1, group == "A"), aes(x, y)) +
geom_point() +
stat_correlation(mapping = use_label(c("R", "P", "n")),
r.digits = 3, p.digits = Inf)
```

Select which labels to show in plot.

## Code

```
ggplot(subset(df1, group == "A"), aes(x, y)) +
geom_point() +
stat_correlation(mapping = use_label(c("R", "t", "P", "n")))
```

Confidence interval for R (Pearson’s correlation).

## Code

```
ggplot(subset(df1, group == "A"), aes(x, y)) +
geom_point() +
stat_correlation(mapping = use_label(c("R", "R.confint")))
```

Confidence interval for Spearman’s correlation.

## Code

```
ggplot(subset(df1, group == "A"),
aes(x, y)) +
geom_point() +
stat_correlation(mapping = use_label(c("R", "R.confint")),
r.conf.level = 0.95,
method = "spearman")
```

Grouping supported.

## Code

```
ggplot(df1, aes(x, y, color = group)) +
geom_point() +
stat_correlation()
```

Facets supported.

## Code

```
ggplot(df1, aes(x, y)) +
geom_point() +
stat_correlation(label.x = "right", label.y = "bottom") +
facet_wrap(~group)
```

Highlighting based on estimates, here estimated *R* but it is possible to use other estimates like *P*-value.

## Code

```
ggplot(df1, aes(x, y)) +
geom_point() +
stat_correlation(mapping =
aes(color = ifelse(after_stat(cor) > 0.5,
"red", "black"))) +
scale_color_identity() +
facet_wrap(~group)
```

## 5.1 Alternatives

Package ‘ggpubr’ provides alternative approaches to correlation testing and reporting.

# 6 OLS regression with polynomials

Polynomials can be fitted as linear models by the method of ordinary least squares (OLS), but many other linear models exist. Robust and resistant variations also exist, based on a very similar user interface.

The most common use case of a polynomial is linear regression. Polynomials are very frequently used, and in the case of linear regression the role of parameters can be easily recognized.

The examples in this section make use of `stat_poly_line()`

and `stat_poly_eq()`

from package ggpmisc. The model `formula`

accepted follows the same format as accepted by `lm()`

. However, automatic equation labels are generated only for subset of these.

As is the case in general for model fitting with R, function `poly()`

can be used within model formulae, to describe polynomials. The default behaviour of `poly()`

is to generate orthogonal polynomials by centring the data. This helps with computation, but produces coefficient estimates that are not directly interpretable. Thus, `eq.label`

is set to `NA`

unless `raw = TRUE`

is passed in the call to `poly()`

to disable the generation of orthogonal polynomials.

As `predict()`

methods take into account whether the polynomial is orthogonal or raw, the plotted lines differ only in case of numerical accuracy problems. On the other hand the coefficient estimates are very different.

In statistics from ‘ggplot2’ and extension packages, model formulas are always defined using the name of aesthetics, *x* and *y*, not the names of the mapped variables. By default, equations added with `stat_poly_eq()`

make use of *x* and *y*, but this can be altered by passing arguments in the call.

The model corresponding to linear regression, `y ~ x`

, is a first degree polynomial, and it is the default of both statistics. The default `method = "lm"`

fits it as a Linear Model (LM) by OLS.

## Code

```
ggplot(df1, aes(x, y)) +
geom_point() +
stat_poly_line() +
stat_poly_eq(use_label("eq"),
label.x = "right", label.y = "bottom") +
facet_wrap(~group)
```

Most likely the variables mapped onto aesthetics `x`

and `y`

have different names than the aesthetics. If we use a symbol to represent these variables, we will want to use these same symbols in the equation.

## Code

```
ggplot(df1, aes(x, y)) +
geom_point() +
stat_poly_line() +
stat_poly_eq(use_label("eq"),
eq.with.lhs = "italic(t)~`=`~",
eq.x.rhs = "~italic(E)",
label.x = "right", label.y = "bottom") +
labs(x = expression("Irradiance, "*italic(E)~~(W~m^{-2})),
y = expression("Temperature, "*italic(t)~~(degree*"C"))) +
facet_wrap(~group)
```

\(R^2\) is the default label, as its estimate is always available, even when the model formula passed as argument is not that of a full polynomial.

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(formula = formula)
```

Fitted model equation, available for polynomials with no missing terms. Here we create the mapping with a call to `use_label()`

, a convenience function that can be called instead of `aes()`

for conciseness.

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(mapping = use_label("eq"), formula = formula)
```

\(R_\mathrm{adj}^2\) and *P*-value added separately. This is necessary if separate locations or colours are desired for different labels derived from the same model fit.

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(mapping = use_label("adj.R2"), formula = formula) +
stat_poly_eq(mapping = use_label("P"), label.x = "right", label.y = "bottom", size = 3,
formula = formula)
```

Fitted model equation and \(R_\mathrm{adj}^2\). As seen above, function `use_label()`

pastes together multiple labels. Here we pass an argument to `sep`

.

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(mapping = use_label(c("eq", "adj.R2"), sep = "~~italic(\"with\")~~"),
formula = formula)
```

Equation with no left-hand-side (lhs).

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(mapping = use_label("eq"),
eq.with.lhs = FALSE,
formula = formula)
```

Variable names in the equation set to be informative, and match the axis labels.

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(mapping = use_label(c("eq", "R2")),
eq.with.lhs = "italic(h)~`=`~",
eq.x.rhs = "~italic(z)",
formula = formula) +
labs(x = expression(italic(z)), y = expression(italic(h)))
```

Even Greek charactes and maths can be incorporated when needed.

## Code

```
formula <- y ~ poly(x, 2, raw = TRUE)
ggplot(df2, aes(x, log10(y + 1e6))) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(mapping = use_label("eq"),
eq.with.lhs = "plain(log)[10](italic(delta)+10^6)~`=`~",
eq.x.rhs = "~Omega",
formula = formula) +
labs(y = expression(plain(log)[10](italic(delta)+10^6)),
x = expression(Omega))
```

`stat_poly_eq()`

by default assumes that the fitted polynomial has the variable mapped to the `x`

or `y`

aesthetic as explanatory variable depending on the model formula and argument passed to parameter `orientation`

. It does not decode any transformations applied on the fly to `x`

or `y`

. In addition it does not generate `eq.label`

for linear models that are not polynomials with no missing terms and terms entered in increasing order of their powers.

If a transformation is applied or any other deviation from a regular polynomial is passed to `formula`

, it is still possible to use the labels for other fitted parameters. It is also possible to set `output.type = "numeric"`

and manually build the equation label, using `sprintf()`

as shown below, or a combination of `paste()`

and `format()`

or any other R function that can assemble a suitable character string from the numeric values returned.

*When using this non-automated approach, any valid linear model can be fitted.*

## Code

```
formula <- y ~ I(x^2) # we apply a transformation to x within the formula
ggplot(df2, aes(x, y)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(mapping = aes(label = sprintf_dm("y~`=`~%.3g~x^2",
after_stat(b_0))),
parse = TRUE,
output.type = "numeric",
formula = formula)
```

In some cases, such as when applying a power as transformation and fitting higher degree polynomials, one needs to manually build the R expression representing the equation to simplify it.

The examples shown above also work with groups. *As in ‘ggplot2’, models are fitted separately to the data in each group.*

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2, colour = group)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(aes(label = after_stat(eq.label)),
formula = formula,
vstep = 0.08) +
theme_bw()
```

Using colours to indicate the groups to which equations correspond is not always best. Here we use textual labels pasted to the left of each equation.

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2, linetype = group, grp.label = group)) +
geom_point() +
stat_poly_line(formula = formula, color = "black") +
stat_poly_eq(aes(label = after_stat(paste("bold(", grp.label, "*':')~~~",
eq.label, sep = ""))),
formula = formula)
```

Facets are also supported.

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(aes(label = after_stat(eq.label)),
formula = formula) +
facet_wrap(facets = vars(group), scales = "free_y", ncol = 1)
```

In the stats defined in ggpmsic the fit method function can modify the model passed as argument because the one used to build the equation is retrieved, when possible, from the fitted model object. In this example, a linear regression is fitted if the slope differs significantly from zero, but if not, the mean is reported, both graphically and numerically.

## Code

```
# user defined fit method removes the slope if
# the slope is not significant
poly_or_mean <- function(formula, data, ...) {
fm <- lm(formula = formula, data = data, ...)
if (anova(fm)[["Pr(>F)"]][1] > 0.1) {
lm(formula = y ~ 1, data = data, ...)
} else {
fm
}
}
# we create a plot as usual, but with our method
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
stat_poly_line(method = "poly_or_mean") +
stat_poly_eq(method = poly_or_mean,
aes(label = after_stat(eq.label)),
label.x = "right") +
theme(legend.position = "bottom") +
facet_wrap(~class, ncol = 2)
```

Faceting works also with free *x* and/or *y* scales in panels.

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2, fill = block)) +
geom_point(shape = 21, size = 3) +
stat_poly_line(formula = formula) +
stat_poly_eq(aes(label = after_stat(rr.label)), size = 3,
geom = "label_npc", alpha = 0.33,
formula = formula) +
facet_wrap(~group, scales = "free_y") +
theme(legend.position = "top")
```

Faceting works also with free *x* and/or *y* scales in panels when the position of labels is set by a string.

## Code

```
formula <- y ~ poly(x, 3, raw = TRUE)
ggplot(df2, aes(x, y2, fill = block)) +
geom_point(shape = 21, size = 3) +
stat_poly_line(formula = formula) +
stat_poly_eq(use_label("R2"), size = 3,
geom = "label_npc", alpha = 0.33,
formula = formula,
label.x = "right", label.y = "bottom",
rr.digits = 3) +
stat_poly_eq(use_label(c("F", "P")),
size = 3,
geom = "label_npc", alpha = 0.33,
formula = formula) +
facet_wrap(~group, scales = "free_y") +
theme(legend.position = "top")
```

Some artificial data with rather high random variation on both *x* and *y*.

Linear regression through the origin.

## Code

```
# model with intercept = 0
formula <- y ~ x + 0
ggplot(df3, aes(x, y)) +
geom_point() +
stat_poly_line(formula = formula) +
stat_poly_eq(formula = formula,
mapping = use_label("eq"))
```

These two examples show how to fit only one of the two parameters of a linear regression using the statistics from ‘ggpmisc’. These are unusual use cases but serve as examples of how simple transformations included in the model formula using the identity function `I()`

can be “undone” at the `after_stat()`

stage. Both examples use function `stage()`

, available only in ‘ggplot2’ (>= 3.4.0).

Linear regression with a fixed (instead of fitted) *y*-intercept (*y* = 5 at *x* = 0) and a fitted slope.

## Code

```
ggplot(df3) +
stat_poly_line(mapping = aes(x = x,
y = stage(start = y5, after_stat = y + 5),
ymax = after_stat(ymax + 5),
ymin = after_stat(ymin + 5)),
formula = I(y - 5) ~ x + 0) +
stat_poly_eq(mapping = aes(x, y5,
label = after_stat(paste(eq.label, "~+~5*\", \"*", rr.label))),
orientation = "x",
formula = I(y - 5) ~ x + 0) +
geom_point(mapping = aes(x, y5), size = 2.5) +
ylab("y")
```

Linear regression with a fixed slope of 1:1 (instead of fitted) and a fitted *y*-intercept.

## Code

```
ggplot(df3) +
stat_poly_line(mapping = aes(x = x,
y = stage(start = y5, after_stat = y + x - mean(x)),
ymax = after_stat(ymax + x - mean(x)),
ymin = after_stat(ymin + x - mean(x))),
formula = y ~ 1) +
stat_poly_eq(mapping = aes(x, y5 - mean(x),
label = sprintf("y~`=`~%.3g~+~italic(x)",
after_stat(b_0))),
parse = TRUE,
output.type = "numeric",
formula = y ~ 1) +
geom_point(mapping = aes(x, y5), size = 2.5) +
ylab("y")
```

Regressions of *y* on *x* and *x* on *y* in the same plot. The lines are drawn for the range of the explanatory variable, which differs in this case between the two fits.

## Code

```
# the default for formula is y ~ x
ggplot(df3, aes(x, y)) +
geom_point() +
stat_poly_line(color = "blue") +
stat_poly_eq(mapping = use_label(c("R2", "eq")),
color = "blue") +
stat_poly_line(color = "red",
orientation = "y") +
stat_poly_eq(mapping = use_label(c("R2", "eq")),
color = "red",
orientation = "y",
label.y = 0.9)
```

## 6.1 Alternatives

Package ‘ggpubr’ provides a renamed copy of `stat_poly_eq()`

taken from an old version of ggpmisc. The version in ‘ggpubr’ is much more limited in its functionality and even contains bugs. The line drawing code in `stat_smooth()`

from ggplot2 was used as a basis for `stat_poly_line()`

, the main difference is that by default it uses `lm()`

as method irrespective of the data size while `stat_smooth()`

switches to using splines for large numbers of observations.

# 7 Major axis regression

If *x* and *y* are both subject to random errors, and none is clearly the cause of the other, we should use major axis regression instead of linear regression.

The examples in this section make use of `stat_ma_line()`

and `stat_ma_eq()`

from package ggpmisc.

## Code

```
ggplot(df3, aes(x, y)) +
geom_point() +
stat_ma_line() +
stat_ma_eq(mapping = use_label(c("R2", "eq")))
```

No alternatives I know of.

# 8 Quantile regression

Quantile regression is frequently used is some disciplines like Economics but less frequently in other fields. It is robust and can be rather easily interpreted based on its similarity to the well known box plots.

The examples in this section make use of `stat_quant_line()`

, `stat_quant_band()`

and `stat_quant_eq()`

from package ggpmisc.

Quantile regression fits are done by numerical approximation, and frequently generate warnings for non-unique solutions or similar problems. Their importance needs to be assessed by users. In the examples below they seem mostly unimportant given the large number of observations and have been silenced in the output.

A median regression with a band limited by the upper and lower quartile regressions.

## Code

```
# the default for formula is y ~ x
ggplot(df3, aes(x, y)) +
geom_point() +
stat_quant_band() +
stat_quant_eq(quantiles = 0.5)
```

Median regression through the origin.

## Code

```
# model with intercept = 0
formula <- y ~ x + 0
ggplot(df3, aes(x, y)) +
geom_point() +
stat_quant_band(formula = formula) +
stat_quant_eq(formula = formula, quantiles = 0.5)
```

Quantile regressions of *y* on *x* and *x* on *y*.

## Code

```
# the default for formula is y ~ x
ggplot(df3, aes(x, y)) +
geom_point() +
stat_quant_band(color = "blue") +
stat_quant_eq(quantiles = 0.5, color = "blue") +
stat_quant_band(color = "red",
orientation = "y") +
stat_quant_eq(quantiles = 0.5, color = "red",
orientation = "y", label.y = 0.9)
```

Fitting a polynomial by quantile regression.

## Code

```
ggplot(df2, aes(x, y)) +
geom_point() +
stat_quant_band(formula = y ~ poly(x, 2, raw = TRUE)) +
stat_quant_eq(formula = y ~ poly(x, 2, raw = TRUE), quantiles = 0.5)
```

Two quantiles are by default plotted as lines,

## Code

```
ggplot(df2, aes(x, y)) +
geom_point() +
stat_quant_line(formula = y ~ poly(x, 2, raw = TRUE), quantiles = c(0.05, 0.95)) +
stat_quant_eq(formula = y ~ poly(x, 2, raw = TRUE), quantiles = c(0.05, 0.95))
```

A single quantile, here the median, is plotted as a line plus a confidence band.

## Code

```
ggplot(df2, aes(x, y)) +
geom_point() +
stat_quant_line(formula = y ~ poly(x, 2, raw = TRUE), quantiles = 0.5) +
stat_quant_eq(formula = y ~ poly(x, 2, raw = TRUE), quantiles = 0.5)
```

Equations labelled by quantile. (This code works only with ‘ggpmisc’ > 0.5.6, when `qtl.label`

was added. Use `grp.label`

with earlier versions.)

## Code

```
ggplot(df2, aes(x, y)) +
geom_point() +
stat_quant_band(formula = y ~ poly(x, 2, raw = TRUE),
color = "black", fill = "grey60") +
stat_quant_eq(aes(label = paste(after_stat(qtl.label), "*\": \"*",
after_stat(eq.label), sep = "")),
formula = y ~ poly(x, 2, raw = TRUE)) +
theme_classic()
```

Equations labelled by group and quantile. (This code works only with ‘ggpmisc’ > 0.5.6, when `qtl.label`

was added. Use only `grp.label`

with earlier versions.)

## Code

```
ggplot(df2, aes(x, y, group = group, linetype = group,
shape = group, grp.label = group)) +
geom_point() +
stat_quant_line(formula = y ~ poly(x, 2, raw = TRUE),
quantiles = c(0.1, 0.9),
color = "black") +
stat_quant_eq(aes(label = paste(after_stat(grp.label), "~~",
after_stat(qtl.label), "*\": \"*",
after_stat(eq.label), sep = "")),
formula = y ~ poly(x, 2, raw = TRUE),
quantiles = c(0.1, 0.9)) +
theme_classic()
```

## 8.1 Alternatives

Package ggplot2 defines `stat_quantile()`

which can be used to plot as a line some types of quantile regression fits, but is less flexible than `stat_quant_line()`

and `stat_quant_band()`

.

# 9 Non-linear models

The examples in previous sections were based on ready assembled labels. Here we show examples of how to generate labels for equations and other parameter estimates with an approach applicable to many different types of models.

The examples in this section make use of `stat_fit_tidy()`

from package ggpmisc.

We use as example a fit of the Michaelis-Menthen equation of reaction kinetics, a function that is non-linear in its parameters.

## Code

```
micmen.formula <- y ~ SSmicmen(x, Vm, K)
ggplot(Puromycin, aes(conc, rate, colour = state)) +
geom_point() +
geom_smooth(method = "nls",
formula = micmen.formula,
se = FALSE) +
stat_fit_tidy(method = "nls",
method.args = list(formula = micmen.formula),
label.x = "right",
label.y = "bottom",
aes(label = paste("V[m]~`=`~", signif(after_stat(Vm_estimate), digits = 3),
"%+-%", signif(after_stat(Vm_se), digits = 2),
"~~~~K~`=`~", signif(after_stat(K_estimate), digits = 3),
"%+-%", signif(after_stat(K_se), digits = 2),
sep = "")),
parse = TRUE) +
theme_bw()
```

## Code

```
micmen.formula <- y ~ SSmicmen(x, Vm, K)
ggplot(Puromycin, aes(conc, rate, colour = state)) +
geom_point() +
geom_smooth(method = "nls",
formula = micmen.formula,
se = FALSE) +
stat_fit_tidy(method = "nls",
method.args = list(formula = micmen.formula),
label.x = "center",
label.y = "bottom",
vstep = 0, hstep = -0.3,
aes(label = paste("V~`=`~frac(", signif(after_stat(Vm_estimate), digits = 2), "~C,",
signif(after_stat(K_estimate), digits = 2), "+C)",
sep = "")),
parse = TRUE) +
labs(x = "C", y = "V") +
theme_bw()
```

## 9.1 Alternatives

In all cases it is possible to do the model fitting before creating the plot, assembling labels in one’s own R code and passing them to `ggplot()`

as data. The possibilities are nearly unlimited but implementing them, depending on one’s own familiarity with R and ‘ggplot2’ can become very time consuming. I have in part written ggpp and ggpmisc to avoid as a user having to remind myself of how to write such code repeatedly. My hope is that others will also save time and effort by using the packages I have developed.