Draft

# Plotting circular data with ‘ggplot2’

R
plotting
Author

Pedro J. Aphalo

Published

2024-03-06

Modified

2024-03-06

Abstract

True circular plots are not just a linear plot bent into a circle. In true circular plots the circle is unbroken and computations like density distributions are continuous round the full circle. Some examples are given of how the lack of true circular plots shows up in ggplot2. I plan to add in the future examples of true circular plots.

Keywords

ggplot2 pkg, data visualization, dataviz

Note

To see the source of this document click on “</> CODE” to the right of the page title. The page is written using Quarto which is an enhanced version of R Markdown. The diagrams are created with Mermaid, a language inspired by the simplicity of Markdown.

Warning

Package ‘ggplot2’ has gained new features over its long life, and although few changes have been ‘code breaking’ you should be aware that the examples in this page have been tested with version (==3.5.0).

``````library(ggplot2)
library(lubridate)``````
``````
Attaching package: 'lubridate'``````
``````The following objects are masked from 'package:base':

date, intersect, setdiff, union``````
``library(learnrbook) # for the wind data``

# 1 Introduction

Circular plots can be disk-shaped or doughnut-shaped. They can be used to plot both linear and circular data. These plots use two coordinate axes, the angle around the circle and the radius of the circle to represent information from pairs of values. Pie-charts have values along a single positional axis, the angle around the circle, and will not be considered further here.

All variations of round-shaped plots are, of course, most suitable for plotting circular data, such as angles or positions along a closed loop. Linear data can be plotted on a circle by bending one of the linear axes of the Cartesian coordinates into a circle or arc and projecting the other linear axis onto the radius. This is artificial, and can be quite confusing for the reader.

Examples of circular data are wind direction, various measurements relative to the time-of-day, gene positions in circular chromosomes in some bacteria and in cell organelles. When we plot individual observations as points, adding `coord_polar()` or `coord_radial()` to a ggplot creates a circular plot.

When we plot individual observations Figure 1, or apply binning such that one boundary between bins falls at the closing point of the circle (0/360 degrees) Figure 2, plots for circular data work well even if the coordinate system is a bent line rather than a closed circle.

Code
``````p <-
ggplot(viikki_d29.dat, aes(WindDir_D1_WVT, WindSpd_S_WVT)) +
geom_point(alpha = 0.15) +
scale_x_continuous(expand = expansion(0, 0),
limits = c(0, 360),
breaks = 0:3 * 90) +
scale_y_continuous(expand = expansion(c(0, 0.05)))

Code
``````p <-
ggplot(viikki_d29.dat, aes(WindDir_D1_WVT)) +
stat_bin(binwidth = 22.5, boundary = 0) +
scale_x_continuous(expand = expansion(0, 0),
limits = c(0, 360),
breaks = 0:3 * 90) +
scale_y_continuous(expand = expansion(c(0, 0.05)))

With the histogram above Figure 2 we were constrained in our choice of `boundary` between bins. When fitting a probability density to circular data, if we fit probability distribution that ignores circularity, the plot created contains a spurious break or discontinuity where the two ends meet at 0 degrees Figure 3.

Code
``````# the y scale expansion needs to be set to 0 to avoid a hole in the center
# the x scale expansion needs to be set to 0, so that 0 degrees and 360 degrees meet

p <-
ggplot(viikki_d29.dat, aes(WindDir_D1_WVT)) +
stat_density(geom = "area", fill = "grey50") +
scale_x_continuous(expand = expansion(0, 0),
limits = c(0, 360),
breaks = 0:3 * 90) +
scale_y_continuous(expand = expansion(c(0, 0.05)))

Code
``````p <-
ggplot(viikki_d29.dat, aes(WindDir_D1_WVT, WindSpd_S_WVT)) +
stat_density_2d_filled(alpha = 0.66) +
scale_x_continuous(expand = expansion(0, 0),
limits = c(0, 360),
breaks = 0:3 * 90) +
scale_y_continuous(expand = expansion(c(0, 0.05))) +
theme(legend.position = "none")

To solve these problems new appropriate `stats` are needed, while existing geometries are enough.

# 2 What next?

Code
``````p <-
ggplot(viikki_d29.dat, aes(WindDir_D1_WVT)) +
stat_bin(fill = "grey50", binwidth = 11.25) +
scale_x_continuous(expand = expansion(0, 0),
limits = c(0, 360),
breaks = 0:3 * 90) +
scale_y_continuous(expand = expansion(c(0, 0.05)))