ggpmisc 0.3.6

Package ‘ggpmisc’ focuses mainly on plot annotations. The new features added in this version required quite minor code changes but add features that I hope will be found useful.

Package documentation web site at: https://docs.r4photobiology.info/ggpmisc/

Changes from version 0.3.5 the most recent CRAN release, are:

New

Annotations using NPC coordinates

NPC or native plot coordinates, are very useful for annotations. On the other hand, they are not of any use for plotting actual data. The geometries using this kind of position coordinates defined in ‘ggpmisc’ make it possible to have different annotations in the different panels when using facets. Traditional annotations in ‘ggplot2’ do not use the data, but instead take as arguments constant values for the aesthetics, and consequently add identical annotations to all panels. Annotations in ‘ggplot2’ use data coordinates. In many cases, the desired position of annotations is unrelated to the data, but instead related to the native coordinates of the plotting viewport. Using NPC coordinates for annotations allows consistent positioning, which is very important from the graphical design perspective.

Starting from version 0.3.6  ‘ggpmisc’ exports a modified definition of annotate() from ‘ggplot2’. The modification adds support of the position aesthetics npcx and npcy retaining all other functionality unaltered. As a consequence geometries "text_npc""label_npc""table_npc", "plot_npc", and "grob_npc" can now be used as the first argument to annotate(). In addition single ggplots, data frames and grobs as well as lists of such objects are accepted arguments to label.

library(ggpmisc) # plot to be inset p <- ggplot(mtcars, aes(factor(cyl), mpg, colour = factor(cyl))) + stat_boxplot() + labs(y = NULL) + theme_bw(9) + theme(legend.position = "none") # main plot with p as an inset ggplot(mtcars, aes(wt, mpg, colour = factor(cyl))) + geom_point() + annotate("plot_npc", npcx = "left", npcy = "bottom", label = p) + expand_limits(y = 0, x = 0)

Tagging equations with group labels (experimental)

Starting from version 0.3.6  stat_poly_eq() supports use of grouping with equations and identifying them by using labels.  Previously use of  the color aesthetic was the only way of “linking” equations to plotted curves, which is frequently distracting or unavailable for printing. After some past failed attempts at implementing this, I recently realized that using a pseudo-aesthetic made implementation very easy and its use flexible and straightforward. The catch is that this relies on undocumented behavior of ‘ggplot2’ and will not necessarily work with future versions of ‘ggplot2’. Statistic stat_poly_eq() now copies grp.label from its input into its returned value. One can map any variable to the pseudo-aesthetic grp.label to achieve this. Values are passed to the output only if all values within the group are the same, otherwise grp.label is filled with NA. The signature of stat_poly_eq() remains unchanged.

library(ggpmisc) my.formula <- y ~ x ggplot(mtcars, aes(wt, mpg, linetype = factor(cyl), shape = factor(cyl), grp.label = factor(cyl))) + geom_point() + stat_smooth(formula = my.formula, method = "lm", colour = "black") + stat_poly_eq(aes(label = stat(paste("bold(\"cyl\"~~", grp.label, "*':')~~~", eq.label, sep = ""))), formula = my.formula, label.x = "right", parse = TRUE) + theme_classic()

Marking and or labeling group centroids

The new stat_centroid() and stat_summary_xy(). stat_centroid() applies the same function to x and y and this function defaults to mean(). In the case of stat_summary_xy() the functions applied to x and y are passed as separate arguments, and they both default to simply copying their input.

Markdown

Starting from version 0.3.6, statistic stat_poly_eq() can optionally generate character strings encoded in markdown suitable for geom_richtext() from package ‘ggtext’.

Acknowlegement: This update was encouraged by recent questions at stackoverflow. The tag [ggpmisc] is in use at stackoverflow for questions related to this package.

Documentation web site at http://docs.r4photobiology.info/ggpmisc/ includes all help pages, with output from all examples, and vignettes in HTML format.

NOTE: The new version of the package is on its way to CRAN.

Please raise issues concerning bugs or enhancements to this package through Bitbucket https://bitbucket.org/aphalo/ggpmisc/issues.

Word cloud figure from LaTeX index entries

I created the word cloud on the cover of “Learn R as a Language” using an R script that takes as input the file for the book index, as generated when creating the PDF from the LaTeX source files. This input file contained quite a lot of additional information, like font changes and page numbers that needed to be stripped into a clean list of words. Only later I realized that it would have been easier to produce a cleaner word list to start with. So, I first present the code revised to work with a simpler word list. This is actually tested with the book files to work. If you want to do something similar for your own book, follow the revised code in first section below. If you want to see the “hacked-up” code I really used for the cover as included in the book, it is in the second section below.

Continue reading“Word cloud figure from LaTeX index entries”

Performance of package ‘photobiology’

In recent updates I have been trying to remove performance bottlenecks in the package. For plotting spectra with ‘ggspectra’ an obvious performance bottleneck has been the computation of color definitions from wavelengths. The solution to this problem was to use pre-computed color definitions in the most common case, that of human vision. Many functions and operators as well as assignments were repeatedly checking the validity of spectral data. Depending on the logic of the code, several of these checks were redundant. It is now possible to enable and disable checks both internally and in users’ code. This has been used within the package to avoid redundant or unnecessary checks when the logic of the computations ensures that results are valid.

In addition changes in some of the ‘tidyverse’ packages like ‘tibble’, ‘dplyr’, ‘vctrs’ and ‘rlang’ seem to have also improved performance of ‘photobiology’ very significantly. If we consider the time taken to run the test cases as an indication of performance, the gain has been massive, with runtime decreasing to nearly 1/3 of what it was a few months ago. This happened in spite of an increase in the number of test cases from about 3900 to 4270. Currently the 4270 test cases run on my laptop in 23.4 s. Updates ‘rlang’ (0.4.7) and/or ‘tibble’ (3.0.3) appearing this week in CRAN seem to have reduced runtime by about 30% compared to the previous versions.

The take home message is that even though there is a small risk of package updates breaking existing scripts, there is usually an advantage in keeping your installed R packages and R itself up to date. If some results change after an update it is important to investigate which one is correct, as it is both possible that earlier bugs have been fixed or new ones introduced. When needed it is possible, although slightly more cumbersome, to install superseded versions from the source-package archive at CRAN, which keeps every single version of the packages earlier available through CRAN. With respect to R itself, multiple versions can coexist on the same computer so it is not necessary to uninstall the version currently in use to test another one, either older or newer.

R 4.0.1 and R 4.0.2

Two minor updates to R were released recently. R 4.0.1 contained a major bug under MS-Windows (affecting R-commander), so it was very soon followed by R 4.0.2 that fixes the problem. As always it is recommended to keep R up-to-date. R for photobiology packages and their dependencies are not affected and pass checks on R >= 3.6.0.