An R marathon updating packages

Recent and approaching code-breaking changes in the tidyverse packages ‘tidyselect’, ‘rlang’, ‘tidyr’, ‘dplyr’, ‘readr’ and ‘ggplot2’ meant that keeping my packages fully functional required changes to several of them. None of the changes in these packages made my packages fail CRAN checks, but they either made some functions unusable or triggered inumerable warnings in some use cases. In some cases the behaviour and warnings were weird and rather unpredictable, specially those due to the changes in ‘tidyselect’ version 1.2.0, which were also visible in ‘dplyr’ and ‘tidyr’. So below is a summary of my R-intensive week.

Continue reading

ggpp/ggpmisc and ggbreak

The recently released ‘ggbreak’ package (= 0.0.3), by Guangchuang Yu and Shuangbin Xu, allows to add scale breaks to plots, an important feature previously unavailable for ggplots. Unfortunately, several geoms from ‘ggpp’ (<= 0.4.0) do not currently work well together with ‘ggbreak’, affecting also ‘ggpmisc’ (<= 0.4.0). In other words, insets and layer created with geoms based on npc pseudo aesthetics are added to all split subpanels. I will see if it is possible to get these packages to cooperate in the future.

Code breaking changes in ‘tidyr’ 1.0.0

Three of my packages needed updates due to code-breaking changes introduced in ‘tidyr’ 1.0.0, an update which will soon be submitted to CRAN by its authors. I received warning e-mails only about ‘photobiologyInOut’, which I have updated and submitted to CRAN some time ago. More recently I have noticed that ‘ggspectra’ had also to be updated and this required an update to ‘photobiology’. ‘ggspectra’  0.3.4 is ready to be submitted to CRAN as soon as ‘photobiology’ 0.9.29 is built on CRAN servers.

If you have been using ‘tidyr’ in your scripts, you may need to update them. The required edits are rather small but can come as a surprise, which is why I am writing this post.

Encryption and https:

The site has been updated to use encryption. For the web site, old addresses using http: are silently redirected to https:. If you have earlier received warnings from browsers because of the lack of encryption, this upgrade to the server settings should remove them.

The repository at the subdomain https://r.photobiology.info also supports encrypted connections. In addition, because of tightening of security in other ways, if you have been earlier using the address http://www.photobiology.info/R to access the repository, you will need to use https://r.photobiology.info instead from now on. In other words if you are setting this repository address in an .Rprofile file you should edit it to use the new address.

The documentation should be accessed through https://docs.photobiology.info as settings for this subdomain have been changing in the same way.

All these changes should ensure enhanced security for the site and downloads.

 

R 3.5.0 has been released

R 3.5.0 was released on 23 April. It includes several performance improvements. All packages in the ‘r4photobiology’ suite pass checks on the new version. In CRAN binaries for R 3.5.0 of all packages have been built. In the case of development versions and packages not in CRAN, the repository at http://r.r4photobiology.info/ will soon be updated to include binaries suitable for R 3.5.0. Meanwhile packages will be installed from sources if the require tools are available (e.g. RTools 3.5.0 or 3.4.0 installed in Windows computers).

The update from R 3.4.x to R 3.5.x requires that all packages are reinstalled. For the time being avoid using package ‘installr’ to do the update, at least on Windows, as copying installed packages from an earlier installation is not useful. Please see the page Upgrading R for instructions.

RANDOM.ORG – True Random Number Service

Source: RANDOM.ORG – True Random Number Service

In most situations pseudo-random numbers produced by computer software  (“random” number generators) are good enough as long as we are careful when choosing the seed for the generator. Sometimes, it can be even an advantage to be able to reproduce sequences of pseudo-random numbers by setting the seed value. Frequently, the seed is obtained from the clock of the computer, e.g. using the seconds or milliseconds digits from current time. This is still not truly random, as random numbers cannot be generated by any deterministic process. True random numbers can be only be generated by a random physical process.

The site random.org is a service which provides true random numbers for free (at least if below a quota). R package random provides an interface to this service.

On Statistical Progress in Ecology | Ecological Rants

There is a general belief that science progresses over time and given that the number of scientists is increasing, this is a reasonable first approximation. The use of statistics in ecology has bee…

Source: On Statistical Progress in Ecology | Ecological Rants

A blog post by Charles Krebs complements my previous posts on P-values and reminds us that statistical analysis and models are tools in our search for understanding. In the end what matters are the “new insights” as Krebs writes, or more boldly as I sometimes say “ideas are what matters, data just let us imagine and assess those new ideas.”

More about P-values: what are the alternatives?

I earlier mentioned that a high-ranking journal in Psychology called “Basic and Applied Social Psychology” has banned the use of P-values. Today, I came across some additional material on this question. First of all, the controversial editorial where the decision was announced.

A paper, published in this journal, giving guidelines on the best way of presenting results without use of P-values. The paper by Geoff Cumming, titled “The New Statistics: Why and How” makes a good argument for using confidence intervals and other descriptive statistics in place of P-values.

He also has a series of videos in YouTube from which the three linked to below are related to the use (and misuse) of P-values. For my liking he does not make a clear enough distinction between the problem inherent to P-values (that they discard a lot of information to reach a true/false decision) and those problems due to the misuse and misinterpretation of tests of significance. He does mention the difference, but you need to keep your eyes and ears open to get this out of his presentations.

In addition a blog and podcast of a round table complete the discussion of this issue giving a bit wider account of the controversy surrounding the use of P-value.

 

How to be a modern scientist by Jeffrey Leek [Leanpub]

A book on how to be a scientist the modern way.

Source: How to be a modern… by Jeffrey Leek [Leanpub PDF/iPad/Kindle]

Book cover image

This book looks very useful for PhD students and also to some extent for more experienced researchers willing to get up-to-speed with the use of modern communication tools and on-line media and forums.

It covers a lot of subjects concisely and is very up-to-date. It is an easy read but full of useful information and ideas.

The e-book has a suggested price, but you can chose to get it for free or pay less if you are on a tight budget. Payment is fully voluntary, so you can also pay more than the suggested price if you want to support the author.