Benchmarking function sun_angles() [updated]

As far as I know there are in CRAN four R packages implementing the computations for the position of the sun and times of sunrise and sunset: ‘photobiology’, ‘fishmethods’, ‘solartime’ and ‘suncalc’.

The functions sun_angles() and day_night() from package ‘photobiology’ use Meeus equations as used by NOAA Solar Calculator https://www.esrl.noaa.gov/gmd/grad/solcalc/ which could be more precise than those in NOAA’s Excel worksheet which implement a simplified version of the Meeus equations especially for far into the past or far into the future calculations. The approximations based on Meuus equations are very good for years between 1800 and 2100 and results should still be sufficiently accurate for the range from -1000 to 3000 as long as the computation of Julian dates is correct. The Excel implementation is only valid for dates between 1901 and 2099 because of how Julian dates are computed in Excel.

Function astrocalc4r() from package ‘fishmethods’ also implements Meeus equations (the authors work at NOAA). Function computeSunPosition() from package ‘solartime’ uses unspecified equations and function getSunlightPosition() is an R interface to the ‘suncalc.js’ library, part of the ‘SunCalc.net’ project <http://suncalc.net>.

Function computeSunPosition() from package ‘solartime’ uses unspecified equations and function getSunlightPosition() is an R interface to the ‘suncalc.js’ library, part of the ‘SunCalc.net’ project <http://suncalc.net>.

UPDATED on 2019-04-24

I have noticed significant differences in the values returned by equivalent functions from different packages. Up to now the tests on the functions of my own package ‘photobiology’ have revealed only very small mismatches to the NOAA Solar Calculator. These small errors, noticeable for dates far from the present, were due to the use of base R’s julian() function, which is not designed to be precise enough for astronomical calculations. The code now in the repository at Bitbucket has been revised to use Meuus’ algorithm for the calculation of Julian days removing this source of  small discrepancies.

In contrast, while testing ‘photobiology’ against other packages, I seem to have found a bug in function astrocalc4r() from R package ‘fishmethods’.

A minimal example follows:

By only changing the hour passed as argument different times for sunrise, sunset and daylength are returned even though the day is the same. The differences are larger at high than at low latitudes. The maximum difference for the example above is 1/4 h for daylength. Comparison against the NOAA Solar Calculator shows even larger differences.

(The bug has been reported to the maintainer of package ‘fishmethods’.)

Performance

It is important to realize that this comparison is done from a users’ viewpoint, as the different functions compared differ in how many additional computed values they return. We start by comparing functions from packages ‘photobiology’ and ‘solartime’.

I have not yet benchmarked function astrocalc4r() from package ‘fishmethods’ as time arguments are passed as several numerical values. I need to first study which parameters accept vectors. In contrast preliminary tests of getSunlightPosition() showed it to be so much slower than the other two functions as to not be possible to include it in the same benchmark sessions.

The two implementations for computation of solar angles differ in that function sun_angles()from package ‘photobiology’ is implemented purely in R but optimized for performance through vectorization to avoid explicit loops and by factoring out of invariant computations and function computeSunPosition() from package ‘solartime’ is written in C++ using package RCpp.

The mathematical algorithms used are different. Function sun_angles() uses one of the most precise algorithms currently available (Meeus equations as used in NOAA’s on-line web calculator), an algorithm which is valid over thousands of years before and after present time. In contrast,
computeSunPosition() uses a simpler algorithm that returns the same angle values irrespective of the year. This may be good enough in many cases.

Function sun_angles() has its performance optimised for the case when the sun position needs to be calculated for the same geographic coordinates and numerous time points. In this case and only this case it outperforms
computeSunPosition() when more than a few hundred time points are passed as argument. The figure below is a log-log plot so for large numbers of time points the difference becomes significant. In practice, for this case, sun_angles()outperforms computeSunPosition() in all cases when the total computation time exceeds about 3 ms.

Expressed per time point, it is clear that once the expensive equation of time computation is reused for many time points sun_angles() becomes more than 4 times faster.

When we compute the sun angles at many different latitudes,
computeSunPosition() seemed at first a lot faster than sun_angles() but it was triggering a warning suggesting that vectorization was not working correctly. Using a for loop to walk over the latitudes may be an unfair test but it is what I would use in practice before knowing more about the origin of the warning. I will replace this test when I understand better the problem. In this case, based on what seems to me like the type of situation when one would use the functions, I computed at each location the sun position at the hour during a whole day. In this test computeSunPosition() takes approximately 1/3 of the time that sun_angles() takes.

The performance comparisons above show in my opinion that the choice of algorithm and an efficient implementation in R can be as important as the use of a compiled language like C++ in place of R. It remains to be determined how much of the difference in performance is due to the use of a simpler algorithm in computations related to geographic location and how much due to their implementation in C++. In fact there seems to be still room for further optimizations of the R code for the case of a single time point and multiple locations.

Accuracy


I have earlier tested sun_angles() against the NOAA interactive web page, and agreement has been very good. In addition, the values returned by sun_angles() and astrocalc4r() are unsurprisingly within a few hundreds of a degree given that they implement the same algorithm. In the output from getSunlightPosition() azimuth angles are expressed not only in radians, but clockwise from the South instead of from the North as is the case for the other functions. For conversion to degrees from North we need to compute(pi + azimuth) * 180 / pi %% 360. Preliminary tests showed very good agreement for solar elevation and some differences for azimuth. I may do a more thorough comparison in the future.

The values returned by sun_angles() are different to those returned by computeSunPosition() even after expressing angles in the same units. However, for all the 64 locations each at 432 time points I tested, in most cases the differences were within a few degrees for elevation but occasionally a lot more for azimuth. The smaller differences look systematic, and they may stem from ignoring several features of the movement of the Earth and sun by using simpler equations.

Anyway, the differences between values calculated with the two functions, especially in the cases when the differences are large, need further investigation. In the case of computeSunPosition() there are cases when the function fails to return a any value, such as at the equator, or for the azimuth as in some years and locations. These are most likely due to numerical computation errors or because of using the implemented algorithm outside its expected range of validity (In the package documentation the algorithm is not described and no reference to literature has been provided)

The disagreement in the calculated solar elevation can be expected to be irrelevant in many use cases.

In the case of azimuth, differences are in some isolated cases huge, and in some cases NaN is returned by computeSunPosition().

To be able to observe the more frequent but smaller deviations we zoom into the few degrees of spread around no difference. This figure raises some concern as the discrepancies are large even for year 2020 and follow an unusual pattern.

The last concern is the calculation of solar time. Here differences are large, especially at midnight and in some years. Differences of more than 15 minutes in the error between different times of the day can be observed which seems unusual. I did compare the values returned by sun_angles()to those returned by the NOAA’s on-line calculator for year 0 and the disagreements are at most a few seconds.

I started these tests mostly to verify that computations done in my package ‘photobiology’ are reliable. Differences can stem from the algorithms used, and how they are implemented in code. Some further differences can be caused by different implementations of system functions, in this case the conversion to Julian dates and times are crucial. In package ‘photobiology’ I had used base::julian() which ignores some leap years and introduced rather small errors. As of version 0.9.27 I have implemented Meuus algorithm for Julian days to ensure good accuracy for dates in the range 1000 BC to 3000 AD.

Share with

Leave a Reply