Recently, the R-consortium accepted a new project called histoRicalg. The main goal of the project is to document and transfer knowledge of some older algorithms used by R and by other computational systems. There is a lot of R written in Fortran—much of which is in the old F77 format—and in C whose original implementations themselves may have been in older languages. It is worthwhile to trace the provenance of these routines in its own right. Firstly, to understand how the code evolved. Secondly, to transfer much of the compiled wisdom and experience of those who wrote these algorithms to a new generation of statisticians, programmers, and analysts. Reviewing the evolution of these algorithms becomes even more valuable when it unearths potential or actual bugs, as was recently seen in nlm()
and in optim::L-BFGS-B
.
To that purpose, Dr. John C. Nash, well-known for his contributions to R and the general statistical knowledge-base, recently started this project. Project members do not have to be academics or expert programmers—I certainly am neither! Rather, anyone who is interested in helping document how these fundamental algorithms came to be and what may be done to make them better is enthusiastically invited. Younger R users, statisticians, programmers, or anyone willing to learn about the methods that underlie the computations in R are especially welcome. This relates directly to the one of the project’s first goals, which is to create a “Working Group on Algorithms Used in R” focused on identifying and prioritizing algorithmic issues and developing procedures for linking older and younger workers. The project is being hosted on Gitlab, where there is a wiki and already near a dozen vignettes on various R functions. For more information, or to sign up, people are invited to contact Dr. Nash directly or via the project’s mailing list. Looking forward to seeing you there soon!