This page will be an ongoing collection of tips and suggestions I find useful (or found out through much trial and effort) when using R. As a living document, it will start as a haphazard collection, but should it grow, I may re-order it.
Use a consistent coding style
I have mainly been following Hadley Wickham’s style guide, although I have not settled on a consistent variable and function naming schema as of yet. Another good resource is Paul Johnson’s brief exposition (PDF).
Benchmark your code
There are multiple ways to time code. Personally, I use the microbenchmark package. There is also the rbenchmark package, and the tried-and-true workhorse System.time(foo). Regardless of which you use, it can be illuminating to compare slightly different implementations. Which brings us to the next suggestion…
Profile slow code
Use R’s code profiling mechanisms, specifically Rprof, when dealing with slow code. Identifying the bottleneck and recoding it, or moving it into C++, can provide speed gains measured not in multiples but orders of magnitude!
Use R’s built-in optimized code as much as possible
This was not immediately obvious to me, but it makes sense. As an example, compare 1 – pnorm(4) with pnorm(4, lower.tail=FALSE). There is a small, but measurable speed increase seen in the latter, probably because the subtraction is happening in the C (or is it FORTRAN) routine and not at the R level. If you need to do several billion calls, this savings can become meaningful. I’ve tested it with many of the basic distributions (normal, lognormal, gamma, etc.) and, as a rule, it seems to hold. I’ll be keeping an eye out for this one in the future.
Compare methods when inverting matrices
In much of my code, I have to invert a matrix (Hessian of negative log-likelihood function at point of convergence to find fitted parameter variance-covariance matrix, if you really want to know). For various reasons, I used to use the QR factorization. Using a cholesky decomposition and then chol2inv was markedly faster in my cases.
Use a fast BLAS if possible
See this post and this update for more detail.
Test files used in 3.1.0 speed tests:
Hi,
thanks for tips.
P.s. Link for Paul Johnson’s brief exposition is broken.
My pleasure, and thanks for the heads up. It’s fixed now.
Most of your tips focus on code optimisation so my tip would be:
Focus on writing clear, readable code. Don’t worry about optimtisation unless you have a performance problem.
I ditto that. But I’m concerned with research more than commercial applications.
[…] intelligent packages like tinytest, easier than you may expect! I should probably add this to my list of R tips. As always, comments and suggestions are […]