We need to talk about coding: getting started with R

By Jenni Hislop

We keep it a secret from undergraduates and Masters degree students. But sooner or later the truth must be told: point and click software will only get you so far, so if you really want to be a good health economist, eventually you’re going to have to learn how to code. Yes, yes I know that if you’d wanted to do programming you’d have just studied computing science and not economics. But life is hard, and the further up the health economics research ladder you venture, the harder it becomes to build and run the kind of models you need to be able to simulate real world healthcare practices, without needing to know a little bit of code.

But having decided that I just want to have a single environment where I can accomplish all my statistical needs, the next question is which to choose? Many people (including several within our own Health Economics Group) certainly appreciate Stata. However, University managers certainly appreciate things that are free; enter R. Both a so called environment for doing statistical computing and the programming language of that environment, it’s a GNU project, which basically means it’s part of some global conspiracy to try and make the world a better place, or something. Which helps explain the whole thing about it being free and subsequently why it’s becoming more and more widely used.

But there’s still a sufficiently steep learning curve that might deter the statistically-minded but as yet uninitiated in R. Luckily, the University here at Newcastle runs a series of R courses to help folk like me get over some of that initial trepidation, so last month I enrolled for a week’s worth of training in all things R.

These courses are open to anyone, not just University employees and though courses 1, 2, 3, 5 and 6 are taught over the course of a week, they are done in the form of stand-alone, single day courses so that attendees with some prior understanding of coding and/or R can opt-in or out depending on what is being taught that day, thereby allowing attendees to tailor their training according to their own specific needs. The teaching format is a good mix of lectures and practical sessions, which effectively enables attendees to check that they definitely understand what’s being said in order to be able to apply it when it comes to sitting down in front of R Studio and getting on with it. Dr Gillespie and his team are also incredibly approachable (even when your error messages are down to you putting a comma in the wrong place for the nth time that day). And because not everything can be covered in a week, and also because if there’s one thing scarier than starting an R course, it’s finishing it and realising you’re on your own from now on, there is also some time devoted during Course Three to identifying useful sources of online help.

This course will not suddenly make you an R genius or stop you from ever putting another comma in the wrong place and getting an error message. But in the space of a week it does provide a very good overview of R and its potential capabilities to you as a user, and helps de-mystify R to such an extent that getting started with using it as your default statistical software is no longer a daunting prospect, and that in itself is something of an achievement