Tuesday, June 18, 2013

Part 1: R and Google Analytics.

At the start-up where i work, we highly use data from Google Analytics and i've found out that looking at only the dashboard limits me in so many ways to do the deep dives in patterns and behaviors. R is a programming language designed for statistical analysis and graphical data visualization. It's a command line language, similar in many ways to Python, but can also be used from graphical applications, too. This is the main reason why i’m loving working with it on a day to day basis.
So if you’re an engineer or anayst who wants to do a bit more with the data from Google Analytics, i say luckily there's an R package for Google Analytics which makes working with, and visualisation, API data a little less challenging.
I've recently been playing around with R and Google Analytics for internal reports and data deep dives and thought I'd share some pointers on how to get started with some simple projects.
I’m running Ubuntu 12.04+ and i’m assuming you already got R running on your machine, otherwise you might want to look at my previous tutorial here on installing R on Ubuntu Linux machines. Besides R itself, the R for Google Analytics code also uses a couple of extra extensions for fetching the data from the web service and for parsing the resulting XML. To avoid any dependency errors please go ahead and install the packages the below.
thedatafugee:~$ sudo apt-get install r-cran-xml libcurl4-gnutls-dev libxml2-dev


1. Kickstart R and install RCurl and XML

The next step is to install the required RCurl package. To do that you need to start R by opening a terminal and typing R then entering a command to download the package from one of the CRAN mirrors.
I always use the Imperial College's CRAN mirror, which is at http://cran.ma.imperial.ac.uk.
You can download RCurl from your chosen mirror by entering the following command in the R command line interface:
install.packages("RCurl", repos = "http://cran.ma.imperial.ac.uk")
Now if that runs ok, issue the command below to install the XML libraries in R:
install.packages("XML", repos = "http://www.omegahat.org/R")

2. Install R Google Analytics

It’s ashaming to say this but it’s just recently that i found out that there’s a pre-built package for using R with Google Analytics.
To install RGoogleAnalytics start R (just type R in a terminal) and then enter this command (with the location plus file name of your RGoogleAnalytics tarball download):

install.packages("/home/ngamita/Downloads/RGoogleAnalytics_1.3.tar.gz",repos=NULL,type="source")

In the part 2 series of this blog, i'll take you through step by step accessing, querying of your internal Analytics data and running some analyses plus visualizations with the R programming language. It's amazing what you can do offline with this data. Watch out for part 2.

No comments:

Post a Comment

Add any comments if it helped :)