Friday, April 25, 2014

Making maps with R.

Talk: 25.04.2014
Location: Mountbatten Offices, Kampala - Uganda.
Topic: Making simple Maps With R (PLE data example)
Displaying spatial data is must have activity/skill that most journalists should have. There are so many tools out there that can make maps but working with robust large datasets will only be supported by a few tools like R etc One of the best ways of visualizing spatial data is through a map and one has to play around with the right colours to match the right context. Finally a map is not complete without legends, title, scale bar and north arrow. In this simple step by step developing maps tutorial in R i will walk you through these steps to achieve the merging shapefiles with Primary Leaving Exams data and find out what districts / regions of Uganda are performing much better than the others.


# Visualization of how students performed across
# different districts in Uganda Primary Leaving 
# Examinations 2013/2014. 
# Plotting the Primary leaving Exams data on a color-coded map,
# in less than 100 lines of R code. 

# Author: Richard Ngamita ''

# Disclaimer: These methods here may not be the best solutions,
# but seemed the easiest for getting started with spatial data 
# in R. For any feedback:

So let’s get started with loading the libraries to read our shapefiles data. 
# Load the rgdal library.
# If you don't have it 
# run this command: install.packages(''rgdal)

# Set working directory
# Check if there, else create one.

# Set wd to data

# Download Uganda district shape files into from the web
# Simple google search of district shapefiles will pull this up. 
download.file("", "")

# Unzip file the shape file. 

# Read in shape files. 
# ?readOGR() to find out more. 
districts <- readOGR(".", "districts_2013_112_web_wgs84")

# Check for loaded data quality.
# Plot districts  to ch
# Note the @ sign includes the "SpatialPolygonsDataFrame" slot  details.

## Pull in Primary Leaving Exams results
# CSV data file. A little of PLE 
# data files will pop up the link. 

# download the uneb PLE data from website. 
# Use the wget method of **nix or Mac machines should
# use the curl. Windows, don't think need any methods. 

# download the file to local directory
download.file('', destfile='ple.csv', method='wget')

# Read CSV uneb data, sep as csv and include the header/column names. 
# to keep the data file types.
ple <- read.csv('ple.csv', sep=',', header=TRUE, = TRUE)

# Check if loaded well
#str(ple) # Make sure data types are right.

# Clean the last column, useless to us. 
ple <- ple[,1:4]

# Convert Division1 to numeric .
# SuppressWarrnings as NA or missings values are present. 
ple$Division1 <- suppressWarnings(as.numeric(ple$Division1))

# RUN: install.packages('plyr')
# incase you don't have it installed. 

# We want, to dice and aggregate counts. 
# Get all sum or totals of division1s per district
ple_division1 <- aggregate(Division1 ~ District, data = ple, sum)

# Creates a dataframe with division1 totals per district, check if loaded well. 
# head(ple_division1)

# Use the match() function to append these two different dataframes into the one SpatialPolygonsDataFrame.
districts@data <- data.frame(districts@data, 
                                  ple_division1[match(districts@data[, "DNAME_2011"],
                             ple_division1[, "District"]), ])

# Remove the repeated columns, specifically "District". 
districts@data$District <- NULL

# Re-name the colname to make sense. 
colnames(districts@data)[1] <- 'Districts_2013'

# Now the shape file or SpatialPolygonsDataFrame contains our added field called ‘Division1’ 
# which contains the count of the number of first grades 
#  We can use this to create a choropleth map with:

# First remove incomplete rows/NA values, disctricts without results.
districts@data <- na.omit(districts)

# Using Basic plot() function
# Load the mapping and color packages.
# RUN: install.packages('package name')
# If you dont have it installed. 


# select a colour palette and 
# the number of colours you wish to display
# Could be 4, 5 or many more.
colours <- brewer.pal(4, "Blues")

# we need to set breaks
# can use the classIntervals function 
# in the classInt package e just loaded.
brks<-classIntervals(districts$Division1, n=4, style="quantile")

# With plot function, lets plot the distribution 
# of the data and view the colours assigned respectively. 
plot(brks, pal=colours)

# extract brks values from the brks object above.
brks<- brks$brks

# Finally, i got a map to show you. 
plot(districts, col=colours[findInterval(districts$Division1, brks,
all.inside=TRUE)], axes=F)

# Go ahead and add title, legent, scale etc. 

# Save file locally. 

#Part 2:


# Clear the missing values issue. 
districts@data <- districts@data[complete.cases(districts@data), ] # getting a bug. 

# Use choropleth function show performing districts. 
choropleth(districts, districts$Division1)

#  map looks fine, but lets make it better with a few extra commands.

# Set colour and number of classes
shades <- auto.shading(districts$Division1, n = 9, cols = brewer.pal(9, "Blues"))

# Draw the map
choropleth(districts, districts$Division1, shades)

# Add a legend
choro.legend(26.64793, 1.674763, shades, fmt = "%g", title = "Count of Division 1", cex = 1.0)

# Add a title to the map
title("Count of Division 1's in PLE, 2013")

# add Notth arrow
north.arrow(27.92452, 3.30194, 10)

# Further reading. 
# Working with GoogleMaps and OpenStreetMap.
library (ggmap) 

# Further reading: check out these solutions by Rodriguez

# Goal is to have a visual map below of Uganda districts and an overlay of data from Division1s.


