Talk: 25.04.2014
Location: Mountbatten Offices, Kampala - Uganda.
Topic: Making simple Maps With R (PLE data example)
twitter: #oddkampala #ddj @datadotug
github: https://github.com/ngamita/spatialR/blob/master/uneb.R
Displaying spatial data is must have activity/skill that most journalists should have. There are so many tools out there that can make maps but working with robust large datasets will only be supported by a few tools like R etc One of the best ways of visualizing spatial data is through a map and one has to play around with the right colours to match the right context. Finally a map is not complete without legends, title, scale bar and north arrow. In this simple step by step developing maps tutorial in R i will walk you through these steps to achieve the merging shapefiles with Primary Leaving Exams data and find out what districts / regions of Uganda are performing much better than the others.
# Visualization of how students performed across
# different districts in Uganda Primary Leaving
# Examinations 2013/2014.
# Plotting the Primary leaving Exams data on a color-coded map,
# in less than 100 lines of R code.
# Author: Richard Ngamita 'ngamita@gmail.com'
# Disclaimer: These methods here may not be the best solutions,
# but seemed the easiest for getting started with spatial data
# in R. For any feedback: ngamita@gmail.com
So let’s get started with loading the libraries to read our shapefiles data.
# Load the rgdal library.
# If you don't have it
# run this command: install.packages(''rgdal)
library(rgdal)
# Set working directory
# Check if there, else create one.
if(!file.exists('data')){
dir.create('data')
}
# Set wd to data
setwd('./data')
# Download Uganda district shape files into districts.zip from the web
# Simple google search of data.ug district shapefiles will pull this up.
download.file("http://maps.data.ug/geoserver/wfs?format_options=charset%3AUTF-8&typename=geonode%3Adistricts_2013_112_web_wgs84&outputFormat=SHAPE-ZIP&version=1.0.0&service=WFS&request=GetFeature", "districts.zip")
# Unzip file the shape file.
unzip("districts.zip")
# Read in shape files.
# ?readOGR() to find out more.
districts <- readOGR(".", "districts_2013_112_web_wgs84")
# Check for loaded data quality.
# Plot districts to ch
#plot(districts)
#slotNames(districts)
#class(districts)
#head(districts@data)
# Note the @ sign includes the "SpatialPolygonsDataFrame" slot details.
## Pull in Primary Leaving Exams results
# CSV data file. A little of data.ug PLE
# data files will pop up the link.
# download the uneb PLE data from data.ug website.
# Use the wget method of **nix or Mac machines should
# use the curl. Windows, don't think need any methods.
# download the file to local directory
download.file('http://catalog.data.ug/dataset/a4a1ef8b-afa4-4b8f-b9f3-d4ef9b783eee/resource/1ba956e0-e8b5-42f8-9655-c464715ec065/download/ple.csv', destfile='ple.csv', method='wget')
# Read CSV uneb data, sep as csv and include the header/column names.
# as.is to keep the data file types.
ple <- read.csv('ple.csv', sep=',', header=TRUE, as.is = TRUE)
# Check if loaded well
#head(ple)
#str(ple) # Make sure data types are right.
# Clean the last column, useless to us.
ple <- ple[,1:4]
# Convert Division1 to numeric .
# SuppressWarrnings as NA or missings values are present.
ple$Division1 <- suppressWarnings(as.numeric(ple$Division1))
# RUN: install.packages('plyr')
# incase you don't have it installed.
library(plyr)
# We want, to dice and aggregate counts.
# Get all sum or totals of division1s per district
ple_division1 <- aggregate(Division1 ~ District, data = ple, sum)
# Creates a dataframe with division1 totals per district, check if loaded well.
# head(ple_division1)
# Use the match() function to append these two different dataframes into the one SpatialPolygonsDataFrame.
districts@data <- data.frame(districts@data,
ple_division1[match(districts@data[, "DNAME_2011"],
ple_division1[, "District"]), ])
# Remove the repeated columns, specifically "District".
districts@data$District <- NULL
# Re-name the colname to make sense.
colnames(districts@data)[1] <- 'Districts_2013'
# Now the shape file or SpatialPolygonsDataFrame contains our added field called ‘Division1’
# which contains the count of the number of first grades
# We can use this to create a choropleth map with:
# First remove incomplete rows/NA values, disctricts without results.
districts@data <- na.omit(districts)
# Using Basic plot() function
# Load the mapping and color packages.
# RUN: install.packages('package name')
# If you dont have it installed.
library(maptools)
library(RColorBrewer)
library(classInt)
# select a colour palette and
# the number of colours you wish to display
# Could be 4, 5 or many more.
colours <- brewer.pal(4, "Blues")
# we need to set breaks
# can use the classIntervals function
# in the classInt package e just loaded.
brks<-classIntervals(districts$Division1, n=4, style="quantile")
# With plot function, lets plot the distribution
# of the data and view the colours assigned respectively.
plot(brks, pal=colours)
# extract brks values from the brks object above.
brks<- brks$brks
# Finally, i got a map to show you.
plot(districts, col=colours[findInterval(districts$Division1, brks,
all.inside=TRUE)], axes=F)
# Go ahead and add title, legent, scale etc.
# Save file locally.
png(filename="your/file/location/name.png")
plot(out_put)
dev.off()
#Part 2:
library(GISTools)
library(RColorBrewer)
# Clear the missing values issue.
districts@data <- districts@data[complete.cases(districts@data), ] # getting a bug.
# Use choropleth function show performing districts.
choropleth(districts, districts$Division1)
# map looks fine, but lets make it better with a few extra commands.
# Set colour and number of classes
shades <- auto.shading(districts$Division1, n = 9, cols = brewer.pal(9, "Blues"))
# Draw the map
choropleth(districts, districts$Division1, shades)
# Add a legend
choro.legend(26.64793, 1.674763, shades, fmt = "%g", title = "Count of Division 1", cex = 1.0)
# Add a title to the map
title("Count of Division 1's in PLE, 2013")
# add Notth arrow
north.arrow(27.92452, 3.30194, 10)
# Further reading.
# Working with GoogleMaps and OpenStreetMap.
library (ggmap)
# Further reading: check out these solutions by Rodriguez
# https://sites.google.com/site/rodriguezsanchezf/news/usingrasagis
# Goal is to have a visual map below of Uganda districts and an overlay of data from Division1s.