Many times we need to plot geo-spatial data in analytics. Information like sales per region, income distribution makes more sense when they are plotted on a map. We can do this quite easily in
R. Let us see it in action.
First of we need data about the map. There are many libraries from where we can download this data
for our personal use. Here we will use data from http://gadm.org/ Here there are data at different levels of details available for most countries. Let us use data for India. To load this downloaded data into R, first open R in R-Commander and change your working directory to where you saved the file
and then read the data into a variable with
ind1 <- readRDS("IND_adm1.rds")
Let us check what kind of data has been loaded with
It will show
ind1 names(ind1)  "OBJECTID" "ID_0" "ISO" "NAME_0" "ID_1" "NAME_1" "HASC_1" "CCN_1" "CCA_1"  "TYPE_1" "ENGTYPE_1" "NL_NAME_1" "VARNAME_1"
We can also check property of the data by using:
This will show various properties of data loaded. Right now we are interested in knowing ID for the states so that we can use it to color the maps with our data. We can user print(ind1) to view complete data, but in this case it will be a huge print. In this example we will use state ID “HASC_1” to plot our data. We can see the values with:
Right now we do not have any data so we populate a excel sheet and fill data with state ID, fill some sales data and assign a color value based on sales amount. In reality we will probably use a data base to get this data. We save data into csv format and read it in R by:
pdata = read.csv("filename")
confirm data has been read correctly
We add a new property color.data into the dataframe ind1 based on color values taken from csv file
Now we plot the map with these color. The command is
spplot(ind1,"NAME_1", col.regions=ind1$color.data, colorkey=T, main="Indian States")
We have the result here
Same concept can be extended to district level for more granular analysis. As you see here the map is different from the official version of Govt. of India. The data that we use here is from R library which have their own logic in depicting a map based on where it has been hosted.
In our next blog, we shall see how to add or delete portions of area in the map to make a map of our choice.