Google Maps Overlays with TileMill QGIS and R

Check it out: http://maps.bhoung.com/popden/index.html

I had been meaning to replicate the cool maps that Ben Boyer made when he was at the Chicago Tribune (population density map of Chicago: http://media.apps.chicagotribune.com/census-2010/census-demo/index.html) .

I'm glad to be able to say that I successfully completed his tutorial a few weeks ago, with Australian data. I finally found a few minutes to upload the modified code onto Amazon's S3 Buckets service. The main obstacle to following the tutorial for anyone interested was trying to generate the images using Mac OSX 10.8.2. As most of the pieces of software are constantly being updated, it was easiest to fire up Ubuntu and run invar, an open source python script to generate the images, rather than trying to install it on the Mac....

No surprises as to what population density looks like in Australia, but it's pretty nonetheless. Extensions would be look at other kinds of data, perhaps at the postcode level. Also, implementing the layering with dots as per some of the Chicago Tribune's other maps would be great too.

Update: So the time has come for me to replicate these maps with new data. I'll document as I go this time round.

1. Merging spatial data with shapefiles can be done more easily than using postgres with either R (http://gis.stackexchange.com/questions/6839/best-way-to-add-attribute-data-to-a-shape-file) or the mmqgis plugin in QGIS.

2.Carto to conver tilemill mml  file to an xml file

3. install pip and then invar

https://github.com/mapnik/mapnik/wiki/UbuntuInstallatio

  • sudo apt-get install python-mapnik mapnik-utils libmapnik0.7
  • sudo apt-get install python-pip
  • sudo apt-get install python-greenlet
  • sudo pip install invar
  • ivtile -h
  • ivtile project.xml . -9.3 96.9 -43 159 10 16 -p 2

ggmap

ggmap is a fantastic package on R. I've previously blogged about rgooglemaps but ggmap supercedes (and appears to even build on it - looks to be the case when you install it). It's the equivalent of ggplot for maps and it is very easy to use. In this example I overlay some data points on a map. You may have seen David Kahle's earlier work using ggplot with Houston's crime data here.

 

Choropleth Map Time Lapse with R statistics, ggplot and ffmpegx

Melbourne Unemployment 2008 to 2011 - Revisited I have previously created a choropleth map showing unemployment rates for Victoria. After seeing Drew Conway's animated clock via R-bloggers, I thought I'd give it a go with my map on unemployment rates. Not surprisingly, unemployment rates remain low in the inner-east of Melbourne, while the north and west, and outer south east tend to have higher rates.

rgooglemaps

I recommend you go have a look at my post on ggmap. It's the best mapping package on R for showing points on a map.

Simon Jackman's maps on Australia's 2010 federal election was one of the reasons I wanted to learn R. His paper "The Spatial Concentration of the Green Vote" has a nice explanation of how he created them. Percentage voting Labour on a 2 party preferred basis, is shown for metropolitan Sydney below. You can generate different variations on his website.

 

Unfortunately since I'm still a novice, I've had to start at the beginning so I've been going through Markus Loecher's rgooglemaps tutorial. Here's where I'm at, plotting 10 schools in Melbourne:

Early days, but at least it feels like a start since I lost a few hours as I had longtitude and latitude around the wrong way. The other gotcha was that the variable names in mymarkers had to be "lat" and "lon" (I was using "lng").

setwd("C:/schools/")
 
require(foreign)
require(RgoogleMaps)
 
schools <- read.dta("C:/schools/schools.dta")
mymarkers <- schools[1:20, c("lat","lng")]
names(mymarkers) <- c("lat","lon")
 
mymarkers$size <- "tiny"
mymarkers$col <- "red"
mymarkers$char <- ""      
 
 
#get the bounding box:
bb <- qbbox(lat = mymarkers[,"lat"], lon = mymarkers[,"lon"])
 
#download the map:
MyMap <- GetMap.bbox(bb$lonR, bb$latR, destfile = "schools.png", GRAYSCALE =T, markers = mymarkers);
 
#determine the max zoom, so that all points fit on the plot
zoom <- min(MaxZoom(latrange=bb$latR,lonrange=bb$lonR));
 
#plot:
png("OverlayTest.png",640,640);
plot.new()
tmp <- PlotOnStaticMap(MyMap,lat = mymarkers[,"lat"], lon = mymarkers[,"lon"], cex=1.5,pch=20,col=c('blue', 'green', 'red'), add=F);
dev.off()

While looking for ways to overlay points on a map, I discovered that David Kahle has created some really cool maps with ggplot and google maps [1]. It looks like he is working on an R package to generalise those maps, which is awesome [2].

 
[1] https://github.com/hadley/ggplot2/wiki/Crime-in-Downtown-Houston,-Texas-:-Combining-ggplot2-and-Google-Maps
[2] https://github.com/dkahle

What do you know? Grockit/Kaggle

I entered a contest on Kaggle which asks us to predict whether a student will get a certain question correct. I'm pretty sure this was a blind alley but I looked at how time taken to answer a question was related to getting a question correct. The data comes from Grockit and are from exam preparation tests. These graphs are produced from the full training set of ~500mb data. Students who answer very quickly, less than 10 seconds, are more likely to get the question wrong. Students that take longer are also more likely to get questions wrong.

 Histograms of time taken to answer a question by Track


Density of time taken by whether question answered correctly


Densities of time taken by track and by answered correctly.  



Yes, I recoded tracks 2  to track 0 and track 3 to track 1 since their numbers were so low.

svn on subdomain and on https

After setting up svn on apache I go about putting svn on a subdomain and also on https (port 443). This is handy because sometimes checking on http (port 80) doesn't work due to firewalls or proxy servers...?


ServerName svn.domain.com


# Uncomment this to enable the repository
DAV svn

# Set this to the path to your repository
SVNPath /home/brendan/svn

AuthType Basic
AuthName "Subversion Repository"
AuthUserFile /etc/apache2/dav_svn.passwd

Require valid-user

NameVirtualHost *:443

ServerName https://svn.domain.com

DAV svn
SVNPath /home/brendan/svn
AuthType Basic
AuthName "Subversion Repos"
AuthUserFile /etc/apache2/dav_svn.passwd
Require valid-user
SSLRequireSSL

SSLEngine on
SSLCertificateFile /etc/apache2/ssl/apache.pem
SSLProtocol all
SSLCipherSuite HIGH:MEDIUM


References
http://www.howtogeek.com/howto/ubuntu/install-subversion-with-web-access-on-ubuntu/

http://www.tc.umn.edu/~brams006/selfsign_ubuntu.html

http://www.danielbyrne.net/2010/04/setting-up-svn-on-a-subdomain/

svn post-commit svn notify & svn update to another server

So I've set up SVN to send an email to me every time someone commits to the repository, and in addition, put instructions for the post-commit hook to update the development server (and/or production server).

file: post-commit

notes:


  • firstname.lastname@gmail.com is the recipient. For multiple recipients, the format is "first" "second" "third" email addresses.

  • I am using a gmail account to send the notification messages after each commit; PASSWORD needs to be replaced by actual password. Replace my gmail account websmashhits@gmail.com with your gmail account. Clearly, you could use any email account, not just gmail.

  • project is the name of the checked out repository


#!/bin/sh

# svn update the production server
/usr/bin/svn update /home/brendan/project

# send emails to everyone about the last commit

REPOS="$1"
REV="$2"

echo $REPOS
echo $REV

for email in "firstname.lastname@gmail.com"
do
/usr/local/bin/svnnotify
--repos-path "$REPOS"
--revision "$REV"
--subject-cx
--with-diff
--handler HTML::ColorDiff
--smtp smtp.gmail.com
--smtp-user websmashhits@gmail.com
--smtp-pass PASSWORD
--to $email
--from websmashhits@gmail.com
--subject-prefix [ATLANTAS]
done


#update the development server
/home/brendan/svn/hooks/post-commit-svn

file post-commit-svn

notes:


  • Once again PASSWORD needs to be replaced by actual password. This it is the ssh password for the development server.

  • SERVER needs to be replaced by an ip or domain name (address of server with ssh)


#!/usr/bin/expect -f

# AUTHOR: Meitar Moscovitz
# DATE : Thu Jun 21 16:32:42 EDT 2007
#

set PASS PASSWORD

# the path to the svn executable on the remote web server
set SVNBIN /usr/bin/svn

# our network is slow, set a long timeout
set timeout 30

##### DO NOT EDIT PAST THIS LINE! #####

# POST-COMMIT HOOK
#
# The post-commit hook is invoked after a commit. Subversion runs
# this hook by invoking a program (script, executable, binary, etc.)
# named 'post-commit' (this file) with the
# following ordered arguments:
#
# [1] REPOS-PATH (the path to this repository)
# [2] REV (the number of the revision just committed)
#
# Note that Subversion does not provide this program with an environment
# of any kind. That means this program lacks a current working directory,
# a home directory, a $PATH, and so on.

set REPOS [lindex $argv 0]
set REV [lindex $argv 1]

# Define error codes
set E_NO_SSH 1 ;# can't find a usable SSH on our system
set E_NO_CONNECT 2 ;# failure to connect to remote server (timed out)
set E_WRONG_PASS 3 ;# password provided does not work
set E_UNKNOWN 25 ;# unexpected failure

# find the SSH binary on our system
set SSHBIN /usr/bin/ssh

spawn $SSHBIN brendan@SERVER -t "$SVNBIN update /home/brendan/project --password $PASS"

expect {
"continue connecting (yes/no)? " { send "yesr"; exp_continue; }
-nocase "password:" { send "$PASSr"; }
timeout {
send_error "nWe have timed out after $timeout seconds while trying to connect to $HOST!n";
exit $E_NO_CONNECT;
}
}

expect {
-nocase "password:" { ;# if we are asked for the password again, then we have provided the wrong password
send_error "nCan not log in to $HOST because the password provided for user $USER has been rejected.n";
exit $E_WRONG_PASS;
}
-re "revision ([0-9]+)." {
if {$REV == $expect_out(1,string)} {
send_user "nSuccessfully updated $WC on $HOST to revision $REV.n"
} else {
send_user "nUpdated repository to revision $expect_out(1,string), but svn reports that we are at revision number $REV.n"
send_error "CAUTION: Repository updated to revision $expect_out(1,string), but committed revision $REV.n"
}
}
default {
send_error "An unexpected error has occured. The process at spawn ID $spawn_id has produced the following output:n"
send_error $expect_out(buffer)
exit $E_UNKNOWN
}
}


References
http://josephscott.org/archives/2010/01/setting-up-svnsync-and-svn-notify/

http://www.svnforum.org/threads/36093-SVN-Hook-to-update-REMOTE-staging-server

Install Python Django on Apache Ubuntu 10.04

I can tell you that moving from LAMP to Python Django has been a terrific headache. Here is the guide that I wished would have existed before I started out.

1.install relevant packages

sudo apt-get install apache2 subversion
sudo apt-get install mysql-server python-mysqldb
sudo apt-get install libapache2-mod-wsgi

2. download django

sudo apt-get install subversion

svn co http://code.djangoproject.com/svn/django/trunk/ ~/django-trunk/


You can really checkout the code to any location, not just "~/django-trunk/"

3. tell python where django is.

python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"
ln -s /home/brendan/django-trunk/django /DIR-FROM-PREV-LINE/django
ln -s /home/brendan/django-trunk/django/bin/django-admin.py /usr/local/bin

sudo python setup.py install


4. for a production site: edit /etc/apache2/sites-available/mysite.com (don't forget to run a2ensite mysite.com).
for localhost: edit /etc/apache2/sites-available/default.
'mysite' is directory for the contents of your new website, to include the code below.

Change the ip address if appropriate. We configure apache to serve static content. This is not recommended by Django but it will do for now.

<VirtualHost 127.0.0.1:80>

ServerAdmin websmashhits@gmail.com

# uncomment next 3 lines for production
# ServerName mysite.com
# ServerAlias www.mysite.com

# WSGIProcessGroup mysite.com
WSGIDaemonProcess mysite.com user=mysite group=mysite threads=10 python-path=/usr/lib/python2.6/site-packages
WSGIScriptAlias / /home/brendan/mysite/mysite.wsgi

Alias /static/ /home/brendan/mysite/static/

/* this is to allow apache to serve static content */  
<Directory "/home/brendan/mysite/static/">
Order deny,allow
Allow from all
</Directory>

<Directory /home/brendan/mysite/>
Order deny,allow
Allow from all
</Directory>

ErrorLog /var/log/apache2/error.log

LogLevel warn
CustomLog /var/log/apache2/access.log combined
</VirtualHost>


5.
chown -R mysite /home/brendan/mysite

6. create mysite.wsgi referred to in the previous line with contents (once again, mysite is the name of your new website):

#!/usr/local/bin/python
import os, sys
# using os we can determine the location of this mysite.wsgi file in ROOT
ROOT = os.path.dirname(os.path.realpath(__file__))
sys.path.append(ROOT)
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings' # this is your settings.py file
os.environ['PYTHON_EGG_CACHE'] = '/tmp'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

7. Edit the details in settings.py within /home/brendan/mysite/, and you're done.

ADMINS = (    # ('Brendan', 'websmashhits@gmail.com'),)
DATABASE_ENGINE =   'django.db.backends.mysql'
DATABASE_NAME =     'mysite_db'
DATABASE_USER =     'user'
DATABASE_PASSWORD = 'password'
TIME_ZONE = 'Australia/Melbourne'
MEDIA_ROOT = '/home/brendan/mysite/static/'
MEDIA_URL = 'http://mysite.com/static/'
ADMIN_MEDIA_PREFIX = '/static/admin/'
TEMPLATE_LOADERS = (    'django.template.loaders.filesystem.load_template_source',
'django.template.loaders.app_directories.load_template_source',
#'django.template.loaders.eggs.load_template_source',)
MIDDLEWARE_CLASSES = (    'django.middleware.common.CommonMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',)
ROOT_URLCONF = 'mysite.urls'
TEMPLATE_DIRS = (    "/home/brendan/mysite/templates/")
INSTALLED_APPS = (    'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.sites',
'polls',
'django.contrib.admin',)

References

Django tutorial (part 1)

http://docs.djangoproject.com/en/dev/topics/install/?from=olddocs

http://docs.djangoproject.com/en/dev/howto/deployment/modwsgi/

http://www.djangofoo.com/17/django-mod_wsgi-deploy-exampl

https://bitbucket.org/ubernostrum/django-registration/src

http://www.stonemind.net/blog/2007/04/13/django-registration-for-newbies/

http://devdoodles.wordpress.com/2009/02/16/user-authentication-with-django-registration/

Convert Excel Tables to Word

Last year I wrote a macro in Excel which automates the process of copying tables from excel format (xml) to word documents.   The macro checks each worksheet and copies each table across to a word document with page breaks. It uses the first cell "A1" in each worksheet to name the table in the word document.