Getting going quickly with Python, MongoDB, and Spatial data on OpenShift: Part II



This post originally appeared on the OpenShift blog

As a follow up to my last post about getting spatial going in MongoDB on OpenShift, today we are going to put a web service in front of it using Python. There are several goals for this article:

  • Learn a little bit about Flask - a Python web framework
  • Learn about how to connect to MongoDB from Python
  • Create a REST Style web service to use in our SoLoMo application

I hope by the end you can see how using a Platform as a Service can get you going with Python, MongoDB, and Spatial faster than you can say…“Awesome Sauce”. We have a lot of ground to cover so let’s dig right in.

Creating the Python application

Here is OpenShift the command line to create the Python app

rhc app create -t python-2.6 -a pythonws 

Using the flask quickstart from GitHub

We have already put together a flask quickstart in the openshift github space. To get the framework into your application all you have to do is (from the

cd pythonws git remote add upstream -m master git:// git pull -s recursive -X theirs upstream master 

There we now have a flask app that we can modify source code.

If you want to just check out the source code I used in the app you can see it on Github and follow the instructions to clone it into your OpenShift account

Adding MongoDB and importing data

Time to add MongoDB to our application:

 rhc app cartridge add -a pythonws -t mongodb-2.0 

The previous post in this series will cover how to import the data from a JSON file of the national parks into your mondodb database and prepare it for spatial queries. Please follows those instructions to import the data into the pythonws DB into a collection called parkpoints.

Quick digression to explain Flask

Before we get into our specific application I am going to take a moment to explain the Python framework for this demo. Flask basically allows you to map URL patterns to methods (it also does a lot more, like templating, but this is the only part we are using today). For example, in the file that is now in your project you can find the line: @route(‘/’) def index(): return ‘Hello World!

This says that when a request comes in for the base URL, the function named

index gets executed. In this case the function just returns the string “Hello World!” and returning has the effect of sending the string to the requestor. @route(‘/name/’) def nameindex(name=’Stranger’): return ‘Hello, %s!’ % name

We can also grab pieces of the requested URL and pass it into the function. By enclosing a part of the URL in a < >, it indicates that we want to access it within our function. Here you can see where if the url looks like:

<a href=""></a>

Then the response will be Hello, steve!

Or the URL could be <a href=""></a>

Hello, Stranger!

We are going to define URL mappings for some basic REST like functionality to interact with our spatial MongoDB data store.

Modify the source code

The first function we are going to write will be to just simply return all the records in the database. In a more full featured app you would probably want to add pagination and other features to this query but we won’t be doing that today.@app.route(“/ws/parks”) def parks(): #setup the connection conn = pymongo.Connection(os.environ[‘OPENSHIFT_NOSQL_DB_URL’]) db = conn.parks

 #query the DB for all the parkpoints result = db.parkpoints.find() #Now turn the results into valid JSON return str(json.dumps({'results':list(result)},default=json_util.default)) 

I chose to put the web services under the url /ws/parks so that we could use other parts of the URL namespace for other functionality. You can now go to your application URL ( and you should be able to see all the documents in the DB.

Using MongoDB in Python

In the code above we simply make a connection to the MongoDB instance for this application and then execute a query. The pymongo package provides all the functionality to interact with the MongoDB instance from our Python code. The pymongo commands are very similar to the MongoDB command line interaction except two word commands like db.collection.findOne are split with a _, such as db.collection.find_one. Please go to the pymongo site to read more about the documentation.

Notice we use the environment variables to specify the connection URL. While not hard coding database connection parameters is good practice in non-cloud apps, in our case you MUST use the environment variables. Since your app can be idled and then spun up or it could be autoscaled, the IP and ports are not always guaranteed. By using the environment variables we make our code portable.

We pass the result set (which comes back as a Python dictionary) into json.dump so we can return JSON straight to the client. Since pymongo is returning the results in UTF and we want just plain text, we need to pass the json_util.default from the bson library into the json.dump command.

This is probably the easiest experience I have ever had writing a web service. I love Flask, Pymongo, and Python for the simplicity of “Just Getting Stuff Done”.

Grab just one park

Next we will implement the code to get back a park given a parks uniqueID. For ID we will just use the ID generated by MongoDB on document insertion (_id). The ID looks like a long random sequence and that is what we will pass into the URL.

return a specific park given it’s mongo _id

@app.route(“/ws/parks/park/”) def onePark(parkId): #setup the connection conn = pymongo.Connection(os.environ[‘OPENSHIFT_NOSQL_DB_URL’]) db = conn.parks

 #query based on the objectid result = db.parkpoints.find({'_id': objectid.ObjectId(parkId)}) #turn the results into valid JSON return str(json.dumps({'results' : list(result)},default=json_util.default)) 

Here you have to use another class from the bson library - ObjectID. The actual ObjectID in MongoDB is an object and so we have to take the ID passed in on the url and create an Object from it. The ObjectID class allows us to create one of these objects to pass into the query. Other than that the code is the same as above.

This little snippet also shows an example of grabbing part of the URL and passing it to a function. I explained this concept above but here we can see it in practice.

Time for the spatial query

Here we do a query to find national parks near a lattitude longitude pair

find parks near a lat and long passed in as query parameters (near?lat=45.5&lon=-82)

@app.route(“/ws/parks/near”) def near(): #setup the connection conn = pymongo.Connection(os.environ[‘OPENSHIFT_NOSQL_DB_URL’]) db = conn.parks

 #get the request parameters lat = float(request.args.get('lat')) lon = float(request.args.get('lon')) #use the request parameters in the query result = db.parkpoints.find({"pos" : { "$near" : [lon,lat]}}) #turn the results into valid JSON return str(json.dumps({'results' : list(result)},default=json_util.default)) 

This piece of code shows how to get request parameters from the URL. We capture the lat and lon from the request url and then cast them to floats to use in our query. Remember, everything in a URL comes across as a string so it needs to be converted before being used in the query. In a production app you would need to make sure that you were actually passed strings that could be parsed as floating point numbers. But since this app is just for demo purposes I am not going to show that here.

Once we have the coordinates, we pass them in the the query just like we did from the command line MongoDB client. The results come back in distance order from the point passed into the query. Remember, the ordering of the coordinates passed into the query need to match the ordering of the coordinates in your MongoDB collection.

Finish it off with a Regex query with spatial goodness

The final piece of code we are going to write allows for a query based both on the name and the location of interest.

find parks with a certain name (using regex) near a lat long pair such as above

@app.route(“/ws/parks/name/near/”) def nameNear(name): #setup the connection conn = pymongo.Connection(os.environ[‘OPENSHIFT_NOSQL_DB_URL’]) db = conn.parks

 #get the request parameters lat = float(request.args.get('lat')) lon = float(request.args.get('lon')) #compile the regex we want to search for and make it case insensitive myregex = re.compile(name, re.I) #use the request parameters in the query along with the regex result = db.parkpoints.find({"Name" : myregex, "pos" : { "$near" : [lon,lat]}}) #turn the results into valid JSON return str(json.dumps({'results' : list(result)},default=json_util.default)) 

Just like the example above we parse out the lat and lon from the URL query parameters. In looking at my architecture I do think it might have been better to add the name as a query parameter as well, but this will still work for this article. We grab the name from the end of the URL path and then compile it into a standard Python regular expression (regex). I added the re.I to make the regex case-insenstive. I then use the regex to search against the Name field in the document collection and do a geo search against the pos field. Again, the results will come back in distance order from the point passed into the query.


And with that we have wrapped up our little web service code - simple and easy using Python and MongoDB. Again, there are some further changes required for going to production, such as request parameter checking, maybe better URL patterns, exception catching, and perhaps a checkin URL - but overall this should put you well on your way. There are examples of:

  • Using Flask to write some nice REST style services in Python
  • Various methods to get URL information so you can use it in your code
  • How to interact with your MongoDB in Python using PyMongo and BSON libraries
  • Getting spatial data out of your application

Give it all a try on OpenShift and drop me a line to show me what you built. I can’t wait to see all the interesting spatial apps built by shifters.