django-stdimage Custom Fields and South Migrations

I’m putting the finishing touches on my latest Django site (a registration site for a dog rescue charity walk) and part of the functionality is the ability for users to upload photos. Since most people will simply snap a photo on their cell phones and upload it, and since cell phones these days take some pretty large pictures, I want to resize the images to a standard size of 800×600 when they’re uploaded and also create a thumbnail.

As with most things in the Python/Django world there’s a library that does exactly what I need, specifically django-stdimage, which provides a custom model field to handle resizing photos and creating thumbnails.

After installing django-stdimage via pip, updating my model class to use the new form field, and then doing a South migration, I ran into this error:

 ! Cannot freeze field ‘supporter.registrant.photo’
 ! (this field has class stdimage.fields.StdImageField)

 ! South cannot introspect some fields; this is probably because they are custom
 ! fields. If they worked in 0.6 or below, this is because we have removed the
 ! models parser (it often broke things).
 ! To fix this, read http://south.aeracode.org/wiki/MyFieldsDontWork

To give a little background, South used to handle automatic introspection of custom fields, but according to the explanation of why it no longer does that in newer versions of South, “… when it broke, it broke spectacularly.” Fair enough! So the expectation on newer versions of South is that you’ll provide the necessary information to allow South to handle the custom fields.
Giving South what it needs to do migrations involving custom fields is simple enough but since this is the first time I’ve had to deal with it I thought I’d help my future self and potentially others by sharing how to handle it.

There are a couple of different scenarios with South and custom fields that you can read more about in the South docs, but in the case of django-stdimage you simply have to add the location of the custom field to South’s introspection rules.

In the models.py that contains the model in which you’re using django-stdimage, add the following somewhere at the top of the file:

from south.modelsinspector import add_introspection_rules

add_introspection_rules([], [‘^stdimage.fields.StdImageField’])

Now when you run the South schema migration South will know where to find StdImageField and will be able to handle the migration.

If you have more complex custom fields be sure and read the South docs since if your custom field adds new attributes or doesn’t extend another class for which there are already introspection rules, you’ll have to provide the introspection rules yourself as opposed to simply providing the location of the custom field class.

I’ll share the link to the app in which I’m using django-stdimage and put the code for the entire app up on GitHub soon.

Generating and Sorting on a Transient Property in a Django Model Class

I ran into an interesting little issue in a Django application today that led to what I thought was some pretty powerful stuff in Python and Django that I hadn’t had to use before, so I thought I’d share.

I’m working on an application that to keep it generic I’ll simply say deals with requests from users, and these requests require various levels of approval depending on the type and severity of the request.

Here’s a basic Django model class that contains the fields relative to this example:

Basic stuff so far.

Where the wrinkle comes in is there is a page in the application that lists the requests ordered by date, but the date by which each request will be displayed and sorted depends upon the impact of the request as follows:

  • minor: use datetime_supervisor_approved
  • major: use datetime_admin_approved
  • severe: use datetime_president_approved

To state the issues succinctly:

  • The date on which I want to sort varies for each request
  • I don’t want to store an additional, redundant field in the database simply to have a consistent name to use for sorting
  • Because the sort date isn’t stored in the database I can’t use order_by on a QuerySet

That last bullet was the killer. In Python it’s simple enough to add attributes to a class on the fly, so I could loop over the QuerySet and use conditional logic around the change impact to add a new calendar_date field to each instance of the UserRequest class, but that’s kind of ugly because the business logic winds up in a view function as opposed to being in the model, and still doesn’t solve the inability to use order_by in the QuerySet.

Because I want to keep the business logic in the model (fat models FTW!), I looked into using a Python property to add a transient attribute to my UserRequest class that is the result of a function call.

Basically what this means in concrete terms is I’m adding a calendar_date attribute to my model class and the value of calendar_date is set by calling a function in the class itself that contains the aforementioned conditional logic around the request impact.

Here’s the modified model class:

The big addition here is the _get_calendar_date function that returns a calendar date based on the impact of the request. This is added as an attribute to the class, but it’s not saved to the database which is exactly what I wanted in this instance.

That solves one piece of the puzzle, but the other piece is sorting the requests based on this new transient property, which again can’t be done by using order_by on the QuerySet since the field on which we want to order isn’t in the database.

This is where Python’s sorted() function comes in. sorted() can take any iterable (which the Django QuerySet is) and sort it based on a provided comparator function, and this is where I can leverage the transient calendar_date property to sort the requests.

Putting all the pieces together in the view function, here’s how it looks:

Python’s sorted() function is pretty powerful and straight-forward — throw it an iterable and what to compare on, and it does the heavy lifting for you.

The only potentially tricky part of this is setting the key since it uses Python’s lambda statement (great explanation here), which is Python’s way of kinda sorta dipping a toe into the functional programming waters by letting you define an anonymous inline function.

In this case how the lambda plays out is to sort the QuerySet based on two attributes: first the transient calendar_date property that isn’t stored in the database, and subsequently the request_impact attribute that is stored in the database. Perfect!

I’m sure none of this is rocket science to seasoned Python and Django veterans but since I hadn’t run into the need to do this before I thought I’d document what I came up with both for my own future reference and hopefully for the benefit of others who made need to so something similar.

Setting Up Django On a Raspberry Pi

This past weekend I finally got a chance to set up one of my two Raspberry Pis to use as a Django server so I thought I’d share the steps I went through both to save someone else attempting to do this some time as well as get any feedback in case there are different/better ways to do any of this.

I’m running this from my house (URL forthcoming once I get the real Django app finalized and put on the Raspberry Pi) using dyndns.org. I don’t cover that aspect of things in this post but I’m happy to write that up as well if people are interested.

General Comments and Assumptions



Reconfigure the Keyboard Mapping

If you already did this at some other point with your Raspberry Pi or your keyboard is working properly you can skip this step. After the base install of Raspbian “wheezy” in my case I noticed when in vim I couldn’t type things like #, |, and @ so I had to do the following:

  1. sudo dpkg-reconfigure keyboard-configuration
  2. Follow the prompts to set things appropriate for your keyboard (US English in my case)
  3. Restart the Raspberry Pi (sudo reboot)


Install Required Server-Wide Tools and Libraries


  1. sudo apt-get install vim python-dev python-setuptools nginx supervisor
  2. sudo easy_install pip
  3. sudo pip install virtualenv virtualenvwrapper
  4. Configure virtualenvwrapper
    1. sudo vim /etc/bash.bashrc
    2. Add this line at the bottom of the file:
      source /usr/local/bin/virtualenvwrapper.sh
    3. Save the file and exit vim
    4. Log out and log back in


Create and Configure a virtualenv for Your Application

tl;dr version of the following paragraph: Run the following commands from your home directory unless you have a reason not to and know what you’re doing. For the purposes of this tutorial I’m using the default ‘pi’ user on the Raspberry Pi.

You can create a virtualenv for your application in your home directory or wherever you prefer your files to be located, just be aware that depending on where you put your files and whether or not you have to execute virtualenv-specific tasks (pip install, etc.) you may have to explicitly specify which python or pip command to use for it to work properly.

For the purposes of this example we’ll create a ‘foo’ application. The steps below where you’re running django-admin.py startproject and firing up the development server are simply to test to make sure everything’s working properly. If you’re doing this all for the first time, I’d suggest doing this both to get familiar with it as well as to double-check that everything with the environment is working as expected.

  1. mkvirtualenv foo
    1. Note that this will activate the virtualenv. You MUST have the virtualenv activated when performing the steps that follow for everything to work properly.
    2. If you’re used to using virtualenv directly as opposed to virtualenvwrapper, note that when you create a virtualenv using  mkvirtualenv, it will put the virtualenv files (again, if you’re doing this from your home directory) in ~/.virtualenvs/foo
  2. pip install django docutils south gunicorn
  3. django-admin.py startproject foo
  4. cd foo
  5. python manage.py runserver
    1. The first time you run this you may see a ton of blank lines output on the Raspberry Pi. It’ll eventually finish, or if you hit Ctrl-C when this happens and try again a couple of times it’ll eventually calm down and the next time you run things this won’t happen. I’m not sure what’s going on here since the Raspberry Pi is the only platform on which I’ve seen this happen.
    2. Once the development server starts up you should see “0 errors found [other stuff here] Development server is running at http://127.0.0.1:8000” If you don’t see that, double-check everything and try again.
    3. You can make a call to the development server by sshing into the Raspberry Pi from another machine and doing curl localhost:8000


Enable Gunicorn and South and Enable a Database in Your Application


  1. cd ~/foo/foo
  2. vim settings.py
  3. In the DATABASES section make the following changes:
    1. ‘ENGINE’: ‘django.db.backends.sqlite3’
    2. ‘NAME’: ‘/home/pi/foo/foo.db’
  4. In the INSTALLED_APPS section, enable the admin and admindocs (optional):
    1. Uncomment # ‘django.contrib.admin’
    2. Uncomment # ‘django/contribu.admindocs’
  5. Also in the INSTALLED_APPS section, add ‘south’ and ‘gunicorn’ to the list of installed apps.
  6. Save the file and exit vim
  7. Test Gunicorn:
    1. python manage.py run_gunicorn (you can hit Ctrl-C to kill it once you see it’s working)


A note on databases: I’m not addressing the broader topic of which database server you should use simply because it’s a separate discussion that’s potentially fraught with zealous opinions and general peril, and is a bit tangential to the present discussion.

If you’re doing all this just to learn, or for your own personal low-use “screwing around” type applications, SQLite (http://www.sqlite.org/), which is what comes with Django, will work just fine, so that’s what I outlined above.

I’m writing this up as part of setting up a site for alt.* Radio (http://codebassradio.net/shows/alt/), which is a weekly show I do on the super-awesome CodeBass Radio (http://codebassradio.net), so for that I’ll be running PostgreSQL on a separate machine. For low-traffic stuff you might even be able to get away with running MySQL or Postgres directly on the Raspberry Pi but I have a couple of unused machines laying around so I figured I might as well keep the Raspberry Pi focused on the Django side of things.

Note that if you DO want to use PostgreSQL, you’ll have to do this on the Raspberry Pi:

  1. sudo apt-get install libpq-dev
  2. pip install psycopg2 (with your virtualenv active)


Configure Supervisor

Supervisor (http://supervisord.org/), which we installed earlier, is a tool that lets you easily create startup/restart scripts for services that aren’t installed and managed through apt-get. In our case we’ll need this for Gunicorn so it’ll fire up when the machine boots and automatically restart if it crashes.

  1. cd /etc/supervisor/conf.d
  2. sudo vim gunicorn.conf
  3. Put the following in the new file:
    [program:gunicorn]
    command = /home/pi/.virtualenvs/foo/bin/python /home/pi/foo/manage.py run_gunicorn -w 4
    directory = /home/pi/foo
    user = pi
    autostart = true
    autorestart = true
    stdout_logfile = /var/log/supervisor/gunicorn.log
    stderr_logfile = /var/log/supervisor/gunicorn_err.log
  4. Save the file and exit vim
  5. sudo service supervisor restart
  6. sudo supervisorctl start gunicorn
    1. Note that you may get an “already started” error here. If you get a “no such process” error, that means supervisor didn’t load the new configuration file. If that happens:
      1. sudo ps -wef | grep supervisor
      2. sudo kill -9 SUPERVISOR_PROCESS_ID (use the real process ID)
      3. sudo service supervisor start


Configure Nginx


  1. sudo rm -f /etc/nginx/sites-enabled/default
  2. sudo vim /etc/nginx/sites-available/foo
  3. Put the following in the foo file:

# upstream server for gunicorn
upstream gunicorn {
  server localhost:8000;
}

# nginx server for the host
server {
  listen 80;

  server_name foo.com www.foo.com;

  root /home/pi/foo;

  access_log /var/log/nginx/foo_access.log;
  error_log /var/log/nginx/foo_error.log;

  # try to serve a static file and if it doesn’t exist, pass to gunicorn
  try_files $uri @gunicorn;

  # rules for gunicorn
  location @gunicorn {
    proxy_pass http://gunicorn;
    proxy_redirect off;
    proxy_read_timeout 5m;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Fowarded-For $proxy_add_x_forwarded_for;
  }
}

  1. Save the file and exit vim
  2. sudo ln -s /etc/nginx/sites-available/foo /etc/nginx/sites-enabled/foo
  3. sudo service nginx restart


Verify Everything’s Working

If you’re not using a real hostname/URL, make sure and add it to /etc/hosts on the Raspberry Pi so you can easily test. In this case we used foo.com as the hostname, so do the following:

  1. sudo vim /etc/hosts
  2. Add this line:
    127.0.0.1 foo.com
    www.foo.com
  3. Save the file and exit vim


If you want to hit the Raspberry Pi from a browser on another machine you’ll have to also add the IP and host names to /etc/hosts on any machines from which you want to hit that host, but use the IP of the Raspberry Pi instead of 127.0.0.1.

With that modification in /etc/hosts locally on the Raspberry Pi, you can do a curl foo.com to see if you get the default Django page (or your application if you took things a bit further than are outlined in this tutorial). If you do, everything’s working! If not, start over and don’t screw up this time!

Seriously though if you need any help just comment here or email/Google+/Twitter me and I’ll be happy to assist.

How To Create a PyCharm Launcher on Ubuntu 12.10

I’m absolutely loving using PyCharm for my Python and Django development, but one of the lingering things I’ve been meaning to nail down once and for all is creating a launcher for PyCharm in Ubuntu 12.10. Despite the automated way you can attempt to do this from within PyCharm itself and all the other recommendations I’ve read I was unable to get it working.

In the mean time I also bought of copy of IntelliJ IDEA when they had some crazy back to school sale a couple of months ago (I still dabble in Groovy and Grails a bit). I was having the same issues with creating a launcher for IDEA and the typical tricks I use for Eclipse weren’t working, but luckily I came across this post that explains how to do it. The only change I made is pointing to idea64.vmoptions instead of just idea.vmoptions — other than that it works great.
That got me thinking — since PyCharm and IDEA are both made by JetBrains, and both run on Java, chances are how they work is pretty darn similar. So I decided to copy the IDEA launcher script and modify it for PyCharm, and lo and behold it worked!
Here’s my modified version of the IDEA launcher that works for PyCharm.
#!/bin/bash

export JAVA_HOME=/opt/java/jdk1.7.0_09
export JDK_HOME=/opt/java/jdk1.7.0_09

export PYCHARM_HOME=/home/mwoodward/pycharm-2.6.2

export PYCHARM_VM_OPTIONS=”$PYCHARM_HOME/bin/pycharm64.vmoptions”
export PYCHARM_PROPERTIES=”$PYCHARM_HOME/bin/idea.properties”

cd “$PYCHARM_HOME/bin”
export LIBXCB_ALLOW_SLOPPY_LOCK=1

./pycharm.sh
Obviously adjust all the paths as necessary for your machine. Make sure you chmod +x on the file, and with that in place you can open up Main Menu (sudo apt-get install alacarte if you don’t already have it installed) and add a launcher. For reference the icon lives in PyCharm’s bin directory.
Hope that helps someone else who has run into this issue.

Using Python to Compare Document IDs in Two CouchDB Databases

I’m doing a bit of research into what may or may not be an issue with a specific database in our BigCouch cluster, but regardless of the outcome of that side of things I thought I’d share how I used Python and couchdb-python to dig into the problem.

In our six-server BigCouch cluster we noticed that on the database for one of our most heavily trafficked applications the document counts displayed in Futon for each of the cluster members don’t match. As I said above this may or may not be a problem (I’m waiting on further information on that particular point), but I was curious which documents were missing from the cluster member that has the lowest document count. (The interesting thing is the missing documents aren’t truly inaccessible from the server with the lower document count, but we’ll get to that in a moment.)

BigCouch is based on Apache CouchDB but adds true clustering as well as some other very cool features, but for those of you not familiar with CouchDB, you communicate with CouchDB through a RESTful HTTP interface and all the data coming and going is JSON. The point here is it’s very simple to interact with CouchDB with any tool that talks HTTP.

Dealing with raw HTTP and JSON may not be difficult but isn’t terribly Pythonic either, which is where couchdb-python comes in. couchdb-python lets you interact with CouchDB via simple Python objects and handles the marshaling of data between JSON and native Python datatypes for you. It’s very slick, very fast, and makes using CouchDB from Python a joy.

In order to get to the bottom of my problem, I wanted to connect to two different BigCouch cluster members, get a list of all the document IDs in a specific database on each server, and then generate a list of the document IDs that don’t exist on the server with the lower total document count.

Here’s what I came up with:

>>> import couchdb
>>> couch1 = couchdb.Server(‘http://couch1:5984/’)
>>> couch2 = couchdb.Server(‘http://couch2:5984/’)
>>> db1 = couch1[‘dbname’]
>>> db2 = couch2[‘dbname’]
>>> ids1 = []
>>> ids2 = []
>>> for id in db1:
…     ids1.append(id)
… 
>>> for id in db2:
…     ids2.append(id)
… 
>>> missing_ids = list(set(ids1) – set(ids2))

What that gives me, thanks to the awesomeness of Python and its ability to subtract one set from another (note that you can also use the difference() method on the set object to achieve the same result), is a list of the document IDs that are in the first list that aren’t in the second list.

The interesting part came when I took one of the supposedly missing IDs and tried to pull up that document from the database in which it supposedly doesn’t exist:

>>> doc = db2[‘supposedly_missing_id_here’]

I was surprised to see that it returned the document just fine, meaning it must be getting it from another member of the cluster, but I’m still digging into what the expected behavior is on all of this. (It’s entirely possible I’m obsessing over consistent document counts when I don’t need to be.)

So what did I learn through all of this?

  • The more I use Python the more I love it. Between little tasks like this and the fantastic experience I’m having working on our first full-blown Django project, I’m in geek heaven.
  • couchdb-python is awesome, and I’m looking forward to using it on a real project soon.
  • Even though we’ve been using CouchDB and BigCouch with great success for a couple of years now, I’m still learning what’s going on under the hood, which for me is a big part of the fun.

Three Approaches to Handling Static Files in Django

I had a really great (and lengthy) pair programming session today with a coworker during which we spent a bit of time going over a couple of different approaches for dealing with static files in Django, so I thought I’d document and share this information while it’s fresh in my mind.

First, a little background. If you’re not familiar with Django it was originally created for a newspaper web site, specifically the Lawrence Journal-World, so the approach to handling what in the Django world are called “static files” — meaning things like images, JavaScript, CSS, etc. — is based on the notion that you might be using a CDN so you should have maximum flexibility as to where these files are located.

While the flexibility is indeed nice, if you’re used to a more self-contained approach it takes a little getting used to, and there are a few different ways to configure your Django app to handle static files. I’m going to outline three approaches, but using different combinations of things and other solutions of which I may be unaware there are certainly more ways to handle static files than what I’ll outline here. (And as I’m still relatively new to Django, if I’m misunderstanding any of this I’d love to hear where I could improve any of what I’m sharing here!)

One other caveat — I’m focusing here on STATIC_URL and ignoring MEDIA_URL but the approach would be the same.

Commonalities Across All Approaches

First, even though it may not strictly be required depending on which approach you take for handling static files, since you wind up needing to use this for other reasons, we’ll use django.template.RequestContext in render_to_response as opposed to the raw request object. This is required if you want access to settings like MEDIA_URL and STATIC_URL in your Django templates. For more details about RequestContext, the TEMPLATE_CONTEXT_PROCESSORS that are involved behind the scenes, and the variables this approach puts into your context, check the Django docs.

I’m also operating under the assumption that the static files will live in a directory called static that’s under the main application directory inside your project directory (i.e. the directory that has your main settings.py file in it). Depending on the approach you use you may be able to put the static directory elsewhere, but unless stated otherwise, that’s where the directory is assumed to be. (Note that if you store static files on another server entirely, such as using a CDN, STATIC_URL can be a full URL as opposed to a root-relative URL like /static/)

Also in all examples it’s assumed that the STATIC_URL setting in the main settings.py file is set to ‘/static/’

Approach One (Basic): Use STATIC_URL Directly in Django Templates

This is the simplest approach and may be all you need. With STATIC_URL set to ‘/static/’ in the main settings.py file, all you really have to worry about is using RequestContext in your view functions and then referencing {{ STATIC_URL }} in your Django templates.

Here’s a sample views.py file:

from django.shortcuts import render_to_response
from django.template import RequestContext

def index(request, template_name=’index.html’):
    return render_to_response(template_name, context_instance=RequestContext(request))

By using RequestContext the STATIC_URL variable will then be available to use in your Django templates like so:

<html>
<head>
    <script src=”{{ STATIC_URL }}scripts/jquery/jquery-1.8.1.min.js”></script>
</head>
<body>
    <img src=”{{ STATIC_URL }}images/header.jpg” />
</body>
</html>

That’s all there is to it. Again, since /static/ will be relative to the root of the main application directory in your project it’s assumed that the static directory is underneath your main application directory for this example, and obviously in the case of the example above that means that underneath the static directory you’d have a scripts and images directory.

Approach Two: Use a URL Pattern, django.views.static.serve, and STATICFILES_DIRS

In this approach you leverage Django’s excellent and hugely flexible URL routing to set a URL pattern that will be matched for your static files, have that URL pattern call the django.views.static.serve view function, and set the document_root that will be passed to the view function to a STATICFILES_DIRS setting from settings.py. This is a little bit more involved than the first approach but gives you a bit more flexibility since you can place your static directory anywhere you want.

The approach I took with this method was to set a CURRENT_PATH variable in settings.py (created by using os.path.abspath since we need a physical path for the document root) and leverage that to create the STATICFILES_DIRS setting. Here’s the relevant chunks from settings.py:

import os
CURRENT_PATH = os.path.abspath(os.path.dirname(__file__).decode(‘utf-8’)).replace(‘\’, ‘/’)

STATICFILES_DIRS = (
    os.path.join(CURRENT_PATH, ‘static’),
)

Note that the replace(‘\’, ‘/’) bit at the end of the CURRENT_PATH setting is to make sure things work on Windows as well as Linux.

Next, set a URL pattern in your main urls.py file:

from django.conf.global_settings import STATICFILES_DIRS

urlpatterns = patterns(”,
    url(r’^static/(?P<path>.*)$’, ‘django.views.static.serve’, {‘document_root’: STATICFILES_DIRS}),
)

And then in your Django templates you simply prefix all your static assets with /static/ as opposed to using {{ STATIC_URL }} as a template variable. Even though you’re specifying /static/ explicitly in your templates, you still have flexibility to put these files wherever you want since the URL pattern acts as an alias to the actual location of the static files.

Approach Three: Use staticfiles_urlpatterns and {% get_static_prefix %} Template Tag

django.contrib.staticfiles was first introduced in Django 1.3 and was designed to clean up, simplify, and create a bit more power for static file management. This approach gives you the most flexibility and employs a template tag instead of a simple template variable when rendering templates.

First, in settings.py we’ll do the same thing we did in the previous approach, namely setting a CURRENT_PATH variable and then use that to set the STATICFILES_DIRS variable:

import os
CURRENT_PATH = os.path.abspath(os.path.dirname(__file__).decode(‘utf-8’)).replace(‘\’, ‘/’)

STATICFILES_DIRS = (
    os.path.join(CURRENT_PATH, ‘static’),
)

Next, in urls.py we’ll import staticfiles_urlpatterns from django.contrib.staticfiles.urls and call that function to add the static file URL patterns to the application’s URL patterns:

from django.contrib.staticfiles.urls import staticfiles_urlpatterns

urlpatterns = patterns(
    # your app’s url patterns here
)

urlpatterns += staticfiles_urlpatterns()

The final line there is what adds the static file URL patterns into the mix. If you output staticfiles_urlpatterns() you’ll see it’s something like so:

[<RegexURLPattern None ^static/(?P<path>.*)$>]

And finally, at the very top of your templates you load the static template tags and then simply use the {% get_static_prefix %} tag to render the static URL:

{% load static %}
<html>
<head>
    <script src=”{% get_static_prefix %}scripts/jquery/jquery-1.8.1.min.js”></script>
</head>
<body>
    <img src=”{% get_static_prefix %}images/header.jpg” />
</body>
</html>

Conclusion

So there you have it, three approaches that more or less accomplish the same thing, but depending on the specific needs of your application or environment one approach may work better for you than another.

For our purposes on our current application we’re using the first approach outlined above since it’s simple and meets our needs, but it’s great to know there’s so much flexibility around static file handling in Django when you need it. As always read the docs for more information and yet more options for managing static files in your Django apps.

Installing python-ldap in virtualenv on Ubuntu

We’re authenticating against Active Directory in our current Python/Django project and though we’ve had excellent luck with python-ldap in general, I ran into issues when trying to install python-ldap in a virtualenv this afternoon. As always a lot of DuckDuckGoing and a minimal amount of headbanging led to a solution.

The error I was getting after activating the virtualenv and running pip install python-ldap was related to gcc missing, which was misleading since that wasn’t actually the issue:

error: Setup script exited with error: command ‘gcc’ failed with exit status 1

To add to the weirdness, when I installed python-ldap outside the context of a virtualenv, everything worked fine.
I’ll save you the blow-by-blow and just tell you that on my machine at least, other than the required OpenLDAP installation and some other libraries, I also had to install libsasl2-dev:

sudo apt-get install libsasl2-dev

Once I had that installed, I could activate my virtualenv, run pip install python-ldap and install without errors.
If you still run into issues make sure (in addition to OpenLDAP) to have these packages installed:
  • python-dev
  • libldap2-dev
  • libssl-dev
Hope that saves someone else some time!

Installing MySQL Python Module on Ubuntu

After tearing through several other Django books and tutorials using sqlite3 as the database, I’m starting to go through the book Beginning Django E-Commerce and it uses MySQL as the database. I use MySQL quite a lot so that side of things isn’t an issue, but I did run into a couple of wrinkles getting MySQL and Django playing nice so I thought I’d share.

Basically if after configuring your database in settings.py and running python manage.py dbshell you get a bunch of errors, you have a minor amount of installation work to do to get things rolling.

First thing I did was install pip, which is a better version of/replacement for easy_install:
sudo easy_install pip
Next I ran pip upgrade for good measure (probably not necessary but can’t hurt, and worth running if you already had pip installed):
sudo pip install pip –upgrade
On my machine (Ubuntu 12.04 64-bit) I also had to build the dependencies for the python-mysqldb libraries:
sudo apt-get build-dep python-mysqldb
And finally with that in place you can use pip to install the Python MySQL libraries:
sudo pip install MySQL-python
If everything worked you should now be able to run python manage.py dbshell from your Django project and have it load up the MySQL console.

Manually Installing the Django Plugin for Eric

If you install Eric (specifically Eric 4) from the Ubuntu software repos, the Eric plugin repository points to a location that’s unavailable:
http://die-offenbachs.homelinux.org/eric/plugins/repository.xml

I’m sure there’s a way to change it but I don’t see how to do it in the app itself (haven’t started poking around to see if there are config files somewhere yet), but luckily Eric plugins are just zip files so you can download them from a repository URL that works, and then add them to Eric.

The working plugin repository is here:
http://eric-ide.python-projects.org/plugins4/repository.xml

From there just do a ctrl-F to find the plugin you’re looking for, then copy/paste the URL for the plugin’s zip file into your browser (or use wget or whatever floats your boat) to download the plugin.

With the zip file downloaded, in Eric go to Plugins -> Install Plugins, click Add, and then point to the zip file you downloaded.

If someone knows how to change the plugin repository URL in Eric I’d love to learn.

Connecting to SQL Server with pyodbc

At long last after my previous posts on this topic we get to the easy part: using pyodbc to run queries against SQL Server.

If you need to get caught up, the posts you’ll want to go through before proceeding with this one are:

With a datasource created in unixODBC this last part is rather anti-climactic, but this is of course also where you can get some real work done in Python.
First, in a Python console (‘python’ from a terminal or whatever your favorite Python console tool is), import pyodbc:
>>> import pyodbc
Next, create a connection to the datasource using the datasource name, which in this case we’ll assume is foo:
>>> conn = pyodbc.connect(‘DSN=foo;UID=username;PWD=password’)
If you don’t get any errors, the connection was successful. Next, create a cursor on that connection:
>>> cursor = conn.cursor()
Now we can execute a query against SQL Server using that cursor:
>>> cursor.execute(“SELECT * FROM bar”)
If the query ran successfully, you’ll see that a pyodbc.Cursor object is returned:
<pyodbc.Cursor object at 0x15a5a50>
Next, and this is is not the most efficient way to do things (see below) but is good for demonstration purposes, let’s call fetachall() on the cursor and set that to a variable called rows:
>>> rows = cursor.fetchall()
This returns a list of pyodbc Row objects, which are basically tuples.
Finally, we can iterate on the Cursor and output the results of the query:
for row in rows:
    print row

This will output each row so you can see the results of your handiwork.

That’s all there is to it!

Of course there are a lot of ways to get at the individual column values returned from the query, so be sure and check out the pyodbc Getting Started wiki for details.

Note that there are numerous considerations depending on the volume and nature of data with which you’re dealing. For example if you have a large amount of data, using fetchall() isn’t advisable since that will load all the results into memory, so you’ll probably much more commonly want to use fetchone() in a loop instead since that’s much more memory efficient.

You can learn all about the objects, methods, etc. involved with pyodbc in the pyodbc wiki.

Next up is pymssql!