Weapons of mass creation
Django, Python, mod_wsgi with cPanel WHM
We recently migrated all of our sites to a new machine and we also decided to run everything on cPanel, for various reasons. One problem with this setup at the time of this writing is out-of-the-box cPanel does not support python 2.5 or mod_wsgi. But it does support Apache2, so we decided to use mod_wsgi to run our sites built with Django. It’s been working pretty smoothly and the setup was not too bad. In this tutorial I assume you already have a working Django installation running (tested with runserver).

We were using mod_python in the past but the advantages to using mod_wsgi are:

  1. Smaller Python memory footprint in Apache.
  2. Can run as separate daemon process that is managed by Apache and supports threads.
  3. We can reload our application code without restarting Apache.
  4. Allows multiple WSGI applications on the same virtual host.
  5. Allows us to specifying what user our application will run as and a host of other nice configurations.
  6. Graham Dumpleton rocks.
The ingredients:
  • Python 2.5.x
  • mod_wsgi 2.x
  • Apache 2 (comes installed with cPanel)
  • Django
  • cPanel/CentOS
Not sure if this matters much to this tutorial but I will note some system information as well:
$ uname -a
Linux bubba.example.com 2.6.18-53.1.21.el5 #1 SMP Tue May 20 09:35:07 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
$ rpm -q glibc
glibc-2.5-18.el5_1.1
glibc-2.5-18.el5_1.1
What is WSGI? According to one site it is:
WSGI is a specification, laid out in PEP 333, for a standardized interface between Web servers and Python Web frameworks/applications. The goal is to provide a relatively simple yet comprehensive interface capable of supporting all (or most) interactions between a Web server and a Web framework. (Think “CGI” but programmatic rather than I/O based.)

Build Python

We decided to build and maintain our own python installation rather than using the one that comes with the cPanel install. We wanted to avoid any grief with mailman and cPanel dependencies being broken. We also preferred using the latest and greatest python for our web applications.

To get sqlite support in python 2.5 you need to get the headers to compile against. Since I already had sqlite3.3.6-2 installed I grabbed a rpm for the devel package on CentOS 5.1 to build against.

Then downloaded python 2.5 and built as such:
$ ./configure --prefix=/opt/local --with-ncurses --with-threads --enable-shared
$ make && make install
If you try to run /opt/local/bin/python and get something like “error while loading shared libraries: libpython2.5.so.1.0: cannot open shared object file: No such file or directory” then you can configure ld to find your shared libs. On the redhat system cpanel uses you can create a .conf file in /etc/ld.so.conf.d/. The .conf file just has a path in it. See man ldconfig for more information.
$ cat >> /etc/ld.so.conf.d/opt-local.conf
/opt/local/lib
^d (press ctrl-d)
$ ldconfig
Now you have python 2.5 on your system, you can configure a /usr/bin/python2.5 symlink if you like or put it anywhere on your system. I chose to just add /opt/local/bin to my $PATH or specify the path explicitly. Don’t replace the system default python 2.4 of course since mailman currently depends on that! Hopefully future releases of cPanel will ship with Python 2.5.

Build mod_wsgi

When building mod_wsgi I used a shared python library as recommended to avoid apache process bloat. If you use a shared library then mod_wsgi.so is only around 250K according to the docs. But you have to symlink the python shared library from the right directory for mod_wsgi to find it. That was clearly stated in the installation issues for mod_wsgi:
In that case you need to create a symlink in the ‘config’ directory to where the shared library is actually installed.
So we built mod_wsgi by doing:
$ cd /opt/local/lib/python2.5/config
$ ln -s ../../libpython2.5.so .
$ cd /opt/local/src/mod_wsgi-2.1/
$ ./configure --with-python=/opt/local/bin/python
$ make && make install
After running the install you will see a new apache module installed at /usr/local/apache/modules/mod_wsgi.so, mine was 171K.

Configure Apache and mod_wsgi

First tell Apache about the new module and what files it should handle. This can go in the pre-vhost conf which can be edited in WHM from Apache Setup | Include Editor.
LoadModule wsgi_module /usr/local/apache/modules/mod_wsgi.so
AddHandler wsgi-script .wsgi 
Then configure a virtual host/site to actually start using it. You should probably start reading some of the mod_wsgi documentation. One concern is whether to use embedded mode or daemon mode. Daemon mode is the most commonly used setup and the one I used sits inside a VirtualHost include file. Here’s the configuration for our Django site:
<IfModule mod_alias.c>
Alias /robots.txt /home/bob/sites/example.com/lib/myproject/media/robots.txt
Alias /media /home/bob/sites/example.com/lib/myproject/media/
Alias /admin_media /home/bob/sites/example.com/lib/django/contrib/admin/media/
</IfModule>

<IfModule mod_wsgi.c>
WSGIScriptAlias / /home/bob/public_html/myapp.wsgi
WSGIDaemonProcess myapp processes=5 threads=1 display-name=%{GROUP}
WSGIProcessGroup myapp
WSGIApplicationGroup %{GLOBAL} 
</IfModule>

According to the above config, all requests for this site will be managed by the WSGI application, except for the URLs defined with the Alias directive.

On our cPanel system this file is:
/usr/local/apache/conf/userdata/std/2/<username>/<your.site.domain>/vhost_mods.conf
Also remember when adding a configuration to your vhost config on a cPanel system you should run the following commands:
/scripts/ensure_vhost_includes --user=<username>
/scripts/verify_vhost_includes 

Hello World

Here is a minimal script you use to test. Once you have mod_wsgi.so compiled and loaded into Apache, put this in your myapp.wsgi script to check if the pipes are clean.
def application(environ, start_response):
    """Simplest possible application object"""
    output = "Hello World"
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)
    return [output]
In searching to get the beginner scoop I came across a mailing list thread which explains it quite well. Most of the mod_wsgi documentation is technical and there is so much it is daunting. I have a feeling Graham is prepping for a book. One thing I did to learn was subscribe to the mailing list and start listening. Helps the pieces come together. Asking questions on the mailing list I was greeted with prompt and helpful answers. It is rare to see a thread go unanswered by Graham, the author, even in some cases during vacation. This level of commitment to a crucial piece of grease and cog makes me proud to be an open-sourceror.

Configure Your Application

Create a mod_wsgi python script (myapp.wsgi) to load our application.
  1. Tell python where to find your libraries
  2. Tell python where to store python eggs (this directory needs to be writable by the mod_wsgi process).
  3. Define your DJANGO_SETTINGS_MODULE so Django knows what project you want to run.
  4. Define a WSGI application, in mod_wsgi script terms that means instantiate an object named ‘application’.

Each WSGI application typically uses it’s own script. Typically this file lives in the docroot and sets up your application. You can see how I’m referencing the WSGI script from the WSGIScriptAlias Apache directive above. These are some conventions I have to store my Django project and related Django applications for the site in /home/<username>/sites/<sitename>/lib. So I insert that path as the first item in my python path.

import os, sys
sys.path.insert(0,'/home/bob/sites/example.com/lib')
os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings'
os.environ['PYTHON_EGG_CACHE'] = '/home/bob/sites/example.com/.python-eggs'

import django.core.handlers.wsgi

application = django.core.handlers.wsgi.WSGIHandler()
One thing to check with your application is that you are not using the print function within it. I had a few places where I was debugging and left some print statements hanging around. That was running quietly in mod_python but mod_wsgi complains about it. This is actually a good thing since your web application should be logging this information in a log file and not printing debug information to the client. So I replaced these print statements with:
print >> sys.stderr, 'entering function'
[or]
sys.stderr.write('entering function')
sys.stderr.flush()  # if timing is a real issue

Success

If all goes well when you restart apache and look at the process list on your system you will see 5 ‘(wsgi: myapp)’ processes that were spawned and kept alive by apache. In this case the process will be owned by nobody or whoever apache is running as. Also if the modification timestamp on your myapp.wsgi file changes and you refresh your site new code is loaded.

I chose not to run a threaded process, so in my config I set threads=1. But that’s because I have some code that may not be thread safe. I should look further into that because ideally I would like to run threaded process so they scale better. I’m still not entirely sure what is thread safe, so figured I wouldn’t trudge too far into unknown territory. I’m amazed at the load we are handling with 5 of these suckers and a bit of caching. Of course we are only serving the python requests, the static file requests are skipping mod_wsgi. They are defined with the Alias directive and handled by Apache.

Related

I have to admit, even after installing a WSGI application I still have a hard time explaining what it is. The simplest explanation is WSGI is just a spec, but I think of it as CGI for Python applications. I guess CGI didn’t cut it. There is plenty of better information than I can provide.
  1. kimiming reblogged this from mandric
  2. mandric posted this