We were using mod_python in the past but the advantages to using mod_wsgi are:
- Smaller Python memory footprint in Apache.
- Can run as separate daemon process that is managed by Apache and supports threads.
- We can reload our application code without restarting Apache.
- Allows multiple WSGI applications on the same virtual host.
- Allows us to specifying what user our application will run as and a host of other nice configurations.
- Graham Dumpleton rocks.
- Python 2.5.x
- mod_wsgi 2.x
- Apache 2 (comes installed with cPanel)
- Django
- cPanel/CentOS
$ uname -a Linux bubba.example.com 2.6.18-53.1.21.el5 #1 SMP Tue May 20 09:35:07 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux $ rpm -q glibc glibc-2.5-18.el5_1.1 glibc-2.5-18.el5_1.1What is WSGI? According to one site it is:
WSGI is a specification, laid out in PEP 333, for a standardized interface between Web servers and Python Web frameworks/applications. The goal is to provide a relatively simple yet comprehensive interface capable of supporting all (or most) interactions between a Web server and a Web framework. (Think “CGI” but programmatic rather than I/O based.)
Build Python
We decided to build and maintain our own python installation rather than using the one that comes with the cPanel install. We wanted to avoid any grief with mailman and cPanel dependencies being broken. We also preferred using the latest and greatest python for our web applications.To get sqlite support in python 2.5 you need to get the headers to compile against. Since I already had sqlite3.3.6-2 installed I grabbed a rpm for the devel package on CentOS 5.1 to build against.
Then downloaded python 2.5 and built as such:$ ./configure --prefix=/opt/local --with-ncurses --with-threads --enable-shared $ make && make installIf you try to run /opt/local/bin/python and get something like “error while loading shared libraries: libpython2.5.so.1.0: cannot open shared object file: No such file or directory” then you can configure ld to find your shared libs. On the redhat system cpanel uses you can create a .conf file in
/etc/ld.so.conf.d/. The .conf file just has a path in it. See man ldconfig for more information.
$ cat >> /etc/ld.so.conf.d/opt-local.conf /opt/local/lib ^d (press ctrl-d) $ ldconfigNow you have python 2.5 on your system, you can configure a
/usr/bin/python2.5 symlink if you like or put it anywhere on your system. I chose to just add /opt/local/bin to my $PATH or specify the path explicitly. Don’t replace the system default python 2.4 of course since mailman currently depends on that! Hopefully future releases of cPanel will ship with Python 2.5.
Build mod_wsgi
When building mod_wsgi I used a shared python library as recommended to avoid apache process bloat. If you use a shared library then mod_wsgi.so is only around 250K according to the docs. But you have to symlink the python shared library from the right directory for mod_wsgi to find it. That was clearly stated in the installation issues for mod_wsgi:In that case you need to create a symlink in the ‘config’ directory to where the shared library is actually installed.So we built mod_wsgi by doing:
$ cd /opt/local/lib/python2.5/config $ ln -s ../../libpython2.5.so . $ cd /opt/local/src/mod_wsgi-2.1/ $ ./configure --with-python=/opt/local/bin/python $ make && make installAfter running the install you will see a new apache module installed at
/usr/local/apache/modules/mod_wsgi.so, mine was 171K.
Configure Apache and mod_wsgi
First tell Apache about the new module and what files it should handle. This can go in the pre-vhost conf which can be edited in WHM from Apache Setup | Include Editor.LoadModule wsgi_module /usr/local/apache/modules/mod_wsgi.so AddHandler wsgi-script .wsgiThen configure a virtual host/site to actually start using it. You should probably start reading some of the mod_wsgi documentation. One concern is whether to use embedded mode or daemon mode. Daemon mode is the most commonly used setup and the one I used sits inside a
VirtualHost include file. Here’s the configuration for our Django site:
<IfModule mod_alias.c>
Alias /robots.txt /home/bob/sites/example.com/lib/myproject/media/robots.txt
Alias /media /home/bob/sites/example.com/lib/myproject/media/
Alias /admin_media /home/bob/sites/example.com/lib/django/contrib/admin/media/
</IfModule>
<IfModule mod_wsgi.c>
WSGIScriptAlias / /home/bob/public_html/myapp.wsgi
WSGIDaemonProcess myapp processes=5 threads=1 display-name=%{GROUP}
WSGIProcessGroup myapp
WSGIApplicationGroup %{GLOBAL}
</IfModule>
According to the above config, all requests for this site will be managed by the WSGI application, except for the URLs defined with the Alias directive.
On our cPanel system this file is:/usr/local/apache/conf/userdata/std/2/<username>/<your.site.domain>/vhost_mods.confAlso remember when adding a configuration to your vhost config on a cPanel system you should run the following commands:
/scripts/ensure_vhost_includes --user=<username> /scripts/verify_vhost_includes
Hello World
Here is a minimal script you use to test. Once you have mod_wsgi.so compiled and loaded into Apache, put this in your myapp.wsgi script to check if the pipes are clean.
def application(environ, start_response):
"""Simplest possible application object"""
output = "Hello World"
status = '200 OK'
response_headers = [('Content-type', 'text/plain'),
('Content-Length', str(len(output)))]
start_response(status, response_headers)
return [output]
In searching to get the beginner scoop I came across a mailing list thread which explains it quite well. Most of the mod_wsgi documentation is technical and there is so much it is daunting. I have a feeling Graham is prepping for a book.
One thing I did to learn was subscribe to the mailing list and start listening. Helps the pieces come together. Asking questions on the mailing list I was greeted with prompt and helpful answers. It is rare to see a thread go unanswered by Graham, the author, even in some cases during vacation. This level of commitment to a crucial piece of grease and cog makes me proud to be an open-sourceror.
Configure Your Application
Create a mod_wsgi python script (myapp.wsgi) to load our application.- Tell python where to find your libraries
- Tell python where to store python eggs (this directory needs to be writable by the mod_wsgi process).
- Define your
DJANGO_SETTINGS_MODULEso Django knows what project you want to run. - Define a WSGI application, in mod_wsgi script terms that means instantiate an object named ‘application’.
Each WSGI application typically uses it’s own script. Typically this file lives in the docroot and sets up your application. You can see how I’m referencing the WSGI script from the WSGIScriptAlias Apache directive above. These are some conventions I have to store my Django project and related Django applications for the site in /home/<username>/sites/<sitename>/lib. So I insert that path as the first item in my python path.
import os, sys sys.path.insert(0,'/home/bob/sites/example.com/lib') os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings' os.environ['PYTHON_EGG_CACHE'] = '/home/bob/sites/example.com/.python-eggs' import django.core.handlers.wsgi application = django.core.handlers.wsgi.WSGIHandler()One thing to check with your application is that you are not using the print function within it. I had a few places where I was debugging and left some print statements hanging around. That was running quietly in mod_python but mod_wsgi complains about it. This is actually a good thing since your web application should be logging this information in a log file and not printing debug information to the client. So I replaced these print statements with:
print >> sys.stderr, 'entering function'
[or]
sys.stderr.write('entering function')
sys.stderr.flush() # if timing is a real issue
Success
If all goes well when you restart apache and look at the process list on your system you will see 5 ‘(wsgi: myapp)’ processes that were spawned and kept alive by apache. In this case the process will be owned by nobody or whoever apache is running as. Also if the modification timestamp on yourmyapp.wsgi file changes and you refresh your site new code is loaded.
I chose not to run a threaded process, so in my config I set threads=1. But that’s because I have some code that may not be thread safe. I should look further into that because ideally I would like to run threaded process so they scale better.
I’m still not entirely sure what is thread safe, so figured I wouldn’t trudge too far into unknown territory. I’m amazed at the load we are handling with 5 of these suckers and a bit of caching. Of course we are only serving the python requests, the static file requests are skipping mod_wsgi. They are defined with the Alias directive and handled by Apache.