Archive

Posts Tagged ‘Python’

A nasty bug

April 27th, 2009

A few weeks ago a pretty nasty bug appeared on my image hosting service pici.se about a month ago. For some reason, some people started noticing that their thumbnails were changed to pictures which didn’t belong to them. This was a rather serious issue, and I had a look at the code to figure out what was going on. I realized that I had recently changed my deployment from Django under apache to Django under lighttpd and this was surely related to that… but how?

It turned out that it was a combination of bugs that together fucked stuff up for me pretty bad. Since I started allowing pictures over 1MB a while ago, but wanted to limit hotlinking to them, I put them into a separate directory from the other pictures being served. That meant that the large pictures were given filenames such as “large/asSDavXZ.jpg” instead of “asSDavXZ.jpg” where the filename is randomly generated. I had however forgotten to update my code to check for the presence of these images when fetching a new random file name, that is my code looked something like this (pseudo-code):

filename = get_random_stuff()
while file_exists(filename):
  filename = get_random_stuff()

when it should have been changed to this:

filename = get_random_stuff()
while file_exists(filename) or file_exists("large/" + filename):
  filename = get_random_stuff()

When I saw this, the first thing that came to mind was that this shouldn’t be such a big deal. Generating two filenames which were the same should pretty much never happen since there are so many possible combinations of characters.

Well… it did. And quite often. Running some scripts on the server showed that there were quite a lot of these collisions which seemed REALLY weird, and then I found this bug in Django. It seems like fastCGI deployment using method=prefork gives the same random seeds to each process. In combination with my fuckup, this made these collisions happen quite a lot and people got their thumbnails overwritten since all thumbnails were stored in the same folder without the “large/” prefix for each image.

That is, for a pictures thumbnail to be overwritten the following had to happen:

1. Someone uploads a picture below 1MB and is handed to fastCGI process 1 and given a “random” string for it’s filename.
2. Someone else uploads a picture larger than 1MB and is handed to fastCGI process 2, and given the same randomized string as a filename due to the random seeds being non-random due to a bug in Django.
3. My code has a nasty bug in it and doesn’t detect this collision.
4. Thumbnails are generated for image2 at the same target as the thumbs for image1.

Fixing the bug in my code was rather trivial but I also patched my Django installation to avoid any other weird issues due to non-random seeds. The current patch which is available for the bug should however not be utilized since it uses time.ctime() as a random seed for each request, and ctime() will only change once a second which means that subsequent requests given to the same fastCGI process during the same second will be given the same seed. Instead time.time would be better, so I patched my installation with pretty much the same thing but using the following instead.

random.seed("%d%f" % (getpid(), time())) 

This seems to generate random values for each request as far as my testing goes.

buffi Programming & scripting, Python, Web development , , , ,

Migrating Django from Apache to lighttpd using FastCGI

February 3rd, 2009

I run a medium traffic imagehosting site which serves about 8-10000 pageviews per day using Django and I have been using the recommended deployment method (Apache + mod_python) for just under two years and it has mostly worked well. However, about two months ago I was starting to notice increasing delays from the server and Apache would occasionally fail in spectacular ways which brought the CPU load to 100% for long periods of time.

I have always used lighttpd to serve static content, and since I don’t really enjoy the Apache configuration syntax I decided to give the lighttpd + FastCGI deployment method a try. I expected this migration to be complicated, but it took me less than an hour to figure it out. I have some minor documentation of my changes below in case you are interested in the basics of how to handle a Apache -> lighttpd Django migration.

Apache configuration

My old configuration, using Apache looked like this (some stuff omitted).

<VirtualHost *>
        ServerName pici.se
        DocumentRoot /var/www
        ErrorLog /var/log/apache2/pici_error.log
        ServerAlias pici
        <Location "/">
            PythonPath "['/home/buffi/site'] + sys.path"
            SetHandler python-program
            PythonHandler django.core.handlers.modpython
            SetEnv DJANGO_SETTINGS_MODULE pici.settings
            PythonDebug On
        </Location>
        <Location "/css/"> SetHandler None </Location>
        <Location "/js/"> SetHandler None </Location>
        <Location "/im/"> SetHandler None </Location>
        <Location "/picisendfiles/"> SetHandler None </Location>
        <Location "/pictures/"> SetHandler None </Location>
        <Location "/thumbs/"> SetHandler None </Location>

        # Static content served with lighttpd.
        RewriteEngine on
        RewriteRule  ^/pictures(.*) http://static.pici.se:8080/pictures$1
        RewriteRule  ^/thumbs(.*) http://static.pici.se:8080/thumbs$1
</VirtualHost>

lighttpd configuration

The corresponding lighttpd configuration became:

$HTTP["host"] =~ "pici\.se" {
    server.document-root = "/home/buffi/site/pici/"
    fastcgi.server = (
        "/pici.fcgi" => (
            "main" => (
                "socket" => "/home/buffi/site/pici/pici.sock",
                "check-local" => "disable",
            )
        ),
    )

    alias.url = (
        "/css/" => "/home/buffi/site/pici/picipage/css/",
        "/js/" => "/home/buffi/site/pici/picipage/js/",
        "/im/" => "/home/buffi/site/pici/picipage/im/",
        "/thumbs/" => "/var/www/static/thumbs/",
        "/pictures/" => "/var/www/static/pictures/",
        "/picisendfiles/" => "/var/www/picisendfiles/",
       )

    url.rewrite-once = (
        "^(/css.*)$" => "$1",
        "^(/im.*)$" => "$1",
        "^(/js.*)$" => "$1",
        "^(/picisendfiles.*)$" => "$1",
        "^(/thumbs.*)$" => "$1",
        "^(/pictures.*)$" => "$1",
        "^(/.*)$" => "/pici.fcgi$1",
        )
}

You might notice the path of a FastCGI socket.

“socket” => “/home/buffi/site/pici/pici.sock”,

To create this socket, simply use the FastCGI script available through manage.py. I create my socket using the following command.

./manage.py runfcgi method=prefork socket=/home/buffi/site/pici/pici.sock pidfile=pici.pid

All Django requests will then be forwarded from lighttpd to the FastCGI daemon.

Conclusion

By migrating from Apache to lighttpd I noticed a nice performance boost and got a more enjoyable syntax for my configuration. I haven’t really bothered to measure the decrease in CPU load, but it is easily noticeable and my server doesn’t become as sluggish as before during heavy load. I’ve used lighttpd + FastCGI for about a month or so now, and everything seems stable. I’d recommend any Django developer using Apache to give it a try.

buffi Programming & scripting, Python, Web development , , , ,