How to get random rows from mysql using django without hurting your server
I did not know this, and tried getting random rows in django by using order_by(”?”), which uses ORDER BY RAND() to get a set of 32 random rows from one of my tables containing about 200 000 rows. This turned out to be a really bad idea, and it pretty much violated my server, having mysql consuming all of my CPU and most of my memory. Going through the slow queries log showed this:
# Query_time: 70 Lock_time: 0 Rows_sent: 0 Rows_examined: 0 SELECT `picipage_picture`.`id`,`picipage_picture`.`name`,`picipage_picture`. `header`,`picipage_picture`.`description`,`picipage_picture`.`uploadednick_id`, `picipage_picture`.`uploadedip`,`picipage_picture`.`views`,`picipage_picture`. `timestamp`,`picipage_picture`.`gallery_id`,`picipage_picture`.`private`, `picipage_picture`.`privid`,`picipage_picture`.`camera` FROM `picipage_picture` WHERE (`picipage_picture`.`private` = 0) ORDER BY RAND() LIMIT 32;
A query time of 70 is pretty much insane (the limit for a query to be concidered slow is by default 2).
The blog post mentioned at the top mentions how to avoid this, and I’m gonna go ahead and post the djangoed version of the solution.
Getting a single row is simple.
random_pic = Picture.objects.order_by("?")[0] # Slow!
becomes
from random import randint num_pics = Picture.objects.count() random_pic = Picture.objects.all()[randint(0, num_pics-1)] # Fast!
Getting a set of random objects is harder.
random_pics = Picture.objects.order_by("?")[:32] # SUPER slow!
becomes
from random import sample PICS_TO_GET = 32 num_pics = Picture.objects.count() # Get a bunch of extra numbers, to avoid missing ID's # Assumes enough rows rand_nums = sample(xrange(1,num_pics), PICS_TO_GET*10) # Match ID's of pictures to the sampled list random_pics = random_pics.filter(id__in = rand_nums)[:PICS_TO_GET] # Fast!
This is messy, but really fast and works beautifully on tables with a bunch of rows (but there are probably nicer ways to do it).
Thanks goes out to mattmcc in #django@freenode for pointing me towards the blog entry.
Recent Comments