premcache: caching and precaching with memcached
Courtenay : October 14th, 2006
Lets imagine that you have a list view of some data which takes a long time to generate, one second or so to suck the data from the database. This is unreasonable for a site of any proportion, and you certainly don't want to re-run query every time someone hits the page.
The easy solution here is to cache the data, but today I'm going to show you a trick to _precache_ the data so that each page load is blazing fast without sacrificing user experience.
I'm going to assume you know how to use memcached. Note that you can of course use all types of caching, but for the purposes of this exercise we're assuming they don't exist.
h2. step one: some handy memcached code
This is the land of ruby, and we have blocks to do our abstractification. This is similar to other memcached libraries out there
module CabooseMemcached
def memcache_me(key, timeout=600, &blk)
unless defined?CACHE # no memcache
return block_given? ? yield : nil
end
# no block given, just delete the cache
unless block_given?
CACHE.delete(key)
return
end
# cache hit, block was not evaluated
return results if results = CACHE.get(key)
# otherwise, set the cached item
if block_given?
CACHE.set(key, results=yield, timeout)
results
end
end
end
So, now we can do things like
@blah = memcache_me('foo') { Blah.find(23) }
memcache_me('foo') # delete the cache item
Back to the controller, where we're running some expensive queries.
class MonkeysController
def list
@monkey_pages, @monkeys = paginate(:monkeys, :per_page => 25, :select => '...')
end
end
Now we can memcache the result set so that the next time they hit that page, it's not going to hit the db at all.
class MonkeysController
def list
@monkey_pages, @monkeys = memcache_me("monkeys_page#{params[:page].to_i}") {
paginate(:monkeys, :per_page => 25, :select => '...')
}
end
end
Victory! However, each initial page load still takes that whole second to generate, and your app listener process will be blocked and unresponsive until it's completed. If you have 5 app listeners and 5 users visiting that page, your entire site will be 'down' until the query is finished.
* note: if your data is user-context sensitive, you should make that key something like "monkeys_page#{params[:page].to_i}_#{current_user.login}"
h2. precache the next page's results
Since memcached is essentially a global shared object storage, we can preload the next page's data so that when the user hits that page, the data will already be in the cache.
Your controller action now looks something like this:
class MonkeysController
def list
params[:page] ||= 1
@monkey_pages, @monkeys = memcache_me("monkeys_page#{params[:page].to_i}") {
paginate(:monkeys, :per_page => 25, :select => '...')
}
memcache_me("monkeys_page#{params[:page].to_i}") {
params[:page] = (params[:page] || 1).to_i + 1 # hack to get the next page number
paginate(:monkeys, :per_page => 25, :select => '...')
}
end
end
but wait! this takes _twice_ as long for the first page and the same amount of time for successive pages..! what's the use in that?
This is where we take advantage of memcached's "global" memory storage.
h2. fire and forget
Some of this code is taken from the excellent "daemonize":http://grub.ath.cx/daemonize/ library, and "_why":http://redhanded.hobix.com/inspect/iThoughtProcessDetachWasMyFriend.html
def fire_and_forget(&block)
pid = fork do
begin
yield
ensure
Process.exit!
end
end
Process.detach pid
end
This fires up another ruby (rails!) process, an evil clone of your current process, does your bidding, then dies.
h2. the final code
class MonkeysController
def list
params[:page] ||= 1
@monkey_pages, @monkeys = memcache_me("monkeys_page#{params[:page].to_i}") {
paginate(:monkeys, :per_page => 25, :select => '...')
}
fire_and_forget do
memcache_me("monkeys_page#{params[:page].to_i}") do
params[:page] = (params[:page] || 1).to_i + 1 # hack to get the next page number
paginate(:monkeys, :per_page => 25, :select => '...')
end
end
end
end
Now, every action (after the first page) will be immediately loaded. A background process will be fired off, loading the next-page's data into memcached. When the user hits that next page, the data is already sitting in memcached waiting for consumption.
h2. finally
A variation of this code is in production and gets hojillions of hits every day. Some of it may have gotten munged in the telling. I kinda made up the pagination stuff. Comments? Bugs?
update(1) : you may need to reconnect to the database. the app in question actually uses an xmlrpc backend which is beyond the scope of this article, so we never had to deal with connections. completely untested.
update(2) : if you want logging, you need to reopen the logger class, because file handles don't stay open in the fork
update(3) : this'd be an awesome way to reduce the startup lag when, say, spawning mongrel clusters.. PDI!
6 Responses to “premcache: caching and precaching with memcached”
Leave a Reply
Remember: escape your underscores \_ and indent code at least 4 spaces or incur the wrath of smartypants.
October 19th, 2006 at 08:47 AM
Great Writeup! I think the same can be done for the root pages as part of the application startup. So, I’d prolly do something like an ‘initialcache’ which would read a list of pages/queries to load when we’re starting the application. Might be better to decouple it from the mongrel cluster setup but that is six of one half dozen of the other.
Sometimes you just want to restart your cluster, but not really mess with backend cached data, sometimes.. oh well.. I’m rambling now :)
June 24th, 2007 at 02:50 PM
Buy Viagra Viagra pill Viagra Online Viagra Soft Tabs Cheap Viagra BUY CIALIS Paxil Online Buy Viagra Viagra pill Viagra Online Viagra Soft Tabs Cheap Viagra BUY CIALIS Paxil Online
October 10th, 2007 at 09:05 AM
this code is amazing. exactly what i was looking for.
October 11th, 2007 at 06:58 AM
It'll be the default way of doing Rails clusters once Ruby can do copy on write properly, because forking will save a ton of memory.
October 13th, 2007 at 06:38 AM
hi, i m trying do caching with this code but its not working first time its working but in log file i cant see anything like cache set and after firt try it gives me error like undefined class/module Regobject so can anyone help me please? thanks in advance
November 13th, 2008 at 03:35 AM
This is a very interesting approach. I tried it in a rails 2.1 application (using Rails.cache), and got pretty nice speedups initially. However, after the first few requests, this causes all kinds of weird errors, which ensured that this code didn't make it to our production servers.
If I do reopen the DB connection (using ActiveRecord::Base.connection.reconnect!), I get "DB server unexpectedly closed the connection" messages after the fourth or so pre-cache run... definitely not what I'd expect (-:
If I don't reopen the DB connection, it actually makes it to about 8 pre-cache runs, until I get ruby compilation errors in my application's files when running in development mode; when I turn on class and template caching, it gets even further, and then dies with unexpected memcache messages: I suppose the memcache client connection will have to be reset, as well.
Right now, I don't know if the horrible entrail-groping inside Rails that this approach requires is worth the speed gains. )-: