Best Way to Combine Fragment and Object Caching for Memcached and Rails

Best way to combine fragment and object caching for memcached and Rails

Evan Weaver's Interlock Plugin solves this problem.

You can also implement something like this yourself easily if you need different behavior, such as more fine grained control. The basic idea is to wrap your controller code in a block that is only actually executed if the view needs that data:

# in FooController#show
@foo_finder = lambda{ Foo.find_slow_stuff }

# in foo/show.html.erb
cache 'foo_slow_stuff' do
@foo_finder.call.each do
...
end
end

If you're familiar with the basics of ruby meta programming it's easy enough to wrap this up in a cleaner API of your taste.

This is superior to putting the finder code directly in the view:

  • keeps the finder code where developers expect it by convention
  • keeps the view ignorant of the model name/method, allowing more view reuse

I think cache_fu might have similar functionality in one of it's versions/forks, but can't recall specifically.

The advantage you get from memcached is directly related to your cache hit rate. Take care not to waste your cache capacity and cause unnecessary misses by caching the same content multiple times. For example, don't cache a set of record objects as well as their html fragment at the same time. Generally fragment caching will offer the best performance, but it really depends on the specifics of your application.

When doing fragment caching, should we cache active record queries as well?

Normally one should not use business logic or queries in views, but in this case it is possible to make an exception. Just define a special method for your query, for instance Video.your_method, and use it in the view. This seems to be the cleanest way to do it:

<% cache 'videos_and_photos', :expires_in => 24.hours do %>   
<div id="videos">
<% Video.your_method.each do |video| %>
...
<% end %>
</div>

Otherwise you are caching data which belongs together in two different places, which may lead to unpredictable results.

pagination and memcached in rails

When it comes to caching there is no easy solution. You might cache every variant of the result, and thats ok if you implement auto-expiration of entries. You can't just use all_posts, because this way you will have to expire dozens of keys if posts will get changed.

Every AR model instance has the .cache_key based on updated_at method, which is prefered way, so use this instead of last record. Also don't base your key on last record, because if some post in the middle will get deleted your key wont change. You can use logic like this instead.

class ActiveRecord::Base         
def self.newest
order("updated_at DESC").first
end
def self.cache_key
newest.nil? ? "0:0" : "#{newest.cache_key}:#{count}"
end
end

Now you can use Post.cache_key, which will get changed if any post will get changed/deleted or created.

In general I would just cache Post.all and then paginate on this object. You really need to do some profiling to find bottle necks in your application.

Besides, if you want to cache every variant, then do fragment/page caching instead.

If up to you how and where to cache. No one-way here.

As for the second part of the question, there is way to few hints for me to figure an answer. Check if the browser is making a call at all LiveHTTPHeaders, tcpdump, etc.

Why is :memory_store cache faster than :dalli_store for memcached? Should I just stick with :memory_store?

From the rails caching guide regarding memory_store:

This cache store keeps entries in memory in the same Ruby process.

You'd expect that to be pretty damn quick for some benchmarks. It doesn't have to put any effort into reading to or writing from any other kind of storage. The cached stuff is just there when it needs it.

However, you can't store too much information like that before it becomes unwieldy. The processes for your server will grow and your cache will get rid of the oldest cached information pretty quickly (the default size of this cache is 32mb).

Also, with multiple processes running, as you'd have for a production server, you would be duplicating cached information across each process. For example, if your home page is cached you'd need to cache it on each of the processes that run on the server.

You're also going to have problems manually expiring things from a cache that is within the server processes because you need to either communicate with all the running processes to expire cached information or get them to check if the information is stale before accessing it.

By using something like memcached or redis you can have all of your server processes accessing the same cache and you can have a much bigger cache meaning that cached information becomes stale a lot less frequently. You can write once to the cache and all server processes benefit and you can clear once and all processes know it is cleared.

As you've identified, the trade-off is the performance of writing to and reading from the cache but on a system of any kind of size that performance trade-off isn't worth it compared to having a more sophisticated caching store.



Related Topics



Leave a reply



Submit