System architecture best practices - memcached + webfarms

Tue Jul 3 17:54:57 UTC 2007

One technique that should work and that would make this easier (as you 
wouldn't have to pass the userid every time), would be to use a 
consistent hash on the key. (If you don't know what that is, look at 
http://lists.danga.com/pipermail/memcached/2006-October/002919.html for 
how it can be applied to adding/removing servers nicely). In the example 
below, "user17" is the same for the two keys, so if you can put them 
close on the number line, they're likely to go to the same server. I 
figured the way to do that would be have the top 24 bits be a hash of 
the prefix (ie "user17"), and the low 8 bits a hash of the suffix (ie 
"name" or "location"). Assuming all your requests in the get_multi have 
the same prefix (which I've found to be true in my application), you'll 
only hit a couple servers. The problem with this method of course is 
that it has the potential to be very unbalanced (ie some servers will be 
hit hard, while others will be empty), since the keys will be grouped 
much more strongly. I think the balance vs grouping issue can be solved 
with a bit of experimentation with the hash function and how many bits 
go to the prefix vs the suffix.

Timo

Brad Fitzpatrick wrote:
> In the Perl client, at least, your key can contain an explicit
> numeric hashing value, which has two benefits: 1) it's quicker to do $int
> % $num_servers than do a crc32, and 2) locality... you tend to hit one
> server if most the memcache objects you're requesting are, say, from the
> same user.
>
> So instead of:  (contrived example:)
>
>    $memc->get_multi("user17:name", "user17:location")
>
> which would do two crc32s, and likely hit two servers, we do:
>
>    $memc->get_multi([17, "user17:name"], [17, "user17:location"])
>
> No crc32s, and they'll both go to the same node.
>
> That's one trick.
>
> But even without that, your web nodes should already have all the TCP
> connections open to all the memcaches (or most of them), so then your
> get_multi implementation should just do a non-blocking write to all of
> them "at once" (actually in serial, but you're not waiting for a reply, so
> network latency, so it's pretty much "immediate"), then you select/epoll
> on them all at once, reading from them in order.  If your implementation
> does a serial get_multi to each one, then the network latency over 100
> requests will kill you, and you should fix the client API you're using.
>
> So basically you can avoid hitting a lot of servers usually, but even if
> you have to, it's shouldn't be _that_ bad.
>
> - Brad
>
>
> On Mon, 2 Jul 2007, Richard Jones wrote:
>
>   
>> I've been thinking about what changes we may have to make to our memcached
>> installation in future as we continue to grow - our webfarm is approaching
>> 100 servers, each with 4GB ram, ~2GB of which is dedicated to memcached on
>> each machine.
>>
>> As we add more nodes, the usefulness of get_multi decreases - it's possible
>> for a single page to hit almost all of the memcached instances. I read
>> somewhere that facebook partition their memcached cluster to improve
>> get_multi performance (eg, all user data on a subset of mc nodes). Can anyone
>> comment on the effectiveness of this?
>>
>> Are we fighting a losing battle here - perhaps we should try and cram as much
>> ram as possible into a small number of machines (what do you do?). get_multi
>> would be more useful, but it costs more and doesn't seem as elegant :(
>>
>> Can anyone comment on how many memcache lookups they make per page?
>> Traditionally our developers treated memcache lookups as "free", but when
>> you're doing a few hundred memcache gets per page it soon adds up..
>>
>> Thanks,
>> RJ
>>
>>
>>
>> --
>> Richard Jones
>> Last.fm Ltd. | http://www.last.fm/
>> Office: +44 (0) 207 780 7080
>>
>>
>>     
>
>