Optimising use and using multiple memcache servers in a pool

Jason Edgecombe jedgecombe at carolina.rr.com
Sun Jan 21 00:14:39 UTC 2007


You're welcome.

I've only used memcache in toy applications, not in production. I've 
just been reading the mailing list for a while. ;)

I honestly don't know about the overhead of one vs multiple servers. It 
depends on the client API implementation.

According to
http://usphp.com/manual/en/function.memcache-connect.php

The connection to the extra server isn't made until it's used.  The 
pconnect method isn't supposed to have to rebuild the connection.

I would suggest benchmarking with 1, 2, and 3 memcache servers to see 
what the overhead times look like.

Theoretically, you should only have to start a connection when you do a 
get. Given the server list and weights, compute which servers to connect 
to get the keys.

a hidden gotcha that comes up on the list occasionally is what happens 
when a server goes down. Do you leave the server in the pool, but just 
not failover or do you failover or remove it from the pool and have all 
of your keys rehash. If you failover or rehash, then you may end up with 
two copies of the data which might be stale.

Jason


Alan Jay wrote:
>> Subject: Re: Optimising use and using multiple memcache servers in a
>>     
>
> Thanks Jason,
>
> That has cleared up one point (that I didn't know I needed to understand) :)
> and made the whole thing more understandable.
>
> I now have a better understanding of the issues with the network but
> fortunately we have a private gigabit network between all our servers and in
> any case all the database transactions go over the network (so there is no
> change there).
>
> The advantage for us is one of RAM based speed and offloading the READs from
> the mySQL database and putting them into the MEMCACHE.
>
> My one query is: 
>
> Is there an *overhead* opening up a pool of memcached servers for each
> transaction which is significant in comparison to opening a single server.
>
> If (like our mySQL cluster) there was a single point of entry to the "cache"
> then the overhead for managing the pool is done at the "server".  
>
> >From what I can see from the limited documentation creating a pool is
> something every client does for every transaction (is that correct).
>
> I was slightly worried that the whole pool creation and management element
> added a level of overhead that (in my application) was not entirely necessary.
>
>
> At the moment I have taken the simple (basic) approach which is:
>
>         // check memcache to see if the object is available
>         if ( $ds_server != "" ) {
>         $memcache_obj = memcache_connect($ds_server, 11211) or $ds_server="";
>         if ( $ds_server != "" ) {
>         $article_row   = memcache_get($memcache_obj,$id_str);
> 	  } else {
> 	    // get the date from mySQL
>         }
>         }
>
> As I understand this I need to do this for every client and every transaction
> that I need to access.
>  
> Now if I was using a pool of servers I would have to:
>
>         $memcache_obj = memcache_connect($ds_server1, 11211,1,20) or (do
> something to deal with the server being down?); 
> 	  memcache_add_server($memcache_obj, $ds_server2, 11211,1,20) or (do
> something to deal with the server being down?);
> 	  memcache_add_server($memcache_obj, $ds_server3, 11211,1,80) or (do
> something to deal with the server being down?);
> 	  memcache_add_server($memcache_obj, $ds_server4, 11211,1,80) or (do
> something to deal with the server being down?);
> 	  memcache_add_server($memcache_obj, $ds_server5, 11211,1,80) or (do
> something to deal with the server being down?);
> 	  memcache_add_server($memcache_obj, $ds_server6, 11211,1,10) or (do
> something to deal with the server being down?);
> 	  memcache_add_server($memcache_obj, $ds_server7, 11211,1,10) or (do
> something to deal with the server being down?);
>
> 	  If all the servers are NOT down then
> 	          $article_row   = memcache_get($memcache_obj,$id_str);
>
> But at first sight it looks like you need quite a lot of code to create the
> pool and manage any servers that are down is this overhead a potential issue?
>
> (do something to deal with the server being down?) probably is something using
> memcahce_set_server_params (possibly).
>
> I assume people have been through this issue but I haven't seen any good code
> examples floating around for the current pecl-memcache code base.
>
> Am I seeing too many potential problems in this :) or should it just work?
>
>   
>> ------------------------------
>> Hi Alan,
>>
>> After reading your email, I still sense some confusion. I just wanted to
>> throw out some examples to clarify things. I apologize if this is
>> unnecessary.
>>     
>
> Not unnecessary - very happy to get as much guidance as possible as there is
> limited user documentation. 
>  
>   
>> Memcache is a distributed cache.  Given three memcache servers and
>> objects number 1..9, the objects may be distributed as follows:
>> server1: 1, 4, 7
>> server2: 2, 5, 8
>> server3: 3, 6, 9
>>
>> (I don't know how the hashing algorithm works, this is a simplification)
>>
>> When client asks for object 1, it fetches it from server1, even if the
>> client is server2 or server3.
>>     
>
> OK that makes sense and explains the advantages.
>  
>   
>> The main reason to run multiple memcache servers is to increase the size
>> of your cache to store more data or to ensure that only a portion of the
>> cache is lost if a server goes down.
>>     
>
> OK again that makes a great deal of sense.
>  
>   
>> The main overhead with memcache vs APC would be the network latency.
>>
>> If you have 3 memcache servers, then at worst, each client has a tcp
>> connection to each server (assuming one thread on each client).
>>     
>
> Sure but is there an overhead to creating those three connections each time a
> client requests a page from the web server.
>  
>   
>> non-memcache workflow is as follows:
>> client read: select from db
>> client write: update db
>>
>> the typical memcache workflow is as follows:
>>
>> client read: fetch object from cache if available, else select from db.
>> put object in cache if it didn't exist.
>>   other clients will fetch the cached copy after it's cached
>>
>> client write: fetch object from db/cache, update db, update cache.
>>
>> The client write procedure can be a little tricky. Fetching directly
>> from the db avoids cache inconsistencies. memcache doesn't have
>> transactions, so I'm not sure if some type of "I'm updating the db"
>> token might be useful to store, or just have the updater overwrite
>> what's in the cache.
>>     
>
> For us I don't think this is an issue as the client does not update the
> content the editorial tool (which runs on a separate machine) has code to note
> a change resets the expiry time to a few seconds so that the next client that
> asks for file will get a new one as it will have expired (at least that is the
> theory - and the test seem to have worked).
>
> Once again thanks everyone for your comments and thoughts.
>
> Alan Jay
> www.digitalspy.co.uk 
>
>   



More information about the memcached mailing list