Which way is better for running memcached?
Steven Grimm
sgrimm at facebook.com
Sat Feb 17 00:56:03 UTC 2007
I'm not sure I understand the basis of the questions. There's no need to
"deal with" data spread across multiple machines -- in fact, the beauty
of memcached is that it *encourages* data to spread across multiple
machines, which serves as a pretty good load balancing mechanism.
If you call the "get keys A, B, and C" function/method in any of the
client libraries I'm familiar with, and keys A and C map to server 1
while key B maps to server 2, they'll all send "get A C" to server 1 and
"get B" to server 2, then wait for both servers to respond and return
the combined result to the application.
Our application then looks up any non-cached data in the database, as
you surmise, and stores it in the cache. But that logic is independent
of figuring out which memcached servers to talk to -- it would work
exactly the same with one huge instance or with a thousand little ones.
If you can explain what problem you're running into that leads to those
questions, maybe I'll be able to give you a more meaningful answer -- I
don't see the context in which those things would be of concern. I'm
especially unclear on the second question about data slices; what do you
mean by that?
-Steve
Don MacAskill wrote:
>
> Is your data perfectly divided so that every multi-get never touches
> more than one instance?
>
> If so, you just make sure somehow that your data slice never crosses
> 32GB or whatever your typical memcached instance has?
>
> If not, how to you deal with data spread across multiple memcached
> instances?
>
> Do you do a multi-get on one instance, see what's missing and issues
> single gets for the remaining data, falling back to some other
> disk-based store on those failures?
>
> Or issue multi-gets to each memcached instance and combine, then go to
> disk for non-cached requests?
>
> Or something else entirely?
>
> Thanks,
>
> Don
>
>
> Steven Grimm wrote:
>> We use it a lot. We divide the data for a given page into "stuff we
>> need immediately for the business logic that will change what other
>> data we need to fetch," "stuff we need for the business logic that we
>> can evaluate in isolation," and "stuff we're going to display." The
>> first gets fetched as needed during the execution of the page. The
>> second and third, we queue up internally and request all in one big
>> "get" just before rendering the page at the end of the request; for
>> the second class of data, we have a callback mechanism wrapped around
>> the memcached client so that we can run our business logic using some
>> of the returned data. There are some additional wrinkles but that's
>> the rough idea.
>>
>> By the way, it's not really any easier or harder in PHP than in any
>> other language; it's about application structure, not language. If we
>> were writing our site in Java or Python or C/C++ we'd probably do
>> exactly the same thing.
>>
>> -Steve
More information about the memcached
mailing list