Which way is better for running memcached?
don at smugmug.com
Sat Feb 17 01:36:36 UTC 2007
You know, right after I sent this, I realized I was smoking crack. :)
We don't currently use multi-get, and one of the reasons is because the
first libary we used (3 years ago?) didn't support multi-get across
Only after I hit send did I realize that that ancient library didn't
even support multiple servers, period, which was the whole reason we
re-wrote the library ourselves. :)
It makes perfect sense that you could shove a bunch of keys at your
memcache libary and it would use the hashed values of each key to create
multiple multi-get requests, one for each instance. Duh.
We're not having any performance problems, so I guess I just haven't
revisited it yet. Sounds like a relatively easy reason to gain some
speed, though, so I probably should.
Sorry for wasting the list's time. :)
Steven Grimm wrote:
> I'm not sure I understand the basis of the questions. There's no need to
> "deal with" data spread across multiple machines -- in fact, the beauty
> of memcached is that it *encourages* data to spread across multiple
> machines, which serves as a pretty good load balancing mechanism.
> If you call the "get keys A, B, and C" function/method in any of the
> client libraries I'm familiar with, and keys A and C map to server 1
> while key B maps to server 2, they'll all send "get A C" to server 1 and
> "get B" to server 2, then wait for both servers to respond and return
> the combined result to the application.
> Our application then looks up any non-cached data in the database, as
> you surmise, and stores it in the cache. But that logic is independent
> of figuring out which memcached servers to talk to -- it would work
> exactly the same with one huge instance or with a thousand little ones.
> If you can explain what problem you're running into that leads to those
> questions, maybe I'll be able to give you a more meaningful answer -- I
> don't see the context in which those things would be of concern. I'm
> especially unclear on the second question about data slices; what do you
> mean by that?
> Don MacAskill wrote:
>> Is your data perfectly divided so that every multi-get never touches
>> more than one instance?
>> If so, you just make sure somehow that your data slice never crosses
>> 32GB or whatever your typical memcached instance has?
>> If not, how to you deal with data spread across multiple memcached
>> Do you do a multi-get on one instance, see what's missing and issues
>> single gets for the remaining data, falling back to some other
>> disk-based store on those failures?
>> Or issue multi-gets to each memcached instance and combine, then go to
>> disk for non-cached requests?
>> Or something else entirely?
>> Steven Grimm wrote:
>>> We use it a lot. We divide the data for a given page into "stuff we
>>> need immediately for the business logic that will change what other
>>> data we need to fetch," "stuff we need for the business logic that we
>>> can evaluate in isolation," and "stuff we're going to display." The
>>> first gets fetched as needed during the execution of the page. The
>>> second and third, we queue up internally and request all in one big
>>> "get" just before rendering the page at the end of the request; for
>>> the second class of data, we have a callback mechanism wrapped around
>>> the memcached client so that we can run our business logic using some
>>> of the returned data. There are some additional wrinkles but that's
>>> the rough idea.
>>> By the way, it's not really any easier or harder in PHP than in any
>>> other language; it's about application structure, not language. If we
>>> were writing our site in Java or Python or C/C++ we'd probably do
>>> exactly the same thing.
More information about the memcached