C API
Brad Fitzpatrick
brad@danga.com
Fri, 16 Jan 2004 10:55:21 -0800 (PST)
First off, there's a big confusion here about hosts, groups, domains, etc.
Let's get some terminology down. Here's what I'd say:
host: a memcached instance, running on an IP+port.
group: a weighted group of hosts, probably weighted by the amount of
memory on each host
domains: a namespace id for keys on a host/group. for instance, how
Slashcode had to add a key prefix to allow multiple slashcode
installs to not collide when hitting the same memcached servers.
memcached doesn't support this yet, so you have to do a
Slashcode-style hack for it to work.
You could make a client work on a host-level or group-level, but I'd argue
the group-level is what needs to be primarily exposed. Certainly if we
make an XS version of the Perl module (which we plan to), the underlying C
module must support groups to get the performance we want.
Now, your original API was very host-focused. This might be okay for
your application, but it's not for a general library. But the thing
is, it won't take much work to make a library support either way: just
make the handle object either represent a single machine or a group of
machines.
> > /* gets a handle for a group of servers. could set the servers here,
> > the domain, the time-out restrictions, etc. like the Perl API.
> > also, perhaps a function pointer to the memory allocator, if not
> > malloc. */
>
> This sounds fine, except that if you lose one server, you end up losing
> them all I would suspect (this is why I would really like to say that an
> object is in a particular pool and just purge only that pool on a
> failure).
Absolutely not.
You need to go read the Perl module source first. memcached is a hash of
hashes. A memcached instance (a host) is a big hashtable and nothing
else. The first layer, though, the client (Cache::Memcached) is the first
layer of the hash, which maps a key to a specific host in a group.
> > /* gets a single item, using mc's allocator, or nothing. */
> > char* memcached_get(memcached_client* mc, const char *key);
>
> See this wouldn't work for me. I have buffers to fill in MySQL and don't
> want the library to allocate anything at all (no mallocs). Possibly have
> the return value be a ptr to the position in the buffer that
> memcahced_client. Could pass in a size_t that could be used to say how
> much data is being returned. Still this solution wouldn't tell the user
> how many times they would have to call memcached_get() to complete a
> fetch.
This is a bizarre enough fetch model that you should make it a separate
function, so the common case has a simple API.
Somehing like:
memcached_get_partial_begin: you give it starting address and max size_t,
it tells you the total size_t, and a
handle to get more with, if there's more.
memcached_get_partial_continue: you give it that handle, and another
buffer and size_t, it gives you
another handle.
memcached_get_partial_end: releases stuff, given a handle
> Separate fetch sounds better since you can keep calling it until it
> returns no data (and you can keep your buffers small).
>
> Is there any advantage to making a get call with multiple keys?
Hell yeah.... Latency!
If you're fetching 500 keys from a server, you can send all keys at once
and get all the replies in one round-trip, or you can do them all one at a
time and wait *at least* a half second (unacceptable).
The perl module supports get_multi to a group of hosts, and splits the
keys up and does parallel get_multis to each host.
> > The "connect" stuff is an internal detail, not part of the public API.
> > You don't make clients deal with that. They just want to get/set and not
> > know from where.
>
> They need to know hosts and ports.
Yes, the hosts/ports/weights for servers are in the memcached_client
struct which each function takes. See the Perl module.
> > Also, you'll want internal functions which return the non-blocking
> > socket fds to wait on in your select loop. (memcached should never assume
> > a server is up or functioning quickly. your select timeout is specified
> > in *memcached_client.... something like a half second or a second at
> > most.)
>
> So a non-blocking cursor for a fetch method?
>
>
> Is a boolean really good enough for error? If I go do an add I may want
> to know why the failure occurred (aka was the object already there...
> did I get a chunk error?).
So if it returns false, call one of:
int get_error_code(memcached_client *mc);
char* get_error_string(memcached_client *mc);
char* get_error_string(int error_code);
(Like $dbh->err, $dbh->errstr, etc)
Don't make the state global to the library... put it in the mc handle
struct. (which could represent a single host or a group)
- Brad