Hackathon notes (non-binary protocol thread)

Sat Jul 14 14:12:24 UTC 2007

Other stuff was done besides the binary protocol last monday at FBHQ.

We had presentations from Facebook about how they use Memcached and
status of libmemcache and the PHP client.  Also there was some
interesting discussion of how to use MySQL replication to propagate
cache invalidation to multiple data centers.

I transcribed the FB slides for their server presentation.  [FB folks--
let me know if it's okay to splash that info across the mailing
list..]

Here's some more notes from our initial brainstorming discussions:

* Pluggable Storage Backends 

We didn't discuss or even consider replacing the in-memory model with
any other form of storage (durable or not...)

* Dump/Restore functionality

Some discussion.  Basically defining the correct conditions where this
might be useful (for data that never expires and is long-term durable)

Basic idea was to generate protocol stream and feed that into another
memcached instance.

* Abstract Data Types

Some pushback on doing things like queues and such in the server.
Many people mentioned that these could be constructed using the
existing memcached get/set/add/incr/decr operations.

I suggested we might want to come up with a standard way for clients
to store abstract data types so we have some kind of cross platform
way of sharing data.  For example a queue can be constructed with 10
data buckets and a counter, all in memcache...

No action taken AFAIK.

* Replace slab allocator

There was some discussion on abstracting out slabs.c/slabs.h.  As a
proof of concept it was suggested that someone write a simplistic
free/malloc implementation to show how it could be done.

Another interesting possibility was using a 'bulldozer thread' to 
optimize memory storage.

No action taken AFAIK.

* Make I/O buffers count as mem usage.

We put that on the list of things to try.  I don't think anyone ran
with it.

* Multidimensional keys

There was some discussion on how multidimensional keys could be used
in practice.  Most people seemed to latch on using it as a way to
expire large quantities of data when the underlying format changes due
to new releases etc.

It was mentioned that there was an FAQ entry on how to do this (fetch
a 'generation' prefix key and use that as part of your individual data
keys.  There was also talk of adding a generation identifier to the
protocol(?).  The main benefit was that this would allow you to expire
known stale data instead of having valid data expired out using the
LRU.

I don't know if anything was done on this other than discussion.

* Documentation improvements

Some people said they were going to work on docs.  Not sure what
happened with that.

* Client Library improvements
* Wildcard (regex deletes)
* Cache replication

The Pizza came before we could discuss this, and then we dived into
coding.

* Append functionality 

Steven and a few others worked through how to modify the memcached
support code.  His work is currently in the append branch.  I promised
to write some unit tests for this, but haven't gotten around to it
yet.

I spent the rest of the night cleaning up a few more things in trunk
and slogging through the Fedora RPM submission/build process.  I also
put in some hooks for eventual publication of internal docs with
Doxygen.

So... that's my notes, maybe others can contribute their experiences?

-- 
Paul Lindner        ||||| | | | |  |  |  |   |   |
lindner at inuus.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.danga.com/pipermail/memcached/attachments/20070714/1342e751/attachment.pgp