Implementing Memcached on EC2
Erik Osterman
e at osterman.com
Fri Jan 5 23:51:55 UTC 2007
> I brought this question up to Brad (actually, asking more about the
> auth factor), and he brought up some good points.
>
> > But: how are nodes currently discovering the available
> > memcached servers? And don't you have rehashing issues if they're
> > coming and going? Or are you using consistent hashing on the
> > client side? Or just local single node caching? In which case,
> > what's wrong with 127.0.0.1?
> > So yes, auth solves a bit, but I'm curious how you plan to make
> > this work reliably when nodes are coming and going.
I think a lot of what we require with our implementation of Memcache on
EC2 goes beyond the scope of what Memcache was designed for and what
other people need. So, rather than proposing a way to modify Memcache,
I'd like to bounce some ideas off of you guys on how we can design a
scaleable, fault-tollerant solution that meets our needs. We don't
really want to go about reimplementing Memcache, so we'd like to work
with the strengths of Memcache while at the same time addressing the
weaknesses.
Problems:
1. No way to get all Memcache clients to get the same server lists.
2. No way to get all Memcache clients to have the same network
connectivity to Memcache instances
3. Without (1) and (2), there is no way to to guarantee(*) that all
clients have an identical view of the cached data.
(*) or as close to a guarantee (within miliseconds) as we can
realistically get.
Result: Data inconsistency. The Memcache architecture is susceptible to
inconsistent cache hits if not all Memcache clients share identical
server lists and identical Memcache instance connectivity.
A server/Memcache instance can be unreachable on the network for any
subset of clients accessing it, but at some later point in time come
online. If this happens, the data associated with the keys on this
Memcache instance are potentially inconsistent with the Memcache
instance(s) that took over.
Solution: Implement a layer on top of Memcache that functions as a
distributed proxy network which maintains persistent connections to all
Memcache instances and all proxy instances. Each proxy coordinates with
all other Memcache proxies to always maintain identical server lists. If
any proxy gets disconnected from any Memcache instance, it broadcasts
this to all other proxies. Any time a Memcache instance joins a proxy's
list of Memcache servers, all keys are flushed on that Memcache
instance. Memcache clients only connect to Memcache proxies.
Assumptions:
1. All proxies can connect to all other proxies. In other words,
connectivity is not an issue for proxies.
2. If a proxy cannot connect to another proxy, it assumes the proxy is
dead for everyone (clients and proxies).
3. A method exists for proxies to discover new Memcache instances.
I'd appreciate to hear all constructive criticisms/weaknesses of this
approach and possible resolutions to any weaknesses, which I'm sure
there are... Alternately, any other random suggestions/contributions to
overcome our problems.
Best Regards,
Erik Osterman
More information about the memcached
mailing list