jlrobins at socialserve.com
Sun Oct 9 17:32:59 PDT 2005
> That's fine. But what about other databases ?
> I don't suppose that anybody who uses memcached has PostgreSQL :)
Well, we use postgresql and memcache, but implemented an invalidation
technique which doesn't need any direct linkage between the database
and memcache as in Sean's crafty work -- at the expense of not-quite-
realtime cache invalidation.
We hooked on-update triggers to the rows which could contribute to
memcache-stashed entities. The caching will be done according to the
primary key of one particular table -- the other tables are joined in
for the queries yielding rows which are cached. So, whenever any of
the subordinate tables are updated in such a way which should
invalidate cached entries according to the primary table rows, or if
the primary table itself is updated, then the corresponding primary
table ids are dumped into an 'invalidate_these_primary_ids' type
table. That's all that is done in trigger-space -- determining the
primary keys of the main tables those updates effect, then dumping
those ids into another table.
We run an external process to periodically drain the invalidate-these-
ids table, performing the actual memcache invalidation calls.
Currently it does so by polling postgres every 5 seconds. When it
actually finds ids to invalidate, it computes the memcache key names
which webspace could have used to stash, then iterates over the
memcache servers, invalidating those keys (we chose to have
webservers interact only with local memcaches -- when load gets
bigger we can always reconsider). When it has successfully
invalidated all the keys listed in the table, then it deletes all the
keys it observed in the table. Using a serializable transaction makes
that part easy.
Now the downside to this is running an additional service, and
ensuring that it is working well and properly. If you have more than
one memcache instance, you have to consider things like partial
failures at invalidation time -- what happens if your memcache client
gets unhappy talking to one of your caches but not the other. We're
taking a simplistic route currently by just throwing away references
to all the memcache clients and letting them be recreated after the
next sleep / poll cycle -- along with not deleting the keys from the
table so that when we poll next we'll try to expunge again. A
slightly more complex table structure marking which memcaches have
invalidated those keys successfully would make that a little less
brain-dead, but for now we just make sure that our memcacheds are happy.
This system was definitely influenced by knowledge of slony-1, the
primary opensource replication engine for postgresql. It uses
triggers to capture update / insert / delete information which then
get transcribed into a replication log table. Then an external
process reaps that data and performs the replication via playing
equivalent SQL commands to remote postgres instances. Once replayed,
those rows in the replication log table are marked completed. Once
all the consumers of those replication log rows have handled them,
then the rows can be deleted. So, if the replicator processes die or
are suspended, or, in our case, if our invalidator gets stopped for
some time, then when it comes back up no messages have been lost --
Slony can run in either wait/notify mode or in polling more. The wait/
notify mode uses a feature of postgres to have one client wait until
a signal is raised on a table, which, in slony's case, will be done
whenever one of the triggers deposits rows into the replication log
table. Under low load, this works well because the external
replication process will sleep until there is definitely work to do.
But under high write load to the replicated tables, this yields a
high context-switch rate, with the external process being woken only
do to a small chunk of work, only to perform it, go back to sleep,
only to be woken up immediately. Polling ensures grabbing more
reasonably sized work units (many more rows to replicate), at the
expense of wasted cycles when there was no work to do, as well as a
possibly longer delay in replication (or invalidation) when there was
little to do.
But enough -- you get the point. We decided it was fine for our data
to be stale for 5 seconds and none would be the wiser.
More information about the memcached