update strategy

Sun Oct 9 17:32:59 PDT 2005

>
>> http://people.freebsd.org/~seanc/pgmemcache/
>>
>>
> Hi,
>
> That's fine. But what about other databases ?
> I don't suppose that anybody who uses memcached has PostgreSQL :)

Well, we use postgresql and memcache, but implemented an invalidation  
technique which doesn't need any direct linkage between the database  
and memcache as in Sean's crafty work -- at the expense of not-quite- 
realtime cache invalidation.

We hooked on-update triggers to the rows which could contribute to  
memcache-stashed entities. The caching will be done according to the  
primary key of one particular table -- the other tables are joined in  
for the queries yielding rows which are cached. So, whenever any of  
the subordinate tables are updated in such a way which should  
invalidate cached entries according to the primary table rows, or if  
the primary table itself is updated, then the corresponding primary  
table ids are dumped into an 'invalidate_these_primary_ids' type  
table. That's all that is done in trigger-space -- determining the  
primary keys of the main tables those updates effect, then dumping  
those ids into another table.

We run an external process to periodically drain the invalidate-these- 
ids table, performing the actual memcache invalidation calls.  
Currently it does so by polling postgres every 5 seconds. When it  
actually finds ids to invalidate, it computes the memcache key names  
which webspace could have used to stash, then iterates over the  
memcache servers, invalidating those keys (we chose to have  
webservers interact only with local memcaches -- when load gets  
bigger we can always reconsider). When it has successfully  
invalidated all the keys listed in the table, then it deletes all the  
keys it observed in the table. Using a serializable transaction makes  
that part easy.

Now the downside to this is running an additional service, and  
ensuring that it is working well and properly. If you have more than  
one memcache instance, you have to consider things like partial  
failures at invalidation time -- what happens if your memcache client  
gets unhappy talking to one of your caches but not the other. We're  
taking a simplistic route currently by just throwing away references  
to all the memcache clients and letting them be recreated after the  
next sleep / poll cycle -- along with not deleting the keys from the  
table so that when we poll next we'll try to expunge again. A  
slightly more complex table structure marking which memcaches have  
invalidated those keys successfully would make that a little less  
brain-dead, but for now we just make sure that our memcacheds are happy.

This system was definitely influenced by knowledge of slony-1, the  
primary opensource replication engine for postgresql. It uses  
triggers to capture update / insert / delete information which then  
get transcribed into a replication log table. Then an external  
process reaps that data and performs the replication via playing  
equivalent SQL commands to remote postgres instances. Once replayed,  
those rows in the replication log table are marked completed. Once  
all the consumers of those replication log rows have handled them,  
then the rows can be deleted. So, if the replicator processes die or  
are suspended, or, in our case, if our invalidator gets stopped for  
some time, then when it comes back up no messages have been lost --  
just delayed.

Slony can run in either wait/notify mode or in polling more. The wait/ 
notify mode uses a feature of postgres to have one client wait until  
a signal is raised on a table, which, in slony's case, will be done  
whenever one of the triggers deposits rows into the replication log  
table. Under low load, this works well because the external  
replication process will sleep until there is definitely work to do.  
But under high write load to the replicated tables, this yields a  
high context-switch rate, with the external process being woken only  
do to a small chunk of work, only to perform it, go back to sleep,  
only to be woken up immediately. Polling ensures grabbing more  
reasonably sized work units (many more rows to replicate), at the  
expense of wasted cycles when there was no work to do, as well as a  
possibly longer delay in replication (or invalidation) when there was  
little to do.

But enough -- you get the point. We decided it was fine for our data  
to be stale for 5 seconds and none would be the wiser.
----
James Robinson
Socialserve.com