Few queries on atomicity of requests

Sun Jun 22 07:22:07 UTC 2008

Dustin

Thanks for the well thought out e-mail!

> On Jun 21, 2008, at 11:35, Daniel wrote:
> 
> >> - Overcomplicated. See dustin's rails examples for easy  
> >> abstractions on
> >> getting to the 90% mark.
> >
> > You are absolutely correct...  But wouldn't it be worth making a
> > memcached/database combo that's overcomplicated if we could get a
> > performance boost from every app that uses just like they use a
> > database?
> 
> 	It just feels like the wrong level.  If you're just caching database  
> results, you're not going to be getting the best usage out of your  
> app.  I mean, you're not likely to be doing all that much better than  
> what any other query cache does today.
> 
I agree, I agree, I agree. I don't want to do anything to take away what
we already have with memcached especially for storing blocks, full
pages, random info that's not needed in the database, etc.  For most
people here, even if this system existed, they'd still want and need to
do direct memcaching for the extra performance.

That being said, I think there is a need for faster, larger, caching
system that can use a standard database api, handle automatic cache
updates when the underlying data is committed, and that takes a lot of
reading load off of the core database.

> > Even with a full fledged database cluster doing the work, the cluster
> > ends up running more slowly because it has to handle all of the extra
> > database needs, replication, journaling, failover, version control,
> > blocking, etc etc.
> 
> 	I thought you were advocating MVCC in memcached earlier?  I don't  
> think you can reach the levels of guarantee you're asking for over DB  
> transactions without something like MVCC and 2pc.  If you start adding  
> these types of things in, you're just making another database, and  
> it'll be as slow as any of them.

In thinking through the design, I see two ways it could work.

1.) Stay with the key/value system, and the CDD would only work with the
most recent data for requests that are not in an active transaction. 

2.) Complicate things a little, and use a key/version/value system.
Each request could ask for the most recent data (a key/value request)
or, if the request was coming from an active transaction, it could make
a key/version/value request to get data appropriate for the transaction.

It may be that method 2 is not possible while keeping the CDD fast and
pure. The first idea is probably enough to add performance. 

Anyway, to finalize the point.  No, the CDD's would not implement MVCC
processing, but just might support version caching if that can be done
in a fast unblockable memcached way.

> > Yes, that is a better memcached specific application design. What I'm
> > suggesting will be significantly slower than an integrated designed- 
> > for
> > memcached app. That's way I believe it seems worthwhile to include the
> > current memcached functionality in the CDD.
> 
> 	I think the best way to approach this is by building an architecture  
> that fits it well.  Turns out, that's pretty hard, and then nobody  
> wants to use it.
> 
> 	I get what you're saying.  You want to make something everyone can  
> use, but doesn't do all that much.  It kind of sounds like mysql's  
> query cache.  I don't know how it'd be better than that.  I've not  
> heard anything particularly wonderful about it.
> 
> 	I've written some activerecord extensions that can do automatic  
> caching and invalidation of objects by relationship.  This lives  
> within the ORM because it can act on real live objects and has a deep  
> understanding of when things change and knows what to do about it.   
> Although it's in its infancy stage, it basically works.  However,  
> getting it to do all of the stuff it should/could would require a  
> *lot* of work and could very well make things slower.  Sometimes it's  
> just better to say what you mean.
> 

Yes, it is so easy to go happily down the path, only later realizing you
ended up worse off then when you started.

I'm not sure what you meant about saying what you mean though.

> > Memcached is great as it is, but it sometimes returns old data.
> 
> 	``There are only two hard things in Computer Science: cache  
> invalidation and naming things.'' -- Phil Karlton
> 

Love the quote!

> > When the application wants the results, it's CDD simply sends a  
> > request
> > to all of the other CDD's and sorts/limits/distinct the results.   
> > What a
> > cool performance boost!
> 
> 
> 	That assumes it can read the data I've cached (or, in this case, it's  
> caching for its own benefit).  That application seems really  
> specialized.  With as little code as I generally write to get caching  
> working in apps, I don't see how something like that would be less work.
> 

I think the database could be setup to automatically detect these long
repeating queries on slowly changing tables. The benefit would come from
the automatic speedup.  This Live Query idea is actually not directly
related to the idea for a CDD, is very specialized, and the more I think
about it, is probably better implemented in the application when
necessary.

Thanks again.

Daniel