Few queries on atomicity of requests

Daniel memcached at puptv.com
Sat Jun 21 18:35:01 UTC 2008


Hi

Thanks Dormando...

> Sorry, I'll weigh in a little here. I've only been skimming so forgive
> if this is a repeat:

Thank you...  I appreciate everyone's considered opinions.

> - Overcomplicated. See dustin's rails examples for easy abstractions on
> getting to the 90% mark.

You are absolutely correct...  But wouldn't it be worth making a
memcached/database combo that's overcomplicated if we could get a
performance boost from every app that uses just like they use a
database? 

> - Crazy in-memory databases probably aren't that much faster. If you rip
> out most of MySQL's parser and optimize your schema so InnoDB clustered
> indexes and adaptive hash indexes morph into life, that's probably close
> to as far as you're going to go.

First, In memory databases are, or at least can be far faster than
databases requiring a spinning disk access. Flash disk accesses and
remote in-memory accesses are somewhere in between.

Even with a full fledged database cluster doing the work, the cluster
ends up running more slowly because it has to handle all of the extra
database needs, replication, journaling, failover, version control,
blocking, etc etc.


In this system, I'm describing a system where the app, and the CDD are
on the same system to avoid extra remote accesses, since the CDD is
running on the app machine. More important, however...

If we can find a way to do more database processing and data reads
"outside of the core", more performance is available than when the core
database is handling reads and writes.

This idea helps to separate the the reading, in the CDD from writes.

FYI, Oracle's product, TimesTen seems to have some performance metrics
that I believe apply...  They reported on page 44 of one financial
trading system having an order of magnitude performance increase:

http://www.oracle.com/technology/products/timesten/pdf/oow2007/oow07_s291347_timesten_caching_use_cases.pdf 

I just thought of another way of describing it. The CDD's are like
Reader Databases, while the core database handles all writing. 


> - Limits your ability to cache the results of processing on multiple
> queries... If you issue three queries, parse the results a little, then
> use that elsewhere, you want to cache the cumulation if it's possible.
> - Shouldn't limit your memcaching to the database :\

> If you're displaying a blog, the amount of processing time you blow on
> the template will likely outweigh any DB activity. Cache the results of
> the entirety or chunks of template renderings. I believe you can KISS
> all sections of this without wasting too much time.

Yes, that is a better memcached specific application design. What I'm
suggesting will be significantly slower than an integrated designed-for
memcached app. That's way I believe it seems worthwhile to include the
current memcached functionality in the CDD.

----

I don't get it...  Here in "memcached land" we're dealing with
situations where if we DON'T warm up the cache before going live can
make sites blow up, meanwhile people are saying/thinking that a generic
memcached/database combination isn't worth the trouble.

I think that combining the two will provide a huge performance boost, as
long as all the design decisions made support non-blocked processing for
the core database.

Memcached is great as it is, but it sometimes returns old data. The
database knows when this data changes. Why can't we develop a system
that will make it so the database alerts/updates the cache when data is
changed.  Additionally, for example, the CDD's could have a special
protocol that let's them access data directly, which should be faster
than even a pre-compiled sql query.

As another way of thinking about it, first implement the things that
memcached can do easily, and let the more complex tasks fall through to
the database. Surely that can be done without slowing the database down.

For the fun of it, let me revisit one of the more complex tasks, and see
if you can't see how this could result in an incredible performance
boost.

Live Queries

We're all familiar with those knarly queries that need to be run
repeatedly. I believe it's possible to create a system where the query
is kept up to date by the Caching Database Daemon's (CDD's) adding a
very small overhead to the core database if there are enough CDD's to
keep the underlying data in cache memory.

Each CDD would be aware of the live query. When the core database
updates the CDD with data involving tables in the query, the CDD would
then proceed to rerun the query on the new data. In doing so, it will
have to do many requests to other CDD's for info on any joined records,
and to alert other CDD's that have joined records that changed, but who
cares as long as no additional requests go back to the core database. 

When the application wants the results, it's CDD simply sends a request
to all of the other CDD's and sorts/limits/distinct the results.  What a
cool performance boost!

Thanks

Daniel




On Fri, 2008-06-20 at 23:18 -0700, dormando wrote:
> Sorry, I'll weigh in a little here. I've only been skimming so forgive
> if this is a repeat:
> 
> - Overcomplicated. See dustin's rails examples for easy abstractions on
> getting to the 90% mark.
> - Crazy in-memory databases probably aren't that much faster. If you rip
> out most of MySQL's parser and optimize your schema so InnoDB clustered
> indexes and adaptive hash indexes morph into life, that's probably close
> to as far as you're going to go.
> - Limits your ability to cache the results of processing on multiple
> queries... If you issue three queries, parse the results a little, then
> use that elsewhere, you want to cache the cumulation if it's possible.
> - Shouldn't limit your memcaching to the database :\
> 
> If you're displaying a blog, the amount of processing time you blow on
> the template will likely outweigh any DB activity. Cache the results of
> the entirety or chunks of template renderings. I believe you can KISS
> all sections of this without wasting too much time.
> 
> -Dormando
> 
> > So, in conclusion, the end goal of this is to provide memcached type
> > caching to the database in such a way that the data it returns is always
> > accurate. I'm not saying this would be easy, but it does seem to be well
> > worth the effort.
> > 
> > Thanks
> > 
> > Daniel
> > 
> > 
> > 
> 



More information about the memcached mailing list