Few queries on atomicity of requests

Thu Jun 19 22:49:47 UTC 2008

Hi

Thanks for the well thought out response... 

I think we have a slightly different views, so let me say a few things
and see if it all makes sense...

We both agree that caching data when transactions are involved is best
left to the database for the current discussion.

However, I believe that the best of all worlds would be if the database
api would integrate the caching functionality creating a database system
that consists of both client daemons (cache server) on each client, and
a core database system that only communicates with the client caching
daemons.

This would have a number of benefits...

1.) All applications would get good caching support without any changes
in the application code.

2.) The Client daemons could pre-process the queries and, like memcache,
provide results without bothering the core database.

3.) The client could communicate directly with the core database server
to invalidate data only when it changes.

 Simply be relieving the core database from sql parsing and simple reads
that are in the cache would greatly improve the performance of most
every database application all without requiring the application to
change.

The cache expiration might work in two ways, the database can make stale
cached data when it is updated or involved in transactions. The LRU
algorithm would work in parallel to removed cached items that are rarely
accessed.

Eventually, an additional performance boost could be gotten by including
in every sql/database request a new parameter that specifies the
allowable staleness of data.

Once a basic system like this is built, it opens the doors on additional
features with even more performance benefits. Although this is getting
ahead of myself, imagine if the database/caching system kept certain sql
queries "live." Say you have a huge sorted join that takes 20 seconds to
compute. Rather than setting up a system that recomputes the query
repeatedly, it would be possible for the system to incrementally update
the query with each update of the underlying tables.

Finally, this could be the beginning of a memory only database...

Thanks for reading.

Daniel

On Thu, 2008-06-19 at 12:48 -0400, Josef Finsel wrote:
>         Is there any reason why Memcached couldn't be baked into a
>         database
>         interface module?
> 
> This is the best place for a memcached layer. It is not, however, the
> best place to handle anything transactional. 
> 
> The key to understanding memcached is that it is a cache. Caching
> isn't used for transactions. Anytime you attempt to use cached data
> for a transaction you open up a potential can of worms that can lead
> to nasty, difficult to track down, bugs. After all, the database will
> always have the information stored in it, memcached may not because
> the memory in use by an item was required for another chunk of data.
> 
> If you need to cache data in the middle of multiple, on-going
> transactions for read only purposes by items not involved in the
> transaction, you can do that. It effectively allows clients to read
> data that may or may not be valid depending on whether the transaction
> finishes.
> 
> You can even write your data layer so that anytime it updates the
> database it can update memcached with the current data, though you may
> end up with lots of writes that aren't read. This adds complexity to
> your data layer but it will require you to really abstract your
> objects to make effective use of it. Depending on your application,
> this may reach a point where it's not really effective and it's easier
> to include a memcached library that sits between each object's data
> layer and the database.
> 
> Does that make sense?
> 
> Josef
> 
> "If you see a whole thing - it seems that it's always beautiful.
> Planets, lives... But up close a world's all dirt and rocks. And day
> to day, life's a hard job, you get tired, you lose the pattern."
> Ursula K. Le Guin 
>