<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=us-ascii">

<META content="MSHTML 6.00.2900.3059" name=GENERATOR></HEAD>

<BODY>

<DIV><SPAN class=756562212-22052007></SPAN><FONT face=Arial color=#0000ff 

size=2><SPAN class=756562212-22052007>Nathan Nobbe,</SPAN></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2>&lt;<SPAN 

class=756562212-22052007>&nbsp;&lt;<FONT face="Times New Roman" color=#000000 

size=3>what i dont understand is when the data profile of a particular table 

changes, and the variable in the memcache that represents that 

</FONT></SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007><FONT face="Times New Roman" color=#000000 

size=3>&lt;&lt; table becomes stale, how can the variable be updated 

efficiently?&nbsp;</FONT></SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT face="Times New Roman" 

color=#000000 size=3><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT><FONT><FONT face=Arial color=#0000ff size=2><SPAN 

class=756562212-22052007>This post is rather lengthy, I will answer your 

question directly first, then provide details below, so you don't have to read 

the whole thing unless you want to.&nbsp; </SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT><FONT><FONT face=Arial color=#0000ff size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT><FONT><FONT face=Arial color=#0000ff size=2><SPAN 

class=756562212-22052007>First, let me warn you not to use triggers for this, 

they will hammer your performance.&nbsp;&nbsp; Some database systems, like 

Oracle, write all changes to log files that can be read outside of running 

queries against&nbsp;the database.&nbsp;You can have an object in a separate 

thread in a background process which reads the log files.&nbsp;&nbsp;&nbsp;Or, 

you can put operating system level events on the log files and implement event 

handlers so that your cache managers are notified when the log file changes, 

this is how I do it for Oracle.&nbsp; The other option is to poll the database 

on some period using queries.&nbsp; Polling will only work if you take steps to 

cut down what has to be looked at, and if the polling query only gets the data 

which changed and nothing else.&nbsp; For this to be efficient you have to have 

a way to first get&nbsp;the primary keys of what changed, then&nbsp;run a second 

query to get the changes.&nbsp; The&nbsp;query which gets the primary keys has 

to be simple.&nbsp; </SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT><FONT><FONT face=Arial color=#0000ff size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT><FONT><FONT face=Arial color=#0000ff size=2><SPAN 

class=756562212-22052007>I describe what I think is required to write such an 

intelligent polling system below.&nbsp; I call this intelligent caching and 

cache forwarding.&nbsp; Please note, that even when using cache forwarding, 

there still must exist at the point where the cache touches the database a 

process which is examining the database (polling) looking for changes.&nbsp; The 

key is to make this polling mechanism very efficient.&nbsp; The cache managers 

have to run in separate threads in a background 

process.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007><FONT face=Arial color=#0000ff 

size=2></FONT></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>There are a couple of concepts that help with managing 

caches.&nbsp; The way I do it is I use the cache as a buffering mechanism so 

that application code never goes directly to the RDBMS, it always goes to the 

cache.&nbsp; I have cache managers which run in a separate thread in a 

background process that synch the caches wih the RDBMS and with other 

caches.&nbsp; The cache managers add an overhead to the entire system but the 

end user does not feel this overhead because the end user is never waiting on 

the RDBMS, the end user always goes directly to the 

cache.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Building a caching system which ensures that the data 

is only stale within acceptable limits is not a trivial task.&nbsp;&nbsp; It 

requires modeling the characteristics of the data at a level that no modeling 

tool that I know of supports.&nbsp; You then have to build processors that can 

move the data around depending on the characteristics, liveness being the most 

important characteristic.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>The characteristics which are important are:&nbsp; 

liveness, concurrency, distribution, notification, and mutability.&nbsp; 

Describing these characteristics for each data element makes it possible to 

build very efficient&nbsp;polling mechanism because the poller will not have to 

look at everytihng in the database and will not have to move everything 

everywhere.&nbsp; </SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Liveness describes how stale the data can be and still 

be acceptable.&nbsp; Liveness is expressed in milliseconds, A value of 0 means 

the data has to be realtime.&nbsp; By stale I mean that changes to the cache 

have not yet been committed&nbsp;to the underlying database, or changes to the 

underlying database have not yet been reflected in this cache.&nbsp;&nbsp; I 

call the process of reflecting changes in the database with the cache 

synchronization.&nbsp; Synchronization also describes updating&nbsp;one cache 

with the&nbsp;changes in another cache.&nbsp;&nbsp;I have seen very few data 

elements which had a liveness of 0 (meaning the data had to be 

realtime).</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT><FONT face=Arial><FONT 

color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Concurrency describes the likelihood that two or more 

users&nbsp;are updating this specific data at the same time.&nbsp; Concurrency 

applies to both the table level, the row level, and the column level.&nbsp; I 

added the column level after seeing many cases where for a given row only 

certain columns were every updated.&nbsp; Concurrency can be further described 

by listing the user roles which are likely to be concurrent with each other or 

for which users in that role are likely to be concurrant with each other.&nbsp; 

Let's say that you have a bank account and you are updating your home address 

online. The likelihood that another user is updating your home address is zero 

(well, it better be zero or the bank has a big problem). The likelihood that a 

bank employee is updating your home address is small, but not zero.&nbsp; And, I 

don't have to worry about all bank employees because the bank employees 

authorized to make&nbsp;this change will be playing certain roles.&nbsp; 

&nbsp;The likelihood that another person who is authorized to use your account 

is updating your home address is also small, but not zero.&nbsp; Knowing not 

just that concurrency is likely, but also which users are likely to be 

performing actions that are concurrent with each other opens up some interesting 

possibilities.&nbsp; If I have marked a table as being concurrent and then I 

define the roles which are concurrent, I can show the users which have these 

roles what all the users in their role are doing in the cases where there are 

potential concurrency conflicts (deadlocks).&nbsp; I can also design my 

applications so that changes made by one user can be broadcast to another user 

without needing to broadcast these changes to everyone.&nbsp; Back to the bank 

example, I only have to worry about concurrency if certain bank employees are 

online or if a user authorized to use your account is 

online.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Distribution describes the necessity of pushing the 

contents of one cache to caches in other tiers in the N-tier architecture.&nbsp; 

Distribution also describes the tiers, what caches or persistent stores live on 

each tier, and how the caches and persistent stores relate to each 

other.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Notification answers the question, "if something does 

change, when will a concurrent user care"?&nbsp; The answer is either while 

committing updates to &nbsp;data to the underlying database, or while viewing 

data.&nbsp; I have to use an example to explain this.&nbsp; Say you are on a web 

page where results of a query are being displayed.&nbsp; This web page is not 

realtime because at no point can you claim that what is on the web page is a 

100% reflection of what is in the database.&nbsp; Now, let's say that you update 

one row of data on the web page.&nbsp; You will care if that same row was 

updated previously by another user&nbsp;when commiting your changes.&nbsp; Given 

two users A and B, if A runs a query,&nbsp; then B comes along and updates one 

of the rows A is looking at and commits the changes, then A updates that same 

row, A will care about B's update at the time that A&nbsp;commits the A's 

changes.&nbsp; In this case what I do is display a page which says "...saving 

this data will overwrite changes made by B", I then show the original data, the 

changes made by B, and the changes made by A and ask A what to do, the options 

are - revert to original, keep Bs changes and discard As changes, or keep As 

changes and discard Bs changes.&nbsp;&nbsp; In this case A cared about Bs 

updates when A was committing updates to A's changes.&nbsp; A database level 

transaction does not help here because no programmer in their right mind would 

begin a transaction on a query because the user MIGHT do an update at some 

undetermined time in the future.&nbsp;&nbsp; You would be surprised how many 

people will say that starting a transaction when A executes the first query is 

the answer to this problem.

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>The second case, that a user cares about changes to the 

underlying data while viewing&nbsp;the data&nbsp;is far more problematic.&nbsp; 

Let's say that your job is to ship products that sales people are selling on the 

phone.&nbsp;&nbsp;A salesperson&nbsp;creates an order, you ship the order.&nbsp; 

In front of you is a screen showing all the orders that need to be 

shipped.&nbsp; Let's say further that this product is very time sensitive, not 

shipping a placed order is a big deal, let's say it will cost you your 

job.&nbsp; In this case you care if the Order table is updated at the time you 

are viewing the Order data.&nbsp; So, the screen you are looking at needs to be 

refreshed as updates to the underlying Order table occur.&nbsp; In this case the 

programmer has no choice but to use a push model.&nbsp; Ouch.&nbsp; I have seen 

this case in logistics management systems and order fulfillment systems.&nbsp; 

But, I think this is the exception rather than the 

rule.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>In all my years of building business applications I 

have encountered data which truly had to be live very rarely.&nbsp;&nbsp; 

Usually a lag time is acceptable.&nbsp; The length of the lag time depends on 

the business processes that the data is used 

in.</SPAN></FONT></FONT></FONT></DIV></SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Mutability describes the lifecycle of the data.&nbsp; 

Mutability applies to both tables and rows.&nbsp; Some tables once populated 

never change.&nbsp; Some tables are only changed by inserts, once inserted the 

row never changes.&nbsp; Some tables have both inserts and row updates.&nbsp; 

Mutability describes not just what kind of changes are made to tables and rows, 

but how often the changes occur and how likely the changes are to occur.&nbsp; 

Knowing this information is crucial in building an intelligent caching 

system.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Based on having the characteristics of the underlying 

database's tables fully defined it is quite easy to build an intelligent cache 

system based on an efficient polling 

mechanism.&nbsp;</SPAN></FONT></FONT></FONT></DIV>

<DIV></SPAN></FONT></FONT></FONT><FONT face=Arial><FONT color=#0000ff><FONT 

size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>I use a cache forwarding mechanism.&nbsp; I always have 

a cache at each tier in the N-tier architecture.&nbsp; There is one cache on the 

same machine where the RDBMS lives, this cache is always a copy of what is in 

the database.&nbsp; A cache manager runs queries in a separate thread in a 

background process looking for changes.&nbsp; The queries are always ran against 

a special timestamp table I created to find what changed, then if there are any 

changes, a second query is run to actually get the changes.&nbsp; This cache 

manager also updates the RDBMS with changes pushed to this cache.&nbsp;&nbsp; 

</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>There are also caches at the app server tier and the 

web server tier.&nbsp; For each cache,&nbsp; a cache manager runs in the 

background which ensures that all changes to the cache&nbsp;are pushed to all 

other caches.&nbsp;&nbsp; </SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>I also have a cache in the web browser (not memcached, 

just javascript arrays) which is populated with data that is related to the 

user.&nbsp; Once a user logs in, and I have the user ID I prepopulate the cache 

with all rows related to the user ID.&nbsp; For example, on a banking 

application I worked on, once the customer logged in I used the cuystomer's 

social security number to select all rows which were related either directly or 

indirectly to that social security number.&nbsp; This query returned all the 

customer info plus all the account info for that customer.&nbsp; The cache then 

fulfills queries without needing to go to the 

database.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Triggers are not a good idea because they will 

significantly degrade the performance of your database.&nbsp; You can help 

yourself a lot by the design of your database.&nbsp; Two rules that I always 

follow in designing a database -&nbsp;1) all tables have a sequential integer as 

a primary key, a number that doesn't mean anything to anyone but me.&nbsp; I 

usually call this column tablename_id; and, 2) I always have an intersection 

table named timestamp_table_row_user which has five columns - table name, 

primary key, username, timestamp, and timestamp_action.&nbsp; Timestamp_action 

can be one of update, insert, or delete.&nbsp;&nbsp; I also create an archive 

table for this table and periodically sweep everything over a certain age into 

the archive table.&nbsp; </SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Some people prefer to add a timestamp column to 

existing tables, I don't like this idea because you need multiple timestamp 

columns, one for insert, update, delete and you cannot re-create a change 

history because the update timestamp column gets overwritten everytime.&nbsp; 

Also, I don't like having to modify every table schema with what is housekeeping 

meta-data.&nbsp; Using an intersection table and custom drivers makes it 

possible to add this approach to existing databases and existing applications 

without changing them.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Using the timestamp table makes it very simple to 

determine what has changed in the database and more importantly how it 

changed.&nbsp; Instead of using triggers to modify the timestamp table I wrote 

my own jdbc drivers which take any SQL statement being executed and create a 

second SQL statement which makes an entry into the timestamp table to describe 

what the original SQL statement did.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Using sequential integers for primary keys helps with 

identifying new rows added to a table with a simpler mechanism than a 

timestamp.&nbsp; All&nbsp; you have to do is obtain max(primarykey), store it, 

then compare that to the next max(primarykey).&nbsp; Getting the new rows is 

simple, just select * where primarykey &gt; the first max(primarykey), or you 

can build a more specific query by using the IN clause, as in SELECT * FROM 

tablename WHERE primarykey IN (key1, key2, key3...).&nbsp; Or you can use 

BETWEEN and give the stored max(primarykey) + 1&nbsp;and the latest 

max(primarykey).</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Obtaining updated rows requires a 

timestamp.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>I hope this helps, feel free to ask any 

questions.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Steve Donelow</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>President</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007>Object Works, Inc.</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN 

class=756562212-22052007></SPAN></FONT></FONT></FONT>&nbsp;</DIV><FONT 

face=Arial><FONT color=#0000ff><FONT size=2><SPAN class=756562212-22052007>

<DIV><BR>&nbsp;</SPAN></FONT></FONT></FONT><BR></DIV><FONT SIZE=3><BR>

<BR>

*************************************************************************<BR>

This communication, including attachments, is<BR>

for the exclusive use of addressee and may contain proprietary,<BR>

confidential and/or privileged information.  If you are not the intended<BR>

recipient, any use, copying, disclosure, dissemination or distribution is<BR>

strictly prohibited.  If you are not the intended recipient, please notify<BR>

the sender immediately by return e-mail, delete this communication and<BR>

destroy all copies.<BR>

*************************************************************************<BR>

</FONT>

</BODY></HTML>