Namespaces

Thu Jun 23 14:14:59 PDT 2005

Just my opinion but I think you are over estimating the need for your to 
cache your tables in memcached.

2,000,000 is a lot of entries but from the way you describe your data, 
the keys are simple and each row doesn't have much info. If that's the 
case, 2 million rows of data that doesn't get written often would be 
easily cached twice, by the database buffer and system os buffer. How 
large in terms of mbs are the table themslves? I estimate it should be 
relatively small and easily fit into cache buffers.

How many writes per second are the tables sustaining?

I would actually recommend you to rethink the sql schema mapping to 
remove the need for you to look at 100 tables individually to find out 
if they are in it which makes it slow and thus makiing you think 
memcached is the right tool.

1) Merge the tables, add a version field. PK becomes zip + version. This 
is the fastest and cleanest method.

or

2) Reverse map. Introduce a zip_to_table table. With a one to many 
relationship or store the multi relationship as a string list in one 
field. Again, if the zip tables is not added/written often, and when I 
mean often, I mean many times per second, then this is extremely fast 
and you only have to do change the maping one, during insert/writes.

Eamon Daly wrote:
> I didn't really provide information about my particular
> application, so let me outline that and perhaps someone will
> have a workaround: I have 100 tables, each containing 20,000
> rows, with zipcode as the PK. When a zipcode comes in, our
> application checks each table and reports which tables
> contain that zipcode.
> 
> With namespaces, the table name is the namespace name, each
> zipcode requested is a key, and the response is the value.
> Cache maintenance is simple: I delete the namespace when the
> underlying table is changed.
> 
> Your suggestion is a good one, but unless I store a complex
> object in the value:
> 
> 94110 -> { 'table_1' => 'Y', 'table_2' => 'N' ... }
> 
> it breaks down in my particular case. Also, I'd still have
> to ask the database for the table version on every request,
> or maintain a separate cache of table updates and query
> against that. That's certainly cheaper than going to the
> database, but still a lot of activity.
> 
> I appreciate the response! Any other ideas?
> 
> ____________________________________________________________
> Eamon Daly
> 
> 
> 
> ----- Original Message ----- From: "Brad Fitzpatrick" <brad at danga.com>
> To: "Eamon Daly" <edaly at nextwavemedia.com>
> Cc: <memcached at lists.danga.com>
> Sent: Thursday, June 23, 2005 10:50 AM
> Subject: Re: Namespaces
> 
> 
>> You can just cache a table version number in each row's day.  So
>>
>> 94110 -> "1/San Francisco, CA"
>>
>> And then you fetch 94110, split the version/city apart, see if the 
>> version
>> matches your table version.
>>
>>
>> On Thu, 23 Jun 2005, Eamon Daly wrote:
>>
>>> Any feel for when namespaces and clearing by namespace might
>>> be implemented? We want to cache some really large tables of
>>> zipcodes, but it doesn't make sense to cache entire tables
>>> when only 5-10% of the rows are ever hit. We'd like to cache
>>> rows as they're requested, but then there's no way to delete
>>> all of them when the underlying table is updated.
>>>
>>> ____________________________________________________________
>>> Eamon Daly
> 
>