Caching location based data?
Jan Miczaika
jan at hitflip.de
Thu Aug 2 11:59:23 UTC 2007
Probably not, right now. There is a discussion going on here regarding
two options which would help you: deletes using a regex (unlikely to
help you because of the md5 hashes) and the option of tagging keys. I
favor the last, but the implementation will probably be after the binary
protocol is implemented.
From: John Kramlich <me at johnkramlich.com>
Subject: Caching location based data?
To: memcached at lists.danga.com
Message-ID: <46B19DE3.6040702 at johnkramlich.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Hello List,
I am a developer and founder of http://www.gigoit.org. We are a
nonprofit organization that helps people give and get free stuff in
their area.
Our site is custom written PHP and MySQL. We use memcache extensively to
cache many of our SQL queries. I’m wondering if anyone can think of a
more elegant solution then the one we have devised.
Currently, our database stores a postal code and country code for all of
our posts. We have a database table of the latitude and longitude
coordinates for these postal codes. A stored procedure on our server is
called when a user wants to see posts within a given radius of their
location. The URL below would show you everything within 50 miles of the
US postal code 23602 and would call the following stored procedure to
get its result set.
http://www.gigoit.org/get.php?c=US&p=23602&r=50
<http://www.gigoit.org/get.php?c=US&p=23602&r=50>
would call:
CALL get_offers(theCountry,thePostalCode,theRadius)
We allow our users to select from a few different radius options; 1, 5,
10, 25, and 50. Any radius that is not in that list is not cached. Any
radius that is in that list is cached indefinitely. I cache an MD5 hash
of the SQL query as the key and the serialized resulting array as the
value. When the same SQL query is called we first check memcache to see
if that key exists and if so pull the data from the cache instead of the
database. (I process the data further otherwise I would just stored the
resulting HTML)
When a new offer is posted or its status changes I run a function to
remove the potentially stale data. I first look at the postal code and
country that the offer is in. Then, I query the database to return a
list of all postal codes within a 50mile radius. This can sometimes be
hundreds of postal codes if you are located in a very dense area. Once I
have that list I loop through it and delete the keys of every possible
cached stored procedure call like this:
delete(MD5("CALL get_offers(theCountry,thePostalCode,50)"));
delete(MD5("CALL get_offers(theCountry,thePostalCode,25)"));
delete(MD5("CALL get_offers(theCountry,thePostalCode,10)"));
delete(MD5("CALL get_offers(theCountry,thePostalCode,5)"));
delete(MD5("CALL get_offers(theCountry,thePostalCode,1)"));
This means that if I have 100 postal codes I have to perform 500 deletes
for keys that may not even exist. It just seems like there’s got to be a
better way.
Any ideas?
More information about the memcached
mailing list