Caching location based data?

John Kramlich me at johnkramlich.com
Thu Aug 2 09:03:31 UTC 2007


Hello List,

I am a developer and founder of http://www.gigoit.org. We are a 
nonprofit organization that helps people give and get free stuff in 
their area.

Our site is custom written PHP and MySQL. We use memcache extensively to 
cache many of our SQL queries. I’m wondering if anyone can think of a 
more elegant solution then the one we have devised.

Currently, our database stores a postal code and country code for all of 
our posts. We have a database table of the latitude and longitude 
coordinates for these postal codes. A stored procedure on our server is 
called when a user wants to see posts within a given radius of their 
location. The URL below would show you everything within 50 miles of the 
US postal code 23602 and would call the following stored procedure to 
get its result set.

http://www.gigoit.org/get.php?c=US&p=23602&r=50 
<http://www.gigoit.org/get.php?c=US&p=23602&r=50>

would call:

CALL get_offers(theCountry,thePostalCode,theRadius)

We allow our users to select from a few different radius options; 1, 5, 
10, 25, and 50. Any radius that is not in that list is not cached. Any 
radius that is in that list is cached indefinitely. I cache an MD5 hash 
of the SQL query as the key and the serialized resulting array as the 
value. When the same SQL query is called we first check memcache to see 
if that key exists and if so pull the data from the cache instead of the 
database. (I process the data further otherwise I would just stored the 
resulting HTML)

When a new offer is posted or its status changes I run a function to 
remove the potentially stale data. I first look at the postal code and 
country that the offer is in. Then, I query the database to return a 
list of all postal codes within a 50mile radius. This can sometimes be 
hundreds of postal codes if you are located in a very dense area. Once I 
have that list I loop through it and delete the keys of every possible 
cached stored procedure call like this:

delete(MD5("CALL get_offers(theCountry,thePostalCode,50)"));
delete(MD5("CALL get_offers(theCountry,thePostalCode,25)"));
delete(MD5("CALL get_offers(theCountry,thePostalCode,10)"));
delete(MD5("CALL get_offers(theCountry,thePostalCode,5)"));
delete(MD5("CALL get_offers(theCountry,thePostalCode,1)"));

This means that if I have 100 postal codes I have to perform 500 deletes 
for keys that may not even exist. It just seems like there’s got to be a 
better way.

Any ideas?



More information about the memcached mailing list