Multiget/intelligent generic PHP wrapper function... thoughts/advice wanted

Dustin Sallings dustin at spy.net
Fri Nov 2 07:34:46 UTC 2007


On Oct 31, 2007, at 5:40 PM, mike wrote:

> Yes, but with key prefixes not matching the database keys, I have to
> change the key indexes in PHP back and forth to suit the needs. Is
> there a better way? Not to mention I have to assign the key name of
> the to be the whole key name, or the array_diff_key/related functions
> can't do a diff properly.
>
> This seems horribly redundant:
> $cache_keys[$prefix.$key] = $prefix.$key;
>
> the memcache_get wants the parameters in the array VALUES, where
> array_diff_key() wants array KEY and memcache_get returns the key name
> in the KEY as well. Aligning them requires reiteration - a total of 3
> iterations in my current code. I am quite sure I can cut it down to 2
> somehow. Possibly also do more optimal looping...


	I'll admit I don't know a lot of PHP, but I'd imagine a function that  
looked something like this (I typed this python in my mail client, so  
I don't know that it actually works):

def get_cached(keys, cache_miss_func, timeout=300):
	found=memcache.get(keys)
	missing=[k for k in keys if k not in found]
	if missing:
		found_in_db=cache_miss_func(missing)
		for k,v in found_in_db.iteritems():
			memcache.set(k, v, timeout)
		found.update(found_in_db)
	return found

	Using something like the above, you can just pass in the function  
that gets the data from the DB itself.  For example (even more pseudo  
pseudo-code):

def get_from_db(keys):
	query="select * from something where id in (" +
		', '.join(['?' for x in keys]) + ")"
	# Assuming a DB cursor is coming from somewhere.
	cursor.execute(query, keys)
	cached_objects=[make_object(row) for row in cursor.fetchall()]
	return dict([(o.id, o) for o in cached_objects])


	With something like that, you could have an efficient caching  
interface that you can use like this:

	objs=get_cached([1, 2, 3, 4, 5], get_from_db)

	On a given call, say 1, 3, and 5 are cached.  Those will be returned  
from the cache, and then get_from_db([2, 4]) will be called, and the  
results of that will be populated into the cache and the result of all  
five will be returned.  In the worst case of this scenario (there are  
no missing records), that'd be:

	1) One multi-GET call.
	2) One SQL query for the misses.
	3) Two memcached sets for the missing records.

-- 
Dustin Sallings





More information about the memcached mailing list