Altering queries?

Clint Webb webb.clint at gmail.com
Fri Sep 21 06:29:45 UTC 2007


There is one thing computers do really well, and that is iterating thru
lists.
I would suggest that getting the list from memcache and then processing it
is not going to be any slower than getting a list of results from a heavy
database query... and... iterating through that.  Especially if you use the
get_multi instead of thousands of individual gets (but that take a little
extra programming to detect an element that wasn't in the cache, but you
might get better over-all response times)

Of course, its always a useful thing to then store in memcache your
PROCESSED data.  So after you've iterated through your list and formatted or
ordered the data in some format that is useful for you, cache it.  Then next
time, use the cached version and you don't have to iterate it again.  If
your data changes frequently, figure out the delta and use it as your expiry
time.

On 9/21/07, K J <sanbat at gmail.com> wrote:
>
>  I just did some hand-wavy math on that:
> >
> >
> >  >>> import random
> > >>> r=random.Random()
> > >>> import sets
> > >>> s=sets.Set([r.randint(1, 10000000) for x in range(30000)])
> > >>> len(s)
> > 29950
> > >>> len(','.join([str(i) for i in s]))
> > 236301
> > >>> import zlib
> >  >>> compressed=zlib.compress(','.join([str(i) for i in s]), 9)
> > >>> len(compressed)
> > 109568
> >
> >
> > So, a user with about 30k friends with numeric IDs fairly even
> > distributed from 1 to 10,000,000 (my average ID was 4,979,961, min was 236,
> > max was 9,999,931) would take a little over 100k of cache when comma
> > separated and compressed.
> >
> >
> > I would expect the normal case to be much smaller.
> >
>
> Great analysis!
>
> Thinking further on this issue though, gives me more questions...
>
> Storing a list of a user's friends in a Memcache array, then getting them
> back in and searching through them... while it avoids the database
> altogether, wouldn't it be sorta slow?  For instance, every time a user
> visits user B's profile page, the system would need to iterate through this
> array and see if they are connected.
>
> In this case would it be just better to have a heap mysql database and
> query that instead?
>
>



-- 
"Be excellent to each other"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20070921/efcd48f4/attachment.htm


More information about the memcached mailing list