Altering queries?

Clint Webb webb.clint at gmail.com
Fri Sep 21 06:30:41 UTC 2007


I wish gmail would reply to the list instead of the person :(

On 9/21/07, Clint Webb <webb.clint at gmail.com> wrote:
>
> There is one thing computers do really well, and that is iterating thru
> lists.
> I would suggest that getting the list from memcache and then processing it
> is not going to be any slower than getting a list of results from a heavy
> database query... and... iterating through that.  Especially if you use the
> get_multi instead of thousands of individual gets (but that take a little
> extra programming to detect an element that wasn't in the cache, but you
> might get better over-all response times)
>
> Of course, its always a useful thing to then store in memcache your
> PROCESSED data.  So after you've iterated through your list and formatted or
> ordered the data in some format that is useful for you, cache it.  Then next
> time, use the cached version and you don't have to iterate it again.  If
> your data changes frequently, figure out the delta and use it as your expiry
> time.
>
> On 9/21/07, K J <sanbat at gmail.com> wrote:
> >
> >   I just did some hand-wavy math on that:
> > >
> > >
> > >  >>> import random
> > > >>> r=random.Random()
> > > >>> import sets
> > > >>> s=sets.Set([r.randint(1, 10000000) for x in range(30000)])
> > > >>> len(s)
> > > 29950
> > > >>> len(','.join([str(i) for i in s]))
> > > 236301
> > > >>> import zlib
> > >  >>> compressed=zlib.compress(','.join([str(i) for i in s]), 9)
> > > >>> len(compressed)
> > > 109568
> > >
> > >
> > > So, a user with about 30k friends with numeric IDs fairly even
> > > distributed from 1 to 10,000,000 (my average ID was 4,979,961, min was 236,
> > > max was 9,999,931) would take a little over 100k of cache when comma
> > > separated and compressed.
> > >
> > >
> > > I would expect the normal case to be much smaller.
> > >
> >
> > Great analysis!
> >
> > Thinking further on this issue though, gives me more questions...
> >
> > Storing a list of a user's friends in a Memcache array, then getting
> > them back in and searching through them... while it avoids the database
> > altogether, wouldn't it be sorta slow?  For instance, every time a user
> > visits user B's profile page, the system would need to iterate through this
> > array and see if they are connected.
> >
> > In this case would it be just better to have a heap mysql database and
> > query that instead?
> >
> >
>
>
>
> --
> "Be excellent to each other"




-- 
"Be excellent to each other"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20070921/dcff70e8/attachment.html


More information about the memcached mailing list