Altering queries?

Clint Webb webb.clint at gmail.com
Fri Sep 21 06:59:52 UTC 2007


Only you could answer that definitively, but I would guess that it would be
better to get the lot.  Depends how often your data changes.

On my site, people see the first 15 entries, but I put the first 100 in one
cache key, and the first 500 in a second cache key if needed.  I get the
first 15 out of the hundred, and if they want more, I iterate though it
until I need more than 100.  On the rare occassion that anyone gets past the
500 mark I just go straight to the database, and then add back to the
cache.

I've split it up into 100 and 500 because most people would only ever look
at less than the first 100 entries.  if they do manage to look past the
first 100, then I have the first 500 cached in another key.  Keep in mind,
this is not first 100 next 500 to make a total of 600 articles.  The first
100 are also duplicated in the 500 list.  The 500 entry list is generated
only the first time it is needed, and the exact same routine also creates
the 1000 entry key if that is ever needed, and so on.  There is no built in
limit, it could end up being a key for a 20000 entry list fall all I know.

Every situation is different.  I suggest you build some test cases and test
it under various situations and see what works for you.  There are some
parts of my site that dont use memcache at all and simply go to the database
directly every time, but I did it that way because for that particular
problem a cached solution would be clunky, and memcache just didnt fit
well.  But apart from those special cases, I cache almost everything.  I
cache the little bits of data (such as key for each IP address that hits the
site, I increment a counter each time they hit, and give it an expiry), all
the small elements of data, all the bigger elements made up of the smaller
elements, all the rendered XML and some of the rendered HTML.  My database
is mostly idle :)



On 9/21/07, K J <sanbat at gmail.com> wrote:
>
> There is one thing computers do really well, and that is iterating thru
> > lists.
> > I would suggest that getting the list from memcache and then processing
> > it is not going to be any slower than getting a list of results from a heavy
> > database query... and... iterating through that.  Especially if you use the
> > get_multi instead of thousands of individual gets (but that take a little
> > extra programming to detect an element that wasn't in the cache, but you
> > might get better over-all response times)
> >
> > Of course, its always a useful thing to then store in memcache your
> > PROCESSED data.  So after you've iterated through your list and formatted or
> > ordered the data in some format that is useful for you, cache it.  Then next
> > time, use the cached version and you don't have to iterate it again.  If
> > your data changes frequently, figure out the delta and use it as your expiry
> > time.
>
>
> I suppose the main problem is this... If I wanted to store the entire
> list, I would have to fetch the entire dataset from the DB, whereas if I
> were doing it via SQL queries, I would use paging.
>
> Does this mean that I would, the first time a user logs in and interacts
> with this list, to fetch the entire set instead of say just page 1, then use
> the entire set when doing paging and other organizing?
>



-- 
"Be excellent to each other"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20070921/81abd0b0/attachment.htm


More information about the memcached mailing list