Altering queries?
Clint Webb
webb.clint at gmail.com
Fri Sep 21 08:52:35 UTC 2007
Oh, I'm so bad. I forgot to mention that I cheat a bit in the example I
gave. It doesn't demonstrate the multi_get. Which you should use in a
situation like this. I dont use it this time because my GetArticle()
function is able to cheat a bit because the peice of code that runs before
this had already pulled down all the article information for what it needed
to do, so a lot of it is already in a local hash array. So it doesn't need
to ask memcache for maybe 90% of the articles that will generate XML in this
example.
Sorry for any confusion there... especially since I started off talking
about my solution for caching paged data, and then I gave you some example
code for something completely different. Oh well. You get what you pay
for, eh?
On 9/21/07, Clint Webb <webb.clint at gmail.com> wrote:
>
> Well, not really arrays in memory. Lists in memcache. It works well
> because 99% of requests will always be within the 100. That is why it
> exists as apposed to just having a 500,1000,1500, etc.
>
> But lets say a user has browsed up to the 800th entry. When getting the
> key from memcache, it only gets the one (this one is called
> "Articles:top:limit=1000"). Which contains all the ID's of the current
> articles in the order that they would be displayed. I skip the first 800
> entries in that list, and display the next 15. The actual contents of the
> article itself is in another cache entry ("Article:aid=800" for example).
> Depending on how I was accessing this information, I would produce either
> xml or html and would generate a key called "xml:Article-list:800-814" or
> "html:Article-list:800-814").
>
> This might not really be the most optimal solution but it worked well
> enough so far that I havent gone back to look at doing any optimisations.
>
> One optimization I did think of doing.. The list of ID's in the
> "Articles:top:limit=1000" is just a comma delimited string of ID's. I was
> planning on either storing the ID's in a fixed width spacing so that I could
> determine where in the string the 800th entry will be and just skip to that
> bit. Or I was thinking of storing a serialized array instead. But this
> piece of code has been quick enough so far that I haven't bothered.
>
> As a sample for ya's... in perl, if you have a comma delimited string, its
> fairly easy to drop that into an array. In fact, heres a peice of code that
> accesses part of it. The GetTopArticles(), GetArticle(),
> GetUsernameByUserID() and GetArticleScore() functions try and get the data
> from the cache, if not, it gets it from the database. This example doesn't
> illustrate the paging because this is for the public API that doesnt do
> paging.
>
> &ConnectCache;
> my ($xml) = $cache->get("xml:Top:$limit");
> unless ($xml) {
>
> SiteLog("api:Articles - XML Top:$limit not found in cache.
> Adding.");
>
> $xml = "<root>\n";
> $xml .= "\t<result>1</result>\n";
>
> my ($list) = GetTopArticles($limit);
> my (@top) = split(/,/, $list);
> foreach $id (@top) {
> my ($article) = GetArticle($id);
> unless ($article) { SiteLog("api.cgi: Wasnt able to get
> article details for aid=$id"); }
> else {
>
> my ($d_userID) = 0 + $article->{'userID'};
> my ($d_username) = GetUsernameByUserID($d_userID);
> my ($v_score) = GetArticleScore($id);
>
> $xml .= "\t<article>\n";
> $xml .= "\t\t<id>$id</id>\n";
> $xml .= "\t\t<uid>$d_userID</uid>\n";
> $xml .=
> "\t\t<username>".xml_quote($d_username)."</username>\n";
> $xml .=
> "\t\t<title>".xml_quote($article->{'title'})."</title>\n";
> $xml .=
> "\t\t<content>".xml_quote($article->{'content'})."</content>\n";
> $xml .=
> "\t\t<url>".xml_quote($article->{'url'})."</url>\n";
> $xml .= "\t\t<score>$v_score</score>\n";
> $xml .= "\t</article>\n";
> }
> }
>
> $xml .= "</root>\n";
> $cache->add("xml:Top:$limit", "$xml", 600);
> }
> &cgiXML;
> print "$xml";
>
>
>
>
> On 9/21/07, K J <sanbat at gmail.com> wrote:
> >
> > Only you could answer that definitively, but I would guess that it would
> > > be better to get the lot. Depends how often your data changes.
> > >
> > > On my site, people see the first 15 entries, but I put the first 100
> > > in one cache key, and the first 500 in a second cache key if needed. I get
> > > the first 15 out of the hundred, and if they want more, I iterate though it
> > > until I need more than 100. On the rare occassion that anyone gets past the
> > > 500 mark I just go straight to the database, and then add back to the
> > > cache.
> > >
> > > I've split it up into 100 and 500 because most people would only ever
> > > look at less than the first 100 entries. if they do manage to look past the
> > > first 100, then I have the first 500 cached in another key. Keep in mind,
> > > this is not first 100 next 500 to make a total of 600 articles. The first
> > > 100 are also duplicated in the 500 list. The 500 entry list is generated
> > > only the first time it is needed, and the exact same routine also creates
> > > the 1000 entry key if that is ever needed, and so on. There is no built in
> > > limit, it could end up being a key for a 20000 entry list fall all I know.
> > >
> > > Every situation is different. I suggest you build some test cases and
> > > test it under various situations and see what works for you. There are some
> > > parts of my site that dont use memcache at all and simply go to the database
> > > directly every time, but I did it that way because for that particular
> > > problem a cached solution would be clunky, and memcache just didnt fit
> > > well. But apart from those special cases, I cache almost everything. I
> > > cache the little bits of data (such as key for each IP address that hits the
> > > site, I increment a counter each time they hit, and give it an expiry), all
> > > the small elements of data, all the bigger elements made up of the smaller
> > > elements, all the rendered XML and some of the rendered HTML. My database
> > > is mostly idle :)
> >
> >
> > I'm wondering about the 100, then the 500. Are you creating a new array
> > at certain intervals? For instance suppose a user keeps paging through the
> > results and end up at result 800. Would you then have 3 arrays like this?
> > - 100 array
> > - 500 array
> > - 1000 array
> >
> >
> >
>
>
>
> --
> "Be excellent to each other"
>
--
"Be excellent to each other"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20070921/72066363/attachment.html
More information about the memcached
mailing list