I wish gmail would reply to the list instead of the person :(<br><br><div><span class="gmail_quote">On 9/21/07, <b class="gmail_sendername">Clint Webb</b> <<a href="mailto:webb.clint@gmail.com">webb.clint@gmail.com</a>> wrote:
</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">There is one thing computers do really well, and that is iterating thru lists.<br>I would suggest that getting the list from memcache and then processing it is not going to be any slower than getting a list of results from a heavy database query... and... iterating through that. Especially if you use the get_multi instead of thousands of individual gets (but that take a little extra programming to detect an element that wasn't in the cache, but you might get better over-all response times)
<br><br>Of course, its always a useful thing to then store in memcache your PROCESSED data. So after you've iterated through your list and formatted or ordered the data in some format that is useful for you, cache it. Then next time, use the cached version and you don't have to iterate it again. If your data changes frequently, figure out the delta and use it as your expiry time.
<div><span class="e" id="q_11526c1f015a581f_1"><br><br><div><span class="gmail_quote">On 9/21/07, <b class="gmail_sendername">K J</b> <<a href="mailto:sanbat@gmail.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
sanbat@gmail.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><span>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0px 0px 0px 0.8ex; padding-left: 1ex;">
<div>
<div>I just did some hand-wavy math on that:</div>
<div><br> </div>
<div>
<div>>>> import random</div>
<div>>>> r=random.Random()</div>
<div>>>> import sets</div>
<div>>>> s=sets.Set([r.randint(1, 10000000) for x in range(30000)])</div>
<div>>>> len(s)</div>
<div>29950</div>
<div>>>> len(','.join([str(i) for i in s]))</div>
<div>236301</div>
<div>>>> import zlib</div>
<div>
<div>>>> compressed=zlib.compress(','.join([str(i) for i in s]), 9)</div>
<div>>>> len(compressed)</div>
<div>109568</div><br> </div></div>
<div><span style="white-space: pre;"></span>So, a user with about 30k friends with numeric IDs fairly even distributed from 1 to 10,000,000 (my average ID was 4,979,961, min was 236, max was 9,999,931) would take a little over 100k of cache when comma separated and compressed.
</div>
<div><br> </div>
<div><span style="white-space: pre;"></span>I would expect the normal case to be much smaller.</div></div></blockquote>
<div> </div></span>
<div>Great analysis!</div>
<div> </div>
<div>Thinking further on this issue though, gives me more questions...</div>
<div> </div>
<div>Storing a list of a user's friends in a Memcache array, then getting them back in and searching through them... while it avoids the database altogether, wouldn't it be sorta slow? For instance, every time a user visits user B's profile page, the system would need to iterate through this array and see if they are connected.
</div>
<div> </div>In this case would it be just better to have a heap mysql database and query that instead?<br> </div>
</blockquote></div><br><br clear="all"><br></span></div><span class="sg">-- <br>"Be excellent to each other"
</span></blockquote></div><br><br clear="all"><br>-- <br>"Be excellent to each other"