<HTML><BODY style="word-wrap: break-word; -khtml-nbsp-mode: space; -khtml-line-break: after-white-space; "><BR><DIV><DIV>On Apr 12, 2007, at 9:59 , Cal Heldenbrand wrote:</DIV><BR class="Apple-interchange-newline"><BLOCKQUOTE type="cite"><SPAN class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-align: auto; -khtml-text-decorations-in-effect: none; text-indent: 0px; -apple-text-size-adjust: auto; text-transform: none; orphans: 2; white-space: normal; widows: 2; word-spacing: 0px; ">1)  is it better to have a large number of variables with small values, or a smaller amount of variables with larger values?  I ran a test of 300,000 variables each 26 bytes in length.  A set() loop for the whole test took around 20 seconds.  8 variables at around 1MB a piece took 0.287 seconds.  I realize that there might be some overhead during each iteration, but this is quite a time difference.  (  strlen() is called 2x for each iteration)   The performance consideration here was to create one large value with comma separated ID strings, insert them to memcache, then pull them back and run a big split on the string.  This would still require some client side processing time, but it would be nice from a programming perspective to be able to add 300,000 variables in a quick amount of time.<SPAN class="Apple-converted-space"> </SPAN><BR><BR>Is there some efficiency tweaking that I'm missing on the memcached server?  (Side tangent question -- is it possible to increase the max value length of 1MB?)<BR></SPAN></BLOCKQUOTE><DIV><BR class="khtml-block-placeholder"></DIV><DIV><SPAN class="Apple-tab-span" style="white-space:pre">        </SPAN>It's got to do with processing the results, I believe.  I'd consider this terminal velocity:</DIV><DIV><BR class="khtml-block-placeholder"></DIV><DIV>dustintmb:/tmp 503% nc -v -w 1 localhost 11211 &lt; mcsets &gt; /dev/null</DIV><DIV>localhost [127.0.0.1] 11211 (?) open</DIV><DIV>0.014u 0.085s 0:06.46 1.3%      0+0k 0+0io 0pf+0w</DIV><DIV><BR class="khtml-block-placeholder"></DIV><DIV><SPAN class="Apple-tab-span" style="white-space:pre">        </SPAN>For that, I generated a list of 300,000 in the form of 'k' + i  The -w 1 adds about a second to the end of the transaction, so I'd say I loaded them in about five seconds.</DIV><DIV><BR class="khtml-block-placeholder"></DIV><DIV><SPAN class="Apple-tab-span" style="white-space:pre">        </SPAN>Doing the same with my java API took me about 7 seconds to queue the sets, but another 26s before the last set actually made it into the server since I read and validate the results of each one individually.  Note that the netcat case pipelines writes in and completely ignores store status (though I can check it with stats).</DIV><BR><BLOCKQUOTE type="cite"><SPAN class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-align: auto; -khtml-text-decorations-in-effect: none; text-indent: 0px; -apple-text-size-adjust: auto; text-transform: none; orphans: 2; white-space: normal; widows: 2; word-spacing: 0px; ">2)   I'm still trying to get into the mindset that memcache is to be used as a volatile cache, not a long term session storage space.  Still, it's an attractive idea -- has anyone created a mirrored cache system?  I was thinking, if I have 30 web machines with 2GB of spare memory a piece, I could run two memcached procs @ 1GB each, then create an API wrapper to write/read to the two separate clusters.  The only consideration is the probability that the hashing algorithm might choose the two mirrored variables to store on one machine, killing the redundancy.  This might be easier to implement in the daemon...  or am I completely thinking down the wrong path on this one?   Does the availability of cache data (hit/miss ratios) have a large effect on overall performance?<SPAN class="Apple-converted-space"> </SPAN><BR></SPAN></BLOCKQUOTE><DIV><BR class="khtml-block-placeholder"></DIV><DIV><SPAN class="Apple-tab-span" style="white-space:pre">        </SPAN>There are other tools out there more appropriate for long-term storage.  It sounds like you may be wanting something more like a persistent DHT.  As an interim, try treating it as a volatile caching backing a centralized store and see how often  you really end up needing to hit the central point.</DIV><BR><BLOCKQUOTE type="cite"><SPAN class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-align: auto; -khtml-text-decorations-in-effect: none; text-indent: 0px; -apple-text-size-adjust: auto; text-transform: none; orphans: 2; white-space: normal; widows: 2; word-spacing: 0px; ">3)  I don't know if this is the right place to ask this -- I'm using libmemcache.  The mc_set() prototype has a 'flags' parameter, but everywhere I see it set to 0 with no documentation.  Anyone know what these are for, and any documentation on this?<SPAN class="Apple-converted-space"> </SPAN><BR></SPAN></BLOCKQUOTE><DIV><BR class="khtml-block-placeholder"></DIV><DIV><SPAN class="Apple-tab-span" style="white-space:pre">        </SPAN>This is the best guide for that:</DIV><DIV><BR class="khtml-block-placeholder"></DIV><DIV><SPAN class="Apple-tab-span" style="white-space:pre">        </SPAN><A href="http://code.sixapart.com/svn/memcached/trunk/server/doc/protocol.txt">http://code.sixapart.com/svn/memcached/trunk/server/doc/protocol.txt</A></DIV><DIV><BR class="khtml-block-placeholder"></DIV><DIV><SPAN class="Apple-tab-span" style="white-space:pre">        </SPAN>Basically, flags mean whatever you want.  I have a transcoder in my java API that uses the flags to remember what the value stored for a given key actually means.  For example, I use half of the flags to tell me what type of an object I stored (integer, string, byte array, serialized java object, etc...) and the other half to set common flags like whether I gzipped the data in the transcoder (so it'll know to decompress it).</DIV><BR><BLOCKQUOTE type="cite"><SPAN class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-align: auto; -khtml-text-decorations-in-effect: none; text-indent: 0px; -apple-text-size-adjust: auto; text-transform: none; orphans: 2; white-space: normal; widows: 2; word-spacing: 0px; ">4)  I've been testing the event of the memcached servers being full.  Initially I was thinking along the functionality of the -M parameter, to tell the client it's full and have some sort of contingency based on that... however I'm thinking this is in the mentality of #2, trying to save data that shouldn't be saved.  I did notice that given a short expiration time on variables, the -M option didn't seem to actually delete them, it kept giving out of memory errors on successive set operations.  Is this a bug or normal behavior?  In any event, I decided it's probably best to leave the daemon at default behavior to clean up after itself, so this is just more of a curiosity.<SPAN class="Apple-converted-space"> </SPAN><BR></SPAN></BLOCKQUOTE></DIV><DIV><BR class="khtml-block-placeholder"></DIV><DIV><SPAN class="Apple-tab-span" style="white-space:pre">        </SPAN>The value of -M is never clear to me.  I have some data with which I need to do something.  It may be processed already and sitting in my memcached cluster, or I may just have to do some preprocessing on it directly from the source (and store that in memcached).  Having memcached stop working just because it's full seems like it would just cause me problems.</DIV><BR><DIV> <SPAN class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-align: auto; -khtml-text-decorations-in-effect: none; text-indent: 0px; -apple-text-size-adjust: auto; text-transform: none; orphans: 2; white-space: normal; widows: 2; word-spacing: 0px; "><DIV>-- </DIV><DIV>Dustin Sallings</DIV><BR class="Apple-interchange-newline"></SPAN> </DIV><BR></BODY></HTML>