<span class="gmail_quote"></span>yup -- you are right on Randy, my mistake -- we have used this method in a couple different things:<br><div><div style="margin-left: 40px;">- logging errors<br>- some processing of very large reports (breaking the report down to smaller queries, then recombining the data)
<br>- storing data larger than 1MB -- split the data into to sequential keys<br></div><br>In
this approach, we separate memcache instances for
each of those, since we did not want to expire truly cached data. The
reports were run live no matter what and we cached the segments for 15
minutes. Logging was post processed where we had a cron to pull down
the data and do the bulk inserts, and the large cache keys was sessions
(some legacy stuff that made the sessions really large).
<br><br>This totally slipped my mind somehow -- good call. I must be getting old. Again my apologies..<br><span class="sg"><br>-- Jason (<a href="mailto:jason@pirkplace.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
jason@pirkplace.com</a>)</span><br><span class="sg"></span><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><span class="e" id="q_10f5116427f7d09a_2">
<div><span class="gmail_quote">On 12/5/06,
<b class="gmail_sendername">Randy Wigginton</b> <<a href="mailto:krw@nobugz.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">krw@nobugz.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div>Fine, do something like:<div><br></div><div>myKeyNum = memcache.incr("mysequence");</div><div>myFullKey = "WellKnownName"+myKeyNum;</div><div>memcache.set(myFullKey, theIPAddressHittingMe);
</div><div><br></div><div><br></div><div>Then, when you are ready to harvest:</div><div><br></div><div>myVal = memcache.decr("mysequence");</div><div>while (myVal>=0) {</div><div><span style="white-space: pre;">
        </span>myFullKey = "WellKnownName"+myVal;</div><div><span style="white-space: pre;">        </span>badIP = memcache.get(myFullKey);</div><div><span style="white-space: pre;">        </span>// send a nasty email to owner of that IP
</div><div><span style="white-space: pre;">        </span>myVal = memcache.decr("mysequence");</div><div>}</div><div><br></div><div>There are numerous variants on this that avoid the read-read-write-write problem, yet still avoid using a heavy-weight DB.
</div><div><span><div><br></div><div><div><div>On Dec 4, 2006, at 4:40 PM, Jason Pirkey wrote:</div><br><blockquote type="cite">only problem with this, is that with very high hit sites, you have the possibility of overwriting data. (read,read,write,write issue). That is what Jed was trying to prevent. That is what is nice about the increment command in memcache --- it is atomic.
<br><br><div><span class="gmail_quote">On 12/4/06, <b class="gmail_sendername">Randy Wigginton</b> <<a href="mailto:krw@nobugz.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">krw@nobugz.com
</a>
> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> <div>Or, if you didn't want to hit your slow DB, create a well known key that contains all IPs over a certain threshhold. Thus when a specific IP reaches 100 hits, put it on the list for later analysis. Once an hour or so, harvest the data.
<div><br></div><div>This doesn't help much with AOL. They put all their users through specific gateway addresses. (at least they did about 18 months ago)<div><span><div><br><div><div>On Dec 4, 2006, at 6:51 PM, Jason Pirkey wrote:
</div><br><blockquote type="cite">Yes -- every X number of requests over the initial threshold -- a simple if and mod.<br><br><div><span class="gmail_quote">On 12/4/06, <b class="gmail_sendername">Jed Reynolds</b> <<a href="mailto:lists@benrey.is-a-geek.net" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
lists@benrey.is-a-geek.net</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Jason Pirkey wrote:<br>> Jed:<br>><br>
> If you are analyizing for attacks, it would be easier to do a real<br>> time analysis with memcached, because at that point you will have the<br>> IP address you are looking for -- do a hit to memcache to get its
<br>> counter and act accordingly (saving it to the database for later<br>> analysis if it hits a certain threshold for instance. This way you<br>> will not have to do scanning of memcache and post processing.<br>
<br>Good idea, Jason, thanks! So if I'm tracking a high volume IP the way to<br>track them is to record their status to database every 1,000 requests<br>(e.g.) and not every request over the threshold.<br><br>Jed<br></blockquote>
</div><br></blockquote></div><br></div></span></div></div></div> </blockquote></div><br></blockquote></div><br></div></span></div></div>
</blockquote></div><br>
</span></div></blockquote></div><br>