Subject: Re: How to get all the keys from servers?

Jason Pirkey jason at pirkplace.com
Tue Dec 5 05:32:46 UTC 2006


yup -- you are right on Randy, my mistake -- we have used this method in a
couple different things:
- logging errors
- some processing of very large reports (breaking the report down to smaller
queries, then recombining the data)
- storing data larger than 1MB -- split the data into to sequential keys

In this approach, we separate memcache instances for each of those, since we
did not want to expire truly cached data.  The reports were run live no
matter what and we cached the segments for 15 minutes. Logging was post
processed where we had a cron to pull down the data and do the bulk inserts,
and the large cache keys was sessions (some legacy stuff that made the
sessions really large).

This totally slipped my mind somehow -- good call.  I must be getting old.
Again my apologies..

-- Jason (jason at pirkplace.com)

On 12/5/06, Randy Wigginton <krw at nobugz.com> wrote:
> >
> > Fine, do something like:
> > myKeyNum = memcache.incr("mysequence");
> > myFullKey = "WellKnownName"+myKeyNum;
> > memcache.set(myFullKey, theIPAddressHittingMe);
> >
> >
> > Then, when you are ready to harvest:
> >
> > myVal = memcache.decr("mysequence");
> > while (myVal>=0) {
> > myFullKey = "WellKnownName"+myVal;
> > badIP = memcache.get(myFullKey);
> > // send a nasty email to owner of that IP
> > myVal = memcache.decr("mysequence");
> > }
> >
> > There are numerous variants on this that avoid the read-read-write-write
> > problem, yet still avoid using a heavy-weight DB.
> >
> > On Dec 4, 2006, at 4:40 PM, Jason Pirkey wrote:
> >
> > only problem with this, is that with very high hit sites, you have the
> > possibility of overwriting data.  (read,read,write,write issue).  That is
> > what Jed was trying to prevent.  That is what is nice about the increment
> > command in memcache --- it is atomic.
> >
> > On 12/4/06, Randy Wigginton <krw at nobugz.com > wrote:
> > >
> > > Or, if you didn't want to hit your slow DB, create a well known key
> > > that contains all IPs over a certain threshhold.  Thus when a specific IP
> > > reaches 100 hits, put it on the list for later analysis.  Once an hour or
> > > so, harvest the data.
> > > This doesn't help much with AOL.  They put all their users through
> > > specific gateway addresses. (at least they did about 18 months ago)
> > > On Dec 4, 2006, at 6:51 PM, Jason Pirkey wrote:
> > >
> > > Yes -- every X number of requests over the initial threshold -- a
> > > simple if and mod.
> > >
> > > On 12/4/06, Jed Reynolds < lists at benrey.is-a-geek.net> wrote:
> > > >
> > > > Jason Pirkey wrote:
> > > > > Jed:
> > > > >
> > > > > If you are analyizing for attacks, it would be easier to do a real
> > > > > time analysis with memcached, because at that point you will have
> > > > the
> > > > > IP address you are looking for -- do a hit to memcache to get its
> > > > > counter and act accordingly (saving it to the database for later
> > > > > analysis if it hits a certain threshold for instance.  This way
> > > > you
> > > > > will not have to do scanning of memcache and post processing.
> > > >
> > > > Good idea, Jason, thanks! So if I'm tracking a high volume IP the
> > > > way to
> > > > track them is to record their status to database every 1,000
> > > > requests
> > > > (e.g.) and not every request over the threshold.
> > > >
> > > > Jed
> > > >
> > >
> > >
> > >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/memcached/attachments/20061205/105707d5/attachment-0001.htm


More information about the memcached mailing list