Patch to add trim option to list_keys op

Nathan Schmidt nschmidt at gmail.com
Thu Jan 4 11:57:25 UTC 2007


Dear MogileFS,

We've got a use case where it'd be preferable to list the shared  
prefixes of a set of keys -- what amounts to returning uniq(dirname 
($key)) for all keys which match the usual prefix/limit/after  
argument for list_keys

The patch implements this behavior in a minimally-invasive way, and  
might be useful for other folks.

Why? When my caches are cold I need to enumerate pages by brute- 
force. For some big wikis we're approaching 100 revs of 1000 pages --  
I'd rather get back 1000 superkeys than 100,000 regular keys. This  
patch causes a bit more work on the DB in exchange for a cleaner  
application layer.

If there's a smarter (or 'more MogileFS') way to do this I'm all ears  
of course.

Regards,
-Nathan Schmidt / PBwiki


Index: trunk/server/lib/MogileFS/Worker/Query.pm
===================================================================
--- trunk/server/lib/MogileFS/Worker/Query.pm   (revision 617)
+++ trunk/server/lib/MogileFS/Worker/Query.pm   (working copy)
@@ -493,9 +493,16 @@
      my $dbh = Mgd::get_dbh() or
          return $self->err_line("nodb");
-    # now select out our keys
-    my $keys = $dbh->selectcol_arrayref
-        ('SELECT dkey FROM file WHERE dmid = ? AND dkey LIKE ? AND  
dkey > ? ' .
+    my $want = 'dkey';
+
+    if(my $trim = $args->{trim}) {
+      # trim leaf names, thus 'bob/thing/rev_1231.txt' -> 'bob/ 
thing' -- simulates a directory structure.
+      # returned values are hardly assured to be valid keys  
themselves, but fine for using as $prefix
+      $want = "distinct SUBSTRING_INDEX(dkey,'$trim',length(dkey)- 
length(replace(dkey,'$trim','')))";
+    }
+
+    # now select out our keys
+    my $keys = $dbh->selectcol_arrayref
+      ('SELECT ' . $want . ' FROM file WHERE dmid = ? AND dkey  
LIKE ? AND dkey > ? ' .
           "ORDER BY dkey LIMIT $limit", undef, $dmid, $prefix, $after);
      # if we got nothing, say so



More information about the mogilefs mailing list