Patch to add trim option to list_keys op
Nathan Schmidt
nschmidt at gmail.com
Thu Jan 4 11:57:25 UTC 2007
Dear MogileFS,
We've got a use case where it'd be preferable to list the shared
prefixes of a set of keys -- what amounts to returning uniq(dirname
($key)) for all keys which match the usual prefix/limit/after
argument for list_keys
The patch implements this behavior in a minimally-invasive way, and
might be useful for other folks.
Why? When my caches are cold I need to enumerate pages by brute-
force. For some big wikis we're approaching 100 revs of 1000 pages --
I'd rather get back 1000 superkeys than 100,000 regular keys. This
patch causes a bit more work on the DB in exchange for a cleaner
application layer.
If there's a smarter (or 'more MogileFS') way to do this I'm all ears
of course.
Regards,
-Nathan Schmidt / PBwiki
Index: trunk/server/lib/MogileFS/Worker/Query.pm
===================================================================
--- trunk/server/lib/MogileFS/Worker/Query.pm (revision 617)
+++ trunk/server/lib/MogileFS/Worker/Query.pm (working copy)
@@ -493,9 +493,16 @@
my $dbh = Mgd::get_dbh() or
return $self->err_line("nodb");
- # now select out our keys
- my $keys = $dbh->selectcol_arrayref
- ('SELECT dkey FROM file WHERE dmid = ? AND dkey LIKE ? AND
dkey > ? ' .
+ my $want = 'dkey';
+
+ if(my $trim = $args->{trim}) {
+ # trim leaf names, thus 'bob/thing/rev_1231.txt' -> 'bob/
thing' -- simulates a directory structure.
+ # returned values are hardly assured to be valid keys
themselves, but fine for using as $prefix
+ $want = "distinct SUBSTRING_INDEX(dkey,'$trim',length(dkey)-
length(replace(dkey,'$trim','')))";
+ }
+
+ # now select out our keys
+ my $keys = $dbh->selectcol_arrayref
+ ('SELECT ' . $want . ' FROM file WHERE dmid = ? AND dkey
LIKE ? AND dkey > ? ' .
"ORDER BY dkey LIMIT $limit", undef, $dmid, $prefix, $after);
# if we got nothing, say so
More information about the mogilefs
mailing list