benchmarking Perl client hash functions

Fri, 16 Jul 2004 18:57:53 -0400 (Eastern Daylight Time)

Hi all -

Troy Hakala posted recently about some hashing benchmarks he had run on
the PHP client:

   http://lists.danga.com/pipermail/memcached/2004-June/000664.html

He found that "(crc32($key)>>16) & 0x7fff" was not only faster than a PHP
implementation of the Perl client's hash algorithm but it also gave a
better distribution across buckets.

I did some benchmarking of Perl hashing algorithms and found the same thing
was true when I used the String::CRC32 module from CPAN.  Details are
below, but in summary the crc32 scheme was over 400% faster, and gave
almost perfect distribution across buckets while the current Perl client
algorithm had uneven distribution.  So I'm wondering a couple things:

1) For test keys I used all the words in my /usr/share/dict/words, plus
each of the same words with the characters reversed (about 90,000 keys
total, average length 8 characters).  Is there any reason to think this
might skew the results, either because of the length or because the test
keys only contain alpha characters?  If so, any ideas for generating more
realistic test keys?

2) I haven't done any profiling of the Perl client code so I don't know
how much of a bottleneck (if any) the hashing part might be.  I can
probably rig up a patch that uses String::CRC32 if it exists, or falls
back to the current algorithm if not - should I go ahead and give that a
try?

Larry

==========================================================================

Details:

The current Perl client hash code looks like:

   sub original {
       my $hash = 0;
       foreach (split //, shift) {
           $hash = $hash*33 + ord($_);
       }
       return $hash;
   }

I also tried a slight variation where I did a hash lookup of the ord()
values instead of the function call, but surprisingly (at least to me)
that ran about 7% slower than the original:

   my %ord = map { chr($_) => $_ } (0..127);
   sub ord_lookup {
       my $hash = 0;0
       foreach (split //, shift) {
           $hash = $hash*33 + $ord{$_};
       }
       return $hash;
   }

The String::CRC32 version looks like:

   sub CRC32 {
      return (crc32(shift) >> 16) & 0x7fff;
   }

It runs about 400% faster than the original algorithm.  The Digest::CRC
module also has a crc32 subroutine, which from my testing looks like it
produces the exact same output as the String::CRC32 version, but the
Digest::CRC hash code actually runs slower than the even the original
code:

                  Rate   Digest_CRC     original   CRC32
   Digest_CRC  37697/s           --         -32%    -87%
   original    55736/s          48%           --    -80%
   CRC32      283906/s         653%         409%      --

(Note:  Digest::CRC also includes a *really* slow pure-Perl implementation
that is used as a fallback if for whatever reason the XS portion does not
load (I stumbled across that accidentally).  In that case for my tests it
dropped from about 38K/sec to 400/sec - ouch.)

I also tried the sprintf scheme that Troy tested:

   sub CRC32_sprintf {
      return sprintf("%u", crc32(shift));
   }

And I also found that it ran slower than the one that just shifts (but
still far faster than the original):

                     Rate CRC32_sprintf         CRC32
   CRC32_sprintf 171415/s            --          -42%
   CRC32         293065/s           71%            --

As far as distribution across buckets, here is what I got for some
different numbers of buckets (percentage hashed into each bucket rounded
to nearest integer):

1
orig    100
crc     100

2
orig    58,42
crc     50,50

3
orig    24,47,29
crc     33,34,33

4
orig    37,21,21,21
crc     25,25,25,25

5
orig    20,20,20,20,20
crc     20,20,20,20,20

6
orig    15,21,17,10,26,12
crc     16,17,17,17,17,17

7
orig    14,15,14,14,14,14,14
crc     14,14,14,14,14,14,14

8
orig    26,11,10,10,11,11,11,11
crc     12,12,12,13,13,13,13,13

9
orig    12,15,11,7,17,9,6,14,10
crc     11,11,11,11,11,11,11,11,11

10
orig    12,8,12,8,12,9,12,8,11,9
crc     10,10,10,10,10,10,10,10,10,10

==========================================================================