Using md5 sum of file contents as key
Ian Sherratt
shez at starfangled.net
Tue Apr 15 11:30:17 UTC 2008
Heya,
We've got a number of applications that require scalable storage, that have
different front end and business requirements but often end up containing the
same files (largely images).
We're considering using mogilefs as a storage solution, using a md5 sum (or
SHA-xx) of the file's contents as the key. This key would be stored by each
application in their own databases along with all the metainformation which
is application dependent.
This would provide a guarantee* that we were never 'wasting' storage by
storing the same file multiple times, without making major changes to our
applications.
Has any body used a function of the file contents as the key before? Good
idea/bad idea?
Cheers!
Shez
* OK hash collisions are always possible, so filelength:SHA-256 would be a
better key.
More information about the mogilefs
mailing list