Dropping mode optimization from Metafile?

robb robb at canfield.com
Tue Aug 28 00:40:37 UTC 2007

During a test of a medium scale backup (simulating RAM and directory
overhead for 77,000 files averaging 1 MB) I am finding some higher than
desired RAM usage.

Brackup buffers all of the file stats in RAM (something that is a bit
tricky to change). This is a constant for the run, but Brackup also
buffers all chunk data until the backup is done and the metafile
written. This chunk data can add up to a considerable amount of usage.
The main reason for buffering chunk data seems to be to optimize the
size of the file by remembering the most common file and directory
modes. If the file/dir is the default mode then the mode line need not
be added.

While optimizing the size of a file is always nice, the downsides for
large backups may overshadow this:

- A failure during backup leaves then brackup metafile out of date (or
non-existent). Even if MANY files are properly saved,, the index to them
is not available.

- RAM is consumed and CPU resources used for saving this data in RAM.
This becomes an issue on virtualized systems with low RAM thresholds OR
for any system with LARGE data sets. Brackup currently stores the entire
file list in RAM, but during backup the chunk references are added to
this and it is those chunk references that start to add up for large
files or large backup sets and small a small chunk size makes it even worse.

- The optimization is relatively small (actually very small) size wise
given the other data in the file.

I vote to drop the optimization file size and favor RAM (and execution
speed) by streaming the metafile as each file is finished. Any thoughts?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3237 bytes
Desc: S/MIME Cryptographic Signature
Url : http://lists.danga.com/pipermail/brackup/attachments/20070827/f3535ea1/smime.bin

More information about the brackup mailing list