Dropping mode optimization from Metafile?

Robb Canfield robb at canfield.com
Wed Aug 29 02:01:24 UTC 2007


Yes, permissions are critical. Just in case I wasn;t clear in my prior
explaination; the permission optimizations are simply a way to make the
MetaFile smaller. If a file/dir has the "default" mode then the mode is
not saved in the metafile. If the file/dir differs from the metafile
recorded default then the a mode line is added for that specific
file/dir in its metafile entry. My default is 0644 for files and 0755
for dirs.

I also store group and owner which were NOT stored previously. Another
rather important component when doing a system or shared backup. I need
permissions and owner/group to restore my local development as well as
my servers.

I am using:
   OpenSuse 10.2 x 2 - development - (am not all that happy with it)
   Centos 4 x 3 - Servers
   Centos 5 - Sever
   Mandriva 2006 x 2 - firewall, misc test boxes
   Mandrake - very old something or other - pending rebuild


Richard Edward Horner wrote:
> Well, permissions are important for backing up if you're running a
> hosting company or some such.
> 
> What distros are you using? It's Debian Etch, Gentoo (well, Sabayon
> 4.3e) and SLES10 over here.
> 
> Thanks, Rich(ard)
> 
> On 8/29/07, robb <robb at canfield.com> wrote:
>> I currently hacked in the "standard" permissions for files and dirs
>> where standard is what my 3 distributions use! Other solutions are
>> possible but I can't help but wonder if its 90% work for for 1% gain, so
>> I am not pursuing them right now.
>>
>> Richard Edward Horner wrote:
>>> I'd say bring this up again when Brad gets back from camping or
>>> whatever because he probably has reasons for this architecture but I
>>> do agree that the issues you point out are unfortunate.
>>>
>>> Thanks, Rich(ard)
>>>
>>> On 8/28/07, robb <robb at canfield.com> wrote:
>>>> During a test of a medium scale backup (simulating RAM and directory
>>>> overhead for 77,000 files averaging 1 MB) I am finding some higher than
>>>> desired RAM usage.
>>>>
>>>> Brackup buffers all of the file stats in RAM (something that is a bit
>>>> tricky to change). This is a constant for the run, but Brackup also
>>>> buffers all chunk data until the backup is done and the metafile
>>>> written. This chunk data can add up to a considerable amount of usage.
>>>> The main reason for buffering chunk data seems to be to optimize the
>>>> size of the file by remembering the most common file and directory
>>>> modes. If the file/dir is the default mode then the mode line need not
>>>> be added.
>>>>
>>>> While optimizing the size of a file is always nice, the downsides for
>>>> large backups may overshadow this:
>>>>
>>>> - A failure during backup leaves then brackup metafile out of date (or
>>>> non-existent). Even if MANY files are properly saved,, the index to them
>>>> is not available.
>>>>
>>>> - RAM is consumed and CPU resources used for saving this data in RAM.
>>>> This becomes an issue on virtualized systems with low RAM thresholds OR
>>>> for any system with LARGE data sets. Brackup currently stores the entire
>>>> file list in RAM, but during backup the chunk references are added to
>>>> this and it is those chunk references that start to add up for large
>>>> files or large backup sets and small a small chunk size makes it even worse.
>>>>
>>>> - The optimization is relatively small (actually very small) size wise
>>>> given the other data in the file.
>>>>
>>>> I vote to drop the optimization file size and favor RAM (and execution
>>>> speed) by streaming the metafile as each file is finished. Any thoughts?
>>>>
>>>>
>>>
>>
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3237 bytes
Desc: S/MIME Cryptographic Signature
Url : http://lists.danga.com/pipermail/brackup/attachments/20070828/a3529bba/smime.bin


More information about the brackup mailing list