Status and GPG processing questions
robb at canfield.com
Wed Aug 29 01:44:53 UTC 2007
Agreed and done, dry-run effectively disables GPG
I am not sure how much memory leaking was occurring with GPG for
multi-gig runs. I may try to test that. But otherwise I found leaking to
be in in the 20-30 MB range and that's not enough to explain David's
issue. But I have not yet torn into S3 processing.
One thing that could be a problem is that Brackup retains the chunk in
RAM! For the default of 64MB that adds up to a whole lot of RAM usage
VERY quickly. From what I can tell the total would never exceed 2x (so
128MB) since encrypted chunks are not read from disk until needed. But
still, 128MB is a LOT of RAM. The easiest way to handle this is setting
the chunk size to 5MB or so. Changing the way chunks are handled is
tricky but I will probably look at it when I try to add alternate
encryption/compression filters later this week (as time allows).
Another RAM issue is that the file list is built THEN processed. While
it's nice to know a completion estimate it does chew up a lot of RAM to
pre-build the file list. Perhaps an option to estimate versus a
scan/backup combination is in order. That would save another 20-40 MB
for large backup sets. In addition the pre-scan has some issues with
files missing, permissions/ownership changes between scan and backup,
etc. None of these are huge for small sets but for 100 GB and 30,000+
files it becomes a bit of a problem.
My main issue with RAM is that Brackup is destined for a number of VPS
systems I have (local and remote). These systems have optimized RAM
usage and I need something that is as small as reasonable (as my time
allows) measurable and predictable.
But with all the changes I have done I will need to remeasure RAM
performance from scratch some day. I know I am using less of it than
prior version of my code but have not done a end-to-end comparison yet.
Richard Edward Horner wrote:
> Awesome work.
> I had actually suspected there might have been an issue with this when
> David posted his problem, hence my asking if he was using GPG.
> On the --dry-run issue, I think ppl expect it to tell you what would
> be done but not do things. If you read the man page for many ppl's
> favorite package manager, apt-get, which I think would be a good point
> for establishing expected behavior, it says:
> No action; perform a simulation of events that would occur but do not
> actually change the system.
> "No action" is pretty clear but "perform a simulation" isn't exactly
> the same as "no action". Sorry, I come from a family of lawyers.
> Usually when you do a dry run on something, you just want to quickly
> see what it would do, so not invoking GPG would be beneficial cuz it
> would be faster and also if you're invoking --dry-run cuz you're
> trying to back up a failing disk and you're not sure how many more
> read/writes you're gonna get, having it do anything is not good. I
> would be inclined to say have it really do nothing more than print
> messages to the console. If you want further action, there can be
> another flag.
> Thanks, Rich(ard)
> On 8/29/07, robb <robb at canfield.com> wrote:
>> I had to tear into the GPG processing to locate some temp file
>> anomalies. I found that temporary files are not always cleaned when GPG
>> is active, and can accumulate at an alarming rate on large backups.
>> That's fixed along with some memory leaking (minor) problems.
>> Added improved recover code for backups so that an error backing up a
>> file/chunk no longer aborts the process. Set via the option --onerror
>> (default is to halt code).
>> An error log is now maintained for the run so that if
>> '--onerror=continue' is given there is still a place to examine for
>> errors. The name is based on the brackup metafile name with '.err'
>> suffixed. Logs are maintained for dry-runs as well and they are suffixed
>> with '-dry.err'.
>> I also found that --dry-run does NOT disable GPG. So the GPG process
>> manager will happily burn CPU and disk creating files that are then
>> deleted (via the new clean up code). Should GPG be disabled during dry
>> runs? I suppose testing GPG on every file to see if it works or not
>> might be useful, but that seems excessive.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 3237 bytes
Desc: S/MIME Cryptographic Signature
Url : http://lists.danga.com/pipermail/brackup/attachments/20070828/585dd467/smime.bin
More information about the brackup