Status and GPG processing questions
Richard Edward Horner
rich at richhorner.com
Wed Aug 29 01:59:58 UTC 2007
Well, yeah, a bigger problem though is in VPS implementations. Most
seem to be designed to be sold, not to work well. They all seem to
dynamically scale CPU but not RAM. It's like you're stuck with
whatever amount of RAM. I know that this is in part because of the
kernel design but I recall some patches being submitted recently that
allow for dynamic scaling of RAM in the kernel.
On 8/29/07, robb <robb at canfield.com> wrote:
> Agreed and done, dry-run effectively disables GPG
> I am not sure how much memory leaking was occurring with GPG for
> multi-gig runs. I may try to test that. But otherwise I found leaking to
> be in in the 20-30 MB range and that's not enough to explain David's
> issue. But I have not yet torn into S3 processing.
> One thing that could be a problem is that Brackup retains the chunk in
> RAM! For the default of 64MB that adds up to a whole lot of RAM usage
> VERY quickly. From what I can tell the total would never exceed 2x (so
> 128MB) since encrypted chunks are not read from disk until needed. But
> still, 128MB is a LOT of RAM. The easiest way to handle this is setting
> the chunk size to 5MB or so. Changing the way chunks are handled is
> tricky but I will probably look at it when I try to add alternate
> encryption/compression filters later this week (as time allows).
> Another RAM issue is that the file list is built THEN processed. While
> it's nice to know a completion estimate it does chew up a lot of RAM to
> pre-build the file list. Perhaps an option to estimate versus a
> scan/backup combination is in order. That would save another 20-40 MB
> for large backup sets. In addition the pre-scan has some issues with
> files missing, permissions/ownership changes between scan and backup,
> etc. None of these are huge for small sets but for 100 GB and 30,000+
> files it becomes a bit of a problem.
> My main issue with RAM is that Brackup is destined for a number of VPS
> systems I have (local and remote). These systems have optimized RAM
> usage and I need something that is as small as reasonable (as my time
> allows) measurable and predictable.
> But with all the changes I have done I will need to remeasure RAM
> performance from scratch some day. I know I am using less of it than
> prior version of my code but have not done a end-to-end comparison yet.
> Richard Edward Horner wrote:
> > Robb,
> > Awesome work.
> > I had actually suspected there might have been an issue with this when
> > David posted his problem, hence my asking if he was using GPG.
> > On the --dry-run issue, I think ppl expect it to tell you what would
> > be done but not do things. If you read the man page for many ppl's
> > favorite package manager, apt-get, which I think would be a good point
> > for establishing expected behavior, it says:
> > --dry-run
> > No action; perform a simulation of events that would occur but do not
> > actually change the system.
> > "No action" is pretty clear but "perform a simulation" isn't exactly
> > the same as "no action". Sorry, I come from a family of lawyers.
> > Usually when you do a dry run on something, you just want to quickly
> > see what it would do, so not invoking GPG would be beneficial cuz it
> > would be faster and also if you're invoking --dry-run cuz you're
> > trying to back up a failing disk and you're not sure how many more
> > read/writes you're gonna get, having it do anything is not good. I
> > would be inclined to say have it really do nothing more than print
> > messages to the console. If you want further action, there can be
> > another flag.
> > Thanks, Rich(ard)
> > On 8/29/07, robb <robb at canfield.com> wrote:
> >> I had to tear into the GPG processing to locate some temp file
> >> anomalies. I found that temporary files are not always cleaned when GPG
> >> is active, and can accumulate at an alarming rate on large backups.
> >> That's fixed along with some memory leaking (minor) problems.
> >> Added improved recover code for backups so that an error backing up a
> >> file/chunk no longer aborts the process. Set via the option --onerror
> >> (default is to halt code).
> >> An error log is now maintained for the run so that if
> >> '--onerror=continue' is given there is still a place to examine for
> >> errors. The name is based on the brackup metafile name with '.err'
> >> suffixed. Logs are maintained for dry-runs as well and they are suffixed
> >> with '-dry.err'.
> >> I also found that --dry-run does NOT disable GPG. So the GPG process
> >> manager will happily burn CPU and disk creating files that are then
> >> deleted (via the new clean up code). Should GPG be disabled during dry
> >> runs? I suppose testing GPG on every file to see if it works or not
> >> might be useful, but that seems excessive.
Richard Edward Horner
Engineer / Composer / Electric Guitar Virtuoso
rich at richhorner.com
http://richhorner.com - updated June 28th
More information about the brackup