Kostas Chatzikokolakis' lowmem patches merged

Kostas Chatzikokolakis kostas at chatzi.org
Mon Jan 18 18:24:32 UTC 2010

> Turns out there's a problem with using tempfile_obj here. If we're doing 
> lots of small files and encryption is fast compared to storage, we end up
> building up a big backlog of encrypted tempfiles we're waiting to consume.
> We have open filehandles on all of these, and can easily hit the limit on
> number of open files, and fall over. I saw this in the weekend on a large
> backup, quadrupled the open files limit from 1024 to 4096, and it still 
> died pretty quickly.

You're right, that's a problem.

> I see two options:
> 1. Revert to what we had earlier, which is using the tempfile function, 
>    and closing the initial filehandle, reopening the file only when we're
>    ready to consume the data (at which time we unlink the file). This is
>    reasonable, although File::Temp points out there's a race condition
>    there that we avoid if we hang on to the open file handle.

I'd vote for this. I don't like taking up resources that are not really
used, even in a controlled way. I'm not sure about the race condition,
but tempfile() opens the file in a temp dir that shouldn't be used by
any other process.

Btw, unlinking an open file might not work on all platforms (does it on
windows?). We could unlink in forget_chunkref().

> 2. We're really encrypting too fast in this situation - having a thousand
>    encrypted tempfiles on disk ready to go really isn't that useful. At the
>    moment the GPGProcManager only slows down when it's got 128MB of data on 
>    disk waiting to be read. I'm thinking we should we also throttle on the
>    actual number of outstanding chunks? e.g. attached patch
> Thoughts? 
> Not sure what the uncollected_chunk limit should be either - setting it to 
> 20 is 4 * the number of GPG processes we run by default, and adds that many
> to the number of open files we have (but limits it at that). Too big/small?

This is also a reasonable thing to do in general (not just to fix this
problem). Maybe it's better to just make the 128MB limit configurable.
In principle one should care about space, not number of files, and the
number of files will lower anyway if you lower the size limit.


More information about the brackup mailing list