gc and inventory db
Kostas Chatzikokolakis
kostas at chatzi.org
Fri Jan 22 01:07:47 UTC 2010
> The only tricky part is that the @orphaned_chunks are the stored chunk
> hashes, which are used as (part of) the inventory_db *values*, not the
> keys. (And we can't derive the keys because those mappings were in
> metafiles we've already pruned.)
>
> So I think we have to add something like a delete_values_beginning_with()
> method to the Dictionary classes under the InventoryDatabase, which seems
> a bit ugly.
A simple solution is to put all keys to delete in a hash (even with
thousands of chunks it shouldn't take too much memory). Then you iterate
over the whole inventory db once, you retrieve the digest from each
value and you delete if it's in the hash.
Not optimal but simple and it should be quite fast compared to the time
it takes to run gc anyway.
Kostas
More information about the brackup
mailing list