Troubleshooting multiprocess communication

Fri Mar 21 23:45:13 UTC 2008

Radu Greab wrote:
> dormando wrote:
>> Except the delete job is never getting that message, and the procmanager 
>> code prevents the job monitors subsequent broadcasts from being sent to 
>> the deleter, since the status hasn't changed.
>> The symptom of this is any deletes destined for those hosts get cycled 
>> through file_to_delete_later and back again every 600 seconds.
> I've seen this and sent a patch which was applied as change #1120. Is
> that change included into the code you use?

Actually, that's probably why I can't find the bug in the code :P I 
missed that patchset a while ago. I've been analyzing the code off of 
trunk, but the bug's affected by production (running a previous release).

We need to cut another release of MogileFS. There're enough changes to 
trunk that I can't even remember what's been fixed anymore :\

Brad; ping? :)


