Troubleshooting multiprocess communication

dormando dormando at rydia.net
Fri Mar 21 23:45:13 UTC 2008


Radu Greab wrote:
> dormando wrote:
>> Except the delete job is never getting that message, and the procmanager 
>> code prevents the job monitors subsequent broadcasts from being sent to 
>> the deleter, since the status hasn't changed.
>>
>> The symptom of this is any deletes destined for those hosts get cycled 
>> through file_to_delete_later and back again every 600 seconds.
> 
> I've seen this and sent a patch which was applied as change #1120. Is
> that change included into the code you use?

Actually, that's probably why I can't find the bug in the code :P I 
missed that patchset a while ago. I've been analyzing the code off of 
trunk, but the bug's affected by production (running a previous release).

We need to cut another release of MogileFS. There're enough changes to 
trunk that I can't even remember what's been fixed anymore :\

Brad; ping? :)

-Dormando


More information about the mogilefs mailing list