Recovery procedure

Brad Fitzpatrick brad at
Tue Nov 8 20:31:54 PST 2005


If you obey step 1, you can't mess anything up.  (But if you already did,
you can still recover...)

Step 2:  modify the device table, marking the old device as dead.

Step 3:  mark the host alive.

Step 4:  make a new device (with a new device id), on the alive host

And then it'll just replicate for you.

On Tue, 8 Nov 2005, Brandon Ooi wrote:

> Hi,
> We suffered our first drive failure on mogile and I'm trying to figure
> out the aftermath.
> 1) the primary drive on one of our storage nodes failed
> 2) this caused both our mogilefsd daemons to hang requiring a restart...
> 3) In SQL, i marked the host as 'down' so that requests would not go to
> that machine.
> 4) I replaced the primary drive and brought the machine back up.
> There is a second drive on the machine that is still good and I'd like
> to use that. How do I mark the device as 'dead' and the host as 'alive'
> and have mogile rereplicate everything that was lost? Is there a
> standard way of doing this?
> brandon

