Defining our own Mgd::find_deviceid...

Saunders, Newton nsaunders at corp.untd.com
Fri Nov 3 22:46:24 UTC 2006


Hi,

 

My company is investigating MogileFS and we are trying to come up with a
good method to rebalance files when new storage nodes are added.
Defining our own replication policy solves part of the problem. However,
ideally, we'd like to specify where the first copy of a new file is
stored in the same way that we are able to specify where subsequent
copies are stored (in the replication policy).  We'd like to be able to
define our own Mgd::find_deviceid. 

 

To show why we'd need to define our own Mgd::find_deviceid, this is how
we'd implement rebalancing:

 

- Before existing storage nodes are too full (say at ~60% capacity), add
some new empty storage nodes,

- Mgd::find_deviceid would specify to save the initial copy of new files
to existing storage nodes only.  (Reason: we assume that new files are
more frequently accessed than older ones and do not want the majority of
new files on the new storage nodes).

- We would run a separate process that simply "rewrites" old files.
When "resaving" the initial copy of these old files, Mgd::find_deviceid
would specify to save them to the new storage nodes.  

- For replication, our policy would use the same Mgd::find_deviceid we
defined above to replicate new files to existing storage nodes and old
files to new storage nodes.

- Once all storage nodes are balanced, we will stop "rewriting" old
files and new files will be saved as they are now...to the storage node
with the most available space.

 

- A few notes:

  - We are able to differentiate between new files and rewritten files
because we store meta data about each file outside of MogileFS (such as
the original creation date).  If the creation date is before the date
the storage node was added, we assume it is being rewritten, otherwise,
we assume it is new.

  - A storage node is considered "new" if its usage is largely less than
the average usage across all storage nodes.

 

 

If we added a patch that allowed users to define their own version of
Mgd::find_deviceid, would that be included?

 

 

Thanks,

 

Newton Saunders

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.danga.com/pipermail/mogilefs/attachments/20061103/7db2866c/attachment.html


More information about the mogilefs mailing list