Analysis of the over-replication issue

info at bouncetek.com info at bouncetek.com
Tue Jul 24 12:38:56 UTC 2007


I'm also experiencing excessive replication on a large scale. I've got a mogilefs setup running on 5
hosts with about 49 devices. There are 4 file classes. 3 of them with a mindevcount of 3 and 1 with
a mindevcount of 2. However a huge number of files get replicated to each and every disk.

+------+---------+----------+-----------------+
| dmid | classid | devcount | COUNT(devcount) |
+------+---------+----------+-----------------+
|    1 |       1 |        1 |          274585 | 
|    1 |       1 |        3 |         7454296 | 
|    1 |       1 |        4 |           45505 | 
|    1 |       1 |        5 |           18600 | 
|    1 |       1 |        6 |           10868 | 
|    1 |       1 |        7 |            8381 | 
|    1 |       1 |        8 |            7414 | 
|    1 |       1 |        9 |            6364 | 
|    1 |       1 |       10 |            4693 | 
|    1 |       1 |       11 |            4066 | 
|    1 |       1 |       12 |            3674 | 
|    1 |       1 |       13 |            2855 | 
|    1 |       1 |       14 |            2975 | 
|    1 |       1 |       15 |            3078 | 
|    1 |       1 |       16 |            4484 | 
|    1 |       1 |       17 |            4453 | 
|    1 |       1 |       18 |           11250 | 
|    1 |       1 |       19 |             642 | 
|    1 |       1 |       20 |             449 | 
|    1 |       1 |       21 |             382 | 
|    1 |       1 |       22 |             591 | 
|    1 |       1 |       23 |             452 | 
|    1 |       1 |       24 |             327 | 
|    1 |       1 |       25 |             354 | 
|    1 |       1 |       26 |             363 | 
|    1 |       1 |       27 |             423 | 
|    1 |       1 |       28 |             555 | 
|    1 |       1 |       29 |             507 | 
|    1 |       1 |       30 |            1293 | 
|    1 |       1 |       31 |            1003 | 
|    1 |       1 |       32 |              85 | 
|    1 |       1 |       33 |              85 | 
|    1 |       1 |       34 |              67 | 
|    1 |       1 |       35 |              91 | 
|    1 |       1 |       36 |              70 | 
|    1 |       1 |       37 |              98 | 
|    1 |       1 |       38 |             145 | 
|    1 |       1 |       39 |             175 | 
|    1 |       1 |       40 |             360 | 
|    1 |       1 |       41 |             657 | 
|    1 |       1 |       42 |            1365 | 
|    1 |       1 |       43 |            3818 | 
|    1 |       1 |       44 |           17986 | 
|    1 |       2 |        0 |               1 | 
|    1 |       2 |        3 |         2182785 | 
|    1 |       2 |        4 |           12143 | 
|    1 |       2 |        5 |            9306 | 
|    1 |       2 |        6 |            9157 | 
|    1 |       2 |        7 |            8194 | 
|    1 |       2 |        8 |            6869 | 
|    1 |       2 |        9 |            7855 | 
|    1 |       2 |       10 |            6130 | 
|    1 |       2 |       11 |            6198 | 
|    1 |       2 |       12 |            5320 | 
|    1 |       2 |       13 |            5994 | 
|    1 |       2 |       14 |           16707 | 
|    1 |       2 |       15 |           15771 | 
|    1 |       2 |       16 |          169202 | 
|    1 |       2 |       17 |            1820 | 
|    1 |       2 |       18 |            2086 | 
|    1 |       2 |       19 |            2115 | 
|    1 |       2 |       20 |            2343 | 
|    1 |       2 |       21 |            2193 | 
|    1 |       2 |       22 |            1172 | 
|    1 |       2 |       23 |            1312 | 
|    1 |       2 |       24 |            1982 | 
|    1 |       2 |       25 |            1468 | 
|    1 |       2 |       26 |            1811 | 
|    1 |       2 |       27 |            1898 | 
|    1 |       2 |       28 |            2119 | 
|    1 |       2 |       29 |           30309 | 
|    1 |       2 |       30 |              31 | 
|    1 |       2 |       31 |               1 | 
|    1 |       2 |       42 |               1 | 
|    1 |       2 |       43 |               1 | 
|    1 |       2 |       44 |             276 | 
|    1 |       3 |        3 |            4440 | 
|    1 |       3 |        4 |              15 | 
|    1 |       3 |        5 |              11 | 
|    1 |       3 |        6 |              22 | 
|    1 |       3 |        7 |              16 | 
|    1 |       3 |        8 |               5 | 
|    1 |       3 |        9 |              13 | 
|    1 |       3 |       10 |               8 | 
|    1 |       3 |       11 |              28 | 
|    1 |       3 |       12 |              12 | 
|    1 |       3 |       13 |              10 | 
|    1 |       3 |       14 |              33 | 
|    1 |       3 |       15 |              37 | 
|    1 |       3 |       16 |             334 | 
|    1 |       3 |       17 |               2 | 
|    1 |       3 |       19 |               1 | 
|    1 |       3 |       22 |               1 | 
|    1 |       3 |       23 |               2 | 
|    1 |       3 |       24 |               2 | 
|    1 |       3 |       25 |               2 | 
|    1 |       3 |       27 |               4 | 
|    1 |       3 |       28 |              31 | 
|    1 |       3 |       29 |            1268 | 
|    1 |       4 |        2 |          161681 | 
|    1 |       4 |        3 |             596 | 
|    1 |       4 |        4 |             499 | 
|    1 |       4 |        5 |             421 | 
|    1 |       4 |        6 |             342 | 
|    1 |       4 |        7 |             351 | 
|    1 |       4 |        8 |             284 | 
|    1 |       4 |        9 |             262 | 
|    1 |       4 |       10 |             299 | 
|    1 |       4 |       11 |             337 | 
|    1 |       4 |       12 |             280 | 
|    1 |       4 |       13 |             367 | 
|    1 |       4 |       14 |             646 | 
|    1 |       4 |       15 |             712 | 
|    1 |       4 |       16 |            8431 | 
|    1 |       4 |       17 |              94 | 
|    1 |       4 |       18 |              88 | 
|    1 |       4 |       19 |              94 | 
|    1 |       4 |       20 |              76 | 
|    1 |       4 |       21 |              83 | 
|    1 |       4 |       22 |              96 | 
|    1 |       4 |       23 |              93 | 
|    1 |       4 |       24 |              86 | 
|    1 |       4 |       25 |              83 | 
|    1 |       4 |       26 |             104 | 
|    1 |       4 |       27 |             112 | 
|    1 |       4 |       28 |             661 | 
|    1 |       4 |       29 |           14789 | 
+------+---------+----------+-----------------+

I've began digging into the mogilefs code and made several observations that might
help us solve this problem.

- Most of the files that are over-replicated still exist in the file_to_replicate table.
- Running a !watch shows a lot of 'ran out of suggestions' error. Upon manually checking
the fid's involved they all appear to have > mindevcount copies.
- For a large period of time my disks were slowly accessible and had many timeouts
 (which was solved by using lightie with multiple workers). This might have something to do
with the initial replication failing causing it to somehow end up looping.
- A plausible fix to me seems a simple check at the start of replicate() in Replicate.pm
to see on how many 'alive' devices the file exists and if this matches/exceeds the mindevcount
for that class. If it does then replicate() can return "2 (success, but someone else replicated it)"
so replicate_using_torepl_table() can safely call delete_fid_from_file_to_replicate().

Checking trackers...
  192.168.0.100:6001 ... OK

Checking hosts...
  [ 1] storage1 ... OK
  [ 2] storage2 ... OK
  [ 3] storage3 ... OK
  [ 4] storage4 ... OK
  [ 5] storage5 ... OK

Checking devices...
  host device         size(G)    used(G)    free(G)   use%   ob state   I/O%
  ---- ------------ ---------- ---------- ---------- ------ ---------- -----
  [ 1] dev1           136.611     81.343     55.268  59.54%  writeable  90.8
  [ 1] dev2           136.611    105.411     31.200  77.16%  writeable 100.4
  [ 1] dev3           136.611    105.901     30.710  77.52%  writeable 100.4
  [ 1] dev4           136.611    105.905     30.706  77.52%  writeable 100.4
  [ 1] dev5           136.611    105.970     30.641  77.57%  writeable  89.6
  [ 1] dev6           136.611    105.478     31.133  77.21%  writeable  88.4
  [ 1] dev7           136.611     81.648     54.964  59.77%  writeable 100.4
  [ 1] dev8           136.611    105.486     31.126  77.22%  writeable  92.0
  [ 1] dev9           136.611    118.235     18.376  86.55%  writeable 100.4
  [ 1] dev10          136.611    118.128     18.483  86.47%  writeable  83.2
  [ 1] dev11          136.611    118.091     18.520  86.44%  writeable 100.4
  [ 1] dev12          136.611    118.468     18.143  86.72%  writeable 100.4
  [ 1] dev13          136.611    117.953     18.658  86.34%  writeable 100.4
  [ 1] dev14          136.611    118.154     18.457  86.49%  writeable 100.4
  [ 1] dev15          136.611    122.528     14.083  89.69%  writeable 100.4
  [ 1] dev16          136.611    121.794     14.817  89.15%  writeable 100.4
  [ 1] dev17          136.611    122.161     14.450  89.42%  writeable 100.4
  [ 1] dev18          136.611     98.298     38.313  71.95%  writeable 100.0
  [ 1] dev19          136.611    117.070     19.541  85.70%  writeable 100.4
  [ 1] dev20          136.611    121.825     14.786  89.18%  writeable 100.4
  [ 1] dev21          136.611    122.386     14.225  89.59%  writeable 100.4
  [ 1] dev22          136.611    121.994     14.617  89.30%  writeable 100.4
  [ 1] dev23          136.611    121.900     14.711  89.23%  writeable 100.4
  [ 1] dev24          136.611    121.897     14.714  89.23%  writeable 100.4
  [ 1] dev25          136.611    121.664     14.947  89.06%  writeable 100.4
  [ 1] dev26          136.611    122.278     14.333  89.51%  writeable 100.4
  [ 1] dev27          136.611     98.187     38.424  71.87%  writeable 100.4
  [ 1] dev28          136.611    121.952     14.659  89.27%  writeable 100.4
  [ 2] dev29          698.101     68.175    629.926   9.77%  writeable  26.7
  [ 2] dev30          698.101     68.105    629.996   9.76%  writeable   4.0
  [ 2] dev31          698.101     67.369    630.731   9.65%  writeable  15.8
  [ 2] dev32          698.101     68.895    629.206   9.87%  writeable  11.9
  [ 2] dev33          698.101     69.219    628.882   9.92%  writeable   5.0
  [ 2] dev34          698.101     69.215    628.886   9.91%  writeable  11.9
  [ 2] dev35          698.101     67.500    630.601   9.67%  writeable   6.9
  [ 2] dev36          698.101     68.141    629.959   9.76%  writeable   0.0
  [ 2] dev37          698.101     69.401    628.700   9.94%  writeable  10.9
  [ 2] dev38          698.101     69.019    629.082   9.89%  writeable  16.8
  [ 2] dev39          698.101     68.304    629.797   9.78%  writeable  19.8
  [ 2] dev40          698.101     67.813    630.287   9.71%  writeable   2.0
  [ 2] dev41          698.101     68.758    629.343   9.85%  writeable   6.9
  [ 2] dev42          698.101     67.699    630.401   9.70%  writeable   1.0
  [ 2] dev43          698.101     69.285    628.816   9.92%  writeable  18.8
  [ 3] dev44          229.176    193.427     35.749  84.40%  writeable   3.2
  [ 3] dev45          222.451    192.966     29.485  86.75%  writeable   5.2
  [ 4] dev46          229.176    200.376     28.800  87.43%  writeable   0.0
  [ 4] dev47          222.451    196.413     26.037  88.30%  writeable   0.0
  [ 5] dev48          182.044    162.701     19.342  89.37%  writeable  21.6
  [ 5] dev49          232.823    194.586     38.236  83.58%  writeable  18.8
  ---- ------------ ---------- ---------- ---------- ------
             total: 15614.739   5329.472  10285.267  34.13%


Regards,

Arjan




More information about the mogilefs mailing list