• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

mdadm fails to start array after replacing 2 drives.

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

ziggo0

Member
Joined
Apr 27, 2004
Long story short, 2 drives failed in a raid6 - replaced both drives - --add drives to the array - mdadm spends a day rebuilding and all is well - till I rebooted. "Failed to mount /mnt/nas blahblah /dev/md5 not ready" - here is cat /proc/mdstat:

Code:
root@rommie:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md5 : inactive sdh1[7](S) sdf1[5](S) sdg1[6](S) sda1[1](S) sdd1[3](S) sde1[4](S) sdb1[2](S) sdc[8](S)
      15628107008 blocks

unused devices: <none>

The two drives replaced were /dev/sdc1 and /dev/sdd1 - you can see it has /dev/sdd1...but for some reason it's listing /dev/sdc? It's partition 1 that should be there and not just the straight disk (my array has 4k af and 512b drives mixed). mdadm --examine /dev/sdc1

Code:
root@rommie:~# mdadm --examine /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : c0370395:f845630f:26c640ad:ebf0ffa6
  Creation Time : Sat Mar 12 16:43:53 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
     Array Size : 11721071616 (11178.09 GiB 12002.38 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 5

    Update Time : Tue Jun  5 05:33:22 2012
          State : clean
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 86970e0f - correct
         Events : 3867051

         Layout : left-symmetric
     Chunk Size : 512K

      Number   Major   Minor   RaidDevice State
this     0       8       33        0      active sync   /dev/sdc1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8        1        1      active sync   /dev/sda1
   2     2       8       17        2      active sync   /dev/sdb1
   3     3       8       49        3      active sync   /dev/sdd1
   4     4       8       65        4      active sync   /dev/sde1
   5     5       8       81        5      active sync   /dev/sdf1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8      113        7      active sync   /dev/sdh1

what's weird is when I tell mdadm to examine /dev/sdc and not partition one:

Code:
root@rommie:~# mdadm --examine /dev/sdc
/dev/sdc:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : c0370395:f845630f:26c640ad:ebf0ffa6
  Creation Time : Sat Mar 12 16:43:53 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
     Array Size : 11721071616 (11178.09 GiB 12002.38 GB)
   Raid Devices : 8
  Total Devices : 7
Preferred Minor : 5

    Update Time : Sat Jun  2 14:57:05 2012
          State : clean
 Active Devices : 6
Working Devices : 7
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 86939768 - correct
         Events : 3866263

         Layout : left-symmetric
     Chunk Size : 512K

      Number   Major   Minor   RaidDevice State
this     8       8       32        8      spare   /dev/sdc

   0     0       0        0        0      removed
   1     1       8        1        1      active sync   /dev/sda1
   2     2       8       17        2      active sync   /dev/sdb1
   3     3       0        0        3      faulty removed
   4     4       8       65        4      active sync   /dev/sde1
   5     5       8       81        5      active sync   /dev/sdf1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8      113        7      active sync   /dev/sdh1
   8     8       8       32        8      spare   /dev/sdc

There is the old failed array...

stopping and force starting...

Code:
root@rommie:~# mdadm -S /dev/md5
mdadm: stopped /dev/md5
root@rommie:~# mdadm --verbose --assemble --force /dev/md5 /dev/sd[abcdefgh]1
mdadm: looking for devices for /dev/md5
mdadm: /dev/sda1 is identified as a member of /dev/md5, slot 1.
mdadm: /dev/sdb1 is identified as a member of /dev/md5, slot 2.
mdadm: /dev/sdc1 is identified as a member of /dev/md5, slot 0.
mdadm: /dev/sdd1 is identified as a member of /dev/md5, slot 3.
mdadm: /dev/sde1 is identified as a member of /dev/md5, slot 4.
mdadm: /dev/sdf1 is identified as a member of /dev/md5, slot 5.
mdadm: /dev/sdg1 is identified as a member of /dev/md5, slot 6.
mdadm: /dev/sdh1 is identified as a member of /dev/md5, slot 7.
mdadm: added /dev/sda1 to /dev/md5 as 1
mdadm: added /dev/sdb1 to /dev/md5 as 2
mdadm: added /dev/sdd1 to /dev/md5 as 3
mdadm: added /dev/sde1 to /dev/md5 as 4
mdadm: added /dev/sdf1 to /dev/md5 as 5
mdadm: added /dev/sdg1 to /dev/md5 as 6
mdadm: added /dev/sdh1 to /dev/md5 as 7
mdadm: added /dev/sdc1 to /dev/md5 as 0
mdadm: /dev/md5 has been started with 8 drives.

mdadm -D /dev/md5:

Code:
root@rommie:~# mdadm -D /dev/md5
/dev/md5:
        Version : 0.90
  Creation Time : Sat Mar 12 16:43:53 2011
     Raid Level : raid6
     Array Size : 11721071616 (11178.09 GiB 12002.38 GB)
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 5
    Persistence : Superblock is persistent

    Update Time : Tue Jun  5 05:33:22 2012
          State : clean
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           UUID : c0370395:f845630f:26c640ad:ebf0ffa6
         Events : 0.3867051

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8        1        1      active sync   /dev/sda1
       2       8       17        2      active sync   /dev/sdb1
       3       8       49        3      active sync   /dev/sdd1
       4       8       65        4      active sync   /dev/sde1
       5       8       81        5      active sync   /dev/sdf1
       6       8       97        6      active sync   /dev/sdg1
       7       8      113        7      active sync   /dev/sdh1

cat /proc/mdstat:

Code:
root@rommie:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md5 : active raid6 sdc1[0] sdh1[7] sdg1[6] sdf1[5] sde1[4] sdd1[3] sdb1[2] sda1[1]
      11721071616 blocks level 6, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]

unused devices: <none>


Any hints tips or suggestions on how to correct this? Once force assembled everything is 110% on track and a-ok.
 
I've never had 2 drives fail at the same time in my RAID6 array. I usually have a hot spare ready to go when a drive lets loose. I've only had to do a force assemble one time, and it's been ok ever since.
 
Oddly enough I had 1 drive fail and added a spare that wasn't already part of the array. The spare failed during rebuild...so I sat on it for a week before I decided to go get drives to replace them...then suddenly another drive failed. Makes no sense...controller is fine, cables are fine and power delivery was fine. According to the smart data the 2 drives that were part of the array failed mechanically.

ANYWAYS - I replaced both drives and for some reason it was stuck re-assembling the raid array with the wrong partition...I failed/removed both replaced drives and have it rebuilding again...so far so good. I think imma update the OS to get a newer version of mdadm...
 
Back