• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

MDADM Superblock Recovery

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Teque5

Member
Joined
Apr 5, 2006
Location
Los Angeles
After a power cycle I found my RAID 5 Array no longer working. I tried various methods to reassemble the array but nothing has worked so far. I believe I need to recreate the superblocks and UUIDs somehow, but was reluctant to barrel into something as to not lose a bunch of data. Thanks for reading.

Code:
cat /etc/mdadm/mdadm.conf

    DEVICE partitions
    ARRAY /dev/md0 level=raid5 num-devices=4 metadata=0.90 UUID=fd522a0f:2de72d76:f2afdfe9:5e3c9df1
    MAILADDR root

Which is normal. It should have 4x2000GB drives (sda, sdc, sde, sdd).

Code:
cat /proc/mdstat

    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md0 : inactive sdd[1](S)
      1953514496 blocks
       
    unused devices: <none>

This is a problem. It only shows one drive in the array and it is also inactive. The array should have sda, sdc, and sde in there as well. When I do a `mdadm --examine /dev/sdd` everything looks fine. On the other drives examine says **no RAID superblock on /dev/sdX**.

Code:
mdadm --examine --scan

    ARRAY /dev/md0 level=raid5 num-devices=4 metadata=0.90 UUID=fd522a0f:2de72d76:f2afdfe9:5e3c9df1

No help there.

Code:
mdadm --assemble --scan -v

    mdadm: looking for devices for /dev/md0
    mdadm: no RAID superblock on /dev/sde
    mdadm: /dev/sde has wrong uuid.
    mdadm: cannot open device /dev/sdd: Device or resource busy
    mdadm: /dev/sdd has wrong uuid.
    mdadm: no RAID superblock on /dev/sdc
    mdadm: /dev/sdc has wrong uuid.
    mdadm: cannot open device /dev/sdb5: Device or resource busy
    mdadm: /dev/sdb5 has wrong uuid.
    mdadm: no RAID superblock on /dev/sdb2
    mdadm: /dev/sdb2 has wrong uuid.
    mdadm: cannot open device /dev/sdb1: Device or resource busy
    mdadm: /dev/sdb1 has wrong uuid.
    mdadm: cannot open device /dev/sdb: Device or resource busy
    mdadm: /dev/sdb has wrong uuid.
    mdadm: no RAID superblock on /dev/sda
    mdadm: /dev/sda has wrong uuid.

From this it looks like I have no UUIDs and no Superblocks for sda, sdc, and sde.

Code:
sudo fdisk -l

    Disk /dev/sda: 2000.4 GB, 2000397852160 bytes
    255 heads, 63 sectors/track, 243201 cylinders, total 3907027055 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000
    
    Disk /dev/sda doesn't contain a valid partition table
                
    Disk /dev/sdb: 250.1 GB, 250058268160 bytes
    255 heads, 63 sectors/track, 30401 cylinders, total 488395055 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x353cf669
    
    Device Boot      Start         End      Blocks   Id  System
    /dev/sdb1              63   476327249   238163593+  83  Linux
    /dev/sdb2       476327250   488392064     6032407+   5  Extended
    /dev/sdb5       476327313   488392064     6032376   82  Linux swap / Solaris
    
    Disk /dev/sdc: 2000.4 GB, 2000397852160 bytes
    255 heads, 63 sectors/track, 243201 cylinders, total 3907027055 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000
    
    Disk /dev/sdc doesn't contain a valid partition table
    
    Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
    255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000
    
    Disk /dev/sdd doesn't contain a valid partition table
    
    Disk /dev/sde: 2000.4 GB, 2000397852160 bytes
    255 heads, 63 sectors/track, 243201 cylinders, total 3907027055 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000
    
    Disk /dev/sde doesn't contain a valid partition table

So from this it looks like none of my RAID disks have a partition table or UUID. The closest thing I found to my problem was this thread, which suggested running `mdadm --create /dev/md0 -v -l 5 -n 4 /dev/sda /dev/sdc /dev/sde /dev/sdd` and checking for a valid filesystem with `fsck -fn /dev/md0`. However, the first command spit out `mdadm: no raid-devices specified.` I retried the command using sda1, sdc1, etc, but then I get this:

Code:
mdadm: layout defaults to left-symmetric
    mdadm: chunk size defaults to 512K
    mdadm: layout defaults to left-symmetric
    mdadm: layout defaults to left-symmetric
    mdadm: super1.x cannot open /dev/sda1: No such file or directory
    mdadm: ddf: Cannot open /dev/sda1: No such file or directory
    mdadm: Cannot open /dev/sda1: No such file or directory
    mdadm: device /dev/sda1 not suitable for any style of array

If I do a create and leave sda1 as a "missing" variable in the command then it just says the same thing for sdc1.

I am sure that I am making this more complicated than it needs to be. Can someone with experience please help me? Thanks for your time in advance. This is on Ubuntu Desktop 11.10.

**edit**

In this thread someone suggested using the dumpe2fs tools to recover the superblock. You can still see a filesystem on the drives:

Code:
dumpe2fs /dev/sda
dumpe2fs 1.41.14 (22-Dec-2010)
Filesystem volume name:   <none>
Last mounted on:          <not available>
Filesystem UUID:          bbe6fb91-d37c-414a-8c2b-c76a30b9b5c5
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags:         signed_directory_hash 
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              366288896
Block count:              1465135872
Reserved block count:     73256793
Free blocks:              568552005
Free inodes:              366066972
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      674
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Filesystem created:       Wed Oct 28 12:23:09 2009
Last mount time:          Tue Oct 18 13:59:36 2011
Last write time:          Tue Oct 18 13:59:36 2011
Mount count:              17
Maximum mount count:      26
Last checked:             Fri Oct 14 17:04:16 2011
Check interval:           15552000 (6 months)
Next check after:         Wed Apr 11 17:04:16 2012
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:	          256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      17e784d8-012e-4a29-9bbd-c312de282588
Journal backup:           inode blocks
Journal superblock magic number invalid!

But I don't know how to restore the superblocks and UUIDs. One place had a method using fsck but I dont want to corrupt the raid filesystem.
 
Last edited:
Sorry man, not sure on RAID stuff on Linux - someone else around here may though.

I take it you don't have a backup of the data? I'd write it off, rebuild the array from scratch, and do a restore - that means having a real backup though.
 
I take it you don't have a backup of the data? I'd write it off, rebuild the array from scratch, and do a restore - that means having a real backup though.

It is a 6 TB Raid full of data. I can't write it off every time a power cycle puts it out of order. I dont think this is insurmountable.
 
What is the output of "mdadm --detail --scan"? I see messages stating that the drive is busy, which leads me to believe that mdadm has incorrectly identified it as part of another RAID array. If this outputs anything but a blank line, do NOT do the next portion.

What happens if you add the "-f" flag to the command you tried?
mdadm --create /dev/md0 -v -f -l 5 -n 4 /dev/sda /dev/sdc /dev/sde /dev/sdd

Also, we don't accept "bounties" here. We are community driven, not money driven. I kindly ask you to remove it from this thread.
 
What is the output of "mdadm --detail --scan"?
What happens if you add the "-f" flag to the command you tried?
mdadm --create /dev/md0 -v -f -l 5 -n 4 /dev/sda /dev/sdc /dev/sde /dev/sdd

Code:
$mdadm --detail --scan
mdadm: md device /dev/md0 does not appear to be active.

I didn't know if using the -f flag would damage the array. Should I still do this?

note: I can't remove the bounty from the thread title.
 
If you use the force flag, as long as you don't write to the array (make sure your check disk is set to NOT write changes to the disk), nothing is modified on the drives and it should not damage them.
 
If you use the force flag, as long as you don't write to the array (make sure your check disk is set to NOT write changes to the disk), nothing is modified on the drives and it should not damage them.

Ok lets try this.

Code:
$ sudo mdadm --create /dev/md0 -v -f -l 5 -n 4 /dev/sda /dev/sdc /dev/sde /dev/sdd
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: /dev/sda appears to contain an ext2fs file system
    size=1565576192K  mtime=Tue Oct 18 13:59:36 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdc appears to contain an ext2fs file system
    size=1297140736K  mtime=Tue Oct 18 13:59:36 2011
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: super1.x cannot open /dev/sdd: Device or resource busy
mdadm: /dev/sdd is not suitable for this array.
mdadm: create aborted

Well that is getting close. I think sdd is still attached to md0 though. Should I run a mdadm --stop /dev/md0?

ps. Thanks a lot for your help so far.
 
Last edited:
If any array is running, stop it first, then run the create with "-f". Otherwise, use "mount" to see if it got mounted, somehow. But most likely is that it got put in its own RAID array. It may not even be md0.
 
Um. I think it worked.

Code:
$ sudo mdadm --create /dev/md0 -v -f -l 5 -n 4 /dev/sda /dev/sdc /dev/sde /dev/sdd
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: /dev/sda appears to contain an ext2fs file system
    size=1565576192K  mtime=Tue Oct 18 13:59:36 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdc appears to contain an ext2fs file system
    size=1297140736K  mtime=Tue Oct 18 13:59:36 2011
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdd appears to be part of a raid array:
    level=raid5 devices=4 ctime=Wed Oct 28 02:11:27 2009
mdadm: size set to 1953511936K
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: ADD_NEW_DISK for /dev/sda failed: Device or resource busy

Which i thought was bad, but then i ran:

Code:
$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid5 sdd[3] sde[2] sda[0] sdc[1]
      5860535808 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  resync =  0.5% (10025524/1953511936) finish=367.0min speed=88251K/sec
      
unused devices: <none>

And it is resyncing!!! This is gonna take a while, but I will report back when it is done.

*edit* Actually I am a little worried it says a size of 1953511936, when it should be 4x that.

*edit2* A little less worried since in the man page it says 'size' if for each individual disk.
 
Last edited:
Since it is rebuilding, you should be able to mount the array and see if your information is valid. I wouldn't do a check disk or write files to it just yet (it won't really hurt anything), though. You should verify your mdadm.conf file is setup right. It should include the array and the drives.
 
Since it is rebuilding, you should be able to mount the array and see if your information is valid.

I might just be spazzing out here, but I hit a wall. I am 90% confident this is an ext3 system:

Code:
sudo mount -r -t ext3 /dev/md0 /var/media
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

I am inclined to let the rebuild finish, then restarting and letting the fstab mount it like usual. I am at 31% with 4 hours remaining.

Perhaps I should run a fsck -n /dev/md0?
 
Code:
$sudo fsck -n /dev/md0
fsck from util-linux 2.19.1
e2fsck 1.41.14 (22-Dec-2010)
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/md0

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

Well. Now my plan is to wait for the resync then recreate the superblock using this e2fsck -b 8193 /dev/md0.
 
Yeah it finished, but I am still fussing with it. Trying the e2fsck -b 32768 /dev/md0 command yielded the same message:
Code:
e2fsck 1.41.14 (22-Dec-2010)
e2fsck: Bad magic number in super-block while trying to open /dev/md0

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

So i restarted and now it says the array is active but read only and called md127. I read somewhere that this is fixed by stopping then reassembling the array.
I just did that and have /dev/md0 back like it was before.

Hmm... What to do now...
 
I might try fsck.ext3 -b 32768 /dev/sdb based on this forum post.

**edit** dang. Still the same output. Looks like fsck.ext3 is a link to e2fsck
Code:
e2fsck 1.41.14 (22-Dec-2010)
fsck.ext3: Bad magic number in super-block while trying to open /dev/md0

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

I am looking into your testdisk option right now.
 
Last edited:
Well i found where the superblocks should be but it still throws the same error:
Code:
$ sudo mke2fs -n /dev/md0
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=128 blocks, Stripe width=384 blocks
366288896 inodes, 1465133952 blocks
73256697 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
44713 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 
	102400000, 214990848, 512000000, 550731776, 644972544

tq5@Servbot:/sbin$ sudo fsck.ext3 -b 819200 /dev/md0
e2fsck 1.41.14 (22-Dec-2010)
fsck.ext3: Bad magic number in super-block while trying to open /dev/md0

I am currently using testdisk to search for superblocks...

Code:
TestDisk 6.11, Data Recovery Utility, April 2009
Christophe GRENIER <[email protected]>
http://www.cgsecurity.org

Disk /dev/md0 - 6001 GB / 5589 GiB - CHS 1465133952 2 4
     Partition                  Start        End    Size in sectors
   P ext3                     0   0  1 1465133951   1  4 11721071616


Search ext2/ext3/ext4 superblock  468842865/3131137024 4%
 
Last edited:
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848, 512000000, 550731776, 644972544

tq5@Servbot:/sbin$ sudo fsck.ext3 -b 819200 /dev/md0
Have you tried the other superblocks? I'm not sure what you can do if all are bad or damaged. That may indicate that there is much more data corruption than a simple disk going out. The whole point behind having multiple superblocks is in case one gets damaged. The chances of all of them getting damages, especially in a RAID array is pretty slim.
 
sigh. the quick scan using testdisk didn't find anything and I tried all of the other superblocks also to no avail.

Any other ideas? :-/
 
Last edited:
Back