{"id":603,"date":"2018-06-13T11:43:24","date_gmt":"2018-06-13T16:43:24","guid":{"rendered":"http:\/\/bitc.bme.emory.edu\/~lzhou\/blogs\/?p=603"},"modified":"2018-06-13T11:43:24","modified_gmt":"2018-06-13T16:43:24","slug":"how-to-recover-a-very-large-raid6","status":"publish","type":"post","link":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/?p=603","title":{"rendered":"How to recover a very large RAID6"},"content":{"rendered":"<p>RAID6 is supposed to have two sets of redundancy that could survive a simultaneous failure of two drives, and it can go directly on automatic rebuilding if you have hot swaps.<\/p>\n<p>However, you must have known that the RAID controller and the back plane are still playing dictatorship for a RAID group. It could happen that due to a malfunctioning controller, a back plane, or a unstable hard drive that spitting erroneous signal to the bus and fools the controller that multiple drives are bad all together.<\/p>\n<p>If such thing happens, in most of time you might be lucky enough that when you reboot the system, the raid might appear as carrying &#8220;foreign&#8221; configuration. In this case, you may want to try importing the &#8220;foreign&#8221; first. If success, all your stuff should be back.<\/p>\n<p>If it fails, the RAID controller vendor might tell you that you have to seek professional data recovery service or try some very high risk tricks that may actually cause you to lose all data.<\/p>\n<p>DO NOT LISTEN TO THEM.<\/p>\n<p>At this point, nothing of your data is missing.<\/p>\n<p>This could happen when a conclusive disk failure happens, the RAID card abandon the first drive and attempt to reconstruct on a hot spare, and more drives fail and render the raid configuration inconsistent.<\/p>\n<p>The worst case, you are expanding your raid group and this happens, which leaves you portion of the drives in one configuration and another half in a different.<\/p>\n<p>This does not mean that your data are gone. They are still there. Even if you totally do not have the knowledge of your RAID drive order, strip size, etc. It is still fine.<\/p>\n<p>Here is what you should do:<\/p>\n<p>Step 1<br \/>\nWrite down the disk numbers that forms you original RAID and your new RAID, just in case you might forget.<br \/>\nStep 2<br \/>\nMake the RAID controller to clear the &#8220;foreign&#8221; setting. Do not worry. The RAID setting is only saved in the first sector or last sector of your drives. Erasing it won&#8217;t alter your data.<br \/>\nStep 3<br \/>\nMake all your drives from the melted RAID single drive RAID0 or standalone virtual drives, REMEMBER to create them as READONLY, just in case you accidentally write over them. Export them to your computer.<br \/>\nStep 4<br \/>\nMount READONLY the drives according to their numbers on your operating system, preferably linux.<br \/>\nStep 5<br \/>\nUse a tool that can read the binary directly, for example hexedit in linux, to browse the drives, finding out how much the reconstruct has been going. Usually new drives are wiped all zeros. So if you jump to some point you are sure that your data should have reach but your reconstruct has not, it will show continuously non-zero pages for disks that forms the original drives, and many pages zeros for new drives. Try to move the viewing point by 1\/2 up or done, you will be able to rapidly separate the original disks and the being-expanded-to disks, and precisely the location of done-reconstruct. Suppose you have a 7x 4TB RAID6 to be expanded to 12x 4TB RAID6, and you find that the new drives were written to 700GB, you will know that 7000GB data had been reconstructed,<br \/>\nor the same location will be at about 1400GB on the old drives. Do the math and find the location in the old drives, you will see same content of pages which means that you are correct. Take notes of the precise location and disk numbers.<br \/>\nStep 6<br \/>\nBy trunk size 256kB, dd several 10GBs out of each drive, name them orderly, from the sure-after rebuild and sure-before rebuild part, into image files.<br \/>\nStep 7<br \/>\nInstall a virtualbox in your linux system. Install a Windows 7 in the virtualbox.<br \/>\nStep 8<br \/>\nDownload and install &#8220;ReclaiMe Free RAID recovery&#8221;, install it into your virtual machine Windows 7.<br \/>\nStep 9<br \/>\nExport the volume that contains your image files to the virtual machine.<br \/>\nStep 10<br \/>\nRun ReclaiMe, open the network location contains the exported volume, read in all the pre-rebuild part of the disk images. Select Other RAIDS-&gt;Start RAID6.<br \/>\nIt will run for a few minutes to a few hours, and find out a map like this:<\/p>\n<p>Block size is 256.0 KB , same as 512 sectors.<br \/>\nThe data starts at sector (LBA) 0 (this is often called &#8220;offset&#8221; or &#8220;start offset&#8221;).<\/p>\n<p>Block map is as follows:<\/p>\n<p>1 2 3 4 5 6 7 8 9 10 11 12 P ?<br \/>\n14 15 16 17 18 19 20 21 22 23 24 P ? 13<br \/>\n27 28 29 30 31 32 33 34 35 36 P ? 25 26<br \/>\n40 41 42 43 44 45 46 47 48 P ? 37 38 39<br \/>\n53 54 55 56 57 58 59 60 P ? 49 50 51 52<br \/>\n66 67 68 69 70 71 72 P ? 61 62 63 64 65<br \/>\n79 80 81 82 83 84 P ? 73 74 75 76 77 78<br \/>\n92 93 94 95 96 P ? 85 86 87 88 89 90 91<br \/>\n105 106 107 108 P ? 97 98 99 100 101 102 103 104<br \/>\n118 119 120 P ? 109 110 111 112 113 114 115 116 117<br \/>\n131 132 P ? 121 122 123 124 125 126 127 128 129 130<br \/>\n144 P ? 133 134 135 136 137 138 139 140 141 142 143<br \/>\nP ? 145 146 147 148 149 150 151 152 153 154 155 156<br \/>\n? 157 158 159 160 161 162 163 164 165 166 167 168 P<\/p>\n<p>This tells you the RAID block size, disk order, and parities orientation.<br \/>\nSave this into a file.<br \/>\nStep 11<br \/>\nDo the same thing to the after-rebuild part of the disk image, save the calculated RAID configuration file.<br \/>\nStep 12<br \/>\nNow you need to prepare a big enough hard drive or RAID volume that is possible to contain TWICE of all the information from the &#8220;after-rebuild&#8221; part. Have it mounted on your linux.<br \/>\nStep 13<br \/>\nUse dd to dump the ENTIRE data of the after-rebuild part of each drives, write them into image files<br \/>\nStep 14<br \/>\nUse soft RAID, to assemble the disk images of the above dumped into a read-only RAID6, according to the after-rebuild configuration obtained from Step 11.<br \/>\nStep 15<br \/>\nThe above step should include the partition information and the filesystem information for at least the first partition. Use dd to dump the partition table, save it.<br \/>\nStep 16<br \/>\nDo not try to activate the filesystem, unless your calculation shows that there is at least one complete file system. Now dd the entire RAID to a giant dump file.<br \/>\nStep 17<br \/>\nUse Hexedit to browse to the very end of the giant dump file, trim ending to eliminate any know not rebuilt data part.<br \/>\nStep 18<br \/>\nSelect the drives that forms the pre-rebuild RAID, umount them, delete the RAID0 volumes, construct RAID6 using them in READONLY mode. Be sure to use the drive order obtained in step 10.\u00a0 Do not attempt to mount the file system since the leading big trunk of data are already destroyed by the rebuilding process.<br \/>\nStep 19<br \/>\nUse hexedit to precisely locate the location of the last few sectors of the big dump file obtained from Step 17 on the READONLY RAID6 with calculated location.\u00a0 Verify it by observation.<br \/>\nStep 20<br \/>\nUse dd to read data from the READONLY RAID6 from the overlapping spot to another giant dump file.<br \/>\nStep 21<br \/>\nUse hexedit to examine that the two dump files are continuous and the total size matches the entire size of the original RAID6.\u00a0 Then joint the two giant dump file into one.<br \/>\nStep 22<br \/>\nNow you need a third set of hard drives with the same number and specs and construct a new RAID6 in RW.<br \/>\nStep 23<br \/>\ndd the giant jointed dump file obtained in Step 21 onto the new RAID6.\u00a0 It is OK if the RAID is attempting automatic initialization.<br \/>\nStep 24<br \/>\nNow you can use gparted to examine if you have got the proper partition table back. You should.\u00a0 If not, you must have done something wrong in previous steps.<br \/>\nStep 25<br \/>\nOnce you are sure that the partition information are correct, you can test mounting them.\u00a0 It will attempt to fix\/flush meta.\u00a0 If you are lucky enough, it will be mounted and you will be able to see your data.\u00a0 Then umount it.\u00a0 If it does not mount, your file system must have been corrupted.<br \/>\nStep 26<br \/>\nWhatever you were able or not able to mount the volume in Step 25, you should now umount it and attempt a repair of the file system.\u00a0 If you are fortunate enough, you will be able to get back most of your data.\u00a0 If not, you may end up nothing or have many of your data in \/lost+found.<br \/>\nStep 27<br \/>\nOnce you are done, and is satisfied of what you have got back, you can remove the drive arrays you have been using to hold the giant dump files.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>RAID6 is supposed to have two sets of redundancy that could survive a simultaneous failure of two drives, and it can go directly on automatic rebuilding if you have hot swaps. However, you must have known that the RAID controller and the back plane are still playing dictatorship for a RAID group. It could happen [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21,3],"tags":[],"class_list":["post-603","post","type-post","status-publish","format-standard","hentry","category-computer-tips","category-mri-technical-support","post-blog"],"_links":{"self":[{"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/603","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=603"}],"version-history":[{"count":2,"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/603\/revisions"}],"predecessor-version":[{"id":625,"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=\/wp\/v2\/posts\/603\/revisions\/625"}],"wp:attachment":[{"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=603"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=603"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/csic.som.emory.edu\/~lzhou\/blogs\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=603"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}