Unfortunately no one can be told what fun_plug is - you have to see it for yourself.
You are not logged in.
Pages: 1
Hi everyone,
I dropped one of my HDD 3 metres onto a concrete floor last week (what a noise it made). Also I'd had a touch to drink before fooling around with the 323 and it was late at night. You know ... "it obviously wasn't my fault".
When the replacement came, I felt that instead of just inserting it and letting the firmware take over (running 1.08), that I would do the whole thing myself via the command line.
I did this for a few reasons. Reason number one was I wanted to take control of the situation a bit more. I've used linux since 1995 and have not had any experience with RAID. So I wanted to learn about RAID.
The other reason was that I have heard some horror stories about the firmware making things far, far worse in the whole reformat process. Like the wrong drive gets formatted, etc, etc.
So anyway, once the new drive was in I partitioned it exactly the same as the other, used mke2fs -j to create an ext3 fs (if that is the right way of doing it, it seemed to work).
/mnt/HD_b4/.systemfile # fdisk -l
Disk /dev/sda: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 66 530113+ 82 Linux swap
/dev/sda2 131 182236 1462766445 83 Linux
/dev/sda4 67 130 514080 83 Linux
Partition table entries are not in disk order
Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 66 530113+ 82 Linux swap
/dev/sdb2 131 182236 1462766445 83 Linux
/dev/sdb4 67 130 514080 83 Linux
Partition table entries are not in disk order
So I did all that, creating swap partition, and the funny little 500 meg partition sdb4.
I also did mdadm /dev/md0 -a /dev/sdb
(I found that http://nst.sourceforge.net/nst/docs/user/ch14.html had lots of good information on mdadm)
The raid1 mirror was rebuilt, in about 7 or 8 hours and it worked, as far as mdadm was concerned.
cat /proc/mdstat and mdadm --detail /dev/md0 showed that everything was healthy as far as the raid1 mirror was concerned.
The only problem was that every time I went into the web interface I would be bugged to "Reformat the newly added disk". Which I always skipped, but I did want to remove that message.
Anyway, from searching around on the internet I found that the command hd_verify was needed to re-create the hd_magic_num file in the /mnt/HD_b4/.systemfile directory.
mount command shows
/dev/sda4 on /mnt/HD_a4 type ext3 (rw)
/dev/sdb4 on /mnt/HD_b4 type ext3 (rw)
It also seemed that the raidtab and the raidtab2web were re-created after I rebooted the box. Possibly, restarting webs the web server would have done the same thing.
At this stage, I don't know exactly what went on, but that removed the message.
Maybe someone else could add to this.
Offline
Just wanted to state that I followed this procedure today and all is running great! I'm still conducting my post-mortem to discover exactly what went wrong the other week and sort out other details, but here's what I have so far:
+ spotted a line in my logs showing that the NAS had lost the bid for Master Browser to one of my laptops, which puzzled me because I thought I had setup the Samba config file so that the NAS would always "win."
+ updated the Samba config file line such that "os level = 255", copied to correct locations and restarted NAS.
( Yes, I know I did not run the quick verification on the config file first. Perhaps that was my first mistake? )
+ NAS rebooted, but was no longer on the net - LED light was off. Powered off by using the front-panel button. Then restarted again. No luck. Still dead to the world as far as access - the other LED's were on and I could hear HDD active, but could not access the NAS at all. Checked my LAN router and confirmed no connection present for the port where the NAS resides.
+ Exclaimed many, many, rude words - followed by pressing the reset button on the back of the unit.
+ Somewhere along the way I ended up pulling both HDDs out of the NAS and brought them over to my other Linux box to check them out (and run fsck.) Both had errors (logical since they were in a RAID1 setup in the NAS unit.) Corrected them and unstalled only one back into the NAS unit after altering the Samba config file back to show "os level = 99".
+ Needed to re-install ffp and re-enter my users/groups.
+ Limped along on one drive for a few days to make sure all was well. Performed a rsync to another machine to ensure I had a good copy of all my important files (didn't want to lose those!) Took the other HDD that used to be in my NAS, and connected it via the USB port so I could wipe it clean (via fdisk.) I thought that might be safer than installing back into the unit in whatever state it was in and hoping the NAS would correctly re-sync the RAID1 volume.
+ Followed the excellent information provide here by grazoulious to manually ensure all was synced.
** Edit **
+ The automatic rebuild on the Raid did not start...I kept receiving an error when it went to 'format' the new drive....and thus I was glad that I was able to follow the steps outlined here.
Last edited by BobE (2011-02-09 00:52:16)
Offline
I'll add some useful info to this thread from my own experience of manually syncing my raid array. In my case, I had played around with drives and Dlink GUI wanted to format both drives but reported that the left hand drive was still good and the raid array was degraded. I was also able to access my data as normal.
List the partitions w/ "fdisk -l" and it returned exactly as the first post here. Both drives were already partitioned correctly.
Then to get the current state of the raid array, you do:
#mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.01
Creation Time : Mon May 2 20:35:37 2005
Raid Level : raid1
Array Size : 1003904 (980.38 MiB 1027.100 MB)
Device Size : 1003904 (980.38 MiB 1027.100 MB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Mon May 2 20:35:39 2005
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : f06414c0:39e569bb:a4e94613:1aa6b923
Events : 0.1
Number Major Minor RaidDevice State
0 22 6 0 active sync /dev/sdb2
1 0 0 - removed
This tells me /dev/sdb is the "good" drive that's left in the raid array. So I need to add /dev/sda.
I did a "mount" command which told me there was something using /dev/sda4 and I needed to unmount it with "umount"
Then I added the /dev/sda2 to the raid array with:
#mdadm /dev/md0 -a /dev/sda2
mdadm: hot added /dev/sda2
confirm it's rebuilding with:
#mdadm --detail /dev/md0
find out how long it's gonna take with:
#cat /proc/mdstat
I noticed that exiting the shell makes it work faster. You can always log back in an do a quick time check. It's still syncing so we'll see if I did it right
Offline
successfully sync'd in 2.5hrs. I even got an email from the unit when it re-built like it does when I sync using the GUI. However, GUI still prompts to re-format even though it statuses the raid array as "complete" and both drives active.
sh-3.2# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Fri Apr 30 20:53:31 2010
Raid Level : raid1
Array Size : 974133312 (929.01 GiB 997.51 GB)
Used Dev Size : 974133312 (929.01 GiB 997.51 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sat Aug 18 16:12:34 2012
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : 371392a1:987933a7:1bb5af35:af6f4afc
Events : 0.2706561
Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
sh-3.2# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1]
md0 : active raid1 sda2[0] sdb2[1]
974133312 blocks [2/2] [UU]
unused devices: <none>
Last edited by kevincw01 (2012-08-19 00:16:31)
Offline
hd_verify doesn't look so good. After running it I also go an email that RH drive failed. The drive is fine per SMART utility. Gonna reboot.
sh-3.2# hd_verify
hd verify v1.23.10072009
******* hd_verify start *********
Mount Hidden Partition
grep: /etc/codepage: No such file or directory
Find raid table from hard disk
Mount normal
grep: /etc/codepage: No such file or directory
grep: /tmp/onedisk: No such file or directory
grep: /tmp/onedisk: No such file or directory
grep: /tmp/onedisk: No such file or directory
RAID1 mode
RAID mount normal
mount device /dev/sda2 fail
hd_mgaic_num1 = 0
mount device /dev/sdb2 fail
hd_mgaic_num2 = 0
error disk = 0 0
Refresh Shared Name Table version v1.04
mdadm: fail to stop array /dev/md0: Device or resource busy
mdadm: stopped /dev/md1
umount: /mnt/HD_a*: not found
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
Refresh Shared Name Table version v1.04
*************** hd_verify end ******************
Last edited by kevincw01 (2012-08-19 00:22:06)
Offline
came back up and raid is healthy from shell side. Same problem with GUI. Contents of raid config files, in case anyone wants to help. The one that looks weird is the raidtab, it has 2 in it and the 2nd one looks suspect and has an extra line. Should I delete it?
sh-3.2# more /mnt/HD_b4/.systemfile/hd_magic_num
106667
sh-3.2# more /mnt/HD_b4/.systemfile/dsk_mapping
md0
<bunch of newlines follow>
sh-3.2# more /mnt/HD_b4/.systemfile/mdadm.conf
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=371392a1:987933a7:1bb5af35:af6f4afc
sh-3.2# more /mnt/HD_b4/.systemfile/raidtab
raiddev /dev/md0
raid-level raid1
nr-raid-disks 2
chunk-size 64
persistent-superblock 1
device /dev/sda2
raid-disk 0
device /dev/sdb2
raid-disk 1
raiddev null
raid-level null
nr-raid-disks 0
chunk-size 64
persistent-superblock 1
device null
raid-disk null
device null
raid-disk null
Version 1.3
sh-3.2# more /mnt/HD_b4/.systemfile/raidtab2web
raiddev /dev/md0
raid-level raid1
nr-raid-disks 2
chunk-size 32
persistent-superblock 1
device /dev/sda2
raid-disk 0
device /dev/sdb2
raid-disk 1
Status Start
raid-masterdisk /dev/sda2
FirstStart 0
FormatwebFlag 1
SatamountFlag 1
SatamountFlag1 0
raidsize 929
filesystem ext3
parti3 1
extra_status 1
Version 1.5
Offline
Pages: 1