DSM-G600, DNS-3xx and NSA-220 Hack Forum

Cazaril · 2008-02-19 15:05:42

Hi all,

I have a RAID 1 configuration
I had previously posted that I encounter a problem with 1 of the hard disks not synchronizing with the other after I upgraded my firmware from 1.03 to 1.04.
The original failing hard disk was the left.
I decided to reformat the RAID and it became better. However, now the right hard disk is getting degraded and has the amber light.

Am I doing anything wrong?

I've posted the print screen attachment below

Attachments:

DNS 323 not synchronised picture 2008-02-19.jpg, Size: 25,018 bytes, Downloads: 810

GhostRiderGrey · 2008-02-21 19:17:02

I am experiencing this same problem. One drive showing as "degraded" after upgrade to 1.04 firmware. Can anyone shed some loght on this? These drive are brand new (unit has been on less than 6 hours total)

Thanks.

GhostRiderGrey · 2008-02-22 22:45:51

Update. I have now gone back to the 1.03 firmware and the drives are acting normally (no degradation). Both drive lights blue. Seems to be an issue with the firmware.

fordem · 2008-02-23 01:42:19

GhostRiderGrey wrote:
Update. I have now gone back to the 1.03 firmware and the drives are acting normally (no degradation). Both drive lights blue. Seems to be an issue with the firmware.

I'd like to believe that if it was the firmware, we'd be seeing a lot more reports - if nothing else - I'm running 1.04 and have had no problems.

Cazaril · 2008-02-23 03:48:53

could it be a combination of firmware and HDDs? Note that Arjan has the white LED and degraded disk after upgrading to 1.04. Also, I recall he also had 400 GB WD disks.

fordem · 2008-02-23 04:12:05

I wouldn't rule that out - but - let me run one by you, and please note, this is purely hypothetical.

Let's assume, just for the moment, that 1.04 supports S.M.A.R.T (Self Monitoring And Reporting Technology) which 1.03 didn't, and let's also assume that one of those drives is reporting a S.M.A.R.T failure. With 1.03 that failure would have gone undetected, but with 1.04 it would be reported - if S.M.A.R.T was supported.

Would/should that, be considered an "issue" with 1.04 - in my opinion, no, it would be a failing drive plain and simple - what S.M.A.R.T does is send an alert any time one or more monitored statistics crosses a preset threshold.

One thing that this hypotheses can not explain, is why the apparently failing drive shifted from one bay to the other in your case.

jayas · 2008-02-23 05:20:54

Hello,

I support a few of DNS-323s with different makes of drives.. All of which operate in RAID 1. I believe there are issues with firmware (including 1.04) that need attention in the upper layers, meaning web pages and compiled in scripts.

I observed that sometimes they do not report the correct status. I suspect that the amber light is one such event that is created by upper layer web pages and scripts establishing the wrong status. I would check it out using a stable version compatible with the kernel. The one they have in 1.04 is version 2.5.6 which has issues to be resolved (like using deprecated ioctls) in relation to kernel.

The other thing to watch out for is 1.03 formats disk into two partitions, while 1.04 creates an extra partition. So what you observe after upgrade will depend on whether or not you reformat the disks. In all the sites I support, I have chosen not reformat after upgrading to 1.04 firmware.

I don't have time to look into the details, but the web pages and (compiled) scripts need quite a bit of attention to address these issues and I am hoping there will be updates soon from D-LINK based on what people report about problems like this.

Jaya

Edit: Earlier version said RAID 0 which is incorrect.

Last edited by jayas (2008-02-23 05:27:34)

Cazaril · 2008-02-23 05:27:04

Like you say, the hypothesis does not explain why my hard disks are shifted from one bay to the other. Also, Ghost ridergrey has managed to get both hard disks in blue light mode after downgrading to 1.03 again.

Therefore, until someone says the 1.04 definitely supports SMART, I must reject the hypothesis.

fordem · 2008-02-23 05:52:32

I think you missed one point - the fact that ghostridergrey managed to get both disks in "blue light mode" after downgrading to 1.03, is an important aspect of the hypothesis - by downgrading he would have removed the S.M.A.R.T support, and thereby appear to make the error go away, when in fact he has simply removed the reporting mechanism.

Never the less - as you say, until someone can show that 1.04 supports S.M.A.R.T. (and I have no evidence that it does) then it remains nothing more than a hypothesis.

The only reason I present this theory is that whilst ghostridergrey's <upgrade, get error, downgrade, get rid of error> process may appear to prove that the firmware is flawed, firmware flaws are usually reproducible, and in my opinion, there aren't that enough reports of users with the problem to substantiate the claim.

jayas · 2008-02-23 09:42:37

fordem wrote:
Never the less - as you say, until someone can show that 1.04 supports S.M.A.R.T. (and I have no evidence that it does) then it remains nothing more than a hypothesis.

Hi Fordem,

I did a grep for SMART and S.M.A.R.T. in mdadm-2.5.6 source (which is used in F/W 1.04) and came up with nothing. Does this sort of rules out your hypothesis?

Regards,

Jaya

fordem · 2008-02-23 16:21:00

Yup - that would be sufficient.

But - all I wanted to do was present a plausible reason for 1.04 to cause a disk failure indication that 1.03 didn't - so mission accomplished.

As I said earlier, I feel that if it were a firmware flaw, as against the firmware responding to something the disks are doing, then a lot more of us would be reporting white/amber leds. Firmware flaws have a habit of being reproducible, take the >2GB file copy one, or the print queue causing a spin-up one. Too many people, tech support, included, blame firmware when they can't think of anything else.

I'd like to believe that DLink has done something, perhaps in the form of disk failure or error detection, it doesn't necessarily have to be S.M.A.R.T, that causes the indicator.

From a personal standpoint, I never got anything but a blue LED prior to 1.04, but I have managed to get what I believe is the white one (or blue & amber together) with 1.04 when fooling around with JBOD, but since I was removing & reinserting the disks at the time trying to force an error it was not entirely unexpected.

Cazaril · 2008-02-24 02:30:41

I have also noticed that when I first power on the DNS box, both hard disks show blue light. the blue light activity on the left hard disk continues while the right hard disk, the activity is noticably slower and after 3 minutes, the right one turns amber.

Does this indicate that it is a failing hard disk?

This happens every time I turn on the DNS

fordem · 2008-02-24 02:48:02

Cazaril

A failing hard drive would be my guess - if you have the data backed up, try swapping the disks around, if the amber LED follows the disks, then the unit is definitely detecting something that it's not happy with - either the disk or the data contained theirn.

jayas · 2008-02-24 11:56:04

Cazaril wrote:
I have also noticed that when I first power on the DNS box, both hard disks show blue light. the blue light activity on the left hard disk continues while the right hard disk, the activity is noticably slower and after 3 minutes, the right one turns amber.

Does this indicate that it is a failing hard disk?

This happens every time I turn on the DNS

Hi Cazaril,

More activity on one disk compared to the other is not an indication of a problem because of how swap partitions are used by the system.

Changing disks around as per Fordem's advice will tell if your problem is drive related. Are you able to telnet to and do 'dmesg' to see what it says when amber light comes on?

Here is what I get on mine when I eject one drive while it is live:

Code:

***************************************
*            HD1 stand by now!        *
***************************************

***************************************
*            HD0 stand by now!        *
***************************************
Synchronizing SCSI cache for disk sdb:
FAILED
  status = 0, message = 00, host = 1, driver = 00
  <5>  Vendor: SAMSUNG   Model: HD501LJ           Rev: CR10
  Type:   Direct-Access                      ANSI SCSI revision: 03

When I then plug in the drive live, I get the so called white light for a while which is actually the blue light plus amber light. Then it resolves to its normal state. The dmesg log after this is as follows:

Code:

 sdc:
#######################################
#              HD1 awake now !        #
#######################################
 sdc1 sdc2
Attached scsi disk sdc at scsi1, channel 0, id 0, lun 0
Attached scsi generic sg1 at scsi1, channel 0, id 0, lun 0,  type 0

#######################################
#              HD0 awake now !        #
#######################################

Hope this helps.

Jaya

Cazaril · 2008-02-24 14:04:28

I haven't managed to get my fun-plug working yet. Will do so sometime next week. In the meantime, I am getting the hard disk back to the manurfacturer's agent to determine if anything is wrong with the hard disk.

Thanks for all the comments everyone!

I did try to swap the physical disks. DNS did not recognise it and asked for the hard disks to be swapped back.

Last edited by Cazaril (2008-02-24 14:05:34)

Gavlester · 2008-02-24 14:39:25

I had the same issue with 2 new samsung 500gb drives and firmware 1.04. I had trouble formatting to raid1 and kept getting the orange led and also some errors when doing the formatting. Eventually after formatting and getting an error i just reset the unit and it came back all good - formatted in raid 1 with no orange leds and has been working fine for weeks now.

damien34 · 2008-02-25 09:33:54

Hello,

Is it DLNA compliant ? (DNS 313, 323)

Cazaril · 2008-02-27 07:35:38

Cazaril wrote:
I haven't managed to get my fun-plug working yet. Will do so sometime next week. In the meantime, I am getting the hard disk back to the manurfacturer's agent to determine if anything is wrong with the hard disk.

Thanks for all the comments everyone!

I did try to swap the physical disks. DNS did not recognise it and asked for the hard disks to be swapped back.

As a further note to this saga, I managed to exchange the failing disk for a new one. After 1 day of testing, now the left hard disk fails. I must be the unluckiest guy in the world.

Speijk · 2008-02-27 10:18:19

Just to add to the statistics:
Upgraded my DNS-323 to 1.04 last week and this morning I noticed the right led was amber. Might have happened earlier but I don't look at them everyday

aloleary · 2008-02-27 19:56:41

Well here's another add...

Upgraded to 1.04
(2x500GB drives)

Amber/Orange light on right drive

-A

ripe_md · 2008-03-01 12:09:31

The same on my side, after updating to 1.04 the RAID 1 gets degraded and one of the disks is not in the array anymore.
I have to manually add the disk to the array using either

"mdadm /dev/md0 -a /dev/sda2" or

"mdadm /dev/md0 -a /dev/sdb2"

depending on which disk got removed from the array.

I never had this problem with firmware 1.03

ripe_md

Ardjan · 2008-03-01 23:13:26

Cazaril wrote:
could it be a combination of firmware and HDDs? Note that Arjan has the white LED and degraded disk after upgrading to 1.04. Also, I recall he also had 400 GB WD disks.

I have the white/amber LED on the left bay, noticed a week or so after upgrading to 1.04b84. Both drives are Western Digital WD4000YR (400GB, 'Raid' models).
I checked the drive externally (hotplugging the drive onto my mainboard. Fortunately Mainboard and Vista took it well!) and let a test programm write 400GB of pattern data onto it and reading the data back: everything ok. SMART also showed everyhing ok.
The funny part: while the drive was in the DNS323 and was showing the white/amber LED, it behaved perfectly normal. Both the Raid1 volume (as it should be :-)) and the JBOD volume worked as expected.

So, the thought of an error in 1.04 really comes to mind, imho...

jayas · 2008-03-02 03:04:52

Ardjan wrote:
The funny part: while the drive was in the DNS323 and was showing the white/amber LED, it behaved perfectly normal. Both the Raid1 volume (as it should be :-)) and the JBOD volume worked as expected.

So, the thought of an error in 1.04 really comes to mind, imho...

Hi Ardjan,

I agree. Blue signals drive is running normally and amber signals drive has failed. Having both on all the time does suggests the DNS-323 is having trouble working out what its position is in relation to drive status.

[It is possible rogue code turns on the amber light and possibly signals error state that has little to do with drive failure, and which is ignored by the rest of the system hence leaving the blue light on.]

Some questions:

1/ What does the status page say when this happens?

2/ Enable email alerts and see if you get A drive has failed alert when this happens.

3/ If you are game enough, install fun_plug and do a dmesg to see if that holds any clues.

Hope this helps.

Jaya

Speijk · 2008-03-02 23:46:14

On my side:

1)
Total Drive(s):    2
Volume Name:    Volume_1
Volume Type:    RAID 1
Sync Time Remaining:    Degraded
Total Hard Drive Capacity:    490402 MB
Used Space:    252470 MB
Unused Space:    237931 MB

2)
Nope, no e-mails are sent. Not even after removing / inserting the drive. Light stays amber/white

3) have no telnet access (yet)

edit: update, now I got an e-mail that the right drive has failed.

Last edited by Speijk (2008-03-03 09:06:21)

Ardjan · 2008-03-04 00:40:05

jayas wrote:
1/ What does the status page say when this happens?

The Raid1 says 'degraded', the JBOD is ok. Both are working, as far as I can see. In fact, the only error I see is the white LED, nothing else yet...
Today I finally managed it to test the drive thorougly with the WD provided test-program. No SMART issues displayed, no bed sectors, nothing that points to an error.

jayas wrote:
2/ Enable email alerts and see if you get A drive has failed alert when this happens.

Not done yet.

jayas wrote:
3/ If you are game enough, install fun_plug and do a dmesg to see if that holds any clues.

Excerpt of dmesg.out:

Code:

Linux version 2.6.12.6-arm1 (jack@SWTEST2) (gcc version 3.3.3) #29 Thu Dec 27 09:59:48 CST 2007
<snip>
scsi0 : Marvell SCSI to SATA adapter
scsi1 : Marvell SCSI to SATA adapter
scsi2 : Marvell SCSI to SATA adapter
scsi3 : Marvell SCSI to SATA adapter
  Vendor: WDC       Model: WD4000YR-01PLB0   Rev: 01.0
  Type:   Direct-Access                      ANSI SCSI revision: 03
  Vendor: WDC       Model: WD4000YR-01PLB0   Rev: 01.0
  Type:   Direct-Access                      ANSI SCSI revision: 03
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,  type 0
Attached scsi generic sg1 at scsi1, channel 0, id 0, lun 0,  type 0
<snip>
RAMDISK: Compressed image found at block 0
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
VFS: Mounted root (ext2 filesystem).
Freeing init memory: 112K
SCSI device sda: 781422768 512-byte hdwr sectors (400088 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 781422768 512-byte hdwr sectors (400088 MB)
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sdb: 781422768 512-byte hdwr sectors (400088 MB)
SCSI device sdb: drive cache: write back
SCSI device sdb: 781422768 512-byte hdwr sectors (400088 MB)
SCSI device sdb: drive cache: write back
 sdb: sdb1 sdb2 sdb3 sdb4
Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
<snip>
Adding 530104k swap on /dev/sda1.  Priority:-1 extents:1
Adding 530104k swap on /dev/sdb1.  Priority:-2 extents:1
ext3: No journal on filesystem on sda2
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
ext3: No journal on filesystem on sda3
ext3: No journal on filesystem on sdb2
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
ext3: No journal on filesystem on sdb3
ext3: No journal on filesystem on sda4
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
ext3: No journal on filesystem on sdb4
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
md: md0 stopped.
md: bind<sdb2>
md: bind<sda2>
md: kicking non-fresh sdb2 from array!
md: unbind<sdb2>
md: export_rdev(sdb2)
raid1: raid set md0 active with 1 out of 2 mirrors
md: md1 stopped.
md: bind<sdb3>
md: bind<sda3>
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
ext3: No journal on filesystem on sda4
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
ext3: No journal on filesystem on sdb4
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
Link Layer Topology Discovery Protocol, version 1.05.1223.2005
dev is  <NULL>

Hmm, what do these ext3 references mean in the last few lines? I thought that ext3 was not in the >1.02 firmware anymore?

Attachments:

dmesg.out, Size: 6,978 bytes, Downloads: 529

DSM-G600, DNS-3xx and NSA-220 Hack Forum

Announcement

#1 2008-02-19 15:05:42

DNS shows 1 hdd degraded after upgrade to 1.04

#2 2008-02-21 19:17:02

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#3 2008-02-22 22:45:51

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#4 2008-02-23 01:42:19

Re: DNS shows 1 hdd degraded after upgrade to 1.04

GhostRiderGrey wrote:

#5 2008-02-23 03:48:53

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#6 2008-02-23 04:12:05

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#7 2008-02-23 05:20:54

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#8 2008-02-23 05:27:04

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#9 2008-02-23 05:52:32

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#10 2008-02-23 09:42:37

Re: DNS shows 1 hdd degraded after upgrade to 1.04

fordem wrote:

#11 2008-02-23 16:21:00

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#12 2008-02-24 02:30:41

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#13 2008-02-24 02:48:02

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#14 2008-02-24 11:56:04

Re: DNS shows 1 hdd degraded after upgrade to 1.04

Cazaril wrote:

Code:

Code:

#15 2008-02-24 14:04:28

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#16 2008-02-24 14:39:25

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#17 2008-02-25 09:33:54

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#18 2008-02-27 07:35:38

Re: DNS shows 1 hdd degraded after upgrade to 1.04

Cazaril wrote:

#19 2008-02-27 10:18:19

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#20 2008-02-27 19:56:41

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#21 2008-03-01 12:09:31

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#22 2008-03-01 23:13:26

Re: DNS shows 1 hdd degraded after upgrade to 1.04

Cazaril wrote:

#23 2008-03-02 03:04:52

Re: DNS shows 1 hdd degraded after upgrade to 1.04

Ardjan wrote:

#24 2008-03-02 23:46:14

Re: DNS shows 1 hdd degraded after upgrade to 1.04

#25 2008-03-04 00:40:05

Re: DNS shows 1 hdd degraded after upgrade to 1.04

jayas wrote:

jayas wrote:

jayas wrote:

Code:

Board footer