RAID5, TX200 S2, (Gamevents S-12 & S-36).

PRIMERGY, PRIMEPower

Moderator: ModTeam

leonsmit

RAID5, TX200 S2, (Gamevents S-12 & S-36).

Postby leonsmit » Thu Jun 22, 2006 12:14

Hi,
Running raid5 (3*SCSI-3 70GB disks + 1 spare) on a TX200 S2 (Netware60), I'm experiencing multiple disk failures.
GAmevents are S-12: A physical disk has failed
and S-36: A physical disk failed because device is missing.


By cooling the environment the fail-frequency has dramatically decreased, but still I got one diskfailure this morning. By looking at the log, I see that the disk failed under heavy loadconditions (totalbackup of the system).

The disks are easily made ready again, and tested to have no errors after diskfailure.

Is anybody out there experiencing something similar?
Is the stability of the TX200 S2 (or the SCSI-3disks?) heavily temperature dependent?

Regards

Data
Posts: 131
Joined: Fri Apr 06, 2001 0:01

Postby Data » Thu Jun 22, 2006 14:58

Hi leonsmit.

Basically every piece of hardware is temperature dependent.
I've just looked up the data sheet and according to this the ambient temperature should not exceed 35 degrees Celsius.

Some tips:
Have you installed serverview. Does it say anything about the temperature of you server?

Can you tell something more about the environment of the server?
What about the temperature of the environment?

Is it allways the same disks which fails?
Maybe it's just a HW defect in one of the disks?



Regards.
Data.

leonsmit

Postby leonsmit » Fri Jun 23, 2006 15:03

Hi Data

Thanks for your kind answer! Over the weekend we will monitor and record the temperature in the server room, there are some problems properly ventilating the room.

I do not think that the temperature is more than 25 degrees in the room, but on monday I'll know for sure.

The serverview application is not installed, I'm not sure if its possible to install on NW60, but if you have any knowledge of where the software can be found along with installation manuals, I would be happy to hear from you again.

The funny thing is that it seems completely arbitrary which disks fails. I've experienced twice that one disks fails and in succession the spare fails as well.

So the situation is deterioating my nervous system, allthough I haven't experienced a disk chrash the last 36 hours.

But anyway thanks for your comforting words, and have a nice weekend!

Brgds

Data
Posts: 131
Joined: Fri Apr 06, 2001 0:01

Postby Data » Mon Jun 26, 2006 10:13

Hi Leon.

I think you can use the Serverview agents in a Netware 6.0 environment.
The serverview application can then be installed on a Windows management station.

The serverview application and agents can be found on:
http://www.fujitsu-siemens.com/serversupport
In the info section you'll find some extra info regarding to System Requirements / necessary Novell Patches etc.

The manuals for your server and for Serverview can be found on:
http://www.fujitsu-siemens.com/serverbooks

Hope this helps.
Data

leonsmit

Postby leonsmit » Wed Jun 28, 2006 12:05

Yesterday I had multiple disk errors again! Though the logical drive was intact, but critical. When I tried to make the disk fault tolerant again, I experienced that the array started rebuilding, but after a while the rebuilding disk went dead again.

Only after I switched off the server, and removed the powersupply, it was possible to rebuild the disk.

What does this indicate? Something serious is wrong....

Just to do something constructive, I went out on the drivers/download page, and downloaded the newest drivers for the megaraid 320-0X.

Checking up the downloaded driver/firmware versions with the versions installed on the server I found a mismatch between the versions of firmware installed on the raidcontroller, and the Mega4_xx.ham driver installed on the netware system. So I upgraded the ham driver to match up with the firmware.

Today I have had no troubles, but its to early to say if this was the solution.

I'll update this thread if I have further troubles, or maybe success. I cross my fingers.



Brgds

leonsmit

Postby leonsmit » Tue Jul 04, 2006 14:41

Still running.... :)

Logging of temperature showed max temperature's of 28 degrees Celsius. Effective cooling has been implemented, and temperature is now below 20.

leonsmit

Postby leonsmit » Tue Jul 18, 2006 16:02

Still running 21 days since last incident. I'm confident that the driver firmware mismatch was the cause of the errors.

Initially the disk error was due to to high temp's in the server room, and somehow the interaction between driver and firmware was not able to clean up properly after the first disk error.

Have a nice summer holiday all of you

:wink:

leonsmit

more troubles!

Postby leonsmit » Tue Aug 29, 2006 16:29

:(

Well, the system went down again in the middle of my summer holiday, multiple disk errors. I managed to get the system up again, but only after rebuilding the SYS partition. This time I have not added a hot spare, and the system has been stable for a month. I have had enough, and will not mingle any further with the system.

So are there any solutions out there?

A package of drivers and utilities (firmware for bios & raidbios 0320-0x, netware drivers, gamserver and gamclient), that match up which each other on the tx200 s2 server would be a nice service from fujitsu-siemens at this moment......

Ne0

Postby Ne0 » Tue Sep 05, 2006 7:32

I saw compatibility table from FSC and i didn't find compatibility with Netware 6.0 only with Netware 6.5 SP5. Try to update system at first.


Return to “Server Products”

Who is online

Users browsing this forum: No registered users and 1 guest