FiberCAT SX60 - Vdisk Expand - Lost partitions

elnone · Postby **elnone** » Sun Mar 08, 2009 12:28

Hello!

First delivery of SX60 was with 8 HDDs 500G. Raid 50 was created with 3TB. There were 3 partitions: 1T,1T,250G. The rest space is unused. Host OS - RHEL 5.3, FC partitions were formated as ext3. Later additional 4 HDDs were bought again and plugged into FC. On 22 Feb the expansion of vdisk was started. Till today (08.03.2009) only ~ 70% ready. On 23 Feb FC hung, with remote access there was no possibility to restart it. Manually the power was off then on. That happened again but in a couple of days. Near about 50% of expand utility readiness, the second partition on FC /dev/mapper/mpath2p1 got corrupted, then dissapeared at all, became not seen by the kernel. Then the first partition was lost :evil:

, 2 days of fsck worked, the result more then 2millions of files in lost+found, corrupted superblocks.
Thank God I have a weekly created mirror of data on these partitions. Ok these two partitions were recreated and mkfs.ext3'ed. At the end of the second day again the same problem.
A running fsck for days and hours produces:
node 30909308 has illegal block(s). Clear? yes

Illegal block #0 (2549152832) in inode 30909308. CLEARED.
Illegal block #1 (2845807483) in inode 30909308. CLEARED.
Illegal block #2 (1211752540) in inode 30909308. CLEARED.
Illegal block #3 (4090817801) in inode 30909308. CLEARED.
Illegal block #4 (3809866318) in inode 30909308. CLEARED.
Illegal block #5 (1753329583) in inode 30909308. CLEARED.
Illegal block #6 (3411164057) in inode 30909308. CLEARED.
Illegal block #7 (778521635) in inode 30909308. CLEARED.
Illegal block #8 (2684163921) in inode 30909308. CLEARED.
Illegal block #9 (1428417761) in inode 30909308. CLEARED.
Illegal block #10 (1941275152) in inode 30909308. CLEARED.
Too many illegal blocks in inode 30909308.
Clear inode? yes

Inode 30909394 has a bad extended attribute block 211087634. Clear? yes

Inode 30909394 has illegal block(s). Clear? yes

dmesg contains:
attempt to access beyond end of device
dm-5: rw=0, want=21973915880, limit=1953118377
attempt to access beyond end of device
dm-5: rw=0, want=7127762568, limit=1953118377
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 51265537 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 51265537 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 52609025 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 51232769 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 52609025 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 51232769 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 51265537 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 52609025 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 51232769 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 51265537 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 52609025 in dir #2
EXT3-fs error (device dm-5): ext3_lookup: unlinked inode 51232769 in dir #2
ext3_abort called.
EXT3-fs error (device dm-5): ext3_put_super: Couldn't clean up the journal
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-5, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev dm-5, type ext3), uses xattr
EXT3-fs error (device dm-5): htree_dirblock_to_tree: bad entry in directory #51265537: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Aborting journal on device dm-5.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-4, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev dm-4, type ext3), uses xattr
ext3_abort called.
EXT3-fs error (device dm-5): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
ext3_abort called.
EXT3-fs error (device dm-5): ext3_put_super: Couldn't clean up the journal

I feel that the partion is lost again!

sudo /bin/mount -w -o async,noatime,nodev,noexec,nosuid,acl /dev/mapper/mpath2p1 /mnt/mpath2p1/
mount: wrong fs type, bad option, bad superblock on /dev/mapper/mpath2p1,
missing codepage or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

So I do not know, what's is better right now: to wait till the end of vdisk expand and see, will an expansion occur? or drop everything: vdisk - and try to create a new raid with volumes?

Anyway I do not know, maybe expand works only in lab conditions but not in a real.
Anyway the support of FSC would not help me, but I would like to share my expirience and warn about making this step.

BTW Volume expansion works fine on ext3.
BTW2 Snapshot technology is not working on Linux. the partions are not recognized.

BR

Schnietz Kathrin · Postby **Schnietz Kathrin** » Wed Mar 11, 2009 15:38

Hello!

You wrote

Anyway the support of FSC would not help me

May I ask how you have contacted FSC and what happened?

Kind regards,
Your Fujitsu Siemens Computers Storage Team

elnone · Postby **elnone** » Fri Mar 13, 2009 7:38

Anyway the support of FSC would not help me

I have tried to contact thru the equipment supplier, the specialist was away, so I have left my data to contact me. I know that the supplier also contacted them and provided serial numbers. No feedback. Frankly to say or better to ask: what can the support do with lost data? I'd expect a reverse question, do you have a backup hopefully? Yes, I have backup. Don't know why, but my "back" feeling's suggested not to rely upon FC when FC was setup and rsync script continued to distribute the data to other file servers, as it was earlier. Today is 13.03.2009 and
# show vdisks
Name Size Free Own RAID Dsk Spr Chk Stat Jobs
Serial#
-----------------------------------------------------------------------------
data 3000.5GB 200.5GB A RAID50 12 0 192 FTOL EXPD 95%
00c0ffd52119004839db954800000000
-----------------------------------------------------------------------------

I hope tomorrow the utility finishes and there will be results. Till the day before yesturday, fsck has been finding errors... but successfully fixed.

elnone · Postby **elnone** » Fri Mar 13, 2009 7:47

And the target idea of this post is to warn other users of possible data lose. Becoz there is no mention of it in the official documentation.

elnone · Postby **elnone** » Sat Mar 14, 2009 6:52

14.03.2009 the second partition gave the result :-)

But it is 99% done?!?!?

Superblock has an invalid ext3 journal (inode

.
Clear? yes

*** ext3 journal has been deleted - filesystem is now ext2 only ***

Corruption found in superblock. (blocks_count = 0).

The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 32768 <device>

FiberCAT SX60 - Vdisk Expand - Lost partitions

FiberCAT SX60 - Vdisk Expand - Lost partitions

Re: FiberCAT SX60 - Vdisk Expand - Lost partitions

Re: FiberCAT SX60 - Vdisk Expand - Lost partitions

Re: FiberCAT SX60 - Vdisk Expand - Lost partitions

Re: FiberCAT SX60 - Vdisk Expand - Lost partitions

Who is online