NetBSD Problem Report #32701

From mouse@Sparkle.Rodents.Montreal.QC.CA  Thu Feb  2 17:23:20 2006
Return-Path: <mouse@Sparkle.Rodents.Montreal.QC.CA>
Received: from Sparkle.Rodents.Montreal.QC.CA (Sparkle.Rodents.Montreal.QC.CA [])
	by (Postfix) with ESMTP id 1E97E63B876
	for <>; Thu,  2 Feb 2006 17:23:20 +0000 (UTC)
Message-Id: <200602021723.MAA01608@Sparkle.Rodents.Montreal.QC.CA>
Date: Thu, 2 Feb 2006 12:18:40 -0500 (EST)
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
Reply-To: mouse@Rodents.Montreal.QC.CA
Subject: [dM] Indirect blocks break on big filesystems
X-Send-Pr-Version: 3.95

>Number:         32701
>Category:       kern
>Synopsis:       [dM] Indirect blocks break on big filesystems
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Feb 02 17:25:00 +0000 2006
>Last-Modified:  Fri Feb 03 08:45:01 +0000 2006
>Originator:     der Mouse
>Release:        NetBSD 2.0
	The hardware corresponds to
System: NetBSD 3.0 NetBSD 3.0 (GENERIC) #0: Mon Dec 19 01:04:02 UTC 2005 i386
Architecture: i386
Machine: i386
	but this was observed under a 2.0 kernel, not the 3.0 kernel
	corresponding to the above uname output.  (Tests are underway
	under 3.0; I'll append to this ticket as I get results.)
	The machine has a 3ware RAID card in it with 12 disks attached,
	set up as a RAID5 and a RAID0:

	twe0 at pci3 dev 1 function 0: 3ware Escalade
	twe0: interrupting at irq 10
	twe0: 12 ports, Firmware FE7S, BIOS BE7X
	twe0: Monitor ME7X, PCB Rev5    , Achip 3.20    , Pchip 1.30-66 
	twe0: port 0: ST3300831AS                              286168 MB
	twe0: port 1: ST3300831AS                              286168 MB
	twe0: port 2: ST3300831AS                              286168 MB
	twe0: port 3: ST3300831AS                              286168 MB
	twe0: port 4: ST3300831AS                              286168 MB
	twe0: port 5: ST3300831AS                              286168 MB
	twe0: port 6: ST3300831AS                              286168 MB
	twe0: port 7: ST3300831AS                              286168 MB
	twe0: port 8: ST3300831AS                              286168 MB
	twe0: port 9: ST3300831AS                              286168 MB
	twe0: port 10: ST3300831AS                              286168 MB
	twe0: port 11: ST3300831AS                              286168 MB
	ld0 at twe0 unit 0: 64K stripe RAID5, status: Normal
	ld0: 1956 GB, 255368 cyl, 255 head, 63 sec, 512 bytes/sect x 4102491904 sectors
	ld1 at twe0 unit 8: 1024K stripe RAID0, status: Normal
	ld1: 838 GB, 109443 cyl, 255 head, 63 sec, 512 bytes/sect x 1758210048 sectors

	I labeled ld0 as

	4 partitions:
	#        size    offset     fstype [fsize bsize cpg/sgs]
	 a: 4102491904         0     4.2BSD   1024  8192 56528  # (Cyl.      0 - 255368*)
	 d: 4102491904         0     unused      0     0        # (Cyl.      0 - 255368*)

	I created a FFSv1 filesystem in ld0a with fsize=1024 bsize=8192
	(as indicated by the values in the label).  Then I created 418
	files of exactly 4G each, split into five directories:
	00/0001-00/0099, 01/0100-01/0199, ..., 04/0400-04/0418.  (418
	is not special; I just had it keep creating until df reported
	at least 90% full, and that happened to be after 418 files.)
	Each file has distinctive content; given a disk block belonging
	to any of them, I could tell which file it belonged to and
	where in that file it belonged.

	Then I unmounted the filesystem and ran fsck.  fsck found many
	problems, mostly "INCORRECT BLOCK COUNT" and a lot of BAD or
	DUP BLKS.  Looking at what's actually on the disk with other
	tools, it appears that all the fsck-reported problems
	(certainly all the ones I spot-checked) are due to indirect
	blocks getting trashed.  (An iblock full of 0s gives INCORRECT
	BLOCK COUNT; the BAD and DUP blocks are from iblocks full of
	nonzero trash.)

	I haven't checked thoroughly yet, but preliminary indications
	are that corruption strikes whenever an indirect block falls
	above the 1T point on the disk, leading me to suspect a
	signed-32-bit bug somewhere in an indirect-block code path.
	Reinforcing this theory is that ld1a, despite having an 8k/64k
	filesystem (which ld0a did, earlier, but I'd heard conjectures
	that the blocksize was the problem), does not suffer from this
	issue as far as I've been able to tell.

	See description above.

	Unknown.  "Don't go above 1T" seems to work, but is hardly a
	real fix.

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

From: der Mouse <mouse@Rodents.Montreal.QC.CA>
Subject: Re: kern/32701: [dM] Indirect blocks break on big filesystems
Date: Fri, 3 Feb 2006 03:35:55 -0500 (EST)

 Update on my "large filesystems vs indirect blocks" issue....

 I did a search on my test filesystem, looking to see if maybe indirect
 blocks just got written to the wrong place.

 First, I made up a list of the blocks the on-disk filesystem pointed to
 for the last-written test file.  The single indirect block was full of
 0s, causing one indirect block's worth of data to be missing from that
 view of the file.

 Then, I searched the filesystem for a distinctive string that was
 present in every block of the file (I wrote the files with a program
 designed to produce such distinctive content).  This gave me a list of
 every block containing file contents.  All the missing data blocks were
 actually present on the disk.

 Then, knowing their numbers and (from their contents) where they
 belonged in the file, I worked out a small (16-byte) snippet of what
 the lost indirect block's contents would be.

 Then, I searched the entire partition for that 16-byte string.  That
 search just completed, and it found nothing at all.  So either the
 indirect block never got written, or it got overwritten by something
 else later.  (I consider the former a great deal more likely,
 especially since all the data I've looked at (which is admittedly just
 a few spot checks) are consistent with the theory that indirect blocks
 are lost when their disk sector numbers, treated as 32-bit signed
 numbers, go negative.)

 Now to try a newfs plus write tests under a 3.0 kernel....

 /~\ The ASCII				der Mouse
 \ / Ribbon Campaign
  X  Against HTML
 / \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD:,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.