NetBSD Problem Report #54935

From martin@aprisoft.de  Tue Feb  4 18:49:39 2020
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 2CDA51A9213
	for <gnats-bugs@gnats.NetBSD.org>; Tue,  4 Feb 2020 18:49:39 +0000 (UTC)
Message-Id: <20200204184929.4D6035CC8D0@emmas.aprisoft.de>
Date: Tue,  4 Feb 2020 19:49:29 +0100 (CET)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: dir corruption not fixed by fsck
X-Send-Pr-Version: 3.95

>Number:         54935
>Category:       port-mac68k
>Synopsis:       dir corruption not fixed by fsck
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    martin
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Feb 04 18:50:00 +0000 2020
>Last-Modified:  Tue Jul 27 03:35:02 +0000 2021
>Originator:     Martin Husemann
>Release:        NetBSD 9.99.45
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD seven-days-to-the-wolves.aprisoft.de 9.99.45 NetBSD 9.99.45 (GENERIC) #348: Mon Feb 3 08:14:54 CET 2020 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

I have a machine (running -current as of today) wich apparently has a sligthly
corrupted filesystem.

I booted single user and ran fsck -fy /

That finished and claimed to have fixed the fs.

On reboot I got:
[ 3489.8572502] /: bad dir ino 1054908 at offset 8120: Bad dir (not rounded), reclen=0x803, namlen=116, dirsiz=128 <= reclen=2051 <= maxsize=72, flags=0x5001, entryoffsetinblock=8120, dirblksiz=512

and on next boot:

[  17.5679221] /: bad dir ino 2107008 at offset 8128: NUL in name [raic] i=4, namlen=6
[  27.9025333] panic: /: bad dir ino 1077399 at offset 8112: NUL in name [libdc++.a] i=9, namlen=11

Backup and restore time - but somehow I guess fsck_ffs should have dealt with
it properly.

>How-To-Repeat:
n/a

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/54935: dir corruption not fixed by fsck
Date: Tue, 4 Feb 2020 17:28:22 -0500

 clri/fsdb is your friend. do you have a copy
 of it?

 christos

 > On Feb 4, 2020, at 1:50 PM, "martin@netbsd.org" <martin@netbsd.org wrote:
 >=20
 > =EF=BB=BF
 >>=20
 >> Number:         54935
 >> Category:       bin
 >> Synopsis:       dir corruption not fixed by fsck
 >> Confidential:   no
 >> Severity:       critical
 >> Priority:       high
 >> Responsible:    bin-bug-people
 >> State:          open
 >> Class:          sw-bug
 >> Submitter-Id:   net
 >> Arrival-Date:   Tue Feb 04 18:50:00 +0000 2020
 >> Originator:     Martin Husemann
 >> Release:        NetBSD 9.99.45
 >> Organization:
 > The NetBSD Foundation, Inc.
 >> Environment:
 > System: NetBSD seven-days-to-the-wolves.aprisoft.de 9.99.45 NetBSD 9.99.45=
  (GENERIC) #348: Mon Feb 3 08:14:54 CET 2020 martin@seven-days-to-the-wolves=
 .aprisoft.de:/work/src/sys/arch/amd64/compile/GENERIC amd64
 > Architecture: x86_64
 > Machine: amd64
 >> Description:
 >=20
 > I have a machine (running -current as of today) wich apparently has a slig=
 thly
 > corrupted filesystem.
 >=20
 > I booted single user and ran fsck -fy /
 >=20
 > That finished and claimed to have fixed the fs.
 >=20
 > On reboot I got:
 > [ 3489.8572502] /: bad dir ino 1054908 at offset 8120: Bad dir (not rounde=
 d), reclen=3D0x803, namlen=3D116, dirsiz=3D128 <=3D reclen=3D2051 <=3D maxsi=
 ze=3D72, flags=3D0x5001, entryoffsetinblock=3D8120, dirblksiz=3D512
 >=20
 > and on next boot:
 >=20
 > [  17.5679221] /: bad dir ino 2107008 at offset 8128: NUL in name [raic] i=
 =3D4, namlen=3D6
 > [  27.9025333] panic: /: bad dir ino 1077399 at offset 8112: NUL in name [=
 libdc++.a] i=3D9, namlen=3D11
 >=20
 > Backup and restore time - but somehow I guess fsck_ffs should have dealt w=
 ith
 > it properly.
 >=20
 >> How-To-Repeat:
 > n/a
 >=20
 >> Fix:
 > n/a

Responsible-Changed-From-To: bin-bug-people->martin
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Wed, 05 Feb 2020 09:55:28 +0000
Responsible-Changed-Why:
I'll have to look into this


From: Martin Husemann <martin@duskware.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: bin/54935: dir corruption not fixed by fsck
Date: Wed, 5 Feb 2020 10:54:08 +0100

 This seems to be a kernel issue - the file system is fine, new kernel
 has issues accessing it.

 Martin

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/54935: dir corruption not fixed by fsck
Date: Sun, 25 Jul 2021 02:21:14 +0000

 On Tue, Feb 04, 2020 at 06:50:00PM +0000, martin@NetBSD.org wrote:
  > [ 3489.8572502] /: bad dir ino 1054908 at offset 8120: Bad dir (not
  >   rounded), reclen=0x803, namlen=116, dirsiz=128 <= reclen=2051 <=
  >   maxsize=72, flags=0x5001, entryoffsetinblock=8120, dirblksiz=512
  > 
  > and on next boot:
  > 
  > [  17.5679221] /: bad dir ino 2107008 at offset 8128: NUL in name
  >   [raic] i=4, namlen=6
  > [  27.9025333] panic: /: bad dir ino 1077399 at offset 8112: NUL in
  >   name [libdc++.a] i=9, namlen=11

  >  This seems to be a kernel issue - the file system is fine, new kernel
  >  has issues accessing it.

 Did this get figured out?

 -- 
 David A. Holland
 dholland@netbsd.org

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/54935: dir corruption not fixed by fsck
Date: Sun, 25 Jul 2021 10:37:34 +0200

 Not yet - but may have been disk(driver) misbehaving - can't test before
 a few other things are resolved, please just leave open and assigned to
 me as a reminder.

 Martin

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/54935: dir corruption not fixed by fsck
Date: Tue, 27 Jul 2021 03:34:53 +0000

 On Sun, Jul 25, 2021 at 08:40:02AM +0000, Martin Husemann wrote:
  >  Not yet - but may have been disk(driver) misbehaving - can't test before
  >  a few other things are resolved, please just leave open and assigned to
  >  me as a reminder.

 Sure.

 Let me know if I can help though :-)

 -- 
 David A. Holland
 dholland@netbsd.org

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.