NetBSD Problem Report #57493
From www@netbsd.org Fri Jun 30 11:28:40 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id BF13A1A923D
for <gnats-bugs@gnats.NetBSD.org>; Fri, 30 Jun 2023 11:28:39 +0000 (UTC)
Message-Id: <20230630112838.35F361A923E@mollari.NetBSD.org>
Date: Fri, 30 Jun 2023 11:28:38 +0000 (UTC)
From: fekete.zoltan@minux.hu
Reply-To: fekete.zoltan@minux.hu
To: gnats-bugs@NetBSD.org
Subject: UFS_DIRHASH panic during installation
X-Send-Pr-Version: www-1.0
>Number: 57493
>Category: kern
>Synopsis: UFS_DIRHASH panic during installation
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Jun 30 11:30:01 +0000 2023
>Last-Modified: Fri Jun 30 15:40:02 +0000 2023
>Originator: Zoltan Fekete
>Release: -CURRENT
>Organization:
private
>Environment:
NetBSD XXXXX 10.99.4 NetBSD 10.99.4 (TPL380) #8: Fri Jun 30 07:41:32 CEST 2023 root@XXXXX:/usr/src/sys/arch/amd64/compile/TPL380 amd64
>Description:
I have just built my own kernel in -current.
This
It seems that there is a panic on certain directory operations. I
experienced in the install phase of a slightly large package with pkgin.
Last call on the stack is ufsdirhash_findslot().
The panic causes also filesystem corruption. fsck_ffs can automatically fix, though.
See my stacktrace image here shot by a phone, sorry for the dusty
screen, it's a workhorse...
https://drive.google.com/file/d/1aR0YcDyKTtCrCdq85byY6RLcGfKUqYqG/view?usp=sharing
>How-To-Repeat:
1. rm -rf /usr/pkg/*
2. rm -rf /var/db/pkgin
3. pkg_add http://cdn.netbsd.org/pub/pkgsrc/packages/NetBSD/x86_64/10.0/All/pkgin
4. pkgin up
5. pkgin in git
Then, when the fetch has finished, and the system starts to install packages, the panic happens. I could reproduce this more than once.
>Fix:
Switched over to GENERIC kernel:
NetBSD U755408N2 10.0_BETA NetBSD 10.0_BETA (GENERIC) #0: Tue May 30 07:21:22 UTC 2023 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
This works fine as expected.
>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57493: UFS_DIRHASH panic during installation
Date: Fri, 30 Jun 2023 13:45:10 +0200
What is different between your kernel config and GENERIC?
Why did you think a switch to GENERIC would fix the issue?
You also switched between -current and -10, what part of that do you
think would cause a difference here?
Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57493: UFS_DIRHASH panic during installation
Date: Fri, 30 Jun 2023 13:52:26 +0200
Can you please also provide file system details, like:
dumpfs / | head -27
(please adjust / if /usr/pkg is on some other file system than / )
Thanks,
Martin
From: =?UTF-8?Q?Fekete_Zolt=C3=A1n?= <fekete.zoltan@minux.hu>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57493: UFS_DIRHASH panic during installation
Date: Fri, 30 Jun 2023 14:09:23 +0200
Please find my config file below.
I do not know which particular part can cause the difference, this is
just my experience. I switch over to GENERIC 10_BETA, and the issue doe
not appear.
Should I do an experiment with GENERIC config in 10.99.4? Just to
exclude a possible problem with my own config?
https://drive.google.com/file/d/1JdfbxI1qnOQmKA440xEAVCKBgTam109p/view?usp=sharing
2023-06-30 13:50 idÅ‘pontban Martin Husemann ezt Ãrta:
> The following reply was made to PR kern/57493; it has been noted by
> GNATS.
>
> From: Martin Husemann <martin@duskware.de>
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: kern/57493: UFS_DIRHASH panic during installation
> Date: Fri, 30 Jun 2023 13:45:10 +0200
>
> What is different between your kernel config and GENERIC?
> Why did you think a switch to GENERIC would fix the issue?
> You also switched between -current and -10, what part of that do you
> think would cause a difference here?
>
> Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57493: UFS_DIRHASH panic during installation
Date: Fri, 30 Jun 2023 15:07:43 +0200
So your kernel does *NOT* have options UFS_DIRHASH at all, while GENERIC
does have it.
I think this gives us enough info to review the code.
Martin
From: =?UTF-8?Q?Fekete_Zolt=C3=A1n?= <fekete.zoltan@minux.hu>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57493: UFS_DIRHASH panic during installation
Date: Fri, 30 Jun 2023 14:12:09 +0200
Here you are:
# dumpfs / | head -27
file system: /dev/rdk2
format FFSv2
endian little-endian
location 65536 (-b 128)
magic 19540119 time Fri Jun 30 14:10:00 2023
superblock location 65536 id [ 647b4551 77d572f1 ]
nbfree 2226630 ndir 75204 nifree 7399948 nffree 33165
ncg 355 size 33554432 blocks 32529149
bsize 16384 shift 14 mask 0xffffc000
fsize 2048 shift 11 mask 0xfffff800
frag 8 shift 3 fsbtodb 2
bpg 11815 fpg 94520 ipg 22976
minfree 5% optim time maxcontig 4 maxbpg 2048
symlinklen 120 contigsumsize 4
maxfilesize 0x000080100202ffff
nindir 2048 inopb 64
avgfilesize 16384 avgfpdir 64
sblkno 40 cblkno 48 iblkno 56 dblkno 2928
sbsize 2048 cgsize 16384
csaddr 2928 cssize 6144
cgrotor 0 fmod 0 ronly 0 clean 0x02
wapbl version 0x1 location 2 flags 0x0
wapbl loc0 66999040 loc1 127040 loc2 512 loc3 7
usrquota 0 grpquota 0
flags wapbl
fsmnt /
volname swuid 0
/usr/pkg is within this filesystem.
Regards,
FeZ
2023-06-30 13:55 idÅ‘pontban Martin Husemann ezt Ãrta:
> The following reply was made to PR kern/57493; it has been noted by
> GNATS.
>
> From: Martin Husemann <martin@duskware.de>
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: kern/57493: UFS_DIRHASH panic during installation
> Date: Fri, 30 Jun 2023 13:52:26 +0200
>
> Can you please also provide file system details, like:
>
> dumpfs / | head -27
>
> (please adjust / if /usr/pkg is on some other file system than / )
>
> Thanks,
>
> Martin
From: =?UTF-8?Q?Fekete_Zolt=C3=A1n?= <fekete.zoltan@minux.hu>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57493: UFS_DIRHASH panic during installation
Date: Fri, 30 Jun 2023 16:05:37 +0200
Oh, I'm sorry, Martin, my failure.
The config file I attached was my last experiment, when switched off
UFS_DIRHASH ans UFS_EXTATTR. In this case the system did not even boot.
I've just forgotten about this and attached the file blindly.
So, the panic this PR is about happened when both options were ON, just
like in the GENERIC config. Please swith both UFS_DIRHASH and
UFS_EXTATTR ON to reproduce.
Sorry for the confusion again, and thank you for taking care.
FeZ
2023-06-30 15:10 idÅ‘pontban Martin Husemann ezt Ãrta:
> The following reply was made to PR kern/57493; it has been noted by
> GNATS.
>
> From: Martin Husemann <martin@duskware.de>
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: kern/57493: UFS_DIRHASH panic during installation
> Date: Fri, 30 Jun 2023 15:07:43 +0200
>
> So your kernel does *NOT* have options UFS_DIRHASH at all, while
> GENERIC
> does have it.
>
> I think this gives us enough info to review the code.
>
> Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57493: UFS_DIRHASH panic during installation
Date: Fri, 30 Jun 2023 16:10:51 +0200
I localy build a kernel from your TPL380 config file and (as expected)
it does not have any ufs_dirhash* symbol in it:
[TPL380] martin@seven-days-to-the-wolves > nm netbsd | fgrep ufsdirhash
[TPL380] martin@seven-days-to-the-wolves >
So... something is wrong with your local kernel build, can you rebuild
it from scratch?
Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57493: UFS_DIRHASH panic during installation
Date: Fri, 30 Jun 2023 16:13:07 +0200
On Fri, Jun 30, 2023 at 02:10:02PM +0000, Fekete Zoltán wrote:
> The config file I attached was my last experiment, when switched off
> UFS_DIRHASH ans UFS_EXTATTR. In this case the system did not even boot.
> I've just forgotten about this and attached the file blindly.
OK, so now we are back at not knowing it is about GENERIC vs TPL380
or -current vs. -10.
Can you plase test again with a GENERIC -current kernel?
Martin
From: =?UTF-8?Q?Fekete_Zolt=C3=A1n?= <fekete.zoltan@minux.hu>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57493: UFS_DIRHASH panic during installation
Date: Fri, 30 Jun 2023 17:34:51 +0200
I can confirm, that the panic occurs also when using GENERIC-current
kernel.
I also confirm, that after the event (after reboot) the filesystem is
corrupted.
Some files are missing/unavailable or undeletable.
Trace messages:
https://drive.google.com/file/d/1h2-TbtX0JYf_H7gTC5fZG_KVZM-8sNrA/view?usp=sharing
https://drive.google.com/file/d/1rbSJeSbcpe_8lnoE5MmPqIvhUWery787/view?usp=sharing
https://drive.google.com/file/d/15WYYEApe8lBLoUGuEAQhTdlhsLI_WxZ7/view?usp=sharing
FeZ
2023-06-30 16:15 idÅ‘pontban Martin Husemann ezt Ãrta:
> The following reply was made to PR kern/57493; it has been noted by
> GNATS.
>
> From: Martin Husemann <martin@duskware.de>
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: kern/57493: UFS_DIRHASH panic during installation
> Date: Fri, 30 Jun 2023 16:13:07 +0200
>
> On Fri, Jun 30, 2023 at 02:10:02PM +0000, Fekete Zolt�n wrote:
> > The config file I attached was my last experiment, when switched
> off
> > UFS_DIRHASH ans UFS_EXTATTR. In this case the system did not even
> boot.
> > I've just forgotten about this and attached the file blindly.
>
> OK, so now we are back at not knowing it is about GENERIC vs TPL380
> or -current vs. -10.
>
> Can you plase test again with a GENERIC -current kernel?
>
> Martin
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.