NetBSD Problem Report #56421
From Manuel.Bouyer@lip6.fr Wed Sep 29 12:33:43 2021
Return-Path: <Manuel.Bouyer@lip6.fr>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 97FC61A921F
for <gnats-bugs@gnats.NetBSD.org>; Wed, 29 Sep 2021 12:33:43 +0000 (UTC)
Message-Id: <20210929123327.E19616B02@armandeche.soc.lip6.fr>
Date: Wed, 29 Sep 2021 14:33:27 +0200 (MEST)
From: Manuel.Bouyer@lip6.fr
Reply-To: Manuel.Bouyer@lip6.fr
To: gnats-bugs@NetBSD.org
Subject: panic: ffs_blkfree: bad size, fsck doens't fix it
X-Send-Pr-Version: 3.95
>Number: 56421
>Category: kern
>Synopsis: panic: ffs_blkfree: bad size, fsck doens't fix it
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Sep 29 12:35:00 +0000 2021
>Last-Modified: Fri Oct 01 05:36:51 +0000 2021
>Originator: Manuel Bouyer
>Release: NetBSD 9.2_STABLE
>Organization:
>Environment:
System: NetBSD armandeche.soc.lip6.fr 9.2_STABLE NetBSD 9.2_STABLE (GENERIC) #0: Thu Sep 23 10:13:28 UTC 2021 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
I was running pbulk on a WAPBL-enabled filesystem. I ended up with a
stuck, unkillable rm command: 100% CPU, doens't react to kill or
kill -9, ktrace -p shows no activity. I guess it was stuck in a loop
in the kernel. This was with an older 9.0_STABLE kernel so maybe this
specific issue has been fixed since then.
After a power cycle, restarting pbulk would panic the kernel with:
[ 233.080282] panic: ffs_blkfree: bad size: dev = 0x14, bno = 54274495 bsize =
32768, size = 28672, fs = /local/armandeche2
[ 233.080282] cpu0: Begin traceback...
[ 233.080282] vpanic() at netbsd:vpanic+0x160
[ 233.080282] snprintf() at netbsd:snprintf
[ 233.080282] ffs_mapsearch() at netbsd:ffs_mapsearch
[ 233.080282] ffs_blkfree() at netbsd:ffs_blkfree+0x82
[ 233.080282] ffs_truncate() at netbsd:ffs_truncate+0xb7e
[ 233.080282] ufs_rmdir() at netbsd:ufs_rmdir+0x276
[ 233.080282] VOP_RMDIR() at netbsd:VOP_RMDIR+0x50
[ 233.080282] do_sys_unlinkat.isra.6() at netbsd:do_sys_unlinkat.isra.6+0x1a9
[ 233.090286] syscall() at netbsd:syscall+0x157
[ 233.090286] --- syscall (number 137) ---
(this one was with WAPBL disabled; the stack trace was sighly different
with WAPBL). This is 100% reproductible. The issue here is that
fsck doens't find any problem with the fileystem:
fsck -fy /local/armandeche2
** /dev/rwd1e
** File system is already clean
** Last Mounted on /local/armandeche2
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
7464502 files, 77168922 used, 43969433 free (216449 frags, 5469123 blocks, 0.2% fragmentation)
So the filesystem remains in an unstable state.
>How-To-Repeat:
Not sure, it seems to be a "bad luck" issue
>Fix:
unknown
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: gnats-admin->kern-bug-people
Responsible-Changed-By: spz@NetBSD.org
Responsible-Changed-When: Fri, 01 Oct 2021 05:36:51 +0000
Responsible-Changed-Why:
kern issue
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.