NetBSD Problem Report #56705
From hauke@Espresso.Rhein-Neckar.DE Sat Feb 12 17:08:33 2022
Return-Path: <hauke@Espresso.Rhein-Neckar.DE>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id D81491A923C
for <gnats-bugs@gnats.NetBSD.org>; Sat, 12 Feb 2022 17:08:33 +0000 (UTC)
Message-Id: <202202121707.21CH7wMv006526@pizza.causeuse.org>
Date: Sat, 12 Feb 2022 18:07:58 +0100 (CET)
From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Reply-To: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@NetBSD.org
Cc: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Subject: wapbl lockdebug panic during tcpdump run
X-Send-Pr-Version: 3.95
>Number: 56705
>Category: kern
>Synopsis: wapbl lockdebug panic during tcpdump run
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Feb 12 17:10:00 +0000 2022
>Last-Modified: Tue Feb 15 21:55:01 +0000 2022
>Originator: Hauke Fath
>Release: NetBSD 9.99.93
>Organization:
Falling Raindrops
>Environment:
System: NetBSD pizza 9.99.93 NetBSD 9.99.93 (BLACKBOX-$Revision: 1.85 $) #5: Fri Feb 11 21:11:10 CET 2022 hauke@pizza:/var/obj/netbsd-build-objects/developer/amd64/sys/arch/amd64/compile/BLACKBOX amd64
Architecture: x86_64
Machine: amd64
>Description:
I am attempting to debug a client machine's hang during tcp
transfers (here: an ftp session) by running tcpdump on the
target (ftp server) machine. Unfortunately, the tcpdump
frequently aborts with a 'no permission' error, and every few
attempts the machine panics with a lockdebug error in a wapbl
write
<ftp://ftp.causeuse.org/pub/NetBSD/tcpdump-panic.gif>
The USB console keyboard is dead at that point, and the
machine swaps to a raidframe mirror, which cannot be dumped to.
Strangely enough, the directory that tcpdump writes the pcap
file to is on zfs, and so is the directory ftp'ed to.
>How-To-Repeat:
tcpdump an incoming ftp transfer on the ftp server side. Watch
the machine panic on (roughly) every fifth attempt.
>Fix:
Yes, please.
>Audit-Trail:
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
Date: Sat, 12 Feb 2022 17:16:44 +0000
[resending to cc gnats-bugs]
I bet wapbl is a red herring.
Can you do `show panic'? If it says `kernel lock spinout', that means
that something else was hogging the kernel lock and the attempt to
acquire it in bdev_strategy via wapbl happens to be the one that got
bored of waiting and panicked.
Can you show `ps' output, and then `bt/a ffff...' for all of the lines
with `>' on them?
Can you get a crash dump?
From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org
Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
Date: Sat, 12 Feb 2022 18:28:14 +0100
On Sat, 12 Feb 2022 17:20:01 +0000 (UTC), Taylor R Campbell wrote:
> Can you show [...]
As mentioned, the USB console keyboard is dead at that point. I have
found and attached a ps2 keyboard (fortunately, the board is that
traditional), and am working on reproducing the panic.
> Can you get a crash dump?
swap on raid0b, so no.
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
Date: Sat, 12 Feb 2022 19:11:34 +0100
On Sat, Feb 12, 2022 at 05:30:02PM +0000, Hauke Fath wrote:
> > Can you get a crash dump?
>
> swap on raid0b, so no.
Why "so no"? Have you tried?
Martin
From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org
Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
Date: Sat, 12 Feb 2022 22:21:43 +0100
On Sat, 12 Feb 2022 18:15:01 +0000 (UTC), Martin Husemann wrote:
> On Sat, Feb 12, 2022 at 05:30:02PM +0000, Hauke Fath wrote:
> > > Can you get a crash dump?
> >
> > swap on raid0b, so no.
>
> Why "so no"? Have you tried?
Well, the machine tried (and failed)... and a quick search comes up
with <https://marc.info/?l=netbsd-port-i386&m=109042024503576>.
From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: Greg Oster <oster@netbsd.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org
Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
Date: Sat, 12 Feb 2022 22:55:47 +0100
On Sat, 12 Feb 2022 15:35:16 -0600, Greg Oster wrote:
> kernel core dumps to swap on RAID 1 sets should be working. If they
> arn't, that's a bug. (initial implementation was in 2007, with some
> fixes in 2016... but perhaps crash dumps to RAID 1 swap isn't as
> widely advertised as it might otherwise be...)
Good to know, thanks.
The panic appears to leave the machine in bad shape, though; one time I
got recurring panics during network setup (an re(4) with several vlans
on it), until I power-cycled the machine.
From: Joerg Sonnenberger <joerg@bec.de>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
Date: Sat, 12 Feb 2022 23:36:08 +0100
Am Sat, Feb 12, 2022 at 05:30:02PM +0000 schrieb Hauke Fath:
> The following reply was made to PR kern/56705; it has been noted by GNATS.
>
> From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
> To: gnats-bugs@netbsd.org
> Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org
> Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
> Date: Sat, 12 Feb 2022 18:28:14 +0100
>
> On Sat, 12 Feb 2022 17:20:01 +0000 (UTC), Taylor R Campbell wrote:
> > Can you show [...]
>
> As mentioned, the USB console keyboard is dead at that point. I have
> found and attached a ps2 keyboard (fortunately, the board is that
> traditional), and am working on reproducing the panic.
ddb.commandonenter can be used if you can reproduce it.
Joerg
From: Greg Oster <oster@netbsd.org>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Cc:
Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
Date: Sat, 12 Feb 2022 15:35:16 -0600
On 2022-02-12 15:25, Hauke Fath wrote:
> The following reply was made to PR kern/56705; it has been noted by GNATS.
>
> From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
> To: gnats-bugs@NetBSD.org
> Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org
> Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
> Date: Sat, 12 Feb 2022 22:21:43 +0100
>
> On Sat, 12 Feb 2022 18:15:01 +0000 (UTC), Martin Husemann wrote:
> > On Sat, Feb 12, 2022 at 05:30:02PM +0000, Hauke Fath wrote:
> > > > Can you get a crash dump?
> > >
> > > swap on raid0b, so no.
> >
> > Why "so no"? Have you tried?
>
> Well, the machine tried (and failed)... and a quick search comes up
> with <https://marc.info/?l=netbsd-port-i386&m=109042024503576>.
>
kernel core dumps to swap on RAID 1 sets should be working. If they
arn't, that's a bug. (initial implementation was in 2007, with some
fixes in 2016... but perhaps crash dumps to RAID 1 swap isn't as widely
advertised as it might otherwise be...)
Later...
Greg Oster
From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org
Subject: Re: kern/56705: wapbl lockdebug panic during tcpdump run
Date: Tue, 15 Feb 2022 22:49:03 +0100
Another panic, finally... a different one.
At 17:20 Uhr +0000 12.02.2022, Taylor R Campbell wrote:
> Can you do `show panic'? If it says `kernel lock spinout', [...]
It did.
> Can you show `ps' output, and then `bt/a ffff...' for all of the lines
> with `>' on them?
Screenshots at <ftp://ftp.causeuse.org/pub/NetBSD/kern-56705/>. I should
really set up the machine for serial console.
> Can you get a crash dump?
a 'reboot 0x100' resulted in 'bad dumpdev', so apparently not.
Cheerio,
Hauke
--
"It's never straight up and down" (DEVO)
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.