NetBSD Problem Report #53265
From gson@gson.org Sun May 6 12:03:53 2018
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 58E0B7A272
for <gnats-bugs@gnats.NetBSD.org>; Sun, 6 May 2018 12:03:53 +0000 (UTC)
Message-Id: <20180506120347.44B6898B44B@guava.gson.org>
Date: Sun, 6 May 2018 15:03:47 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: panic in bnx_detach() on shutdown
X-Send-Pr-Version: 3.95
>Number: 53265
>Category: kern
>Synopsis: panic in bnx_detach() on shutdown
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: msaitoh
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun May 06 12:05:00 +0000 2018
>Closed-Date: Tue May 08 04:15:34 +0000 2018
>Last-Modified: Wed May 09 14:55:01 +0000 2018
>Originator: Andreas Gustafsson
>Release: NetBSD-current, source date 2018.05.04.14.15.41
>Organization:
>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:
Seen on the serial console shutting down an 8-core amd64 machine
running a fresh -current:
[ 63426.6135600] syncing disks... done
[ 63428.0241184] cd0: detached
[ 63428.0608678] brgphy3: detached
[ 63428.0983665] brgphy2: detached
[ 63428.1358664] brgphy1: detached
[ 63428.1733649] brgphy0: detached
[ 63428.2108644] atapibus0: detached
[ 63428.2442055] uhub5: detached
[ 63428.2842215] uhub4: detached
[ 63428.3142328] uhub2: detached
[ 63428.3542487] uhub1: detached
[ 63428.3842605] uhub0: detached
[ 63428.4242764] com1: detached
[ 63428.4642922] bnx3: detached
[ 63428.5043087] bnx2: detached
[ 63428.5443239] Skipping crash dump on recursive panic
[ 63428.5943436] panic: kernel diagnostic assertion "c->c_cpu->cc_lwp == curlwp || c->c_cpu->cc_active != c" failed: file "/tmp/bracket/build/2018.05.04.14.15\
.41-amd64-debug-baremetal/src/sys/kern/kern_timeout.c", line 318
[ 63428.8344384] cpu7: Begin traceback...
[ 63428.8744542] vpanic() at netbsd:vpanic+0x16f
[ 63428.9344780] ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
[ 63429.0045057] callout_destroy() at netbsd:callout_destroy+0x75
[ 63429.0745334] bnx_detach() at netbsd:bnx_detach+0xbb
[ 63429.1345572] config_detach() at netbsd:config_detach+0x121
[ 63429.2045849] config_detach_all() at netbsd:config_detach_all+0x97
[ 63429.2746126] cpu_reboot() at netbsd:cpu_reboot+0x19a
[ 63429.3346364] sys_reboot() at netbsd:sys_reboot+0x85
[ 63429.3946602] syscall() at netbsd:syscall+0x208
[ 63429.4446800] --- syscall (number 208) ---
[ 63429.4946998] 74ed2443ebda:
[ 63429.5347157] cpu7: End traceback...
[ 63429.5847356] rebooting...
No harm done, but it's a bug nonetheless...
>How-To-Repeat:
Only happened once so far.
>Fix:
>Release-Note:
>Audit-Trail:
From: Masanobu SAITOH <msaitoh@execsw.org>
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: msaitoh@execsw.org
Subject: Re: kern/53265: panic in bnx_detach() on shutdown
Date: Mon, 7 May 2018 19:09:01 +0900
On 2018/05/06 21:05, Andreas Gustafsson wrote:
>> Number: 53265
>> Category: kern
>> Synopsis: panic in bnx_detach() on shutdown
>> Confidential: no
>> Severity: non-critical
>> Priority: low
>> Responsible: kern-bug-people
>> State: open
>> Class: sw-bug
>> Submitter-Id: net
>> Arrival-Date: Sun May 06 12:05:00 +0000 2018
>> Originator: Andreas Gustafsson
>> Release: NetBSD-current, source date 2018.05.04.14.15.41
>> Organization:
>
>> Environment:
> System: NetBSD
> Architecture: x86_64
> Machine: amd64
>> Description:
>
> Seen on the serial console shutting down an 8-core amd64 machine
> running a fresh -current:
>
> [ 63426.6135600] syncing disks... done
> [ 63428.0241184] cd0: detached
> [ 63428.0608678] brgphy3: detached
> [ 63428.0983665] brgphy2: detached
> [ 63428.1358664] brgphy1: detached
> [ 63428.1733649] brgphy0: detached
> [ 63428.2108644] atapibus0: detached
> [ 63428.2442055] uhub5: detached
> [ 63428.2842215] uhub4: detached
> [ 63428.3142328] uhub2: detached
> [ 63428.3542487] uhub1: detached
> [ 63428.3842605] uhub0: detached
> [ 63428.4242764] com1: detached
> [ 63428.4642922] bnx3: detached
> [ 63428.5043087] bnx2: detached
> [ 63428.5443239] Skipping crash dump on recursive panic
> [ 63428.5943436] panic: kernel diagnostic assertion "c->c_cpu->cc_lwp == curlwp || c->c_cpu->cc_active != c" failed: file "/tmp/bracket/build/2018.05.04.14.15\
> .41-amd64-debug-baremetal/src/sys/kern/kern_timeout.c", line 318
> [ 63428.8344384] cpu7: Begin traceback...
> [ 63428.8744542] vpanic() at netbsd:vpanic+0x16f
> [ 63428.9344780] ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
> [ 63429.0045057] callout_destroy() at netbsd:callout_destroy+0x75
> [ 63429.0745334] bnx_detach() at netbsd:bnx_detach+0xbb
> [ 63429.1345572] config_detach() at netbsd:config_detach+0x121
> [ 63429.2045849] config_detach_all() at netbsd:config_detach_all+0x97
> [ 63429.2746126] cpu_reboot() at netbsd:cpu_reboot+0x19a
> [ 63429.3346364] sys_reboot() at netbsd:sys_reboot+0x85
> [ 63429.3946602] syscall() at netbsd:syscall+0x208
> [ 63429.4446800] --- syscall (number 208) ---
> [ 63429.4946998] 74ed2443ebda:
> [ 63429.5347157] cpu7: End traceback...
> [ 63429.5847356] rebooting...
>
> No harm done, but it's a bug nonetheless..
Even if you do "shutdown -h", it doesn't halt and reboot.
>> How-To-Repeat:
>
> Only happened once so far.
>
>> Fix:
>
How often does it panic on shutdown? Could you test the following patch
to verify the problem is fixed?
---------------------------
- Fix a bug that bnx(4) panic on shutdown. Reported by Andreas Gustafsson in
PR#53265.
- Make sure not to re-arm the callout when we are about to detach. Same as
if_bge.c rev. 1.292.
- Use pci_intr_establish_xname().
---------------------------
Index: if_bnxvar.h
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/if_bnxvar.h,v
retrieving revision 1.6
diff -u -p -r1.6 if_bnxvar.h
--- if_bnxvar.h 1 Jul 2014 17:11:35 -0000 1.6
+++ if_bnxvar.h 7 May 2018 10:03:56 -0000
@@ -210,6 +210,7 @@ struct bnx_softc
uint32_t tx_prod_bseq; /* Counts the bytes used. */
struct callout bnx_timeout;
+ int bnx_detaching;
/* Frame size and mbuf allocation size for RX frames. */
uint32_t max_frame_size;
Index: if_bnx.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/if_bnx.c,v
retrieving revision 1.63
diff -u -p -r1.63 if_bnx.c
--- if_bnx.c 8 Feb 2018 09:05:19 -0000 1.63
+++ if_bnx.c 7 May 2018 10:03:59 -0000
@@ -792,7 +792,8 @@ bnx_attach(device_t parent, device_t sel
IFCAP_CSUM_UDPv4_Tx | IFCAP_CSUM_UDPv4_Rx;
/* Hookup IRQ last. */
- sc->bnx_intrhand = pci_intr_establish(pc, ih, IPL_NET, bnx_intr, sc);
+ sc->bnx_intrhand = pci_intr_establish_xname(pc, ih, IPL_NET, bnx_intr,
+ sc, device_xname(self));
if (sc->bnx_intrhand == NULL) {
aprint_error_dev(self, "couldn't establish interrupt");
if (intrstr != NULL)
@@ -890,17 +891,7 @@ bnx_detach(device_t dev, int flags)
/* Stop and reset the controller. */
s = splnet();
- if (ifp->if_flags & IFF_RUNNING)
- bnx_stop(ifp, 1);
- else {
- /* Disable the transmit/receive blocks. */
- REG_WR(sc, BNX_MISC_ENABLE_CLR_BITS, 0x5ffffff);
- REG_RD(sc, BNX_MISC_ENABLE_CLR_BITS);
- DELAY(20);
- bnx_disable_intr(sc);
- bnx_reset(sc, BNX_DRV_MSG_CODE_RESET);
- }
-
+ bnx_stop(ifp, 1);
splx(s);
pmf_device_deregister(dev);
@@ -3371,10 +3362,11 @@ bnx_stop(struct ifnet *ifp, int disable)
DBPRINT(sc, BNX_VERBOSE_RESET, "Entering %s()\n", __func__);
- if ((ifp->if_flags & IFF_RUNNING) == 0)
- return;
-
- callout_stop(&sc->bnx_timeout);
+ if (disable) {
+ sc->bnx_detaching = 1;
+ callout_halt(&sc->bnx_timeout, NULL);
+ } else
+ callout_stop(&sc->bnx_timeout);
mii_down(&sc->bnx_mii);
@@ -5694,9 +5686,6 @@ bnx_tick(void *xsc)
/* Update the statistics from the hardware statistics block. */
bnx_stats_update(sc);
- /* Schedule the next tick. */
- callout_reset(&sc->bnx_timeout, hz, bnx_tick, sc);
-
mii = &sc->bnx_mii;
mii_tick(mii);
@@ -5707,6 +5696,11 @@ bnx_tick(void *xsc)
bnx_get_buf(sc, &prod, &chain_prod, &prod_bseq);
sc->rx_prod = prod;
sc->rx_prod_bseq = prod_bseq;
+
+ /* Schedule the next tick. */
+ if (!sc->bnx_detaching)
+ callout_reset(&sc->bnx_timeout, hz, bnx_tick, sc);
+
splx(s);
return;
}
The same diff is at:
http://www.netbsd.org/~msaitoh/bnx-20180507-0.dif
--
-----------------------------------------------
SAITOH Masanobu (msaitoh@execsw.org
msaitoh@netbsd.org)
Responsible-Changed-From-To: kern-bug-people->msaitoh
Responsible-Changed-By: msaitoh@NetBSD.org
Responsible-Changed-When: Mon, 07 May 2018 10:11:38 +0000
Responsible-Changed-Why:
mine.
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53265: panic in bnx_detach() on shutdown
Date: Mon, 7 May 2018 17:42:32 +0300
Masanobu SAITOH wrote:
> How often does it panic on shutdown?
It has only happened once.
> Could you test the following patch
> to verify the problem is fixed?
I tested the patch and was able to shut down the system without a
panic, and did not notice any other problems, either. Since I have
also been able to shut it down without a panic many times without the
patch, this isn't conclusive proof that the problem is fixed, but at
least the patch appears not to break anything.
--
Andreas Gustafsson, gson@gson.org
State-Changed-From-To: open->closed
State-Changed-By: msaitoh@NetBSD.org
State-Changed-When: Tue, 08 May 2018 04:15:34 +0000
State-Changed-Why:
Fixed. Thanks.
From: Masanobu SAITOH <msaitoh@execsw.org>
To: gnats-bugs@NetBSD.org, msaitoh@NetBSD.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, Andreas Gustafsson <gson@gson.org>
Cc: msaitoh@execsw.org
Subject: Re: kern/53265: panic in bnx_detach() on shutdown
Date: Tue, 8 May 2018 13:13:55 +0900
On 2018/05/07 23:45, Andreas Gustafsson wrote:
> The following reply was made to PR kern/53265; it has been noted by GNATS.
>
> From: Andreas Gustafsson <gson@gson.org>
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: kern/53265: panic in bnx_detach() on shutdown
> Date: Mon, 7 May 2018 17:42:32 +0300
>
> Masanobu SAITOH wrote:
> > How often does it panic on shutdown?
>
> It has only happened once.
>
> > Could you test the following patch
> > to verify the problem is fixed?
>
> I tested the patch and was able to shut down the system without a
> panic, and did not notice any other problems, either. Since I have
> also been able to shut it down without a panic many times without the
> patch, this isn't conclusive proof that the problem is fixed
Destroying callout without stopping is a bug and the stack trace
say so. I committed the change and it won't happen anymore.
Thank you for the report.
> , but at
> least the patch appears not to break anything.
> --
> Andreas Gustafsson, gson@gson.org
>
>
--
-----------------------------------------------
SAITOH Masanobu (msaitoh@execsw.org
msaitoh@netbsd.org)
From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/53265 CVS commit: src/sys/dev/pci
Date: Tue, 8 May 2018 04:11:10 +0000
Module Name: src
Committed By: msaitoh
Date: Tue May 8 04:11:10 UTC 2018
Modified Files:
src/sys/dev/pci: if_bnx.c if_bnxvar.h
Log Message:
- Fix a bug that bnx(4) panics on shutdown. Stop callout before restroy.
Reported by Andreas Gustafsson in PR#53265.
- Make sure not to re-arm the callout when we are about to detach. Same as
if_bge.c rev. 1.292.
- Use pci_intr_establish_xname().
To generate a diff of this commit:
cvs rdiff -u -r1.63 -r1.64 src/sys/dev/pci/if_bnx.c
cvs rdiff -u -r1.6 -r1.7 src/sys/dev/pci/if_bnxvar.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/53265 CVS commit: [netbsd-8] src/sys/dev/pci
Date: Wed, 9 May 2018 14:52:40 +0000
Module Name: src
Committed By: martin
Date: Wed May 9 14:52:40 UTC 2018
Modified Files:
src/sys/dev/pci [netbsd-8]: if_bnx.c if_bnxvar.h
Log Message:
Pull up following revision(s) (requested by msaitoh in ticket #814):
sys/dev/pci/if_bnxvar.h: revision 1.7
sys/dev/pci/if_bnx.c: revision 1.64
- Fix a bug that bnx(4) panics on shutdown. Stop callout before restroy.
Reported by Andreas Gustafsson in PR#53265.
- Make sure not to re-arm the callout when we are about to detach. Same as
if_bge.c rev. 1.292.
- Use pci_intr_establish_xname().
To generate a diff of this commit:
cvs rdiff -u -r1.61.8.1 -r1.61.8.2 src/sys/dev/pci/if_bnx.c
cvs rdiff -u -r1.6 -r1.6.22.1 src/sys/dev/pci/if_bnxvar.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.