NetBSD Problem Report #55745
From hf@spg.tu-darmstadt.de Thu Oct 22 09:02:13 2020
Return-Path: <hf@spg.tu-darmstadt.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 183581A923E
for <gnats-bugs@gnats.NetBSD.org>; Thu, 22 Oct 2020 09:02:13 +0000 (UTC)
Message-Id: <202010220902.09M925Vx001331@Gstoder.nt.e-technik.tu-darmstadt.de>
Date: Thu, 22 Oct 2020 11:02:05 +0200 (CEST)
From: Hauke Fath <hf@spg.tu-darmstadt.de>
Reply-To: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
Subject: spdmem_i2c_match() panics
X-Send-Pr-Version: 3.95
>Number: 55745
>Category: kern
>Synopsis: spdmem_i2c_match() panics
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Oct 22 09:05:00 +0000 2020
>Last-Modified: Wed Nov 18 09:15:01 +0000 2020
>Originator: Hauke Fath
>Release: NetBSD 9.0_STABLE
>Organization:
Technische Universitaet Darmstadt
>Environment:
System: NetBSD Gstoder 9.0_STABLE NetBSD 9.0_STABLE (GA-MA770-UD3-$Revision$) #0: Mon Jul 13 16:36:43 CEST 2020 hf@Hochstuhl:/var/obj/netbsd-builds/9/amd64/sys/arch/amd64/compile/GA-MA770-UD3 amd64
Architecture: x86_64
Machine: amd64
>Description:
For a few months, netbsd-9 kernels on my amd64 desktop
<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3.kconfig>
have been panicking during spdmem auto-configuration:
<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3-panic.jpg>
The above kernel is the last one to boot, its dmesg is here:
<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3_2020_07_13.dmesg>
>How-To-Repeat:
Configure a netbsd-9 kernel to probe spdmem.
>Fix:
Yes, please.
>Audit-Trail:
From: SAITOH Masanobu <msaitoh@execsw.org>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: msaitoh@execsw.org
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Thu, 22 Oct 2020 18:27:46 +0900
On 2020/10/22 18:05, Hauke Fath wrote:
>> Number: 55745
>> Category: kern
>> Synopsis: spdmem_i2c_match() panics
>> Confidential: no
>> Severity: critical
>> Priority: high
>> Responsible: kern-bug-people
>> State: open
>> Class: sw-bug
>> Submitter-Id: net
>> Arrival-Date: Thu Oct 22 09:05:00 +0000 2020
>> Originator: Hauke Fath
>> Release: NetBSD 9.0_STABLE
>> Organization:
> Technische Universitaet Darmstadt
>> Environment:
>
>
> System: NetBSD Gstoder 9.0_STABLE NetBSD 9.0_STABLE (GA-MA770-UD3-$Revision$) #0: Mon Jul 13 16:36:43 CEST 2020 hf@Hochstuhl:/var/obj/netbsd-builds/9/amd64/sys/arch/amd64/compile/GA-MA770-UD3 amd64
> Architecture: x86_64
> Machine: amd64
>> Description:
>
> For a few months, netbsd-9 kernels on my amd64 desktop
> <https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3.kconfig>
>
> have been panicking during spdmem auto-configuration:
> <https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3-panic.jpg>
>
> The above kernel is the last one to boot, its dmesg is here:
> <https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3_2020_07_13.dmesg>
>
>
>
>> How-To-Repeat:
>
> Configure a netbsd-9 kernel to probe spdmem.
>
>
>> Fix:
> Yes, please.
>
>
>> Unformatted:
>
>
>
Your machine's kernel was compiled at 2020/07/13, right?
If so, please test the latest netbsd-9. Some modifications
for netbsd-9 were pulled up at 2020/07/16.
--
-----------------------------------------------
SAITOH Masanobu (msaitoh@execsw.org
msaitoh@netbsd.org)
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Thu, 22 Oct 2020 12:33:28 +0200
On 2020-10-22 11:30, SAITOH Masanobu wrote:
> Your machine's kernel was compiled at 2020/07/13, right?
That one is the latest usable kernel that will run send-pr; I included
its dmesg for machine information.
> If so, please test the latest netbsd-9. Some modifications
> for netbsd-9 were pulled up at 2020/07/16.
Every kernel newer than the above has panicked, the last I tried was
built from 2020-10-20 sources. In fact, I suspect that those pulled-up
modifications you mentioned might be the source of the problem...
Cheerio,
Hauke
--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-21344
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@netbsd.org
Cc: SAITOH Masanobu <msaitoh@execsw.org>, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Thu, 22 Oct 2020 13:05:29 +0200
On 2020-10-22 11:27, SAITOH Masanobu wrote:
> Your machine's kernel was compiled at 2020/07/13, right?
> If so, please test the latest netbsd-9. Some modifications
> for netbsd-9 were pulled up at 2020/07/16.
FTR, a -current kernel of today boots successfully, so it looks like -9
has a regression.
Cheerio,
Hauke
--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-21344
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
=?iso-8859-1?Q?Edgar_Fu=DF?= <ef@math.uni-bonn.de>
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Fri, 13 Nov 2020 20:12:54 +0100
The registers:
<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/kern-55745-registers-1.png>
<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/kern-55745-registers-2.png>
The disassembly of the crashing code:
<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/kern-55745-disassembly.png>
From: Edgar =?iso-8859-1?B?RnXf?= <ef@math.uni-bonn.de>
To: gnats-bugs@netbsd.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Mon, 16 Nov 2020 15:55:19 +0100
--jq0ap7NbKX2Kqbes
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
The problem (at least on -8, didn't check -9) is that sys/dev/pci/piixpm.c 1.52.6.2 installs NULL handlers for ic_{acquire,release}_bus in some cases, but the rest of the I2C code isn't prepared for that. So I guess one needs to install dummy handlers that only acquire/release the i2c mutex.
With that change, I can boot again.
--jq0ap7NbKX2Kqbes
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="piixpm.c.diff"
Index: sys/dev/pci/piixpm.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/piixpm.c,v
retrieving revision 1.52.6.2
diff -u -r1.52.6.2 piixpm.c
--- sys/dev/pci/piixpm.c 5 Aug 2020 16:11:56 -0000 1.52.6.2
+++ sys/dev/pci/piixpm.c 16 Nov 2020 14:35:58 -0000
@@ -125,6 +125,8 @@
static void piixpm_csb5_reset(void *);
static int piixpm_i2c_sb800_acquire_bus(void *, int);
static void piixpm_i2c_sb800_release_bus(void *, int);
+static int piixpm_i2c_dummy_acquire_bus(void *, int);
+static void piixpm_i2c_dummy_release_bus(void *, int);
static int piixpm_i2c_exec(void *, i2c_op_t, i2c_addr_t, const void *,
size_t, void *, size_t, int);
@@ -338,8 +340,8 @@
tag->ic_acquire_bus = piixpm_i2c_sb800_acquire_bus;
tag->ic_release_bus = piixpm_i2c_sb800_release_bus;
} else {
- tag->ic_acquire_bus = NULL;
- tag->ic_release_bus = NULL;
+ tag->ic_acquire_bus = piixpm_i2c_dummy_acquire_bus;
+ tag->ic_release_bus = piixpm_i2c_dummy_release_bus;
}
tag->ic_exec = piixpm_i2c_exec;
memset(&iba, 0, sizeof(iba));
@@ -549,6 +551,18 @@
return 0;
}
+static int
+piixpm_i2c_dummy_acquire_bus(void *cookie, int flags)
+{
+ struct piixpm_smbus *smbus = cookie;
+ struct piixpm_softc *sc = smbus->softc;
+
+ if (!cold)
+ mutex_enter(&sc->sc_i2c_mutex);
+
+ return 0;
+}
+
static void
piixpm_i2c_sb800_release_bus(void *cookie, int flags)
{
@@ -590,6 +604,16 @@
mutex_exit(&sc->sc_i2c_mutex);
}
+static void
+piixpm_i2c_dummy_release_bus(void *cookie, int flags)
+{
+ struct piixpm_smbus *smbus = cookie;
+ struct piixpm_softc *sc = smbus->softc;
+
+ if (!cold)
+ mutex_exit(&sc->sc_i2c_mutex);
+}
+
static int
piixpm_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t addr,
const void *cmdbuf, size_t cmdlen, void *buf, size_t len, int flags)
--jq0ap7NbKX2Kqbes--
From: SAITOH Masanobu <msaitoh@execsw.org>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
Hauke Fath <hf@spg.tu-darmstadt.de>
Cc: msaitoh@execsw.org
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Tue, 17 Nov 2020 19:01:05 +0900
On 2020/11/17 0:00, Edgar Fuß wrote:
> The following reply was made to PR kern/55745; it has been noted by GNATS.
>
> From: Edgar =?iso-8859-1?B?RnXf?= <ef@math.uni-bonn.de>
> To: gnats-bugs@netbsd.org
> Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
> Subject: Re: kern/55745: spdmem_i2c_match() panics
> Date: Mon, 16 Nov 2020 15:55:19 +0100
>
> --jq0ap7NbKX2Kqbes
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
>
> The problem (at least on -8, didn't check -9) is that sys/dev/pci/piixpm.c 1.52.6.2 installs NULL handlers for ic_{acquire,release}_bus in some cases, but the rest of the I2C code isn't prepared for that. So I guess one needs to install dummy handlers that only acquire/release the i2c mutex.
>
LGTM.
Those functions are not "dummy" so I think it would be good to name
them as piixpm_i2c_{acquire,release}_bus() or any other better name.
> With that change, I can boot again.
>
> --jq0ap7NbKX2Kqbes
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: attachment; filename="piixpm.c.diff"
>
> Index: sys/dev/pci/piixpm.c
> ===================================================================
> RCS file: /cvsroot/src/sys/dev/pci/piixpm.c,v
> retrieving revision 1.52.6.2
> diff -u -r1.52.6.2 piixpm.c
> --- sys/dev/pci/piixpm.c 5 Aug 2020 16:11:56 -0000 1.52.6.2
> +++ sys/dev/pci/piixpm.c 16 Nov 2020 14:35:58 -0000
> @@ -125,6 +125,8 @@
> static void piixpm_csb5_reset(void *);
> static int piixpm_i2c_sb800_acquire_bus(void *, int);
> static void piixpm_i2c_sb800_release_bus(void *, int);
> +static int piixpm_i2c_dummy_acquire_bus(void *, int);
> +static void piixpm_i2c_dummy_release_bus(void *, int);
> static int piixpm_i2c_exec(void *, i2c_op_t, i2c_addr_t, const void *,
> size_t, void *, size_t, int);
>
> @@ -338,8 +340,8 @@
> tag->ic_acquire_bus = piixpm_i2c_sb800_acquire_bus;
> tag->ic_release_bus = piixpm_i2c_sb800_release_bus;
> } else {
> - tag->ic_acquire_bus = NULL;
> - tag->ic_release_bus = NULL;
> + tag->ic_acquire_bus = piixpm_i2c_dummy_acquire_bus;
> + tag->ic_release_bus = piixpm_i2c_dummy_release_bus;
> }
> tag->ic_exec = piixpm_i2c_exec;
> memset(&iba, 0, sizeof(iba));
> @@ -549,6 +551,18 @@
> return 0;
> }
>
> +static int
> +piixpm_i2c_dummy_acquire_bus(void *cookie, int flags)
> +{
> + struct piixpm_smbus *smbus = cookie;
> + struct piixpm_softc *sc = smbus->softc;
> +
> + if (!cold)
> + mutex_enter(&sc->sc_i2c_mutex);
> +
> + return 0;
> +}
> +
> static void
> piixpm_i2c_sb800_release_bus(void *cookie, int flags)
> {
> @@ -590,6 +604,16 @@
> mutex_exit(&sc->sc_i2c_mutex);
> }
>
> +static void
> +piixpm_i2c_dummy_release_bus(void *cookie, int flags)
> +{
> + struct piixpm_smbus *smbus = cookie;
> + struct piixpm_softc *sc = smbus->softc;
> +
> + if (!cold)
> + mutex_exit(&sc->sc_i2c_mutex);
> +}
> +
> static int
> piixpm_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t addr,
> const void *cmdbuf, size_t cmdlen, void *buf, size_t len, int flags)
>
> --jq0ap7NbKX2Kqbes--
>
>
--
-----------------------------------------------
SAITOH Masanobu (msaitoh@execsw.org
msaitoh@netbsd.org)
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Tue, 17 Nov 2020 21:18:22 +0700
I would not use
if (!cold)
in piixpm_i2c_dummy_release_bus() or whatever that function ends
up being called, there must be some chance that the system will
transition from cold to !cold while the code that happens between
the piixpm_i2c_dummy_acquire_bus() and piixpm_i2c_dummy_release_bus()
is running, in which case the latter will attempt to exit a mutex
that has never been entered.
Instead, have the acquire() function remember in a state bit whether
it attempted to enter the mutex or notm (ie: entered it, there is no
tried and failed), and then exit it only when it was actually entered.
Even if it is (apparently) impossible for the cold->!cold transition
to happen during this period, writing it this way makes it safer
against later modifications in how other parts of the system work.
kre
From: Edgar =?iso-8859-1?B?RnXf?= <ef@math.uni-bonn.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Wed, 18 Nov 2020 10:12:30 +0100
It was the "if (!cold)" way before the pull-up (in piixpm_i2c_release_bus()) and it is the "if (!cold)" way after the pull-up in piixpm_i2c_sb800_release_bus().
I haven't checked how it's done in after i2c_{acquire,release}_bus() became real functions.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.