NetBSD Problem Report #55745

From hf@spg.tu-darmstadt.de  Thu Oct 22 09:02:13 2020
Return-Path: <hf@spg.tu-darmstadt.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 183581A923E
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 22 Oct 2020 09:02:13 +0000 (UTC)
Message-Id: <202010220902.09M925Vx001331@Gstoder.nt.e-technik.tu-darmstadt.de>
Date: Thu, 22 Oct 2020 11:02:05 +0200 (CEST)
From: Hauke Fath <hf@spg.tu-darmstadt.de>
Reply-To: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
Subject: spdmem_i2c_match() panics
X-Send-Pr-Version: 3.95

>Number:         55745
>Category:       kern
>Synopsis:       spdmem_i2c_match() panics
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Oct 22 09:05:00 +0000 2020
>Last-Modified:  Wed Nov 18 09:15:01 +0000 2020
>Originator:     Hauke Fath
>Release:        NetBSD 9.0_STABLE
>Organization:
Technische Universitaet Darmstadt
>Environment:


System: NetBSD Gstoder 9.0_STABLE NetBSD 9.0_STABLE (GA-MA770-UD3-$Revision$) #0: Mon Jul 13 16:36:43 CEST 2020 hf@Hochstuhl:/var/obj/netbsd-builds/9/amd64/sys/arch/amd64/compile/GA-MA770-UD3 amd64
Architecture: x86_64
Machine: amd64
>Description:

	For a few months, netbsd-9 kernels on my amd64 desktop 
	<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3.kconfig>

	have been panicking during spdmem auto-configuration:
	<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3-panic.jpg>

	The above kernel is the last one to boot, its dmesg is here:
	<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3_2020_07_13.dmesg>



>How-To-Repeat:

	Configure a netbsd-9 kernel to probe spdmem.


>Fix:
	Yes, please.


>Audit-Trail:
From: SAITOH Masanobu <msaitoh@execsw.org>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: msaitoh@execsw.org
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Thu, 22 Oct 2020 18:27:46 +0900

 On 2020/10/22 18:05, Hauke Fath wrote:
 >> Number:         55745
 >> Category:       kern
 >> Synopsis:       spdmem_i2c_match() panics
 >> Confidential:   no
 >> Severity:       critical
 >> Priority:       high
 >> Responsible:    kern-bug-people
 >> State:          open
 >> Class:          sw-bug
 >> Submitter-Id:   net
 >> Arrival-Date:   Thu Oct 22 09:05:00 +0000 2020
 >> Originator:     Hauke Fath
 >> Release:        NetBSD 9.0_STABLE
 >> Organization:
 > Technische Universitaet Darmstadt
 >> Environment:
 > 	
 > 	
 > System: NetBSD Gstoder 9.0_STABLE NetBSD 9.0_STABLE (GA-MA770-UD3-$Revision$) #0: Mon Jul 13 16:36:43 CEST 2020 hf@Hochstuhl:/var/obj/netbsd-builds/9/amd64/sys/arch/amd64/compile/GA-MA770-UD3 amd64
 > Architecture: x86_64
 > Machine: amd64
 >> Description:
 > 
 > 	For a few months, netbsd-9 kernels on my amd64 desktop
 > 	<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3.kconfig>
 > 	
 > 	have been panicking during spdmem auto-configuration:
 > 	<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3-panic.jpg>
 > 
 > 	The above kernel is the last one to boot, its dmesg is here:
 > 	<https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/GA-MA770-UD3_2020_07_13.dmesg>
 > 
 > 
 > 	
 >> How-To-Repeat:
 > 
 > 	Configure a netbsd-9 kernel to probe spdmem.
 > 
 > 	
 >> Fix:
 > 	Yes, please.
 > 	
 > 
 >> Unformatted:
 >   	
 >   	
 > 

 Your machine's kernel was compiled at 2020/07/13, right?
 If so, please test the latest netbsd-9. Some modifications
 for netbsd-9 were pulled up at 2020/07/16.

 -- 
 -----------------------------------------------
                  SAITOH Masanobu (msaitoh@execsw.org
                                   msaitoh@netbsd.org)

From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Thu, 22 Oct 2020 12:33:28 +0200

 On 2020-10-22 11:30, SAITOH Masanobu wrote:
 >   Your machine's kernel was compiled at 2020/07/13, right?

 That one is the latest usable kernel that will run send-pr; I included 
 its dmesg for machine information.

 >   If so, please test the latest netbsd-9. Some modifications
 >   for netbsd-9 were pulled up at 2020/07/16.

 Every kernel newer than the above has panicked, the last I tried was 
 built from 2020-10-20 sources. In fact, I suspect that those pulled-up 
 modifications you mentioned might be the source of the problem...

 Cheerio,
 Hauke

 -- 
       The ASCII Ribbon Campaign                    Hauke Fath
 ()     No HTML/RTF in email	        Institut für Nachrichtentechnik
 /\     No Word docs in email                     TU Darmstadt
       Respect for open standards              Ruf +49-6151-16-21344

From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@netbsd.org
Cc: SAITOH Masanobu <msaitoh@execsw.org>, kern-bug-people@netbsd.org,
        gnats-admin@netbsd.org
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Thu, 22 Oct 2020 13:05:29 +0200

 On 2020-10-22 11:27, SAITOH Masanobu wrote:
 > Your machine's kernel was compiled at 2020/07/13, right?
 > If so, please test the latest netbsd-9. Some modifications
 > for netbsd-9 were pulled up at 2020/07/16.

 FTR, a -current kernel of today boots successfully, so it looks like -9 
 has a regression.

 Cheerio,
 Hauke

 -- 
       The ASCII Ribbon Campaign                    Hauke Fath
 ()     No HTML/RTF in email	        Institut für Nachrichtentechnik
 /\     No Word docs in email                     TU Darmstadt
       Respect for open standards              Ruf +49-6151-16-21344

From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
        =?iso-8859-1?Q?Edgar_Fu=DF?= <ef@math.uni-bonn.de>
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Fri, 13 Nov 2020 20:12:54 +0100

 The registers:

 <https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/kern-55745-registers-1.png>
 <https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/kern-55745-registers-2.png>

 The disassembly of the crashing code:

 <https://www2.nt.tu-darmstadt.de/~hf/netbsd/GA-MA770-UD3/kern-55745-disassembly.png>

From: Edgar =?iso-8859-1?B?RnXf?= <ef@math.uni-bonn.de>
To: gnats-bugs@netbsd.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Mon, 16 Nov 2020 15:55:19 +0100

 --jq0ap7NbKX2Kqbes
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 The problem (at least on -8, didn't check -9) is that sys/dev/pci/piixpm.c 1.52.6.2 installs NULL handlers for ic_{acquire,release}_bus in some cases, but the rest of the I2C code isn't prepared for that. So I guess one needs to install dummy handlers that only acquire/release the i2c mutex.

 With that change, I can boot again.

 --jq0ap7NbKX2Kqbes
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="piixpm.c.diff"

 Index: sys/dev/pci/piixpm.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/piixpm.c,v
 retrieving revision 1.52.6.2
 diff -u -r1.52.6.2 piixpm.c
 --- sys/dev/pci/piixpm.c	5 Aug 2020 16:11:56 -0000	1.52.6.2
 +++ sys/dev/pci/piixpm.c	16 Nov 2020 14:35:58 -0000
 @@ -125,6 +125,8 @@
  static void	piixpm_csb5_reset(void *);
  static int	piixpm_i2c_sb800_acquire_bus(void *, int);
  static void	piixpm_i2c_sb800_release_bus(void *, int);
 +static int	piixpm_i2c_dummy_acquire_bus(void *, int);
 +static void	piixpm_i2c_dummy_release_bus(void *, int);
  static int	piixpm_i2c_exec(void *, i2c_op_t, i2c_addr_t, const void *,
      size_t, void *, size_t, int);

 @@ -338,8 +340,8 @@
  			tag->ic_acquire_bus = piixpm_i2c_sb800_acquire_bus;
  			tag->ic_release_bus = piixpm_i2c_sb800_release_bus;
  		} else {
 -			tag->ic_acquire_bus = NULL;
 -			tag->ic_release_bus = NULL;
 +			tag->ic_acquire_bus = piixpm_i2c_dummy_acquire_bus;
 +			tag->ic_release_bus = piixpm_i2c_dummy_release_bus;
  		}
  		tag->ic_exec = piixpm_i2c_exec;
  		memset(&iba, 0, sizeof(iba));
 @@ -549,6 +551,18 @@
  	return 0;
  }

 +static int
 +piixpm_i2c_dummy_acquire_bus(void *cookie, int flags)
 +{
 +	struct piixpm_smbus *smbus = cookie;
 +	struct piixpm_softc *sc = smbus->softc;
 +
 +	if (!cold)
 +		mutex_enter(&sc->sc_i2c_mutex);
 +
 +	return 0;
 +}
 +
  static void
  piixpm_i2c_sb800_release_bus(void *cookie, int flags)
  {
 @@ -590,6 +604,16 @@
  		mutex_exit(&sc->sc_i2c_mutex);
  }

 +static void
 +piixpm_i2c_dummy_release_bus(void *cookie, int flags)
 +{
 +	struct piixpm_smbus *smbus = cookie;
 +	struct piixpm_softc *sc = smbus->softc;
 +
 +	if (!cold)
 +		mutex_exit(&sc->sc_i2c_mutex);
 +}
 +
  static int
  piixpm_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t addr,
      const void *cmdbuf, size_t cmdlen, void *buf, size_t len, int flags)

 --jq0ap7NbKX2Kqbes--

From: SAITOH Masanobu <msaitoh@execsw.org>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
 Hauke Fath <hf@spg.tu-darmstadt.de>
Cc: msaitoh@execsw.org
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Tue, 17 Nov 2020 19:01:05 +0900

 On 2020/11/17 0:00, Edgar Fuß wrote:
 > The following reply was made to PR kern/55745; it has been noted by GNATS.
 > 
 > From: Edgar =?iso-8859-1?B?RnXf?= <ef@math.uni-bonn.de>
 > To: gnats-bugs@netbsd.org
 > Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
 > Subject: Re: kern/55745: spdmem_i2c_match() panics
 > Date: Mon, 16 Nov 2020 15:55:19 +0100
 > 
 >   --jq0ap7NbKX2Kqbes
 >   Content-Type: text/plain; charset=us-ascii
 >   Content-Disposition: inline
 >   
 >   The problem (at least on -8, didn't check -9) is that sys/dev/pci/piixpm.c 1.52.6.2 installs NULL handlers for ic_{acquire,release}_bus in some cases, but the rest of the I2C code isn't prepared for that. So I guess one needs to install dummy handlers that only acquire/release the i2c mutex.
 >   

 LGTM.

 Those functions are not "dummy" so I think it would be good to name
 them as piixpm_i2c_{acquire,release}_bus() or any other better name.


 >   With that change, I can boot again.
 >   
 >   --jq0ap7NbKX2Kqbes
 >   Content-Type: text/plain; charset=us-ascii
 >   Content-Disposition: attachment; filename="piixpm.c.diff"
 >   
 >   Index: sys/dev/pci/piixpm.c
 >   ===================================================================
 >   RCS file: /cvsroot/src/sys/dev/pci/piixpm.c,v
 >   retrieving revision 1.52.6.2
 >   diff -u -r1.52.6.2 piixpm.c
 >   --- sys/dev/pci/piixpm.c	5 Aug 2020 16:11:56 -0000	1.52.6.2
 >   +++ sys/dev/pci/piixpm.c	16 Nov 2020 14:35:58 -0000
 >   @@ -125,6 +125,8 @@
 >    static void	piixpm_csb5_reset(void *);
 >    static int	piixpm_i2c_sb800_acquire_bus(void *, int);
 >    static void	piixpm_i2c_sb800_release_bus(void *, int);
 >   +static int	piixpm_i2c_dummy_acquire_bus(void *, int);
 >   +static void	piixpm_i2c_dummy_release_bus(void *, int);
 >    static int	piixpm_i2c_exec(void *, i2c_op_t, i2c_addr_t, const void *,
 >        size_t, void *, size_t, int);
 >    
 >   @@ -338,8 +340,8 @@
 >    			tag->ic_acquire_bus = piixpm_i2c_sb800_acquire_bus;
 >    			tag->ic_release_bus = piixpm_i2c_sb800_release_bus;
 >    		} else {
 >   -			tag->ic_acquire_bus = NULL;
 >   -			tag->ic_release_bus = NULL;
 >   +			tag->ic_acquire_bus = piixpm_i2c_dummy_acquire_bus;
 >   +			tag->ic_release_bus = piixpm_i2c_dummy_release_bus;
 >    		}
 >    		tag->ic_exec = piixpm_i2c_exec;
 >    		memset(&iba, 0, sizeof(iba));
 >   @@ -549,6 +551,18 @@
 >    	return 0;
 >    }
 >    
 >   +static int
 >   +piixpm_i2c_dummy_acquire_bus(void *cookie, int flags)
 >   +{
 >   +	struct piixpm_smbus *smbus = cookie;
 >   +	struct piixpm_softc *sc = smbus->softc;
 >   +
 >   +	if (!cold)
 >   +		mutex_enter(&sc->sc_i2c_mutex);
 >   +
 >   +	return 0;
 >   +}
 >   +
 >    static void
 >    piixpm_i2c_sb800_release_bus(void *cookie, int flags)
 >    {
 >   @@ -590,6 +604,16 @@
 >    		mutex_exit(&sc->sc_i2c_mutex);
 >    }
 >    
 >   +static void
 >   +piixpm_i2c_dummy_release_bus(void *cookie, int flags)
 >   +{
 >   +	struct piixpm_smbus *smbus = cookie;
 >   +	struct piixpm_softc *sc = smbus->softc;
 >   +
 >   +	if (!cold)
 >   +		mutex_exit(&sc->sc_i2c_mutex);
 >   +}
 >   +
 >    static int
 >    piixpm_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t addr,
 >        const void *cmdbuf, size_t cmdlen, void *buf, size_t len, int flags)
 >   
 >   --jq0ap7NbKX2Kqbes--
 >   
 > 


 -- 
 -----------------------------------------------
                  SAITOH Masanobu (msaitoh@execsw.org
                                   msaitoh@netbsd.org)

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Tue, 17 Nov 2020 21:18:22 +0700

 I would not use

 	if (!cold)

 in piixpm_i2c_dummy_release_bus() or whatever that function ends
 up being called, there must be some chance that the system will
 transition from cold to !cold while the code that happens between
 the piixpm_i2c_dummy_acquire_bus() and piixpm_i2c_dummy_release_bus()
 is running, in which case the latter will attempt to exit a mutex
 that has never been entered.

 Instead, have the acquire() function remember in a state bit whether
 it attempted to enter the mutex or notm (ie: entered it, there is no
 tried and failed), and then exit it only when it was actually entered.

 Even if it is (apparently) impossible for the cold->!cold transition
 to happen during this period, writing it this way makes it safer
 against later modifications in how other parts of the system work.

 kre

From: Edgar =?iso-8859-1?B?RnXf?= <ef@math.uni-bonn.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55745: spdmem_i2c_match() panics
Date: Wed, 18 Nov 2020 10:12:30 +0100

 It was the "if (!cold)" way before the pull-up (in piixpm_i2c_release_bus()) and it is the "if (!cold)" way after the pull-up in piixpm_i2c_sb800_release_bus().
 I haven't checked how it's done in after i2c_{acquire,release}_bus() became real functions.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.