NetBSD Problem Report #52147

From mlelstv@hoppa.1st.de  Sun Apr  9 10:23:10 2017
Return-Path: <mlelstv@hoppa.1st.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 794777A1C0
	for <gnats-bugs@gnats.NetBSD.org>; Sun,  9 Apr 2017 10:23:10 +0000 (UTC)
Message-Id: <20170409102247.70CE09A@hoppa.1st.de>
Date: Sun,  9 Apr 2017 12:22:47 +0200 (CEST)
From: mlelstv@serpens.de
Reply-To: mlelstv@serpens.de
To: gnats-bugs@NetBSD.org
Subject: deadlock when booting from USB disk
X-Send-Pr-Version: 3.95

>Number:         52147
>Category:       kern
>Synopsis:       deadlock when booting from USB disk
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    jdolecek
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Apr 09 10:25:00 +0000 2017
>Closed-Date:    Sat May 13 20:45:04 +0000 2017
>Last-Modified:  Sat May 13 20:45:04 +0000 2017
>Originator:     Michael van Elst
>Release:        NetBSD 7.99.67
>Organization:
-- 
                                Michael van Elst
Internet: mlelstv@serpens.de
                                "A potential Snark may lurk in every tree."
>Environment:


System: NetBSD hoppa 7.99.67 NetBSD 7.99.67 (HOPPA) #6: Sun Apr 9 02:46:58 CEST 2017 mlelstv@gossam:/home/netbsd-current/obj.evbarm/home/netbsd-current/src/sys/arch/evbarm/compile/HOPPA evbarm
Architecture: earmv6hf
Machine: evbarm
>Description:

The latest changes to sd(4) to support FUA/DPO have a funny side effect
when booting from USB disk (on a modular kernel).

The requested settings trigger a SCSI error, since USB disks rarely support
these commands.

The scsipi layer tries to write error messages to the console using the
scsiverbose module.

On the first message, the scsiverbose module needs to be loaded from
the same disk that is currently in error processing.

-> instant deadlock.

>How-To-Repeat:
Boot a system, in this case an RPI, with root on a USB disk.

>Fix:
There are several errors.

The FUA/DP0 support needs some refinement to not cause error messages
on devices that do not support these settings.

There needs to be some notion of "module load path is unavailable" so
that autoloading a module doesn't happen when this would cause a
deadlock. I think scsipi is the only such place for now. A quick
solution is to just load scsiverbose unconditionally instead of
on-demand.

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Sun, 09 Apr 2017 14:34:02 +0000
Responsible-Changed-Why:
I'll look at this. It's a bug in the autoload - the recent changes only
trigger MODE SENSE for page 8 (Caching page) via DIOCGCACHE call when WAPBL
filesystem is mounted, nothing really special.


From: Michael van Elst <mlelstv@serpens.de>
To: gnats-bugs@NetBSD.org
Cc: jdolecek@NetBSD.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
        gnats-admin@netbsd.org
Subject: Re: kern/52147 (deadlock when booting from USB disk)
Date: Sun, 9 Apr 2017 18:12:19 +0200

 On Sun, Apr 09, 2017 at 02:34:03PM +0000, jdolecek@NetBSD.org wrote:
 > Synopsis: deadlock when booting from USB disk
 > 
 > I'll look at this. It's a bug in the autoload - the recent changes only
 > trigger MODE SENSE for page 8 (Caching page) via DIOCGCACHE call when WAPBL
 > filesystem is mounted, nothing really special.

 That MODE SENSE fails with Invalid request.


 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52147 CVS commit: src/sys/dev/scsipi
Date: Mon, 10 Apr 2017 18:20:43 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Mon Apr 10 18:20:43 UTC 2017

 Modified Files:
 	src/sys/dev/scsipi: sd.c

 Log Message:
 execute the cache page MODE SENSE with XS_CTL_SILENT; it's pretty normal
 for e.g. USB sticks thus showing error is not really useful, and the pretty
 printing triggers autoload of scsiverbose module and immediate deadlock when
 the DIOCGCACHE call is made by WAPBL during root mount

 adresses PR kern/52147 by Michael van Elst


 To generate a diff of this commit:
 cvs rdiff -u -r1.323 -r1.324 src/sys/dev/scsipi/sd.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Mon, 10 Apr 2017 21:29:51 +0000
State-Changed-Why:
I've committed a fix. Can you confirm it resolves your problem? I' want
to also deal with the scsiverbose autoload, but I'll do it separately.


From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52147 CVS commit: src/sys/dev/scsipi
Date: Mon, 10 Apr 2017 21:53:38 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Mon Apr 10 21:53:37 UTC 2017

 Modified Files:
 	src/sys/dev/scsipi: scsipiconf.c

 Log Message:
 just do not autoload scsiverbose module, it causes deadlock if it happens
 while root fs is being mounted

 adresses second part of PR kern/52147 by Michael van Elst, thank you


 To generate a diff of this commit:
 cvs rdiff -u -r1.42 -r1.43 src/sys/dev/scsipi/scsipiconf.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/52147: deadlock when booting from USB disk
Date: Wed, 12 Apr 2017 05:25:52 +0000 (UTC)

 christos@astron.com (Christos Zoulas) writes:

 >>There needs to be some notion of "module load path is unavailable" so
 >>that autoloading a module doesn't happen when this would cause a
 >>deadlock. I think scsipi is the only such place for now. A quick
 >>solution is to just load scsiverbose unconditionally instead of
 >>on-demand.

 >Can you print the deadlock path? Or instructions how to reproduce it?

 This here happened on RPI with root on an a USB drive and filesystems
 using WAPBL. The deadlock occurs shortly after starting userland
 when a journal is played back (e.g. when root is remounted).

 This could happen on all archs with SCSI disks.

 The journal play back triggered a SCSI error (bad MODE SENSE) which triggers
 a scsiverbose message which triggers the autoload but which cannot access
 the sd device because REQUEST SENSE processing has the periph frozen.
 Also, the message is printed synchronously in the completion thread, so
 even when you offload the module loading, no other error on the same scsi
 bus could be processed.

 A workaround was to put scsiverbose into /etc/modules.conf which is
 done while root is still read-only.

 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, jdolecek@NetBSD.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org, mlelstv@serpens.de
Cc: 
Subject: Re: kern/52147: deadlock when booting from USB disk
Date: Wed, 12 Apr 2017 08:25:18 -0400

 On Apr 12,  5:30am, mlelstv@serpens.de (Michael van Elst) wrote:
 -- Subject: Re: kern/52147: deadlock when booting from USB disk

 |  >Can you print the deadlock path? Or instructions how to reproduce it?
 |  
 |  This here happened on RPI with root on an a USB drive and filesystems
 |  using WAPBL. The deadlock occurs shortly after starting userland
 |  when a journal is played back (e.g. when root is remounted).
 |  
 |  This could happen on all archs with SCSI disks.

 Yes, I understand.

 |  The journal play back triggered a SCSI error (bad MODE SENSE) which triggers
 |  a scsiverbose message which triggers the autoload but which cannot access
 |  the sd device because REQUEST SENSE processing has the periph frozen.
 |  Also, the message is printed synchronously in the completion thread, so
 |  even when you offload the module loading, no other error on the same scsi
 |  bus could be processed.
 |  
 |  A workaround was to put scsiverbose into /etc/modules.conf which is
 |  done while root is still read-only.

 But hasn't the bad MODE SENSE been fixed now? I.e. the code was changed
 back not to do MODE SENSE? Or do we need the notion of "root is currently
 being mounted"?

 christos

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@netbsd.org, Jaromir Dolecek <jdolecek@netbsd.org>, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org, Michael van Elst <mlelstv@serpens.de>
Subject: Re: kern/52147: deadlock when booting from USB disk
Date: Wed, 12 Apr 2017 20:08:43 +0200

 Yes, the offending MODE SENSE was silenced, it doesn't try to print
 anything out. But it could easily be reintroduced if we add there
 something else, which would cause the same problem again. Having
 something inherently deadlocking is quite dangerous.

 Given that scsiverbose is only cosmetic and completely optional (code
 prints some relevant info out even without it), I think it makes sense
 to leave it up to explicit user action to have it loaded.


 2017-04-12 14:25 GMT+02:00 Christos Zoulas <christos@zoulas.com>:
 > On Apr 12,  5:30am, mlelstv@serpens.de (Michael van Elst) wrote:
 > -- Subject: Re: kern/52147: deadlock when booting from USB disk
 >
 > |  >Can you print the deadlock path? Or instructions how to reproduce it?
 > |
 > |  This here happened on RPI with root on an a USB drive and filesystems
 > |  using WAPBL. The deadlock occurs shortly after starting userland
 > |  when a journal is played back (e.g. when root is remounted).
 > |
 > |  This could happen on all archs with SCSI disks.
 >
 > Yes, I understand.
 >
 > |  The journal play back triggered a SCSI error (bad MODE SENSE) which triggers
 > |  a scsiverbose message which triggers the autoload but which cannot access
 > |  the sd device because REQUEST SENSE processing has the periph frozen.
 > |  Also, the message is printed synchronously in the completion thread, so
 > |  even when you offload the module loading, no other error on the same scsi
 > |  bus could be processed.
 > |
 > |  A workaround was to put scsiverbose into /etc/modules.conf which is
 > |  done while root is still read-only.
 >
 > But hasn't the bad MODE SENSE been fixed now? I.e. the code was changed
 > back not to do MODE SENSE? Or do we need the notion of "root is currently
 > being mounted"?
 >
 > christos

From: christos@zoulas.com (Christos Zoulas)
To: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
Cc: gnats-bugs@netbsd.org, Jaromir Dolecek <jdolecek@netbsd.org>, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, 
	Michael van Elst <mlelstv@serpens.de>
Subject: Re: kern/52147: deadlock when booting from USB disk
Date: Wed, 12 Apr 2017 14:25:13 -0400

 On Apr 12,  8:08pm, jaromir.dolecek@gmail.com (=?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?=) wrote:
 -- Subject: Re: kern/52147: deadlock when booting from USB disk

 | Yes, the offending MODE SENSE was silenced, it doesn't try to print
 | anything out. But it could easily be reintroduced if we add there
 | something else, which would cause the same problem again. Having
 | something inherently deadlocking is quite dangerous.
 | 
 | Given that scsiverbose is only cosmetic and completely optional (code
 | prints some relevant info out even without it), I think it makes sense
 | to leave it up to explicit user action to have it loaded.

 I think it is better to just fix it not to deadlock, this is why I was
 asking for a stack trace... This has never been a problem so far; it
 got introduced by adding the mode-sense code. Yes, I agree it could
 be re-introduced again, but manually loading modules goes against the
 principle of having module loading/unloading be seamless and not noticed
 by the user.

 christos

From: Michael van Elst <mlelstv@serpens.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org, jdolecek@NetBSD.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: kern/52147: deadlock when booting from USB disk
Date: Thu, 13 Apr 2017 00:00:19 +0200

 On Wed, Apr 12, 2017 at 08:25:18AM -0400, Christos Zoulas wrote:

 > But hasn't the bad MODE SENSE been fixed now? I.e. the code was changed
 > back not to do MODE SENSE? Or do we need the notion of "root is currently
 > being mounted"?

 AFAIK the code has been changed to be quiet in case of an error. That
 solves the issue as far as it is created by the MODE SENSE command.

 But I think any other SCSI error would trigger the same problem.


 Greetings,
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: Michael van Elst <mlelstv@serpens.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: =?iso-8859-1?Q?Jarom=EDr?= Dole?ek <jaromir.dolecek@gmail.com>,
        gnats-bugs@netbsd.org, Jaromir Dolecek <jdolecek@netbsd.org>,
        gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52147: deadlock when booting from USB disk
Date: Thu, 13 Apr 2017 01:15:55 +0200

 On Wed, Apr 12, 2017 at 02:25:13PM -0400, Christos Zoulas wrote:

 > I think it is better to just fix it not to deadlock, this is why I was
 > asking for a stack trace...

 Typed manually from the screenshots....

 mi_switch
 sleepq_block
 cv_wait
 biowait
 breadn
 ufs_blkatoff
 ufs_lookup
 VOP_LOOKUP
 lookup_once
 namei_tryemulroot
 namei
 vn_open
 kobj_load_vfs
 module_load_vfs
 module_do_load
 module_autoload
 scsipi_print_sense_stub
 scsipi_interpret_sense
 scsipi_complete
 scsipi_execute_xs
 scsipi_command
 scsipi_mode_sense_big
 sd_mode_sense
 sdioctl
 spec_ioctl
 VOP_IOCTL
 spec_ioctl
 VOP_IOCTL
 wapbl_start
 ffs_wapbl_start
 ffs_mount
 VFS_MOUNT
 do_sys_mount
 sys___mount50
 syscall



 Greetings,
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, jdolecek@NetBSD.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org, mlelstv@serpens.de
Cc: 
Subject: Re: kern/52147: deadlock when booting from USB disk
Date: Wed, 12 Apr 2017 20:04:50 -0400

 On Apr 12, 11:20pm, mlelstv@serpens.de (Michael van Elst) wrote:
 -- Subject: Re: kern/52147: deadlock when booting from USB disk

 Looks like this should do it?

 christos

 Index: kern_module.c
 ===================================================================
 RCS file: /cvsroot/src/sys/kern/kern_module.c,v
 retrieving revision 1.123
 diff -u -u -r1.123 kern_module.c
 --- kern_module.c	11 Apr 2017 21:15:57 -0000	1.123
 +++ kern_module.c	13 Apr 2017 00:04:02 -0000
 @@ -609,7 +609,7 @@
  {
  	int error;

 -	if (rootvp == NULL) {
 +	if (rootvp == NULL || rootvp->v_mount == NULL) {
  #ifdef DIAGNOSTIC
  		printf("%s: trying to load `%s' before root is mounted\n",
  		    __func__, filename);
 @@ -617,6 +617,14 @@
  		return EPERM;
  	}

 +	if (fstrans_getstate(rootvp->v_vmount) != FSTRANS_NORMAL) {
 +#ifdef DIAGNOSTIC
 +		printf("%s: trying to load `%s' while root is suspended\n",
 +		    __func__, filename);
 +#endif
 +		return EPERM;
 +	}
 +
  	kernconfig_lock();

  	/* Nothing if the user has disabled it. */

State-Changed-From-To: feedback->closed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sat, 13 May 2017 20:45:04 +0000
State-Changed-Why:
Fixed, thanks for report.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.