NetBSD Problem Report #55461

From gson@gson.org  Sun Jul  5 19:14:15 2020
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 543B11A9213
	for <gnats-bugs@gnats.NetBSD.org>; Sun,  5 Jul 2020 19:14:15 +0000 (UTC)
Message-Id: <20200705191409.4D355253EC8@guava.gson.org>
Date: Sun,  5 Jul 2020 22:14:09 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: ciss(4) no longer works
X-Send-Pr-Version: 3.95

>Number:         55461
>Category:       kern
>Synopsis:       ciss(4) no longer works
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    jdolecek
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Jul 05 19:15:00 +0000 2020
>Closed-Date:    Wed Jul 08 08:08:27 +0000 2020
>Last-Modified:  Wed Jul 08 08:08:27 +0000 2020
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current, source date >= 2020.07.04.14.49.24
>Organization:
>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:

My amd64 bare metal testbed is hanging at boot since this commit:

  2020.07.04.14.49.24 jdolecek src/sys/dev/pci/ciss_pci.c 1.16

The last console messages printed before the hang are from the ciss
driver:

  [   1.1012041] pci0 at mainbus0 bus 0: configuration mode 1
  [   1.1012041] pchb0 at pci0 dev 0 function 0: Intel 5520 ESI Port (rev. 0x13)
  [   1.1012041] ppb0 at pci0 dev 1 function 0: Intel 5520/5500/X58 PCIe Root Port 1 (rev. 0x13)
  [   1.1012041] ppb0: PCI Express capability version 2 <Root Port of PCI-E Root Complex> x4 @ 5.0GT/s
  [   1.1012041] pci1 at ppb0 bus 5
  [   1.1012041] ciss0 at pci1 dev 0 function 0: HP Smart Array 11
  [   1.1012041] ciss0: interrupting at msix0 vec 0
  [   1.1012041] ciss0: 2 LDs, HW rev 2, FW 3.66/3.66, 64bit fifo rro
  [...]
  [  61.8400703] ciss0: workqueue busy: updates stopped
  [  92.3400561] ciss0: unqueued ccb 0xffff8b025e3d6c00 ready, state=0x10
  [ 152.4100278] ciss0: unqueued ccb 0xffff8b025e3d6a00 ready, state=0x10

This is running on a HP DL360 G7.  Full console log:

  http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2020/2020.07.04.21.02.16/install.log

>How-To-Repeat:

>Fix:

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: gson@NetBSD.org
Responsible-Changed-When: Sun, 05 Jul 2020 19:16:50 +0000
Responsible-Changed-Why:
Over to committer.


From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55461 CVS commit: src/sys/dev/pci
Date: Sun, 5 Jul 2020 19:28:37 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Sun Jul  5 19:28:37 UTC 2020

 Modified Files:
 	src/sys/dev/pci: ciss_pci.c

 Log Message:
 there is more to MSI/MSI-X support in ciss(4) than just allocating the
 right interrupt, it needs some explicit support; disable for now
 until the full support is there

 PR kern/55461


 To generate a diff of this commit:
 cvs rdiff -u -r1.16 -r1.17 src/sys/dev/pci/ciss_pci.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->analyzed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sun, 05 Jul 2020 19:32:38 +0000
State-Changed-Why:
MSI interrupts seem to be only supported in performant mode, according
to FreeBSD driver. So this needs to be first implemented. For now I've
reverted the change to ciss_pci.c, it now uses only INTx.


From: Andreas Gustafsson <gson@gson.org>
To: jdolecek@netbsd.org
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/55461 (ciss(4) no longer works)
Date: Tue, 7 Jul 2020 12:02:14 +0300

 jdolecek@NetBSD.org wrote:
 > For now I've reverted the change to ciss_pci.c, it now uses only INTx.

 Despite the reversion, I'm still getting boot messages saying "ciss0:
 interrupting at msix0 vec 0", and the boot still fails with the same
 errors:

   http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2020/2020.07.06.07.36.14/install.log

 This also happens when I try to boot a version of -current from before
 the commit of ciss_pci.c 1.16:

   http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2020/2020.07.04.11.55.18/install.log

 It appears that some kind of persistent setting has changed and is not
 being reset by power cycling the machine.  I also tried unplugging the
 power cord for 15 minutes, but that didn't help, either.

 Any clues?
 -- 
 Andreas Gustafsson, gson@gson.org

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: kern/55461 (ciss(4) no longer works)
Date: Tue, 7 Jul 2020 12:19:36 +0200

 That is quite odd, since x86 pci_intr_map() has no provision
 whatsoever to return MSI/MSI-X interrupts. This configuration is
 entirely driven by the kernel, the device can't force the kernel to
 use MSI-X instead of INTx. As such I have no idea how this is
 possible.

 I'll try expediting the support for performant mode anyway.

From: Andreas Gustafsson <gson@gson.org>
To: jdolecek@netbsd.org
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/55461 (ciss(4) no longer works)
Date: Tue, 7 Jul 2020 12:12:30 +0300

 I wrote:
 > It appears that some kind of persistent setting has changed and is not
 > being reset by power cycling the machine.  I also tried unplugging the
 > power cord for 15 minutes, but that didn't help, either.

 Never mind - examining the logs more closely shows that the kernel
 version being booted does not match the release the testbed is trying
 to test, so this must be a bug in the testbed.  Sorry about the
 confusion.
 -- 
 Andreas Gustafsson, gson@gson.org

State-Changed-From-To: analyzed->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 08 Jul 2020 08:08:27 +0000
State-Changed-Why:
ciss is working again.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.