NetBSD Problem Report #55461
From gson@gson.org Sun Jul 5 19:14:15 2020
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 543B11A9213
for <gnats-bugs@gnats.NetBSD.org>; Sun, 5 Jul 2020 19:14:15 +0000 (UTC)
Message-Id: <20200705191409.4D355253EC8@guava.gson.org>
Date: Sun, 5 Jul 2020 22:14:09 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: ciss(4) no longer works
X-Send-Pr-Version: 3.95
>Number: 55461
>Category: kern
>Synopsis: ciss(4) no longer works
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: jdolecek
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Jul 05 19:15:00 +0000 2020
>Closed-Date: Wed Jul 08 08:08:27 +0000 2020
>Last-Modified: Wed Jul 08 08:08:27 +0000 2020
>Originator: Andreas Gustafsson
>Release: NetBSD-current, source date >= 2020.07.04.14.49.24
>Organization:
>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:
My amd64 bare metal testbed is hanging at boot since this commit:
2020.07.04.14.49.24 jdolecek src/sys/dev/pci/ciss_pci.c 1.16
The last console messages printed before the hang are from the ciss
driver:
[ 1.1012041] pci0 at mainbus0 bus 0: configuration mode 1
[ 1.1012041] pchb0 at pci0 dev 0 function 0: Intel 5520 ESI Port (rev. 0x13)
[ 1.1012041] ppb0 at pci0 dev 1 function 0: Intel 5520/5500/X58 PCIe Root Port 1 (rev. 0x13)
[ 1.1012041] ppb0: PCI Express capability version 2 <Root Port of PCI-E Root Complex> x4 @ 5.0GT/s
[ 1.1012041] pci1 at ppb0 bus 5
[ 1.1012041] ciss0 at pci1 dev 0 function 0: HP Smart Array 11
[ 1.1012041] ciss0: interrupting at msix0 vec 0
[ 1.1012041] ciss0: 2 LDs, HW rev 2, FW 3.66/3.66, 64bit fifo rro
[...]
[ 61.8400703] ciss0: workqueue busy: updates stopped
[ 92.3400561] ciss0: unqueued ccb 0xffff8b025e3d6c00 ready, state=0x10
[ 152.4100278] ciss0: unqueued ccb 0xffff8b025e3d6a00 ready, state=0x10
This is running on a HP DL360 G7. Full console log:
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2020/2020.07.04.21.02.16/install.log
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: gson@NetBSD.org
Responsible-Changed-When: Sun, 05 Jul 2020 19:16:50 +0000
Responsible-Changed-Why:
Over to committer.
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/55461 CVS commit: src/sys/dev/pci
Date: Sun, 5 Jul 2020 19:28:37 +0000
Module Name: src
Committed By: jdolecek
Date: Sun Jul 5 19:28:37 UTC 2020
Modified Files:
src/sys/dev/pci: ciss_pci.c
Log Message:
there is more to MSI/MSI-X support in ciss(4) than just allocating the
right interrupt, it needs some explicit support; disable for now
until the full support is there
PR kern/55461
To generate a diff of this commit:
cvs rdiff -u -r1.16 -r1.17 src/sys/dev/pci/ciss_pci.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->analyzed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sun, 05 Jul 2020 19:32:38 +0000
State-Changed-Why:
MSI interrupts seem to be only supported in performant mode, according
to FreeBSD driver. So this needs to be first implemented. For now I've
reverted the change to ciss_pci.c, it now uses only INTx.
From: Andreas Gustafsson <gson@gson.org>
To: jdolecek@netbsd.org
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/55461 (ciss(4) no longer works)
Date: Tue, 7 Jul 2020 12:02:14 +0300
jdolecek@NetBSD.org wrote:
> For now I've reverted the change to ciss_pci.c, it now uses only INTx.
Despite the reversion, I'm still getting boot messages saying "ciss0:
interrupting at msix0 vec 0", and the boot still fails with the same
errors:
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2020/2020.07.06.07.36.14/install.log
This also happens when I try to boot a version of -current from before
the commit of ciss_pci.c 1.16:
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2020/2020.07.04.11.55.18/install.log
It appears that some kind of persistent setting has changed and is not
being reset by power cycling the machine. I also tried unplugging the
power cord for 15 minutes, but that didn't help, either.
Any clues?
--
Andreas Gustafsson, gson@gson.org
From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc:
Subject: Re: kern/55461 (ciss(4) no longer works)
Date: Tue, 7 Jul 2020 12:19:36 +0200
That is quite odd, since x86 pci_intr_map() has no provision
whatsoever to return MSI/MSI-X interrupts. This configuration is
entirely driven by the kernel, the device can't force the kernel to
use MSI-X instead of INTx. As such I have no idea how this is
possible.
I'll try expediting the support for performant mode anyway.
From: Andreas Gustafsson <gson@gson.org>
To: jdolecek@netbsd.org
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/55461 (ciss(4) no longer works)
Date: Tue, 7 Jul 2020 12:12:30 +0300
I wrote:
> It appears that some kind of persistent setting has changed and is not
> being reset by power cycling the machine. I also tried unplugging the
> power cord for 15 minutes, but that didn't help, either.
Never mind - examining the logs more closely shows that the kernel
version being booted does not match the release the testbed is trying
to test, so this must be a bug in the testbed. Sorry about the
confusion.
--
Andreas Gustafsson, gson@gson.org
State-Changed-From-To: analyzed->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 08 Jul 2020 08:08:27 +0000
State-Changed-Why:
ciss is working again.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.