NetBSD Problem Report #57404

From www@netbsd.org  Sat May 13 06:29:15 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 9869C1A923C
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 13 May 2023 06:29:15 +0000 (UTC)
Message-Id: <20230513062914.35BA71A923D@mollari.NetBSD.org>
Date: Sat, 13 May 2023 06:29:14 +0000 (UTC)
From: maxim@synrc.com
Reply-To: maxim@synrc.com
To: gnats-bugs@NetBSD.org
Subject: Can't see NVMe drives on ASUS Rampage VI mb in DIMM slot
X-Send-Pr-Version: www-1.0

>Number:         57404
>Category:       kern
>Synopsis:       Can't see NVMe drives on ASUS Rampage VI mb in DIMM slot
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat May 13 06:30:01 +0000 2023
>Last-Modified:  Tue May 23 20:10:02 +0000 2023
>Originator:     Namdak Tonpa
>Release:        9.3
>Organization:
Synrc Research Center
>Environment:
NetBSD localhost 9.3 NetBSD 9.3 (GENERIC) #0: Thu Aug  4 15:30:37 UTC 2022  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
[     1.042790] nvme3 at pci15 dev 0 function 0: vendor 15b7 product 5011 (rev. 0x01)
[     1.042790] nvme3: NVMe 1.4
[     1.042790] nvme3: for admin queue interrupting at msix11 vec 0
[     1.042790] nvme3: WDS100T1X0E-00AFY0, firmware 614600WD, serial 2136HR449906
[     1.042790] nvme3: autoconfiguration error: unable to establish nvme3 ioq1 interrupt
[     1.042790] nvme3: autoconfiguration error: unable to create io queue

[     1.042790] nvme4 at pci17 dev 0 function 0: vendor 144d product a808 (rev. 0x00)
[     1.042790] nvme4: NVMe 1.3
[     1.042790] nvme4: for admin queue interrupting at msix11 vec 0
[     1.042790] nvme4: Samsung SSD 970 EVO Plus 1TB, firmware 2B2QEXM7, serial S4EWNX0R946108Y
[     1.042790] nvme4: autoconfiguration error: unable to establish nvme4 ioq1 interrupt
[     1.042790] nvme4: autoconfiguration error: unable to create io queue

[     1.042790] nvme5 at pci18 dev 0 function 0: vendor 144d product a808 (rev. 0x00)
[     1.042790] nvme5: NVMe 1.3
[     1.042790] nvme5: for admin queue interrupting at msix11 vec 0
[     1.042790] nvme5: Samsung SSD 970 EVO Plus 1TB, firmware 2B2QEXM7, serial S4EWNX0R946133P
[     1.042790] nvme5: autoconfiguration error: unable to establish nvme5 ioq1 interrupt
[     1.042790] nvme5: autoconfiguration error: unable to create io queue
>How-To-Repeat:
You need the ASUS Rampage VI motherboard I can provide access to.
>Fix:
Not known.

>Audit-Trail:
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@netbsd.org, maxim@synrc.com
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: re: kern/57404: Can't see NVMe drives on ASUS Rampage VI mb in DIMM slot
Date: Sun, 14 May 2023 16:55:27 +1000

 > NetBSD localhost 9.3 NetBSD 9.3 (GENERIC) #0: Thu Aug  4 15:30:37 UTC 20=
 22  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd=
 64
 > >Description:
 > [     1.042790] nvme3 at pci15 dev 0 function 0: vendor 15b7 product 501=
 1 (rev. 0x01)
 > [     1.042790] nvme3: NVMe 1.4
 > [     1.042790] nvme3: for admin queue interrupting at msix11 vec 0
 > [     1.042790] nvme3: WDS100T1X0E-00AFY0, firmware 614600WD, serial 213=
 6HR449906
 > [     1.042790] nvme3: autoconfiguration error: unable to establish nvme=
 3 ioq1 interrupt
 > [     1.042790] nvme3: autoconfiguration error: unable to create io queu=
 e
 >
 > [     1.042790] nvme4 at pci17 dev 0 function 0: vendor 144d product a80=
 8 (rev. 0x00)
 > [     1.042790] nvme4: NVMe 1.3
 > [     1.042790] nvme4: for admin queue interrupting at msix11 vec 0
 > [     1.042790] nvme4: Samsung SSD 970 EVO Plus 1TB, firmware 2B2QEXM7, =
 serial S4EWNX0R946108Y
 > [     1.042790] nvme4: autoconfiguration error: unable to establish nvme=
 4 ioq1 interrupt
 > [     1.042790] nvme4: autoconfiguration error: unable to create io queu=
 e
 >
 > [     1.042790] nvme5 at pci18 dev 0 function 0: vendor 144d product a80=
 8 (rev. 0x00)
 > [     1.042790] nvme5: NVMe 1.3
 > [     1.042790] nvme5: for admin queue interrupting at msix11 vec 0
 > [     1.042790] nvme5: Samsung SSD 970 EVO Plus 1TB, firmware 2B2QEXM7, =
 serial S4EWNX0R946133P
 > [     1.042790] nvme5: autoconfiguration error: unable to establish nvme=
 5 ioq1 interrupt
 > [     1.042790] nvme5: autoconfiguration error: unable to create io queu=
 e
 > >How-To-Repeat:
 > You need the ASUS Rampage VI motherboard I can provide access to.

 can you show the full dmesg?  or at least, the cpus, and all the
 nvme lines?

 there's a problem with many cpus and several nvme devices in netbsd-9
 that is partly solved in netbsd-10, but i'm not sure that 6 devices
 will work, nor that it's exactly the same problem, but it certainly
 fails to attach all the per-cpu interrupts due to running out.  one
 method to work around this would be to either on on "force_intx" or
 turn off "mq" settings in the kernel (unfortunately, requires a
 kernel build or early ddb to modify these variables):

 sys/dev/pci/nvme_pci.c:67:int nvme_pci_force_intx =3D 0;
 sys/dev/pci/nvme_pci.c:69:int nvme_pci_mq =3D 1;          /* INTx: ioq=3D1=
 , MSI/MSI-X: ioq=3Dncpu */


 .mrg.

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/57404: Can't see NVMe drives on ASUS Rampage VI mb in DIMM slot
Date: Sun, 14 May 2023 09:13:11 -0000 (UTC)

 mrg@eterna.com.au (matthew green) writes:

 >> [     1.042790] nvme3: for admin queue interrupting at msix11 vec 0
 >> [     1.042790] nvme4: for admin queue interrupting at msix11 vec 0
 >> [     1.042790] nvme5: for admin queue interrupting at msix11 vec 0

 >there's a problem with many cpus and several nvme devices in netbsd-9
 >that is partly solved in netbsd-10, but i'm not sure that 6 devices
 >will work, nor that it's exactly the same problem, but it certainly
 >fails to attach all the per-cpu interrupts due to running out.  one
 >method to work around this would be to either on on "force_intx" or
 >turn off "mq" settings in the kernel (unfortunately, requires a
 >kernel build or early ddb to modify these variables):

 The devices sharing the same msix is also confusing.

From: Namdak Tonpa <maxim@synrc.com>
To: matthew green <mrg@eterna.com.au>, "gnats-bugs@netbsd.org"
	<gnats-bugs@netbsd.org>
Cc: "kern-bug-people@netbsd.org" <kern-bug-people@netbsd.org>,
	"gnats-admin@netbsd.org" <gnats-admin@netbsd.org>, "netbsd-bugs@netbsd.org"
	<netbsd-bugs@netbsd.org>
Subject: RE: kern/57404: Can't see NVMe drives on ASUS Rampage VI mb in DIMM
 slot
Date: Tue, 23 May 2023 20:07:20 +0000

 --_000_PAVPR02MB9938C5E75A2E3CC0408CC512B0409PAVPR02MB9938eurp_
 Content-Type: text/plain; charset="us-ascii"
 Content-Transfer-Encoding: quoted-printable

 Sure, here is my dmesg: https://gist.github.com/5HT/d00e14b9fbf73f3fcb332d1=
 01a64feb9

 From: matthew green<mailto:mrg@eterna.com.au>
 Sent: Sunday, May 14, 2023 9:55 AM
 To: gnats-bugs@netbsd.org<mailto:gnats-bugs@netbsd.org>; maxim@synrc.com<ma=
 ilto:maxim@synrc.com>
 Cc: kern-bug-people@netbsd.org<mailto:kern-bug-people@netbsd.org>; gnats-ad=
 min@netbsd.org<mailto:gnats-admin@netbsd.org>; netbsd-bugs@netbsd.org<mailt=
 o:netbsd-bugs@netbsd.org>
 Subject: re: kern/57404: Can't see NVMe drives on ASUS Rampage VI mb in DIM=
 M slot

 > NetBSD localhost 9.3 NetBSD 9.3 (GENERIC) #0: Thu Aug  4 15:30:37 UTC 202=
 2  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
 > >Description:
 > [     1.042790] nvme3 at pci15 dev 0 function 0: vendor 15b7 product 5011=
  (rev. 0x01)
 > [     1.042790] nvme3: NVMe 1.4
 > [     1.042790] nvme3: for admin queue interrupting at msix11 vec 0
 > [     1.042790] nvme3: WDS100T1X0E-00AFY0, firmware 614600WD, serial 2136=
 HR449906
 > [     1.042790] nvme3: autoconfiguration error: unable to establish nvme3=
  ioq1 interrupt
 > [     1.042790] nvme3: autoconfiguration error: unable to create io queue
 >
 > [     1.042790] nvme4 at pci17 dev 0 function 0: vendor 144d product a808=
  (rev. 0x00)
 > [     1.042790] nvme4: NVMe 1.3
 > [     1.042790] nvme4: for admin queue interrupting at msix11 vec 0
 > [     1.042790] nvme4: Samsung SSD 970 EVO Plus 1TB, firmware 2B2QEXM7, s=
 erial S4EWNX0R946108Y
 > [     1.042790] nvme4: autoconfiguration error: unable to establish nvme4=
  ioq1 interrupt
 > [     1.042790] nvme4: autoconfiguration error: unable to create io queue
 >
 > [     1.042790] nvme5 at pci18 dev 0 function 0: vendor 144d product a808=
  (rev. 0x00)
 > [     1.042790] nvme5: NVMe 1.3
 > [     1.042790] nvme5: for admin queue interrupting at msix11 vec 0
 > [     1.042790] nvme5: Samsung SSD 970 EVO Plus 1TB, firmware 2B2QEXM7, s=
 erial S4EWNX0R946133P
 > [     1.042790] nvme5: autoconfiguration error: unable to establish nvme5=
  ioq1 interrupt
 > [     1.042790] nvme5: autoconfiguration error: unable to create io queue
 > >How-To-Repeat:
 > You need the ASUS Rampage VI motherboard I can provide access to.

 can you show the full dmesg?  or at least, the cpus, and all the
 nvme lines?

 there's a problem with many cpus and several nvme devices in netbsd-9
 that is partly solved in netbsd-10, but i'm not sure that 6 devices
 will work, nor that it's exactly the same problem, but it certainly
 fails to attach all the per-cpu interrupts due to running out.  one
 method to work around this would be to either on on "force_intx" or
 turn off "mq" settings in the kernel (unfortunately, requires a
 kernel build or early ddb to modify these variables):

 sys/dev/pci/nvme_pci.c:67:int nvme_pci_force_intx =3D 0;
 sys/dev/pci/nvme_pci.c:69:int nvme_pci_mq =3D 1;          /* INTx: ioq=3D1,=
  MSI/MSI-X: ioq=3Dncpu */


 .mrg.


 --_000_PAVPR02MB9938C5E75A2E3CC0408CC512B0409PAVPR02MB9938eurp_
 Content-Type: text/html; charset="us-ascii"
 Content-Transfer-Encoding: quoted-printable

 <html xmlns:o=3D"urn:schemas-microsoft-com:office:office" xmlns:w=3D"urn:sc=
 hemas-microsoft-com:office:word" xmlns:m=3D"http://schemas.microsoft.com/of=
 fice/2004/12/omml" xmlns=3D"http://www.w3.org/TR/REC-html40">
 <head>
 <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
 >
 <meta name=3D"Generator" content=3D"Microsoft Word 15 (filtered medium)">
 <style><!--
 /* Font Definitions */
 @font-face
 	{font-family:"Cambria Math";
 	panose-1:2 4 5 3 5 4 6 3 2 4;}
 @font-face
 	{font-family:Calibri;
 	panose-1:2 15 5 2 2 2 4 3 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
 	{margin:0in;
 	font-size:11.0pt;
 	font-family:"Calibri",sans-serif;}
 a:link, span.MsoHyperlink
 	{mso-style-priority:99;
 	color:blue;
 	text-decoration:underline;}
 .MsoChpDefault
 	{mso-style-type:export-only;}
 @page WordSection1
 	{size:8.5in 11.0in;
 	margin:42.5pt 42.5pt 42.5pt 70.85pt;}
 div.WordSection1
 	{page:WordSection1;}
 --></style>
 </head>
 <body lang=3D"EN-US" link=3D"blue" vlink=3D"#954F72" style=3D"word-wrap:bre=
 ak-word">
 <div class=3D"WordSection1">
 <p class=3D"MsoNormal">Sure, here is my dmesg: <a href=3D"https://gist.gith=
 ub.com/5HT/d00e14b9fbf73f3fcb332d101a64feb9">
 https://gist.github.com/5HT/d00e14b9fbf73f3fcb332d101a64feb9</a></p>
 <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
 <div style=3D"mso-element:para-border-div;border:none;border-top:solid #E1E=
 1E1 1.0pt;padding:3.0pt 0in 0in 0in">
 <p class=3D"MsoNormal" style=3D"border:none;padding:0in">From: <a hr=
 ef=3D"mailto:mrg@eterna.com.au">matthew green</a><br>
 Sent: Sunday, May 14, 2023 9:55 AM<br>
 To: <a href=3D"mailto:gnats-bugs@netbsd.org">gnats-bugs@netbsd.org</=
 a>; <a href=3D"mailto:maxim@synrc.com">
 maxim@synrc.com</a><br>
 Cc: <a href=3D"mailto:kern-bug-people@netbsd.org">kern-bug-people@ne=
 tbsd.org</a>;
 <a href=3D"mailto:gnats-admin@netbsd.org">gnats-admin@netbsd.org</a>; <a hr=
 ef=3D"mailto:netbsd-bugs@netbsd.org">
 netbsd-bugs@netbsd.org</a><br>
 Subject: re: kern/57404: Can't see NVMe drives on ASUS Rampage VI mb=
  in DIMM slot</p>
 </div>
 <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
 <p class=3D"MsoNormal">&gt; NetBSD localhost 9.3 NetBSD 9.3 (GENERIC) #0: T=
 hu Aug&nbsp; 4 15:30:37 UTC 2022&nbsp; mkrepro@mkrepro.NetBSD.org:/usr/src/=
 sys/arch/amd64/compile/GENERIC amd64<br>
 &gt; &gt;Description:<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme3 at pci15 dev 0 function 0: v=
 endor 15b7 product 5011 (rev. 0x01)<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme3: NVMe 1.4<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme3: for admin queue interruptin=
 g at msix11 vec 0<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme3: WDS100T1X0E-00AFY0, firmwar=
 e 614600WD, serial 2136HR449906<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme3: autoconfiguration error: un=
 able to establish nvme3 ioq1 interrupt<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme3: autoconfiguration error: un=
 able to create io queue<br>
 &gt;<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme4 at pci17 dev 0 function 0: v=
 endor 144d product a808 (rev. 0x00)<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme4: NVMe 1.3<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme4: for admin queue interruptin=
 g at msix11 vec 0<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme4: Samsung SSD 970 EVO Plus 1T=
 B, firmware 2B2QEXM7, serial S4EWNX0R946108Y<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme4: autoconfiguration error: un=
 able to establish nvme4 ioq1 interrupt<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme4: autoconfiguration error: un=
 able to create io queue<br>
 &gt;<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme5 at pci18 dev 0 function 0: v=
 endor 144d product a808 (rev. 0x00)<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme5: NVMe 1.3<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme5: for admin queue interruptin=
 g at msix11 vec 0<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme5: Samsung SSD 970 EVO Plus 1T=
 B, firmware 2B2QEXM7, serial S4EWNX0R946133P<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme5: autoconfiguration error: un=
 able to establish nvme5 ioq1 interrupt<br>
 &gt; [&nbsp;&nbsp;&nbsp;&nbsp; 1.042790] nvme5: autoconfiguration error: un=
 able to create io queue<br>
 &gt; &gt;How-To-Repeat:<br>
 &gt; You need the ASUS Rampage VI motherboard I can provide access to.<br>
 <br>
 can you show the full dmesg?&nbsp; or at least, the cpus, and all the<br>
 nvme lines?<br>
 <br>
 there's a problem with many cpus and several nvme devices in netbsd-9<br>
 that is partly solved in netbsd-10, but i'm not sure that 6 devices<br>
 will work, nor that it's exactly the same problem, but it certainly<br>
 fails to attach all the per-cpu interrupts due to running out.&nbsp; one<br=
 >
 method to work around this would be to either on on &quot;force_intx&quot; =
 or<br>
 turn off &quot;mq&quot; settings in the kernel (unfortunately, requires a<b=
 r>
 kernel build or early ddb to modify these variables):<br>
 <br>
 sys/dev/pci/nvme_pci.c:67:int nvme_pci_force_intx =3D 0;<br>
 sys/dev/pci/nvme_pci.c:69:int nvme_pci_mq =3D 1;&nbsp;&nbsp;&nbsp;&nbsp;&nb=
 sp;&nbsp;&nbsp;&nbsp;&nbsp; /* INTx: ioq=3D1, MSI/MSI-X: ioq=3Dncpu */<br>
 <br>
 <br>
 .mrg.<o:p></o:p></p>
 <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
 </div>
 </body>
 </html>

 --_000_PAVPR02MB9938C5E75A2E3CC0408CC512B0409PAVPR02MB9938eurp_--

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.