NetBSD Problem Report #52111

From kardel@gateway.kardel.name  Sat Mar 25 22:10:48 2017
Return-Path: <kardel@gateway.kardel.name>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 283D57A25E
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 25 Mar 2017 22:10:48 +0000 (UTC)
Message-Id: <20170325221043.DE012343CDE@gateway.kardel.name>
Date: Sat, 25 Mar 2017 22:10:43 +0000 (UTC)
From: kardel@netbsd.org
Reply-To: kardel@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: wmX i82574L inoperative in monoprocessor mode (i386)
X-Send-Pr-Version: 3.95

>Number:         52111
>Category:       kern
>Synopsis:       wmX i82574L inoperative in monoprocessor mode (i386)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Mar 25 22:15:00 +0000 2017
>Closed-Date:    Mon May 01 05:05:37 +0000 2017
>Last-Modified:  Mon May 01 05:10:00 +0000 2017
>Originator:     kardel@netbsd.org
>Release:        NetBSD 7.99.66-20170324
>Organization:

>Environment:


System: NetBSD Gateway 7.99.66 NetBSD 7.99.66 (GATEWAY) #1: Fri Mar 24 22:53:34 CET 2017 kardel@pip.kardel.name:/fs/raid2a/src/NetBSD/act/src/obj.i386/sys/arch/i386/compile/GATEWAY i386
Architecture: i386
Machine: i386
>Description:
When booting a Soekris 6501 in mono processor mode (boot -1) wm interfaces are inoperative
as arp resolution does not complete at all.
On an amd64 system with other wm interfaces (i82572EI, i82583V) mono processor mode does not impair
interface operation.

Soekris interfaces:
wm0 at pci5 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: for TX and RX interrupting at msix0 vec 0 affinity to 0
wm0: for TX and RX interrupting at msix0 vec 1 affinity to 1
wm0: for LINK interrupting at msix0 vec 2
wm0: PCI-Express bus
wm0: 2048 words (8 address bits) SPI EEPROM, version 2.1.2, Image Unique ID 0000ffff
wm0: Ethernet address 00:00:aa:bb:cc:ec
wm1 at pci6 dev 0 function 0: Intel i82574L (rev. 0x00)
wm1: for TX and RX interrupting at msix1 vec 0 affinity to 0
wm1: for TX and RX interrupting at msix1 vec 1 affinity to 1
wm1: for LINK interrupting at msix1 vec 2
wm1: PCI-Express bus
wm1: 64 words (8 address bits) SPI EEPROM, version 2.1.2, Image Unique ID ffffed5e
wm1: Ethernet address 00:00:aa:bb:cc:ed
wm2 at pci10 dev 0 function 0: Intel i82574L (rev. 0x00)
wm2: for TX and RX interrupting at msix2 vec 0 affinity to 0
wm2: for TX and RX interrupting at msix2 vec 1 affinity to 1
wm2: for LINK interrupting at msix2 vec 2
wm2: PCI-Express bus
wm2: 64 words (8 address bits) SPI EEPROM, version 2.1.2, Image Unique ID ffffee5e
wm2: Ethernet address 00:00:aa:bb:cc:ee
wm3 at pci11 dev 0 function 0: Intel i82574L (rev. 0x00)
wm3: for TX and RX interrupting at msix3 vec 0 affinity to 0
wm3: for TX and RX interrupting at msix3 vec 1 affinity to 1
wm3: for LINK interrupting at msix3 vec 2
wm3: PCI-Express bus
wm3: 64 words (8 address bits) SPI EEPROM, version 2.1.2, Image Unique ID ffffef5e
wm3: Ethernet address 00:00:aa:bb:cc:ef

>How-To-Repeat:
	Boot a current (20170324) i386 version on a Soekris 6501 in mono procesor
	mode (boot -1). Observe that no ARP resultion is done,
>Fix:
	Stay in multiprocessor mode :-( - makes debugging other defects much harder.

>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Sun, 26 Mar 2017 09:52:37 +0200

 On Sat, Mar 25, 2017 at 10:15:00PM +0000, kardel@netbsd.org wrote:
 > wm0: for LINK interrupting at msix0 vec 2

 On the other systems that do not show this issue, does the wm use msi?

 Martin

From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, kardel@netbsd.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Sun, 26 Mar 2017 10:27:08 +0200

 The other system
 machdep.dmi.bios-vendor = American Megatrends Inc.
 machdep.dmi.board-vendor = ASRock
 machdep.dmi.board-product = 990FX Extreme9
 uses ioapic for interrupts:

 wm0 at pci6 dev 0 function 0: Intel i82572EI 1000baseT Ethernet (rev. 0x06)
 wm0: interrupting at ioapic1 pin 23
 wm0: PCI-Express bus
 wm0: 2048 words (16 address bits) SPI EEPROM, version 5.11.8, Image 
 Unique ID 0000ffff
 wm0: Ethernet address 00:1b:21:aa:9b:7c
 igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
 igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
 1000baseT-FDX, auto

 wm1 at pci13 dev 0 function 0: Intel i82583V (rev. 0x00)
 wm1: interrupting at ioapic0 pin 18
 wm1: PCI-Express bus
 wm1: 2048 words FLASH, version 1.10.0, Image Unique ID 0000ffff
 wm1: Ethernet address bc:5f:f4:98:32:84
 makphy0 at wm1 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1
 makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
 1000baseT-FDX, auto

 So we have differences in interrupt handling and chips and one setup is 
 sensible to mono processor mode.

 Best regards,
    Frank

 On 03/26/17 09:55, Martin Husemann wrote:
 > The following reply was made to PR kern/52111; it has been noted by GNATS.
 >
 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@NetBSD.org
 > Cc:
 > Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
 > Date: Sun, 26 Mar 2017 09:52:37 +0200
 >
 >   On Sat, Mar 25, 2017 at 10:15:00PM +0000, kardel@netbsd.org wrote:
 >   > wm0: for LINK interrupting at msix0 vec 2
 >   
 >   On the other systems that do not show this issue, does the wm use msi?
 >   
 >   Martin
 >   

From: Martin Husemann <martin@duskware.de>
To: Frank Kardel <kardel@netbsd.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Sun, 26 Mar 2017 11:06:35 +0200

 On Sun, Mar 26, 2017 at 10:27:08AM +0200, Frank Kardel wrote:
 > So we have differences in interrupt handling and chips and one setup is
 > sensible to mono processor mode.

 This is a MD (x86) issue, but basically yes.

 Martin

From: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Mon, 27 Mar 2017 11:27:17 +0900

 Hi,

 On 2017/03/26 7:15, kardel@netbsd.org wrote:
 > When booting a Soekris 6501 in mono processor mode (boot -1) wm interfaces are inoperative
 > as arp resolution does not complete at all.
 > On an amd64 system with other wm interfaces (i82572EI, i82583V) mono processor mode does not impair
 > interface operation.
 > 
 > Soekris interfaces:
 > wm0 at pci5 dev 0 function 0: Intel i82574L (rev. 0x00)
 > wm0: for TX and RX interrupting at msix0 vec 0 affinity to 0
 > wm0: for TX and RX interrupting at msix0 vec 1 affinity to 1
 > wm0: for LINK interrupting at msix0 vec 2
 > wm0: PCI-Express bus
 > wm0: 2048 words (8 address bits) SPI EEPROM, version 2.1.2, Image Unique ID 0000ffff
 > wm0: Ethernet address 00:00:aa:bb:cc:ec

 Hmm...., It is strange to me that wm(4) use two TX and RX interrupts on
 uniprocessor system. The TX and RX interrupt is limited by ncpu.
 At my reproduction environment NetBSD/i386(-current updated in today)
 on VMware ESXi with e1000e(82574), the wm(4) uses only one TX and RX
 interrupt.

 Could you show full dmesg?


 Thanks,

 -- 
 //////////////////////////////////////////////////////////////////////
 Internet Initiative Japan Inc.

 Device Engineering Section,
 IoT Platform Development Department,
 Network Division,
 Technology Unit

 Kengo NAKAHARA <k-nakahara@iij.ad.jp>

From: Martin Husemann <martin@duskware.de>
To: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Mon, 27 Mar 2017 10:42:33 +0200

 On Mon, Mar 27, 2017 at 11:27:17AM +0900, Kengo NAKAHARA wrote:
 > Hmm...., It is strange to me that wm(4) use two TX and RX interrupts on
 > uniprocessor system.

 He means using "boot -1" on a multiprocessor system.

 I don't know if on x86 that makes ncpu be 1 or if wm should better use 
 ncpuonline instead. The pool code seems to use

 	if (ncpu < 2 || !mp_online)

 instead, maybe we should provide a simple inline function and make it
 all the same?

 Martin

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
	Frank Kardel <kardel@netbsd.org>
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Mon, 27 Mar 2017 11:12:10 +0200

 Right, a function would be good. E.g. nvme(4) also uses ncpu, so would
 fail the same way when system for 'boot -1'. But maybe the bug is in
 that ncpu is set wrongly in this case? I mean, is there a way to
 activate the additional cpus after the system is booted with -1?

 Jaromir

 2017-03-27 10:45 GMT+02:00 Martin Husemann <martin@duskware.de>:
 > The following reply was made to PR kern/52111; it has been noted by GNATS.
 >
 > From: Martin Husemann <martin@duskware.de>
 > To: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
 > Cc: gnats-bugs@NetBSD.org
 > Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
 > Date: Mon, 27 Mar 2017 10:42:33 +0200
 >
 >  On Mon, Mar 27, 2017 at 11:27:17AM +0900, Kengo NAKAHARA wrote:
 >  > Hmm...., It is strange to me that wm(4) use two TX and RX interrupts on
 >  > uniprocessor system.
 >
 >  He means using "boot -1" on a multiprocessor system.
 >
 >  I don't know if on x86 that makes ncpu be 1 or if wm should better use
 >  ncpuonline instead. The pool code seems to use
 >
 >         if (ncpu < 2 || !mp_online)
 >
 >  instead, maybe we should provide a simple inline function and make it
 >  all the same?
 >
 >  Martin
 >

From: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
To: martin@duskware.de
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Mon, 27 Mar 2017 18:20:20 +0900

 Hi,

 On 2017/03/27 17:42, Martin Husemann wrote:
 > On Mon, Mar 27, 2017 at 11:27:17AM +0900, Kengo NAKAHARA wrote:
 >> Hmm...., It is strange to me that wm(4) use two TX and RX interrupts on
 >> uniprocessor system.
 > 
 > He means using "boot -1" on a multiprocessor system.

 Yes. I also use "boot -1" on multiprocessor system as my reproduction
 environment, however my wm(4) use one TX and Rx interrupt.
 So, I wonder that. Sorry lack of talk.

 > I don't know if on x86 that makes ncpu be 1 or if wm should better use 
 > ncpuonline instead. The pool code seems to use
 > 
 > 	if (ncpu < 2 || !mp_online)
 > 
 > instead, maybe we should provide a simple inline function and make it
 > all the same?

 Thank you for your comments. Hmm, I think wm(4) should use ncpuonline
 rather than "(ncpu < 2 || !mp_online)" code to reduce modification
 as wm(4) use some codes such as "ncpu < nqueues". :)
 I create that patch below.

 ====================
 diff --git a/sys/dev/pci/if_wm.c b/sys/dev/pci/if_wm.c
 index f8f10dc3..89bf451 100644
 --- a/sys/dev/pci/if_wm.c
 +++ b/sys/dev/pci/if_wm.c
 @@ -332,7 +332,7 @@ struct wm_txqueue {
  	int txq_fifo_stall;		/* Tx FIFO is stalled */

  	/*
 -	 * When ncpu > number of Tx queues, a Tx queue is shared by multiple
 +	 * When ncpuonline > number of Tx queues, a Tx queue is shared by multiple
  	 * CPUs. This queue intermediate them without block.
  	 */
  	pcq_t *txq_interq;
 @@ -4520,7 +4520,7 @@ wm_init_rss(struct wm_softc *sc)
   * The numbers are affected by below parameters.
   *     - The nubmer of hardware queues
   *     - The number of MSI-X vectors (= "nvectors" argument)
 - *     - ncpu
 + *     - ncpuonline
   */
  static void
  wm_adjust_qnum(struct wm_softc *sc, int nvectors)
 @@ -4596,8 +4596,8 @@ wm_adjust_qnum(struct wm_softc *sc, int nvectors)
  	 * As queues more then cpus cannot improve scaling, we limit
  	 * the number of queues used actually.
  	 */
 -	if (ncpu < sc->sc_nqueues)
 -		sc->sc_nqueues = ncpu;
 +	if (ncpuonline < sc->sc_nqueues)
 +		sc->sc_nqueues = ncpuonline;
  }

  static inline bool
 @@ -4684,7 +4684,7 @@ wm_setup_msix(struct wm_softc *sc)
  	char intrbuf[PCI_INTRSTR_LEN];
  	char intr_xname[INTRDEVNAMEBUF];

 -	if (sc->sc_nqueues < ncpu) {
 +	if (sc->sc_nqueues < ncpuonline) {
  		/*
  		 * To avoid other devices' interrupts, the affinity of Tx/Rx
  		 * interrupts start from CPU#1.
 @@ -4714,7 +4714,7 @@ wm_setup_msix(struct wm_softc *sc)
  	txrx_established = 0;
  	for (qidx = 0; qidx < sc->sc_nqueues; qidx++) {
  		struct wm_queue *wmq = &sc->sc_queue[qidx];
 -		int affinity_to = (sc->sc_affinity_offset + intr_idx) % ncpu;
 +		int affinity_to = (sc->sc_affinity_offset + intr_idx) % ncpuonline;

  		intrstr = pci_intr_string(pc, sc->sc_intrs[intr_idx], intrbuf,
  		    sizeof(intrbuf));
 @@ -6639,7 +6639,7 @@ wm_select_txqueue(struct ifnet *ifp, struct mbuf *m)
  	 * TODO:
  	 * distribute by flowid(RSS has value).
  	 */
 -        return (cpuid + ncpu - sc->sc_affinity_offset) % sc->sc_nqueues;                                                          
 +        return (cpuid + ncpuonline - sc->sc_affinity_offset) % sc->sc_nqueues;
  }

  /*
 ====================

 However, I am not sure this patch will fix kardel@n.o's problem...


 Thanks,

 -- 
 //////////////////////////////////////////////////////////////////////
 Internet Initiative Japan Inc.

 Device Engineering Section,
 IoT Platform Development Department,
 Network Division,
 Technology Unit

 Kengo NAKAHARA <k-nakahara@iij.ad.jp>

From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Tue, 28 Mar 2017 10:34:59 +0200

 On 03/28/17 06:46, Kengo NAKAHARA wrote:
 > Hi,
 >
 > On 2017/03/27 18:25, Kengo NAKAHARA wrote:
 >>   On 2017/03/27 17:42, Martin Husemann wrote:
 >>   > On Mon, Mar 27, 2017 at 11:27:17AM +0900, Kengo NAKAHARA wrote:
 >>   >> Hmm...., It is strange to me that wm(4) use two TX and RX interrupts on
 >>   >> uniprocessor system.
 > kardel@n.o send dmesg to me with individually mail. The dmesg has two bootup
 > records, that is,
 >      - If "boot -1", wm(4) use one TX and RX interrupt
 >      - If boot normal, wm(4) use two TX and RX interrupt
 >
 > These are what I expected. I think it would be copy&paste miss that
 > two TX and RX interrupts when "boot -1".
 >
 >
 > And then, I try to reproduce this problem, but I cannot yet. I use
 > NetBSD-current(at 2017/03/28 GENERIC) i386 with 82574, ping(8) with
 > "ip address" works well in the environment. I think kardel@n.o use
 > custom kernel, so I try to reproduce with kernel enabled GATEWAY option.
 > however this problem is not reproduced either.
 >
 > kardel@n.o, could you tell your kernel configuration(if there is
 > modification from GENERIC other than GATEWAY), your machine
 > setting(e.g. additional sysctl operation), and detail reproduction
 > operation?
 >
 I just rebooted with GENERIC "boot netbsd.GENERIC-act -1".
 Same sad story. So we can avoid going throught the differences.
 looking with gdb into the live kernel I see
 (gdb) print hardclock_ticks
 $13 = 68486
 (gdb) print hardclock_ticks
 $14 = 68766
 it is the right one (not the image)

 (gdb) print ncpus
 $19 = 0
 uniprocessor going by source comments,
 (gdb) print mp_online
 $18 = 1
 multiprocessor - unconditionally set in init_main.c:configure2 before 
 cpu_boot_secondary_processors()

 building a true non multiprocessor kernel didn't succeed via 
 configuration. There are already MP assuptions in
 the x86 code.
 (__HAVE_PREEMPTION)
 if you disable __HAVE_PREEMENTION you trip over
 /fs/raid2a/src/NetBSD/act/src/sys/arch/x86/x86/cpu.c:940:38: error: 
 'mp_trampoline_paddr' undeclared (first use in this function)
    pmap_kenter_pa(mp_trampoline_vaddr, mp_trampoline_paddr,
 then I stopped trying to build a true non MP kernel.

 The wm configuration for the GENERIC kernel is:
 wm0 at pci5 dev 0 function 0: Intel i82574L (rev. 0x00)
 wm0: for TX and RX interrupting at msix0 vec 0 affinity to 0
 wm0: for LINK interrupting at msix0 vec 1
 wm0: PCI-Express bus
 wm0: 2048 words (8 address bits) SPI EEPROM, version 2.1.2, Image Unique 
 ID 0000ffff
 wm0: Ethernet address 00:00:xx:y:zz:ec

 wm interrupts don't work in GENERIC either. ARP is not resolved.

 looking at tcpdump I only see transmitted packets - no received packets 
 (not even broadcasts) which supports the
 assumption that the receive path is broken

 sysctl variables look unsuspicious:
 #kern.timecounter.hardware=TSC
 kern.timecounter.timestepwarnings=1
 net.inet.ip.forwarding=1
 net.inet.ip.random_id=1
 net.inet6.tcp6.log_refused=1
 net.inet.tcp.log_refused=1
 #kern.mbuf.nmbclusters=128000
 kern.maxfiles=20000
 kern.maxproc=2048
 net.inet.tcp.sendspace=128000
 security.models.bsd44.curtain=1
 hw.cnmagic=ESC^B
 kern.logsigexit=1

 what else should I look at?

From: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org, kardel@netbsd.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Tue, 28 Mar 2017 18:57:16 +0900

 Hi,

 Thank you very much for detail report!

 On 2017/03/28 17:40, Frank Kardel wrote:
 > From: Frank Kardel <kardel@netbsd.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
 > Date: Tue, 28 Mar 2017 10:34:59 +0200
 >  I just rebooted with GENERIC "boot netbsd.GENERIC-act -1".
 >  Same sad story. So we can avoid going throught the differences.

 Hmm, I see.

 >  looking with gdb into the live kernel I see
 >  (gdb) print hardclock_ticks
 >  $13 = 68486
 >  (gdb) print hardclock_ticks
 >  $14 = 68766
 >  it is the right one (not the image)
 >  
 >  (gdb) print ncpus
 >  $19 = 0

 Is it "ncpu" typo?

 >  wm interrupts don't work in GENERIC either. ARP is not resolved.

 Oh, that's an important fact. If wm(4) cannot cause interrupts, the
 problem may not be wm(4) specific. When "boot -1", the other devices
 can cause interrupts?
 # "intrctl list" command can help to get the information

 If so, below patch can help to use 82574 when "boot -1".
 ====================
 diff --git a/sys/dev/pci/if_wm.c b/sys/dev/pci/if_wm.c
 index f8f10dc3..310ca7f 100644
 --- a/sys/dev/pci/if_wm.c
 +++ b/sys/dev/pci/if_wm.c
 @@ -1828,6 +1828,12 @@ wm_attach(device_t parent, device_t self, void *aux)
  	counts[PCI_INTR_TYPE_MSI] = 1;
  	counts[PCI_INTR_TYPE_INTX] = 1;

 +	/* XXX 82574 and "boot -1" workaround */
 +	if (sc->sc_type == WM_T_82574 && ncpu == 1) {
 +		max_type = PCI_INTR_TYPE_INTX;
 +		counts[PCI_INTR_TYPE_INTX] = 1;
 +	}
 +
  alloc_retry:
  	if (pci_intr_alloc(pa, &sc->sc_intrs, counts, max_type) != 0) {
  		aprint_error_dev(sc->sc_dev, "failed to allocate interrupt\n");
 ====================
 # This patch force 82574 use not MSI/MSI-X but INTx like other devices


 Thanks,

 -- 
 //////////////////////////////////////////////////////////////////////
 Internet Initiative Japan Inc.

 Device Engineering Section,
 IoT Platform Development Department,
 Network Division,
 Technology Unit

 Kengo NAKAHARA <k-nakahara@iij.ad.jp>

From: Frank Kardel <kardel@kardel.name>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Tue, 28 Mar 2017 13:16:06 +0200

 Yes ncpus is a typo, but the symbol exists anyway :-).

 Further results
 intrctl list MP case (wm interface work):
 Gateway# intrctl list
 interrupt id     CPU0  CPU1  device name(s)
 ioapic0 pin 19   4602*    0  unknown, unknown, unknown, ehci0, unknown, 
 unknown, unknown, unknown
 ioapic0 pin 18      0*    0  unknown, unknown
 ioapic0 pin 17 108394*    0  ahcisata0
 ioapic0 pin 16      0*    0  unknown, unknown, unknown, ehci1
 msix0 vec 0     12822*    0  wm0TXRX0
 msix0 vec 1         0  9517* wm0TXRX1
 msix0 vec 2         0*    0  wm0LINK
 msix1 vec 0       827*    0  wm1TXRX0
 msix1 vec 1         0   568* wm1TXRX1
 msix1 vec 2         1*    0  wm1LINK
 msix2 vec 0       559*    0  wm2TXRX0
 msix2 vec 1         0   103* wm2TXRX1
 msix2 vec 2         1*    0  wm2LINK
 msix3 vec 0      4045*    0  wm3TXRX0
 msix3 vec 1         0  2178* wm3TXRX1
 msix3 vec 2         2*    0  wm3LINK
 ioapic0 pin 4    1459*    0  com0

 intrctl list in boot -1 case (wm interface don't work)
 interrupt id  CPU0  device name(s)
 pic0 pin 9    2337* unknown, unknown, unknown, ehci0, unknown, unknown, 
 unknown, unknown
 pic0 pin 5       0* unknown, unknown
 pic0 pin 11  11421* ahcisata0
 pic0 pin 10      0* unknown, unknown, unknown, ehci1
 msix0 vec 0      0* wm0TXRX0
 msix0 vec 1      0* wm0LINK
 msix1 vec 0      0* wm1TXRX0
 msix1 vec 1      0* wm1LINK
 msix2 vec 0      0* wm2TXRX0
 msix2 vec 1      0* wm2LINK
 msix3 vec 0      0* wm3TXRX0
 msix3 vec 1      0* wm3LINK
 pic0 pin 4    3112* com0
 pic0 pin 0   22874* unknown

 boot -1 with your patch - interfaces work, intrctl list causes panic: trap:
 #0  0xc011423e in maybe_dump (howto=260) at 
 /fs/raid2a/src/NetBSD/act/src/sys/arch/i386/i386/machdep.c:708
 #1  cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at 
 /fs/raid2a/src/NetBSD/act/src/sys/arch/i386/i386/machdep.c:729
 #2  0xc09353c0 in vpanic (fmt=fmt@entry=0xc0cd7a92 "trap", 
 ap=ap@entry=0xdc3fbc28 "��?ܠ�?�\001") at 
 /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_prf.c:342
 #3  0xc093544a in panic (fmt=fmt@entry=0xc0cd7a92 "trap") at 
 /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_prf.c:258
 #4  0xc0116a92 in trap (frame=0xdc3fbca0) at 
 /fs/raid2a/src/NetBSD/act/src/sys/arch/i386/i386/trap.c:325
 #5  0xc010d4af in alltraps ()
 #6  0xdc3fbca0 in ?? ()
 #7  0xc0145f71 in interrupt_construct_intrids (cpuset=0xc3421ec8) at 
 /fs/raid2a/src/NetBSD/act/src/sys/arch/x86/x86/intr.c:2199
 #8  0xc0926c1f in interrupt_intrio_list_size () at 
 /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_interrupt.c:195
 #9  interrupt_intrio_list (il=il@entry=0x0, length=length@entry=0) at 
 /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_interrupt.c:220
 #10 0xc09274b2 in interrupt_intrio_list_sysctl (name=0xdc3fbf0c, 
 namelen=0, oldp=0x0, oldlenp=0xdc3fbefc, newp=0x0, newlen=0, 
 oname=0xdc3fbf00, l=0xc3cd27e0, rnode=0xc343480c)
      at /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_interrupt.c:312
 #11 0xc090fbcc in sysctl_dispatch (name=name@entry=0xdc3fbf00, 
 namelen=3, oldp=0x0, oldlenp=oldlenp@entry=0xdc3fbefc, newp=0x0, 
 newlen=0, oname=oname@entry=0xdc3fbf00, l=l@entry=0xc3cd27e0, 
 rnode=0xc343480c, rnode@entry=0x0)
      at /fs/raid2a/src/NetBSD/act/src/sys/kern/kern_sysctl.c:454
 #12 0xc090fe19 in sys___sysctl (l=0xc3cd27e0, uap=0xdc3fbf68, 
 retval=0xdc3fbf60) at 
 /fs/raid2a/src/NetBSD/act/src/sys/kern/kern_sysctl.c:310
 #13 0xc015059b in sy_call (rval=0xdc3fbf60, uap=0xdc3fbf68, 
 l=0xc3cd27e0, sy=0xc0f5f348 <sysent+4040>) at 
 /fs/raid2a/src/NetBSD/act/src/sys/sys/syscallvar.h:65
 #14 sy_invoke (code=202, rval=0xdc3fbf60, uap=0xdc3fbf68, l=0xc3cd27e0, 
 sy=0xc0f5f348 <sysent+4040>) at 
 /fs/raid2a/src/NetBSD/act/src/sys/sys/syscallvar.h:94
 #15 syscall (frame=0xdc3fbfa8) at 
 /fs/raid2a/src/NetBSD/act/src/sys/arch/x86/x86/syscall.c:156
 #16 0xc01006b6 in Xsyscall ()
 #17 0xdc3fbfa8 in ?? ()

 So the good news is - boot -1 gives working wm interfaces with the 
 workaround. intrctl list is a panic trigger though,


From: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org, kardel@netbsd.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Wed, 29 Mar 2017 12:58:18 +0900

 Hi,

 On 2017/03/28 20:20, Frank Kardel wrote:
 > From: Frank Kardel <kardel@kardel.name>
 >  Further results
 >  intrctl list MP case (wm interface work):
 >  Gateway# intrctl list
 >  interrupt id     CPU0  CPU1  device name(s)
 >  ioapic0 pin 19   4602*    0  unknown, unknown, unknown, ehci0, unknown, 
 >  unknown, unknown, unknown
 >  ioapic0 pin 18      0*    0  unknown, unknown
 >  ioapic0 pin 17 108394*    0  ahcisata0
 >  ioapic0 pin 16      0*    0  unknown, unknown, unknown, ehci1
 >  msix0 vec 0     12822*    0  wm0TXRX0
 >  msix0 vec 1         0  9517* wm0TXRX1
 >  msix0 vec 2         0*    0  wm0LINK
 >  msix1 vec 0       827*    0  wm1TXRX0
 >  msix1 vec 1         0   568* wm1TXRX1
 >  msix1 vec 2         1*    0  wm1LINK
 >  msix2 vec 0       559*    0  wm2TXRX0
 >  msix2 vec 1         0   103* wm2TXRX1
 >  msix2 vec 2         1*    0  wm2LINK
 >  msix3 vec 0      4045*    0  wm3TXRX0
 >  msix3 vec 1         0  2178* wm3TXRX1
 >  msix3 vec 2         2*    0  wm3LINK
 >  ioapic0 pin 4    1459*    0  com0
 >  
 >  intrctl list in boot -1 case (wm interface don't work)
 >  interrupt id  CPU0  device name(s)
 >  pic0 pin 9    2337* unknown, unknown, unknown, ehci0, unknown, unknown, 
 >  unknown, unknown
 >  pic0 pin 5       0* unknown, unknown
 >  pic0 pin 11  11421* ahcisata0
 >  pic0 pin 10      0* unknown, unknown, unknown, ehci1
 >  msix0 vec 0      0* wm0TXRX0
 >  msix0 vec 1      0* wm0LINK
 >  msix1 vec 0      0* wm1TXRX0
 >  msix1 vec 1      0* wm1LINK
 >  msix2 vec 0      0* wm2TXRX0
 >  msix2 vec 1      0* wm2LINK
 >  msix3 vec 0      0* wm3TXRX0
 >  msix3 vec 1      0* wm3LINK
 >  pic0 pin 4    3112* com0
 >  pic0 pin 0   22874* unknown
 >  
 >  boot -1 with your patch - interfaces work, intrctl list causes panic: trap:
 >  #0  0xc011423e in maybe_dump (howto=260) at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/arch/i386/i386/machdep.c:708
 >  #1  cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/arch/i386/i386/machdep.c:729
 >  #2  0xc09353c0 in vpanic (fmt=fmt@entry=0xc0cd7a92 "trap", 
 >  ap=ap@entry=0xdc3fbc28 "��?ܠ�?�\001") at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_prf.c:342
 >  #3  0xc093544a in panic (fmt=fmt@entry=0xc0cd7a92 "trap") at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_prf.c:258
 >  #4  0xc0116a92 in trap (frame=0xdc3fbca0) at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/arch/i386/i386/trap.c:325
 >  #5  0xc010d4af in alltraps ()
 >  #6  0xdc3fbca0 in ?? ()
 >  #7  0xc0145f71 in interrupt_construct_intrids (cpuset=0xc3421ec8) at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/arch/x86/x86/intr.c:2199
 >  #8  0xc0926c1f in interrupt_intrio_list_size () at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_interrupt.c:195
 >  #9  interrupt_intrio_list (il=il@entry=0x0, length=length@entry=0) at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_interrupt.c:220
 >  #10 0xc09274b2 in interrupt_intrio_list_sysctl (name=0xdc3fbf0c, 
 >  namelen=0, oldp=0x0, oldlenp=0xdc3fbefc, newp=0x0, newlen=0, 
 >  oname=0xdc3fbf00, l=0xc3cd27e0, rnode=0xc343480c)
 >       at /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_interrupt.c:312
 >  #11 0xc090fbcc in sysctl_dispatch (name=name@entry=0xdc3fbf00, 
 >  namelen=3, oldp=0x0, oldlenp=oldlenp@entry=0xdc3fbefc, newp=0x0, 
 >  newlen=0, oname=oname@entry=0xdc3fbf00, l=l@entry=0xc3cd27e0, 
 >  rnode=0xc343480c, rnode@entry=0x0)
 >       at /fs/raid2a/src/NetBSD/act/src/sys/kern/kern_sysctl.c:454
 >  #12 0xc090fe19 in sys___sysctl (l=0xc3cd27e0, uap=0xdc3fbf68, 
 >  retval=0xdc3fbf60) at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/kern/kern_sysctl.c:310
 >  #13 0xc015059b in sy_call (rval=0xdc3fbf60, uap=0xdc3fbf68, 
 >  l=0xc3cd27e0, sy=0xc0f5f348 <sysent+4040>) at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/sys/syscallvar.h:65
 >  #14 sy_invoke (code=202, rval=0xdc3fbf60, uap=0xdc3fbf68, l=0xc3cd27e0, 
 >  sy=0xc0f5f348 <sysent+4040>) at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/sys/syscallvar.h:94
 >  #15 syscall (frame=0xdc3fbfa8) at 
 >  /fs/raid2a/src/NetBSD/act/src/sys/arch/x86/x86/syscall.c:156
 >  #16 0xc01006b6 in Xsyscall ()
 >  #17 0xdc3fbfa8 in ?? ()

 Hmm..., I think it may be the reason of this panic that wm(4) use
 shared IRQ of legacy apic. When boot normally, "ahcisata0"'s
 interrupt id is "ioapic0 pin 17". In contrast, when boot -1,
 "ahcisata0"'s interrupt id is "pic0 pin 11" that means legacy apic.
 For comparison, my reproduction environment does not use legacy
 apic even if boot -1. It seems this difference is the reason of
 whether reproduce of not.

 >  So the good news is - boot -1 gives working wm interfaces with the 
 >  workaround. intrctl list is a panic trigger though,

 Oh, I am glad to hear the news. :)
 I'm sorry to say please don't use intrctl list when boot -1...
 I will research intrctl list panic in more detail.


 Thanks,

 -- 
 //////////////////////////////////////////////////////////////////////
 Internet Initiative Japan Inc.

 Device Engineering Section,
 IoT Platform Development Department,
 Network Division,
 Technology Unit

 Kengo NAKAHARA <k-nakahara@iij.ad.jp>

From: Frank Kardel <kardel@netbsd.org>
To: Kengo NAKAHARA <k-nakahara@iij.ad.jp>, gnats-bugs@NetBSD.org, 
 kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
 netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Wed, 29 Mar 2017 08:02:05 +0200

 As for the panic in the boot -1 case the
 panic of "intrctl list" occurred in:

 sys/arch/x86/x86/intr.c:

 static bool
 intr_is_affinity_intrsource(struct intrsource *isp, const kcpuset_t 
 *cpuset)
 {
          struct cpu_info *ci;

          KASSERT(mutex_owned(&cpu_lock));

  >>>       ci = isp->is_handlers->ih_cpu;
          KASSERT(ci != NULL);

          return kcpuset_isset(cpuset, cpu_index(ci));
 }

 due to is_handlers == NULL. The whole struct intrsource seems to be 
 zeroed except fir is_intrid, but
 correctly linked.

 Value of the intrsource is:
 (gdb) print *isp
 $2 = {is_maxlevel = 0, is_pin = 0, is_handlers = 0x0, is_pic = 0x0, 
 is_recurse = 0x0, is_resume = 0x0, is_lwp = 0x0, is_evcnt = {ev_count = 
 0, ev_list = {tqe_next = 0x0, tqe_prev = 0x0}, ev_type = 0 '\000', 
 ev_grouplen = 0 '\000',
      ev_namelen = 0 '\000', ev_pad1 = 0 '\000', ev_parent = 0x0, 
 ev_group = 0x0, ev_name = 0x0}, is_flags = 0, is_type = 0, is_idtvec = 
 0, is_minlevel = 0, is_evname = '\000' <repeats 31 times>,
    is_intrid = "irq 9", '\000' <repeats 58 times>, is_xname = '\000' 
 <repeats 255 times>, is_active_cpu = 0, is_saved_evcnt = 0xc3423ba8, 
 is_list = {sqe_next = 0xc3a40e08}}
 (gdb)

 Frank

 On 03/29/17 05:58, Kengo NAKAHARA wrote:
 > Hi,
 >
 > On 2017/03/28 20:20, Frank Kardel wrote:
 >> From: Frank Kardel <kardel@kardel.name>
 >>   Further results
 >>   intrctl list MP case (wm interface work):
 >>   Gateway# intrctl list
 >>   interrupt id     CPU0  CPU1  device name(s)
 >>   ioapic0 pin 19   4602*    0  unknown, unknown, unknown, ehci0, unknown,
 >>   unknown, unknown, unknown
 >>   ioapic0 pin 18      0*    0  unknown, unknown
 >>   ioapic0 pin 17 108394*    0  ahcisata0
 >>   ioapic0 pin 16      0*    0  unknown, unknown, unknown, ehci1
 >>   msix0 vec 0     12822*    0  wm0TXRX0
 >>   msix0 vec 1         0  9517* wm0TXRX1
 >>   msix0 vec 2         0*    0  wm0LINK
 >>   msix1 vec 0       827*    0  wm1TXRX0
 >>   msix1 vec 1         0   568* wm1TXRX1
 >>   msix1 vec 2         1*    0  wm1LINK
 >>   msix2 vec 0       559*    0  wm2TXRX0
 >>   msix2 vec 1         0   103* wm2TXRX1
 >>   msix2 vec 2         1*    0  wm2LINK
 >>   msix3 vec 0      4045*    0  wm3TXRX0
 >>   msix3 vec 1         0  2178* wm3TXRX1
 >>   msix3 vec 2         2*    0  wm3LINK
 >>   ioapic0 pin 4    1459*    0  com0
 >>   
 >>   intrctl list in boot -1 case (wm interface don't work)
 >>   interrupt id  CPU0  device name(s)
 >>   pic0 pin 9    2337* unknown, unknown, unknown, ehci0, unknown, unknown,
 >>   unknown, unknown
 >>   pic0 pin 5       0* unknown, unknown
 >>   pic0 pin 11  11421* ahcisata0
 >>   pic0 pin 10      0* unknown, unknown, unknown, ehci1
 >>   msix0 vec 0      0* wm0TXRX0
 >>   msix0 vec 1      0* wm0LINK
 >>   msix1 vec 0      0* wm1TXRX0
 >>   msix1 vec 1      0* wm1LINK
 >>   msix2 vec 0      0* wm2TXRX0
 >>   msix2 vec 1      0* wm2LINK
 >>   msix3 vec 0      0* wm3TXRX0
 >>   msix3 vec 1      0* wm3LINK
 >>   pic0 pin 4    3112* com0
 >>   pic0 pin 0   22874* unknown
 >>   
 >>   boot -1 with your patch - interfaces work, intrctl list causes panic: trap:
 >>   #0  0xc011423e in maybe_dump (howto=260) at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/arch/i386/i386/machdep.c:708
 >>   #1  cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/arch/i386/i386/machdep.c:729
 >>   #2  0xc09353c0 in vpanic (fmt=fmt@entry=0xc0cd7a92 "trap",
 >>   ap=ap@entry=0xdc3fbc28 "��?ܠ�?�\001") at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_prf.c:342
 >>   #3  0xc093544a in panic (fmt=fmt@entry=0xc0cd7a92 "trap") at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_prf.c:258
 >>   #4  0xc0116a92 in trap (frame=0xdc3fbca0) at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/arch/i386/i386/trap.c:325
 >>   #5  0xc010d4af in alltraps ()
 >>   #6  0xdc3fbca0 in ?? ()
 >>   #7  0xc0145f71 in interrupt_construct_intrids (cpuset=0xc3421ec8) at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/arch/x86/x86/intr.c:2199
 >>   #8  0xc0926c1f in interrupt_intrio_list_size () at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_interrupt.c:195
 >>   #9  interrupt_intrio_list (il=il@entry=0x0, length=length@entry=0) at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_interrupt.c:220
 >>   #10 0xc09274b2 in interrupt_intrio_list_sysctl (name=0xdc3fbf0c,
 >>   namelen=0, oldp=0x0, oldlenp=0xdc3fbefc, newp=0x0, newlen=0,
 >>   oname=0xdc3fbf00, l=0xc3cd27e0, rnode=0xc343480c)
 >>        at /fs/raid2a/src/NetBSD/act/src/sys/kern/subr_interrupt.c:312
 >>   #11 0xc090fbcc in sysctl_dispatch (name=name@entry=0xdc3fbf00,
 >>   namelen=3, oldp=0x0, oldlenp=oldlenp@entry=0xdc3fbefc, newp=0x0,
 >>   newlen=0, oname=oname@entry=0xdc3fbf00, l=l@entry=0xc3cd27e0,
 >>   rnode=0xc343480c, rnode@entry=0x0)
 >>        at /fs/raid2a/src/NetBSD/act/src/sys/kern/kern_sysctl.c:454
 >>   #12 0xc090fe19 in sys___sysctl (l=0xc3cd27e0, uap=0xdc3fbf68,
 >>   retval=0xdc3fbf60) at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/kern/kern_sysctl.c:310
 >>   #13 0xc015059b in sy_call (rval=0xdc3fbf60, uap=0xdc3fbf68,
 >>   l=0xc3cd27e0, sy=0xc0f5f348 <sysent+4040>) at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/sys/syscallvar.h:65
 >>   #14 sy_invoke (code=202, rval=0xdc3fbf60, uap=0xdc3fbf68, l=0xc3cd27e0,
 >>   sy=0xc0f5f348 <sysent+4040>) at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/sys/syscallvar.h:94
 >>   #15 syscall (frame=0xdc3fbfa8) at
 >>   /fs/raid2a/src/NetBSD/act/src/sys/arch/x86/x86/syscall.c:156
 >>   #16 0xc01006b6 in Xsyscall ()
 >>   #17 0xdc3fbfa8 in ?? ()
 > Hmm..., I think it may be the reason of this panic that wm(4) use
 > shared IRQ of legacy apic. When boot normally, "ahcisata0"'s
 > interrupt id is "ioapic0 pin 17". In contrast, when boot -1,
 > "ahcisata0"'s interrupt id is "pic0 pin 11" that means legacy apic.
 > For comparison, my reproduction environment does not use legacy
 > apic even if boot -1. It seems this difference is the reason of
 > whether reproduce of not.
 >
 >>   So the good news is - boot -1 gives working wm interfaces with the
 >>   workaround. intrctl list is a panic trigger though,
 > Oh, I am glad to hear the news. :)
 > I'm sorry to say please don't use intrctl list when boot -1...
 > I will research intrctl list panic in more detail.
 >
 >
 > Thanks,
 >

From: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
To: kardel@netbsd.org, gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
        gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Wed, 29 Mar 2017 16:31:30 +0900

 Hi,

 On 2017/03/29 15:02, Frank Kardel wrote:
 > As for the panic in the boot -1 case the
 > panic of "intrctl list" occurred in:
 > 
 > sys/arch/x86/x86/intr.c:
 > 
 > static bool
 > intr_is_affinity_intrsource(struct intrsource *isp, const kcpuset_t 
 > *cpuset)
 > {
 >          struct cpu_info *ci;
 > 
 >          KASSERT(mutex_owned(&cpu_lock));
 > 
 >  >>>       ci = isp->is_handlers->ih_cpu;
 >          KASSERT(ci != NULL);
 > 
 >          return kcpuset_isset(cpuset, cpu_index(ci));
 > }
 > 
 > due to is_handlers == NULL. The whole struct intrsource seems to be 
 > zeroed except fir is_intrid, but
 > correctly linked.
 > 
 > Value of the intrsource is:
 > (gdb) print *isp
 > $2 = {is_maxlevel = 0, is_pin = 0, is_handlers = 0x0, is_pic = 0x0, 
 > is_recurse = 0x0, is_resume = 0x0, is_lwp = 0x0, is_evcnt = {ev_count = 
 > 0, ev_list = {tqe_next = 0x0, tqe_prev = 0x0}, ev_type = 0 '\000', 
 > ev_grouplen = 0 '\000',
 >      ev_namelen = 0 '\000', ev_pad1 = 0 '\000', ev_parent = 0x0, 
 > ev_group = 0x0, ev_name = 0x0}, is_flags = 0, is_type = 0, is_idtvec = 
 > 0, is_minlevel = 0, is_evname = '\000' <repeats 31 times>,
 >    is_intrid = "irq 9", '\000' <repeats 58 times>, is_xname = '\000' 
 > <repeats 255 times>, is_active_cpu = 0, is_saved_evcnt = 0xc3423ba8, 
 > is_list = {sqe_next = 0xc3a40e08}}
 > (gdb)

 Thank you for your debugging!

 struct intrsource must be initialized in intr_establish(), however
 the *isp isn't. It seems the *isp whose is_intrid=="irq 9" is wm(4),
 because there is no such device in your last mail "intrctl list"
 when boot -1 . So, I think my workaround patch does not wholly fit
 your "boot -1" environment.

 I implement fix code, but it will take a while to do it...

 Could you keep using the workaround code and avoiding "intrctl list" ?
 Please use "vmstat -e" instead of "intrctl list" if you want to know
 interrupt count, sorry.


 Thanks,

 -- 
 //////////////////////////////////////////////////////////////////////
 Internet Initiative Japan Inc.

 Device Engineering Section,
 IoT Platform Development Department,
 Network Division,
 Technology Unit

 Kengo NAKAHARA <k-nakahara@iij.ad.jp>

From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
Date: Wed, 29 Mar 2017 09:46:10 +0200

 Fine - I'll be careful :-)

 On 03/29/17 09:35, Kengo NAKAHARA wrote:
 > The following reply was made to PR kern/52111; it has been noted by GNATS.
 >
 > From: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
 > To: kardel@netbsd.org, gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
 >          gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
 > Cc:
 > Subject: Re: kern/52111: wmX i82574L inoperative in monoprocessor mode (i386)
 > Date: Wed, 29 Mar 2017 16:31:30 +0900
 >
 >   Hi,
 >   
 >   On 2017/03/29 15:02, Frank Kardel wrote:
 >   > As for the panic in the boot -1 case the
 >   > panic of "intrctl list" occurred in:
 >   >
 >   > sys/arch/x86/x86/intr.c:
 >   >
 >   > static bool
 >   > intr_is_affinity_intrsource(struct intrsource *isp, const kcpuset_t
 >   > *cpuset)
 >   > {
 >   >          struct cpu_info *ci;
 >   >
 >   >          KASSERT(mutex_owned(&cpu_lock));
 >   >
 >   >  >>>       ci = isp->is_handlers->ih_cpu;
 >   >          KASSERT(ci != NULL);
 >   >
 >   >          return kcpuset_isset(cpuset, cpu_index(ci));
 >   > }
 >   >
 >   > due to is_handlers == NULL. The whole struct intrsource seems to be
 >   > zeroed except fir is_intrid, but
 >   > correctly linked.
 >   >
 >   > Value of the intrsource is:
 >   > (gdb) print *isp
 >   > $2 = {is_maxlevel = 0, is_pin = 0, is_handlers = 0x0, is_pic = 0x0,
 >   > is_recurse = 0x0, is_resume = 0x0, is_lwp = 0x0, is_evcnt = {ev_count =
 >   > 0, ev_list = {tqe_next = 0x0, tqe_prev = 0x0}, ev_type = 0 '\000',
 >   > ev_grouplen = 0 '\000',
 >   >      ev_namelen = 0 '\000', ev_pad1 = 0 '\000', ev_parent = 0x0,
 >   > ev_group = 0x0, ev_name = 0x0}, is_flags = 0, is_type = 0, is_idtvec =
 >   > 0, is_minlevel = 0, is_evname = '\000' <repeats 31 times>,
 >   >    is_intrid = "irq 9", '\000' <repeats 58 times>, is_xname = '\000'
 >   > <repeats 255 times>, is_active_cpu = 0, is_saved_evcnt = 0xc3423ba8,
 >   > is_list = {sqe_next = 0xc3a40e08}}
 >   > (gdb)
 >   
 >   Thank you for your debugging!
 >   
 >   struct intrsource must be initialized in intr_establish(), however
 >   the *isp isn't. It seems the *isp whose is_intrid=="irq 9" is wm(4),
 >   because there is no such device in your last mail "intrctl list"
 >   when boot -1 . So, I think my workaround patch does not wholly fit
 >   your "boot -1" environment.
 >   
 >   I implement fix code, but it will take a while to do it...
 >   
 >   Could you keep using the workaround code and avoiding "intrctl list" ?
 >   Please use "vmstat -e" instead of "intrctl list" if you want to know
 >   interrupt count, sorry.
 >   
 >   
 >   Thanks,
 >   
 >   --
 >   //////////////////////////////////////////////////////////////////////
 >   Internet Initiative Japan Inc.
 >   
 >   Device Engineering Section,
 >   IoT Platform Development Department,
 >   Network Division,
 >   Technology Unit
 >   
 >   Kengo NAKAHARA <k-nakahara@iij.ad.jp>
 >   

From: "Kengo NAKAHARA" <knakahara@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52111 CVS commit: src/sys/arch/x86/pci
Date: Fri, 14 Apr 2017 09:34:46 +0000

 Module Name:	src
 Committed By:	knakahara
 Date:		Fri Apr 14 09:34:46 UTC 2017

 Modified Files:
 	src/sys/arch/x86/pci: pci_msi_machdep.c

 Log Message:
 disable msi/msix when the system doesn't detect ioapic. This would fix PR kern/52111.

 Some system does not detect ioapic when "boot -1", disable acpi, and so on.
 In such cases, msi/msix doesn't work, so disable them.

 This patch is implemented by nonaka@n.o, I just commit by proxy, thanks.


 To generate a diff of this commit:
 cvs rdiff -u -r1.10 -r1.11 src/sys/arch/x86/pci/pci_msi_machdep.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: PR/52111 CVS commit: src/sys/arch/x86/pci
Date: Sat, 29 Apr 2017 15:54:47 +0200

 I just tested the change and 'boot -1' in combination with wm interfaces now
 works as expected without addtional tweaks to if_wm.c.

 Thanks!

 PR/52111 can now be closed.

 Best regards,
    Frank

 On 04/14/17 11:35, Kengo NAKAHARA wrote:
 > The following reply was made to PR kern/52111; it has been noted by GNATS.
 >
 > From: "Kengo NAKAHARA" <knakahara@netbsd.org>
 > To: gnats-bugs@gnats.NetBSD.org
 > Cc:
 > Subject: PR/52111 CVS commit: src/sys/arch/x86/pci
 > Date: Fri, 14 Apr 2017 09:34:46 +0000
 >
 >   Module Name:	src
 >   Committed By:	knakahara
 >   Date:		Fri Apr 14 09:34:46 UTC 2017
 >   
 >   Modified Files:
 >   	src/sys/arch/x86/pci: pci_msi_machdep.c
 >   
 >   Log Message:
 >   disable msi/msix when the system doesn't detect ioapic. This would fix PR kern/52111.
 >   
 >   Some system does not detect ioapic when "boot -1", disable acpi, and so on.
 >   In such cases, msi/msix doesn't work, so disable them.
 >   
 >   This patch is implemented by nonaka@n.o, I just commit by proxy, thanks.
 >   
 >   
 >   To generate a diff of this commit:
 >   cvs rdiff -u -r1.10 -r1.11 src/sys/arch/x86/pci/pci_msi_machdep.c
 >   
 >   Please note that diffs are not public domain; they are subject to the
 >   copyright notices on the relevant files.
 >   

State-Changed-From-To: open->closed
State-Changed-By: knakahara@NetBSD.org
State-Changed-When: Mon, 01 May 2017 05:05:37 +0000
State-Changed-Why:


From: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
To: kardel@netbsd.org, gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
        gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: PR/52111 CVS commit: src/sys/arch/x86/pci
Date: Mon, 1 May 2017 14:06:44 +0900

 Hi,

 On 2017/04/29 22:54, Frank Kardel wrote:
 > I just tested the change and 'boot -1' in combination with wm interfaces now
 > works as expected without addtional tweaks to if_wm.c.
 > 
 > Thanks!
 > 
 > PR/52111 can now be closed.

 Thank you for your testing! I closed PR/52111.


 Thanks,

 -- 
 //////////////////////////////////////////////////////////////////////
 Internet Initiative Japan Inc.

 Device Engineering Section,
 IoT Platform Development Department,
 Network Division,
 Technology Unit

 Kengo NAKAHARA <k-nakahara@iij.ad.jp>

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.