NetBSD Problem Report #54331
From www@netbsd.org Sat Jun 29 23:10:06 2019
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 7682B7A17C
for <gnats-bugs@gnats.NetBSD.org>; Sat, 29 Jun 2019 23:10:06 +0000 (UTC)
Message-Id: <20190629231004.4279E7A1B3@mollari.NetBSD.org>
Date: Sat, 29 Jun 2019 23:10:03 +0000 (UTC)
From: david@gutteridge.ca
Reply-To: david@gutteridge.ca
To: gnats-bugs@NetBSD.org
Subject: macppc MP kernels fail to boot successfully in -current (8.99.49)
X-Send-Pr-Version: www-1.0
>Number: 54331
>Category: port-macppc
>Synopsis: macppc MP kernels fail to boot successfully in -current (8.99.49)
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-macppc-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jun 29 23:15:00 +0000 2019
>Closed-Date: Sun May 31 19:31:17 +0000 2020
>Last-Modified: Sun May 31 19:31:17 +0000 2020
>Originator: David H. Gutteridge
>Release: 8.99.49/macppc
>Organization:
>Environment:
(See following dmesg details.)
>Description:
I'm unable to boot a -current MP macppc kernel. It initially hits
"autoconfiguration error: 2:CPU 1 didn't start 2", which generates a
brief backtrace at the db prompt. Choosing "continue" after this gets
the boot much farther, but then it starts outputting garbled text at
random points on the console, and reboots.
Note this is not an issue with 8.1. Its GENERIC.MP kernel works quite
well on the same hardware. This is also not an issue with the UP
GENERIC kernel on the same hardware, 8.99.49 works fine (so far as I've
tested it). So there's a regression somewhere in the MP code.
GENERIC.MP kernel with DIAGNOSTIC and DEBUG enabled:
[ 1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
[ 1.0000000] 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
[ 1.0000000] 2018, 2019 The NetBSD Foundation, Inc. All rights reserved.
[ 1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[ 1.0000000] The Regents of the University of California. All rights reserved.
[ 1.0000000] NetBSD 8.99.49 (DEBUG) #2: Fri Jun 28 23:54:10 EDT 2019
[ 1.0000000] disciple@arcusxvi.nonus-porta.net:/home/disciple/netbsd-current/src/sys/arch/macppc/compile/obj/DEBUG
[ 1.0000000] total memory = 512 MB
[ 1.0000000] avail memory = 480 MB
[ 1.0000000] timecounter: Timecounters tick every 10.000 msec
[ 1.0000000] Kernelized RAIDframe activated
[ 1.0000000] found openpic PIC at 80040000
[ 1.0000000] OpenPIC Version 1.2: Supports 4 CPUs and 64 interrupt sources.
[ 1.0000000] bootpath: /pci@f4000000/ata-6@d/disk@0:0/netbsd-current-mp-debug
[ 1.0000000] mainbus0 (root)
[ 1.0000000] cpu0 at mainbus0: 7455 (Revision 2.1), ID 0 (primary)
[ 1.0000000] cpu0: HID0 0x84d0c1bc<EMCP,TBEN,HIGH_BAT_EN,NAP,DPM,ICE,DCE,XBSEN,SGE,BTIC,LRSTK,FOLD,BHT>, powersave: 1
[ 1.0000000] cpu0: 866.67 MHz, 256KB L2 cache no parity, 1MB no-parity L3 cache (DDR SRAM) at 4:1 ratio
[ 1.0000000] cpu1 at mainbus0: 7455 (Revision 2.1), ID 1
[ 1.0000000] cpu1: HID0 0x84d0c1bc<EMCP,TBEN,HIGH_BAT_EN,NAP,DPM,ICE,DCE,XBSEN,SGE,BTIC,LRSTK,FOLD,BHT>, powersave: 1
[ 1.0000000] cpu1: 866.67 MHz, 256KB L2 cache no parity, 1MB no-parity L3 cache (DDR SRAM) at 4:1 ratio
[ 1.0000000] autoconfiguration error: 2:CPU 1 didn't start 2
Stopped in pid 0.1 (system) at netbsd:cpu_spinup+0x2b8 lwz r29, 0x1
14(r1)
db{0}> bt
0x00dd8d30: at config_attach_loc+0x1d4
0x00dd8d60: at config_found_sm_loc+0x5c
0x00dd8d90: at mainbus_attach+0x74
0x00dd8e60: at config_attach_loc+0x1d4
0x00dd8e90: at cpu_configure+0x20
0x00dd8ea0: at main+0x314
0x00dd8f10: at 0x1000b4
db{0}> continue
[ 1.0000000] memory0 at mainbus0: len=512
[ 1.0000000] spdmem0 at memory0
[ 1.0000000] spdmem0: DDR SDRAM, no parity or ECC, 256MB, 267MHz (PC-2100)
[ 1.0000000] spdmem0: 13 rows, 10 cols, 1 ranks, 4 banks/chip, 7.5ns cycle time
[ 1.0000000] spdmem0: tAA-tRCD-tRP-tRAS: 1-3-3-6
[ 1.0000000] spdmem0: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.0000000] spdmem1 at memory0
[ 1.0000000] spdmem1: DDR SDRAM, no parity or ECC, 256MB, 267MHz (PC-2100)
[ 1.0000000] spdmem1: 13 rows, 10 cols, 1 ranks, 4 banks/chip, 7.5ns cycle time
[ 1.0000000] spdmem1: tAA-tRCD-tRP-tRAS: 1-3-3-6
[ 1.0000000] spdmem1: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.0000000] uni_n0 at mainbus0 address 0xf8000000
[ 1.0000000] ki2c0 at uni_n0 address 0xf8001000
[ 1.0000000] iic0 at ki2c0: I2C bus
[ 1.0000000] dbcool0 at iic0 addr 0x2c
[ 1.0000000] dbcool0: ADM1030 dBCool(tm) Controller (rev 0x0083)
[ 1.0000000] i2c-hwclock at iic0 addr 0x65 not configured
[ 1.0000000] admtemp0 at iic0 addr 0x49: LM84 temperature sensor: id. 0x4b, rev. 0x00
[ 1.0000000] uninorth0 at mainbus0
[ 1.0000000] pci0 at uninorth0 bus 0
[ 1.0000000] pci0: i/o space, memory space enabled
[ 1.0000000] pchb0 at pci0 dev 11 function 0
[ 1.0000000] pchb0: Apple Computer UniNorth AGP Bridge (rev. 0x00)
[ 1.0000000] genfb0 at pci0 dev 16 function 0: NVIDIA GeForce4 MX 420 (rev. 0xa3)
[ 1.0000000] genfb0: framebuffer at 0x98004000, size 1360x768, depth 8, stride 1536
[ 1.0000000] wsdisplay0 at genfb0 kbdmux 1: console (default, vt100 emulation)
[ 1.0000000] wsmux1: connecting to wsdisplay0
[ 1.0000000] drm at genfb0 not configured
[ 1.0000000] uninorth1 at mainbus0
[ 1.0000000] pci1 at uninorth1 bus 0
[ 1.0000000] pci1: i/o space, memory space enabled
[ 1.0000000] pchb1 at pci1 dev 11 function 0
[ 1.0000000] pchb1: Apple Computer UniNorth Host-PCI Bridge (rev. 0x00)
[ 1.0000000] obio0 at pci1 dev 23 function 0: addr 0x80000000
[ 1.0000000] obio0: enabling KeyLargo internal modem
[ 1.0000000] zsc0 at obio0 irq 22,23
[ 1.0000000] zstty0 at zsc0 channel 0
[ 1.0000000] zstty1 at zsc0 channel 1
[ 1.0000000] snapper0 at obio0 offset 0x10000: irq 30,1,2
[ 1.0000000] pmu0 at obio0 offset 0x16000 irq 47: rev. 12
[ 1.0000000] pmu0: initializing IIC bus
[ 1.0000000] iic1 at pmu0: I2C bus
[ 1.0000000] i2c-hwclock at iic1 addr 0x6a not configured
[ 1.0000000] i2c-hwclock at iic1 addr 0x64 not configured
[ 1.0000000] pmu0: initializing RTC
[ 1.0000000] pmu0: power-mgt not configured
[ 1.0000000] ki2c1 at obio0
[ 1.0000000] iic2 at ki2c1: I2C bus
[ 1.0000000] deq0 at iic2 addr 0x35 Audio Codec (tas3004)
[ 1.0000000] wdc0 at obio0 offset 0x1f000 irq 19, level triggered: DMA transfer
[ 1.0000000] atabus0 at wdc0 channel 0
[ 1.0000000] wdc1 at obio0 offset 0x20000 irq 20, level triggered: DMA transfer
[ 1.0000000] atabus1 at wdc1 channel 0
[ 1.0000000] snapper0: codec: TAS3004
[ 1.0000000] audio0 at snapper0: playback, capture, full duplex
[ 1.0000000] audio0: slinear_be:16 -> slinear_be:16 2ch 48000Hz, blk 21ms for playback
[ 1.0000000] audio0: slinear_be:16 <- slinear_be:16 2ch 48000Hz, blk 21ms for recording
[ 1.0000000] spkr0 at audio0: PC Speaker (synthesized)
[ 1.0000000] wsbell at spkr0 not configured
[ 1.0000000] ohci0 at pci1 dev 24 function 0: Apple Computer KeyLargo USB Controller (rev. 0x00)
[ 1.0000000] csr: 02000006
[ 1.0000000] genppc_pci_intr_map: pin: 1, line: 27
[ 1.0000000] ohci0: interrupting at irq 27
[ 1.0000000] ohci0: OHCI version 1.0
[ 1.0000000] usb0 at ohci0: USB revision 1.0
[ 1.0000000] ohci1 at pci1 dev 25 function 0: Apple Computer KeyLargo USB Controller (rev. 0x00)
[ 1.0000000] csr: 02000006
[ 1.0000000] genppc_pci_intr_map: pin: 1, line: 28
[ 1.0000000] ohci1: interrupting at irq 28
[ 1.0000000] ohci1: OHCI version 1.0
[ 1.0000000] usb1 at ohci1: USB revision 1.0
[ 1.0000000] uninorth2 at mainbus0
[ 1.0000000] pci2 at uninorth2 bus 0
[ 1.0000000] pci2: i/o space, memory space enabled
[ 1.0000000] pchb2 at pci2 dev 11 function 0
[ 1.0000000] pchb2: Apple Computer UniNorth Host-PCI Bridge (rev. 0x00)
[ 1.0000000] wdc2 at pci2 dev 13 function 0genppc_pci_intr_map: pin: 1, line: 39
[ 1.0000000] : interrupting at irq 39
[ 1.0000000] atabus2 at wdc2 channel 0
[ 1.0000000] fwohci0 at pci2 dev 14 function 0: Apple Computer UniNorth Firewire (rev. 0x01)
[ 1.0000000] genppc_pci_intr_map: pin: 1, line: 40
[ 1.0000000] fwohci0: interrupting at irq 40
[ 1.0000000] fwohci0: OHCI version 1.10 (ROM=0)
[ 1.0000000] fwohci0: No. of Isochronous channels is 8.
[ 1.0000000] fwohci0: EUI64 00:03:93:ff:fe:dc:b7:e2
[ 1.0000000] fwohci0: Phy 1394a available S400, 3 ports.
[ 1.0000000] fwohci0: Link S400, max_rec 2048 bytes.
[ 1.0000000] ieee1394if0 at fwohci0: IEEE1394 bus
[ 1.0000000] fwip0 at ieee1394if0: IP over IEEE1394
[ 1.0000000] fwohci0: Initiate bus reset
[ 1.0000000] fwohci0: Clear PME# now
[ 1.0000000] gem0 at pci2 dev 15 function 0: Apple Computer GMAC Ethernet (rev. 0x00)
[ 1.0000000] genppc_pci_intr_map: pin: 1, line: 41
[ 1.0000000] gem0: interrupting at irq 41
[ 1.0000000] brgphy0 at gem0 phy 0: BCM5421 1000BASE-T media interface, rev. 1
[ 1.0000000] brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
[ 1.0000000] gem0: Ethernet address 00:03:93:dc:b7:e2, 10KB RX fifo, 4KB TX fifo
[ 1.0000000] vmmask 677c0000 schedmask 7ffc0000 highmask 7ffc0000
[ 1.0000000] timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
[ 1.0000030] timecounter: Timecounter "mftb" frequency 33304156 Hz quality 100
[ 1.0000203] fwohci0: BUS reset
[ 1.0000203] fwohci0: node_id=0xc800ffc0, gen=2, CYCLEMASTER mode
[ 1.0000203] ieee1394if0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me)
[ 1.0000203] ieee1394if0: bus manager 0
[ 1.0000203] cpu1 started
[ 1.0800202] panic: kernel diagnostic assertion "virq != 0" failed: file "/home/disciple/netbsd-current/src/sys/arch/powerpc/pic/intr.c", line 561
[ 1.1099928] cpu1: Begin traceback...
[ 1.1199932] 0x00dd9d40: at vpanic+0x144
[ 1.1299929] uhub0 at usb0: NetBSD (00000x)0 0OdHdC9Id 7r0o:o t hub (0000), class 9/0, rev 1.00/1.00, adadtr k1e
[ 1.1299929] rn_assert+0x60
[ 1.1299929] uhub0: 2 ports with 2 removable, self powered
[ 1.1299929] uhub1 at usb1: NetBSD (0000) OHCI root hub0 x(0000d0d09)d,b 0c:l ass 9/0, rev 1.00/1.00, addr 1
[ 1.1299929] at pic_handle_intr+0x200
[ 1.1299929] uhub1: 2 ports with 2 removable, self powered
[ 1.1299929] uhub0: autoconfiguration error: port 1 reset 0fxa0i0ldedd9
[ 1.1299929] e00: at trapstart+0x6b0
[ 1.1299929] uhub1: autoconfiguration error: port 1 reset 0fxa0i0ldedd
[ 1.1299929] 9ed0: at kprintf+0x7f8
[ 1.3700513] at0axp0i0bduds90f 0a0t :a tabus1: 2 targets
[ 1.3700513] at trapstart+0x800
[ 1.3700513] [c d 0 1a.t3 7a0t0a51p3i]b u0sx00 0ddrdi9vfed0 0: : <PHILIPS CDD5101, H9246047JNVGA,a tA 4c.pu2>_ chadrtcomh +r0xe2mo1vca
[ 1.3700513] ble
[ 1.3700513] 0x00dd9ff0: at cpucd0: drive supports PIO mode _4spinup_tr, DMA mode 2
[[ 11..55110000334466]] mcpoonlifn[0e+] 0=x 03cx
[ 1.5100346] 11823, cyc = 4 (120 ns), act = 3 (75 ns), inact = 1
[ 1.5100346] cpu1: End traceback...
[ 1.5100346] cd0(wdc1:0:0): using PIO mode 4, DMA mode 2 (using DMA)
[ 1.5100346] wd0 at atabus2 drive 0
[ 1.5100346] Failed to pause: cpu0
[ 1.5100346] wd0: <WDC WD800JB-00JJC0>
[ 1.5100346] wd0: drive supports 16-sector PIO transfers, LBA addressing
[ 1.5100346] wd0: 76319 MB, 155061 cyl, 16 head, 63 sec, 512 bytes/sect x 156301488 sectors
[ 1.8000446] wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
[ 1.8000446] wd1 at atabus2 drive 1
[ 1.8000446] wd1: <IBM-IC35L120AVVA07-0>
[ 1.8000446] wd1: drive supports 16-sector PIO transfers, LBA addressing
[ 1.8000446] wd1: 115 GB, 239340 cyl, 16 head, 63 sec, 512 bytes/sect x 241254720 sectors
[ 1.9499934] wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
[ 1.9499934] wd0(wdc2:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) (using DMA)
[ 1.9499934] wd1(wdc2:0:1): using PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) (using DMA)
[ 1.9699929] dumpsys: TBD
[ 1.9999936] rebooting
GENERIC UP kernel output:
[ 1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
[ 1.0000000] 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
[ 1.0000000] 2018, 2019 The NetBSD Foundation, Inc. All rights reserved.
[ 1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[ 1.0000000] The Regents of the University of California. All rights reserved.
[ 1.0000000] NetBSD 8.99.49 (GENERIC) #1: Sat Jun 29 15:55:20 EDT 2019
[ 1.0000000] disciple@arcusxvi.nonus-porta.net:/home/disciple/netbsd-current/src/sys/arch/macppc/compile/obj/GENERIC
[ 1.0000000] total memory = 512 MB
[ 1.0000000] avail memory = 482 MB
[ 1.0000000] timecounter: Timecounters tick every 10.000 msec
[ 1.0000000] Kernelized RAIDframe activated
[ 1.0000000] found openpic PIC at 80040000
[ 1.0000000] OpenPIC Version 1.2: Supports 4 CPUs and 64 interrupt sources.
[ 1.0000000] bootpath: /pci@f4000000/ata-6@d/disk@0:0/netbsd-current-up
[ 1.0000000] mainbus0 (root)
[ 1.0000000] cpu0 at mainbus0: 7455 (Revision 2.1), ID 0 (primary)
[ 1.0000000] cpu0: HID0 0x84d0c1bc<EMCP,TBEN,HIGH_BAT_EN,NAP,DPM,ICE,DCE,XBSEN,SGE,BTIC,LRSTK,FOLD,BHT>, powersave: 1
[ 1.0000000] cpu0: 866.67 MHz, 256KB L2 cache no parity, 1MB no-parity L3 cache (DDR SRAM) at 4:1 ratio
[ 1.0000000] cpu1 at mainbus0: ID 1
[ 1.0000000] cpu1: processor off-line; multiprocessor support not present in kernel
[ 1.0000000] memory0 at mainbus0: len=512
[ 1.0000000] spdmem0 at memory0
[ 1.0000000] spdmem0: DDR SDRAM, no parity or ECC, 256MB, 267MHz (PC-2100)
[ 1.0000000] spdmem0: 13 rows, 10 cols, 1 ranks, 4 banks/chip, 7.5ns cycle time
[ 1.0000000] spdmem0: tAA-tRCD-tRP-tRAS: 1-3-3-6
[ 1.0000000] spdmem0: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.0000000] spdmem1 at memory0
[ 1.0000000] spdmem1: DDR SDRAM, no parity or ECC, 256MB, 267MHz (PC-2100)
[ 1.0000000] spdmem1: 13 rows, 10 cols, 1 ranks, 4 banks/chip, 7.5ns cycle time
[ 1.0000000] spdmem1: tAA-tRCD-tRP-tRAS: 1-3-3-6
[ 1.0000000] spdmem1: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.0000000] uni_n0 at mainbus0 address 0xf8000000
[ 1.0000000] ki2c0 at uni_n0 address 0xf8001000
[ 1.0000000] iic0 at ki2c0: I2C bus
[ 1.0000000] dbcool0 at iic0 addr 0x2c
[ 1.0000000] dbcool0: ADM1030 dBCool(tm) Controller (rev 0x0083)
[ 1.0000000] i2c-hwclock at iic0 addr 0x65 not configured
[ 1.0000000] admtemp0 at iic0 addr 0x49: LM84 temperature sensor: id. 0x4b, rev. 0x00
[ 1.0000000] uninorth0 at mainbus0
[ 1.0000000] pci0 at uninorth0 bus 0
[ 1.0000000] pci0: i/o space, memory space enabled
[ 1.0000000] pchb0 at pci0 dev 11 function 0
[ 1.0000000] pchb0: Apple Computer UniNorth AGP Bridge (rev. 0x00)
[ 1.0000000] genfb0 at pci0 dev 16 function 0: NVIDIA GeForce4 MX 420 (rev. 0xa3)
[ 1.0000000] genfb0: framebuffer at 0x98004000, size 1360x768, depth 8, stride 1536
[ 1.0000000] wsdisplay0 at genfb0 kbdmux 1: console (default, vt100 emulation)
[ 1.0000000] wsmux1: connecting to wsdisplay0
[ 1.0000000] drm at genfb0 not configured
[ 1.0000000] uninorth1 at mainbus0
[ 1.0000000] pci1 at uninorth1 bus 0
[ 1.0000000] pci1: i/o space, memory space enabled
[ 1.0000000] pchb1 at pci1 dev 11 function 0
[ 1.0000000] pchb1: Apple Computer UniNorth Host-PCI Bridge (rev. 0x00)
[ 1.0000000] obio0 at pci1 dev 23 function 0: addr 0x80000000
[ 1.0000000] obio0: enabling KeyLargo internal modem
[ 1.0000000] zsc0 at obio0 irq 22,23
[ 1.0000000] zstty0 at zsc0 channel 0
[ 1.0000000] zstty1 at zsc0 channel 1
[ 1.0000000] snapper0 at obio0 offset 0x10000: irq 30,1,2
[ 1.0000000] pmu0 at obio0 offset 0x16000 irq 47: rev. 12
[ 1.0000000] pmu0: initializing IIC bus
[ 1.0000000] iic1 at pmu0: I2C bus
[ 1.0000000] i2c-hwclock at iic1 addr 0x6a not configured
[ 1.0000000] i2c-hwclock at iic1 addr 0x64 not configured
[ 1.0000000] pmu0: initializing RTC
[ 1.0000000] pmu0: power-mgt not configured
[ 1.0000000] ki2c1 at obio0
[ 1.0000000] iic2 at ki2c1: I2C bus
[ 1.0000000] deq0 at iic2 addr 0x35 Audio Codec (tas3004)
[ 1.0000000] wdc0 at obio0 offset 0x1f000 irq 19, level triggered: DMA transfer
[ 1.0000000] atabus0 at wdc0 channel 0
[ 1.0000000] wdc1 at obio0 offset 0x20000 irq 20, level triggered: DMA transfer
[ 1.0000000] atabus1 at wdc1 channel 0
[ 1.0000000] snapper0: codec: TAS3004
[ 1.0000000] audio0 at snapper0: playback, capture, full duplex
[ 1.0000000] audio0: slinear_be:16 -> slinear_be:16 2ch 48000Hz, blk 21ms for playback
[ 1.0000000] audio0: slinear_be:16 <- slinear_be:16 2ch 48000Hz, blk 21ms for recording
[ 1.0000000] spkr0 at audio0: PC Speaker (synthesized)
[ 1.0000000] wsbell at spkr0 not configured
[ 1.0000000] ohci0 at pci1 dev 24 function 0: Apple Computer KeyLargo USB Controller (rev. 0x00)
[ 1.0000000] ohci0: interrupting at irq 27
[ 1.0000000] ohci0: OHCI version 1.0
[ 1.0000000] usb0 at ohci0: USB revision 1.0
[ 1.0000000] ohci1 at pci1 dev 25 function 0: Apple Computer KeyLargo USB Controller (rev. 0x00)
[ 1.0000000] ohci1: interrupting at irq 28
[ 1.0000000] ohci1: OHCI version 1.0
[ 1.0000000] usb1 at ohci1: USB revision 1.0
[ 1.0000000] uninorth2 at mainbus0
[ 1.0000000] pci2 at uninorth2 bus 0
[ 1.0000000] pci2: i/o space, memory space enabled
[ 1.0000000] pchb2 at pci2 dev 11 function 0
[ 1.0000000] pchb2: Apple Computer UniNorth Host-PCI Bridge (rev. 0x00)
[ 1.0000000] wdc2 at pci2 dev 13 function 0: interrupting at irq 39
[ 1.0000000] atabus2 at wdc2 channel 0
[ 1.0000000] fwohci0 at pci2 dev 14 function 0: Apple Computer UniNorth Firewire (rev. 0x01)
[ 1.0000000] fwohci0: interrupting at irq 40
[ 1.0000000] fwohci0: OHCI version 1.10 (ROM=0)
[ 1.0000000] fwohci0: No. of Isochronous channels is 8.
[ 1.0000000] fwohci0: EUI64 00:03:93:ff:fe:dc:b7:e2
[ 1.0000000] fwohci0: Phy 1394a available S400, 3 ports.
[ 1.0000000] fwohci0: Link S400, max_rec 2048 bytes.
[ 1.0000000] ieee1394if0 at fwohci0: IEEE1394 bus
[ 1.0000000] fwip0 at ieee1394if0: IP over IEEE1394
[ 1.0000000] fwohci0: Initiate bus reset
[ 1.0000000] fwohci0: Clear PME# now
[ 1.0000000] gem0 at pci2 dev 15 function 0: Apple Computer GMAC Ethernet (rev. 0x00)
[ 1.0000000] gem0: interrupting at irq 41
[ 1.0000000] brgphy0 at gem0 phy 0: BCM5421 1000BASE-T media interface, rev. 1
[ 1.0000000] brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
[ 1.0000000] gem0: Ethernet address 00:03:93:dc:b7:e2, 10KB RX fifo, 4KB TX fifo
[ 1.0000000] vmmask 677c0000 schedmask 7ffc0000 highmask 7ffc0000
[ 1.0000000] timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
[ 1.0000030] timecounter: Timecounter "mftb" frequency 33304213 Hz quality 100
[ 1.0000188] fwohci0: BUS reset
[ 1.0000188] fwohci0: node_id=0xc800ffc0, gen=2, CYCLEMASTER mode
[ 1.0000188] ieee1394if0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me)
[ 1.0000188] ieee1394if0: bus manager 0
[ 1.1099931] uhub0 at usb0: NetBSD (0000) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
[ 1.1099931] uhub0: 2 ports with 2 removable, self powered
[ 1.1099931] uhub1 at usb1: NetBSD (0000) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
[ 1.1099931] uhub1: 2 ports with 2 removable, self powered
[ 1.1099931] IPsec: Initialized Security Association Processing.
[ 1.1799927] atapibus0 at atabus1: 2 targets
[ 1.2100003] cd0 at atapibus0 drive 0: <PHILIPS CDD5101, H9246047JNVGA, A4.2> cdrom removable
[ 1.2100003] cd0: drive supports PIO mode 4, DMA mode 2
[ 1.2100003] cd0(wdc1:0:0): using PIO mode 4, DMA mode 2 (using DMA)
[ 1.2100003] wd0 at atabus2 drive 0
[ 1.2100003] wd0: <WDC WD800JB-00JJC0>
[ 1.2100003] wd0: drive supports 16-sector PIO transfers, LBA addressing
[ 1.2100003] wd0: 76319 MB, 155061 cyl, 16 head, 63 sec, 512 bytes/sect x 156301488 sectors
[ 1.3099933] wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
[ 1.3099933] wd1 at atabus2 drive 1
[ 1.3099933] wd1: <IBM-IC35L120AVVA07-0>
[ 1.3099933] wd1: drive supports 16-sector PIO transfers, LBA addressing
[ 1.3099933] wd1: 115 GB, 239340 cyl, 16 head, 63 sec, 512 bytes/sect x 241254720 sectors
[ 1.3800037] wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
[ 1.3800037] wd0(wdc2:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) (using DMA)
[ 1.3800037] wd1(wdc2:0:1): using PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) (using DMA)
[ 3.4399993] uhidev0 at uhub1 port 1 configuration 1 interface 0
[ 3.4399993] uhidev0: Logitech (0x46d) USB-PS/2 Optical Mouse (0xc00e), rev 2.00/11.10, addr 2, iclass 3/1
[ 3.4399993] uhub2 at uhub0 port 1: Apple, Inc. (0x5ac) Keyboard Hub (0x1006), class 9/0, rev 2.00/96.15, addr 2
[ 3.5100094] ums0 at uhidev0: 3 buttons and Z dir
[ 3.5100094] wsmouse0 at ums0 mux 0
[ 3.5100094] uhub2: 3 ports with 2 removable, bus powered
[ 4.8800033] uhidev1 at uhub2 port 2 configuration 1 interface 0
[ 4.8800033] uhidev1: Apple, Inc (0x5ac) Apple Keyboard (0x220), rev 2.00/0.69, addr 3, iclass 3/1
[ 4.8800033] ukbd0 at uhidev1
[ 5.2900034] wskbd0 at ukbd0: console keyboard, using wsdisplay0
[ 5.2900034] uhidev2 at uhub2 port 2 configuration 1 interface 1
[ 5.2900034] uhidev2: Apple, Inc (0x5ac) Apple Keyboard (0x220), rev 2.00/0.69, addr 3, iclass 3/0
[ 5.3500123] uhid0 at uhidev2: input=1, output=0, feature=0
[ 5.3500123] swwdog0: software watchdog initialized
[ 5.4400041] boot device: wd0
[ 5.4400041] root on wd0a dumps on wd0b
[ 5.4800100] Supported file systems: mfs lfs ffs ext2fs nfs umap procfs overlay null kernfs fdesc union tmpfs smbfs puffs ptyfs ntfs msdos efs cd9660 coda
[ 5.4800100] no file system for wd0 (dev 0xa00)
[ 5.4800100] cannot mount root, error = 79
[ 5.4800100] root device (default wd0a): wd0b
[ 5.5700058] dump device (default wd0b): wd0f
[ 6.5000709] file system (default generic):
[ 7.4400821] dumps on wd0f
[ 7.7601033] root file system type: ffs
[ 7.7601033] kern.module.path=/stand/macppc/8.99.49/modules
[ 7.7601033] WARNING: clock gained 19 days
[ 7.8201039] init path (default /sbin/init):
[ 7.8601049] init: trying /sbin/init
[ 20.7801459] wsdisplay0: screen 1 added (default, vt100 emulation)
[ 20.7801459] wsdisplay0: screen 2 added (default, vt100 emulation)
[ 20.7801459] wsdisplay0: screen 3 added (default, vt100 emulation)
[ 20.7901492] wsdisplay0: screen 4 added (default, vt100 emulation)
>How-To-Repeat:
Boot an MP kernel in -current on a G4 Mac with two CPUs.
>Fix:
>Release-Note:
>Audit-Trail:
From: Michael <macallan@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Mon, 1 Jul 2019 02:18:41 -0400
Hmm, this might be fallout from me adding G5 SMP support a while ago,
and unfortunately I don't have a working SMP G4.
IIRC I had to slightly change the way we find the GPIOs that control
each CPU's reset pin ( since of course there are a few differences in
the OF variants used on G5 ) - please send me the ofctl -p output of
your machine, maybe there is something obvious.
have fun
Michael
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Sun, 07 Jul 2019 17:43:54 -0400
This is different than I thought, there are actually two separate
issues involved. The failure to successfully boot an MP kernel can be
duplicated with 8.1_STABLE: when DEBUG and DIAGNOSTIC are enabled, I
get a very similar initial backtrace, and the exact same behaviour
(and, from what I can see, output) before the machine reboots. I hadn't
previously tried a debugging kernel with 8.x, since it otherwise works
fine for me. So this isn't a regression with -current, it's pre-
existing.
With -current, the distinction is a non-debug MP kernel fails to
complete booting. It hangs before running /sbin/init. It continues to
process hardware-related events (e.g., I can unplug and plug in USB
devices, and it continues to handle these correctly), but otherwise
just stalls, without any output to indicate why.
So, perhaps in order to answer why the -current non-debug MP kernel
isn't working, we need to figure out why MP debug kernels aren't
working first.
Dave
From: Michael <macallan@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Mon, 8 Jul 2019 02:46:18 -0400
> So, perhaps in order to answer why the -current non-debug MP kernel
> isn't working, we need to figure out why MP debug kernels aren't
> working first.
Hmm, that's a good data point, which narrows things down quite a bit.
Hopefully I'll have time for a closer look on my next weekend ( which
starts Wednesday ).
Yes, this is basically a SIGKEEPALIVE from me.
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Wed, 10 Jul 2019 00:55:58 -0400
I tried a couple more things, which didn't really yield that much more.
First, I built an MP debug kernel from the netbsd-7 branch. It also
panics; however, the panic happens later. (It involves a locking issue
with USB devices. If this is unrelated to everything else, I don't
think anyone necessarily wants to chase new issues with 7.x.)
To get around one of the predictable panics with -current, I commented
out
KASSERT(virq != 0);
at line 561 of sys/arch/powerpc/pic/intr.c. This gives me a kernel that
behaves the same as an MP non-debug -current kernel. That is, it
freezes while devices are attaching, after the hard drives have been
recognized. It doesn't seem to output anything significant that's
different from the non-debug kernel.
Dave
From: Michael <macallan@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Thu, 25 Jul 2019 13:12:31 -0400
I finally had a look at the code and the logs - looks like a race
between cpu_setup() on the newly hatched CPU and delay(200000) in
cpu_spinup(), where we just wait a fixed time and then bail out if the
other CPU isn't ready. This would explain why the 2nd CPU still wakes
up later on but then things go south because we didn't install the IPI
handler.
The following patch adds a period of polling h->hatch_running for a
while instead of just blindly increasing the delay().
Index: cpu_subr.c
===================================================================
RCS file: /cvsroot/src/sys/arch/powerpc/oea/cpu_subr.c,v
retrieving revision 1.99
diff -u -w -r1.99 cpu_subr.c
--- cpu_subr.c 6 Feb 2019 07:32:50 -0000 1.99
+++ cpu_subr.c 25 Jul 2019 17:03:14 -0000
@@ -1381,6 +1385,17 @@
__asm volatile ("dcbst 0,%0"::"r"(&h->hatch_running):"memory");
__asm volatile ("sync; isync");
#endif
+ int hatch_bail = 0;
+ while ((h->hatch_running < 1) && (hatch_bail < 100000)) {
+ delay(1);
+ hatch_bail++;
+#ifdef CACHE_PROTO_MEI
+ __asm volatile ("dcbi 0,%0"::"r"(&h->hatch_running):"memory");
+ __asm volatile ("sync; isync");
+ __asm volatile ("dcbst 0,%0"::"r"(&h->hatch_running):"memory");
+ __asm volatile ("sync; isync");
+#endif
+ }
if (h->hatch_running < 1) {
#ifdef CACHE_PROTO_MEI
__asm volatile ("dcbi
0,%0"::"r"(&cpu_spinstart_ack):"memory");
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Mon, 29 Jul 2019 19:38:01 -0400
On Thu, 2019-07-25 at 17:15 +0000, Michael wrote:
> The following reply was made to PR port-macppc/54331; it has been noted by GNATS.
>
> From: Michael <macallan@netbsd.org>
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
> in -current (8.99.49)
> Date: Thu, 25 Jul 2019 13:12:31 -0400
>
> I finally had a look at the code and the logs - looks like a race
> between cpu_setup() on the newly hatched CPU and delay(200000) in
> cpu_spinup(), where we just wait a fixed time and then bail out if the
> other CPU isn't ready. This would explain why the 2nd CPU still wakes
> up later on but then things go south because we didn't install the IPI
> handler.
>
> The following patch adds a period of polling h->hatch_running for a
> while instead of just blindly increasing the delay().
>
> Index: cpu_subr.c
> ===================================================================
> RCS file: /cvsroot/src/sys/arch/powerpc/oea/cpu_subr.c,v
> retrieving revision 1.99
> diff -u -w -r1.99 cpu_subr.c
> --- cpu_subr.c 6 Feb 2019 07:32:50 -0000 1.99
> +++ cpu_subr.c 25 Jul 2019 17:03:14 -0000
> @@ -1381,6 +1385,17 @@
> __asm volatile ("dcbst 0,%0"::"r"(&h->hatch_running):"memory");
> __asm volatile ("sync; isync");
> #endif
> + int hatch_bail = 0;
> + while ((h->hatch_running < 1) && (hatch_bail < 100000)) {
> + delay(1);
> + hatch_bail++;
> +#ifdef CACHE_PROTO_MEI
> + __asm volatile ("dcbi 0,%0"::"r"(&h->hatch_running):"memory");
> + __asm volatile ("sync; isync");
> + __asm volatile ("dcbst 0,%0"::"r"(&h->hatch_running):"memory");
> + __asm volatile ("sync; isync");
> +#endif
> + }
> if (h->hatch_running < 1) {
> #ifdef CACHE_PROTO_MEI
> __asm volatile ("dcbi
> 0,%0"::"r"(&cpu_spinstart_ack):"memory");
Hi Michael,
Your patch fixes the initial panic; it now continues booting without
dropping into the debugger, and it doesn't "spontaneously" reboot at
any point. Now the debug and non-debug MP kernels behave identically,
which is that they hang right after USB devices are recognized,
without any output to indicate why. (My debug kernel includes the
DEBUG, DIAGNOSTIC, and LOCKDEBUG options.)
Separately, I've now hit an occasion where 8.1_STABLE without debugging
options also hangs, which I hadn't seen before. It gets farther than
-current: it mounts file systems and tries /sbin/init, and hangs there.
I'm going to try some more tests with it and see what happens.
Thanks,
Dave
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Wed, 31 Jul 2019 02:27:28 -0400
On Mon, 2019-07-29 at 19:38 -0400, David H. Gutteridge wrote:
>
> Separately, I've now hit an occasion where 8.1_STABLE without debugging
> options also hangs, which I hadn't seen before. It gets farther than
> -current: it mounts file systems and tries /sbin/init, and hangs there.
> I'm going to try some more tests with it and see what happens.
Hi Michael,
I can't reproduce that last hanging issue with a non-debug 8.1_STABLE
MP kernel. Every reboot worked without trouble. (Perhaps there's some
rare timing condition involved, or perhaps the hardware is getting
flaky.)
I also tried applying your patch to an 8.1 MP kernel (with DEBUG,
DIAGNOSTIC, and LOCKDEBUG options enabled), and that gets me past the
related original panic there, too. As the 8.1 kernel continues booting,
it gets to the point where it needs to request user input (I have non-
standard partitioning in this test setup), at which point it panics
with the following backtrace:
panic: kernel diagnostic assertion "mutex_owned(&sc->sc_lock)" failed: file "/home/disciple/netbsd-8/src/sys/dev/usb/ohci.c", line 590
Stopped in pid 0.1 (system) at netbsd:vpanic+0x140: addi r4, r0, 0x0
0x00c56a90: at kern_assert+0x68
0x00c56ad0: at ohci_reset_std_chain+0x518
0x00c56b30: at ohci_device_intr_start+0xd8
0x00c56b70: at usbd_start_next+0x6c
0x00c56b80: at ohci_softintr+0x6a0
0x00c56be0: at ohci_intr1+0x1f0
0x00c56c00: at ohci_poll+0x70
0x00c56c10: at ukbd_cngetc+0x80
0x00c56c30: at wskbd_cngetc+0x114
0x00c56c70: at cngetc+0x58
0x00c56c90: at cngetsn+0x6c
0x00c56cc0: at setroot+0x174
0x00c56ea0: at main+0x62c
0x00c56f10: at 0x1000b4
(Since -current hangs around USB controller attachment, I wondered if
that and this new 8.1 panic are somehow related.)
Regards,
Dave
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@netbsd.org
Cc: port-macppc-maintainer@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, david@gutteridge.ca
Subject: re: port-macppc/54331: macppc MP kernels fail to boot successfully in -current (8.99.49)
Date: Thu, 01 Aug 2019 17:26:53 +1000
> panic: kernel diagnostic assertion "mutex_owned(&sc->sc_lock)" failed: =
file "/home/disciple/netbsd-8/src/sys/dev/usb/ohci.c", line 590
i think i see the problem here.
> Stopped in pid 0.1 (system) at netbsd:vpanic+0x140: addi r4, =
r0, 0x0
> 0x00c56a90: at kern_assert+0x68
> 0x00c56ad0: at ohci_reset_std_chain+0x518
> 0x00c56b30: at ohci_device_intr_start+0xd8
ohci_device_intr_start() only takes the lock when not polling, so
this assert needs to be adjusted for that.
can you try this patch? it probably won't fix the original hang,
but it should fix this issue. it's against netbsd-8 though i
am fairly sure that the code is the same in -current.
.mrg.
Index: ohci.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/dev/usb/ohci.c,v
retrieving revision 1.273.6.4
diff -p -u -r1.273.6.4 ohci.c
--- ohci.c 27 Sep 2018 14:52:26 -0000 1.273.6.4
+++ ohci.c 1 Aug 2019 07:25:46 -0000
@@ -587,7 +587,7 @@ ohci_reset_std_chain(ohci_softc_t *sc, s
OHCIHIST_FUNC(); OHCIHIST_CALLED();
DPRINTF("start len=3D%jd", alen, 0, 0, 0);
=
- KASSERT(mutex_owned(&sc->sc_lock));
+ KASSERT(sc->sc_bus.ub_usepolling || mutex_owned(&sc->sc_lock));
=
DPRINTFN(8, "addr=3D%jd endpt=3D%jd len=3D%jd speed=3D%jd",
xfer->ux_pipe->up_dev->ud_addr,
@@ -1675,7 +1675,7 @@ ohci_device_bulk_done(struct usbd_xfer *
int isread =3D
(UE_GET_DIR(xfer->ux_pipe->up_endpoint->ue_edesc->bEndpointAddress) =
=3D=3D UE_DIR_IN);
=
- KASSERT(mutex_owned(&sc->sc_lock));
+ KASSERT(sc->sc_bus.ub_usepolling || mutex_owned(&sc->sc_lock));
=
OHCIHIST_FUNC(); OHCIHIST_CALLED();
DPRINTFN(10, "xfer=3D%#jx, actlen=3D%jd", (uintptr_t)xfer, xfer->ux_actl=
en,
From: "Michael Lorenz" <macallan@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54331 CVS commit: src/sys/arch/powerpc/oea
Date: Fri, 2 Aug 2019 05:08:08 +0000
Module Name: src
Committed By: macallan
Date: Fri Aug 2 05:08:07 UTC 2019
Modified Files:
src/sys/arch/powerpc/oea: cpu_subr.c
Log Message:
first step to address PR54331:
poll h->hatch_running for a bit instead of blindly relying on a fixed timeout
for secondary CPUs to wake up and get ready
needs more testing, possibly pullup
To generate a diff of this commit:
cvs rdiff -u -r1.99 -r1.100 src/sys/arch/powerpc/oea/cpu_subr.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org, matthew green <mrg@eterna.com.au>
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Sun, 11 Aug 2019 18:24:16 -0400
On Thu, 2019-08-01 at 17:26 +1000, matthew green wrote:
> > panic: kernel diagnostic assertion "mutex_owned(&sc->sc_lock)" failed: file "/home/disciple/netbsd-8/src/sys/dev/usb/ohci.c", line 590
>
> i think i see the problem here.
>
> > Stopped in pid 0.1 (system) at netbsd:vpanic+0x140: addi r4, r0, 0x0
> > 0x00c56a90: at kern_assert+0x68
> > 0x00c56ad0: at ohci_reset_std_chain+0x518
> > 0x00c56b30: at ohci_device_intr_start+0xd8
>
> ohci_device_intr_start() only takes the lock when not polling, so
> this assert needs to be adjusted for that.
>
> can you try this patch? it probably won't fix the original hang,
> but it should fix this issue. it's against netbsd-8 though i
> am fairly sure that the code is the same in -current.
>
>
> .mrg.
>
>
> Index: ohci.c
> ===================================================================
> RCS file: /cvsroot/src/sys/dev/usb/ohci.c,v
> retrieving revision 1.273.6.4
> diff -p -u -r1.273.6.4 ohci.c
> --- ohci.c 27 Sep 2018 14:52:26 -0000 1.273.6.4
> +++ ohci.c 1 Aug 2019 07:25:46 -0000
> @@ -587,7 +587,7 @@ ohci_reset_std_chain(ohci_softc_t *sc, s
> OHCIHIST_FUNC(); OHCIHIST_CALLED();
> DPRINTF("start len=%jd", alen, 0, 0, 0);
>
> - KASSERT(mutex_owned(&sc->sc_lock));
> + KASSERT(sc->sc_bus.ub_usepolling || mutex_owned(&sc->sc_lock));
>
> DPRINTFN(8, "addr=%jd endpt=%jd len=%jd speed=%jd",
> xfer->ux_pipe->up_dev->ud_addr,
> @@ -1675,7 +1675,7 @@ ohci_device_bulk_done(struct usbd_xfer *
> int isread =
> (UE_GET_DIR(xfer->ux_pipe->up_endpoint->ue_edesc->bEndpointAddress) == UE_DIR_IN);
>
> - KASSERT(mutex_owned(&sc->sc_lock));
> + KASSERT(sc->sc_bus.ub_usepolling || mutex_owned(&sc->sc_lock));
>
> OHCIHIST_FUNC(); OHCIHIST_CALLED();
> DPRINTFN(10, "xfer=%#jx, actlen=%jd", (uintptr_t)xfer, xfer->ux_actlen,
Hi,
Your patch fixed the remaining issue booting an MP+DEBUG+DIAGNOSTIC+
LOCKDEBUG kernel built from the netbsd-8 branch. Thanks!
With this plus macallan@'s earlier patch applied, I don't see any
remaining issues for netbsd-8, so we've narrowed this PR to what's in
HEAD/netbsd-9 now.
Dave
From: "matthew green" <mrg@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54331 CVS commit: src/sys/dev/usb
Date: Sun, 11 Aug 2019 22:55:03 +0000
Module Name: src
Committed By: mrg
Date: Sun Aug 11 22:55:03 UTC 2019
Modified Files:
src/sys/dev/usb: ohci.c
Log Message:
relax an assert to be mutex_owned || polling.
half of the patch i sent for netbsd-8 in PR#54331.
the other half is already applied.
To generate a diff of this commit:
cvs rdiff -u -r1.289 -r1.290 src/sys/dev/usb/ohci.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: matthew green <mrg@eterna.com.au>
To: "David H. Gutteridge" <david@gutteridge.ca>
Cc: gnats-bugs@netbsd.org
Subject: re: port-macppc/54331: macppc MP kernels fail to boot successfully in -current (8.99.49)
Date: Mon, 12 Aug 2019 09:07:32 +1000
> Your patch fixed the remaining issue booting an MP+DEBUG+DIAGNOSTIC+
> LOCKDEBUG kernel built from the netbsd-8 branch. Thanks!
great news. i sent a pullup request. i had actually fixed one
of the two chunks back in march, but both are commited now.
> With this plus macallan@'s earlier patch applied, I don't see any
> remaining issues for netbsd-8, so we've narrowed this PR to what's in
> HEAD/netbsd-9 now.
i will get the above additional fix into netbsd-9.
can you update us with the status of -current/-9? the system
hangs after probing USB still?
if so and if you can drop to DDB, can you inspect the
"config_pending" variable? eg, "x/x config_pending". if it
is zero, that means that some device configuration has called
config_incr() without a matching config_decr(), which makes the
code that allows mountroot and then init to run to wait. this
is normally for eg, boot device probe, but it tends to indicate
that there is a lack of interupts, allowing a device probe to
complete, but really, anything that stops the driver from
getting to its config_decr() moment will do this.
.mrg.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54331 CVS commit: [netbsd-8] src/sys/dev/usb
Date: Mon, 12 Aug 2019 17:19:02 +0000
Module Name: src
Committed By: martin
Date: Mon Aug 12 17:19:02 UTC 2019
Modified Files:
src/sys/dev/usb [netbsd-8]: ohci.c
Log Message:
Pull up following revision(s) (requested by mrg in ticket #1334):
sys/dev/usb/ohci.c: revision 1.289
sys/dev/usb/ohci.c: revision 1.290
adjust KASSERT() for sc_lock to handle polling case.
should fix usb kdb in ddb. reported by martin.
-
relax an assert to be mutex_owned || polling.
half of the patch i sent for netbsd-8 in PR#54331.
the other half is already applied.
To generate a diff of this commit:
cvs rdiff -u -r1.273.6.4 -r1.273.6.5 src/sys/dev/usb/ohci.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org, mrg@eterna.com.au
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Mon, 12 Aug 2019 22:24:53 -0400
On Sun, 2019-08-11 at 23:10 +0000, matthew green wrote:
> The following reply was made to PR port-macppc/54331; it has been noted by GNATS.
>
> From: matthew green <mrg@eterna.com.au>
> To: "David H. Gutteridge" <david@gutteridge.ca>
> Cc: gnats-bugs@netbsd.org
> Subject: re: port-macppc/54331: macppc MP kernels fail to boot successfully in -current (8.99.49)
> Date: Mon, 12 Aug 2019 09:07:32 +1000
>
> > Your patch fixed the remaining issue booting an MP+DEBUG+DIAGNOSTIC+
> > LOCKDEBUG kernel built from the netbsd-8 branch. Thanks!
>
> great news. i sent a pullup request. i had actually fixed one
> of the two chunks back in march, but both are commited now.
>
> > With this plus macallan@'s earlier patch applied, I don't see any
> > remaining issues for netbsd-8, so we've narrowed this PR to what's in
> > HEAD/netbsd-9 now.
>
> i will get the above additional fix into netbsd-9.
>
> can you update us with the status of -current/-9? the system
> hangs after probing USB still?
>
> if so and if you can drop to DDB, can you inspect the
> "config_pending" variable? eg, "x/x config_pending". if it
> is zero, that means that some device configuration has called
> config_incr() without a matching config_decr(), which makes the
> code that allows mountroot and then init to run to wait. this
> is normally for eg, boot device probe, but it tends to indicate
> that there is a lack of interupts, allowing a device probe to
> complete, but really, anything that stops the driver from
> getting to its config_decr() moment will do this.
I built an MP+DEBUG+DIAGNOSTIC+LOCKDEBUG 9.99.7 kernel with the patches
included, and it's hanging after USB devices are attached. I can unplug
and reattach USB devices and it continues to handle those events fine,
but it never actually mounts (or prompts to mount) the hard drive. I
don't see anything useful in the output to indicate why. I'm not able
to get into ddb as far as I know. That is, typing something on the USB
keyboard that's attached doesn't result in anything happening.
Dave
From: matthew green <mrg@eterna.com.au>
To: "David H. Gutteridge" <david@gutteridge.ca>
Cc: gnats-bugs@netbsd.org
Subject: re: port-macppc/54331: macppc MP kernels fail to boot successfully in -current (8.99.49)
Date: Tue, 13 Aug 2019 12:46:06 +1000
> I built an MP+DEBUG+DIAGNOSTIC+LOCKDEBUG 9.99.7 kernel with the patches
> included, and it's hanging after USB devices are attached. I can unplug
> and reattach USB devices and it continues to handle those events fine,
> but it never actually mounts (or prompts to mount) the hard drive. I
> don't see anything useful in the output to indicate why. I'm not able
> to get into ddb as far as I know. That is, typing something on the USB
> keyboard that's attached doesn't result in anything happening.
does it still have the cpu failure attach messages in early boot?
here's a simple way to see if i'm right about config_pending not
being reduced at this point:
in config_finalize() there is this loop:
while (config_pending != 0)
cv_wait(&config_misc_cv, &config_misc_lock);
if you convert it to a cv_timedwait() that sleeps for hz (or
even hz*2 or hz*5) and then prints config_pending, eg:
while (config_pending != 0) {
cv_timedwait(&config_misc_cv, &config_misc_lock, hz);
printf("%s: config_pending still %d\n", __func__,
config_pending);
}
(untested/compiled). this would at least give us any idea on
where to look next.
.mrg.
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org, matthew green <mrg@eterna.com.au>
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Mon, 12 Aug 2019 23:17:10 -0400
On Tue, 2019-08-13 at 12:46 +1000, matthew green wrote:
> > I built an MP+DEBUG+DIAGNOSTIC+LOCKDEBUG 9.99.7 kernel with the
> > patches
> > included, and it's hanging after USB devices are attached. I can
> > unplug
> > and reattach USB devices and it continues to handle those events
> > fine,
> > but it never actually mounts (or prompts to mount) the hard drive. I
> > don't see anything useful in the output to indicate why. I'm not
> > able
> > to get into ddb as far as I know. That is, typing something on the
> > USB
> > keyboard that's attached doesn't result in anything happening.
>
> does it still have the cpu failure attach messages in early boot?
The early boot issue with CPU attachment is fixed.
> here's a simple way to see if i'm right about config_pending not
> being reduced at this point:
>
> in config_finalize() there is this loop:
>
> while (config_pending != 0)
> cv_wait(&config_misc_cv, &config_misc_lock);
>
> if you convert it to a cv_timedwait() that sleeps for hz (or
> even hz*2 or hz*5) and then prints config_pending, eg:
>
> while (config_pending != 0) {
> cv_timedwait(&config_misc_cv, &config_misc_lock, hz);
> printf("%s: config_pending still %d\n", __func__,
> config_pending);
> }
>
> (untested/compiled). this would at least give us any idea on
> where to look next.
Okay, I'll try that out, thanks!
Dave
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current
Date: Fri, 16 Aug 2019 11:09:29 +0200
I seem to have the same problem (with 9.99.9 on a silver tower dual G4 mac).
I have serial console though and can get into ddb.
Your suggested patch shows:
[ 1.0000227] cpu1 started
[ 1.0599986] uhub0 at usb1: NetBSD (0000) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
[ 1.0799956] uhub1 at usb0: NetBSD (0000) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
[ 2.1799924] config_finalize: config_pending still 3
[ 3.1799924] config_finalize: config_pending still 3
[ 4.1699925] wd0 at atabus0 drive 0
[ 4.1828592] config_finalize: config_pending still 3
[ 4.1828592] wd0: <WDC WD2000JB-00FUA0>
[ 4.1828592] wd0: 186 GB, 387621 cyl, 16 head, 63 sec, 512 bytes/sect x 390721968 sectors
[ 4.2499928] atapibus0 at atabus1: 2 targets
[ 4.2799925] cd0 at atapibus0 drive 0: <HL-DT-STDVD-ROM GDR8161B, , 0100> cdrom removable
[ 4.2999962] config_finalize: config_pending still 0
and then everything seems idle but cpu1 stuck in softclk:
0 22 3 0 200 3f9760a0 iic0 iicintr
0 21 3 1 200 3f980c00 xcall/1 xcall
0 20 1 1 200 3f980920 softser/1
0 > 19 7 1 200 3f980640 softclk/1
0 18 1 1 200 3f980360 softbio/1
0 17 1 1 200 3f980080 softnet/1
0 > 16 7 1 201 3f994be0 idle/1
0 15 3 0 200 3f994900 sysmon smtaskq
0 14 3 1 200 3f994620 pmfsuspend pmfsuspend
0 13 3 0 200 3f994340 pmfevent pmfevent
0 12 3 0 200 3f994060 sopendfree sopendfr
0 11 3 1 200 3fb27bc0 nfssilly nfssilly
0 10 3 0 200 3fb278e0 cachegc cachegc
0 9 3 0 200 3fb27600 vdrain vdrain
0 8 3 0 200 3fb27320 modunload mod_unld
0 7 3 0 200 3fb27040 xcall/0 xcall
0 6 1 0 200 3fb31ba0 softser/0
0 5 1 0 200 3fb318c0 softclk/0
0 4 1 0 200 3fb315e0 softbio/0
0 3 1 0 200 3fb31300 softnet/0
0 > 2 7 0 201 3fb31020 idle/0
0 1 3 1 200 b85b80 swapper tstile
db{0}> bt/a 3f980640
trace: pid 0 lid 19 at 0x10047eb0
0x10047ee0: at softint_dispatch+0xa8
0x10047f20: at softint_fast_dispatch+0xdc
saved LR(0xff3ffb79) is invalid.
db{0}> x/i softint_dispatch+0xa0
netbsd:softint_dispatch+0xa0: lwz r3, 0x8(r31)
db{0}>
netbsd:softint_dispatch+0xa4: mtspr ctr, r9,
db{0}>
netbsd:softint_dispatch+0xa8: bctrl
and looking at netbsd.gdb:
(gdb) list *(softint_dispatch+0xa8)
0x68f388 is in softint_dispatch (../../../../kern/kern_softint.c:592).
587 }
588 } else if (!havelock) {
589 KERNEL_LOCK(1, l);
590 havelock = true;
591 }
592 (*sh->sh_func)(sh->sh_arg);
593
Martin
From: "David H. Gutteridge" <david@gutteridge.ca>
To: gnats-bugs@netbsd.org, matthew green <mrg@eterna.com.au>
Cc:
Subject: Re: port-macppc/54331: macppc MP kernels fail to boot successfully
in -current (8.99.49)
Date: Wed, 20 Nov 2019 23:55:44 -0500
I tried a build of 9.99.17 from earlier this week, and the situation
has improved. It no longer hits any issues during device attachment,
and begins running rc scripts. At that point, however, as it's bringing
up the gem0 device, it panics there, like so:
[ 20.6900578] panic: pr_phinpage_check: [mclpl] item 0x1fb2f040 not part of pool
[ 20.6900578] cpu0: Begin traceback...
[ 20.6900578] 0x10007da0: at vpanic+0x144
[ 20.7300428] 0x10007dd0: at panic+0x50
[ 20.7300428] 0x10007e20: at pool_cache_put_paddr+0x25c
[ 20.7600388] 0x10007e50: at m_ext_free+0x130
[ 20.7600388] 0x10007e60: at m_free+0x9c
[ 20.7600388] 0x10007e70: at m_freem.part.8+0xc
[ 20.7600388] 0x10007e80: at ether_input+0x67c
[ 20.7600388] 0x10007eb0: at if_percpuq_softint+0xb4
[ 20.7600388] 0x10007ed0: at softint_dispatch+0x1d0
[ 20.7600388] 0x10007f20: at softint_fast_dispatch+0xdc
[ 20.8700561] 0x10007fe8: at 0xfffffffc
[ 20.8700561] cpu0: End traceback...
Stopped in pid 0.3 (system) at netbsd:vpanic+0x148: or r3, r29, r29
gem0: receive error: RX overflow sc->rxptr 0, complete 4
Regards,
Dave
State-Changed-From-To: open->closed
State-Changed-By: gutteridge@NetBSD.org
State-Changed-When: Sun, 31 May 2020 19:31:17 +0000
State-Changed-Why:
Original issue seems to be addressed. gem0 issue now tracked in PR 55326.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.