NetBSD Problem Report #55933

From martin@duskware.de  Sat Jan 16 14:50:45 2021
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C177A1A924A
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 16 Jan 2021 14:50:45 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: diskless machine hangs at boot
X-Send-Pr-Version: 3.95

>Number:         55933
>Category:       kern
>Synopsis:       diskless machine hangs at boot
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jan 16 14:55:00 +0000 2021
>Last-Modified:  Sat Jan 30 17:32:38 +0000 2021
>Originator:     Martin Husemann
>Release:        NetBSD 9.99.77
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD space-truckin.duskware.de 9.99.77 NetBSD 9.99.77 (GENERIC) #115: Tue Jan 12 13:24:17 CET 2021 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/evbarm/compile/GENERIC evbarm
Architecture: earmv7hfeb
Machine: evbarm
>Description:

I updated a diskless test machine yesterday (from a state about two weeks old)
and now it does not complete rc.d:

[   1.0000000] ksyms: checking .text
[   1.0000000] ksyms: checking .rodata
[   1.0000000] ksyms: checking .eh_frame
[   1.0000000] ksyms: checking link_set_evcnts
[   1.0000000] ksyms: checking link_set_arm_platforms
[   1.0000000] ksyms: checking link_set_arm_cpu_methods
[   1.0000000] ksyms: checking link_set_fdt_consoles
[   1.0000000] ksyms: checking link_set_sysctl_funcs
[   1.0000000] ksyms: checking link_set_fdt_opps
[   1.0000000] ksyms: checking link_set_modules
[   1.0000000] ksyms: checking link_set_sdt_argtypes_set
[   1.0000000] ksyms: checking link_set_sdt_probes_set
[   1.0000000] ksyms: checking link_set_sdt_providers_set
[   1.0000000] ksyms: checking link_set_ieee80211_funcs
[   1.0000000] ksyms: checking link_set_domains
[   1.0000000] ksyms: checking link_set_linux_module_param_desc
[   1.0000000] ksyms: checking link_set_linux_module_param_info
[   1.0000000] ksyms: checking link_set_dkwedge_methods
[   1.0000000] ksyms: checking link_set_prop_linkpools
[   1.0000000] ksyms: checking .data
[   1.0000000] ksyms: checking .bss
[   1.0000000] ksyms: checking _init_memory
[   1.0000000] ksyms: checking .ARM.attributes
[   1.0000000] ksyms: checking .ident
[   1.0000000] ksyms: checking .comment
[   1.0000000] ksyms: checking .note.netbsd.ident
[   1.0000000] ksyms: checking .symtab
[   1.0000000] ksyms: checking .strtab
[   1.0000000] ksyms: checking .shstrtab
[   1.0000000] Loaded initial symtab at 0x8089efec, strtab at 0x8091a93c, # entries 31637
[   1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
[   1.0000000]     2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
[   1.0000000]     2018, 2019, 2020, 2021 The NetBSD Foundation, Inc.  All rights reserved.
[   1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[   1.0000000]     The Regents of the University of California.  All rights reserved.

[   1.0000000] NetBSD 9.99.77 (GENERIC) #15: Sat Jan 16 14:50:33 CET 2021
[   1.0000000] 	martin@martins.aprisoft.de:/usr/src/sys/arch/evbarm/compile/GENERIC
[   1.0000000] total memory = 256 MB
[   1.0000000] avail memory = 235 MB
[   1.0000000] entropy: no seed from bootloader
[   1.0000000] armfdt0 (root)
[   1.0000000] simplebus0 at armfdt0: Xunlong Orange Pi Zero
[   1.0000000] simplebus1 at simplebus0
[   1.0000000] simplebus2 at simplebus0
[   1.0000000] cpus0 at simplebus0
[   1.0000000] simplebus3 at simplebus0
[   1.0000000] cpu0 at cpus0: Cortex-A7 r0p5 (Cortex V7A core)
[   1.0000000] cpu0: DC enabled IC enabled WB enabled LABT branch prediction enabled
[   1.0000000] cpu0: 32KB/32B 2-way L1 VIPT Instruction cache
[   1.0000000] cpu0: 32KB/64B 4-way write-back-locking-C L1 PIPT Data cache
[   1.0000000] cpu0: 512KB/64B 8-way write-through L2 PIPT Unified cache
[   1.0000000] vfp0 at cpu0: NEON MPE (VFP 3.0+), rounding, NaN propagation, denormals
[   1.0000000] cpufreqdt0 at cpu0
[   1.0000000] cpu1 at cpus0
[   1.0000000] cpufreqdt1 at cpu1
[   1.0000000] cpu2 at cpus0
[   1.0000000] cpufreqdt2 at cpu2
[   1.0000000] cpu3 at cpus0
[   1.0000000] cpufreqdt3 at cpu3
[   1.0000000] gic0 at simplebus1: GIC
[   1.0000000] armgic0 at gic0: Generic Interrupt Controller, 160 sources (150 valid)
[   1.0000000] armgic0: 16 Priorities, 128 SPIs, 7 PPIs, 15 SGIs
[   1.0000000] fclock0 at simplebus2: 24000000 Hz fixed clock (osc24M)
[   1.0000000] fclock1 at simplebus2: 32768 Hz fixed clock (ext_osc32k)
[   1.0000000] gtmr0 at simplebus0: Generic Timer
[   1.0000000] gtmr0: interrupting on GIC irq 27
[   1.0000000] armgtmr0 at gtmr0: Generic Timer (24000 kHz, virtual)
[   1.0000030] sun8ih3ccu0 at simplebus1: H3 CCU
[   1.0000030] sun8ih3rccu0 at simplebus1: H3 PRCM CCU
[   1.0000030] sunxigpio0 at simplebus1: PIO
[   1.0000030] gpio0 at sunxigpio0: 94 pins
[   1.0000030] sunxigpio0: interrupting on GIC irq 43
[   1.0000030] sunxigpio1 at simplebus1: PIO
[   1.0000030] gpio1 at sunxigpio1: 12 pins
[   1.0000030] sunxigpio1: interrupting on GIC irq 77
[   1.0000030] sunxisramc0 at simplebus1: SRAM Controller
[   1.0000030] sun6idma0 at simplebus1: DMA controller (12 channels)
[   1.0000030] sun6idma0: interrupting on GIC irq 82
[   1.0000030] gregulator0 at simplebus0: vdd-cpux
[   1.0000030] sunxisid0 at simplebus1: Security ID EFUSE
[   1.0000030] com0 at simplebus1: DesignWare APB UART, no fifo
[   1.0000030] com0: console
[   1.0000030] com0: interrupting on GIC irq 32
[   1.0000030] fregulator0 at simplebus0: vcc3v3
[   1.0000030] fregulator1 at simplebus0: vcc-wifi
[   1.0000030] fregulator2 at simplebus0: vcc3v0
[   1.0000030] fregulator3 at simplebus0: vcc5v0
[   1.0000030] sunxiusbphy0 at simplebus1: USB PHY
[   1.0000030] /soc/clock@1000000 at simplebus1 not configured
[   1.0000030] /soc/mixer@1100000 at simplebus1 not configured
[   1.0000030] /soc/lcd-controller@1c0c000 at simplebus1 not configured
[   1.0000030] mmcpwrseq0 at simplebus0: Simple MMC power sequence provider
[   1.0000030] sunxirtc0 at simplebus1: RTC
[   1.0000030] emac0 at simplebus1: EMAC
[   1.0000030] emac0: Ethernet address 02:42:95:64:50:6e
[   1.0000030] emac0: interrupting on GIC irq 114
[   1.0000030] ukphy0 at emac0 phy 1: OUI 0x0088a0, model 0x0000, rev. 0
[   1.0000030] ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
[   1.0000030] h3codec0 at simplebus1: H3 Audio Codec (analog part)
[   1.0000030] /soc/hdmi-phy@1ef0000 at simplebus1 not configured
[   1.0000030] /soc/dram-controller@1c62000 at simplebus1 not configured
[   1.0000030] /opp_table0 at simplebus0 not configured
[   1.0000030] sunximmc0 at simplebus1: SD/MMC controller
[   1.0000030] sunximmc0: interrupting on GIC irq 92
[   1.0000030] sunximmc1 at simplebus1: SD/MMC controller
[   1.0000030] sunximmc1: interrupting on GIC irq 93
[   1.0000030] motg0 at simplebus1: 'peripheral' mode not supported
[   1.0000030] ehci0 at simplebus1: EHCI
[   1.0000030] ehci0: interrupting on GIC irq 104
[   1.0000030] ehci0: 1 companion controller, 1 port
[   1.0000030] usb0 at ehci0: USB revision 2.0
[   1.0000030] ohci0 at simplebus1: OHCI
[   1.0000030] ohci0: interrupting on GIC irq 105
[   1.0000030] ohci0: OHCI version 1.0
[   1.0000030] usb1 at ohci0: USB revision 1.0
[   1.0000030] ehci1 at simplebus1: EHCI
[   1.0000030] ehci1: interrupting on GIC irq 106
[   1.0000030] ehci1: 1 companion controller, 1 port
[   1.0000030] usb2 at ehci1: USB revision 2.0
[   1.0000030] ohci1 at simplebus1: OHCI
[   1.0000030] ohci1: interrupting on GIC irq 107
[   1.0000030] ohci1: OHCI version 1.0
[   1.0000030] usb3 at ohci1: USB revision 1.0
[   1.0000030] sunxiwdt0 at simplebus1: Watchdog
[   1.0000030] sunxiwdt0: default watchdog period is 16 seconds
[   1.0000030] /soc/deinterlace@1400000 at simplebus1 not configured
[   1.0000030] /soc/crypto@1c15000 at simplebus1 not configured
[   1.0000030] /soc/gpu@1c40000 at simplebus1 not configured
[   1.0000030] psci0 at simplebus0: PSCI 0.1
[   1.0000030] gpioleds0 at simplebus0: orangepi:green:pwr orangepi:red:status
[   1.0000030] /soc/timer@1c20c00 at simplebus1 not configured
[   1.0000030] /soc/video-codec@1c0e000 at simplebus1 not configured
[   1.0000030] entropy: WARNING: extracting entropy too early
[   1.0000030] cpu0 has 1 core siblings: cpu0
[   1.0000030] cpu0 has 4 pkg siblings: cpu1 cpu2 cpu3 cpu0
[   1.0000030] cpu0 has 1 1st siblings: cpu0
[   1.0000030] cpu0 first in package: cpu0
[   1.0000030] cpu1 has 1 core siblings: cpu1
[   1.0000030] cpu1 has 4 pkg siblings: cpu2 cpu3 cpu0 cpu1
[   1.0000030] cpu1 has 1 1st siblings: cpu0
[   1.0000030] cpu1 first in package: cpu0
[   1.0000030] cpu2 has 1 core siblings: cpu2
[   1.0000030] cpu2 has 4 pkg siblings: cpu3 cpu0 cpu1 cpu2
[   1.0000030] cpu2 has 1 1st siblings: cpu0
[   1.0000030] cpu2 first in package: cpu0
[   1.0000030] cpu3 has 1 core siblings: cpu3
[   1.0000030] cpu3 has 4 pkg siblings: cpu0 cpu1 cpu2 cpu3
[   1.0000030] cpu3 has 1 1st siblings: cpu0
[   1.0000030] cpu3 first in package: cpu0
[   1.0000030] cpu_hatch(cpu3): cpu_hatch(cpu1): cpu_hatch(cpu2):  vectors vectors vbar=0x8009c240 vectors stacks vbar=0x8009c240 tlb vbar=0x8009c240 stackscpu1 stacks tlb: Cortex-A7 r0p5 (Cortex V7A core)
[   1.0354407]  tlbcpu1: DC enabled IC enabled WB enabled LABT branch prediction enabled
[   1.0454226] cpu1: 32KB/32B 2-way L1 VIPT Instruction cache
[   1.0554225] cpu1: 32KB/64B 4-way write-back-locking-C L1 PIPT Data cache
[   1.0554225] cpu1: 512KB/64B 8-way write-through L2 PIPT Unified cache
[   1.0654229]  vfpvfp1 at cpu1: NEON MPE (VFP 3.0+), rounding, NaN propagation, denormals
[   1.0754230]  md(0x8007f1f4)cpu2 interrupts: Cortex-A7 r0p5 (Cortex V7A core)
[   1.0754230]  done!
[   1.0854230] cpu2: DC enabled IC enabled WB enabled LABT branch prediction enabled
[   1.0854230] cpu2: 32KB/32B 2-way L1 VIPT Instruction cache
[   1.0954275] cpu2: 32KB/64B 4-way write-back-locking-C L1 PIPT Data cache
[   1.1054236] cpu2: 512KB/64B 8-way write-through L2 PIPT Unified cache
[   1.1054236]  vfpvfp2 at cpu2: NEON MPE (VFP 3.0+), rounding, NaN propagation, denormals
[   1.1154240]  md(0x8007f1f4)cpu3 interrupts: Cortex-A7 r0p5 (Cortex V7A core)
[   1.1254240]  done!
[   1.1254240] cpu3: DC enabled IC enabled WB enabled LABT branch prediction enabled
[   1.1354288] cpu3: 32KB/32B 2-way L1 VIPT Instruction cache
[   1.1354288] cpu3: 32KB/64B 4-way write-back-locking-C L1 PIPT Data cache
[   1.1454246] cpu3: 512KB/64B 8-way write-through L2 PIPT Unified cache
[   1.1554287]  vfpvfp3 at cpu3: NEON MPE (VFP 3.0+), rounding, NaN propagation, denormals
[   1.1554287]  md(0x8007f1f4) interrupts done!
[   1.1654252] cpufreqdt0: rate: 1008.000 MHz, 1200000 uV
[   1.1754305] sdmmc0 at sunximmc0
[   1.1954217] uhub0 at usb2: NetBSD (0x0000) EHCI root hub (0x0000), class 9/0, rev 2.00/1.00, addr 1
[   1.2054203] uhub1 at usb1: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev 1.00/1.00, addr 1
[   1.2154230] uhub2 at usb3: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev 1.00/1.00, addr 1
[   1.2306155] uhub3 at usb0: NetBSD (0x0000) EHCI root hub (0x0000), class 9/0, rev 2.00/1.00, addr 1
[   1.3054255] sdmmc0: SD card status: 4-bit, C10
[   1.3162048] ld0 at sdmmc0: <0x9f:0x5449:00000:0x00:0x00000010:0x127>
[   1.3162048] ld0: 7624 MB, 3872 cyl, 64 head, 63 sec, 512 bytes/sect x 15613952 sectors
[   1.3354299] ld0: 4-bit width, High-Speed/SDR25, 50.000 MHz
[   1.3754250] sdmmc1 at sunximmc1
[   1.5854309] sdmmc1: 4-bit width, 50.000 MHz
[   1.5983208] (manufacturer 0x20, product 0x2281) at sdmmc1 function 1 not configured
[   3.4655210] uhub4 at uhub0 port 1: vendor 1a40 (0x1a40) USB 2.0 Hub (0x0101), class 9/0, rev 2.00/1.11, addr 2
[   3.4755195] uhub4: single transaction translator
[   4.7955545] run0 at uhub4 port 2
[   4.7955545] run0: Ralink (0x148f) 802.11 n WLAN (0x5370), rev 2.00/1.01, addr 3
[   4.8855570] run0: link state DOWN (was UNKNOWN)
[   5.8555833] urtwn0 at uhub4 port 3
[   5.8555833] urtwn0: Vealtek (0x7392) product 7811 (0x7811), rev 2.00/2.00, addr 4
[   5.9955870] urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 08:be:ac:0b:13:1d
[   6.0055869] urtwn0: 1 rx pipe, 2 tx pipes
[   6.0055869] urtwn0: link state DOWN (was UNKNOWN)
[   6.0255912] boot device: <unknown>
[   6.0355941] root on emac0
[   6.0355941] mountroot: trying ffs...
[   6.0355941] mountroot: trying ext2fs...
[   6.0455885] mountroot: trying nfs...
[   6.0492273] nfs_boot: trying DHCP/BOOTP
[   6.0492273] emac0: link state DOWN (was UNKNOWN)
[   9.0456641] emac0: link state UP (was DOWN)
[   9.0456641] nfs_boot: DHCP next-server: 192.168.111.42
[   9.0556662] nfs_boot: my_domain=aprisoft.de
[   9.0556662] nfs_boot: my_addr=192.168.111.63
[   9.0656650] nfs_boot: my_mask=255.255.255.0
[   9.0656650] nfs_boot: gateway=192.168.111.42
[  21.0759940] root on 192.168.111.42:/zero2
[  21.0759940] root time: 0x6002fb99
[  21.0759940] root file system type: nfs
[  21.0859934] kern.module.path=/stand/evbarm/9.99.77/modules
[  21.0859934] WARNING: clock lost 7686 days
[  21.0959941] WARNING: using filesystem time
[  21.1017747] WARNING: CHECK AND RESET THE DATE!
[  21.1059931] init: copying out path `/sbin/init' 11
Created tmpfs /dev (1359872 byte, 2624 inodes)
Sat Jan 16 14:43:40 UTC 2021
Not checking /: fs_passno = 0 in /etc/fstab
mount_nfs: emmas:/zero2 on /: Operation not supported
Starting file system checks:
random_seed: /var/db/entropy-file: Not present; creating
[  29.5566240] entropy: WARNING: consolidating less than full entropy
Saved entropy to /var/db/entropy-file.
Setting tty flags.
Setting sysctl variables:
ddb.onpanic: 1 -> 0

^C


~Stopped in pid 0.2 (system) at  netbsd:cpu_Debugger+0x4:        bx      r14
db{0}> ps
PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
620  > 620 7   2     40000           90f679c0                 sh
425    425 3   1       180           91514240                 sh pipe_rd
477  > 477 7   1     40000           91514500                 sh
478    478 3   0       180           910a3780              sleep nanoslp
367    367 3   2       180           910a3200                 sh wait
171    171 3   1       180           910a3d00                 sh pipe_rd
170    170 3   3       180           9102dcc0                 sh wait
166    166 3   2       180           9102da00                 sh wait
1        1 3   3       180           90f55980               init wait
0      187 5   1       240           90f67c80           (zombie)
0      215 3   3       200           9102d740            ioflush syncer
0      214 3   1       200           9102d480          pooldrain pooldrain
0      213 3   0       200           9102d1c0           pgdaemon pgdaemon
0      210 3   1       200           90f67700              nfsio nfsiod
0      209 3   0       200           90f67440              nfsio nfsiod
0      208 3   1       200           90ecb500              nfsio nfsiod
0      207 3   3       200           907efd00              nfsio nfsiod
0      206 3   0       200           90ecb7c0               usb0 usbevt
0      205 3   1       200           90ecba80               usb3 usbevt
0      204 3   0       200           90ecb240               usb1 usbevt
0      203 3   1       200           907efa40               usb2 usbevt
0      202 3   0       200           90f67180             sdmmc1 mmctaskq
0      200 3   3       200           90f55c40             npfgc0 npfgcw
0      199 3   0       200           90f556c0            rt_free rt_free
0      198 3   0       200           90f55400              unpgc unpgc
0      197 3   3       200           90f55140    key_timehandler key_timehandler

0      196 3   3       200           90f4cc00    icmp6_wqinput/3 icmp6_wqinput
0      195 3   2       200           90f4c940    icmp6_wqinput/2 icmp6_wqinput
0      194 3   1       200           90f4c680    icmp6_wqinput/1 icmp6_wqinput
0      193 3   0       200           90f4c3c0    icmp6_wqinput/0 icmp6_wqinput
0      192 3   0       200           90f4c100          nd6_timer nd6_timer
0       31 3   3       200           90f43bc0    carp6_wqinput/3 carp6_wqinput
0       63 3   2       200           90f43900    carp6_wqinput/2 carp6_wqinput
0      126 3   1       200           90f43640    carp6_wqinput/1 carp6_wqinput
0      125 3   0       200           90f43380    carp6_wqinput/0 carp6_wqinput
0      124 3   3       200           90f430c0     carp_wqinput/3 carp_wqinput
0      123 3   2       200           90f39b80     carp_wqinput/2 carp_wqinput
0      122 3   1       200           90f398c0     carp_wqinput/1 carp_wqinput
0      121 3   0       200           90f39600     carp_wqinput/0 carp_wqinput
0      120 3   3       200           90f39340     icmp_wqinput/3 icmp_wqinput
0      119 3   2       200           90f39080     icmp_wqinput/2 icmp_wqinput
0      118 3   1       200           90ecdb40     icmp_wqinput/1 icmp_wqinput
0      117 3   0       200           90ecd880     icmp_wqinput/0 icmp_wqinput
0      116 3   2       200           90ecd5c0           rt_timer rt_timer
0      115 3   2       200           90ecbd40        vmem_rehash vmem_rehash
0      114 3   0       200           90ecd040          entbutler entropy
0      113 3   0       200           90ecd300             sdmmc0 mmctaskq
0      104 3   0       200           907ef780         usbtask-dr usbtsk
0      103 3   0       200           907ef4c0         usbtask-hc usbtsk
0      102 3   3       200           907ef200            xcall/3 xcall
0      101 1   3       200           9079ccc0          softser/3
0      100 1   3       200           9079ca00          softclk/3
0       99 1   3       200           9079c740          softbio/3
0       98 1   3       200           9079c480          softnet/3
0    >  97 1   3       201           9079c1c0             idle/3
0       96 3   2       200           90798c80            xcall/2 xcall
0       30 1   2       200           907989c0          softser/2
0       29 1   2       200           90798700          softclk/2
0       28 1   2       200           90798440          softbio/2
0       27 1   2       200           90798180          softnet/2
0       26 1   2       201           90793c40             idle/2
0       25 3   1       200           90793980            xcall/1 xcall
0       24 1   1       200           907936c0          softser/1
0       23 1   1       200           90793400          softclk/1
0       22 1   1       200           90793140          softbio/1
0       21 1   1       200           9077dc00          softnet/1
0       20 1   1       201           9077d940             idle/1
0       19 3   0       200           9077d680           lnxpwrwq lnxpwrwq
0       18 3   3       200           9077d3c0           lnxlngwq lnxlngwq
0       17 3   0       200           9077d100           lnxsyswq lnxsyswq
0       16 3   0       200           90778bc0           lnxrcugc lnxrcugc
0       15 3   0       200           90778900             sysmon smtaskq
0       14 3   0       200           90778640         pmfsuspend pmfsuspend
0       13 3   0       200           90778380           pmfevent pmfevent
0       12 3   0       200           907780c0         sopendfree sopendfr
0       11 3   3       200           90765b80            iflnkst iflnkst
0       10 3   0       200           907658c0           nfssilly nfssilly
0        9 3   0       200           90765600             vdrain vdrain
0        8 3   0       200           90765340          modunload mod_unld
0        7 3   0       200           90765080            xcall/0 xcall
0        6 1   0       200           90763b40          softser/0
0        5 1   0       200           90763880          softclk/0
0        4 1   0       200           907635c0          softbio/0
0        3 1   0       200           90763300          softnet/0
0    >   2 1   0       201           90763040             idle/0
0        0 3   2       200           80891100            swapper uvm
db{0}> c
Mutex error: mi_userret,116: sleep lock held

[  94.4913070] lock address : 0x00000000915a1068 type     :     sleep/adaptive
[  94.5013040] initialized  : 0x00000000800a3098
[  94.5013040] shared holds :                  0 exclusive:                  1
[  94.5113015] shares wanted:                  0 exclusive:                  0
[  94.5113015] relevant cpu :                  1 last held:                  1
[  94.5213018] relevant lwp : 0x0000000091514500 last held: 0x0000000091514500
[  94.5313018] last locked* : 0x00000000800a4e74 unlocked : 0x00000000800a5090
[  94.5413024] owner field  : 0x0000000091514500 wait/spin:                0/0
[  94.5413024] Turnstile: no active turnstile for this lock.

[  94.5513024] panic: LOCKDEBUG: Mutex error: mi_userret,116: sleep lock held
[  94.5513024] cpu1: Begin traceback...
[  94.5613030] 0x9d3e7e14: netbsd:db_panic+0x14
[  94.5613030] 0x9d3e7e2c: netbsd:vpanic+0xe4
[  94.5713055] 0x9d3e7e44: netbsd:snprintf
[  94.5713055] 0x9d3e7e84: netbsd:lockdebug_more
[  94.5813036] 0x9d3e7ebc: netbsd:lockdebug_barrier+0x14c
[  94.5813036] 0x9d3e7ee4: netbsd:userret+0x110
[  94.5913037] 0x9d3e7fac: netbsd:prefetch_abort_handler+0x308
[  94.5913037] cpu1: End traceback...
[  94.6013039] rebooting...


>How-To-Repeat:
s/a

>Fix:
n/a

>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/55933: diskless machine hangs at boot
Date: Sat, 16 Jan 2021 17:38:08 +0000

 This (strangely) seems to be a userland issue.

 The version I was running before was from sometime Dec 19-ish, so
 I checked out the state of 20201220 12:00UTC and replaced the kernel -
 this did not help.

 Then I build all sets for that source and extracted only base.tgz, which
 fixed the issue.

 So some change in userland after that broke it. Will narrow down further.

 Martin

Responsible-Changed-From-To: kern-bug-people->joerg
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Sun, 17 Jan 2021 08:27:59 +0000
Responsible-Changed-Why:
over to commiter of the change that broke it


From: Martin Husemann <martin@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: joerg@NetBSD.org
Subject: Re: bin/55933: diskless machine hangs at boot
Date: Sun, 17 Jan 2021 08:31:21 +0000

 I have no idea why, but this commit makes earmv7hfeb machines hang at
 boot while running /etc/rc.d/*.

 Martin

 ----- Forwarded message from Joerg Sonnenberger <joerg@netbsd.org> -----

 Date: Thu, 7 Jan 2021 14:41:50 +0000
 From: Joerg Sonnenberger <joerg@netbsd.org>
 To: source-changes@NetBSD.org
 Subject: CVS commit: src/lib/libc/cdb
 X-Mailer: log_accum

 Module Name:	src
 Committed By:	joerg
 Date:		Thu Jan  7 14:41:50 UTC 2021

 Modified Files:
 	src/lib/libc/cdb: cdbw.c

 Log Message:
 Optimize CPU and memory use of cdbw(3)

 Reduce memory footprint and processing time by dropping the vertex parts
 of the edges kept during the peeling. Hook up the
 division-by-multiplication logic to help older platforms.


 To generate a diff of this commit:
 cvs rdiff -u -r1.6 -r1.7 src/lib/libc/cdb/cdbw.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.


 ----- End forwarded message -----

Responsible-Changed-From-To: joerg->kern-bug-people
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Mon, 18 Jan 2021 12:56:34 +0000
Responsible-Changed-Why:
Seems to be a kernel issue


From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/55933: diskless machine hangs at boot
Date: Mon, 18 Jan 2021 14:02:31 +0100

 The effect is very volatile, adding simple debug code to cdbw_open()
 makes the issue go away, but also confirms that *no* use of the cdbw
 code happens before the hang.

 Maybe some libc code layout issue and the cdbw change pushing things
 to a new (not yet used before) page? Plus some emac(4) driver bug breaking
 NFS traffic?

 In case of the hang the shell executing /etc/rc gets into rc.d/network,
 close to the end and hangs in 

 	load_rc_config_var dhcpcd dhcpcd

 right after testing the function was called with two args (so maybe
 the following eval failed to exec a new sh?)

 With "set -X" added to the start of rc.d/network:

 ---8<---
 Setting sysctl variables:
 ddb.onpanic: 1 -> 0
 + name=network
 + start_cmd=network_start
 + stop_cmd=network_stop
 + nl='
 '
 + load_rc_config network
 + _command=network
 + '[' -z network ']'
 + true
 + ':'
 + '[' -f /etc/rc.conf.d/network ']'
 + load_rc_config_var dhcpcd dhcpcd
 + '[' 2 -ne 2 ']'
 --->8---

 Any ideas how to trace this further?

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/55933: diskless machine hangs at boot
Date: Mon, 18 Jan 2021 14:44:30 +0100

 I have a tcpdump capture from the NFS server while the client "hangs",
 if anyone feels like having a look at that, please contact me off list.

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55933: diskless machine hangs at boot
Date: Thu, 21 Jan 2021 20:37:39 +0100

 It seems to be looping in pmap of the shell process executing rc.d/network.
 On every third (or so) entry to ddb the "continue" kills the kernel due
 to reaching userret() with the pmap lock (of that shell process) still
 held.

 Below is the tail of a uvmhist log before the lock held panic.

 Martin

 1611256850.687880 prefetch_abort_handler#4094920@3: called!
 1611256850.687880 prefetch_abort_handler#4094920@3:  (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
 1611256850.687881 pmap_fault_fixup#4108685@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
 1611256850.687881 pmap_fault_fixup#4108685@3:  ti=0x8089c670 pai=0x915c849c asid=0x60
 1611256850.687887 prefetch_abort_handler#4094920@3:  <- emulated
 1611256850.687888 prefetch_abort_handler#4094921@3: called!
 1611256850.687888 prefetch_abort_handler#4094921@3:  (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
 1611256850.687889 pmap_fault_fixup#4108686@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
 1611256850.687889 pmap_fault_fixup#4108686@3:  ti=0x8089c670 pai=0x915c849c asid=0x60
 1611256850.687895 prefetch_abort_handler#4094921@3:  <- emulated
 1611256850.687896 prefetch_abort_handler#4094922@3: called!
 1611256850.687897 prefetch_abort_handler#4094922@3:  (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
 1611256850.687897 pmap_fault_fixup#4108687@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
 1611256850.687898 pmap_fault_fixup#4108687@3:  ti=0x8089c670 pai=0x915c849c asid=0x60
 1611256850.687903 prefetch_abort_handler#4094922@3:  <- emulated
 1611256850.687905 prefetch_abort_handler#4094923@3: called!
 1611256850.687905 prefetch_abort_handler#4094923@3:  (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
 1611256850.687905 pmap_fault_fixup#4108688@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
 1611256850.687906 pmap_fault_fixup#4108688@3:  ti=0x8089c670 pai=0x915c849c asid=0x60
 1611256850.687912 prefetch_abort_handler#4094923@3:  <- emulated
 1611256850.687913 prefetch_abort_handler#4094924@3: called!
 1611256850.687913 prefetch_abort_handler#4094924@3:  (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
 1611256850.687914 pmap_fault_fixup#4108689@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
 1611256850.687914 pmap_fault_fixup#4108689@3:  ti=0x8089c670 pai=0x915c849c asid=0x60
 1611256850.687920 prefetch_abort_handler#4094924@3:  <- emulated
 1611256850.687921 prefetch_abort_handler#4094925@3: called!
 1611256850.687922 prefetch_abort_handler#4094925@3:  (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
 1611256850.687923 pmap_fault_fixup#4108690@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
 1611256850.687923 pmap_fault_fixup#4108690@3:  ti=0x8089c670 pai=0x915c849c asid=0x60
 1611256850.687929 prefetch_abort_handler#4094925@3:  <- emulated
 1611256850.687930 prefetch_abort_handler#4094926@3: called!
 1611256850.687931 prefetch_abort_handler#4094926@3:  (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
 1611256850.687931 pmap_fault_fixup#4108691@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
 1611256850.687932 pmap_fault_fixup#4108691@3:  ti=0x8089c670 pai=0x915c849c asid=0x60
 1611256850.687936 prefetch_abort_handler#4094926@3:  <- emulated
 db{0}> c
 Mutex error: mi_userret,116: sleep lock held

 [  66.9145944] lock address : 0x00000000915c8070 type     :     sleep/adaptive
 [  66.9245970] initialized  : 0x00000000800a5e8c
 [  66.9245970] shared holds :                  0 exclusive:                  1
 [  66.9345900] shares wanted:                  0 exclusive:                  0
 [  66.9345900] relevant cpu :                  3 last held:                  3
 [  66.9445905] relevant lwp : 0x00000000914fc880 last held: 0x00000000914fc880
 [  66.9545904] last locked* : 0x00000000800a88b8 unlocked : 0x00000000800a8cac
 [  66.9645909] owner field  : 0x00000000914fc880 wait/spin:                0/0
 [  66.9645909] Turnstile: no active turnstile for this lock.

 [  66.9745909] panic: LOCKDEBUG: Mutex error: mi_userret,116: sleep lock held
 [  66.9745909] cpu3: Begin traceback...
 [  66.9845916] 0x9d431e04: netbsd:db_panic+0x14
 [  66.9845916] 0x9d431e1c: netbsd:vpanic+0xe4
 [  66.9945916] 0x9d431e34: netbsd:snprintf
 [  66.9945916] 0x9d431e74: netbsd:lockdebug_more
 [  67.0045918] 0x9d431eac: netbsd:lockdebug_barrier+0x14c
 [  67.0045918] 0x9d431ed4: netbsd:userret+0x110
 [  67.0145919] 0x9d431fac: netbsd:prefetch_abort_handler+0x444
 [  67.0145919] cpu3: End traceback...

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.