NetBSD Problem Report #55933
From martin@duskware.de Sat Jan 16 14:50:45 2021
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id C177A1A924A
for <gnats-bugs@gnats.NetBSD.org>; Sat, 16 Jan 2021 14:50:45 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: diskless machine hangs at boot
X-Send-Pr-Version: 3.95
>Number: 55933
>Category: kern
>Synopsis: diskless machine hangs at boot
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jan 16 14:55:00 +0000 2021
>Last-Modified: Sat Jan 30 17:32:38 +0000 2021
>Originator: Martin Husemann
>Release: NetBSD 9.99.77
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD space-truckin.duskware.de 9.99.77 NetBSD 9.99.77 (GENERIC) #115: Tue Jan 12 13:24:17 CET 2021 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/evbarm/compile/GENERIC evbarm
Architecture: earmv7hfeb
Machine: evbarm
>Description:
I updated a diskless test machine yesterday (from a state about two weeks old)
and now it does not complete rc.d:
[ 1.0000000] ksyms: checking .text
[ 1.0000000] ksyms: checking .rodata
[ 1.0000000] ksyms: checking .eh_frame
[ 1.0000000] ksyms: checking link_set_evcnts
[ 1.0000000] ksyms: checking link_set_arm_platforms
[ 1.0000000] ksyms: checking link_set_arm_cpu_methods
[ 1.0000000] ksyms: checking link_set_fdt_consoles
[ 1.0000000] ksyms: checking link_set_sysctl_funcs
[ 1.0000000] ksyms: checking link_set_fdt_opps
[ 1.0000000] ksyms: checking link_set_modules
[ 1.0000000] ksyms: checking link_set_sdt_argtypes_set
[ 1.0000000] ksyms: checking link_set_sdt_probes_set
[ 1.0000000] ksyms: checking link_set_sdt_providers_set
[ 1.0000000] ksyms: checking link_set_ieee80211_funcs
[ 1.0000000] ksyms: checking link_set_domains
[ 1.0000000] ksyms: checking link_set_linux_module_param_desc
[ 1.0000000] ksyms: checking link_set_linux_module_param_info
[ 1.0000000] ksyms: checking link_set_dkwedge_methods
[ 1.0000000] ksyms: checking link_set_prop_linkpools
[ 1.0000000] ksyms: checking .data
[ 1.0000000] ksyms: checking .bss
[ 1.0000000] ksyms: checking _init_memory
[ 1.0000000] ksyms: checking .ARM.attributes
[ 1.0000000] ksyms: checking .ident
[ 1.0000000] ksyms: checking .comment
[ 1.0000000] ksyms: checking .note.netbsd.ident
[ 1.0000000] ksyms: checking .symtab
[ 1.0000000] ksyms: checking .strtab
[ 1.0000000] ksyms: checking .shstrtab
[ 1.0000000] Loaded initial symtab at 0x8089efec, strtab at 0x8091a93c, # entries 31637
[ 1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
[ 1.0000000] 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
[ 1.0000000] 2018, 2019, 2020, 2021 The NetBSD Foundation, Inc. All rights reserved.
[ 1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[ 1.0000000] The Regents of the University of California. All rights reserved.
[ 1.0000000] NetBSD 9.99.77 (GENERIC) #15: Sat Jan 16 14:50:33 CET 2021
[ 1.0000000] martin@martins.aprisoft.de:/usr/src/sys/arch/evbarm/compile/GENERIC
[ 1.0000000] total memory = 256 MB
[ 1.0000000] avail memory = 235 MB
[ 1.0000000] entropy: no seed from bootloader
[ 1.0000000] armfdt0 (root)
[ 1.0000000] simplebus0 at armfdt0: Xunlong Orange Pi Zero
[ 1.0000000] simplebus1 at simplebus0
[ 1.0000000] simplebus2 at simplebus0
[ 1.0000000] cpus0 at simplebus0
[ 1.0000000] simplebus3 at simplebus0
[ 1.0000000] cpu0 at cpus0: Cortex-A7 r0p5 (Cortex V7A core)
[ 1.0000000] cpu0: DC enabled IC enabled WB enabled LABT branch prediction enabled
[ 1.0000000] cpu0: 32KB/32B 2-way L1 VIPT Instruction cache
[ 1.0000000] cpu0: 32KB/64B 4-way write-back-locking-C L1 PIPT Data cache
[ 1.0000000] cpu0: 512KB/64B 8-way write-through L2 PIPT Unified cache
[ 1.0000000] vfp0 at cpu0: NEON MPE (VFP 3.0+), rounding, NaN propagation, denormals
[ 1.0000000] cpufreqdt0 at cpu0
[ 1.0000000] cpu1 at cpus0
[ 1.0000000] cpufreqdt1 at cpu1
[ 1.0000000] cpu2 at cpus0
[ 1.0000000] cpufreqdt2 at cpu2
[ 1.0000000] cpu3 at cpus0
[ 1.0000000] cpufreqdt3 at cpu3
[ 1.0000000] gic0 at simplebus1: GIC
[ 1.0000000] armgic0 at gic0: Generic Interrupt Controller, 160 sources (150 valid)
[ 1.0000000] armgic0: 16 Priorities, 128 SPIs, 7 PPIs, 15 SGIs
[ 1.0000000] fclock0 at simplebus2: 24000000 Hz fixed clock (osc24M)
[ 1.0000000] fclock1 at simplebus2: 32768 Hz fixed clock (ext_osc32k)
[ 1.0000000] gtmr0 at simplebus0: Generic Timer
[ 1.0000000] gtmr0: interrupting on GIC irq 27
[ 1.0000000] armgtmr0 at gtmr0: Generic Timer (24000 kHz, virtual)
[ 1.0000030] sun8ih3ccu0 at simplebus1: H3 CCU
[ 1.0000030] sun8ih3rccu0 at simplebus1: H3 PRCM CCU
[ 1.0000030] sunxigpio0 at simplebus1: PIO
[ 1.0000030] gpio0 at sunxigpio0: 94 pins
[ 1.0000030] sunxigpio0: interrupting on GIC irq 43
[ 1.0000030] sunxigpio1 at simplebus1: PIO
[ 1.0000030] gpio1 at sunxigpio1: 12 pins
[ 1.0000030] sunxigpio1: interrupting on GIC irq 77
[ 1.0000030] sunxisramc0 at simplebus1: SRAM Controller
[ 1.0000030] sun6idma0 at simplebus1: DMA controller (12 channels)
[ 1.0000030] sun6idma0: interrupting on GIC irq 82
[ 1.0000030] gregulator0 at simplebus0: vdd-cpux
[ 1.0000030] sunxisid0 at simplebus1: Security ID EFUSE
[ 1.0000030] com0 at simplebus1: DesignWare APB UART, no fifo
[ 1.0000030] com0: console
[ 1.0000030] com0: interrupting on GIC irq 32
[ 1.0000030] fregulator0 at simplebus0: vcc3v3
[ 1.0000030] fregulator1 at simplebus0: vcc-wifi
[ 1.0000030] fregulator2 at simplebus0: vcc3v0
[ 1.0000030] fregulator3 at simplebus0: vcc5v0
[ 1.0000030] sunxiusbphy0 at simplebus1: USB PHY
[ 1.0000030] /soc/clock@1000000 at simplebus1 not configured
[ 1.0000030] /soc/mixer@1100000 at simplebus1 not configured
[ 1.0000030] /soc/lcd-controller@1c0c000 at simplebus1 not configured
[ 1.0000030] mmcpwrseq0 at simplebus0: Simple MMC power sequence provider
[ 1.0000030] sunxirtc0 at simplebus1: RTC
[ 1.0000030] emac0 at simplebus1: EMAC
[ 1.0000030] emac0: Ethernet address 02:42:95:64:50:6e
[ 1.0000030] emac0: interrupting on GIC irq 114
[ 1.0000030] ukphy0 at emac0 phy 1: OUI 0x0088a0, model 0x0000, rev. 0
[ 1.0000030] ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
[ 1.0000030] h3codec0 at simplebus1: H3 Audio Codec (analog part)
[ 1.0000030] /soc/hdmi-phy@1ef0000 at simplebus1 not configured
[ 1.0000030] /soc/dram-controller@1c62000 at simplebus1 not configured
[ 1.0000030] /opp_table0 at simplebus0 not configured
[ 1.0000030] sunximmc0 at simplebus1: SD/MMC controller
[ 1.0000030] sunximmc0: interrupting on GIC irq 92
[ 1.0000030] sunximmc1 at simplebus1: SD/MMC controller
[ 1.0000030] sunximmc1: interrupting on GIC irq 93
[ 1.0000030] motg0 at simplebus1: 'peripheral' mode not supported
[ 1.0000030] ehci0 at simplebus1: EHCI
[ 1.0000030] ehci0: interrupting on GIC irq 104
[ 1.0000030] ehci0: 1 companion controller, 1 port
[ 1.0000030] usb0 at ehci0: USB revision 2.0
[ 1.0000030] ohci0 at simplebus1: OHCI
[ 1.0000030] ohci0: interrupting on GIC irq 105
[ 1.0000030] ohci0: OHCI version 1.0
[ 1.0000030] usb1 at ohci0: USB revision 1.0
[ 1.0000030] ehci1 at simplebus1: EHCI
[ 1.0000030] ehci1: interrupting on GIC irq 106
[ 1.0000030] ehci1: 1 companion controller, 1 port
[ 1.0000030] usb2 at ehci1: USB revision 2.0
[ 1.0000030] ohci1 at simplebus1: OHCI
[ 1.0000030] ohci1: interrupting on GIC irq 107
[ 1.0000030] ohci1: OHCI version 1.0
[ 1.0000030] usb3 at ohci1: USB revision 1.0
[ 1.0000030] sunxiwdt0 at simplebus1: Watchdog
[ 1.0000030] sunxiwdt0: default watchdog period is 16 seconds
[ 1.0000030] /soc/deinterlace@1400000 at simplebus1 not configured
[ 1.0000030] /soc/crypto@1c15000 at simplebus1 not configured
[ 1.0000030] /soc/gpu@1c40000 at simplebus1 not configured
[ 1.0000030] psci0 at simplebus0: PSCI 0.1
[ 1.0000030] gpioleds0 at simplebus0: orangepi:green:pwr orangepi:red:status
[ 1.0000030] /soc/timer@1c20c00 at simplebus1 not configured
[ 1.0000030] /soc/video-codec@1c0e000 at simplebus1 not configured
[ 1.0000030] entropy: WARNING: extracting entropy too early
[ 1.0000030] cpu0 has 1 core siblings: cpu0
[ 1.0000030] cpu0 has 4 pkg siblings: cpu1 cpu2 cpu3 cpu0
[ 1.0000030] cpu0 has 1 1st siblings: cpu0
[ 1.0000030] cpu0 first in package: cpu0
[ 1.0000030] cpu1 has 1 core siblings: cpu1
[ 1.0000030] cpu1 has 4 pkg siblings: cpu2 cpu3 cpu0 cpu1
[ 1.0000030] cpu1 has 1 1st siblings: cpu0
[ 1.0000030] cpu1 first in package: cpu0
[ 1.0000030] cpu2 has 1 core siblings: cpu2
[ 1.0000030] cpu2 has 4 pkg siblings: cpu3 cpu0 cpu1 cpu2
[ 1.0000030] cpu2 has 1 1st siblings: cpu0
[ 1.0000030] cpu2 first in package: cpu0
[ 1.0000030] cpu3 has 1 core siblings: cpu3
[ 1.0000030] cpu3 has 4 pkg siblings: cpu0 cpu1 cpu2 cpu3
[ 1.0000030] cpu3 has 1 1st siblings: cpu0
[ 1.0000030] cpu3 first in package: cpu0
[ 1.0000030] cpu_hatch(cpu3): cpu_hatch(cpu1): cpu_hatch(cpu2): vectors vectors vbar=0x8009c240 vectors stacks vbar=0x8009c240 tlb vbar=0x8009c240 stackscpu1 stacks tlb: Cortex-A7 r0p5 (Cortex V7A core)
[ 1.0354407] tlbcpu1: DC enabled IC enabled WB enabled LABT branch prediction enabled
[ 1.0454226] cpu1: 32KB/32B 2-way L1 VIPT Instruction cache
[ 1.0554225] cpu1: 32KB/64B 4-way write-back-locking-C L1 PIPT Data cache
[ 1.0554225] cpu1: 512KB/64B 8-way write-through L2 PIPT Unified cache
[ 1.0654229] vfpvfp1 at cpu1: NEON MPE (VFP 3.0+), rounding, NaN propagation, denormals
[ 1.0754230] md(0x8007f1f4)cpu2 interrupts: Cortex-A7 r0p5 (Cortex V7A core)
[ 1.0754230] done!
[ 1.0854230] cpu2: DC enabled IC enabled WB enabled LABT branch prediction enabled
[ 1.0854230] cpu2: 32KB/32B 2-way L1 VIPT Instruction cache
[ 1.0954275] cpu2: 32KB/64B 4-way write-back-locking-C L1 PIPT Data cache
[ 1.1054236] cpu2: 512KB/64B 8-way write-through L2 PIPT Unified cache
[ 1.1054236] vfpvfp2 at cpu2: NEON MPE (VFP 3.0+), rounding, NaN propagation, denormals
[ 1.1154240] md(0x8007f1f4)cpu3 interrupts: Cortex-A7 r0p5 (Cortex V7A core)
[ 1.1254240] done!
[ 1.1254240] cpu3: DC enabled IC enabled WB enabled LABT branch prediction enabled
[ 1.1354288] cpu3: 32KB/32B 2-way L1 VIPT Instruction cache
[ 1.1354288] cpu3: 32KB/64B 4-way write-back-locking-C L1 PIPT Data cache
[ 1.1454246] cpu3: 512KB/64B 8-way write-through L2 PIPT Unified cache
[ 1.1554287] vfpvfp3 at cpu3: NEON MPE (VFP 3.0+), rounding, NaN propagation, denormals
[ 1.1554287] md(0x8007f1f4) interrupts done!
[ 1.1654252] cpufreqdt0: rate: 1008.000 MHz, 1200000 uV
[ 1.1754305] sdmmc0 at sunximmc0
[ 1.1954217] uhub0 at usb2: NetBSD (0x0000) EHCI root hub (0x0000), class 9/0, rev 2.00/1.00, addr 1
[ 1.2054203] uhub1 at usb1: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev 1.00/1.00, addr 1
[ 1.2154230] uhub2 at usb3: NetBSD (0x0000) OHCI root hub (0x0000), class 9/0, rev 1.00/1.00, addr 1
[ 1.2306155] uhub3 at usb0: NetBSD (0x0000) EHCI root hub (0x0000), class 9/0, rev 2.00/1.00, addr 1
[ 1.3054255] sdmmc0: SD card status: 4-bit, C10
[ 1.3162048] ld0 at sdmmc0: <0x9f:0x5449:00000:0x00:0x00000010:0x127>
[ 1.3162048] ld0: 7624 MB, 3872 cyl, 64 head, 63 sec, 512 bytes/sect x 15613952 sectors
[ 1.3354299] ld0: 4-bit width, High-Speed/SDR25, 50.000 MHz
[ 1.3754250] sdmmc1 at sunximmc1
[ 1.5854309] sdmmc1: 4-bit width, 50.000 MHz
[ 1.5983208] (manufacturer 0x20, product 0x2281) at sdmmc1 function 1 not configured
[ 3.4655210] uhub4 at uhub0 port 1: vendor 1a40 (0x1a40) USB 2.0 Hub (0x0101), class 9/0, rev 2.00/1.11, addr 2
[ 3.4755195] uhub4: single transaction translator
[ 4.7955545] run0 at uhub4 port 2
[ 4.7955545] run0: Ralink (0x148f) 802.11 n WLAN (0x5370), rev 2.00/1.01, addr 3
[ 4.8855570] run0: link state DOWN (was UNKNOWN)
[ 5.8555833] urtwn0 at uhub4 port 3
[ 5.8555833] urtwn0: Vealtek (0x7392) product 7811 (0x7811), rev 2.00/2.00, addr 4
[ 5.9955870] urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 08:be:ac:0b:13:1d
[ 6.0055869] urtwn0: 1 rx pipe, 2 tx pipes
[ 6.0055869] urtwn0: link state DOWN (was UNKNOWN)
[ 6.0255912] boot device: <unknown>
[ 6.0355941] root on emac0
[ 6.0355941] mountroot: trying ffs...
[ 6.0355941] mountroot: trying ext2fs...
[ 6.0455885] mountroot: trying nfs...
[ 6.0492273] nfs_boot: trying DHCP/BOOTP
[ 6.0492273] emac0: link state DOWN (was UNKNOWN)
[ 9.0456641] emac0: link state UP (was DOWN)
[ 9.0456641] nfs_boot: DHCP next-server: 192.168.111.42
[ 9.0556662] nfs_boot: my_domain=aprisoft.de
[ 9.0556662] nfs_boot: my_addr=192.168.111.63
[ 9.0656650] nfs_boot: my_mask=255.255.255.0
[ 9.0656650] nfs_boot: gateway=192.168.111.42
[ 21.0759940] root on 192.168.111.42:/zero2
[ 21.0759940] root time: 0x6002fb99
[ 21.0759940] root file system type: nfs
[ 21.0859934] kern.module.path=/stand/evbarm/9.99.77/modules
[ 21.0859934] WARNING: clock lost 7686 days
[ 21.0959941] WARNING: using filesystem time
[ 21.1017747] WARNING: CHECK AND RESET THE DATE!
[ 21.1059931] init: copying out path `/sbin/init' 11
Created tmpfs /dev (1359872 byte, 2624 inodes)
Sat Jan 16 14:43:40 UTC 2021
Not checking /: fs_passno = 0 in /etc/fstab
mount_nfs: emmas:/zero2 on /: Operation not supported
Starting file system checks:
random_seed: /var/db/entropy-file: Not present; creating
[ 29.5566240] entropy: WARNING: consolidating less than full entropy
Saved entropy to /var/db/entropy-file.
Setting tty flags.
Setting sysctl variables:
ddb.onpanic: 1 -> 0
^C
~Stopped in pid 0.2 (system) at netbsd:cpu_Debugger+0x4: bx r14
db{0}> ps
PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
620 > 620 7 2 40000 90f679c0 sh
425 425 3 1 180 91514240 sh pipe_rd
477 > 477 7 1 40000 91514500 sh
478 478 3 0 180 910a3780 sleep nanoslp
367 367 3 2 180 910a3200 sh wait
171 171 3 1 180 910a3d00 sh pipe_rd
170 170 3 3 180 9102dcc0 sh wait
166 166 3 2 180 9102da00 sh wait
1 1 3 3 180 90f55980 init wait
0 187 5 1 240 90f67c80 (zombie)
0 215 3 3 200 9102d740 ioflush syncer
0 214 3 1 200 9102d480 pooldrain pooldrain
0 213 3 0 200 9102d1c0 pgdaemon pgdaemon
0 210 3 1 200 90f67700 nfsio nfsiod
0 209 3 0 200 90f67440 nfsio nfsiod
0 208 3 1 200 90ecb500 nfsio nfsiod
0 207 3 3 200 907efd00 nfsio nfsiod
0 206 3 0 200 90ecb7c0 usb0 usbevt
0 205 3 1 200 90ecba80 usb3 usbevt
0 204 3 0 200 90ecb240 usb1 usbevt
0 203 3 1 200 907efa40 usb2 usbevt
0 202 3 0 200 90f67180 sdmmc1 mmctaskq
0 200 3 3 200 90f55c40 npfgc0 npfgcw
0 199 3 0 200 90f556c0 rt_free rt_free
0 198 3 0 200 90f55400 unpgc unpgc
0 197 3 3 200 90f55140 key_timehandler key_timehandler
0 196 3 3 200 90f4cc00 icmp6_wqinput/3 icmp6_wqinput
0 195 3 2 200 90f4c940 icmp6_wqinput/2 icmp6_wqinput
0 194 3 1 200 90f4c680 icmp6_wqinput/1 icmp6_wqinput
0 193 3 0 200 90f4c3c0 icmp6_wqinput/0 icmp6_wqinput
0 192 3 0 200 90f4c100 nd6_timer nd6_timer
0 31 3 3 200 90f43bc0 carp6_wqinput/3 carp6_wqinput
0 63 3 2 200 90f43900 carp6_wqinput/2 carp6_wqinput
0 126 3 1 200 90f43640 carp6_wqinput/1 carp6_wqinput
0 125 3 0 200 90f43380 carp6_wqinput/0 carp6_wqinput
0 124 3 3 200 90f430c0 carp_wqinput/3 carp_wqinput
0 123 3 2 200 90f39b80 carp_wqinput/2 carp_wqinput
0 122 3 1 200 90f398c0 carp_wqinput/1 carp_wqinput
0 121 3 0 200 90f39600 carp_wqinput/0 carp_wqinput
0 120 3 3 200 90f39340 icmp_wqinput/3 icmp_wqinput
0 119 3 2 200 90f39080 icmp_wqinput/2 icmp_wqinput
0 118 3 1 200 90ecdb40 icmp_wqinput/1 icmp_wqinput
0 117 3 0 200 90ecd880 icmp_wqinput/0 icmp_wqinput
0 116 3 2 200 90ecd5c0 rt_timer rt_timer
0 115 3 2 200 90ecbd40 vmem_rehash vmem_rehash
0 114 3 0 200 90ecd040 entbutler entropy
0 113 3 0 200 90ecd300 sdmmc0 mmctaskq
0 104 3 0 200 907ef780 usbtask-dr usbtsk
0 103 3 0 200 907ef4c0 usbtask-hc usbtsk
0 102 3 3 200 907ef200 xcall/3 xcall
0 101 1 3 200 9079ccc0 softser/3
0 100 1 3 200 9079ca00 softclk/3
0 99 1 3 200 9079c740 softbio/3
0 98 1 3 200 9079c480 softnet/3
0 > 97 1 3 201 9079c1c0 idle/3
0 96 3 2 200 90798c80 xcall/2 xcall
0 30 1 2 200 907989c0 softser/2
0 29 1 2 200 90798700 softclk/2
0 28 1 2 200 90798440 softbio/2
0 27 1 2 200 90798180 softnet/2
0 26 1 2 201 90793c40 idle/2
0 25 3 1 200 90793980 xcall/1 xcall
0 24 1 1 200 907936c0 softser/1
0 23 1 1 200 90793400 softclk/1
0 22 1 1 200 90793140 softbio/1
0 21 1 1 200 9077dc00 softnet/1
0 20 1 1 201 9077d940 idle/1
0 19 3 0 200 9077d680 lnxpwrwq lnxpwrwq
0 18 3 3 200 9077d3c0 lnxlngwq lnxlngwq
0 17 3 0 200 9077d100 lnxsyswq lnxsyswq
0 16 3 0 200 90778bc0 lnxrcugc lnxrcugc
0 15 3 0 200 90778900 sysmon smtaskq
0 14 3 0 200 90778640 pmfsuspend pmfsuspend
0 13 3 0 200 90778380 pmfevent pmfevent
0 12 3 0 200 907780c0 sopendfree sopendfr
0 11 3 3 200 90765b80 iflnkst iflnkst
0 10 3 0 200 907658c0 nfssilly nfssilly
0 9 3 0 200 90765600 vdrain vdrain
0 8 3 0 200 90765340 modunload mod_unld
0 7 3 0 200 90765080 xcall/0 xcall
0 6 1 0 200 90763b40 softser/0
0 5 1 0 200 90763880 softclk/0
0 4 1 0 200 907635c0 softbio/0
0 3 1 0 200 90763300 softnet/0
0 > 2 1 0 201 90763040 idle/0
0 0 3 2 200 80891100 swapper uvm
db{0}> c
Mutex error: mi_userret,116: sleep lock held
[ 94.4913070] lock address : 0x00000000915a1068 type : sleep/adaptive
[ 94.5013040] initialized : 0x00000000800a3098
[ 94.5013040] shared holds : 0 exclusive: 1
[ 94.5113015] shares wanted: 0 exclusive: 0
[ 94.5113015] relevant cpu : 1 last held: 1
[ 94.5213018] relevant lwp : 0x0000000091514500 last held: 0x0000000091514500
[ 94.5313018] last locked* : 0x00000000800a4e74 unlocked : 0x00000000800a5090
[ 94.5413024] owner field : 0x0000000091514500 wait/spin: 0/0
[ 94.5413024] Turnstile: no active turnstile for this lock.
[ 94.5513024] panic: LOCKDEBUG: Mutex error: mi_userret,116: sleep lock held
[ 94.5513024] cpu1: Begin traceback...
[ 94.5613030] 0x9d3e7e14: netbsd:db_panic+0x14
[ 94.5613030] 0x9d3e7e2c: netbsd:vpanic+0xe4
[ 94.5713055] 0x9d3e7e44: netbsd:snprintf
[ 94.5713055] 0x9d3e7e84: netbsd:lockdebug_more
[ 94.5813036] 0x9d3e7ebc: netbsd:lockdebug_barrier+0x14c
[ 94.5813036] 0x9d3e7ee4: netbsd:userret+0x110
[ 94.5913037] 0x9d3e7fac: netbsd:prefetch_abort_handler+0x308
[ 94.5913037] cpu1: End traceback...
[ 94.6013039] rebooting...
>How-To-Repeat:
s/a
>Fix:
n/a
>Release-Note:
>Audit-Trail:
From: Martin Husemann <martin@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/55933: diskless machine hangs at boot
Date: Sat, 16 Jan 2021 17:38:08 +0000
This (strangely) seems to be a userland issue.
The version I was running before was from sometime Dec 19-ish, so
I checked out the state of 20201220 12:00UTC and replaced the kernel -
this did not help.
Then I build all sets for that source and extracted only base.tgz, which
fixed the issue.
So some change in userland after that broke it. Will narrow down further.
Martin
Responsible-Changed-From-To: kern-bug-people->joerg
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Sun, 17 Jan 2021 08:27:59 +0000
Responsible-Changed-Why:
over to commiter of the change that broke it
From: Martin Husemann <martin@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: joerg@NetBSD.org
Subject: Re: bin/55933: diskless machine hangs at boot
Date: Sun, 17 Jan 2021 08:31:21 +0000
I have no idea why, but this commit makes earmv7hfeb machines hang at
boot while running /etc/rc.d/*.
Martin
----- Forwarded message from Joerg Sonnenberger <joerg@netbsd.org> -----
Date: Thu, 7 Jan 2021 14:41:50 +0000
From: Joerg Sonnenberger <joerg@netbsd.org>
To: source-changes@NetBSD.org
Subject: CVS commit: src/lib/libc/cdb
X-Mailer: log_accum
Module Name: src
Committed By: joerg
Date: Thu Jan 7 14:41:50 UTC 2021
Modified Files:
src/lib/libc/cdb: cdbw.c
Log Message:
Optimize CPU and memory use of cdbw(3)
Reduce memory footprint and processing time by dropping the vertex parts
of the edges kept during the peeling. Hook up the
division-by-multiplication logic to help older platforms.
To generate a diff of this commit:
cvs rdiff -u -r1.6 -r1.7 src/lib/libc/cdb/cdbw.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
----- End forwarded message -----
Responsible-Changed-From-To: joerg->kern-bug-people
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Mon, 18 Jan 2021 12:56:34 +0000
Responsible-Changed-Why:
Seems to be a kernel issue
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/55933: diskless machine hangs at boot
Date: Mon, 18 Jan 2021 14:02:31 +0100
The effect is very volatile, adding simple debug code to cdbw_open()
makes the issue go away, but also confirms that *no* use of the cdbw
code happens before the hang.
Maybe some libc code layout issue and the cdbw change pushing things
to a new (not yet used before) page? Plus some emac(4) driver bug breaking
NFS traffic?
In case of the hang the shell executing /etc/rc gets into rc.d/network,
close to the end and hangs in
load_rc_config_var dhcpcd dhcpcd
right after testing the function was called with two args (so maybe
the following eval failed to exec a new sh?)
With "set -X" added to the start of rc.d/network:
---8<---
Setting sysctl variables:
ddb.onpanic: 1 -> 0
+ name=network
+ start_cmd=network_start
+ stop_cmd=network_stop
+ nl='
'
+ load_rc_config network
+ _command=network
+ '[' -z network ']'
+ true
+ ':'
+ '[' -f /etc/rc.conf.d/network ']'
+ load_rc_config_var dhcpcd dhcpcd
+ '[' 2 -ne 2 ']'
--->8---
Any ideas how to trace this further?
Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/55933: diskless machine hangs at boot
Date: Mon, 18 Jan 2021 14:44:30 +0100
I have a tcpdump capture from the NFS server while the client "hangs",
if anyone feels like having a look at that, please contact me off list.
Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/55933: diskless machine hangs at boot
Date: Thu, 21 Jan 2021 20:37:39 +0100
It seems to be looping in pmap of the shell process executing rc.d/network.
On every third (or so) entry to ddb the "continue" kills the kernel due
to reaching userret() with the pmap lock (of that shell process) still
held.
Below is the tail of a uvmhist log before the lock held panic.
Martin
1611256850.687880 prefetch_abort_handler#4094920@3: called!
1611256850.687880 prefetch_abort_handler#4094920@3: (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
1611256850.687881 pmap_fault_fixup#4108685@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
1611256850.687881 pmap_fault_fixup#4108685@3: ti=0x8089c670 pai=0x915c849c asid=0x60
1611256850.687887 prefetch_abort_handler#4094920@3: <- emulated
1611256850.687888 prefetch_abort_handler#4094921@3: called!
1611256850.687888 prefetch_abort_handler#4094921@3: (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
1611256850.687889 pmap_fault_fixup#4108686@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
1611256850.687889 pmap_fault_fixup#4108686@3: ti=0x8089c670 pai=0x915c849c asid=0x60
1611256850.687895 prefetch_abort_handler#4094921@3: <- emulated
1611256850.687896 prefetch_abort_handler#4094922@3: called!
1611256850.687897 prefetch_abort_handler#4094922@3: (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
1611256850.687897 pmap_fault_fixup#4108687@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
1611256850.687898 pmap_fault_fixup#4108687@3: ti=0x8089c670 pai=0x915c849c asid=0x60
1611256850.687903 prefetch_abort_handler#4094922@3: <- emulated
1611256850.687905 prefetch_abort_handler#4094923@3: called!
1611256850.687905 prefetch_abort_handler#4094923@3: (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
1611256850.687905 pmap_fault_fixup#4108688@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
1611256850.687906 pmap_fault_fixup#4108688@3: ti=0x8089c670 pai=0x915c849c asid=0x60
1611256850.687912 prefetch_abort_handler#4094923@3: <- emulated
1611256850.687913 prefetch_abort_handler#4094924@3: called!
1611256850.687913 prefetch_abort_handler#4094924@3: (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
1611256850.687914 pmap_fault_fixup#4108689@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
1611256850.687914 pmap_fault_fixup#4108689@3: ti=0x8089c670 pai=0x915c849c asid=0x60
1611256850.687920 prefetch_abort_handler#4094924@3: <- emulated
1611256850.687921 prefetch_abort_handler#4094925@3: called!
1611256850.687922 prefetch_abort_handler#4094925@3: (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
1611256850.687923 pmap_fault_fixup#4108690@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
1611256850.687923 pmap_fault_fixup#4108690@3: ti=0x8089c670 pai=0x915c849c asid=0x60
1611256850.687929 prefetch_abort_handler#4094925@3: <- emulated
1611256850.687930 prefetch_abort_handler#4094926@3: called!
1611256850.687931 prefetch_abort_handler#4094926@3: (pc=0x6d2fdffe, l=0x914fc880, tf=0x9d431fb0)
1611256850.687931 pmap_fault_fixup#4108691@3: called: pm=0x915c8070, va=0x6d2fd000, ftype=0x5, user=1
1611256850.687932 pmap_fault_fixup#4108691@3: ti=0x8089c670 pai=0x915c849c asid=0x60
1611256850.687936 prefetch_abort_handler#4094926@3: <- emulated
db{0}> c
Mutex error: mi_userret,116: sleep lock held
[ 66.9145944] lock address : 0x00000000915c8070 type : sleep/adaptive
[ 66.9245970] initialized : 0x00000000800a5e8c
[ 66.9245970] shared holds : 0 exclusive: 1
[ 66.9345900] shares wanted: 0 exclusive: 0
[ 66.9345900] relevant cpu : 3 last held: 3
[ 66.9445905] relevant lwp : 0x00000000914fc880 last held: 0x00000000914fc880
[ 66.9545904] last locked* : 0x00000000800a88b8 unlocked : 0x00000000800a8cac
[ 66.9645909] owner field : 0x00000000914fc880 wait/spin: 0/0
[ 66.9645909] Turnstile: no active turnstile for this lock.
[ 66.9745909] panic: LOCKDEBUG: Mutex error: mi_userret,116: sleep lock held
[ 66.9745909] cpu3: Begin traceback...
[ 66.9845916] 0x9d431e04: netbsd:db_panic+0x14
[ 66.9845916] 0x9d431e1c: netbsd:vpanic+0xe4
[ 66.9945916] 0x9d431e34: netbsd:snprintf
[ 66.9945916] 0x9d431e74: netbsd:lockdebug_more
[ 67.0045918] 0x9d431eac: netbsd:lockdebug_barrier+0x14c
[ 67.0045918] 0x9d431ed4: netbsd:userret+0x110
[ 67.0145919] 0x9d431fac: netbsd:prefetch_abort_handler+0x444
[ 67.0145919] cpu3: End traceback...
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.