NetBSD Problem Report #39283

From root@frost.netbsd.se  Mon Aug  4 14:00:42 2008
Return-Path: <root@frost.netbsd.se>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id EE2B763B853
	for <gnats-bugs@gnats.NetBSD.org>; Mon,  4 Aug 2008 14:00:41 +0000 (UTC)
Message-Id: <20080804124301.E522A4F344@frost.netbsd.se>
Date: Mon,  4 Aug 2008 14:43:01 +0200 (CEST)
From: fredrik@netbsd.se
Reply-To: fredrik@netbsd.se
To: gnats-bugs@gnats.NetBSD.org
Subject: Kernel crash on Dell Poweredge 2950
X-Send-Pr-Version: 3.95

>Number:         39283
>Category:       port-amd64
>Synopsis:       4.99.71 crashed after about 3-4 days when running MP-kernel
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-amd64-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Aug 04 14:05:01 +0000 2008
>Closed-Date:    Mon Jan 11 11:27:50 +0000 2010
>Last-Modified:  Mon Jan 11 11:27:50 +0000 2010
>Originator:     Fredrik Carlsson
>Release:        NetBSD 4.99.71
>Organization:

>Environment:


System: NetBSD frost.netbsd.se 4.99.71 NetBSD 4.99.71 (FROST.debug) #1: Wed Jul 30 22:01:59 CEST 2008 viktor@frost.netbsd.se:/usr/src/work/obj/sys/arch/amd64/compile/FROST.debug amd64
Architecture: x86_64
Machine: amd64
>Description:
	Kernel crashes after 3-4 days of operations no mather if the system is idle or not.
	Information from the crash can be found bellow and crash dumps can be provided if needed. The kernel has been crashing ever since 4.0_beta2 which was stable, 4.0, 4.99.XX crashes with a MP-kernel. Booting with only one CPU is stable and works fine.



	db{0}> ps/l
 PID         LID S     FLAGS       STRUCT LWP *               NAME WAIT
 6359          1 7         4   ffff800078075be0                 sh     
 5402          1 3         4   ffff800078075420                 sh wait
 4768          1 3        84   ffff800079836800               make piperd
 27852         1 3        84   ffff8000783087c0                 sh wait  
 5060          1 3        84   ffff80005be6b7c0               make piperd
 5972          1 3     40084   ffff80005bb2c3e0                 sh wait  
 3577          1 3        84   ffff80005bd947c0               make piperd
 4072          1 7         4   ffff80005ce92ba0                ssh       
 4670          1 3        84   ffff80005ce9b400                cvs piperd
 4791          1 3         4   ffff80005b38a7e0                cvs biowait
 3077          1 3        80   ffff80007830d420             vmstat nanoslp
 29262         1 3        80   ffff80005bd8a7e0                 sh wait   
 2658          1 3        84   ffff80005b64cbe0                 sh wait
 3492          1 3        84   ffff80005ce9b7e0                 sh wait
 1752          1 3        80   ffff80005c084400                 sh wait
 2926          1 3        80   ffff80005b64c040                 sh wait
 3367          1 3        80   ffff80005bd8a400                 sh wait
 3147          1 3        80   ffff80005ce927c0               cron piperd
 2759          1 3        80   ffff80005bd943e0               cron piperd
 835           1 3        84   ffff80005c0847e0           postgres select
 811           1 3        84   ffff80005c084bc0           postgres select
810           1 3        84   ffff80005be6b000           postgres select
 820           1 3        84   ffff80005be6bba0           postgres select
 847           1 3        84   ffff80005be6b3e0           postgres select
 425           1 3        80   ffff800053918000              getty ttyraw
 421           1 3        84   ffff80005bce5420               cron nanoslp
 402           1 3        84   ffff80005bce5be0       hobbitlaunch nanoslp
 358           1 3        80   ffff80005b3f8420              inetd kqueue 
 345           1 3        80   ffff80005bc18020               qmgr kqueue
 347           1 3        84   ffff800053901400             pickup kqueue
 342           1 3        84   ffff80005bb2c7c0             master kqueue
 255           1 3        80   ffff80005b3f8040               sshd select
 120           1 7         4   ffff80005b38a400            syslogd       
 1             1 3        84   ffff8000539067c0               init wait
>0            58 3       204   ffff80005b38a020            physiod physiod
              57 3       204   ffff8000539183e0        vmem_rehash vmem_rehash
              56 3       204   ffff8000539187c0           aiodoned aiodoned   
              55 3       204   ffff800053918ba0            ioflush syncer  
              54 3       204   ffff800053913040           pgdaemon pgdaemon
              53 3       204   ffff800053913420            pfpurge pftm    
              52 3       204   ffff800053905420          cryptoret crypto_wait
              51 3       204   ffff800053905040               usb1 usbevt     
              50 3       204   ffff800053901020               usb3 usbevt
              49 3       204   ffff800053905800               usb0 usbevt
              48 3       204   ffff800053913800         usbtask-dr usbtsk
              47 3       204   ffff800053913be0         usbtask-hc usbtsk
              46 3       204   ffff800053905be0               usb2 usbevt
              45 3       204   ffff800053907020          atapibus0 sccomp
              44 3       204   ffff800053907400               mfi0 mfi_mgmt
              43 3       204   ffff8000539077e0          coretemp3 coretemp3
              42 3       204   ffff800053907bc0          coretemp2 coretemp2
              41 3       204   ffff800053906000          coretemp1 coretemp1
              40 3       204   ffff8000539063e0          coretemp0 coretemp0
              31 3       204   ffff8000539017e0            atabus0 atath    
              30 2       204   ffff800053901bc0           scsibus0      
              29 3       204   ffff80004b27e000            xcall/3 xcall
              28 1       204   ffff80004b27e3e0          softser/3      
              27 1       204   ffff80004b27e7c0          softclk/3
              26 1       204   ffff80004b27eba0          softbio/3
              25 1       204   ffff80004b27c040          softnet/3
              24 1       205   ffff80004b27c420             idle/3
              23 3       204   ffff80004b27c800            xcall/2 xcall
              22 1       204   ffff80004b27cbe0          softser/2      
              21 1       204   ffff80004b27b020          softclk/2
              20 1       204   ffff80004b27b400          softbio/2
              19 1       204   ffff80004b27b7e0          softnet/2
              18 1       205   ffff80004b27bbc0             idle/2
              17 3       204   ffff80004b279000            xcall/1 xcall
              16 1       204   ffff80004b2793e0          softser/1      
              15 1       204   ffff80004b2797c0          softclk/1
              14 1       204   ffff80004b279ba0          softbio/1
              13 1       204   ffff80004b273040          softnet/1
              12 1       205   ffff80004b273420             idle/1
              11 3       204   ffff80004b273800             sysmon smtaskq
              10 3       204   ffff80004b273be0           pmfevent pmfevent
               9 3       204   ffff80004b26c020            cachegc cachegc 
               8 2       204   ffff80004b26c400              vrele        
               7 3       204   ffff80004b26c7e0            xcall/0 xcall
               6 1       204   ffff80004b26cbc0          softser/0      
               5 1       204   ffff80004b26a000          softclk/0
               4 1       204   ffff80004b26a3e0          softbio/0
               3 1       204   ffff80004b26a7c0          softnet/0
           >   2 7       205   ffff80004b26aba0             idle/0
               1 3       204   ffffffff80bd8e00            swapper schedule




db{0}> dmesg                                                               
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008                                                     
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993           
    The Regents of the University of California.  All rights reserved.

NetBSD 4.99.71 (FROST.debug) #1: Wed Jul 30 22:01:59 CEST 2008
        viktor@frost.netbsd.se:/usr/src/work/obj/sys/arch/amd64/compile/FROST.de
bug                                                                             
total memory = 4095 MB
avail memory = 3954 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
SMBIOS rev. 2.4 @ 0xcffbc000 (62 entries)                        
Dell Inc. PowerEdge 2950                 
mainbus0 (root)         
cpu0 at mainbus0 apid 0: Intel 686-class, 1862MHz, id 0x6f6
cpu1 at mainbus0 apid 6: Intel 686-class, 1862MHz, id 0x6f6
cpu2 at mainbus0 apid 1: Intel 686-class, 1862MHz, id 0x6f6
cpu3 at mainbus0 apid 7: Intel 686-class, 1862MHz, id 0x6f6
ioapic0 at mainbus0 apid 8: pa 0xfec00000, version 20, 24 pins
ioapic1 at mainbus0 apid 9: pa 0xfec81000, version 20, 24 pins
acpi0 at mainbus0: Intel ACPICA 20080321                      
acpi0: X/RSDT: OemId <DELL  ,PE_SC3  ,00000001>, AslId <DELL,00000001>
acpi0: SCI interrupting at int 9                                      
acpi0: fixed-feature power button present
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
ACPI-Fast 24-bit timer                                                
attimer1 at acpi0 (TMR, PNP0100): AT Timer
attimer1: io 0x40-0x5f irq 0              
COMA (PNP0501) at acpi0 not configured
COMB (PNP0501) at acpi0 not configured
hpet0 at acpi0 (HPET, PNP0103-0)      
hpet0: mem 0xfed00000-0xfed003ff
timecounter: Timecounter "hpet0" frequency 14318179 Hz quality 2000
pci0 at mainbus0 bus 0: configuration mode 1                       
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0                                    
pchb0: vendor 0x8086 product 0x25c0 (rev. 0x12)
ppb0 at pci0 dev 2 function 0: vendor 0x8086 product 0x25e2 (rev. 0x12)
pci1 at ppb0 bus 6                                                     
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
ppb1 at pci1 dev 0 function 0: vendor 0x8086 product 0x3500 (rev. 0x01)
pci2 at ppb1 bus 7                                                     
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
ppb2 at pci2 dev 0 function 0: vendor 0x8086 product 0x3510 (rev. 0x01)
pci3 at ppb2 bus 8                                                     
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
ppb3 at pci3 dev 0 function 0: vendor 0x1166 product 0x0103 (rev. 0xc3)
ppb3: disabling notification events                                    
pci4 at ppb3 bus 9                 
pci4: i/o space, memory space enabled, rd/line, wr/inv ok
bnx0 at pci4 dev 0 function 0: Broadcom NetXtreme II BCM5708 1000Base-T
bnx0: Ethernet address 00:18:8b:8a:04:f9                               
brgphy0 at bnx0 phy 1: BCM5708C 1000BASE-T media interface, rev. 6
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FD
X, auto                                                                         
ppb4 at pci2 dev 1 function 0: vendor 0x8086 product 0x3514 (rev. 0x01)
pci5 at ppb4 bus 10                                                    
pci5: i/o space, memory space enabled, rd/line, wr/inv ok
ppb5 at pci1 dev 0 function 3: vendor 0x8086 product 0x350c (rev. 0x01)
ppb5: disabling notification events                                    
pci6 at ppb5 bus 11                
pci6: i/o space, memory space enabled, rd/line, wr/inv ok
ppb6 at pci0 dev 3 function 0: vendor 0x8086 product 0x25e3 (rev. 0x12)
pci7 at ppb6 bus 1                                                     
pci7: i/o space, memory space enabled, rd/line, wr/inv ok
ppb7 at pci7 dev 0 function 0: vendor 0x8086 product 0x0370 (rev. 0x00)
ppb7: disabling notification events                                    
pci8 at ppb7 bus 2                 
pci8: i/o space, memory space enabled, rd/line, wr/inv ok
mfi0 at pci8 dev 14 function 0: Dell PERC 5/i integrated 
mfi0: interrupting at ioapic1 pin 14                    
mfi0: logical drives 2, version 5.0.2-0003, 256MB RAM
scsibus0 at mfi0: 64 targets, 8 luns per target      
ppb8 at pci7 dev 0 function 2: vendor 0x8086 product 0x0372 (rev. 0x00)
ppb8: disabling notification events                                    
pci9 at ppb8 bus 3                 
pci9: i/o space, memory space enabled, rd/line, wr/inv ok
ppb9 at pci0 dev 4 function 0: vendor 0x8086 product 0x25f8 (rev. 0x12)
pci10 at ppb9 bus 12                                                   
pci10: i/o space, memory space enabled, rd/line, wr/inv ok
ppb10 at pci0 dev 5 function 0: vendor 0x8086 product 0x25e5 (rev. 0x12)
pci11 at ppb10 bus 13                                                   
pci11: i/o space, memory space enabled, rd/line, wr/inv ok
ppb11 at pci0 dev 6 function 0: vendor 0x8086 product 0x25f9 (rev. 0x12)
pci12 at ppb11 bus 14                                                   
pci12: i/o space, memory space enabled, rd/line, wr/inv ok
ppb12 at pci0 dev 7 function 0: vendor 0x8086 product 0x25e7 (rev. 0x12)
pci13 at ppb12 bus 15                                                   
pci13: i/o space, memory space enabled, rd/line, wr/inv ok
pchb1 at pci0 dev 16 function 0                           
pchb1: vendor 0x8086 product 0x25f0 (rev. 0x12)
pchb2 at pci0 dev 16 function 1                
pchb2: vendor 0x8086 product 0x25f0 (rev. 0x12)
pchb3 at pci0 dev 16 function 2                
pchb3: vendor 0x8086 product 0x25f0 (rev. 0x12)
pchb4 at pci0 dev 17 function 0                
pchb4: vendor 0x8086 product 0x25f1 (rev. 0x12)
pchb5 at pci0 dev 19 function 0                
pchb5: vendor 0x8086 product 0x25f3 (rev. 0x12)
pchb6 at pci0 dev 21 function 0                
pchb6: vendor 0x8086 product 0x25f5 (rev. 0x12)
pchb7 at pci0 dev 22 function 0                
pchb7: vendor 0x8086 product 0x25f6 (rev. 0x12)
ppb13 at pci0 dev 28 function 0: vendor 0x8086 product 0x2690 (rev. 0x09)
pci14 at ppb13 bus 4                                                     
pci14: i/o space, memory space enabled, rd/line, wr/inv ok
ppb14 at pci14 dev 0 function 0: vendor 0x1166 product 0x0103 (rev. 0xc3)
ppb14: disabling notification events                                     
pci15 at ppb14 bus 5                
pci15: i/o space, memory space enabled, rd/line, wr/inv ok
bnx1 at pci15 dev 0 function 0: Broadcom NetXtreme II BCM5708 1000Base-T
bnx1: Ethernet address 00:18:8b:8a:04:f7                                
brgphy1 at bnx1 phy 1: BCM5708C 1000BASE-T media interface, rev. 6
brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FD
X, auto                                                                         
uhci0 at pci0 dev 29 function 0: vendor 0x8086 product 0x2688 (rev. 0x09)
uhci0: interrupting at ioapic0 pin 21                                    
usb0 at uhci0: USB revision 1.0      
uhci1 at pci0 dev 29 function 1: vendor 0x8086 product 0x2689 (rev. 0x09)
uhci1: interrupting at ioapic0 pin 20                                    
usb1 at uhci1: USB revision 1.0      
uhci2 at pci0 dev 29 function 2: vendor 0x8086 product 0x268a (rev. 0x09)
uhci2: interrupting at ioapic0 pin 21                                    
usb2 at uhci2: USB revision 1.0      
ehci0 at pci0 dev 29 function 7: vendor 0x8086 product 0x268c (rev. 0x09)
ehci0: interrupting at ioapic0 pin 21                                    
ehci0: EHCI version 1.0              
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2
usb3 at ehci0: USB revision 2.0                              
ppb15 at pci0 dev 30 function 0: vendor 0x8086 product 0x244e (rev. 0xd9)
pci16 at ppb15 bus 16                                                    
pci16: i/o space, memory space enabled
vga0 at pci16 dev 13 function 0: vendor 0x1002 product 0x515e (rev. 0x02)
wsdisplay0 at vga0 kbdmux 1                                              
wsmux1: connecting to wsdisplay0
drm at vga0 not configured      
pcib0 at pci0 dev 31 function 0
pcib0: vendor 0x8086 product 0x2670 (rev. 0x09)
piixide0 at pci0 dev 31 function 1             
piixide0: Intel 631xESB/632xESB IDE Controller (rev. 0x09)
piixide0: bus-master DMA support present                  
piixide0: primary channel configured to compatibility mode
piixide0: primary channel interrupting at ioapic0 pin 14  
atabus0 at piixide0 channel 0                           
piixide0: secondary channel configured to compatibility mode
piixide0: secondary channel ignored (disabled)              
isa0 at pcib0                                 
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console                                              
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64                              
pckbdprobe: reset error 5    
pmsprobe: reset error 5  
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker (CPU-intensive output)
sysbeep0 at pcppi0                                
attimer1: attached to pcppi0
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
timecounter: Timecounter "TSC" frequency 1862173460 Hz quality 3000 
scsibus0: waiting 2 seconds for devices to settle...               
atapibus0 at atabus0: 2 targets                     
cd0 at atapibus0 drive 0: <HL-DT-ST DVD-ROM GDR-8084N, , 1.01> cdrom removable
uhub0 at usb2: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1  
uhub0: 2 ports with 2 removable, self powered                               
uhub1 at usb0: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered                               
uhub2 at usb3: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub2: 6 ports with 6 removable, self powered                               
uhub3 at usb1: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered                               
cd0: 32-bit data port                        
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd0(piixide0:0:0): using PIO mode 4, DMA mode 2 (using DMA)            
uhub4 at uhub2 port 5: vendor 0x04b4 product 0x6560, class 9/0, rev 2.00/0.0b, a
ddr 2                                                                           
uhub4: multiple transaction translators
sd0 at scsibus0 target 0 lun 0: <DELL, PERC 5/i, 1.00> disk fixed
sd0: fabricating a geometry                                      
sd0: 1116 GB, 1142784 cyl, 64 head, 32 sec, 512 bytes/sect x 2340421632 sectors
sd0: fabricating a geometry                                                    
uhub4: 4 ports with 4 removable, self powered
sd1 at scsibus0 target 1 lun 0: <DELL, PERC 5/i, 1.00> disk fixed
sd1: fabricating a geometry                                      
sd1: 465 GB, 476416 cyl, 64 head, 32 sec, 512 bytes/sect x 975699968 sectors
sd1: fabricating a geometry                                                 
raidattach: Asked for 8 units
Kernelized RAIDframe activated
pad: requested 1 units        
pad0: outputs: 44100Hz, 16-bit, stereo
audio0 at pad0: half duplex           
Searching for RAID components...
boot device: sd0                
root on sd0a dumps on sd0b
mountroot: trying lfs...  
mountroot: trying ffs...
root file system type: ffs
init: copying out path `/sbin/init' 11
mag 0 78:1                            
mag 1 78:2
mag 2 78:3
mag 3 64:4
mag 4 64:5
mag 5 62:7f
arp info overwritten for 83.140.44.3 by 00:15:c5:5d:d4:e5
arp info overwritten for 83.140.44.3 by 00:15:c5:5d:d4:e3
arp info overwritten for 83.140.44.3 by 00:15:c5:5d:d4:e5
arp info overwritten for 83.140.44.3 by 00:15:c5:5d:d4:e3
fatal page fault in supervisor mode                      
trap type 6 code 0 rip 0 cs 8 rflags 10202 cr2  0 cpl 6 rsp ffff80004f8a3ab0




db{0}> mach cpu 0;           
Bad character     
using CPU 0  
db{0}> bt  
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff8053135c cs 8 rflags 10246 cr2  0 cpl 8 rsp ffff80004f8a34c0
kernel: page fault trap, code=0                                                            
Faulted in DDB; continuing...  


db{0}> mach cpu 1            
using CPU 1      
db{0}> bt  
_kernel_lock() at netbsd:_kernel_lock+0x12d
tcp_usrreq_wrapper() at netbsd:tcp_usrreq_wrapper+0x39
soreceive() at netbsd:soreceive+0x9d6                 
dofileread() at netbsd:dofileread+0x80
sys_read() at netbsd:sys_read+0x72    
syscall() at netbsd:syscall+0x9a  


db{0}> mach cpu 2               
using CPU 2      
db{0}> bt  
x86_pause() at netbsd:x86_pause+0x2
cdev_open() at netbsd:cdev_open+0x8d
spec_open() at netbsd:spec_open+0x15e
VOP_OPEN() at netbsd:VOP_OPEN+0x62   
vn_open() at netbsd:vn_open+0x236 
sys_open() at netbsd:sys_open+0xeb
syscall() at netbsd:syscall+0x9a  


db{0}> mach cpu 3               
using CPU 3      
db{0}> bt  
x86_pause() at netbsd:x86_pause+0x2
kevent1() at netbsd:kevent1+0x5e4  
sys_kevent() at netbsd:sys_kevent+0x33
syscall() at netbsd:syscall+0x9a      


db{0}> ps/w                                       
 PID        LID          COMMAND     EMUL  PRI WAIT-MSG    WAIT-CHANNEL
 6359         1               sh   netbsd   25              0          
 5402         1               sh   netbsd   25 wait         ffff80005ea61d80
 4768         1             make   netbsd   25 piperd       ffff80005bc504b0
 27852        1               sh   netbsd   25 wait         ffff80005ea61038
 5060         1             make   netbsd   25 piperd       ffff80005ca7d3b0
 5972         1               sh   netbsd   25 wait         ffff80005bb36d88
 3577         1             make   netbsd   25 piperd       ffff80005ce963c0
 4072         1              ssh   netbsd   25              0               
 4670         1              cvs   netbsd   43 piperd       ffff80005ca7daa0
 4791         1              cvs   netbsd   29 biowait      ffff800009f32e68
 3077         1           vmstat   netbsd   42 nanoslp      ffff80007830d420
 29262        1               sh   netbsd   42 wait         ffff80005bb36ae0
 2658         1               sh   netbsd   25 wait         ffff80005b648ae0
 3492         1               sh   netbsd   43 wait         ffff80005ce9a2d8
 1752         1               sh   netbsd   43 wait         ffff80005be6d030
 2926         1               sh   netbsd   43 wait         ffff80005b6482e8
 3367         1               sh   netbsd   43 wait         ffff80005c0ee038
 3147         1             cron   netbsd   43 piperd       ffff80005ce70290
 2759         1             cron   netbsd   43 piperd       ffff80005ce96e28
 835          1         postgres   netbsd   43 select       ffff8000509c4940
 811          1         postgres   netbsd   43 select       ffff80004f8a0540
 810          1         postgres   netbsd   43 select       ffff8000509c4940
 820          1         postgres   netbsd   43 select       ffff8000509c4840
 847          1         postgres   netbsd   43 select       ffff8000509c4940
 425          1            getty   netbsd   43 ttyraw       ffff800050a6e928
 421          1             cron   netbsd   43 nanoslp      ffff80005bce5420
 402          1     hobbitlaunch   netbsd   43 nanoslp      ffff80005bce5be0
 358          1            inetd   netbsd   35 kqueue       ffff80005b32fd58
 345          1             qmgr   netbsd   43 kqueue       ffff80005b32fdd8
 347          1           pickup   netbsd   43 kqueue       ffff80005b32fcd8
 342          1           master   netbsd   43 kqueue       ffff80005b32fc58
 255          1             sshd   netbsd   43 select       ffff8000509c4840
 120          1          syslogd   netbsd   42              0               
 1            1             init   netbsd   43 wait         ffff80005390bd78
>0           58           system   netbsd  123 physiod      ffff800050abdbc8
>0           57           system   netbsd  125 vmem_rehash  ffff800050abdac8
>0           56           system   netbsd  125 aiodoned     ffff800050abda08
>0           55           system   netbsd  124 syncer       ffffffff80c83240
>0           54           system   netbsd  126 pgdaemon     ffffffff80caa754
>0           53           system   netbsd   96 pftm         ffffffff801c2220
>0           52           system   netbsd   96 crypto_wait  ffffffff80c89350
>0           51           system   netbsd   96 usbevt       ffff8000093b9428
>0           50           system   netbsd   96 usbevt       ffff8000093bf428
>0           49           system   netbsd   96 usbevt       ffff8000093b5428
>0           48           system   netbsd   96 usbtsk       ffffffff80c864c8
>0           47           system   netbsd   96 usbtsk       ffffffff80c864a0
>0           46           system   netbsd   96 usbevt       ffff8000093bb428
>0           45           system   netbsd   96 sccomp       ffff8000093cb358
>0           44           system   netbsd   96 mfi_mgmt     ffff8000086e63a0
>0           43           system   netbsd   96 coretemp3    ffff800050abd2c8
>0           42           system   netbsd   96 coretemp2    ffff800050abd148
>0           41           system   netbsd   96 coretemp1    ffff800050a46cc8
>0           40           system   netbsd   96 coretemp0    ffff800050a46b48
>0           31           system   netbsd   96 atath        ffff8000093cb3a0
>0           30           system   netbsd   96              0               
>0           29           system   netbsd  127 xcall        ffff800007e280f0
>0           28           system   netbsd  223              0               
>0           27           system   netbsd  220              0
>0           26           system   netbsd  221              0
>0           25           system   netbsd  222              0
>0           24           system   netbsd    0              0
>0           23           system   netbsd  127 xcall        ffff800007e270f0
>0           22           system   netbsd  223              0               
>0           21           system   netbsd  220              0
>0           20           system   netbsd  221              0
>0           19           system   netbsd  222              0
>0           18           system   netbsd    0              0
>0           17           system   netbsd  127 xcall        ffff800007e250f0
>0           16           system   netbsd  223              0               
>0           15           system   netbsd  220              0
>0           14           system   netbsd  221              0
>0           13           system   netbsd  222              0
>0           12           system   netbsd    0              0
>0           11           system   netbsd   96 smtaskq      ffffffff80c2e490
>0           10           system   netbsd   96 pmfevent     ffff80004f8a0a48
>0            9           system   netbsd  125 cachegc      ffff80004b26c020
>0            8           system   netbsd  125              0               
>0            7           system   netbsd  127 xcall        ffffffff80bdcd70
>0            6           system   netbsd  223              0               
>0            5           system   netbsd  220              0
>0            4           system   netbsd  221              0
>0            3           system   netbsd  222              0
>0            2           system   netbsd    0              0
>0            1           system   netbsd  125 schedule     ffffffff80caa780



db{0}> sync                                                      
syncing disks... 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 giving up
Printing vnodes for busy buffers                                                      
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             

vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             

vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             

vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
vnode @ 0xffff80005b3e8000, flags (30<MPSAFE,LOCKSWORK>)
        tag VT_UFS(1), type VBLK(3), usecount 78355, writecount 0, holdcount 3379
        freelisthd 0x0, mount 0xffff80005b2dc000, data 0xffff80005b3edec8 lock 0xffff80005b3e8108 recursecnt 0
        tag VT_UFS, ino 166336, on dev 4, 0 flags 0x0, effnlink 1, nlink 1                                    
        mode 060640, owner 0, group 5, size 0                             
giving up                                    

dumping to dev 4,1 offset 8390079
dump 4095 4094 4093 4092 4091 4090 4089 4088 4087 4086 4085 4084 4083 4082 4081 4080 4079 4078 4077 4076 4075 4074 4073 4072 4071 4070 4069 4068 463 4062 4061 4060 4059 4058 4057 4056 4055 4054 4053 4052 4051 4050 4049 4048 4047 4046 4045 4044 4043 4042 4041 4040 4039 4038 4037 4036 4035 403 4029 4028 4027 4026 4025 4024 4023 4022 4021 4020 4019 4018 4017 4016 4015 4014 4013 4012 4011 4010 4009 4008 4007 4006 4005 4004 4003 4002 4001 996 3995 3994 3993 3992 3991 3990 3989 3988 3987 3986 3985 3984 3983 3982 3981 3980 3979 3978 3977 3976 3975 3974 3973 3972 3971 3970 3969 3968 393 3962 3961 3960 3959 3958 3957 3956 3955 3954 3953 3952 3951 3950 3949 3948 3947 3946 3945 3944 3943 3942 3941 3940 3939 3938 3937 3936 3935 39343929 3928 3927 3926 3925 3924 3923 3922 3921 3920 3919 3918 3917 3916 3915 3914 3913 3912 3911 3910 3909 3908 3907 3906 3905 3904 3903 3902 3901 396 3895 3894 3893 3892 3891 3890 3889 3888 3887 3886 3885 3884 3883 3882 3881 3880 3879 3878 3877 3876 3875 3874 3
 873 3872 3871 3870 3869 3868 386 3862 3861 3860 3859 3858 3857 3856 3855 3854 3853 3852 3851 3850 3849 3848 3847 3846 3845 3844 3843 3842 3841 3840 3839 3838 3837 3836 3835 3834 829 3828 3827 3826 3825 3824 3823 3822 3821 3820 3819 3818 3817 3816 3815 3814 3813 3812 3811 3810 3809 3808 3807 3806 3805 3804 3803 3802 3801 386 3795 3794 3793 3792 3791 3790 3789 3788 3787 3786 3785 3784 3783 3782 3781 3780 3779 3778 3777 3776 3775 3774 3773 3772 3771 3770 3769 3768 37673762 3761 3760 3759 3758 3757 3756 3755 3754 3753 3752 3751 3750 3749 3748 3747 3746 3745 3744 3743 3742 3741 3740 3739 3738 3737 3736 3735 3734 329 3728 3727 3726 3725 3724 3723 3722 3721 3720 3719 3718 3717 3716 3715 3714 3713 3712 3711 3710 3709 3708 3707 3706 3705 3704 3703 3702 3701 370 3695 3694 3693 3692 3691 3690 3689 3688 3687 3686 3685 3684 3683 3682 3681 3680 3679 3678 3677 3676 3675 3674 3673 3672 3671 3670 3669 3668 3667 662 3661 3660 3659 3658 3657 3656 3655 3654 3653 3652 3651 3650 3649 3648 3647 36
 46 3645 3644 3643 3642 3641 3640 3639 3638 3637 3636 3635 3634 363629 3628 3627 3626 3625 3624 3623 3622 3621 3620 3619 3618 3617 3616 3615 3614 3613 3612 3611 3610 3609 3608 3607 3606 3605 3604 3603 3602 3601 3 3596 3595 3594 3593 3592 3591 3590 3589 3588 3587 3586 3585 3584 3583 3582 3581 3580 3579 3578 3577 3576 3575 3574 3573 3572 3571 3570 3569 3568 4 3563 3562 3561 3560 3559 3558 3557 3556 3555 3554 3553 3552 3551 3550 3549 3548 3547 3546 3545 3544 3543 3542 3541 3540 3539 3538 3537 3536 35351 3530 3529 3528 3527 3526 3525 3524 3523 3522 3521 3520 3519 3518 3517 3516 3515 3514 3513 3512 3511 3510 3509 3508 3507 3506 3505 3504 3503 350200 3499 3498 3497 3496 3495 3494 3493 3492 3491 3490 3489 3488 3487 3486 3485 3484 3483 3482 3481 3480 3479 3478 3477 3476 3475 3474 3473 3472 3471 3470 3469 3468 3467 3466 3465 3464 3463 3462 3461 3460 3459 3458 3457 3456 3455 3454 3453 3452 3451 3450 3449 3448 3447 3446 3445 3444 3443 3442 3441 3440 3439 3438 3437 3436 3435 3434 3433 34
 32 3431 3430 3429 3428 3427 3426 3425 3424 3423 3422 3421 3420 3419 3418 3417 3416 3415 3414 3413 3412 3411 3410 3409 3408 3407 3406 3405 3404 3403 3402 3401 3400 3399 3398 3397 3396 3395 3394 3393 3392 3391 3390 3389 3388 3387 3386 3385 3384 3383 3382 3381 3380 3379 3378 3377 3376 3375 3374 3373 3372 3371 3370 3369 3368 3367 3366 3365 3364 3363 3362 3361 3360 3359 3358 3357 3356 3355 3354 3353 3352 3351 3350 3349 3348 3347 3346 3345 3344 3343 3342 3341 3340 3339 3338 3337 3336 3335 3334 3333 3332 3331 3330 3329 3328 3327 3326 3325 3324 3323 3322 3321 3320 3319 3318 3317 3316 3315 3314 3313 3312 3311 3310 3309 3308 3307 3306 3305 3304 3303 3302 3301 3300 3299 3298 3297 3296 3295 3294 3293 3292 3291 3290 3289 3288 3287 3286 3285 3284 3283 3282 3281 3280 3279 3278 3277 3276 3275 3274 3273 3272 3271 3270 3269 3268 3267 3266 3265 3264 3263 3262 3261 3260 3259 3258 3257 3256 3255 3254 3253 3252 3251 3250 3249 3248 3247 3246 3245 3244 3243 3242 3241 3240 3239 3238 3237 3236 3235 3
 234 3233 3232 3231 3230 3229 3228 3227 3226 3225 3224 3223 3222 3221 3220 3219 3218 3217 3216 3215 3214 3213 3212 3211 3210 3209 3208 3207 3206 3205 3204 3203 3202 3201 3200 3199 3198 3197 3196 3195 3194 3193 3192 3191 3190 3189 3188 3187 3186 3185 3184 3183 3182 3181 3180 3179 3178 3177 3176 3175 3174 3173 3172 3171 3170 3169 3168 3167 3166 3165 3164 3163 3162 3161 3160 3159 3158 3157 3156 3155 3154 3153 3152 3151 3150 3149 3148 3147 3146 3145 3144 3143 3142 3141 3140 3139 3138 3137 3136 3135 3134 3133 3132 3131 3130 3129 3128 3127 3126 3125 3124 3123 3122 3121 3120 3119 3118 3117 3116 3115 3114 3113 3112 3111 3110 3109 3108 3107 3106 3105 3104 3103 3102 3101 3100 3099 3098 3097 3096 3095 3094 3093 3092 3091 3090 3089 3088 3087 3086 3085 3084 3083 3082 3081 3080 3079 3078 3077 3076 3075 3074 3073 3072 3071 3070 3069 3068 3067 3066 3065 3064 3063 3062 3061 3060 3059 3058 3057 3056 3055 3054 3053 3052 3051 3050 3049 3048 3047 3046 3045 3044 3043 3042 3041 3040 3039 3038 3037 
 3036 3035 3034 3033 3032 3031 3030 3029 3028 3027 3026 3025 3024 3023 3022 3021 3020 3019 3018 3017 3016 3015 3014 3013 3012 3011 3010 3009 3008 3007 3006 3005 3004 3003 3002 3001 3000 2999 2998 2997 2996 2995 2994 2993 2992 29
.
.
.
.
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 succeeded


sd1(mfi0:0:1:0): should have flushed queue?
sd1: cache synchronization failed          
sd0: cache synchronization failed
                                 rebooting...



>How-To-Repeat:
	Run NetBSD > 4.0 on a Dell Poweredge 2950.

>Fix:


>Release-Note:

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: fredrik@netbsd.se, gnats-bugs@netbsd.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Sun, 24 Aug 2008 21:30:44 +0000

 On Mon, Aug 04, 2008 at 02:05:01PM +0000, fredrik@netbsd.se wrote:
  > Kernel crashes after 3-4 days of operations no mather if the
  > system is idle or not.

 Is it connected with some kind of network activity? That seems fairly
 likely if the machine's otherwise idle.

  > trap type 6 code 0 rip 0 cs 8 rflags 10202 cr2  0 cpl 6 rsp ffff80004f8a3ab0
                       ^^^^^
 It has jumped to NULL.

 Unfortunately,

  > db{0}> bt  
  > fatal page fault in supervisor mode

 ddb doesn't seem to be able to read the stack and tell us where it
 came from, either. So there's not much we can do but guess and try to
 collect more data...

 My guess is a bad callout, but that doesn't narrow it down very much.

 -- 
 David A. Holland
 dholland@netbsd.org

From: fredrik@netbsd.se
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 fredrik@netbsd.se
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Mon, 25 Aug 2008 08:05:33 +0200 (CEST)

 >  Is it connected with some kind of network activity? That seems fairly
 >  likely if the machine's otherwise idle.
 >

 We have seen the machine crash both with and without network activity.


 >   > trap type 6 code 0 rip 0 cs 8 rflags 10202 cr2  0 cpl 6 rsp
 > ffff80004f8a3ab0
 >                        ^^^^^
 >  It has jumped to NULL.
 >
 >  Unfortunately,
 >
 >   > db{0}> bt
 >   > fatal page fault in supervisor mode
 >
 >  ddb doesn't seem to be able to read the stack and tell us where it
 >  came from, either. So there's not much we can do but guess and try to
 >  collect more data...
 >
 >  My guess is a bad callout, but that doesn't narrow it down very much.
 >
 >  --
 >  David A. Holland
 >  dholland@netbsd.org
 >
 >


 The machine has been running stable (with -1 in boot.cfg) for the last 21
 days, with a lot of network, I/O and CPU load. What can we do to collect
 more data to help the troubleshooting?


 Regards
 Fredrik Carlsson




From: David Holland <dholland-bugs@netbsd.org>
To: fredrik@netbsd.se
Cc: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Sun, 31 Aug 2008 22:31:04 +0000

 On Mon, Aug 25, 2008 at 08:05:33AM +0200, fredrik@netbsd.se wrote:
  > >  My guess is a bad callout, but that doesn't narrow it down very much.
  > 
  > The machine has been running stable (with -1 in boot.cfg) for the last 21
  > days, with a lot of network, I/O and CPU load. What can we do to collect
  > more data to help the troubleshooting?

 Probably the best thing to do is build a kernel with DIAGNOSTIC; and
 on the conjecture that it may be a bad callout, add this patch, which
 should cause it to panic in a recognizable way instead of crashing if
 that's the problem.

 Or it may panic somewhere else, if you weren't previously running a
 DIAGNOSTIC kernel.

 Then, wait for it to crash. :-/

 Index: kern_timeout.c
 ===================================================================
 RCS file: /cvsroot/src/sys/kern/kern_timeout.c,v
 retrieving revision 1.41
 diff -u -p -r1.41 kern_timeout.c
 --- kern_timeout.c	2 Jul 2008 14:47:34 -0000	1.41
 +++ kern_timeout.c	31 Aug 2008 22:26:51 -0000
 @@ -722,6 +722,7 @@ callout_softclock(void *v)
  		cc->cc_active = c;

  		mutex_spin_exit(&cc->cc_lock);
 +		KASSERT(func != NULL);
  		if (!mpsafe) {
  			KERNEL_LOCK(1, NULL);
  			(*func)(arg);


 -- 
 David A. Holland
 dholland@netbsd.org

From: fredrik@netbsd.se
To: "David Holland" <dholland-bugs@netbsd.org>
Cc: fredrik@netbsd.se,
 gnats-bugs@netbsd.org,
 port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Mon, 1 Sep 2008 08:01:01 +0200 (CEST)

 > On Mon, Aug 25, 2008 at 08:05:33AM +0200, fredrik@netbsd.se wrote:
 >  > >  My guess is a bad callout, but that doesn't narrow it down very
 > much.
 >  >
 >  > The machine has been running stable (with -1 in boot.cfg) for the last
 > 21
 >  > days, with a lot of network, I/O and CPU load. What can we do to
 > collect
 >  > more data to help the troubleshooting?
 >
 > Probably the best thing to do is build a kernel with DIAGNOSTIC; and
 > on the conjecture that it may be a bad callout, add this patch, which
 > should cause it to panic in a recognizable way instead of crashing if
 > that's the problem.
 >
 > Or it may panic somewhere else, if you weren't previously running a
 > DIAGNOSTIC kernel.
 >
 > Then, wait for it to crash. :-/
 >
 > Index: kern_timeout.c
 > ===================================================================
 > RCS file: /cvsroot/src/sys/kern/kern_timeout.c,v
 > retrieving revision 1.41
 > diff -u -p -r1.41 kern_timeout.c
 > --- kern_timeout.c	2 Jul 2008 14:47:34 -0000	1.41
 > +++ kern_timeout.c	31 Aug 2008 22:26:51 -0000
 > @@ -722,6 +722,7 @@ callout_softclock(void *v)
 >  		cc->cc_active = c;
 >
 >  		mutex_spin_exit(&cc->cc_lock);
 > +		KASSERT(func != NULL);
 >  		if (!mpsafe) {
 >  			KERNEL_LOCK(1, NULL);
 >  			(*func)(arg);
 >
 >
 > --
 > David A. Holland
 > dholland@netbsd.org
 >

 We are already running with DEBUG and DIAGNOSTIC, hopefully it will be a
 little more verbose after your patch.

 The current options:
 options DEBUG
 options DIAGNOSTIC
 options LOCKDEBUG
 makeoptions DEBUG="-g"

 Regards
 Fredrik








From: David Holland <dholland-bugs@netbsd.org>
To: fredrik@netbsd.se
Cc: David Holland <dholland-bugs@netbsd.org>, gnats-bugs@netbsd.org,
	port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Wed, 19 Nov 2008 06:50:28 +0000

 On Mon, Sep 01, 2008 at 08:01:01AM +0200, fredrik@netbsd.se wrote:
  > > > > My guess is a bad callout, but that doesn't narrow it down very much.
  > > [...]
  > > @@ -722,6 +722,7 @@ callout_softclock(void *v)
  > > [...]
  > > +		KASSERT(func != NULL);
  > 
  > We are already running with DEBUG and DIAGNOSTIC, hopefully it will be a
  > little more verbose after your patch.

 Have you seen this again recently? PR 39655, which might be the same
 problem, was fixed in -current on October 10.

 -- 
 David A. Holland
 dholland@netbsd.org

From: fredrik@netbsd.se
To: "David Holland" <dholland-bugs@netbsd.org>
Cc: admin@netbsd.se,
 "David Holland" <dholland-bugs@netbsd.org>,
 gnats-bugs@netbsd.org,
 port-amd64-maintainer@netbsd.org
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Wed, 19 Nov 2008 07:57:03 +0100 (CET)

 > On Mon, Sep 01, 2008 at 08:01:01AM +0200, fredrik@netbsd.se wrote:
 >  > > > > My guess is a bad callout, but that doesn't narrow it down very
 > much.
 >  > > [...]
 >  > > @@ -722,6 +722,7 @@ callout_softclock(void *v)
 >  > > [...]
 >  > > +		KASSERT(func != NULL);
 >  >
 >  > We are already running with DEBUG and DIAGNOSTIC, hopefully it will be
 > a
 >  > little more verbose after your patch.
 >
 > Have you seen this again recently? PR 39655, which might be the same
 > problem, was fixed in -current on October 10.
 >
 > --
 > David A. Holland
 > dholland@netbsd.org
 >

 Hi!

 We have planed to upgrade to the 5.0-branch this weekend, will give you a
 report then how it works.

 Regards
 Fredrik Carlsson




From: fredrik@netbsd.se
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 fredrik@netbsd.se
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Sun, 23 Nov 2008 15:52:10 +0100 (CET)

 > The following reply was made to PR port-amd64/39283; it has been noted by
 > GNATS.
 >
 > From: David Holland <dholland-bugs@netbsd.org>
 > To: fredrik@netbsd.se
 > Cc: David Holland <dholland-bugs@netbsd.org>, gnats-bugs@netbsd.org,
 > 	port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
 > 	netbsd-bugs@netbsd.org
 > Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
 > Date: Wed, 19 Nov 2008 06:50:28 +0000
 >
 >  On Mon, Sep 01, 2008 at 08:01:01AM +0200, fredrik@netbsd.se wrote:
 >   > > > > My guess is a bad callout, but that doesn't narrow it down very
 > much.
 >   > > [...]
 >   > > @@ -722,6 +722,7 @@ callout_softclock(void *v)
 >   > > [...]
 >   > > +		KASSERT(func != NULL);
 >   >
 >   > We are already running with DEBUG and DIAGNOSTIC, hopefully it will be
 > a
 >   > little more verbose after your patch.
 >
 >  Have you seen this again recently? PR 39655, which might be the same
 >  problem, was fixed in -current on October 10.
 >
 >  --
 >  David A. Holland
 >  dholland@netbsd.org
 >
 >

 We did a test with 5.0_BETA and the machines was alive for about 23 hours
 then it paniced.

 Regards
 Fredrik




From: David Holland <dholland-bugs@netbsd.org>
To: fredrik@netbsd.se, gnats-bugs@netbsd.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Thu, 27 Nov 2008 23:01:40 +0000

 On Sun, Nov 23, 2008 at 02:55:02PM +0000, fredrik@netbsd.se wrote:
  >>> We are already running with DEBUG and DIAGNOSTIC, hopefully it will be a
  >>> little more verbose after your patch.
  >>
  >>  Have you seen this again recently? PR 39655, which might be the same
  >>  problem, was fixed in -current on October 10.
  >  
  >  We did a test with 5.0_BETA and the machines was alive for about 23 hours
  >  then it paniced.

 Guess it's not the same problem. :(

 Any new info from the panic?

 -- 
 David A. Holland
 dholland@netbsd.org

State-Changed-From-To: open->feedback
State-Changed-By: dsl@NetBSD.org
State-Changed-When: Fri, 06 Nov 2009 22:49:43 +0000
State-Changed-Why:
Any further info ??


From: fredrik@netbsd.se
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org,
 netbsd-bugs@netbsd.org,
 gnats-admin@netbsd.org,
 dsl@netbsd.org,
 fredrik@netbsd.se
Subject: Re: port-amd64/39283 (4.99.71 crashed after about 3-4 days when 
 running MP-kernel)
Date: Sat, 7 Nov 2009 09:21:05 +0100

 > Synopsis: 4.99.71 crashed after about 3-4 days when running MP-kernel
 >
 > State-Changed-From-To: open->feedback
 > State-Changed-By: dsl@NetBSD.org
 > State-Changed-When: Fri, 06 Nov 2009 22:49:43 +0000
 > State-Changed-Why:
 > Any further info ??
 >
 >
 >
 >

 No more than the info that we put in the PR and on the mailing lists:

 http://archive.netbsd.se/?ml=netbsd-current-users&a=2008-07&t=7775691
 http://archive.netbsd.se/?ml=port-amd64&a=2009-02&t=9259482

 The server crashed every 8'th day on SP-kernel and every 24'th hour with
 MP-kernel. The problem has evolved with NetBSD releases, these days it's
 "only" the filesystem that seems to hang and no i/o is possible without
 reboot, if it runs to long without reboot is eventually crashed.

 Regards
 Fredrik


From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: fredrik@netbsd.se
Cc: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Tue, 24 Nov 2009 20:21:24 +0000

 Hello,

 >  > Synopsis: 4.99.71 crashed after about 3-4 days when running MP-kernel
 >  >
 >  > State-Changed-From-To: open->feedback
 >  > State-Changed-By: dsl@NetBSD.org
 >  > State-Changed-When: Fri, 06 Nov 2009 22:49:43 +0000
 >  > State-Changed-Why:
 >  > Any further info ??
 >  
 >  No more than the info that we put in the PR and on the mailing lists:
 >  
 >  http://archive.netbsd.se/?ml=netbsd-current-users&a=2008-07&t=7775691
 >  http://archive.netbsd.se/?ml=port-amd64&a=2009-02&t=9259482
 >  
 >  The server crashed every 8'th day on SP-kernel and every 24'th hour with
 >  MP-kernel. The problem has evolved with NetBSD releases, these days it's
 >  "only" the filesystem that seems to hang and no i/o is possible without
 >  reboot, if it runs to long without reboot is eventually crashed.

 Seems there is not yet enough information to figure out where bug is hiding.
 Also, it the only problem report with such symptoms, but if it is happening
 consistently and crashing the same way - unlikely to be a hardware problem.

 http://mail-index.netbsd.org/port-amd64/2008/12/19/msg000684.html

 Looking at the data from your email, there is callback_run_roundrobin() which
 calls a function pointer, but it is very unlikely to be NULL.  I have added an
 assert, "just in case":

 http://mail-index.netbsd.org/source-changes/2009/11/24/msg003498.html

 Would you be able to try -current kernel, with that change in the link, and
 with the following debug options:

 options DIAGNOSTIC
 options DEBUG
 makeoptions DEBUG="-g -fno-omit-frame-pointer"

 Then repeat x/Lx and x/I dance again.  But no LOCKDEBUG option, as it might
 avoid some overhead and hopefully get stack dump a little bit more readable.
 Additionally, 'show uvm' output from DDB might be useful.

 Thanks.

 -- 
 Mindaugas

From: fredrik@netbsd.se
To: "Mindaugas Rasiukevicius" <rmind@netbsd.org>
Cc: fredrik@netbsd.se,
 gnats-bugs@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Tue, 24 Nov 2009 22:02:08 +0100

 > Hello,
 >
 >>  > Synopsis: 4.99.71 crashed after about 3-4 days when running MP-kernel
 >>  >
 >>  > State-Changed-From-To: open->feedback
 >>  > State-Changed-By: dsl@NetBSD.org
 >>  > State-Changed-When: Fri, 06 Nov 2009 22:49:43 +0000
 >>  > State-Changed-Why:
 >>  > Any further info ??
 >>
 >>  No more than the info that we put in the PR and on the mailing lists:
 >>
 >>  http://archive.netbsd.se/?ml=netbsd-current-users&a=2008-07&t=7775691
 >>  http://archive.netbsd.se/?ml=port-amd64&a=2009-02&t=9259482
 >>
 >>  The server crashed every 8'th day on SP-kernel and every 24'th hour
 >> with
 >>  MP-kernel. The problem has evolved with NetBSD releases, these days
 >> it's
 >>  "only" the filesystem that seems to hang and no i/o is possible without
 >>  reboot, if it runs to long without reboot is eventually crashed.
 >
 > Seems there is not yet enough information to figure out where bug is
 > hiding.
 > Also, it the only problem report with such symptoms, but if it is
 > happening
 > consistently and crashing the same way - unlikely to be a hardware
 > problem.
 >
 > http://mail-index.netbsd.org/port-amd64/2008/12/19/msg000684.html
 >
 > Looking at the data from your email, there is callback_run_roundrobin()
 > which
 > calls a function pointer, but it is very unlikely to be NULL.  I have
 > added an
 > assert, "just in case":
 >
 > http://mail-index.netbsd.org/source-changes/2009/11/24/msg003498.html
 >
 > Would you be able to try -current kernel, with that change in the link,
 > and
 > with the following debug options:
 >
 > options DIAGNOSTIC
 > options DEBUG
 > makeoptions DEBUG="-g -fno-omit-frame-pointer"
 >
 > Then repeat x/Lx and x/I dance again.  But no LOCKDEBUG option, as it
 > might
 > avoid some overhead and hopefully get stack dump a little bit more
 > readable.
 > Additionally, 'show uvm' output from DDB might be useful.
 >
 > Thanks.
 >
 > --
 > Mindaugas
 >

 Hi,

 Thanks for the response.

 We will try a current kernel tonight and report back with the result.

 Regards
 Fredrik


From: Tobias Nygren <tnn@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Mon, 30 Nov 2009 20:04:03 +0100

 The machine crashed again. This time it hit an assertion but not the
 one recently added. Here is the backtrace. It looks like it ran out
 of kva and tried to sleep from interrupt context.

 panic: kernel diagnostic assertion "(l->l_pflag & LP_INTR) == 0 ||
 panicstr != N ULL" failed: file "/usr/src/sys/kern/kern_condvar.c",
 line 145 fatal breakpoint trap in supervisor mode
 tWrAaRpN INtGy:pe S 1PL c oNOdeT  L0O rWEiRpE fD ffONff fSYffSC80AL22L
 6f0 650  c EXs IT8  0rf l7
 ags 246 cr2  7f7ffd3a1870 cpl 4 rsp ffff800050803240
 db{1}> mach cpu 1
 using CPU 1
 db{1}> bt
 breakpoint() at netbsd:breakpoint+0x5
 panic() at netbsd:panic+0x2a0
 __kernassert() at netbsd:__kernassert+0x2d
 cv_wait() at netbsd:cv_wait+0x144
 xc_wait() at netbsd:xc_wait+0x44
 pool_cache_invalidate() at netbsd:pool_cache_invalidate+0xea
 pool_reclaim() at netbsd:pool_reclaim+0x65
 pool_reclaim_callback() at netbsd:pool_reclaim_callback+0x22
 callback_run_roundrobin() at netbsd:callback_run_roundrobin+0x57
 uvm_map_prepare() at netbsd:uvm_map_prepare+0x190
 uvm_map() at netbsd:uvm_map+0xbb
 km_vacache_alloc() at netbsd:km_vacache_alloc+0x4e
 pool_grow() at netbsd:pool_grow+0x38
 pool_get() at netbsd:pool_get+0x66
 uvm_km_alloc_poolpage_cache() at netbsd:uvm_km_alloc_poolpage_cache+0x40
 pool_grow() at netbsd:pool_grow+0x38
 pool_get() at netbsd:pool_get+0x66
 tcp_newtcpcb() at netbsd:tcp_newtcpcb+0x29
 tcp_attach() at netbsd:tcp_attach+0xce
 tcp_usrreq() at netbsd:tcp_usrreq+0x5d4
 tcp_usrreq_wrapper() at netbsd:tcp_usrreq_wrapper+-0x365f
 sonewconn() at netbsd:sonewconn+0x1dc
 syn_cache_get() at netbsd:syn_cache_get+0x13b
 tcp_input() at netbsd:tcp_input+0x1134
 tcp6_input() at netbsd:tcp6_input+0x6b
 ip6_input() at netbsd:ip6_input+0x6f4
 ip6intr() at netbsd:ip6intr+0x71
 softint_dispatch() at netbsd:softint_dispatch+0xf5
 DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xffff800050803d70
 Xsoftintr() at netbsd:Xsoftintr+0x4f
 --- interrupt ---
 db{1}> mach cpu 0
 using CPU 0
 db{1}> bt
 x86_pause() at netbsd:x86_pause+0x2
 callout_softclock() at netbsd:callout_softclock+0x397
 softint_dispatch() at netbsd:softint_dispatch+0xf5
 DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xffff80004f89cd70
 Xsoftintr() at netbsd:Xsoftintr+0x4f
 --- interrupt ---

State-Changed-From-To: feedback->open
State-Changed-By: tnn@NetBSD.org
State-Changed-When: Mon, 30 Nov 2009 19:24:07 +0000
State-Changed-Why:
feedback provided. Building new kernel with COPTS=-O1 -fno-omit-frame-pointer -g


From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: Tobias Nygren <tnn@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
 fredrik@netbsd.se
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Wed, 2 Dec 2009 04:21:42 +0000

 This is a multi-part message in MIME format.

 --Multipart=_Wed__2_Dec_2009_04_21_42_+0000_3i5cvLjM=p/z6/nU
 Content-Type: text/plain; charset=US-ASCII
 Content-Transfer-Encoding: 7bit

 Hello,

 Tobias Nygren <tnn@NetBSD.org> wrote:
 > ...
 >  cv_wait() at netbsd:cv_wait+0x144
 >  xc_wait() at netbsd:xc_wait+0x44
 >  pool_cache_invalidate() at netbsd:pool_cache_invalidate+0xea
 >  pool_reclaim() at netbsd:pool_reclaim+0x65
 >  pool_reclaim_callback() at netbsd:pool_reclaim_callback+0x22
 >  callback_run_roundrobin() at netbsd:callback_run_roundrobin+0x57
 >  uvm_map_prepare() at netbsd:uvm_map_prepare+0x190
 >  uvm_map() at netbsd:uvm_map+0xbb
 >  km_vacache_alloc() at netbsd:km_vacache_alloc+0x4e
 >  pool_grow() at netbsd:pool_grow+0x38
 >  pool_get() at netbsd:pool_get+0x66
 >  uvm_km_alloc_poolpage_cache() at netbsd:uvm_km_alloc_poolpage_cache+0x40
 >  pool_grow() at netbsd:pool_grow+0x38
 >  pool_get() at netbsd:pool_get+0x66
 >  tcp_newtcpcb() at netbsd:tcp_newtcpcb+0x29
 > ...

 It is a recent regression in -current.  Please try attached workaround, which
 disables draining of per-CPU caches (it is safe, since nothing yet depends on
 this behaviour).  I will look for a proper fix.

 -- 
 Mindaugas

 --Multipart=_Wed__2_Dec_2009_04_21_42_+0000_3i5cvLjM=p/z6/nU
 Content-Type: text/plain;
  name="pool_inv_workaround.diff"
 Content-Disposition: attachment;
  filename="pool_inv_workaround.diff"
 Content-Transfer-Encoding: 7bit

 Index: subr_pool.c
 ===================================================================
 RCS file: /cvsroot/src/sys/kern/subr_pool.c,v
 retrieving revision 1.177
 diff -u -p -r1.177 subr_pool.c
 --- subr_pool.c	20 Oct 2009 17:24:22 -0000	1.177
 +++ subr_pool.c	2 Dec 2009 04:22:01 -0000
 @@ -2298,6 +2298,7 @@ void
  pool_cache_invalidate(pool_cache_t pc)
  {
  	pcg_t *full, *empty, *part;
 +#if 0
  	uint64_t where;

  	if (ncpu < 2 || !mp_online) {
 @@ -2316,6 +2317,7 @@ pool_cache_invalidate(pool_cache_t pc)
  		where = xc_broadcast(0, (xcfunc_t)pool_cache_xcall, pc, NULL);
  		xc_wait(where);
  	}
 +#endif

  	mutex_enter(&pc->pc_lock);
  	full = pc->pc_fullgroups;


 --Multipart=_Wed__2_Dec_2009_04_21_42_+0000_3i5cvLjM=p/z6/nU--

State-Changed-From-To: open->feedback
State-Changed-By: tnn@NetBSD.org
State-Changed-When: Wed, 02 Dec 2009 11:13:00 +0000
State-Changed-Why:
workaround for new bug added, waiting for machine to fall over.


From: Tobias Nygren <tnn@NetBSD.org>
To: Mindaugas Rasiukevicius <rmind@netbsd.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Wed, 9 Dec 2009 16:18:02 +0100

 It tripped over again. Backtrace is similar to before but not identical.
 Looks like lock recursion now (notice the bnx interrupt).
 Would it be possible (and safe?) to return immediately without doing any
 work if mutex_owned()?

 panic: lock error
 cpu_Debugger() at netbsd:cpu_Debugger+0x9
 panic() at netbsd:panic+0x1f6
 lockdebug_abort() at netbsd:lockdebug_abort+0x8f
 mutex_abort() at netbsd:mutex_abort+0x29
 mutex_vector_enter() at netbsd:mutex_vector_enter+0x1c4
 pool_cache_invalidate() at netbsd:pool_cache_invalidate+0x23
 pool_reclaim() at netbsd:pool_reclaim+0x69
 pool_reclaim_callback() at netbsd:pool_reclaim_callback+0x41
 callback_run_roundrobin() at netbsd:callback_run_roundrobin+0x100
 uvm_km_va_drain() at netbsd:uvm_km_va_drain+0x1a
 uvm_map_prepare() at netbsd:uvm_map_prepare+0x1ed
 uvm_map() at netbsd:uvm_map+0x127
 km_vacache_alloc() at netbsd:km_vacache_alloc+0x53
 pool_grow() at netbsd:pool_grow+0x36
 pool_get() at netbsd:pool_get+0x1ca
 uvm_km_alloc_poolpage_cache() at netbsd:uvm_km_alloc_poolpage_cache+0x4a
 pool_page_alloc() at netbsd:pool_page_alloc+0x13
 pool_grow() at netbsd:pool_grow+0x36
 pool_get() at netbsd:pool_get+0x1ca
 pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x1b6
 pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x14d
 m_get() at netbsd:m_get+0x26
 m_gethdr() at netbsd:m_gethdr+0x9
 bnx_get_buf() at netbsd:bnx_get_buf+0x75
 bnx_rx_intr() at netbsd:bnx_rx_intr+0x2ac
 bnx_intr() at netbsd:bnx_intr+0xf1
 intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
 Xintr_ioapic_level1() at netbsd:Xintr_ioapic_level1+0xf4
 --- interrupt ---
 mutex_enter() at netbsd:mutex_enter+0x11
 pool_reclaim() at netbsd:pool_reclaim+0x69
 pool_reclaim_callback() at netbsd:pool_reclaim_callback+0x41
 callback_run_roundrobin() at netbsd:callback_run_roundrobin+0x100
 uvm_km_va_drain() at netbsd:uvm_km_va_drain+0x1a
 uvm_map_prepare() at netbsd:uvm_map_prepare+0x1ed
 uvm_map() at netbsd:uvm_map+0x127
 km_vacache_alloc() at netbsd:km_vacache_alloc+0x53
 pool_grow() at netbsd:pool_grow+0x36
 pool_get() at netbsd:pool_get+0x1ca
 uvm_km_alloc_poolpage_cache() at netbsd:uvm_km_alloc_poolpage_cache+0x4a
 pool_page_alloc() at netbsd:pool_page_alloc+0x13
 pool_grow() at netbsd:pool_grow+0x36
 pool_get() at netbsd:pool_get+0x1ca
 pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x1b6
 pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x14d
 bt_alloc() at netbsd:bt_alloc+0x1d
 vmem_add1() at netbsd:vmem_add1+0xa3
 vmem_xalloc() at netbsd:vmem_xalloc+0x58b
 vmem_alloc() at netbsd:vmem_alloc+0x14d
 qc_poolpage_alloc() at netbsd:qc_poolpage_alloc+0x61
 pool_grow() at netbsd:pool_grow+0x36
 pool_get() at netbsd:pool_get+0x1ca
 pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x1b6
 pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x14d
 vmem_alloc() at netbsd:vmem_alloc+0x11e
 kmem_alloc() at netbsd:kmem_alloc+0x198
 amap_copy() at netbsd:amap_copy+0x214
 uvm_fault_internal() at netbsd:uvm_fault_internal+0x30c
 trap() at netbsd:trap+0x7da
 --- trap (number 4233368) ---
 0x152de:
 db{0}> mach cpu 1
 using CPU 1
 db{0}> bt
 x86_pause() at netbsd:x86_pause
 mutex_vector_enter() at netbsd:mutex_vector_enter+0x207
 pool_cache_invalidate() at netbsd:pool_cache_invalidate+0x23
 pool_reclaim() at netbsd:pool_reclaim+0x69
 pool_reclaim_callback() at netbsd:pool_reclaim_callback+0x41
 callback_run_roundrobin() at netbsd:callback_run_roundrobin+0x100
 uvm_km_va_drain() at netbsd:uvm_km_va_drain+0x1a
 uvm_map_prepare() at netbsd:uvm_map_prepare+0x1ed
 uvm_map() at netbsd:uvm_map+0x127
 km_vacache_alloc() at netbsd:km_vacache_alloc+0x53
 pool_grow() at netbsd:pool_grow+0x36
 pool_get() at netbsd:pool_get+0x1ca
 uvm_km_alloc_poolpage_cache() at netbsd:uvm_km_alloc_poolpage_cache+0x4a
 pool_page_alloc() at netbsd:pool_page_alloc+0x13
 pool_grow() at netbsd:pool_grow+0x36
 pool_get() at netbsd:pool_get+0x1ca
 pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x1b6
 pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x14d
 bt_alloc() at netbsd:bt_alloc+0x1d
 vmem_xalloc() at netbsd:vmem_xalloc+0x2f5
 vmem_alloc() at netbsd:vmem_alloc+0x14d
 qc_poolpage_alloc() at netbsd:qc_poolpage_alloc+0x61
 pool_grow() at netbsd:pool_grow+0x36
 pool_get() at netbsd:pool_get+0x1ca
 pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x1b6
 pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x14d
 vmem_alloc() at netbsd:vmem_alloc+0x11e
 kmem_alloc() at netbsd:kmem_alloc+0x198
 amap_copy() at netbsd:amap_copy+0x214
 uvm_fault_internal() at netbsd:uvm_fault_internal+0x30c
 trap() at netbsd:trap+0x7da
 --- trap (number 4233368) ---
 0x152de:
 db{0}>

State-Changed-From-To: feedback->open
State-Changed-By: tnn@NetBSD.org
State-Changed-When: Wed, 09 Dec 2009 16:31:50 +0000
State-Changed-Why:
need help deciphering recent stack trace


From: Tobias Nygren <tnn@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Mon, 14 Dec 2009 15:50:54 +0100

 With some local patches I've been able to reduce the panics to a
 deadlock waiting for kva that never becomes availabe. It looks like
 the underlying problem is in fact a resource leak.

 From ddb I've identified the following pool cache whose resource usage
 looks highly suspicious to me.

 POOL CACHE ksiginfo: size 72, align 8, ioff 0, roflags 0x00000040
         alloc 0xffffffff80d42f20
         minitems 0, minpages 0, maxpages 4294967295, npages 264334
         itemsperpage 56, nitems 21, nout 14802683, hardlimit 4294967295
         nget 14809537, nfail 13, nput 6854
         npagealloc 264337, npagefree 3, hiwat 264334, nidle 0
         cpu layer hits 5450561 misses 14853292
         cache layer hits 43741 misses 14809551
         cache layer entry uncontended 14853283 contended 9
         cache layer empty groups 0 full groups 0

 Are we aware of any issues related to ksiginfo leakage in 5.0/current?

From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: Tobias Nygren <tnn@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
 fredrik@netbsd.se
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Mon, 14 Dec 2009 20:54:02 +0000

 Hello,

 Tobias Nygren <tnn@NetBSD.org> wrote:
 >  It tripped over again. Backtrace is similar to before but not identical.
 >  Looks like lock recursion now (notice the bnx interrupt).
 >  Would it be possible (and safe?) to return immediately without doing any
 >  work if mutex_owned()?

 Now this is a locking bug.  Do you mean using mutex_owned() to make locking
 decisions?  In such case - no, it would be very wrong, and would also not
 work on spin-mutex.

 >  panic: lock error
 >  cpu_Debugger() at netbsd:cpu_Debugger+0x9
 >  panic() at netbsd:panic+0x1f6
 >  lockdebug_abort() at netbsd:lockdebug_abort+0x8f
 >  mutex_abort() at netbsd:mutex_abort+0x29
 >  mutex_vector_enter() at netbsd:mutex_vector_enter+0x1c4
 >  pool_cache_invalidate() at netbsd:pool_cache_invalidate+0x23
 >  pool_reclaim() at netbsd:pool_reclaim+0x69
 >  pool_reclaim_callback() at netbsd:pool_reclaim_callback+0x41
 >  callback_run_roundrobin() at netbsd:callback_run_roundrobin+0x100
 >  ...

 From the backtrace, it seems there are three paths competing on the same
 thing, basically - reclaim on VA cache of kmem_map (since more layers are
 involved, like vmem quantum cache, it goes through pool subsystem couple
 times).  The following interrupt happens (3rd path) while reclaiming, and
 it tries to reclaim again from interrupt context and probably locks against
 oneself ("lock error" would be meaningful with LOCKDEBUG, in this case):

 > bnx_intr() at netbsd:bnx_intr+0xf1
 > intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x1d
 > Xintr_ioapic_level1() at netbsd:Xintr_ioapic_level1+0xf4
 > --- interrupt ---
 > mutex_enter() at netbsd:mutex_enter+0x11
 > pool_reclaim() at netbsd:pool_reclaim+0x69
 > pool_reclaim_callback() at netbsd:pool_reclaim_callback+0x41

 This is a bit confusing.  Since kmem_map is VM_MAP_INTRSAFE, pool should be
 interrupt-safe too i.e. run at IPL_VM and that mutex should be a spin-lock,
 blocking bnx_intr() as it runs at IPL_NET (== IPL_VM).

 Unfortunately, I had not have time yet to figure out more, but can add some
 KASSERT()s if you are OK to crash machine a little bit more? :)

 -- 
 Mindaugas

From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, fredrik@netbsd.se
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Mon, 14 Dec 2009 21:00:07 +0000

 On Mon, Dec 14, 2009 at 02:55:01PM +0000, Tobias Nygren wrote:
 > The following reply was made to PR port-amd64/39283; it has been noted by GNATS.
 > 
 > From: Tobias Nygren <tnn@NetBSD.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
 > Date: Mon, 14 Dec 2009 15:50:54 +0100
 > 
 >  With some local patches I've been able to reduce the panics to a
 >  deadlock waiting for kva that never becomes availabe. It looks like
 >  the underlying problem is in fact a resource leak.
 >  
 >  From ddb I've identified the following pool cache whose resource usage
 >  looks highly suspicious to me.
 >  
 >  POOL CACHE ksiginfo: size 72, align 8, ioff 0, roflags 0x00000040
 >          alloc 0xffffffff80d42f20
 >          minitems 0, minpages 0, maxpages 4294967295, npages 264334
 >          itemsperpage 56, nitems 21, nout 14802683, hardlimit 4294967295
 >          nget 14809537, nfail 13, nput 6854
 >          npagealloc 264337, npagefree 3, hiwat 264334, nidle 0
 >          cpu layer hits 5450561 misses 14853292
 >          cache layer hits 43741 misses 14809551
 >          cache layer entry uncontended 14853283 contended 9
 >          cache layer empty groups 0 full groups 0
 >  
 >  Are we aware of any issues related to ksiginfo leakage in 5.0/current?

 No one has mentioned any, but I do recall something about signals being
 queued to apps (rather than just being a bitmask). If something is
 looping generating signals maybe there is a path which causes an
 indefinite number to be queued (ksiginfo sounds like the item being queued!)

 Perhaps something has ignored SIGSEGV!

 	David

 -- 
 David Laight: david@l8s.co.uk

From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: Tobias Nygren <tnn@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
 fredrik@netbsd.se
Subject: Re: port-amd64/39283: Kernel crash on Dell Poweredge 2950
Date: Mon, 14 Dec 2009 21:02:46 +0000

 Tobias Nygren <tnn@NetBSD.org> wrote:
 >  With some local patches I've been able to reduce the panics to a
 >  deadlock waiting for kva that never becomes availabe. It looks like
 >  the underlying problem is in fact a resource leak.

 Cool.  Although before fixing this, it would be good to fix locking issues
 in KVA reclamation.

 >  POOL CACHE ksiginfo: size 72, align 8, ioff 0, roflags 0x00000040
 >          alloc 0xffffffff80d42f20
 >          minitems 0, minpages 0, maxpages 4294967295, npages 264334
 >          itemsperpage 56, nitems 21, nout 14802683, hardlimit 4294967295
 >          nget 14809537, nfail 13, nput 6854
 >          npagealloc 264337, npagefree 3, hiwat 264334, nidle 0
 >          cpu layer hits 5450561 misses 14853292
 >          cache layer hits 43741 misses 14809551
 >          cache layer entry uncontended 14853283 contended 9
 >          cache layer empty groups 0 full groups 0
 >  
 >  Are we aware of any issues related to ksiginfo leakage in 5.0/current?

 In conversation with ad@ some time ago, he mentioned some possible ksiginfo
 leak.  I will check few code paths with ksiginfo_cache later..

 -- 
 Mindaugas

From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/39283 CVS commit: src/sys/kern
Date: Sat, 19 Dec 2009 18:25:55 +0000

 Module Name:	src
 Committed By:	rmind
 Date:		Sat Dec 19 18:25:55 UTC 2009

 Modified Files:
 	src/sys/kern: sys_sig.c

 Log Message:
 sigtimedwait: fix a memory leak (which happens since newlock2 times).
 Allocate ksiginfo on stack since it is safe and sigget() assumes that it is
 not allocated from pool (pending signals via sigput()/sigget() "mill" should
 be dynamically allocated, however).  Might be useful to revisit later.

 Likely the cause of PR/40750 and indirect cause of PR/39283.


 To generate a diff of this commit:
 cvs rdiff -u -r1.23 -r1.24 src/sys/kern/sys_sig.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/39283 CVS commit: [netbsd-5] src/sys/kern
Date: Thu, 7 Jan 2010 07:04:51 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Thu Jan  7 07:04:51 UTC 2010

 Modified Files:
 	src/sys/kern [netbsd-5]: sys_sig.c

 Log Message:
 Pull up following revision(s) (requested by rmind in ticket #1199):
 	sys/kern/sys_sig.c: revision 1.24
 sigtimedwait: fix a memory leak (which happens since newlock2 times).
 Allocate ksiginfo on stack since it is safe and sigget() assumes that it is
 not allocated from pool (pending signals via sigput()/sigget() "mill" should
 be dynamically allocated, however).  Might be useful to revisit later.
 Likely the cause of PR/40750 and indirect cause of PR/39283.


 To generate a diff of this commit:
 cvs rdiff -u -r1.17.4.2 -r1.17.4.3 src/sys/kern/sys_sig.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/39283 CVS commit: [netbsd-5-0] src/sys/kern
Date: Thu, 7 Jan 2010 07:08:34 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Thu Jan  7 07:08:34 UTC 2010

 Modified Files:
 	src/sys/kern [netbsd-5-0]: sys_sig.c

 Log Message:
 Pull up following revision(s) (requested by rmind in ticket #1199):
 	sys/kern/sys_sig.c: revision 1.24
 sigtimedwait: fix a memory leak (which happens since newlock2 times).
 Allocate ksiginfo on stack since it is safe and sigget() assumes that it is
 not allocated from pool (pending signals via sigput()/sigget() "mill" should
 be dynamically allocated, however).  Might be useful to revisit later.
 Likely the cause of PR/40750 and indirect cause of PR/39283.


 To generate a diff of this commit:
 cvs rdiff -u -r1.17.4.2 -r1.17.4.2.2.1 src/sys/kern/sys_sig.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: tnn@NetBSD.org
State-Changed-When: Mon, 11 Jan 2010 11:27:50 +0000
State-Changed-Why:
22 days uptime and counting, I think this one is nailed.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.