NetBSD Problem Report #51522

From christos@astron.com  Sun Oct  2 00:51:21 2016
Return-Path: <christos@astron.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C47EF7A214
	for <gnats-bugs@gnats.NetBSD.org>; Sun,  2 Oct 2016 00:51:21 +0000 (UTC)
Message-Id: <20161001234720.D5FD21556E@quasar.astron.com>
Date: Sat,  1 Oct 2016 19:43:12 -0400 (EDT)
From: christos@netbsd.org
Reply-To: christos@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: pinging while booting makes the machine crash
X-Send-Pr-Version: 3.95

>Number:         51522
>Category:       kern
>Synopsis:       pinging a machine while it boots makes it crash
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Oct 02 00:55:00 +0000 2016
>Last-Modified:  Sat Oct 29 07:25:01 +0000 2016
>Originator:     Christos Zoulas
>Release:        NetBSD 7.99.39
>Organization:
	Ping, Pong, Inc.
>Environment:
System: NetBSD quasar.astron.com 7.99.39 NetBSD 7.99.39 (QUASAR) #52: Sat Oct 1 19:03:35 EDT 2016 christos@quasar.astron.com:/usr/src/sys/arch/amd64/compile/QUASAR amd64
Architecture: x86_64
Machine: amd64
>Description:
reboot a machine while pinging it from another host. It will crash
in m_freem() from udp_input(). Apparently there is some race in icmp_reflect()?
>How-To-Repeat:
	ping and see
>Fix:
	?

>Release-Note:

>Audit-Trail:

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Sun, 09 Oct 2016 10:51:45 +0200

 I've been seeing these for a while.  Here's a recent one:

 barsoom# dmesg -N netbsd.34 -M netbsd.34.core
 [...]
 fatal protection fault in supervisor mode
 trap type 4 code 0 rip ffffffff808ae10d cs 8 rflags 10282 cr2
 7a14f761e102 ilevel 4 rsp fffffe8100007a10
 curlwp 0xfffffe823f72b420 pid 0.3 lowest kstack 0xfffffe81000042c0
 panic: trap
 cpu0: Begin traceback...
 vpanic() at netbsd:vpanic+0x140
 snprintf() at netbsd:snprintf
 trap() at netbsd:trap+0xc4b
 --- trap (number 4) ---
 psref_release() at netbsd:psref_release+0x7f
 ip_output() at netbsd:ip_output+0x3c3
 icmp_reflect() at netbsd:icmp_reflect+0x568
 icmp_error() at netbsd:icmp_error+0x306
 udp_input() at netbsd:udp_input+0x4d2
 ipintr() at netbsd:ipintr+0xa4e
 softint_dispatch() at netbsd:softint_dispatch+0xd3
 DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe8100007ff0
 Xsoftintr() at netbsd:Xsoftintr+0x4f
 --- interrupt ---

 I suspect the problem may have been there for a long time.  This box is
 my main VLAN router, and general service host, and runs lots of network
 services.  The console is a never-ending stream of complaints about
 various bits of software (PowerDNS, dhcpd, OpenVPN, etc) not being able
 to allocate mbufs when needed, which is somewhat weird, considering that
 most of the traffic is limited by my 2Mbps Internet connection, anyway.

 Back when it had slower hardware, and used ipfilter, I had to forego
 having ipfilter return ICMP messages for dropped packets, because it
 made the machine really unstable.  Now, with much faster hardware, I'm
 letting pf do so ("set block-policy return"), and the crashes look as if
 that's when it happens.  Might be a coincidence, of course.

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Tue, 11 Oct 2016 06:43:15 +0200

 Incidentally, the crashes looked different under 7.99.36:

 crash> crash -M netbsd.33.core -N netbsd.33
 Crash version 7.99.39, image version 7.99.36.
 WARNING: versions differ, you may not be able to examine this image.
 System panicked: kernel diagnostic assertion "(m)->m_type != MT_FREE"
 failed: file "/usr/src/sys/kern/uipc_mbuf.c", line 655
 Backtrace from time of crash is available.
 crash> bt
 _KERNEL_OPT_NARCNET() at 0
 _KERNEL_OPT_RASOPS_DEFAULT_HEIGHT() at
 _KERNEL_OPT_RASOPS_DEFAULT_HEIGHT+0x3
 vpanic() at vpanic+0x149
 cd_play_msf() at cd_play_msf
 m_freem() at m_freem+0xab
 udp_input() at udp_input+0x4d2
 ipintr() at ipintr+0xa47
 softint_dispatch() at softint_dispatch+0xd3
 DDB lost frame for Xsoftintr+0x4f, trying 0xfffffe8100007ff0
 Xsoftintr() at Xsoftintr+0x4f
 --- interrupt ---

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Tue, 11 Oct 2016 17:21:16 +0200

 > Incidentally, the crashes looked different under 7.99.36:

 Ah, no.  Seems I'm just seeing two different patterns.  Just now, with a
 7.99.39 from September 28th:

 barsoom# dmesg -M netbsd.36.core -N netbsd.36
 [...]
 panic: kernel diagnostic assertion "(m)->m_type != MT_FREE" failed: file "/usr/src/sys/kern/uipc_mbuf.c", line 655
 cpu0: Begin traceback...
 vpanic() at netbsd:vpanic+0x140
 cd_play_msf() at netbsd:cd_play_msf
 m_freem() at netbsd:m_freem+0xab
 udp_input() at netbsd:udp_input+0x4d2
 ipintr() at netbsd:ipintr+0xa4e
 softint_dispatch() at netbsd:softint_dispatch+0xd3
 DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xfffffe8100007ff0
 Xsoftintr() at netbsd:Xsoftintr+0x4f
 --- interrupt ---

Responsible-Changed-From-To: gnats-admin->kern-bug-people
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Fri, 14 Oct 2016 16:08:49 +0000
Responsible-Changed-Why:
came in wrong


From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Sun, 16 Oct 2016 10:09:14 +0200

 Tiring of frequent crashes, I changed "set block-policy return" to "set
 block-policy drop" in pf.conf, hoping that the reduced ICMP load would
 make things better.  After a few hours, I get the crash shown below.  As
 you can see, I have (at least) two CPUs complaining at me, and after
 deciphering the text "WARNING: SPL NOT LOWERED", I find:

 /sys/arch/amd64/amd64/locore.S:4:       .asciz  "WARNING: SPL NOT LOWERED ON SYSCALL %d %d EXIT %x %x\n"
 /sys/arch/amd64/amd64/amd64_trap.S:4:   .asciz  "WARNING: SPL NOT LOWERED ON TRAP EXIT %x %x\n"

 Anyway, the panic:

 panic: pool_get(rndsample): free list modified: magic=f01735f; page 0xfffffe8105746000; item addr 0xfffffe8105746480

 cpu2: Begin traceback...
 vpanic() at netbsd:vpanic+0x140
 snprintf() at netbsd:snprintf
 pool_get() at netbsd:pool_get+0x41b
 pool_cache_get_slow() aWtA RNneINtbG:sd S:PpoL olN_OTca cLOheWWE_RAEgDe RtO_NNIs NSGlY:oS wCS+AP0LxLL1  b40N O
 -T4 6Lp9O7o8o5Wl6_E cERaXcEIDhT e O_3g1ebNt9 5_S6Yp1S2a dC7Ad
 rL()WALR N1I  N5G :a tS EPXLIn TN O7T8 6eL1tO0bW6sEd9R:0E D7 
 pONoW AoSRlYN_SICcAaNLcGLh e0:_ g -Se2Pt0_L7p4 1a2N1dO7Td4 4r LE+X0IxOT2 W9E21R08
 E106D 9r0n d7O
 _aN WASdRdYN_SICNAdGaL:tL a S_Pt4L  s1N4(O TE )XL OaIWtE RnETeDt  bOs8Nf d4S:1Yr0SnC6dA_L9aL0d d1_ d9 a9t773
 a3_5t04sW0+A 0ERxXN1II1TN7 G
 3:d 2S_1rP0Ln6d9_ 0N O7T
 a LOdWAdR_WNuIiNnGt:E RS3EP2DL ( ONN)O  Ta tSLY OSWCnEeRAELDtL b O0sNd : S_0Yr SECXnAIdL_TLa d 1d 81f_7u diE0nX3tI0T3 0320d+2 1070x6
 9103W A37R

 NINWAGsRy:N sISmNoPGL:  nNSOP_TL e nLNOvOsWTE yRLsEO_WrEDR EOeDN fOr NS eSYsYhSSCC_AALsLLe Ln0  04  s0E oXrEI(XTI )8T 2adt9 b 0n 87fe
 tdb0s3WA0dR0N:0I sNyG7s:
  mSoWPnALR N_NIOeNTnG :L vOsWSEPyRsLE_Dr eOf Nr NeSOsYTSh CL_AsLOeLWn s1Eo R9r9+7E03x3450D0 
 4O0 NEsX ISTY S3mCdA2eL1L_0u6p 9d10  a-71t
 e90_d3WA0RiNc8I7tNi4Go:9n 6SaP LrE yXN(O)ITT   aLtO WnE9R1eEtD8 1O0bN6s 9Sd0Y S7C
 :AsLmLW Ae1_ Ru1pN7I NEGdX:aI tTSe _P3ddLi2 1Nc0O6T9t0i o7n 
 LaOWrWAERyRNE+I0DN GO:N x ScPSaL
 Y NOSsTy CsLmAOoLWnELR iE0oDc  OtNl 8S_ YeESnCXAIvLTsL  5y0s 6(0) 1 Ea0X6tI 9T0n e8 ftdb073s0d
 0:0sW y7s
 mARoWAnRNNIiIoNNcGGt::  lS_PeLSn PNLO vTs NyLsOO+W0ETxR EL2D8 eOO
 WNE SRYESVCDAO LPOLN_  0IS Y0SO CECXTAILLT( L)8 f8 da 0t3 40 n0eE0t Xb7I
 sT d7WA8:RVN6OIPN_G1I:0O CSTPLL+ 6N0OxT3 9L0O bW
 6E
 REWADv nRO_NNI NiSoGYcS:tC AlSL(PLL)   0Na Ot0  nEeTXtIbTs  d8LfOdW0E:3R0v0E0n D7_
 io WAORcNtNIl NS+GY0:S CxSAPaL6L LN O1
 T  5L OEWXsEyRIsE_TD  iOoNc7 8tSlY6(S1C0A)L La t609  00n e E7X
 tITb W8AsfRddN0I:3s0Ny0s0_ i7Go
 : ctSlP+L0 x1NO0T1 
 LOsyWsERcEaDl lO(N)  SatYS CneAtLbL s6dWA: R4N sIENXyGsI:T cS aP7L8 6lN1O0lT+6 09LxO1W50Eb R
 E6D -
 O-N-  WSAsYySsRCNAILcLa N0G :0l  SElX I(TP nL4ubm abN1O0Te6r9  0L O7W
 EREWADR 5NO4IN)N G :S Y-SSP-CLA- 
 NLOLT 7 aLc1O6W cE5R3 E8EDeX IO9TN2  4SaY:S7C
 AL8Lc6 p31u22 01:60 9 0EEn XdI7 Tt 
 r4aacW7Ae0b3a0c7k0R. N6I.
 .N
 GWA:R NSIP
 NLGd:  NSuOPmLpT  iNLnOOgTW  tLEoORW EEdRDeE vDO  NO 1NS9 YSSY,SCC1AAL L(LoL  f0f8  s140e  tEEXXI=T1I 2T4 a77000836401,7 00s6 9i60z
  e=6
 2096990):
 dump 

From: "Ryota Ozaki" <ozaki-r@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51522 CVS commit: src/sys/netinet
Date: Tue, 18 Oct 2016 01:15:21 +0000

 Module Name:	src
 Committed By:	ozaki-r
 Date:		Tue Oct 18 01:15:21 UTC 2016

 Modified Files:
 	src/sys/netinet: ip_input.c

 Log Message:
 Avoid double frees of mbuf

 May fix one of panicks reported by Tom Ivar Helbekkmo in PR kern/51522


 To generate a diff of this commit:
 cvs rdiff -u -r1.342 -r1.343 src/sys/netinet/ip_input.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Tue, 18 Oct 2016 06:22:59 +0200

 The patch by ozaki-r looks very promising - since I run pf and altq, and
 explicitly use altq to do RED, it seems it may be quite relevant.  I've
 updated my sources again, and reverted to "set block-policy return".
 Now I'll just have to wait, and see what happens.  :)

From: Ryota Ozaki <ozaki-r@netbsd.org>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/51522
Date: Tue, 18 Oct 2016 13:34:13 +0900

 On Tue, Oct 18, 2016 at 1:25 PM, Tom Ivar Helbekkmo
 <tih@hamartun.priv.no> wrote:
 > The following reply was made to PR kern/51522; it has been noted by GNATS.
 >
 > From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
 > To: gnats-bugs@netbsd.org
 > Cc:
 > Subject: Re: kern/51522
 > Date: Tue, 18 Oct 2016 06:22:59 +0200
 >
 >  The patch by ozaki-r looks very promising - since I run pf and altq, and
 >  explicitly use altq to do RED, it seems it may be quite relevant.  I've
 >  updated my sources again, and reverted to "set block-policy return".
 >  Now I'll just have to wait, and see what happens.  :)

 Thank you for testing!

 Subsequent fixes for pserialize might be relevant to one of your reports
 ("WARNING: SPL NOT LOWERED" one). Including the fixes in your kernel is
 probably better (if you didn't).

   ozaki-r

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Tue, 18 Oct 2016 08:10:39 +0200

 > Subsequent fixes for pserialize might be relevant to one of your
 > reports ("WARNING: SPL NOT LOWERED" one). Including the fixes in your
 > kernel is probably better (if you didn't).

 I did include those, and have the new kernel running now.  I've been
 having crashes at least once per day lately, so if this stays up for a
 week, I'd say we have a pretty good indication that it's good.  I'll
 make a point of not being nice to it, too.

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Tue, 18 Oct 2016 17:24:04 +0200

 > I did include those, and have the new kernel running now.

 Well, that didn't take long...

 Crash version 7.99.39, image version 7.99.39.
 System panicked: kernel diagnostic assertion "(m)->m_type != MT_FREE" failed: file "/usr/src/sys/kern/uipc_mbuf.c", line 1972
 Backtrace from time of crash is available.
 crash> bt
 _KERNEL_OPT_NARCNET() at 0
 _KERNEL_OPT_RASOPS_DEFAULT_HEIGHT() at
 _KERNEL_OPT_RASOPS_DEFAULT_HEIGHT+0x3
 vpanic() at vpanic+0x149
 cd_play_msf() at cd_play_msf
 m__freem() at m__freem+0xab
 udp_input() at udp_input+0x4d2
 ipintr() at ipintr+0xa2e
 softint_dispatch() at softint_dispatch+0xd3
 DDB lost frame for Xsoftintr+0x4f, trying 0xfffffe8100007ff0
 Xsoftintr() at Xsoftintr+0x4f
 --- interrupt ---
 0:

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Tue, 18 Oct 2016 18:33:51 +0200

 Hitting the latest crash with gdb, I find that it's the call at the very
 bottom of icmp_error(), in sys/netinet/ip_icmp.c, that fails, i.e. the

 freeit:
         m_freem(n);

 icmp_error() was called from near the bottom of udp_input(), in
 sys/netinet/udp_usrreq.c:

         if (n == 0) {
 [...]
                 icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_PORT, 0, 0);
                 m = NULL;
         }

 (Thus, the 'm' in the latter is the 'n' in the former.)

 The backtrace, in gdb, looks like this:

 #0  0xffffffff80119ab5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0)
     at /usr/src/sys/arch/amd64/amd64/machdep.c:676
 #1  0xffffffff808ad94c in vpanic (fmt=0xffffffff80ed4ad0 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ", 
     ap=ap@entry=0xfffffe8100007d08) at /usr/src/sys/kern/subr_prf.c:342
 #2  0xffffffff80beee75 in kern_assert (
     fmt=fmt@entry=0xffffffff80ed4ad0 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ")
     at /usr/src/sys/lib/libkern/kern_assert.c:51
 #3  0xffffffff808d85f8 in m__freem (f=f@entry=0xffffffff80e50400 <__func__.10240> "m_freem", l=l@entry=1990, 
     m=0xffff80001090d410, m@entry=0xfffffe80560c3600) at /usr/src/sys/kern/uipc_mbuf.c:1972
 #4  0xffffffff808d9644 in m_freem (m=m@entry=0xfffffe80560c3600) at /usr/src/sys/kern/uipc_mbuf.c:1990
 #5  0xffffffff8057ef24 in icmp_error (n=n@entry=0xfffffe80560c3600, type=type@entry=3, code=<optimized out>, 
     code@entry=3, dest=4294967295, dest@entry=0, destmtu=destmtu@entry=0) at /usr/src/sys/netinet/ip_icmp.c:363
 #6  0xffffffff8059fbcf in udp_input (m=0xfffffe80560c3600) at /usr/src/sys/netinet/udp_usrreq.c:436
 #7  0xffffffff80580fde in ip_input (m=0xfffffe80560c3600) at /usr/src/sys/netinet/ip_input.c:846
 #8  ipintr (arg=<optimized out>) at /usr/src/sys/netinet/ip_input.c:442
 #9  0xffffffff8088560d in softint_execute (l=<optimized out>, s=4, si=0xffff80008f9c3230)
     at /usr/src/sys/kern/kern_softint.c:589
 #10 softint_dispatch (pinned=<optimized out>, s=4) at /usr/src/sys/kern/kern_softint.c:871
 #11 0xffffffff8011419f in Xsoftintr ()

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Tue, 18 Oct 2016 18:40:09 +0200

 Incidentally, I've now learned that trusting crash(8) to do the job of
 gdb(1) is a fool's game.  :)

From: Ryota Ozaki <ozaki-r@netbsd.org>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/51522
Date: Wed, 19 Oct 2016 10:52:24 +0900

 One more fix has been committed, but I don't believe that it fixes
 the issue and I've not found other apparent bugs for now.

 So could you enable DEBUG of your kernel and try again? It seems that
 a recent change of m_freem implements a new feature that shows us
 where is the first free of a double free if DEBUG is enabled.

 Thanks,
   ozaki-r

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Wed, 19 Oct 2016 07:49:29 +0200

 > One more fix has been committed, but I don't believe that it fixes
 > the issue and I've not found other apparent bugs for now.

 I applied it, just in case.

 > So could you enable DEBUG of your kernel and try again? It seems that
 > a recent change of m_freem implements a new feature that shows us
 > where is the first free of a double free if DEBUG is enabled.

 Cool!  I didn't want to turn on debugging throughout the kernel, so I
 modified sys/kern/uipc_mbuf.c to enable that particular feature.
 Running the new kernel now, so the next panic will have the information.

From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/51522
Date: Thu, 27 Oct 2016 16:10:51 +0200

 After more than a week, I finally had another crash, but this is
 probably something different...?

 dmesg:

 uvm_fault(0xffffffff81303840, 0x0, 1) -> e
 fatal page fault in supervisor mode
 trap type 6 code 0 rip ffffffff806401c3 cs 8 rflags 10246 cr2 78 ilevel 4 rsp fffffe8100514e68
 curlwp 0xfffffe8100602220 pid 0.67 lowest kstack 0xfffffe81005112c0
 panic: trap
 cpu2: Begin traceback...
 vpanic() at netbsd:vpanic+0x140
 snprintf() at netbsd:snprintf
 trap() at netbsd:trap+0xc4b
 --- trap (number 6) ---
 pf_state_tree_ext_gwy_RB_REMOVE_COLOR() at netbsd:pf_state_tree_ext_gwy_RB_REMOVE_COLOR+0xcf
 pf_state_tree_ext_gwy_RB_REMOVE() at netbsd:pf_state_tree_ext_gwy_RB_REMOVE+0xca
 pf_detach_state() at netbsd:pf_detach_state+0x8b
 pf_purge_expired_states() at netbsd:pf_purge_expired_states+0x73
 pf_purge_thread() at netbsd:pf_purge_thread+0x69
 cpu2: End traceback...

 backtrace:

 0xffffffff80119ab5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0)
     at /usr/src/sys/arch/amd64/amd64/machdep.c:676
 676                     dumpsys();
 (gdb) bt
 #0  0xffffffff80119ab5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0)
     at /usr/src/sys/arch/amd64/amd64/machdep.c:676
 #1  0xffffffff808ad97c in vpanic (fmt=fmt@entry=0xffffffff80ed4fdb "trap", ap=ap@entry=0xfffffe8100514c38)
     at /usr/src/sys/kern/subr_prf.c:342
 #2  0xffffffff808ada30 in panic (fmt=fmt@entry=0xffffffff80ed4fdb "trap") at /usr/src/sys/kern/subr_prf.c:258
 #3  0xffffffff8011b736 in trap (frame=0xfffffe8100514d70) at /usr/src/sys/arch/amd64/amd64/trap.c:298
 #4  0xffffffff8010115e in alltraps ()
 #5  0xffffffff806401c3 in pf_state_tree_ext_gwy_RB_REMOVE_COLOR (head=0xffffffff8128cea0 <pf_statetbl_ext_gwy>, 
     parent=0xfffffe81129c1160, elm=0x0) at /usr/src/sys/dist/pf/net/pf.c:337
 #6  0xffffffff8064051e in pf_state_tree_ext_gwy_RB_REMOVE (head=head@entry=0xffffffff8128cea0 <pf_statetbl_ext_gwy>, 
     elm=<optimized out>, elm@entry=0xfffffe81129c1d40) at /usr/src/sys/dist/pf/net/pf.c:337
 #7  0xffffffff806456b6 in pf_detach_state (s=<optimized out>, flags=flags@entry=0)
     at /usr/src/sys/dist/pf/net/pf.c:3042
 #8  0xffffffff806459d8 in pf_unlink_state (cur=<optimized out>) at /usr/src/sys/dist/pf/net/pf.c:1086
 #9  0xffffffff80645ae7 in pf_purge_expired_states (maxcheck=18) at /usr/src/sys/dist/pf/net/pf.c:1148
 #10 0xffffffff80645baa in pf_purge_thread (v=<optimized out>) at /usr/src/sys/dist/pf/net/pf.c:950
 #11 0xffffffff801008d7 in lwp_trampoline ()
 #12 0x0000000000000000 in ?? ()

From: Ryota Ozaki <ozaki-r@netbsd.org>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/51522
Date: Sat, 29 Oct 2016 16:24:18 +0900

 On Thu, Oct 27, 2016 at 11:15 PM, Tom Ivar Helbekkmo
 <tih@hamartun.priv.no> wrote:
 > The following reply was made to PR kern/51522; it has been noted by GNATS.
 >
 > From: Tom Ivar Helbekkmo <tih@hamartun.priv.no>
 > To: gnats-bugs@netbsd.org
 > Cc:
 > Subject: Re: kern/51522
 > Date: Thu, 27 Oct 2016 16:10:51 +0200
 >
 >  After more than a week, I finally had another crash, but this is
 >  probably something different...?

 I guess so...

 BTW if you revert the change for mbuf debugging, the panic
 due to a double free happens again? I worry that
 the debugging code may hide the issue.

   ozaki-r

 >
 >  dmesg:
 >
 >  uvm_fault(0xffffffff81303840, 0x0, 1) -> e
 >  fatal page fault in supervisor mode
 >  trap type 6 code 0 rip ffffffff806401c3 cs 8 rflags 10246 cr2 78 ilevel 4 rsp fffffe8100514e68
 >  curlwp 0xfffffe8100602220 pid 0.67 lowest kstack 0xfffffe81005112c0
 >  panic: trap
 >  cpu2: Begin traceback...
 >  vpanic() at netbsd:vpanic+0x140
 >  snprintf() at netbsd:snprintf
 >  trap() at netbsd:trap+0xc4b
 >  --- trap (number 6) ---
 >  pf_state_tree_ext_gwy_RB_REMOVE_COLOR() at netbsd:pf_state_tree_ext_gwy_RB_REMOVE_COLOR+0xcf
 >  pf_state_tree_ext_gwy_RB_REMOVE() at netbsd:pf_state_tree_ext_gwy_RB_REMOVE+0xca
 >  pf_detach_state() at netbsd:pf_detach_state+0x8b
 >  pf_purge_expired_states() at netbsd:pf_purge_expired_states+0x73
 >  pf_purge_thread() at netbsd:pf_purge_thread+0x69
 >  cpu2: End traceback...
 >
 >  backtrace:
 >
 >  0xffffffff80119ab5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0)
 >      at /usr/src/sys/arch/amd64/amd64/machdep.c:676
 >  676                     dumpsys();
 >  (gdb) bt
 >  #0  0xffffffff80119ab5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0)
 >      at /usr/src/sys/arch/amd64/amd64/machdep.c:676
 >  #1  0xffffffff808ad97c in vpanic (fmt=fmt@entry=0xffffffff80ed4fdb "trap", ap=ap@entry=0xfffffe8100514c38)
 >      at /usr/src/sys/kern/subr_prf.c:342
 >  #2  0xffffffff808ada30 in panic (fmt=fmt@entry=0xffffffff80ed4fdb "trap") at /usr/src/sys/kern/subr_prf.c:258
 >  #3  0xffffffff8011b736 in trap (frame=0xfffffe8100514d70) at /usr/src/sys/arch/amd64/amd64/trap.c:298
 >  #4  0xffffffff8010115e in alltraps ()
 >  #5  0xffffffff806401c3 in pf_state_tree_ext_gwy_RB_REMOVE_COLOR (head=0xffffffff8128cea0 <pf_statetbl_ext_gwy>,
 >      parent=0xfffffe81129c1160, elm=0x0) at /usr/src/sys/dist/pf/net/pf.c:337
 >  #6  0xffffffff8064051e in pf_state_tree_ext_gwy_RB_REMOVE (head=head@entry=0xffffffff8128cea0 <pf_statetbl_ext_gwy>,
 >      elm=<optimized out>, elm@entry=0xfffffe81129c1d40) at /usr/src/sys/dist/pf/net/pf.c:337
 >  #7  0xffffffff806456b6 in pf_detach_state (s=<optimized out>, flags=flags@entry=0)
 >      at /usr/src/sys/dist/pf/net/pf.c:3042
 >  #8  0xffffffff806459d8 in pf_unlink_state (cur=<optimized out>) at /usr/src/sys/dist/pf/net/pf.c:1086
 >  #9  0xffffffff80645ae7 in pf_purge_expired_states (maxcheck=18) at /usr/src/sys/dist/pf/net/pf.c:1148
 >  #10 0xffffffff80645baa in pf_purge_thread (v=<optimized out>) at /usr/src/sys/dist/pf/net/pf.c:950
 >  #11 0xffffffff801008d7 in lwp_trampoline ()
 >  #12 0x0000000000000000 in ?? ()
 >

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.