NetBSD Problem Report #50809

From hf@spg.tu-darmstadt.de  Mon Feb 15 15:20:09 2016
Return-Path: <hf@spg.tu-darmstadt.de>
Received: from mail.netbsd.org (mail.NetBSD.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id E8ADC7ABDA
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 15 Feb 2016 15:20:09 +0000 (UTC)
Message-Id: <201602151444.u1FEiWST000520@Vertatscha.nt.e-technik.tu-darmstadt.de>
Date: Mon, 15 Feb 2016 15:44:32 +0100 (CET)
From: Hauke Fath <hf@spg.tu-darmstadt.de>
Reply-To: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
Subject: pf panics while purging state
X-Send-Pr-Version: 3.95

>Number:         50809
>Category:       kern
>Synopsis:       pf panics while purging state
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Feb 15 15:25:00 +0000 2016
>Last-Modified:  Mon Feb 15 17:50:00 +0000 2016
>Originator:     Hauke Fath <hf@spg.tu-darmstadt.de>
>Release:        NetBSD 7.0_STABLE
>Organization:
Technische Universitaet Darmstadt
>Environment:


System: NetBSD Vertatscha 7.0_STABLE NetBSD 7.0_STABLE (FIFI-$Revision: 1.85 $) #0: Mon Feb 8 12:13:12 CET 2016 hf@Hochstuhl:/var/obj/netbsd-builds/7/amd64/sys/arch/amd64/compile/FIFI amd64
Architecture: x86_64
Machine: amd64
>Description:

	On a busy router machine, we have seen at least two kernel
	panics with the stack trace

uvm_fault(0xffffffff806ce920, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff802c5eef cs 8 rflags 10246 cr2 48 ilevel 4 rsp ff
curlwp 0xfffffe810f0855c0 pid 0.62 lowest kstack 0xfffffe810f1452c0
panic: trap
cpu3: Begin traceback...
vpanic() at netbsd:vpanic+0x13c
snprintf() at netbsd:snprintf
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0x96
pf_state_tree_id_RB_REMOVE() at netbsd:pf_state_tree_id_RB_REMOVE+0xd6
pf_unlink_state() at netbsd:pf_unlink_state+0x21
pf_purge_expired_states() at netbsd:pf_purge_expired_states+0x79
pf_purge_thread() at netbsd:pf_purge_thread+0x69
cpu3: End traceback...
rebooting...

	After the reboot, pflogd(8) 100% hogs a cpu core. When I
	manually restart it, cpu load is back to normal.


>How-To-Repeat:

	Run netbsd-7 on a busy router till it crashes.


>Fix:

	If only I knew how to.



>Audit-Trail:
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/50809: pf panics while purging state
Date: Mon, 15 Feb 2016 10:40:55 -0500

 On Feb 15,  3:25pm, hf@spg.tu-darmstadt.de (Hauke Fath) wrote:
 -- Subject: kern/50809: pf panics while purging state


 We really need to decide what to do with pf and ipf. People keep using
 them but it seems that the versions in the tree have bit rotted and we
 get kernel bugs that nobody seems to care about fixing. Particularly
 in the pf case, the code is really old and should be really updated to
 the latest pf if we want to maintain this packet filter in the tree.

 If we are not going to maintain them or spend cycles try to fix the
 bugs people report, we should get people to use npf which we actively
 maintain. For that we need to get npf to have feature parity with the
 other packet filters. Hauke can you try switching in this case?

 christos

From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org
Subject: Re: kern/50809: pf panics while purging state
Date: Mon, 15 Feb 2016 16:56:25 +0100

 On Mon, 15 Feb 2016 15:45:01 +0000 (UTC), Christos Zoulas wrote:
 >  If we are not going to maintain them or spend cycles try to fix the
 >  bugs people report, we should get people to use npf which we actively
 >  maintain. For that we need to get npf to have feature parity with the
 >  other packet filters.

 Sounds good so far. What would help besides feature parity is a=20
 conversion how-to.

 > Hauke can you try switching in this case?

 I'd love to - this is a six-core cpu - but it's not going to happen=20
 quickly:

 % wc -l /etc/pf.conf
      829 /etc/pf.conf
 %

 I have to be familiar enough with npf to convert the setup over a=20
 weekend, or there'll be one less sysadmin.  ;)

 Last I looked, tehre wasn't much npf documentation to speak of, but I=20
 admit it's been a while.

 hauke

 --=20
      The ASCII Ribbon Campaign                    Hauke Fath
 ()     No HTML/RTF in email            Institut f=FCr Nachrichtentechnik
 /\     No Word docs in email                     TU Darmstadt
      Respect for open standards              Ruf +49-6151-16-21344

From: christos@zoulas.com (Christos Zoulas)
To: Hauke Fath <hf@spg.tu-darmstadt.de>, gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org
Subject: Re: kern/50809: pf panics while purging state
Date: Mon, 15 Feb 2016 10:59:23 -0500

 On Feb 15,  4:56pm, hf@spg.tu-darmstadt.de (Hauke Fath) wrote:
 -- Subject: Re: kern/50809: pf panics while purging state

 | On Mon, 15 Feb 2016 15:45:01 +0000 (UTC), Christos Zoulas wrote:
 | >  If we are not going to maintain them or spend cycles try to fix the
 | >  bugs people report, we should get people to use npf which we actively
 | >  maintain. For that we need to get npf to have feature parity with the
 | >  other packet filters.
 | 
 | Sounds good so far. What would help besides feature parity is a=20
 | conversion how-to.
 | 
 | > Hauke can you try switching in this case?
 | 
 | I'd love to - this is a six-core cpu - but it's not going to happen=20
 | quickly:
 | 
 | % wc -l /etc/pf.conf
 |      829 /etc/pf.conf
 | %
 | 
 | I have to be familiar enough with npf to convert the setup over a=20
 | weekend, or there'll be one less sysadmin.  ;)
 | 
 | Last I looked, tehre wasn't much npf documentation to speak of, but I=20
 | admit it's been a while.

 Have you seen this: https://www.netbsd.org/~rmind/npf/

 christos

From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org
Subject: Re: kern/50809: pf panics while purging state
Date: Mon, 15 Feb 2016 17:01:31 +0100

 On Mon, 15 Feb 2016 10:59:23 -0500, Christos Zoulas wrote:
 > | Last I looked, tehre wasn't much npf documentation to speak of, but I=
 =3D20
 > | admit it's been a while.
 >=20
 > Have you seen this: https://www.netbsd.org/~rmind/npf/

 A while ago. It's grown since, which is good.

 I guess I might risk annoying SWMBO, and convert my home router first...

 hauke

 --=20
      The ASCII Ribbon Campaign                    Hauke Fath
 ()     No HTML/RTF in email            Institut f=FCr Nachrichtentechnik
 /\     No Word docs in email                     TU Darmstadt
      Respect for open standards              Ruf +49-6151-16-21344

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, 
	Hauke Fath <hf@spg.tu-darmstadt.de>
Cc: 
Subject: Re: kern/50809: pf panics while purging state
Date: Mon, 15 Feb 2016 11:08:34 -0500

 On Feb 15,  4:05pm, hf@spg.tu-darmstadt.de (Hauke Fath) wrote:
 -- Subject: Re: kern/50809: pf panics while purging state

 |  A while ago. It's grown since, which is good.
 |  
 |  I guess I might risk annoying SWMBO, and convert my home router first...

 If you need help, let me know. I run npf on pretty much everything these
 days.

 christos

From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: christos@zoulas.com (Christos Zoulas)
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/50809: pf panics while purging state
Date: Mon, 15 Feb 2016 16:32:12 +0000

 christos@zoulas.com (Christos Zoulas) wrote:
 > 
 > If we are not going to maintain them or spend cycles try to fix the
 > bugs people report, we should get people to use npf which we actively
 > maintain. For that we need to get npf to have feature parity with the
 > other packet filters. Hauke can you try switching in this case?
 > 

 I have not had enough time recently to work on the feature parity
 recently, but I am more than happy to spread the knowledge on the
 NPF internals and help with the work.  I also have some unfinished
 patches which add features; they need some mechanical completion
 and just testing really.

 -- 
 Mindaugas

From: Brad Spencer <brad@anduin.eldar.org>
To: rmind@NetBSD.org
Cc: christos@zoulas.com, gnats-bugs@NetBSD.org, kern-bug-people@NetBSD.org,
        gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/50809: pf panics while purging state
Date: Mon, 15 Feb 2016 12:25:57 -0500 (EST)

    christos@zoulas.com (Christos Zoulas) wrote:
    > 
    > If we are not going to maintain them or spend cycles try to fix the
    > bugs people report, we should get people to use npf which we actively
    > maintain. For that we need to get npf to have feature parity with the
    > other packet filters. Hauke can you try switching in this case?
    > 

    I have not had enough time recently to work on the feature parity
    recently, but I am more than happy to spread the knowledge on the
    NPF internals and help with the work.  I also have some unfinished
    patches which add features; they need some mechanical completion
    and just testing really.

    -- 
    Mindaugas



 I probably use IPF in a somewhat unusual manor, but the only reason I
 don't use NPF is the seemly lack of BRIDGE_IPF.  I have placed an IPF
 filter in between me and the Internet with another system lower down doing
 NAT, and internal routing and more firewalling.  I actually have a small
 set of fully routable IPs that live on systems and would rather not do NAT
 on the edge if I can help it, nor would I like to maintain firewall sets
 on these systems for those things I would like to prevent from leaving or
 prevent from entering the edge network.



 -- 
 Brad Spencer - brad@anduin.eldar.org - KC8VKS
 http://anduin.eldar.org  - & -  http://anduin.ipv6.eldar.org [IPv6 only]

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: rmind@NetBSD.org, christos@zoulas.com, gnats-bugs@NetBSD.org,
        kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
        netbsd-bugs@NetBSD.org
Subject: Re: kern/50809: pf panics while purging state
Date: Mon, 15 Feb 2016 18:48:01 +0100

 On Mon, Feb 15, 2016 at 12:25:57PM -0500, Brad Spencer wrote:
 > I probably use IPF in a somewhat unusual manor, but the only reason I
 > don't use NPF is the seemly lack of BRIDGE_IPF.  I have placed an IPF

 It's not unusual to use BRIDGE_IPF. I have it on all my Xen dom0 systems
 (to do filtering for domUs).

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.