NetBSD Problem Report #38245

From reed@reedmedia.net  Sun Mar 16 02:55:22 2008
Return-Path: <reed@reedmedia.net>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id EC8AC63B863
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 16 Mar 2008 02:55:21 +0000 (UTC)
Message-Id: <23345-1205635929@reedmedia.net>
Date: Sat, 15 Mar 2008 21:52:10 -0500
From: reed@reedmedia.net
Reply-To: reed@reedmedia.net
To: gnats-bugs@gnats.NetBSD.org
Subject: system lock up with pf.o module on amd64
X-Send-Pr-Version: 3.95

>Number:         38245
>Category:       kern
>Synopsis:       system lock up with pf.o module on amd64
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Mar 16 03:00:00 +0000 2008
>Closed-Date:    Mon Nov 21 07:34:30 +0000 2016
>Last-Modified:  Mon Nov 21 07:34:30 +0000 2016
>Originator:     reed@reedmedia.net
>Release:        NetBSD 4.99.55
>Organization:
  Jeremy C. Reed
>Environment:


System: NetBSD tx.reedmedia.net 4.99.55 NetBSD 4.99.55 (GENERIC) #0: Fri Mar 7 09:06:46 CST 2008 reed@tx.reedmedia.net:/usr/src/obj/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
My dmesg is at http://reedmedia.net/~reed/tmp-nb73459yt/dmesg.boot

(Also had problem on 4.0_RC2.)

Using pf.o module on amd64 to do nat on ral0 from my re0 which is plugged
into Cisco IP phone causes system to lock up. Can't be pinged.

I don't know when it happens. Sometimes I am using it in X and sometimes
I am away. I am in X and no log messages.
(I can't test at console to see if any messages as I can't use console
after loading module -- see my ticket #38244.)

I repeated this many times with 4.0_RC2 and with 4.99.55.

Before using module and once I stopped using the module, the same system
works fine. Also I don't see this pf.o problem with my NetBSD/i386 4.0
system (which is using rum0 and bge0).
(The working system's dmesg is at 
http://reedmedia.net/~reed/tmp-nb73459yt/dmesg.boot-Dell-Latitude-D610-NetBSD-4.0-i386 )

For what its worth, I replaced pf.o with just using builtin ipfilter
and using ipnat to do the same task for 12 hours and no lock up yet.
(pf lock up was probably about 30 miniutes).

I file this as a "kern" issue, but maybe it is a "amd64" issue?

I may try to build pf into kernel and try than instead of module.
But it is very inconvenient to have this system lock up quickly
and frequently.
>How-To-Repeat:
For me it locks up every time maybe 30 minutes after I modload pf.o,
enable forwarding, and enable PF with one rule:
nat on ral0 from !(ral0) -> (ral0:0)
with the cisco IP phone plugged into my re0.
>Fix:


>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 01 Apr 2009 06:38:12 +0000
State-Changed-Why:
Has this been a problem since ad fixed module loading on amd64?


From: "Jeremy C. Reed" <reed@reedmedia.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/38245 (system lock up with pf.o module on amd64)
Date: Wed, 1 Apr 2009 07:58:42 -0500 (CDT)

 > Has this been a problem since ad fixed module loading on amd64?

 Have not been able to test yet. One system is running 4.0. Other system 
 won't run NetBSD HEAD newer than May 13, 2008. 
 (http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=39275)

State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 31 Jan 2010 01:25:47 +0000
State-Changed-Why:
new(ish) development


State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 31 Jan 2010 01:27:33 +0000
State-Changed-Why:
another amd64 module fix was pulled up in november (pullup-5 #1140) - please
try that when you get a chance.


State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 14 Aug 2016 23:40:24 +0000
State-Changed-Why:
6.5-year feedback timeout.


State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 21 Nov 2016 07:34:30 +0000
State-Changed-Why:
In the absence of other information, assume the problem was that this
was observed during the time when modules were loaded into memory
overwriting critical parts of the pmap (or whatever it was, something
comparably fatal at least) on amd64, not a problem with pf.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.41 2016/01/01 03:26:19 jakllsch Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2016 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.