NetBSD Problem Report #32429
From woods@building.weird.com Mon Jan 2 03:58:10 2006
Return-Path: <woods@building.weird.com>
Received: from building.weird.com (building.weird.com [204.92.254.24])
by narn.netbsd.org (Postfix) with ESMTP id A929363B942
for <gnats-bugs@gnats.netbsd.org>; Mon, 2 Jan 2006 03:58:10 +0000 (UTC)
Message-Id: <m1EtGpl-002IeQC@building.weird.com>
Date: Sun, 1 Jan 2006 22:58:09 -0500 (EST)
From: "Greg A. Woods" <woods@planix.com>
Sender: "Greg A. Woods" <woods@building.weird.com>
Reply-To: "Greg A. Woods" <woods@planix.com>
To: gnats-bugs@netbsd.org
Subject: setting MAXDSIZ > 1GB on 1.6.x alpha causes a "panic: trap"
X-Send-Pr-Version: 3.95
>Number: 32429
>Category: port-alpha
>Synopsis: setting MAXDSIZ over 1GB on 1.6.x alpha causes a "panic: trap"
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: thorpej
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Jan 02 07:15:51 +0000 2006
>Last-Modified: Tue Oct 13 15:54:45 +0000 2020
>Originator: Greg A. Woods
>Release: NetBSD 1.6.2_STABLE (cvs update on 20051127)
>Organization:
Planix, Inc.; Toronto, Ontario; Canada
>Environment:
System: NetBSD building 1.6.2_STABLE
Architecture: alpha
Machine: alpha
>Description:
NetBSD/alpha has a MAXDSIZ default setting of 1GB thus limiting
all processes to a hard RLIMIT_DATA of the same.
When MAXDSIZ is increased beyond 1GB in order to try to allow a
process to have an RLIMIT_DATA of more than 1GB, the kernel will
quickly panic when put under any significant load.
Note that everything works fine in single user mode with just
one process running:
[console]<@> # ulimit -d $((8*1024*1024*1024))
[console]<@> # ulimit -a
time(cpu-seconds) unlimited
file(blocks) unlimited
coredump(blocks) unlimited
data(kbytes) 8388608
stack(kbytes) 2048
lockedmem(kbytes) 4860504
memory(kbytes) 14581512
nofiles(descriptors) 64
processes 160
[console]<@> # time zonec -v -f dnsbl.sorbs.net.nsd sorbs.zonec &
[1] zonec -v -f dnsbl.sorbs.net.nsd sorbs.zonec
[[ ... wait for some time ... ]]
[console]<@> # ps -u
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 76 99.0 -38534.9 1191528 1149864 C0 R 10:24PM 3:17.11 zonec -v -f dnsbl
root 72 0.0 -19.3 608 560 C0 S 10:23PM 0:00.49 ksh
root 15 0.0 -21.7 728 632 C0 Is 10:19PM 0:01.49 -sh
root 107 0.0 -13.4 384 384 C0 R+ 10:28PM 0:00.00 ps -u
>How-To-Repeat:
options MAXDSIZ="(8UL*1024*1024*1024)"
boot to multiuser, and observe a panic shortly afterwards:
CPU 3: fatal kernel trap:
CPU 3 trap entry = 0x2 (memory management fault)
CPU 3 a0 = 0x2a0
CPU 3 a1 = 0x1
CPU 3 a2 = 0x0
CPU 3 pc = 0xfffffc0000300a50
CPU 3 ra = 0xfffffc0000300a44
CPU 3 pv = 0xfffffc0000300994
CPU 3 curproc = 0xfffffc00b3be8ba8
CPU 3 pid = 328, comm = imapd
panic: trap
Stopped in pid 328 (imapd) at cpu_Debugger+0x4: ret zero,(ra)
db{3}> trace
cpu_Debugger() at cpu_Debugger+0x4
panic() at panic+0x160
trap() at trap+0x6ec
XentMM() at XentMM+0x20
--- memory management fault (from ipl 0) ---
copyinstr() at copyinstr+0x54
namei() at namei+0xb8
sys___stat13() at sys___stat13+0x5c
syscall_plain() at syscall_plain+0x158
XentSys() at XentSys+0x5c
--- syscall (278) ---
--- user mode ---
db{3}>
>Fix:
unknown
>Release-Note:
>Audit-Trail:
From: Elad Efrat <elad@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/32429: setting MAXDSIZ > 1GB on 1.6.x alpha causes a "panic:
trap"
Date: Mon, 02 Jan 2006 18:05:22 +0200
please try to reproduce on a -current kernel.
-e.
--
Elad Efrat
From: "Greg A. Woods" <woods@planix.com>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/32429: setting MAXDSIZ > 1GB on 1.6.x alpha causes a "panic: trap"
Date: Mon, 02 Jan 2006 12:17:16 -0500
--pgp-sign-Multipart_Mon_Jan__2_12:17:14_2006-1
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
At Mon, 2 Jan 2006 16:10:03 +0000 (UTC),
Elad Efrat wrote:
>=20
> please try to reproduce on a -current kernel.
Unfortunately that's just not going to be easy.
The system it's happening on is running in production with 15,000 or so
users.
I do have a test system that I'll try to get a newer release kernel onto
(maybe even 3.0), but I'm not really enthused about trying -current at
all. Even if it worked it would be totally useless to me as I can't run
it in the production environment.
--=20
Greg A. Woods
H:+1 416 218-0098 W:+1 416 489-5852 x122 VE3TCP RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Secrets of the Weird <woods@weird.com>
--pgp-sign-Multipart_Mon_Jan__2_12:17:14_2006-1
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit
-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 5.0i for non-commercial use
MessageID: 4YbiBiAPt5rbprCvmvcfS04MFXit5BZy
iQA/AwUBQ7lgG2Z9cbd4v/R/EQJUdgCgmPlZE/na0oX8bWFId8uU8L0klOoAoPYa
Ix9Lvtedt2e29NR0mFEJC277
=VwGL
-----END PGP SIGNATURE-----
--pgp-sign-Multipart_Mon_Jan__2_12:17:14_2006-1--
From: "Greg A. Woods" <woods@planix.com>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/32429: setting MAXDSIZ > 1GB on 1.6.x alpha causes a "panic: trap"
Date: Mon, 02 Jan 2006 19:22:43 -0500
--pgp-sign-Multipart_Mon_Jan__2_19:22:40_2006-1
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
At Mon, 2 Jan 2006 17:25:09 +0000 (UTC),
I wrote:
>=20
> I do have a test system that I'll try to get a newer release kernel onto
> (maybe even 3.0), but I'm not really enthused about trying -current at
> all. Even if it worked it would be totally useless to me as I can't run
> it in the production environment.
Unfortunately the test system won't crash even though I've been running
"build.sh" and building packages with pkg_chk on it all afternoon.
I'm guessing this means the problem is somehow more closely related to
the networking code.
(though the test system is mounting /usr/src, and /usr/pkgsrc, and
distfiles, all over NFS)
The production server where the problem was observed runs a rather
heavily used Cyrus IMAPd, among other network services such as HTTP,
FTP, DNS, SMTP, etc. The crash happened in imapd both times, almost
immediately after the system gets to multi-user mode (I was able to
login via SSH once, but only just barely before it crashed).
I'll try installing Cyrus on the test box tomorrow and running as many
simultaneous connections against it as I can. Maybe the POP benchmark
from benchmarks/postal will help.
--=20
Greg A. Woods
H:+1 416 218-0098 W:+1 416 489-5852 x122 VE3TCP RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com> Secrets of the Weird <woods@weird.com>
--pgp-sign-Multipart_Mon_Jan__2_19:22:40_2006-1
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit
-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 5.0i for non-commercial use
MessageID: Kg6E3wSh6PqB75eKURSEOFsvo2HoQkgl
iQA/AwUBQ7nD0mZ9cbd4v/R/EQJ2LgCgwQ/a/vyWeq8/SInTyswVpAdVK8wAn2ra
a5B1BI9TsqmQd9fVU/ZcdBa+
=fxGQ
-----END PGP SIGNATURE-----
--pgp-sign-Multipart_Mon_Jan__2_19:22:40_2006-1--
Responsible-Changed-From-To: kern-bug-people->thorpej
Responsible-Changed-By: thorpej@NetBSD.org
Responsible-Changed-When: Tue, 13 Oct 2020 15:54:45 +0000
Responsible-Changed-Why:
Take.
>Unformatted:
(Contact us)
$NetBSD: gnats-precook-prs,v 1.4 2018/12/21 14:20:20 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.