NetBSD Problem Report #7897

Received: (qmail 7188 invoked from network); 3 Jul 1999 08:42:15 -0000
Message-Id: <199907030842.BAA09701@nooksack.ldc.cs.wwu.edu>
Date: Sat, 3 Jul 1999 01:42:10 -0700 (PDT)
From: cgd@netbsd.org
Reply-To: cgd@netbsd.org
To: gnats-bugs@gnats.netbsd.org
Subject: 1.4-branch kernel panic in amap_copy()
X-Send-Pr-Version: www-1.0

>Number:         7897
>Category:       kern
>Synopsis:       1.4-branch kernel panic in amap_copy()
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    mrg
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jul 03 01:50:01 +0000 1999
>Closed-Date:    Sun Apr 05 21:13:51 +0000 2009
>Last-Modified:  Sun Apr 05 21:13:51 +0000 2009
>Originator:     Chris Demetriou
>Release:        1.4-branch as of 2-3 hours ago.
>Organization:
>Environment:
NetBSD speedy 1.4 NetBSD 1.4 (SPEEDY) #1: Sat Jul  3 00:36:11 PDT 1999     cgd@speedy:/a/src/src-1-4-branch/sys/arch/i386/compile/SPEEDY i386
userland, including X server and libraries, from the NetBSD/i386 1.4
distribution sets.  Relevant packages:
ghostscript-5.50    Aladdin Postscript interpreter.
gv-3.5.8            A PostScript and PDF previewer.

>Description:
The NetBSD 1.4-branch kernel i'm running (compiled from sources checked
out this evening) crashes in uvm's amap_copy() function.

The relevant bits of the dmesg info are:

uvm_fault(0xf4a87a54, 0x0, 0, 3) -> 1
fatal page fault in supervisor mode
trap type 6 code 2 eip f01cd016 cs 8 eflags 10202 cr2 0 cpl 0
panic: trap

That eip (PC) is in the UVM amap_copy() function.  Unfortunately, since
gdb on the i386 isn't capable of generating stack traces through trap
entries, I can't easily provide a complete stack trace.
>How-To-Repeat:
This happens fairly reliably for me (66% of the time, or so)
when I try to run 'gv' on a particular large PostScript file:

	ftp://ftp.netbsd.org/pub/NetBSD/misc/dec-docs/ec-qd2ka-te.ps.gz

After downloading it and un-gzipping it, i start up gv on it.
gv complains about something like not being able to allocate space
for backing store for the window, then the system either crashes
and dumps a core or hangs for a while with the disk activity light
on, then 'wakes up' with the X server entirely dead.  (In that case,
the X server is reported to be killed because the system has run
out of swap.  Attempting the same test again seems to randomly yeild
one of the two results.  If at first you don't succeed... 8-)

The system has 64MB of memory, and is configured to have just over 256MB
of swap.  It's possible that the X server and gv are running
out of memory+swap, but it seems unreasonable.  My display is
1600x1200x8bpp, so even if it has to allocate a _bunch_ of
pixmaps the size of the entire display, it should be OK!

I can provide kernel crash dumps for people interested in hunting
down the problem, and since it seems I can reproduce this pretty
easily I can try to test fixes.
>Fix:
???
>Release-Note:
>Audit-Trail:

From: Chuck Silvers <chuq@chuq.com>
To: cgd@netbsd.org
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: kern/7897: 1.4-branch kernel crash in amap_copy() 
Date: Sat, 03 Jul 1999 23:43:48 -0700

 I'll take a look if you'll send me a dump.

 -Chuck

From: cgd@netbsd.org (Chris G. Demetriou)
To: gnats-bugs@gnats.netbsd.org
Cc:  Subject: Re: kern/7897: 1.4-branch kernel crash in amap_copy()
Date: 05 Jul 1999 19:41:27 -0700

 This problem consisted of several serious bugs:

 1. some problem in gv/ghostscript.  (I put a problem here because
 starting up gv with the PS file named on the command line fails, but
 opening it works fine, or so it seems.)  This causes in turn causes
 the X server to make some really huge (1GB, 512MB) memory allocation
 attempts.

 2. the lack of mmap()'d memory vs. datasize checking, which prevented
 the ridiculous allocations from being rejected immediately.

 3. a bug in the allocation failure handling in amap_alloc1(), which
 caused success to be returned incorrectly in many failure cases.

 4. unidentified problems that cause the system to take 5+ (often 10+)
 minutes to recover after killing the X server... if it ever actually
 kills the X server.


 (3) has been fixed correctly.

 A (highly bogus) but effective (against this particular type of
 allocation, not against a malicious hacker) fix has been implemented
 for (2).  The only correct fix for this is proper accounting of mmap'd
 memory.

 The other problems have not been investigated.


 cgd
 -- 
 Chris Demetriou - cgd@netbsd.org -
 http://www.netbsd.org/People/Pages/cgd.html Disclaimer: Not speaking
 for NetBSD, just expressing my own opinion.
State-Changed-From-To: open->feedback 
State-Changed-By: thorpej 
State-Changed-When: Mon Aug 2 21:36:56 PDT 1999 
State-Changed-Why:  
Is this fixed?  I recall a commit related to this. 

From: cgd@netbsd.org (Chris G. Demetriou)
To: thorpej@netbsd.org
Cc: kern-bug-people@netbsd.org
Subject: Re: kern/7897
Date: 02 Aug 1999 23:51:43 -0700

 thorpej@netbsd.org writes:
 > Is this fixed?  I recall a commit related to this.

 see the note in the PR about what has and hasn't been done.

 The only modification to that is that i think CHS thought about (4),
 but there was no solution for it.  hangs of over 10 minutes weren't
 surprisingly atypical when the problem was triggered.


 cgd
 -- 
 Chris Demetriou - cgd@netbsd.org - http://www.netbsd.org/People/Pages/cgd.html
 Disclaimer: Not speaking for NetBSD, just expressing my own opinion.
State-Changed-From-To: feedback->analyzed 
State-Changed-By: fair 
State-Changed-When: Sun Apr 23 01:10:52 PDT 2000 
State-Changed-Why:  
Feedback provided, problem not completely resolved yet. 

From: Manuel Bouyer <bouyer@antioche.lip6.fr>
To: chs@netbsd.org
Cc:  
Subject: kern/7897
Date: Fri, 2 Aug 2002 15:22:50 +0200

 Hi,
 while browsing the PR summaries, I found this one:
 http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=7897

 isn't mmap'ed memory accounted as datasize now ? If so, I guess this PR
 can be closed.

 --
 Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@lip6.fr
 --

From: Manuel Bouyer <bouyer@antioche.lip6.fr>
To: Chuck Silvers <chuq@chuq.com>
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: kern/7897
Date: Fri, 2 Aug 2002 17:19:19 +0200

 On Fri, Aug 02, 2002 at 07:50:23AM -0700, Chuck Silvers wrote:
 > nope, there's only a check in the mmap() path that any one allocation
 > request won't exceed your remaining data limit all by itself.
 > but existing mmap() allocations are not included in this calculation.

 OK, so this PR is likely not solved yet.

 Thanks.

 --
 Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@lip6.fr
 --
State-Changed-From-To: analyzed->closed
State-Changed-By: dsl@netbsd.org
State-Changed-When: Mon, 11 Sep 2006 20:47:48 +0000
State-Changed-Why:
mmap has been counted for process size for a while.


State-Changed-From-To: closed->open
State-Changed-By: mrg@NetBSD.org
State-Changed-When: Sat, 21 Mar 2009 09:43:33 +0000
State-Changed-Why:
mmap allocations are not properly accounted for in any limit (yet).


Responsible-Changed-From-To: kern-bug-people->mrg
Responsible-Changed-By: mrg@NetBSD.org
Responsible-Changed-When: Mon, 23 Mar 2009 02:24:08 +0000
Responsible-Changed-Why:
i have a patch for this in testing.


From: matthew green <mrg@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/7897 CVS commit: src
Date: Sun, 29 Mar 2009 01:02:51 +0000

 Module Name:	src
 Committed By:	mrg
 Date:		Sun Mar 29 01:02:51 UTC 2009

 Modified Files:
 	src/bin/csh: csh.1 func.c
 	src/bin/ps: print.c ps.c
 	src/bin/sh: miscbltin.c sh.1
 	src/external/bsd/top/dist/machine: m_netbsd.c
 	src/lib/libkvm: kvm_proc.c
 	src/sys/arch/mips/mips: cpu_exec.c
 	src/sys/compat/darwin: darwin_exec.c
 	src/sys/compat/ibcs2: ibcs2_exec.c
 	src/sys/compat/irix: irix_resource.c
 	src/sys/compat/linux/arch/amd64: linux_exec_machdep.c
 	src/sys/compat/linux/arch/i386: linux_exec_machdep.c
 	src/sys/compat/linux/common: linux_limit.h
 	src/sys/compat/osf1: osf1_resource.c
 	src/sys/compat/svr4: svr4_resource.c
 	src/sys/compat/svr4_32: svr4_32_resource.c
 	src/sys/kern: exec_subr.c init_sysctl.c kern_exec.c kern_resource.c
 	src/sys/sys: resource.h sysctl.h
 	src/sys/uvm: uvm_extern.h uvm_glue.c uvm_mmap.c
 	src/usr.bin/systat: ps.c

 Log Message:
 - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
 address space available to processes.  this limit exists in most other
 modern unix variants, and like most of them, our defaults are unlimited.
 remove the old mmap / rlimit.datasize hack.

 - adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
 it is currently unused, but was added a few years ago.

 - add a pair of new process size values to kinfo_proc2{}. one is the
 total size of the process memory map, and the other is the total size
 adjusted for unused stack space (since most processes have a lot of
 this...)

 - patch sh, and csh to notice RLIMIT_AS.  (in some cases, the alias
 RLIMIT_VMEM was already present and used if availble.)

 - patch ps, top and systat to notice the new k_vm_vsize member of
 kinfo_proc2{}.

 - update irix, svr4, svr4_32, linux and osf1 emulations to support
 this information.  (freebsd could be done, but that it's best left
 as part of the full-update of compat/freebsd.)

 this addresses PR 7897.  it also gives correct memory usage values,
 which have never been entirely correct (since mmap), and have been
 very incorrect since jemalloc() was enabled.

 tested on i386 and sparc64, build tested on several other platforms.

 thanks to many folks for feedback and testing but most espcially
 chuq and yamt for critical suggestions that lead to this patch not
 having a special ugliness i wasn't happy with anyway :-)


 To generate a diff of this commit:
 cvs rdiff -u -r1.45 -r1.46 src/bin/csh/csh.1
 cvs rdiff -u -r1.36 -r1.37 src/bin/csh/func.c
 cvs rdiff -u -r1.110 -r1.111 src/bin/ps/print.c
 cvs rdiff -u -r1.73 -r1.74 src/bin/ps/ps.c
 cvs rdiff -u -r1.37 -r1.38 src/bin/sh/miscbltin.c
 cvs rdiff -u -r1.91 -r1.92 src/bin/sh/sh.1
 cvs rdiff -u -r1.6 -r1.7 src/external/bsd/top/dist/machine/m_netbsd.c
 cvs rdiff -u -r1.81 -r1.82 src/lib/libkvm/kvm_proc.c
 cvs rdiff -u -r1.54 -r1.55 src/sys/arch/mips/mips/cpu_exec.c
 cvs rdiff -u -r1.56 -r1.57 src/sys/compat/darwin/darwin_exec.c
 cvs rdiff -u -r1.72 -r1.73 src/sys/compat/ibcs2/ibcs2_exec.c
 cvs rdiff -u -r1.14 -r1.15 src/sys/compat/irix/irix_resource.c
 cvs rdiff -u -r1.15 -r1.16 \
     src/sys/compat/linux/arch/amd64/linux_exec_machdep.c
 cvs rdiff -u -r1.11 -r1.12 \
     src/sys/compat/linux/arch/i386/linux_exec_machdep.c
 cvs rdiff -u -r1.4 -r1.5 src/sys/compat/linux/common/linux_limit.h
 cvs rdiff -u -r1.13 -r1.14 src/sys/compat/osf1/osf1_resource.c
 cvs rdiff -u -r1.17 -r1.18 src/sys/compat/svr4/svr4_resource.c
 cvs rdiff -u -r1.16 -r1.17 src/sys/compat/svr4_32/svr4_32_resource.c
 cvs rdiff -u -r1.61 -r1.62 src/sys/kern/exec_subr.c
 cvs rdiff -u -r1.159 -r1.160 src/sys/kern/init_sysctl.c
 cvs rdiff -u -r1.287 -r1.288 src/sys/kern/kern_exec.c
 cvs rdiff -u -r1.150 -r1.151 src/sys/kern/kern_resource.c
 cvs rdiff -u -r1.30 -r1.31 src/sys/sys/resource.h
 cvs rdiff -u -r1.183 -r1.184 src/sys/sys/sysctl.h
 cvs rdiff -u -r1.152 -r1.153 src/sys/uvm/uvm_extern.h
 cvs rdiff -u -r1.135 -r1.136 src/sys/uvm/uvm_glue.c
 cvs rdiff -u -r1.127 -r1.128 src/sys/uvm/uvm_mmap.c
 cvs rdiff -u -r1.31 -r1.32 src/usr.bin/systat/ps.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/7897 CVS commit: [netbsd-5] src
Date: Wed, 1 Apr 2009 00:25:24 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Wed Apr  1 00:25:23 UTC 2009

 Modified Files:
 	src/bin/csh [netbsd-5]: csh.1 func.c
 	src/bin/ps [netbsd-5]: print.c ps.c
 	src/bin/sh [netbsd-5]: miscbltin.c sh.1
 	src/external/bsd/top/dist/machine [netbsd-5]: m_netbsd.c
 	src/lib/libkvm [netbsd-5]: kvm_proc.c
 	src/sys/arch/mips/mips [netbsd-5]: cpu_exec.c
 	src/sys/compat/darwin [netbsd-5]: darwin_exec.c
 	src/sys/compat/ibcs2 [netbsd-5]: ibcs2_exec.c
 	src/sys/compat/irix [netbsd-5]: irix_resource.c
 	src/sys/compat/linux/arch/amd64 [netbsd-5]: linux_exec_machdep.c
 	src/sys/compat/linux/arch/i386 [netbsd-5]: linux_exec_machdep.c
 	src/sys/compat/linux/common [netbsd-5]: linux_limit.h
 	src/sys/compat/osf1 [netbsd-5]: osf1_resource.c
 	src/sys/compat/svr4 [netbsd-5]: svr4_resource.c
 	src/sys/compat/svr4_32 [netbsd-5]: svr4_32_resource.c
 	src/sys/kern [netbsd-5]: exec_subr.c init_sysctl.c kern_exec.c
 	    kern_resource.c
 	src/sys/sys [netbsd-5]: param.h resource.h sysctl.h
 	src/sys/uvm [netbsd-5]: uvm_extern.h uvm_glue.c uvm_mmap.c
 	src/usr.bin/systat [netbsd-5]: ps.c

 Log Message:
 Pull up following revision(s) (requested by mrg in ticket #622):
 	bin/csh/csh.1: revision 1.46
 	bin/csh/func.c: revision 1.37
 	bin/ps/print.c: revision 1.111
 	bin/ps/ps.c: revision 1.74
 	bin/sh/miscbltin.c: revision 1.38
 	bin/sh/sh.1: revision 1.92 via patch
 	external/bsd/top/dist/machine/m_netbsd.c: revision 1.7
 	lib/libkvm/kvm_proc.c: revision 1.82
 	sys/arch/mips/mips/cpu_exec.c: revision 1.55
 	sys/compat/darwin/darwin_exec.c: revision 1.57
 	sys/compat/ibcs2/ibcs2_exec.c: revision 1.73
 	sys/compat/irix/irix_resource.c: revision 1.15
 	sys/compat/linux/arch/amd64/linux_exec_machdep.c: revision 1.16
 	sys/compat/linux/arch/i386/linux_exec_machdep.c: revision 1.12
 	sys/compat/linux/common/linux_limit.h: revision 1.5
 	sys/compat/osf1/osf1_resource.c: revision 1.14
 	sys/compat/svr4/svr4_resource.c: revision 1.18
 	sys/compat/svr4_32/svr4_32_resource.c: revision 1.17
 	sys/kern/exec_subr.c: revision 1.62
 	sys/kern/init_sysctl.c: revision 1.160
 	sys/kern/kern_exec.c: revision 1.288
 	sys/kern/kern_resource.c: revision 1.151
 	sys/sys/param.h: patch
 	sys/sys/resource.h: revision 1.31
 	sys/sys/sysctl.h: revision 1.184
 	sys/uvm/uvm_extern.h: revision 1.153
 	sys/uvm/uvm_glue.c: revision 1.136
 	sys/uvm/uvm_mmap.c: revision 1.128
 	usr.bin/systat/ps.c: revision 1.32
 - - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
 address space available to processes.  this limit exists in most other
 modern unix variants, and like most of them, our defaults are unlimited.
 remove the old mmap / rlimit.datasize hack.
 - - adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
 it is currently unused, but was added a few years ago.
 - - add a pair of new process size values to kinfo_proc2{}. one is the
 total size of the process memory map, and the other is the total size
 adjusted for unused stack space (since most processes have a lot of
 this...)
 - - patch sh, and csh to notice RLIMIT_AS.  (in some cases, the alias
 RLIMIT_VMEM was already present and used if availble.)
 - - patch ps, top and systat to notice the new k_vm_vsize member of
 kinfo_proc2{}.
 - - update irix, svr4, svr4_32, linux and osf1 emulations to support
 this information.  (freebsd could be done, but that it's best left
 as part of the full-update of compat/freebsd.)
 this addresses PR 7897.  it also gives correct memory usage values,
 which have never been entirely correct (since mmap), and have been
 very incorrect since jemalloc() was enabled.
 tested on i386 and sparc64, build tested on several other platforms.
 thanks to many folks for feedback and testing but most espcially
 chuq and yamt for critical suggestions that lead to this patch not
 having a special ugliness i wasn't happy with anyway :-)


 To generate a diff of this commit:
 cvs rdiff -u -r1.43 -r1.43.32.1 src/bin/csh/csh.1
 cvs rdiff -u -r1.36 -r1.36.12.1 src/bin/csh/func.c
 cvs rdiff -u -r1.106 -r1.106.2.1 src/bin/ps/print.c
 cvs rdiff -u -r1.71 -r1.71.4.1 src/bin/ps/ps.c
 cvs rdiff -u -r1.36 -r1.36.26.1 src/bin/sh/miscbltin.c
 cvs rdiff -u -r1.87 -r1.87.18.1 src/bin/sh/sh.1
 cvs rdiff -u -r1.5 -r1.5.8.1 src/external/bsd/top/dist/machine/m_netbsd.c
 cvs rdiff -u -r1.78.6.1 -r1.78.6.2 src/lib/libkvm/kvm_proc.c
 cvs rdiff -u -r1.50 -r1.50.54.1 src/sys/arch/mips/mips/cpu_exec.c
 cvs rdiff -u -r1.55 -r1.55.4.1 src/sys/compat/darwin/darwin_exec.c
 cvs rdiff -u -r1.71 -r1.71.4.1 src/sys/compat/ibcs2/ibcs2_exec.c
 cvs rdiff -u -r1.14 -r1.14.10.1 src/sys/compat/irix/irix_resource.c
 cvs rdiff -u -r1.13 -r1.13.2.1 \
     src/sys/compat/linux/arch/amd64/linux_exec_machdep.c
 cvs rdiff -u -r1.11 -r1.11.4.1 \
     src/sys/compat/linux/arch/i386/linux_exec_machdep.c
 cvs rdiff -u -r1.3 -r1.3.10.1 src/sys/compat/linux/common/linux_limit.h
 cvs rdiff -u -r1.13 -r1.13.12.1 src/sys/compat/osf1/osf1_resource.c
 cvs rdiff -u -r1.16 -r1.16.10.1 src/sys/compat/svr4/svr4_resource.c
 cvs rdiff -u -r1.15 -r1.15.10.1 src/sys/compat/svr4_32/svr4_32_resource.c
 cvs rdiff -u -r1.61 -r1.61.8.1 src/sys/kern/exec_subr.c
 cvs rdiff -u -r1.149.4.3 -r1.149.4.4 src/sys/kern/init_sysctl.c
 cvs rdiff -u -r1.280.4.1 -r1.280.4.2 src/sys/kern/kern_exec.c
 cvs rdiff -u -r1.147 -r1.147.4.1 src/sys/kern/kern_resource.c
 cvs rdiff -u -r1.330.4.5 -r1.330.4.6 src/sys/sys/param.h
 cvs rdiff -u -r1.29 -r1.29.72.1 src/sys/sys/resource.h
 cvs rdiff -u -r1.177 -r1.177.4.1 src/sys/sys/sysctl.h
 cvs rdiff -u -r1.148.4.1 -r1.148.4.2 src/sys/uvm/uvm_extern.h
 cvs rdiff -u -r1.133 -r1.133.6.1 src/sys/uvm/uvm_glue.c
 cvs rdiff -u -r1.126 -r1.126.8.1 src/sys/uvm/uvm_mmap.c
 cvs rdiff -u -r1.30 -r1.30.18.1 src/usr.bin/systat/ps.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: mrg@NetBSD.org
State-Changed-When: Sun, 05 Apr 2009 21:13:51 +0000
State-Changed-Why:
this is now fixed.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.