NetBSD Problem Report #46096

From gson@gson.org  Sat Feb 25 16:08:52 2012
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 3952963BA17
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 25 Feb 2012 16:08:52 +0000 (UTC)
Message-Id: <20120225160850.9368475E5E@guava.gson.org>
Date: Sat, 25 Feb 2012 18:08:50 +0200 (EET)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@gnats.NetBSD.org
Subject: uvmwait test case sometimes panics kernel
X-Send-Pr-Version: 3.95

>Number:         46096
>Category:       kern
>Synopsis:       uvmwait test case sometimes panics kernel
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Feb 25 16:10:01 +0000 2012
>Closed-Date:    Sat Nov 03 15:00:56 +0000 2012
>Last-Modified:  Sat Aug 23 15:10:05 +0000 2014
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current, source date 2012.02.24.19.40.49
>Organization:
>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:

There have now been multiple incidents where a kernel panic has
occurred while running the uvmwait test case of the rump/rumpkern/t_vm
test, for example:

  http://releng.netbsd.org/b5reports/i386/build/2012.01.30.12.19.45/test.log
  http://www.gson.org/netbsd/bugs/build/build/2012.02.06.17.51.47/test.log
  http://www.gson.org/netbsd/bugs/build.i386-debug/build/2012.02.24.19.40.49/test.log

The uvmwait test case has been consistently failing since the vmem
commits of January 29, which would be worthy of a PR in itself, but
this PR is specifically about the kernel panics, not the test failures.

Tracking down the problem should be easier than usual, because the
latest failure occurred on a test system that was built with full
debug symbols (using "build.sh -V MKDEBUG=yes -V DBG=-g"), installed
with full source, and run under a new test fixture that automatically
archived a full disk image of the failed system, including the kernel
crash dump.

This disk image is available for downloading at:

   http://www.gson.org/netbsd/bugs/i386-debug-2012.02.24.19.40.49.img.gz

The compressed image is 832 MB in size and decompresses to 4 GB.

To debug the problem while enjoying the comforts of source-level
debugging, download and gunzip the disk image, and then boot it with

  qemu -snapshot -nographic -hda i386-debug-2012.02.24.19.40.49.img

Note that you don't need to be running to i386 port, or even NetBSD,
to do this.

Log in as root (there is no password).  To help gdb find the kernel
sources, type:

  mkdir -p /tmp/bracket/build/2012.02.24.19.40.49-i386-debug
  ln -s /usr/src /tmp/bracket/build/2012.02.24.19.40.49-i386-debug/src

Then type:

  cd /var/crash
  gunzip netbsd*
  gdb /netbsd
  target kvm netbsd.0.core
  where

>How-To-Repeat:

Run the ATF tests enough times.  But there should be no need to
reproduce the problem since an exceptionally complete set of evidence
was collected from the latest crime scene.

>Fix:

>Release-Note:

>Audit-Trail:
From: Lars Heidieker <lars@heidieker.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46096: uvmwait test case sometimes panics kernel
Date: Wed, 29 Feb 2012 17:50:52 +0100

 The address in v is fine but how did pr_itemoffset in the pool struct 
 change to 1...

 (gdb)
 #4  0xc099ec2c in pool_get (pp=0xc0f805c0, flags=2)
      at 
 /tmp/bracket/build/2012.02.24.19.40.49-i386-debug/src/sys/kern/subr_pool.c:1113
 1113		KASSERT((((vaddr_t)v + pp->pr_itemoffset) & (pp->pr_align - 1)) == 0);
 (gdb) list
 1108			 * a caller's assumptions about interrupt protection, etc.
 1109			 */
 1110		}
 1111	
 1112		mutex_exit(&pp->pr_lock);
 1113		KASSERT((((vaddr_t)v + pp->pr_itemoffset) & (pp->pr_align - 1)) == 0);
 1114		FREECHECK_OUT(&pp->pr_freecheck, v);
 1115		return (v);
 1116	}
 1117	
 (gdb) print v
 $1 = (void *) 0xc1781000
 (gdb) print *pp
 $2 = {pr_poollist = {tqe_next = 0xc0f80800, tqe_prev = 0xc0f80c80},
    pr_emptypages = {lh_first = 0x0}, pr_fullpages = {lh_first = 
 0xc14b8414},
    pr_partpages = {lh_first = 0xc14b84ec}, pr_curpage = 0xc14b84ec,
    pr_phpool = 0xc0f7849c, pr_cache = 0xc0f805c0, pr_size = 4096,
    pr_align = 4096, pr_itemoffset = 1, pr_minitems = 0, pr_minpages = 0,
    pr_maxpages = 4294967295, pr_npages = 64, pr_itemsperpage = 16,
    pr_slack = 0, pr_nitems = 181, pr_nout = 843, pr_hardlimit = 4294967295,
    pr_refcnt = 0, pr_alloc = 0xc0f79c5c, pr_alloc_list = {tqe_next = 0x0,
      tqe_prev = 0xc0f80858}, pr_drain_hook = 0, pr_drain_hook_arg = 0x0,
    pr_wchan = 0xc0f79c88 "kva-4096", pr_flags = 0, pr_roflags = 3584,
    pr_lock = {u = {mtxa_owner = 1537}}, pr_cv = {cv_opaque = {0x0, 
 0xc0f80638,
        0xc0f79c88}}, pr_ipl = 6, pr_phtree = {sph_root = 0xc14b8798},
    pr_maxcolor = 0, pr_curcolor = 0, pr_phoffset = 0,
    pr_hardlimit_warning = 0x0, pr_hardlimit_ratecap = {tv_sec = 0,
      tv_usec = 0}, pr_hardlimit_warning_last = {tv_sec = 0, tv_usec = 0},
    pr_nget = 2651, pr_nfail = 0, pr_nput = 1808, pr_npagealloc = 73,
    pr_npagefree = 9, pr_hiwat = 68, pr_nidle = 0, pr_log = 0x0,
    pr_curlogentry = 0, pr_logsize = 0, pr_entered_file = 0x0,
    pr_entered_line = 0, pr_reclaimerentry = {ce_q = {tqe_next = 0x0,
        tqe_prev = 0x0}, ce_func = 0, ce_obj = 0x0}, pr_freecheck = 0x0,
    pr_qcache = 0xc0f79c80}
 (gdb)


 -- 
 ------------------------------------

 Mystische Erklärungen:
 Die mystischen Erklärungen gelten für tief;
 die Wahrheit ist, dass sie noch nicht einmal oberflächlich sind.

     -- Friedrich Nietzsche
     [ Die Fröhliche Wissenschaft Buch 3, 126 ]

State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Sat, 03 Nov 2012 15:00:56 +0000
State-Changed-Why:
No occurrences of 'uvmwait: uvm_fault' in the babylon5 i386 test logs
since 2012.03.09.08.03.53, so presumably this is fixed, perhaps by
src/sys/netinet/rfc6056.c 1.5.


From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/46096 CVS commit: src
Date: Sat, 23 Aug 2014 11:05:41 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Sat Aug 23 15:05:41 UTC 2014

 Modified Files:
 	src/tests/usr.bin/make: t_make.sh
 	src/usr.bin/make: compat.c lst.h main.c make.1 make.c make.h nonints.h
 	    parse.c suff.c targ.c
 	src/usr.bin/make/lst.lib: lstInt.h lstRemove.c
 Added Files:
 	src/usr.bin/make/unit-tests: impsrc.exp impsrc.mk suffixes.exp
 	    suffixes.mk

 Log Message:
 PR/46096: Jarmo Jaakkola: fix many problems with dependencies (PR 49086)

 Quite extensive rewrite of the Suff module.  Some ripple effects into
 Parse and Targ modules too.

 Dependency searches in general were made to honor explicit rules so
 implicit and explicit sources are no longer applied on targets that
 do not invoke a transformation rule.

 Archive member dependency search was rewritten.  Explicit rules now
 work properly and $(.TARGET) is set correctly.  POSIX semantics for
 lib(member.o) and .s1.a rules are supported.

 .SUFFIXES list maintenance was rewritten so that scanning of existing
 rules works when suffixes are added and that clearing the suffix list
 removes single suffix rules too.  Transformation rule nodes are now
 mixed with regular nodes so they are available as regular targets too
 if needed (especially after the known suffixes are cleared).

 The .NULL target was documented in the manual page, especially to
 warn against using it when a single suffix rule would work.
 A deprecation warning was also added to the manual and make also
 warns the user if it encounters .NULL.

 Search for suffix rules no longer allows the explicit dependencies
 to override the selected transformation rule.  A check is made in
 the search that the transformation that would be tried does not
 already exist in the chain.  This prevents getting stuck in an infinite
 loop under specific circumstances.  Local variables are now set
 before node's children are expanded so dynamic sources work in
 multi-stage transformations.  Make_HandleUse() no longer expands
 the added children for transformation nodes, preventing triple
 expansion and allowing the Suff module to properly postpone their
 expansion until proper values are set for the local variables.

 Directory prefix is no longer removed from $(.PREFIX) if the target
 is found via directory search.

 The last rule defined is now used instead of the first one (POSIX
 requirement) in case a rule is defined multiple times.  Everything
 defined in the first instance is undone, but things added "globally"
 are honored.  To implement this, each node tracks attribute bits
 which have been set by special targets (global) instead of special
 sources (local).  They also track dependencies that were added by
 a rule with commands (local) instead of rule with no commands (global).

 New attribute, OP_FROM_SYS_MK is introduced.  It is set on all targets
 found in system makefiles so that they are not eligible to become
 the main target.  We cannot just set OP_NOTMAIN because it is one of
 the attributes inherited from transformation and .USE rules and would
 make any eligible target that uses a built-in inference rule ineligible.

 The $(.IMPSRC) local variable now works like in gmake: it is set to
 the first prerequisite for explicit rules.  For implicit rules it
 is still the implied source.

 The manual page is improved regarding the fixed features.  Test cases
 for the fixed problems are added.

 Other improvements in the Suff module include:
   - better debug messages for transformation rule search (length of
     the chain is now visualized by indentation)
   - Suff structures are created, destroyed and moved around by a set
     of maintenance functions so their reference counts are easier
     to track (this also gets rid of a lot of code duplication)
   - some unreasonably long functions were split into smaller ones
   - many local variables had their names changed to describe their
     purpose instead of their type


 To generate a diff of this commit:
 cvs rdiff -u -r1.3 -r1.4 src/tests/usr.bin/make/t_make.sh
 cvs rdiff -u -r1.94 -r1.95 src/usr.bin/make/compat.c
 cvs rdiff -u -r1.18 -r1.19 src/usr.bin/make/lst.h
 cvs rdiff -u -r1.228 -r1.229 src/usr.bin/make/main.c
 cvs rdiff -u -r1.232 -r1.233 src/usr.bin/make/make.1
 cvs rdiff -u -r1.88 -r1.89 src/usr.bin/make/make.c
 cvs rdiff -u -r1.93 -r1.94 src/usr.bin/make/make.h
 cvs rdiff -u -r1.65 -r1.66 src/usr.bin/make/nonints.h
 cvs rdiff -u -r1.199 -r1.200 src/usr.bin/make/parse.c
 cvs rdiff -u -r1.70 -r1.71 src/usr.bin/make/suff.c
 cvs rdiff -u -r1.57 -r1.58 src/usr.bin/make/targ.c
 cvs rdiff -u -r1.20 -r1.21 src/usr.bin/make/lst.lib/lstInt.h
 cvs rdiff -u -r1.14 -r1.15 src/usr.bin/make/lst.lib/lstRemove.c
 cvs rdiff -u -r0 -r1.1 src/usr.bin/make/unit-tests/impsrc.exp \
     src/usr.bin/make/unit-tests/impsrc.mk \
     src/usr.bin/make/unit-tests/suffixes.exp \
     src/usr.bin/make/unit-tests/suffixes.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.