NetBSD Problem Report #44466

From Wolfgang.Stukenbrock@nagler-company.com  Wed Jan 26 12:32:57 2011
Return-Path: <Wolfgang.Stukenbrock@nagler-company.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 5B2E763B873
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 26 Jan 2011 12:32:57 +0000 (UTC)
Message-Id: <20110126123249.479AB1E80CE@test-s0.nagler-company.com>
Date: Wed, 26 Jan 2011 13:32:49 +0100 (CET)
From: Wolfgang.Stukenbrock@nagler-company.com
Reply-To: Wolfgang.Stukenbrock@nagler-company.com
To: gnats-bugs@gnats.NetBSD.org
Subject: savecore tries to save NULL kernel -> clear of core-flag fails
X-Send-Pr-Version: 3.95

>Number:         44466
>Category:       bin
>Synopsis:       savecore tries to save NULL kernel -> clear of core-flag fails
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jan 26 12:35:00 +0000 2011
>Originator:     Dr. Wolfgang Stukenbrock
>Release:        NetBSD 5.1
>Organization:
Dr. Nagler & Company GmbH
>Environment:


System: NetBSD test-s0 4.0 NetBSD 4.0 (NSW-WS) #0: Tue Aug 17 17:28:09 CEST 2010 wgstuken@test-s0:/usr/src/sys/arch/amd64/compile/NSW-WS amd64
Architecture: x86_64
Machine: amd64
>Description:
	Due to some changes to savecore in the past it may happen, that the variable kernel is set to NULL.
	This is fine for kvm_openfiles() etc. but is a bad idea for stat() and open() syscalls ...
	Output of ktruss:

   ..... lots of lines deleted - including the stat(NULL, ...) call ....
   437      1 savecore write(0x5, 0x7f7ffd46b000, 0x319d) = 12701
       "\^]\^?T^\M^W\^W\M-_\M-;\M-w\M-i\M-E\M-O\M-c3\M-?\M-l\M-R\M-O\M-w\M-x\M->\M-8{\M-~x\M-~M\M-CS\M-N\M^?zy\M-Q\M-=\M-;\M^O"
   437      1 savecore close(0x5)                  = 0
   437      1 savecore gettimeofday(0x7f7fffffc8e0, 0) = 0
   437      1 savecore writev(0x2, 0x7f7fffffc980, 0x2) = 63
       "savecore: writing compressed kernel to /var/crash/netbsd.13.gz\n"
   437      1 savecore fcntl(0x3, 0x3, 0)          = 2
   437      1 savecore sendto(0x3, 0x7f7fffffc9b0, 0x52, 0, 0, 0) = 82
       "<29>Jan 26 12:52:27 savecore: writing compressed kernel to /var/crash/netbsd.13.gz"
   437      1 savecore open("/var/crash/netbsd.13.gz", 0x601, 0x1b6) = 5
   437      1 savecore __fstat30(0x5, 0x7f7fffffca10) = 0
   437      1 savecore open(0, 0, 0)               Err#14 EFAULT
   437      1 savecore gettimeofday(0x7f7fffffc8d0, 0) = 0
   437      1 savecore issetugid()                 = 0
   437      1 savecore issetugid()                 = 0
   437      1 savecore open("/usr/share/nls/nls.alias.db", 0, 0) Err#2 ENOENT
   437      1 savecore open("/usr/share/nls/nls.alias", 0, 0) = 6
   437      1 savecore fcntl(0x6, 0x2, 0x1)        = 0
   437      1 savecore __fstat30(0x6, 0x7f7fffffbb20) = 0
   437      1 savecore mmap(0, 0x5f0, 0x1, 0x2, 0x6, 0, 0) = 0x7f7ffdff5000
   437      1 savecore close(0x6)                  = 0
   437      1 savecore munmap(0x7f7ffdff5000, 0x5f0) = 0
   437      1 savecore open("/usr/share/nls/C/libc.cat", 0, 0x7f7ffd5e0a01) = 6
   437      1 savecore __fstat30(0x6, 0x7f7fffffbfc0) = 0
   437      1 savecore mmap(0, 0x10be, 0x1, 0x1, 0x6, 0, 0) = 0x7f7ffdff4000
   437      1 savecore close(0x6)                  = 0
   437      1 savecore munmap(0x7f7ffdff4000, 0x10be) = 0
   437      1 savecore writev(0x2, 0x7f7fffffc970, 0x2) = 30
       "savecore: (null): Bad address\n"
   437      1 savecore fcntl(0x3, 0x3, 0)          = 2
   437      1 savecore sendto(0x3, 0x7f7fffffc9a0, 0x31, 0, 0, 0) = 49
       "<27>Jan 26 12:52:27 savecore: (null): Bad address"
   437      1 savecore write(0x5, 0x7f7ffd46b000, 0xa) = 10
       "\^_\M^K\b\0\0\0\0\0\0\^C"
   437      1 savecore exit(0x1)

	The realy bad thing with this is, that the helper function Open() will call exit(1) when the
	open() fails. So no kernel is saved (OK - shit may happen) and the core-present flag is not cleared!
	This will lead to a core saved on every boot from now on until "savecore -c" is called ...

	The following patch will kill the strange NULL calls to stat() and open().
	As a side effect, a kernel is now only saved, if -N is given on the command line.

	This PR is related to the the still open PR's 41310, 41441 and 41583.

	- PR 41310: a workaround by adding the -N option on the command line is described here
	- PR 41441: an other way to fix this problem - force kernel to be initialized as mentioned in the manual for savecore
		    This one will conflict with the effect of passing NULL to kvm_openfiles() if no -N is given.
	- PR 41583: the problem here seems to be the missing clear after a open-failure in a previous run.

>How-To-Repeat:
	Force the system to write a 
>Fix:
	Apply the following patch to /usr/src/sbin/savecore/savecore.c.

diff -u -r1.1 savecore.c
--- savecore.c  2011/01/26 12:13:56     1.1
+++ savecore.c  2011/01/26 12:20:38
@@ -774,6 +774,7 @@
                (void)close(ifd);
        (void)fclose(fp);

+      if (kernel != NULL) { /* only if -N is specified ... */
        /* Create a kernel. */
        (void)snprintf(path, sizeof(path), "%s/netbsd.%d%s",
            dirname, bounds, compress ? ".gz" : "");
@@ -804,6 +805,7 @@
                (void)fclose(fp);
        else
                (void)close(ofd);
+      }

        /*
         * For development systems where the crash occurs during boot
@@ -911,8 +913,12 @@
        char mbuf[100], path[MAXPATHLEN];

        /* XXX assume a reasonable default, unless we find a kernel. */
-       kernelsize = 20 * 1024 * 1024;
-       if (!stat(kernel, &st)) kernelsize = st.st_blocks * S_BLKSIZE;
+       if (kernel == NULL)
+               kernelsize = 0;
+       else {
+               kernelsize = 20 * 1024 * 1024;
+               if (!= NULL && !stat(kernel, &st)) kernelsize = st.st_blocks * S_BLKSIZE;
+       }
        if (statvfs(dirname, &fsbuf) < 0) {
                syslog(LOG_ERR, "%s: %m", dirname);
                exit(1);

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.