NetBSD Problem Report #6832

Received: (qmail 29965 invoked from network); 18 Jan 1999 04:57:31 -0000
Message-Id: <199901180457.WAA01826@marvin.ece.utexas.edu>
Date: Sun, 17 Jan 1999 22:57:30 -0600 (CST)
From: bgrayson@ece.utexas.edu
To: gnats-bugs@gnats.netbsd.org
Subject: savecore deficiencies
X-Send-Pr-Version: 3.95

>Number:         6832
>Category:       bin
>Synopsis:       savecore could be better
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Jan 17 21:05:01 +0000 1999
>Closed-Date:    
>Last-Modified:  Sun Oct 01 07:45:00 +0000 2000
>Originator:     Brian Grayson
>Release:        Jan 15, 1998
>Organization:
	Parallel and Distributed Systems
	Electrical and Computer Engineering
	The University of Texas at Austin
>Environment:

>Description:
	As a NetBSD-helper-wannabe, I've been quite frustrated
	with how savecore does and doesn't work.  There are
	several things that could be better!

	1.  There is no reference to a crash(8) man page.  Each
	architecture should have a crash man page talking about
	what occurs when crashes happen, how (in general terms)
	savecore saves the information for further perusal, and
	how to debug a dead kernel using the saved info (i.e.,
	print a backtrace, do a ps).  

	The hp300 and vax archs have crash(8) man pages, which
	would be a good start for the other archs.  panic(9)
	and other things should also x-ref the crash(8) man page.

	2.  Some of the exit conditions don't tell much
	information, even with -v.  For example, the error message
	"/dev/wd0e: Device busy" could be something like "Failure
	while attempting to open dump device /dev/wd0e:  Device
	busy".  (Also, this particular failure is one failure
	mode for when /netbsd and the crashed kernel aren't the
	same -- wd0e is an FFS file system, not a swap partition.
	A note like "Check that /netbsd is your running
	kernel, and the kernel xxx passed via -N (if any) is the
	kernel that caused the panic" would be nice.  There are
	several other common exit points that should have similar
	messages about kernel names.)

	3.  There is no -N-style option to deal with the
	currently-running kernel being different from /netbsd.
	For example, I currently am running /netbsd-test3, with a
	crashdump from /netbsd-test3.  We need two -N-style
	options, one for the current kernel, and the
	already-existing -N option for the dead kernel.

	4.  savecore's failure to save when there is a crash is too
	quiet.  Perhaps each exit(1) call should be replaced
	with something that prints out a more noticeable message
	(*** savecore failure:  xxx)?
>How-To-Repeat:
>Fix:
	Some of the above are over my head.  I'll try to send in
	patches as I write them over the next few weeks, if no
	one else beats me to it.
>Release-Note:
>Audit-Trail:

From: "Erik E. Fair" <fair@clock.org>
To: bgrayson@ece.utexas.edu
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: bin/6832: savecore deficiencies
Date: Mon, 18 Jan 1999 00:31:51 -0800

 Last winter I had a long conversation with my friend Stu Grossman, who is
 Mr. GDB at Cygnus. He opined that the Linux people had a cute idea with
 their "core" files, /proc/<pid>/mem and /dev/{,k}mem files - make them
 produce an ELF header/format. The idea here was to be able to remove the
 "special knowledge" of the core file formats, be they dumps or kernels, or
 whatever, from the programs that must manipulate them. Apparently the ELF
 is flexible enough to represent everything you need...

 	Erik <fair@clock.org>

From: Darren Reed <darrenr@reed.wattle.id.au>
To: bgrayson@ece.utexas.edu, fair@clock.org
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: bin/6832: savecore deficiencies
Date: Sun, 1 Oct 2000 16:01:08 +1100 (EST)

 Hi,
    Looking at your PR on savecore, I can agree with some of your points.
 One in particular, #2, is a bug.  I've just committed a fix for this one.
 The problem was kvm_openfiles() not being called with the "new kernel".
 I'm not sure I understand what you mean by #3, I think this is actually
 the same problem as #2.

    Given your comments, I'll look at addressing some of the other issues
 (which I agree with wholeheartedly).

    btw, the patch below has already been committed.

 Cheers,
 Darren
 Index: savecore.c
 ===================================================================
 RCS file: /cvsroot/basesrc/sbin/savecore/savecore.c,v
 retrieving revision 1.41
 retrieving revision 1.42
 diff -c -r1.41 -r1.42
 *** savecore.c	2000/08/01 16:46:27	1.41
 --- savecore.c	2000/10/01 02:27:06	1.42
 ***************
 *** 1,4 ****
 ! /*	$NetBSD: savecore.c,v 1.41 2000/08/01 16:46:27 eeh Exp $	*/

   /*-
    * Copyright (c) 1986, 1992, 1993
 --- 1,4 ----
 ! /*	$NetBSD: savecore.c,v 1.42 2000/10/01 02:27:06 darrenr Exp $	*/

   /*-
    * Copyright (c) 1986, 1992, 1993
 ***************
 *** 43,49 ****
   #if 0
   static char sccsid[] = "@(#)savecore.c	8.5 (Berkeley) 4/28/95";
   #else
 ! __RCSID("$NetBSD: savecore.c,v 1.41 2000/08/01 16:46:27 eeh Exp $");
   #endif
   #endif /* not lint */

 --- 43,49 ----
   #if 0
   static char sccsid[] = "@(#)savecore.c	8.5 (Berkeley) 4/28/95";
   #else
 ! __RCSID("$NetBSD: savecore.c,v 1.42 2000/10/01 02:27:06 darrenr Exp $");
   #endif
   #endif /* not lint */

 ***************
 *** 223,229 ****
   	 * presumed to be the same (since the disk partitions are probably
   	 * the same!)
   	 */
 ! 	kd_kern = kvm_openfiles(NULL, NULL, NULL, O_RDONLY, errbuf);
   	if (kd_kern == NULL) {
   		syslog(LOG_ERR, "%s: kvm_openfiles: %s", _PATH_UNIX, errbuf);
   		exit(1);
 --- 223,229 ----
   	 * presumed to be the same (since the disk partitions are probably
   	 * the same!)
   	 */
 ! 	kd_kern = kvm_openfiles(kernel, NULL, NULL, O_RDONLY, errbuf);
   	if (kd_kern == NULL) {
   		syslog(LOG_ERR, "%s: kvm_openfiles: %s", _PATH_UNIX, errbuf);
   		exit(1);

From: Darren Reed <darrenr@reed.wattle.id.au>
To: gnats-bugs@netbsd.org
Cc:  
Subject: Re: bin/6832: savecore deficiencies
Date: Sun, 1 Oct 2000 18:43:37 +1100 (EST)

 As it turns out, savecore currently calls openlog with LOG_PERROR, causing
 all errors to goto stderr as well as syslog.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.