NetBSD Problem Report #6832
Received: (qmail 29965 invoked from network); 18 Jan 1999 04:57:31 -0000
Message-Id: <199901180457.WAA01826@marvin.ece.utexas.edu>
Date: Sun, 17 Jan 1999 22:57:30 -0600 (CST)
From: bgrayson@ece.utexas.edu
To: gnats-bugs@gnats.netbsd.org
Subject: savecore deficiencies
X-Send-Pr-Version: 3.95
>Number: 6832
>Category: bin
>Synopsis: savecore could be better
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Jan 17 21:05:01 +0000 1999
>Closed-Date:
>Last-Modified: Sun Oct 01 07:45:00 +0000 2000
>Originator: Brian Grayson
>Release: Jan 15, 1998
>Organization:
Parallel and Distributed Systems
Electrical and Computer Engineering
The University of Texas at Austin
>Environment:
>Description:
As a NetBSD-helper-wannabe, I've been quite frustrated
with how savecore does and doesn't work. There are
several things that could be better!
1. There is no reference to a crash(8) man page. Each
architecture should have a crash man page talking about
what occurs when crashes happen, how (in general terms)
savecore saves the information for further perusal, and
how to debug a dead kernel using the saved info (i.e.,
print a backtrace, do a ps).
The hp300 and vax archs have crash(8) man pages, which
would be a good start for the other archs. panic(9)
and other things should also x-ref the crash(8) man page.
2. Some of the exit conditions don't tell much
information, even with -v. For example, the error message
"/dev/wd0e: Device busy" could be something like "Failure
while attempting to open dump device /dev/wd0e: Device
busy". (Also, this particular failure is one failure
mode for when /netbsd and the crashed kernel aren't the
same -- wd0e is an FFS file system, not a swap partition.
A note like "Check that /netbsd is your running
kernel, and the kernel xxx passed via -N (if any) is the
kernel that caused the panic" would be nice. There are
several other common exit points that should have similar
messages about kernel names.)
3. There is no -N-style option to deal with the
currently-running kernel being different from /netbsd.
For example, I currently am running /netbsd-test3, with a
crashdump from /netbsd-test3. We need two -N-style
options, one for the current kernel, and the
already-existing -N option for the dead kernel.
4. savecore's failure to save when there is a crash is too
quiet. Perhaps each exit(1) call should be replaced
with something that prints out a more noticeable message
(*** savecore failure: xxx)?
>How-To-Repeat:
>Fix:
Some of the above are over my head. I'll try to send in
patches as I write them over the next few weeks, if no
one else beats me to it.
>Release-Note:
>Audit-Trail:
From: "Erik E. Fair" <fair@clock.org>
To: bgrayson@ece.utexas.edu
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: bin/6832: savecore deficiencies
Date: Mon, 18 Jan 1999 00:31:51 -0800
Last winter I had a long conversation with my friend Stu Grossman, who is
Mr. GDB at Cygnus. He opined that the Linux people had a cute idea with
their "core" files, /proc/<pid>/mem and /dev/{,k}mem files - make them
produce an ELF header/format. The idea here was to be able to remove the
"special knowledge" of the core file formats, be they dumps or kernels, or
whatever, from the programs that must manipulate them. Apparently the ELF
is flexible enough to represent everything you need...
Erik <fair@clock.org>
From: Darren Reed <darrenr@reed.wattle.id.au>
To: bgrayson@ece.utexas.edu, fair@clock.org
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: bin/6832: savecore deficiencies
Date: Sun, 1 Oct 2000 16:01:08 +1100 (EST)
Hi,
Looking at your PR on savecore, I can agree with some of your points.
One in particular, #2, is a bug. I've just committed a fix for this one.
The problem was kvm_openfiles() not being called with the "new kernel".
I'm not sure I understand what you mean by #3, I think this is actually
the same problem as #2.
Given your comments, I'll look at addressing some of the other issues
(which I agree with wholeheartedly).
btw, the patch below has already been committed.
Cheers,
Darren
Index: savecore.c
===================================================================
RCS file: /cvsroot/basesrc/sbin/savecore/savecore.c,v
retrieving revision 1.41
retrieving revision 1.42
diff -c -r1.41 -r1.42
*** savecore.c 2000/08/01 16:46:27 1.41
--- savecore.c 2000/10/01 02:27:06 1.42
***************
*** 1,4 ****
! /* $NetBSD: savecore.c,v 1.41 2000/08/01 16:46:27 eeh Exp $ */
/*-
* Copyright (c) 1986, 1992, 1993
--- 1,4 ----
! /* $NetBSD: savecore.c,v 1.42 2000/10/01 02:27:06 darrenr Exp $ */
/*-
* Copyright (c) 1986, 1992, 1993
***************
*** 43,49 ****
#if 0
static char sccsid[] = "@(#)savecore.c 8.5 (Berkeley) 4/28/95";
#else
! __RCSID("$NetBSD: savecore.c,v 1.41 2000/08/01 16:46:27 eeh Exp $");
#endif
#endif /* not lint */
--- 43,49 ----
#if 0
static char sccsid[] = "@(#)savecore.c 8.5 (Berkeley) 4/28/95";
#else
! __RCSID("$NetBSD: savecore.c,v 1.42 2000/10/01 02:27:06 darrenr Exp $");
#endif
#endif /* not lint */
***************
*** 223,229 ****
* presumed to be the same (since the disk partitions are probably
* the same!)
*/
! kd_kern = kvm_openfiles(NULL, NULL, NULL, O_RDONLY, errbuf);
if (kd_kern == NULL) {
syslog(LOG_ERR, "%s: kvm_openfiles: %s", _PATH_UNIX, errbuf);
exit(1);
--- 223,229 ----
* presumed to be the same (since the disk partitions are probably
* the same!)
*/
! kd_kern = kvm_openfiles(kernel, NULL, NULL, O_RDONLY, errbuf);
if (kd_kern == NULL) {
syslog(LOG_ERR, "%s: kvm_openfiles: %s", _PATH_UNIX, errbuf);
exit(1);
From: Darren Reed <darrenr@reed.wattle.id.au>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/6832: savecore deficiencies
Date: Sun, 1 Oct 2000 18:43:37 +1100 (EST)
As it turns out, savecore currently calls openlog with LOG_PERROR, causing
all errors to goto stderr as well as syslog.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.