NetBSD Problem Report #43877
From apb@cequrux.com Tue Sep 14 07:30:38 2010
Return-Path: <apb@cequrux.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 5CE8363BC98
for <gnats-bugs@gnats.NetBSD.org>; Tue, 14 Sep 2010 07:30:38 +0000 (UTC)
Message-Id: <20100914071601.BC38D100AA6F@apb-laptoy.apb.alt.za>
Date: Tue, 14 Sep 2010 06:11:45 +0000 (UTC)
From: apb@cequrux.com
Reply-To: apb@cequrux.com
To: gnats-bugs@gnats.NetBSD.org
Subject: named hangs with 5.99.39 kernel, 5.99.27 userland
X-Send-Pr-Version: 3.95
>Number: 43877
>Category: kern
>Synopsis: named hangs with 5.99.39 kernel, 5.99.27 userland
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Sep 14 07:35:00 +0000 2010
>Last-Modified: Sun Dec 26 15:30:02 +0000 2010
>Originator: Alan Barrett
>Release: NetBSD 5.99.39
>Organization:
Not much
>Environment:
Kernel: NetBSD 5.99.39 i386
Userland: NetBSD 5.99.27 i387
>Description:
I booted a 5.99.39 kernel (built from sources checked out with cvs
update -D '2010-09-12 12:00 UTC') on a system with a userland from a few
months ago (version 5.99.27, built from sources checked out with cvs
update -D '2010-04-18 12:00 UTC').
One of the first things that I noticed was that "/atc/rc.d/named
forcerestart" hung.
>How-To-Repeat:
# /etc/rc.d/named forcerestart
Stopping named.
Waiting for PIDS: 177, 177, 177, 177, 177, 177, 177, 177, 177, 177,
177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177,
177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177,
177, 177, 177, 177, 177, 177, 177 [... this continues forever]
In another window:
# ps -axlsww | awk 'NR==1 || /named/ {print}'
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
14 177 1 542 85 0 31140 14348 kqueue Isl ? 0:00.15 /usr/sbin/named -u named -t /var/chroot/named
14 177 1 542 43 0 31140 14348 parked Isl ? 0:00.15 /usr/sbin/named -u named -t /var/chroot/named
14 177 1 542 43 0 31140 14348 parked Isl ? 0:00.15 /usr/sbin/named -u named -t /var/chroot/named
14 177 1 542 43 0 31140 14348 parked Isl ? 0:00.15 /usr/sbin/named -u named -t /var/chroot/named
14 177 1 542 85 0 31140 14348 sigwait Isl ? 0:00.15 /usr/sbin/named -u named -t /var/chroot/named
0 387 1171 431 85 0 3144 1256 wait I+ ttyp2 0:00.01 /bin/sh /etc/rc.d/named forcerestart
0 1452 387 0 85 0 3144 1252 wait S+ ttyp2 0:00.02 /bin/sh /etc/rc.d/named forcestop
>Fix:
Unknown. It may also be relevant that "kqueue" appears in the report of
test failures from anita:
Failed test cases:
fs/vfs/t_renamerace:lfs_renamerace,
fs/vfs/t_renamerace:lfs_renamerace_dirs, lib/libevent/t_event:kqueue,
lib/libevent/t_event:poll, lib/libevent/t_event:select,
util/sort/t_sort:any_char
>Audit-Trail:
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: re: kern/43877: named hangs with 5.99.39 kernel, 5.99.27 userland
Date: Tue, 14 Sep 2010 18:07:33 +1000
i saw this on sparc64, too. i thought it was the old sparc vs pthreads
vs. bind issue, and disabled pthreads in bind and it make my bind work
again. i have a small patch to the bind build to fix the
NAMED_USE_PTHREADS=no setting again i plan to commit once i've confirmed
it doesn't break anything else. (see below)
in my case, i didn't get to starting named because dhcpcd tried to run
"dig" which never exited for me, and my boot hung there.
.mrg.
Index: Makefile.inc
===================================================================
RCS file: /cvsroot/src/external/bsd/bind/Makefile.inc,v
retrieving revision 1.5
diff -p -r1.5 Makefile.inc
*** Makefile.inc 6 Aug 2010 10:58:03 -0000 1.5
--- Makefile.inc 14 Sep 2010 08:06:28 -0000
*************** CPPFLAGS+= -DLIBINTERFACE=${LIBINTERFACE
*** 74,79 ****
--- 74,80 ----
.if ${NAMED_USE_PTHREADS} == "yes"
# XXX: Not ready yet
# CPPFLAGS+= -DISC_PLATFORM_USE_NATIVE_RWLOCKS
+ CPPFLAGS+= -DISC_PLATFORM_USETHREADS
.if !defined (LIB) || empty(LIB)
LDADD+= -lpthread
DPADD+= ${LIBPTHREAD}
Index: include/isc/platform.h
===================================================================
RCS file: /cvsroot/src/external/bsd/bind/include/isc/platform.h,v
retrieving revision 1.5
diff -p -r1.5 platform.h
*** include/isc/platform.h 6 Aug 2010 10:58:13 -0000 1.5
--- include/isc/platform.h 14 Sep 2010 08:06:32 -0000
***************
*** 207,213 ****
/*
* Defined if we are using threads.
*/
! #define ISC_PLATFORM_USETHREADS 1
/*
* Defined if unistd.h does not cause fd_set to be delared.
--- 207,214 ----
/*
* Defined if we are using threads.
*/
! /* Put in the Makefile */
! /* #define ISC_PLATFORM_USETHREADS 1 */
/*
* Defined if unistd.h does not cause fd_set to be delared.
From: Alan Barrett <apb@cequrux.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/43877: named hangs with 5.99.39 kernel, 5.99.27 userland
Date: Sun, 26 Dec 2010 17:06:20 +0200
On Tue, 14 Sep 2010, apb@cequrux.com wrote:
> >Synopsis: named hangs with 5.99.39 kernel, 5.99.27 userland
> I booted a 5.99.39 kernel (built from sources checked out with cvs
> update -D '2010-09-12 12:00 UTC') on a system with a userland from a few
> months ago (version 5.99.27, built from sources checked out with cvs
> update -D '2010-04-18 12:00 UTC').
This is still an issue with a 5.99.40 kernel and 5.99.27 userland. What
can I do to help debug this failure of backward compatibility?
I'd really like to upgrade from 5.99.27, but as long as an old userland
doesn't work with a new kernel, I can't upgrade safely.
--apb (Alan Barrett)
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, apb@cequrux.com
Cc:
Subject: Re: kern/43877: named hangs with 5.99.39 kernel, 5.99.27 userland
Date: Sun, 26 Dec 2010 10:28:06 -0500
On Dec 26, 3:10pm, apb@cequrux.com (Alan Barrett) wrote:
-- Subject: Re: kern/43877: named hangs with 5.99.39 kernel, 5.99.27 userland
| The following reply was made to PR kern/43877; it has been noted by GNATS.
|
| From: Alan Barrett <apb@cequrux.com>
| To: gnats-bugs@NetBSD.org
| Cc:
| Subject: Re: kern/43877: named hangs with 5.99.39 kernel, 5.99.27 userland
| Date: Sun, 26 Dec 2010 17:06:20 +0200
|
| On Tue, 14 Sep 2010, apb@cequrux.com wrote:
| > >Synopsis: named hangs with 5.99.39 kernel, 5.99.27 userland
| > I booted a 5.99.39 kernel (built from sources checked out with cvs
| > update -D '2010-09-12 12:00 UTC') on a system with a userland from a few
| > months ago (version 5.99.27, built from sources checked out with cvs
| > update -D '2010-04-18 12:00 UTC').
|
| This is still an issue with a 5.99.40 kernel and 5.99.27 userland. What
| can I do to help debug this failure of backward compatibility?
|
| I'd really like to upgrade from 5.99.27, but as long as an old userland
| doesn't work with a new kernel, I can't upgrade safely.
What does a ktrace show? Where is it getting stuck?
christos
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.