NetBSD Problem Report #38681

From khorben@defora.org  Sat May 17 17:46:01 2008
Return-Path: <khorben@defora.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id E38D563B8BC
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 17 May 2008 17:46:00 +0000 (UTC)
Message-Id: <20080517174607.BD668FCBD@syn.defora.rom>
Date: Sat, 17 May 2008 19:46:07 +0200 (CEST)
From: Pierre Pronchery <khorben@defora.org>
To: gnats-bugs@gnats.NetBSD.org
Subject: sparc64 machine won't reboot
X-Send-Pr-Version: 3.95

>Number:         38681
>Category:       port-sparc64
>Synopsis:       Rebooting a sparc64 machine stalls
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-sparc64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat May 17 17:50:00 +0000 2008
>Last-Modified:  Sun May 18 23:40:01 +0000 2008
>Originator:     Pierre Pronchery
>Release:        NetBSD 4.99.63
>Organization:
>Environment:
NetBSD 4.99.63 (GENERIC) #0: Sat May 17 17:38:39 CEST 2008
Architecture: sparc64
Machine: sparc64
>Description:
When trying to reboot, a serial console output gives:

--- BEGIN PASTE ---
# shutdown -r now
Shutdown NOW!
shutdown: [pid 429]
System going down IMMEDIATELY


System shutdown time has arrived

About to run shutdown hooks...
Stopping cron.
Waiting for PIDS: 374.
Saving mixer settings: mixer0.
Removing block-type swap devices
swapctl: removing /dev/sd0b as swap device
Done running shutdown hooks.
--- END PASTE ---

Here the system stalls. Sending a break gives:

--- BEGIN PASTE ---
~Stopped in pid 155.1 (mount_mfs) at     netbsd:cpu_Debugger+0x4:        nop
db> reboot
syncing disks... Mutex error: lockdebug_wantlock: acquiring sleep lock from interrupt context

lock address : 0x000000000c187f80 type     :     sleep/adaptive
shared holds :                  0 exclusive:                  0
shares wanted:                  0 exclusive:                  0
current cpu  :                  0 last held:                  0
current lwp  : 0x000000000d6cc7e0 last held: 000000000000000000
last locked  : 0x00000000012dca94 unlocked : 0x00000000012dcadc
initialized  : 0x00000000012cec44
owner field  : 000000000000000000 wait/spin:                0/0

Turnstile chain at 0x18b96e0.
=> No active turnstile for this lock.

panic: LOCKDEBUG
Stopped in pid 155.1 (mount_mfs) at     netbsd:cpu_Debugger+0x4:        nop
db> bt
lockdebug_abort1(18c1218, f, 1517168, 1685088, 1, 2e79800) at netbsd:lockdebug_abort1+0x7c
mutex_vector_enter(d6cc7e0, 5, c187f80, 0, e0017398, 0) at netbsd:mutex_vector_enter+0x244
suspendsched(168a5e8, d, 1553ff2e7dd, 6, fffffffffffffffc, f) at netbsd:suspendsched+0x8
vfs_shutdown(0, d, 1, e0017680, ffffffffffffffff, 0) at netbsd:vfs_shutdown+0x24

cpu_reboot(0, 0, e0017548, 1880c00, 1880fd0, 1880fac) at netbsd:cpu_reboot+0x170

db_reboot_cmd(1, 0, 4, e0017610, e0017738, 0) at netbsd:db_reboot_cmd+0x40
db_command(180f020, 180f058, 0, 0, e0017828, 0) at netbsd:db_command+0xa0
db_command_loop(144a928, 0, 1, c1a0ef9, 0, 0) at netbsd:db_command_loop+0x10c
db_trap(e0018000, 0, 0, 0, 1515098, 2e79800) at netbsd:db_trap+0x128
kdb_trap(101, e0017b60, 0, 0, 1c14000, 181c000) at netbsd:kdb_trap+0xe4
trap(e0017b60, 101, 144a920, 1d0006, 1c14000, 181c000) at netbsd:trap+0x358
?(d98e910, 134ebfc, 1517000, d6cc7e0, 1c14000, 1d) at 0x1008b64
sabtty_intr(2d4a000, e0017e0c, d98e910, 400, 143cd20, 40) at netbsd:sabtty_intr+0x48c
sab_intr(0, 0, e0017ed0, 0, 1416860, 1805000) at netbsd:sab_intr+0x48
sparc_interrupt(d98e6f0, 1249330, 1517000, d6cc7e0, 1c14000, d977578) at netbsd:sparc_interrupt+0x1e0
mutex_vector_exit(d98e6f0, 124919c, d98e6f0, d6cc7e0, 1c14000, 0) at netbsd:mutex_vector_exit+0x140
ffs_sync(d972d80, 1, c18bef0, d6cc7e0, 1515098, 2e79800) at netbsd:ffs_sync+0x250          
VFS_SYNC(d972d80, 1, c18bef0, 1, 0, d9777c8) at netbsd:VFS_SYNC+0x14
dounmount(0, 0, d6cc7e0, d6cc7e0, 1c14000, 18cbc00) at netbsd:dounmount+0xe0
mfs_start(d972d80, 0, d6cc7e0, 18, 1c14000, 189e400) at netbsd:mfs_start+0xd8
VFS_START(d972d80, 0, 0, 1, 1c14000, 0) at netbsd:VFS_START+0x10
do_sys_mount(0, 18177b0, 2e76c00, ffffffffffffde9e, 40, ffffffffffffb9a8) at netbsd:do_sys_mount+0x768
sys___mount50(d6cc7e0, d977dc0, d977e00, 1, 20aae0, 0) at netbsd:sys___mount50+0x28        
syscall_plain(d977ed0, 5, 4073db4c, 4073db50, 0, 4073db4c) at netbsd:syscall_plain+0x114   
?(107b98, ffffffffffffde9e, 40, ffffffffffffb9a8, 98, 20b400) at 0x1008cc0
--- END PASTE ---

>How-To-Repeat:
# shutdown -r now
>Fix:
Not yet.

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-sparc64/38681: sparc64 machine won't reboot
Date: Sat, 17 May 2008 20:45:11 +0200

 Sounds like a generic kernel bug to me (that's ok, compensation for all the
 kern/* PRs I recently filed that are sparc64 specific)

 Martin

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: port-sparc64-maintainer@NetBSD.org, gnats-admin@NetBSD.org,
        netbsd-bugs@NetBSD.org, Pierre Pronchery <khorben@defora.org>
Subject: Re: port-sparc64/38681: sparc64 machine won't reboot
Date: Sun, 18 May 2008 21:13:59 +0200

 On Sat, May 17, 2008 at 06:50:02PM +0000, Martin Husemann wrote:
 > The following reply was made to PR port-sparc64/38681; it has been noted by GNATS.
 > 
 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: port-sparc64/38681: sparc64 machine won't reboot
 > Date: Sat, 17 May 2008 20:45:11 +0200
 > 
 >  Sounds like a generic kernel bug to me (that's ok, compensation for all the
 >  kern/* PRs I recently filed that are sparc64 specific)

 FWIW I've seen it on an Xen/amd64 dom0 system today too.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Pierre Pronchery <khorben@defora.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-sparc64/38681: sparc64 machine won't reboot
Date: Mon, 19 May 2008 01:36:38 +0200

 		Hello again,

 In fact, if I issue a backtrace command instead of trying to
 force the reboot again, I get this:

 Stopped in pid 155.1 (mount_mfs) at     netbsd:cpu_Debugger+0x4:        nop
 db{0}> bt
 sab_intr(0, d9abb30, 1896c60, 5874, 18c6c00, d730800) at netbsd:sab_intr+0x48
 intr_biglock_wrapper(2d57680, 0, e0017ed0, d730800, 1440f60, 181c000) at netbsd:intr_biglock_wrapper+0x10
 sparc_interrupt(0, d9abb30, d9abb30, 5824, 143ff40, d730800) at netbsd:sparc_interrupt+0x1ec
 lockdebug_wantlock(d9d0a00, 0, 0, 0, 1c14000, 4073c5dc) at netbsd:lockdebug_wantlock+0xfc
 mutex_vector_enter(d9abb30, 1249e20, 151c400, d730800, 1c14000, 0) at netbsd:mutex_vector_enter+0x2b8
 vn_lock(d9abb30, 20002, 18de128, 18db800, 0, 0) at netbsd:vn_lock+0x88
 ffs_sync(da10f40, 1, c18bec0, d730800, 151a720, 2e7d800) at netbsd:ffs_sync+0x1d4
 VFS_SYNC(da10f40, 1, c18bec0, 0, 0, da157c8) at netbsd:VFS_SYNC+0x28
 dounmount(0, 0, d730800, 18db800, 1c14000, 0) at netbsd:dounmount+0xe0
 mfs_start(da10f40, 0, d730800, 18, 1c14000, 0) at netbsd:mfs_start+0xd8
 VFS_START(da10f40, 0, 0, 1, 0, 0) at netbsd:VFS_START+0x24
 do_sys_mount(0, 1817808, 2e78a00, ffffffffffffde9e, 40, ffffffffffffb9a8) at netbsd:do_sys_mount+0x768
 sys___mount50(d730800, da15dc0, da15e00, 1, 20aae0, 0) at netbsd:sys___mount50+0x28
 syscall_plain(da15ed0, 5, 4073db4c, 4073db50, 1, 4073db4c) at netbsd:syscall_plain+0x114
 ?(107b98, ffffffffffffde9e, 40, ffffffffffffb9a8, 98, 20b400) at 0x1008cc0

 I guess this incriminates the mfs filesystem, which I am using
 for /tmp. Then this problem isn't sparc64 specific indeed.

 HTH,
 -- 
 khorben

>Unformatted:
 With a kernel from today (an hour ago), and a 4.99.62 userland from May 9th.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.