NetBSD Problem Report #55340
From martin@duskware.de Wed Jun 3 14:35:45 2020
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 6D9D01A9217
for <gnats-bugs@gnats.NetBSD.org>; Wed, 3 Jun 2020 14:35:45 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: kernel hang during rump test run + ddb failure
X-Send-Pr-Version: 3.95
>Number: 55340
>Category: port-macppc
>Synopsis: kernel hang during rump test run + ddb failure
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-macppc-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jun 03 14:40:00 +0000 2020
>Last-Modified: Sat Jun 20 11:40:01 +0000 2020
>Originator: Martin Husemann
>Release: NetBSD 9.99.64
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD gethsemane.duskware.de 9.99.64 NetBSD 9.99.64 (GETHSEMANE) #45: Wed Jun 3 12:29:44 CEST 2020 martin@seven-days-to-the-wolves.aprisoft.de:/work/src/sys/arch/macppc/compile/GETHSEMANE macppc
Architecture: powerpc
Machine: macppc
>Description:
Running atf test on a dual G4 macppc today locked up the system.
It stopped in /usr/tests/rump/rumpkern/t_sp:
rump/rumpkern/t_sp (749/847): 10 test cases
basic: [0.376696s] Passed.
fork_fakeauth: [0.381822s] Passed.
fork_pipecomm: [0.375336s] Passed.
fork_simple: [0.380721s] Passed.
reconnect: [301.288133s] Failed: Test case timed out after 300 seconds
signal: [0.414238s] Passed.
sigsafe: [5.543676s] Passed.
stress_killer: [11.301062s] Passed.
stress_long:
After watching it hang for > 10 minutes I dropped into ddb on the console,
but that did not work well either:
Stopped in pid 2868.16103 (rump_server) at netbsd:zstty_stint+0x1d4:
b netbsd:zstty_stint+0x140
db{0}> bt
0x16c23ea0: at zsc_intr_hard+0x74
0x16c23ec0: at zshard+0x18
0x16c23ed0: at intr_deliver.isra.1+0x90
0x16c23ef0: at pic_handle_intr+0x178
0x16c23f20: at trapstart+0x6b0
[ 12392.8289365] trap: kernel read DSI trap @ 0xea73ff14 by 0x12b1e8 (DSISR 0x40000000, err=14), lr 0x12b798
[ 12392.8289365] panic: trap
[ 12392.8289365] cpu0: Begin traceback...
[ 12392.8289365] 0x16c23960: at vpanic+0x12c
[ 12392.8289365] 0x16c23990: at panic+0x50
[ 12392.8289365] 0x16c239d0: at trap+0x100
[ 12392.8289365] 0x16c23a80: kernel DSI read trap @ 0xea73ff14 by db_stack_trace_print+0x11c: srr1=0x32
[ 12392.8289365] r1=0x16c23b50 cr=0x20244444 xer=0 ctr=0x10a274 dsisr=0x40000000
[ 12392.8289365] 0x16c23b50: at db_stack_trace_print+0x6c8
[ 12392.8289365] 0x16c23bc0: at db_command+0x12c
[ 12392.8289365] 0x16c23c60: at db_command_loop+0xd8
[ 12392.8289365] 0x16c23d40: at db_trap+0xe0
[ 12392.8289365] 0x16c23d70: at kdb_trap+0x120
[ 12392.8289365] 0x16c23db0: at trapstart+0x95c
[ 12392.8289365] saved LR(0x804d924a) is invalid.cpu0: End traceback...
[ 12392.8289365] halting CPU 1
[ 12392.8289365] dumpsys: TBD
[ 12392.8289365] rebooting
>How-To-Repeat:
s/a
>Fix:
n/a
>Audit-Trail:
From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@netbsd.org, port-macppc-maintainer@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: port-macppc/55340: kernel hang during rump test run + ddb failure
Date: Mon, 8 Jun 2020 10:10:01 +0900
Hi.
Could you try the patch attached in port-powerpc/55325:
http://gnats.netbsd.org/55325
With this patch, kernel does not hung for my Mac mini (uniprocessor):
| Tests root: /usr/tests/rump/rumpkern
|
| t_sp (1/1): 10 test cases
| basic: [1.030238s] Passed.
| fork_fakeauth: [0.528322s] Passed.
| fork_pipecomm: [0.530721s] Passed.
| fork_simple: [0.545351s] Passed.
| reconnect: [3.729749s] Passed.
| signal: [0.952547s] Passed.
| sigsafe: [6.074471s] Passed.
| stress_killer: [323.604229s] Passed.
| stress_long: [604.847985s] Failed: Test case timed out after 300 seconds
| stress_short: [603.281914s] Failed: Test case timed out after 300 seconds
| [1545.160124s]
|
| Failed test cases:
| t_sp:stress_long, t_sp:stress_short
|
| Summary for 1 test programs:
| 8 passed test cases.
| 2 failed test cases.
| 0 expected failed test cases.
| 0 skipped test cases.
Or, DIAGNOSTIC may help you. (Unfortunately, DIAGNOSTIC is disabled for
macppc for a very long time.) If your machine has gem(4), a dirty hack
described in port-macppc/55326:
http://gnats.netbsd.org/55326
is necessary in order to enable DIAGNOSTIC.
Thanks,
rin
From: Martin Husemann <martin@duskware.de>
To: Rin Okuyama <rokuyama.rk@gmail.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-macppc/55340: kernel hang during rump test run + ddb failure
Date: Sat, 20 Jun 2020 13:37:12 +0200
On Mon, Jun 08, 2020 at 10:10:01AM +0900, Rin Okuyama wrote:
> Hi.
>
> Could you try the patch attached in port-powerpc/55325:
>
> http://gnats.netbsd.org/55325
>
> With this patch, kernel does not hung for my Mac mini (uniprocessor):
I applied the patch and enabled DIAGNOSTIC - this makes the test run a lot
better!
It still dies due to PR 55272, but that is unrelated.
Martin
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.