NetBSD Problem Report #51724

From www@NetBSD.org  Fri Dec 16 05:08:10 2016
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id AAE007A2F5
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 16 Dec 2016 05:08:10 +0000 (UTC)
Message-Id: <20161216050809.4F47F7A332@mollari.NetBSD.org>
Date: Fri, 16 Dec 2016 05:08:09 +0000 (UTC)
From: rokuyama@rk.phys.keio.ac.jp
Reply-To: rokuyama@rk.phys.keio.ac.jp
To: gnats-bugs@NetBSD.org
Subject: landisk fails to reboot
X-Send-Pr-Version: www-1.0

>Number:         51724
>Category:       port-sh3
>Synopsis:       landisk fails to reboot
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-sh3-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Dec 16 05:10:00 +0000 2016
>Closed-Date:    Sat Dec 17 01:39:40 +0000 2016
>Last-Modified:  Sun Jan 01 00:10:00 +0000 2017
>Originator:     Rin Okuyama
>Release:        7.99.49
>Organization:
Faculty of Science and Technology, Keio University
>Environment:
NetBSD usl5p 7.99.49 NetBSD 7.99.49 (USL-5P_RO) #0: Fri Dec 16 04:39:06 JST 2016  rin@latipes:/var/build/src/sys/arch/landisk/compile/USL-5P_RO landisk
>Description:
USL-5P does not reboot, but falls into infinite loop instead. This is
because the reset instructions are not executed. The followings are the
original source code and its objdump:

src/sys/landisk/landisk/machdep.c

  377  void
  378  machine_reset(void)
  379  {
  380
  381          _cpu_exception_suspend();
  382          _reg_write_4(SH_(EXPEVT), EXPEVT_RESET_MANUAL);
  383          (void)*(volatile uint32_t *)0x80000001; /* CPU shutdown */
  384
  385          /*NOTREACHED*/
  386          for (;;) {
  387                  continue;
  388          }
  389  }

objdump -d machdep.o

  00000244 <machine_reset>:
   244:   22 4f           sts.l   pr,@-r15
   246:   06 d0           mov.l   260 <machine_reset+0x1c>,r0     ! 0 <sysctl_machdep_setup>
   248:   0b 40           jsr     @r0
   24a:   09 00           nop
   24c:   05 d1           mov.l   264 <machine_reset+0x20>,r1     ! ff000024
   24e:   20 e2           mov     #32,r2
   250:   22 21           mov.l   r2,@r1
   252:   05 d2           mov.l   268 <machine_reset+0x24>,r2     ! 80000001
   254:   20 61           mov.b   @r2,r1
   256:   21 84           mov.b   @(1,r2),r0
   258:   22 84           mov.b   @(2,r2),r0
   25a:   23 84           mov.b   @(3,r2),r0
   25c:   fe af           bra     25c <machine_reset+0x18>
   25e:   09 00           nop
   260:   00 00           .word 0x0000
   262:   00 00           .word 0x0000
   264:   24 00           mov.b   r2,@(r0,r0)
   266:   00 ff           .word 0xff00
   268:   01 00           .word 0x0001
   26a:   00 80           mov.b   r0,@(0,r0)

You can see that 0x80000001 at 0x268 is never executed. This is probably
because the behavior of GCC has been changed. As a work around, replace

  383          (void)*(volatile uint32_t *)0x80000001; /* CPU shutdown */

with

  383          goto *(void *)0x80000001; /* CPU shutdown */

as in a similar manner to src/sys/arch/sh3/sh3/sh3_machdep.c. Then,
objdump yields

  00000244 <machine_reset>:
   244:   22 4f           sts.l   pr,@-r15
   246:   04 d0           mov.l   258 <machine_reset+0x14>,r0     ! 0 <sysctl_machdep_setup>
   248:   0b 40           jsr     @r0
   24a:   09 00           nop
   24c:   03 d1           mov.l   25c <machine_reset+0x18>,r1     ! ff000024
   24e:   20 e2           mov     #32,r2
   250:   22 21           mov.l   r2,@r1
   252:   03 d1           mov.l   260 <machine_reset+0x1c>,r1     ! 80000001
   254:   2b 41           jmp     @r1
   256:   09 00           nop
   258:   00 00           .word 0x0000
   25a:   00 00           .word 0x0000
   25c:   24 00           mov.b   r2,@(r0,r0)
   25e:   00 ff           .word 0xff00
   260:   01 00           .word 0x0001
   262:   00 80           mov.b   r0,@(0,r0)

and the system reboots successfully.

Better fix may be use cpu_reset() in sh3_machdep.c. Please find the
attached patch. I've confirmed that it does not break other sh3 ports.
>How-To-Repeat:
Attempt "reboot" or "shutdown -r" on landisk boxes.
>Fix:
--- src/sys/arch/landisk/conf/std.landisk.orig	2016-12-16 03:24:16.391310471 +0900
+++ src/sys/arch/landisk/conf/std.landisk	2016-12-16 04:31:02.580040861 +0900
@@ -15,4 +15,6 @@
 options 	IOM_RAM_BEGIN=0x0c000000
 options 	IOM_RAM_SIZE=0x04000000         # 64MB

+options 	RESET_INSTR=0x80000001
+
 makeoptions	DEFTEXTADDR="0x8c001000"
--- src/sys/arch/landisk/landisk/machdep.c.orig	2016-12-16 03:14:04.567471879 +0900
+++ src/sys/arch/landisk/landisk/machdep.c	2016-12-16 04:31:19.030692713 +0900
@@ -366,21 +366,7 @@
 #endif

 	printf("rebooting...\n");
-	machine_reset();
-
-	/*NOTREACHED*/
-	for (;;) {
-		continue;
-	}
-}
-
-void
-machine_reset(void)
-{
-
-	_cpu_exception_suspend();
-	_reg_write_4(SH_(EXPEVT), EXPEVT_RESET_MANUAL);
-	(void)*(volatile uint32_t *)0x80000001;	/* CPU shutdown */
+	cpu_reset();

 	/*NOTREACHED*/
 	for (;;) {
--- src/sys/arch/sh3/sh3/sh3_machdep.c.orig	2016-12-16 03:19:31.920134296 +0900
+++ src/sys/arch/sh3/sh3/sh3_machdep.c	2016-12-16 04:30:45.202026098 +0900
@@ -134,6 +134,11 @@
 #ifdef SH4
 extern char sh4_vector_tlbmiss[], sh4_vector_tlbmiss_end[];
 #endif
+
+#ifndef RESET_INSTR
+#define RESET_INSTR 0xa0000000
+#endif
+
 /*
  * These variables are needed by /sbin/savecore
  */
@@ -559,7 +564,7 @@
 	_reg_write_4(SH_(EXPEVT), EXPEVT_RESET_MANUAL);

 #ifndef __lint__
-	goto *(void *)0xa0000000;
+	goto *(void *)RESET_INSTR;
 #endif
 	/* NOTREACHED */
 }

>Release-Note:

>Audit-Trail:
From: "Valeriy E. Ushakov" <uwe@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51724 CVS commit: src/sys/arch/landisk/landisk
Date: Sat, 17 Dec 2016 01:32:22 +0000

 Module Name:	src
 Committed By:	uwe
 Date:		Sat Dec 17 01:32:22 UTC 2016

 Modified Files:
 	src/sys/arch/landisk/landisk: machdep.c

 Log Message:
 PR 51724 - landisk fails to reboot.

 machine_reset() - instead of trying to cause an invalid data access
 that gcc will optimize away, just use "trapa" instruction.

 G/c assignment to EXPEVT, it will be set by the actual reset.  That
 assignment has been probably cargo-culted from cpu_reset() that
 manually jumps to 0xa0000000 (the reset address) instead of triggering
 a reset.

 TODO: This code should be SuperH generic cpu_reset().


 To generate a diff of this commit:
 cvs rdiff -u -r1.19 -r1.20 src/sys/arch/landisk/landisk/machdep.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: uwe@NetBSD.org
State-Changed-When: Sat, 17 Dec 2016 01:39:40 +0000
State-Changed-Why:
Thanks.  Should be fixed now (though with a different fix).


From: Rin Okuyama <rokuyama@rk.phys.keio.ac.jp>
To: gnats-bugs@NetBSD.org, uwe@NetBSD.org
Cc: 
Subject: Re: port-sh3/51724 (landisk fails to reboot)
Date: Sat, 17 Dec 2016 12:08:03 +0900

 Thank you. Fix confirmed.

 # I misunderstood what mov.l and "goto *(void *)XXX" mean.
 # Is my face red!

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-sh3/51724: landisk fails to reboot
Date: Sun, 1 Jan 2017 00:04:52 +0000

 On Fri, Dec 16, 2016 at 05:10:00AM +0000, rokuyama@rk.phys.keio.ac.jp wrote:
  >   383          (void)*(volatile uint32_t *)0x80000001; /* CPU shutdown */
  > [...]
  >    252:   05 d2           mov.l   268 <machine_reset+0x24>,r2     ! 80000001
  >    254:   20 61           mov.b   @r2,r1
  >    256:   21 84           mov.b   @(1,r2),r0
  >    258:   22 84           mov.b   @(2,r2),r0
  >    25a:   23 84           mov.b   @(3,r2),r0

 I don't understand, but maybe that's because I don't actually speak
 sh3 assembler. It looks like it's doing exactly what the code says,
 modulo reading bytes at a time (I guess because it's unaligned?)

 Is the problem that it needs to be mov.l @r2, r1? Or is it supposed to
 be jumping to 80000001, which the code on line 383 doesn't do?

 (in which case, wouldn't it be better to do
    ((void (*)(void))0x80000001)();
 rather than use the gcc extension for goto?)

 -- 
 David A. Holland
 dholland@netbsd.org

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sh3/51724: landisk fails to reboot
Date: Sun, 1 Jan 2017 00:05:52 +0000

 On Sun, Jan 01, 2017 at 12:05:00AM +0000, David Holland wrote:
  > [rubbish]

 never mind. "read the whole thread before replying", I should be old
 enough by now to know that :-)

 -- 
 David A. Holland
 dholland@netbsd.org

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.