NetBSD Problem Report #44292
From kilbi@kilbi.de Wed Dec 29 11:40:10 2010
Return-Path: <kilbi@kilbi.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id AA8F163B89A
for <gnats-bugs@gnats.NetBSD.org>; Wed, 29 Dec 2010 11:40:10 +0000 (UTC)
Message-Id: <20101229113454.8C43F38E63@mail.kilbi.de>
Date: Wed, 29 Dec 2010 12:34:53 +0100 (MET)
From: mk@kilbi.de
Reply-To: mk@kilbi.de
To: gnats-bugs@gnats.NetBSD.org
Subject: -current kernels do not work on (my) cobalt qube 2 since one (1!) year
X-Send-Pr-Version: 3.95
>Number: 44292
>Category: port-cobalt
>Synopsis: -current kernels do not work on (my) cobalt qube 2 since one (1!) year
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: tsutsui
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Dec 29 11:45:00 +0000 2010
>Closed-Date: Sat Jan 22 18:31:51 +0000 2011
>Last-Modified: Sun Jan 23 23:10:02 +0000 2011
>Originator: Markus W Kilbinger
>Release: NetBSD 5.99.22
>Organization:
>Environment:
System: NetBSD qube 5.99.22 NetBSD 5.99.22 (QUBE) #5: Fri Dec 11 09:24:20 MET 2009 root@q:/usr/u/NetBSD/HEAD/src/sys/arch/cobalt/compile/QUBE cobalt
Architecture: mipsel
Machine: cobalt
>Description:
After one year of waiting (hoping)
http://mail-index.netbsd.org/port-cobalt/2010/06/05/msg000425.html
I followed Izumi's advice
http://mail-index.netbsd.org/port-cobalt/2010/06/05/msg000426.html
to send-pr: Longer than one year now I cannot run a -current
kernel on my cobalt qube 2!
Beside some panicing in the meantime (see my older post/link
above) now a -current kernel (compiled from yesterdays
sources) gets stuck at:
[...]
root on wd0a dumps on wd0b
root file system type: ffs
pid 1(init): ABI set to O32 (e_flags=0x1007)
... and no further go!
But: The same kernel boots fine under gxemul (simulating
cobalt hardware)!?
Last working -current kernel on my qube:
NetBSD qube 5.99.22 NetBSD 5.99.22 (QUBE) #5: Fri Dec 11 09:24:20 MET 2009
>How-To-Repeat:
Try to boot/run an actual -current kernel on a cobalt qube 2 machine.
>Fix:
unknown
>Release-Note:
>Audit-Trail:
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: port-cobalt-maintainer@NetBSD.org, gnats-admin@NetBSD.org,
netbsd-bugs@NetBSD.org, tsutsui@ceres.dti.ne.jp
Subject: Re: port-cobalt/44292: -current kernels do not work on (my) cobalt
qube 2 since one (1!) year
Date: Sat, 1 Jan 2011 00:36:07 +0900
FYI,
> to send-pr: Longer than one year now I cannot run a -current
> kernel on my cobalt qube 2!
At least 201006290000Z GENERIC kernel seems to work:
ftp://ftp.jp.NetBSD.org/pub/NetBSD-daily/HEAD/201006290000Z/cobalt/binary/kernel/
so there is some newer problem than mips64 merge.
201007300000Z GENERIC doesn't start init(8) though.
ftp://ftp.jp.NetBSD.org/pub/NetBSD-daily/HEAD/201007300000Z/cobalt/binary/kernel/
> But: The same kernel boots fine under gxemul (simulating
> cobalt hardware)!?
Emulation in gxemul is not so precise since it's desgiend to run OSes,
rather than emulationg exact hardware. Probably Rm52xx specific quirk?
---
Izumi Tsutsui
Responsible-Changed-From-To: port-cobalt-maintainer->tsutsui
Responsible-Changed-By: tsutsui@NetBSD.org
Responsible-Changed-When: Wed, 19 Jan 2011 23:36:13 +0900
Responsible-Changed-Why:
State-Changed-From-To: open->feedback
State-Changed-By: tsutsui@NetBSD.org
State-Changed-When: Wed, 19 Jan 2011 23:36:13 +0900
State-Changed-Why:
Can you try patch?
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: port-cobalt-maintainer@NetBSD.org, gnats-admin@NetBSD.org,
netbsd-bugs@NetBSD.org, jklos@NetBSD.org, tsutsui@ceres.dti.ne.jp
Subject: Re: port-cobalt/44292: -current kernels do not work on (my) cobalt
qube 2 since one (1!) year
Date: Wed, 19 Jan 2011 23:34:11 +0900
Reverting part of src/sys/dev/ic/com.c rev 1.298 seems to fix the problem.
(201007200000Z works, 201007210000Z doesn't)
Needs to use device properties to enable prescaler?
More initialization is required?
---
Index: com.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ic/com.c,v
retrieving revision 1.298
retrieving revision 1.297
diff -u -p -r1.298 -r1.297
--- com.c 20 Jul 2010 06:17:20 -0000 1.298
+++ com.c 19 Apr 2010 18:24:26 -0000 1.297
@@ -465,8 +465,6 @@ com_attach_subr(struct com_softc *sc)
sc->sc_fifolen = 0;
} else {
SET(sc->sc_hwflags, COM_HW_FLOW);
- SET(sc->sc_mcr, MCR_PRESCALE);
- sc->sc_frequency /= 4;
sc->sc_fifolen = 32;
}
} else
---
Izumi Tsutsui
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: jklos@NetBSD.org
Cc: gnats-bugs@NetBSD.org, port-cobalt-maintainer@NetBSD.org,
gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org,
tsutsui@ceres.dti.ne.jp
Subject: Re: port-cobalt/44292: -current kernels do not work on (my) cobaltqube
2 since one (1!) year
Date: Thu, 20 Jan 2011 00:58:19 +0900
I wrote:
> Needs to use device properties to enable prescaler?
> More initialization is required?
We can't touch frequency or prescaler in the attach function
if the device is already initialized in earlier cnattach.
I'm not sure which variants actually require prescaler
but I'll fix the code to disable it if comconsattached.
---
Izumi Tsutsui
From: Nick Hudson <nick.hudson@gmx.co.uk>
To: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
Cc: gnats-bugs@netbsd.org,
port-cobalt-maintainer@netbsd.org,
gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org,
jklos@netbsd.org
Subject: Re: port-cobalt/44292: -current kernels do not work on (my) cobalt qube 2 since one (1!) year
Date: Wed, 19 Jan 2011 15:59:11 +0000
On Wednesday 19 January 2011 14:34:11 Izumi Tsutsui wrote:
[...]
> Index: com.c
> ===================================================================
> RCS file: /cvsroot/src/sys/dev/ic/com.c,v
> retrieving revision 1.298
> retrieving revision 1.297
> diff -u -p -r1.298 -r1.297
> --- com.c 20 Jul 2010 06:17:20 -0000 1.298
> +++ com.c 19 Apr 2010 18:24:26 -0000 1.297
> @@ -465,8 +465,6 @@ com_attach_subr(struct com_softc *sc)
> sc->sc_fifolen = 0;
> } else {
> SET(sc->sc_hwflags, COM_HW_FLOW);
> - SET(sc->sc_mcr, MCR_PRESCALE);
> - sc->sc_frequency /= 4;
> sc->sc_fifolen = 32;
> }
> } else
>
My Cobalt RaQ 2 boots 5.99.44 with this patch.
Nick
From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-cobalt/44292: -current kernels do not work on (my) cobalt qube 2 since one (1!) year
Date: Wed, 19 Jan 2011 16:19:57 +0000
On Wed, Jan 19, 2011 at 04:00:26PM +0000, Nick Hudson wrote:
> > - SET(sc->sc_mcr, MCR_PRESCALE);
> > - sc->sc_frequency /= 4;
Presumably MCR_PRESCALE is generating a divide by 4.
So doing this more than once generates an invalid sc_frequency ??
So should this just be dependant on whether MCR_PRESCALE is already set?
I guess SET(a,b) is '(a) |= (b)' just to confuse things.
David
--
David Laight: david@l8s.co.uk
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: tsutsui@ceres.dti.ne.jp
Subject: Re: port-cobalt/44292: -current kernels do not work on (my) cobalt
qube 2 since one (1!) year
Date: Thu, 20 Jan 2011 01:27:05 +0900
> > > - SET(sc->sc_mcr, MCR_PRESCALE);
> > > - sc->sc_frequency /= 4;
>
> Presumably MCR_PRESCALE is generating a divide by 4.
> So doing this more than once generates an invalid sc_frequency ??
It's in com_attach_subr() so only once per attach,
but it causes a problem if com(4) is already initialized
in cnattach(), which doesn't care prescaler.
> So should this just be dependant on whether MCR_PRESCALE is already set?
Presclaer is quite device dependent, so MCR_PRESSCALE should be set
and sc_frequency should be adjusted in MD sys/arch/amiga/dev/com_supio.c
attachment, I think.
> I guess SET(a,b) is '(a) |= (b)' just to confuse things.
It's a separate discussion. (unless you will do cleanup the whole sources)
---
Izumi Tsutsui
From: Markus W Kilbinger <mk@kilbi.de>
To: tsutsui@NetBSD.org, Martin Mersberger <gremlin@portal-to-web.de>,
gnats-bugs@NetBSD.org
Cc: port-cobalt-maintainer@netbsd.org,
netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org,
Subject: Re: port-cobalt/44292 (-current kernels do not work on (my) cobalt qube 2 since one (1!) year) [and 1 more messages]
Date: Thu, 20 Jan 2011 21:25:11 +0100
>>>>> "tsutsui" == tsutsui <tsutsui@NetBSD.org> writes:
tsutsui> State-Changed-Why: Can you try patch?
Sorry for the delay, now got some minutes to test your patch and
indeed it makes the kernel pass the old stuck point:
[...]
root on wd0a dumps on wd0b
root file system type: ffs
pid 1(init): ABI set to O32 (e_flags=0x1007)
Thu Jan 20 19:50:40 GMT 2011
Starting root file system check:
/dev/rwd0a: file system is clean; not checking
swapctl: adding /dev/wd0b as swap device at priority 0
Starting file system checks:
/dev/rwd0e: file system is clean; not checking
/dev/rwd0f: file system is clean; not checking
/dev/rwd0a: file system is mounted read-write on /; not checking
Setting tty flags.
Setting sysctl variables:
[...]
but later (my qube is a heavy loaded machine ;-)) it crashes (at
starting swapping?):
[...]
Starting squid.
Starting spamd.
pid 381(squid): trap: TLB miss (store) in kernel mode
status=0xfc03, cause=0xc, epc=0x8000152c, vaddr=0xcc54f00c tf=0xcc54ef98 ksp=0xcc54eff8 ra=0x80001528
Stopped in pid 381.1 (squid) at netbsd:MachFPInterrupt+0xc0: sw ra,20(sp
)
db>
... as Martin's cobalt machine under some load:
>>>>> "Martin" == Martin Mersberger <gremlin@portal-to-web.de> writes:
Martin> My cobalt with -current just crashed while building
Martin> pkgsrc/shells/bash
Martin> db> bt
Martin> pid 18288(conftest): trap: TLB miss (load or instr.
Martin> fetch) in kernel mode status=0x3, cause=0x8808,
Martin> epc=0x802a1b98, vaddr=0xcb0bb014 tf=0xcb0bac78
Martin> ksp=0xcb0bacd8 ra=0x802a20bc Stopped in pid 18288.1
Martin> (conftest) at netbsd:kdbrpeek+0x30: lw v0,0(v1)
This reproducable crash(es) were the other reason for writing the pr.
Any more info I can provide?
Markus.
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: mk@kilbi.de
Cc: gremlin@portal-to-web.de, gnats-bugs@NetBSD.org, tsutsui@ceres.dti.ne.jp
Subject: Re: port-cobalt/44292 (-current kernels do not work on (my) cobalt
qube 2 since one (1!) year) [and 1 more messages]
Date: Fri, 21 Jan 2011 18:57:46 +0900
> but later (my qube is a heavy loaded machine ;-)) it crashes (at
> starting swapping?):
:
> This reproducable crash(es) were the other reason for writing the pr.
Could you file a new PR for this TLB miss problem?
It's a bit annoying to find necessary info from
a long PR including multiple replies.
The original problem (init(8) not start) was caused by
two independent mistakes but this one seems mips pmap issue.
> Any more info I can provide?
- userland version
- kernel config (GENERIC or your custom)
- (if custom) test with GENERIC
- test GENERIC + options DIAGNOSTIC kernel
etc?
---
Izumi Tsutsui
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: mk@kilbi.de, gremlin@portal-to-web.de
Cc: gnats-bugs@NetBSD.org, tsutsui@ceres.dti.ne.jp
Subject: Re: port-cobalt/44292 (-current kernels do not work on (my) cobaltqube
2 since one (1!) year) [and 1 more messages]
Date: Sat, 22 Jan 2011 01:16:59 +0900
> Could you file a new PR for this TLB miss problem?
Ah, never mind, could you try the following patch?
Index: arch/mips/mips/locore.S
===================================================================
RCS file: /cvsroot/src/sys/arch/mips/mips/locore.S,v
retrieving revision 1.173
diff -u -p -r1.173 locore.S
--- arch/mips/mips/locore.S 22 Dec 2010 01:34:17 -0000 1.173
+++ arch/mips/mips/locore.S 21 Jan 2011 16:13:39 -0000
@@ -750,7 +750,7 @@ XNESTED(MachFPTrap)
*/
FPReturn:
mfc0 t0, MIPS_COP_0_STATUS
- REG_S ra, CALLFRAME_RA(sp)
+ REG_L ra, CALLFRAME_RA(sp)
and t0, t0, ~MIPS_SR_COP_1_BIT
mtc0 t0, MIPS_COP_0_STATUS
COP0_SYNC
---
Izumi Tsutsui
From: Markus W Kilbinger <mk@kilbi.de>
To: gnats-bugs@NetBSD.org
Cc: tsutsui@NetBSD.org,
gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: port-cobalt/44292 (-current kernels do not work on (my) cobaltqube
2 since one (1!) year) [and 1 more messages]
Date: Sat, 22 Jan 2011 16:22:12 +0100
>>>>> "Izumi" == Izumi Tsutsui <tsutsui@ceres.dti.ne.jp> writes:
IZUMI >> Could you file a new PR for this TLB miss problem?
Izumi> Ah, never mind, could you try the following patch?
Izumi> Index: arch/mips/mips/locore.S
Izumi> ===================================================================
Izumi> RCS file: /cvsroot/src/sys/arch/mips/mips/locore.S,v
Izumi> retrieving revision 1.173 diff -u -p -r1.173 locore.S
Izumi> --- arch/mips/mips/locore.S 22 Dec 2010 01:34:17 -0000
Izumi> 1.173
Izumi> +++ arch/mips/mips/locore.S 21 Jan 2011 16:13:39 -0000
Izumi> @@ -750,7 +750,7 @@ XNESTED(MachFPTrap)
Izumi> */
Izumi> FPReturn:
Izumi> mfc0 t0, MIPS_COP_0_STATUS
Izumi> - REG_S ra, CALLFRAME_RA(sp)
Izumi> + REG_L ra, CALLFRAME_RA(sp)
Izumi> and t0, t0, ~MIPS_SR_COP_1_BIT mtc0 t0,
Izumi> MIPS_COP_0_STATUS COP0_SYNC
That helped!! My qube 2 is up and running a -current kernel and now
userland quite flawlessly. No crash/panic so far!
Thanks a lot!
I guess, you can close the pr.
What about -current's cobalt64 capabilities? Worth to try?
Markus.
From: "Izumi Tsutsui" <tsutsui@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44292 CVS commit: src/sys/dev/ic
Date: Sat, 22 Jan 2011 16:59:27 +0000
Module Name: src
Committed By: tsutsui
Date: Sat Jan 22 16:59:27 UTC 2011
Modified Files:
src/sys/dev/ic: com.c
Log Message:
Revert part of changes in rev 1.298:
- it breaks cobalt's serial console as mentioned in PR port-cobalt/44292
- MCR_PRESCALE doesn't affect unless EFR_EFCR is set in the EFR register
- even if MCR_PRESCALE is enabled we should define appropriate sc_type
variants and BRG values should be adjusted in comspeed() per sc_type
- sc_frequency should be adjusted in MD attachment if necessary
Tested on cobalt by several people, ok from jklos@
To generate a diff of this commit:
cvs rdiff -u -r1.298 -r1.299 src/sys/dev/ic/com.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Izumi Tsutsui" <tsutsui@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44292 CVS commit: src/sys/arch/mips/mips
Date: Sat, 22 Jan 2011 17:31:32 +0000
Module Name: src
Committed By: tsutsui
Date: Sat Jan 22 17:31:31 UTC 2011
Modified Files:
src/sys/arch/mips/mips: locore.S
Log Message:
Fix a fatal typo that causes TLB miss panic in MachFPInterrupt().
Reported in followups of PR port-cobalt/44292.
To generate a diff of this commit:
cvs rdiff -u -r1.173 -r1.174 src/sys/arch/mips/mips/locore.S
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: feedback->closed
State-Changed-By: tsutsui@NetBSD.org
State-Changed-When: Sun, 23 Jan 2011 03:31:51 +0900
State-Changed-Why:
Now they work properly by fixing misc small but critical bugs.
From: Martin Mersberger <gremlin@portal-to-web.de>
To: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
Cc: mk@kilbi.de, gnats-bugs@NetBSD.org
Subject: Re: port-cobalt/44292 (-current kernels do not work on (my) cobaltqube
2 since one (1!) year) [and 1 more messages]
Date: Sun, 23 Jan 2011 11:42:54 +0100
Hi...
>> Could you file a new PR for this TLB miss problem?
> Ah, never mind, could you try the following patch?
>
> Index: arch/mips/mips/locore.S
> ===================================================================
> RCS file: /cvsroot/src/sys/arch/mips/mips/locore.S,v
> retrieving revision 1.173
....
Since my RAQ runs with that patch applied, I've not seen any problems
anymore (since 2 days..) - Thanks for your help!!
(... I'm preparing to run a ./build.sh build on that box and check if it
survives that one as well ;-) )
Markus, how is your Cube? ;-)
regards
Martin
From: Markus W Kilbinger <mk@kilbi.de>
To: gnats-bugs@NetBSD.org,tsutsui@NetBSD.org,gnats-admin@netbsd.org,netbsd-bugs@netbsd.org
Cc:
Subject: Re: port-cobalt/44292 (-current kernels do not work on (my) cobaltqube 2 since one (1!) year) [and 1 more messages]
Date: Mon, 24 Jan 2011 00:08:38 +0100
With the new kernel the old and new userland and pkgs seem to have problems: Some daemons suddenly die. My mail queue stocks ramdomly. From that point quite unusable :-/
I had to go back to a 5.1 kernel/system just to be able to write this mail.
Maybe I will find some time to investigate further.
Markus.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.