NetBSD Problem Report #32894

From www@netbsd.org  Tue Feb 21 19:27:20 2006
Return-Path: <www@netbsd.org>
Received: by narn.netbsd.org (Postfix, from userid 31301)
	id CD40663B871; Tue, 21 Feb 2006 19:27:20 +0000 (UTC)
Message-Id: <20060221192720.CD40663B871@narn.netbsd.org>
Date: Tue, 21 Feb 2006 19:27:20 +0000 (UTC)
From: sysadmin@terc.edu
Reply-To: sysadmin@terc.edu
To: gnats-bugs@netbsd.org
Subject: protection fault trap in tmx86_get_longrun_mode
X-Send-Pr-Version: www-1.0

>Number:         32894
>Category:       port-i386
>Synopsis:       protection fault trap in tmx86_get_longrun_mode
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-i386-maintainer
>State:          feedback
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Feb 21 19:30:00 +0000 2006
>Closed-Date:    
>Last-Modified:  Sat Nov 05 17:48:06 +0000 2011
>Originator:     Robby Griffin
>Release:        NetBSD 3.0
>Organization:
TERC
>Environment:
NetBSD  3.0 NetBSD 3.0 (TERC_RLX) #4: Tue Feb 21 12:17:21 EST 2006  root@foiegras:/usr/src/sys/arch/i386/compile/TERC_RLX i386

>Description:
Booting minimally edited NetBSD 3.0 GENERIC on an RLX ServerBlade 1000t results in a protection fault during cpu initialization:

cpu0 at mainbus0: (uniprocessor)
cpu0: Transmeta Crusoe (586-class), 1000.14 MHz, id 0x543
cpu0: Processor revision 1.5.0.2
cpu0: Code Morphing Software Rev: 4.3.6-9-571
cpu0: 20030113 18:40 official release 4.3.6#1
kernel: protection fault trap, code=0
Stopped in pid 0.1 (swapper) at netbsd:tmx86_get_longrun_mode+0xe:      rdmsr
db> bt
tmx86_get_longrun_mode(0,c090c8c7,78,80860007,33303032) at netbsd:tmx86_get_longrun_mode+0xe
transmeta_cpu_info(c07cd6a0,543,c0842a80,24,0) at netbsd:transmeta_cpu_info+0x91
identifycpu(c07cd6a0,c07c0f60,1,c07b2a40,c090cdcc) at netbsd:identifycpu+0x68d
cpu_attach(c1fc9f80,c1fc9f40,c090ce50,c07b6684,29) at netbsd:cpu_attach+0x100
config_attach_loc(c1fc9f80,c07b6684,0,c090ce50,c045d334) at netbsd:config_attach_loc+0x284
mainbus_attach(0,c1fc9f80,0,c07c4b80,0) at netbsd:mainbus_attach+0x63
config_attach_loc(0,c07b5788,0,0,0) at netbsd:config_attach_loc+0x284
config_rootfound(c071d650,0,c090cf68,c042ae01,c072f3d1) at netbsd:config_rootfound+0x2c
cpu_configure(0,c083ca80,c090cfa0,c03700e8,0) at netbsd:cpu_configure+0x24
configure(0,0,0,0,0) at netbsd:configure+0x4a
main(0,0,0,0,0) at netbsd:main+0x2d4

If I disable the call to tmx86_get_longrun_mode during cpu initialization for the sake of argument, the machine will boot, but a sysctl could still crash it in the same way:

# sysctl -a
sysctl: warning: /var/run/dev.db: No such file or directory
kernel: protection fault trap, code=0
Stopped in pid 52.1 (sysctl) at netbsd:tmx86_get_longrun_mode+0xe:      rdmsr
db> t
tmx86_get_longrun_mode(0,1,0,0,1000272) at netbsd:tmx86_get_longrun_mode+0xe
sysctl_machdep_tm_longrun(cc37defc,0,bfbfe058,cc37def0,0) at netbsd:sysctl_machdep_tm_longrun+0x57
...

Here's dmesg from a successful boot with the call to tmx86_get_longrun_mode commented out:

NetBSD 3.0 (TERC_RLX) #4: Tue Feb 21 12:17:21 EST 2006
        root@foiegras:/usr/src/sys/arch/i386/compile/TERC_RLX
total memory = 1143 MB
avail memory = 1109 MB
BIOS32 rev. 0 found at 0xfd7b0
mainbus0 (root)
cpu0 at mainbus0: (uniprocessor)
cpu0: Transmeta Crusoe (586-class), 1000.15 MHz, id 0x543
cpu0: Processor revision 1.5.0.2
cpu0: Code Morphing Software Rev: 4.3.6-9-571
cpu0: 20030113 18:40 official release 4.3.6#1
cpu0: features 84893f<FPU,VME,DE,PSE,TSC,MSR,CX8,SEP>
cpu0: features 84893f<CMOV,PN,MMX>
cpu0: "Transmeta(tm) Crusoe(tm) Processor TM5800"
cpu0: serial number 0000-0543-0000-342E-00AC-8C24
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: Transmeta LongRun Northbridge (rev. 0x03)
Transmeta SDRAM Controller (RAM memory) at pci0 dev 0 function 1 not configured
Transmeta BIOS Scratchpad (RAM memory) at pci0 dev 0 function 2 not configured
pcib0 at pci0 dev 7 function 0
pcib0: Acer Labs M1543 PCI-ISA Bridge (rev. 0x00)
fxp0 at pci0 dev 9 function 0: i82559 Ethernet, rev 8
fxp0: interrupting at irq 11
fxp0: May need receiver lock-up workaround
fxp0: Ethernet address 00:42:52:01:1b:a4
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp1 at pci0 dev 10 function 0: i82559 Ethernet, rev 8
fxp1: interrupting at irq 10
fxp1: May need receiver lock-up workaround
fxp1: Ethernet address 00:42:52:01:1b:a5
inphy1 at fxp1 phy 1: i82555 10/100 media interface, rev. 4
inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp2 at pci0 dev 11 function 0: i82559 Ethernet, rev 8
fxp2: interrupting at irq 7
fxp2: May need receiver lock-up workaround
fxp2: Ethernet address 00:42:52:01:1b:a6
inphy2 at fxp2 phy 1: i82555 10/100 media interface, rev. 4
inphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
aceride0 at pci0 dev 15 function 0
aceride0: Acer Labs M5229 UDMA IDE Controller (rev. 0xc3)
aceride0: bus-master DMA support present
aceride0: primary channel wired to compatibility mode
aceride0: primary channel interrupting at irq 14
atabus0 at aceride0 channel 0
aceride0: secondary channel wired to compatibility mode
aceride0: secondary channel interrupting at irq 15
atabus1 at aceride0 channel 1
Acer Labs M7101 Power Management Controller (miscellaneous bridge) at pci0 dev 17 function 0 not configured
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
isapnp0: no ISA Plug 'n Play devices found
Kernelized RAIDframe activated
wd0 at atabus0 drive 0: <FUJITSU MHT2060AS>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 57231 MB, 116280 cyl, 16 head, 63 sec, 512 bytes/sect x 117210240 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA)
wd1 at atabus1 drive 0: <FUJITSU MHT2060AS>
wd1: drive supports 16-sector PIO transfers, LBA addressing
wd1: 57231 MB, 116280 cyl, 16 head, 63 sec, 512 bytes/sect x 117210240 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd1(aceride0:1:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA)
boot device: fxp2
root on fxp2

>How-To-Repeat:
Obtain an RLX ServerBlade 1000t, edit GENERIC or INSTALL kernel config to set CONS_OVERRIDE and force a console on com0 as a workaround for the oddities of BIOS console redirection. Try to boot.

>Fix:
Not sure why we get a protection fault while trying to read an MSR. A horrible workaround is to forcibly disable longrun support, maybe better done in kernel config if absolutely necessary:

--- usr/src/sys/arch/i386/i386/identcpu.c.orig	2005-07-18 16:48:58.000000000 -0400
+++ usr/src/sys/arch/i386/i386/identcpu.c	2006-02-21 14:17:26.000000000 -0500
@@ -1079,7 +1079,10 @@
 		info.text[64] = 0;
 		printf("%s: %s\n", ci->ci_dev->dv_xname, info.text);
 	}
-
+#if 0
+	/* XXX TERC - disabling this to prevent protection fault
+	 * XXX TERC - during boot of RLX ServerBlade 1000t (TM5800)
+	 */
 	if (nreg >= 0x80860007) {
 		crusoe_longrun = tmx86_get_longrun_mode();
 		tmx86_get_longrun_status(&crusoe_frequency,
@@ -1089,6 +1092,7 @@
 		    crusoe_longrun, crusoe_frequency, crusoe_voltage,
 		    crusoe_percentage);
 	}
+#endif
 }

 void
@@ -1097,8 +1101,13 @@
 	u_int nreg = 0, dummy;

 	CPUID(0x80860000, nreg, dummy, dummy, dummy);
+# if 0
+	/* XXX TERC - disabling this to prevent protection fault in
+	 * XXX TERC - sysctl on RLX ServerBlade 1000t (TM5800)
+	 */
 	if (nreg >= 0x80860007)
 		tmx86_has_longrun = 1;
+# endif
 }

 static const char n_support[] __attribute__((__unused__)) =

>Release-Note:

>Audit-Trail:
From: "Jared D. McNeill" <jmcneill@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/32894 CVS commit: src/sys/arch/i386/i386
Date: Sun, 23 Oct 2011 13:02:32 +0000

 Module Name:	src
 Committed By:	jmcneill
 Date:		Sun Oct 23 13:02:32 UTC 2011

 Modified Files:
 	src/sys/arch/i386/i386: longrun.c

 Log Message:
 PR #32894: protection fault trap in tmx86_get_longrun_mode

 Use rdmsr_safe in tmx86_init_longrun to verify that the MSRs are present.


 To generate a diff of this commit:
 cvs rdiff -u -r1.3 -r1.4 src/sys/arch/i386/i386/longrun.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sat, 05 Nov 2011 17:48:06 +0000
State-Changed-Why:
Did the commit help?


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.