NetBSD Problem Report #52266

From gson@gson.org  Wed May 31 12:50:35 2017
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 122057A1BE
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 31 May 2017 12:50:35 +0000 (UTC)
Message-Id: <20170531125029.A997A743D38@guava.gson.org>
Date: Wed, 31 May 2017 15:50:29 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: Double fault early in boot with Transmeta Crusoe CPU
X-Send-Pr-Version: 3.95

>Number:         52266
>Category:       port-i386
>Synopsis:       Double fault early in boot with Transmeta Crusoe CPU
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    nonaka
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed May 31 12:55:00 +0000 2017
>Closed-Date:    Sat Jul 22 12:03:49 +0000 2017
>Last-Modified:  Sat Jul 22 12:03:49 +0000 2017
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current, source date >= 2017.05.23.08.54.39
>Organization:

>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:

I have an old NEC Versa Daylite laptop with a Transmeta Crusoe CPU
running NetBSD/i386.  I recently tried to upgrade the kernel to
-current, but it crashed very early in the boot, immediately after
printing the kernel segment sizes, with the following message
(transcribed manually):

  Fatal double fault in supervisor mode
  Trap type 13 code 0xc0116f29 eip 0x8 cs 0x296 eflags 0xc0150010 cr2 0 ilevel 0 esp 0xc08b0030
  curlwp 0xc1231920 pid 0 lid 1 lowest kstack 0xc149e2c0
  kernel: user trap double fault, code=0
  Stopped in pid 0.1 (system) at  8:      invalid adderss
  db{0}>

The keyboard does not respond to ddb commands at this point.

By bisection, I have determined that the problem appeared with the
following recent commits:

  2017.05.23.08.54.38 nonaka src/sys/arch/amd64/amd64/db_interface.c 1.25
  2017.05.23.08.54.38 nonaka src/sys/arch/amd64/amd64/mainbus.c 1.38
  2017.05.23.08.54.38 nonaka src/sys/arch/amd64/amd64/vector.S 1.49
  2017.05.23.08.54.38 nonaka src/sys/arch/amd64/include/i82093reg.h 1.8
  2017.05.23.08.54.38 nonaka src/sys/arch/i386/i386/db_interface.c 1.72
  2017.05.23.08.54.38 nonaka src/sys/arch/i386/i386/mainbus.c 1.103
  2017.05.23.08.54.38 nonaka src/sys/arch/i386/i386/vector.S 1.69
  2017.05.23.08.54.39 nonaka src/sys/arch/i386/include/i82093reg.h 1.10
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/include/cpuvar.h 1.50
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/include/i82489var.h 1.19
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/include/intr.h 1.50
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/include/mpacpi.h 1.11
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/pci/msipic.c 1.9
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/cpu.c 1.125
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/lapic.c 1.58
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/pmc.c 1.7
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/tprof_amdpmi.c 1.7
  2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/tprof_pmi.c 1.14
  2017.05.23.08.54.39 nonaka src/sys/arch/xen/include/intr.h 1.40
  2017.05.23.08.54.39 nonaka src/sys/arch/xen/include/mpacpi.h 1.2
  2017.05.23.08.54.39 nonaka src/sys/arch/xen/x86/intr.c 1.31
  2017.05.23.08.54.39 nonaka src/sys/arch/xen/x86/mainbus.c 1.19

>How-To-Repeat:

Attempt to boot NetBSD-current/i386 on a Transmeta Crusoe CPU.

>Fix:

>Release-Note:

>Audit-Trail:
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta
 Crusoe CPU
Date: Wed, 31 May 2017 13:17:37 +0000

 BTW, you can hard code DDB_COMMANDONENTER
 options 	DDB_COMMANDONENTER="bt"

From: Andreas Gustafsson <gson@gson.org>
To: coypu@sdf.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta
 Crusoe CPU
Date: Wed, 31 May 2017 17:54:33 +0300

 coypu@sdf.org wrote:
 >  BTW, you can hard code DDB_COMMANDONENTER
 >  options 	DDB_COMMANDONENTER="bt"

 I have now tried that, but it just made the machine spontaneously
 reboot.
 -- 
 Andreas Gustafsson, gson@gson.org

Responsible-Changed-From-To: port-i386-maintainer->nonaka
Responsible-Changed-By: gson@NetBSD.org
Responsible-Changed-When: Wed, 31 May 2017 14:56:04 +0000
Responsible-Changed-Why:
Problem started with nonaka's commits


From: Kimihiro Nonaka <nonakap@gmail.com>
To: "gnats-bugs@netbsd.org" <gnats-bugs@netbsd.org>
Cc: port-i386-maintainer@netbsd.org, 
	"gnats-admin@netbsd.org" <gnats-admin@netbsd.org>, "netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe CPU
Date: Thu, 1 Jun 2017 12:37:06 +0900

 Could you send a dmesg and a result of cpuctl identify 0 with old kernel?

 On Wed, May 31, 2017 at 9:55 PM, Andreas Gustafsson <gson@gson.org> wrote:
 >>Number:         52266
 >>Category:       port-i386
 >>Synopsis:       Double fault early in boot with Transmeta Crusoe CPU
 >>Confidential:   no
 >>Severity:       critical
 >>Priority:       medium
 >>Responsible:    port-i386-maintainer
 >>State:          open
 >>Class:          sw-bug
 >>Submitter-Id:   net
 >>Arrival-Date:   Wed May 31 12:55:00 +0000 2017
 >>Originator:     Andreas Gustafsson
 >>Release:        NetBSD-current, source date >= 2017.05.23.08.54.39
 >>Organization:
 >
 >>Environment:
 > System: NetBSD
 > Architecture: i386
 > Machine: i386
 >>Description:
 >
 > I have an old NEC Versa Daylite laptop with a Transmeta Crusoe CPU
 > running NetBSD/i386.  I recently tried to upgrade the kernel to
 > -current, but it crashed very early in the boot, immediately after
 > printing the kernel segment sizes, with the following message
 > (transcribed manually):
 >
 >   Fatal double fault in supervisor mode
 >   Trap type 13 code 0xc0116f29 eip 0x8 cs 0x296 eflags 0xc0150010 cr2 0 ilevel 0 esp 0xc08b0030
 >   curlwp 0xc1231920 pid 0 lid 1 lowest kstack 0xc149e2c0
 >   kernel: user trap double fault, code=0
 >   Stopped in pid 0.1 (system) at  8:      invalid adderss
 >   db{0}>
 >
 > The keyboard does not respond to ddb commands at this point.
 >
 > By bisection, I have determined that the problem appeared with the
 > following recent commits:
 >
 >   2017.05.23.08.54.38 nonaka src/sys/arch/amd64/amd64/db_interface.c 1.25
 >   2017.05.23.08.54.38 nonaka src/sys/arch/amd64/amd64/mainbus.c 1.38
 >   2017.05.23.08.54.38 nonaka src/sys/arch/amd64/amd64/vector.S 1.49
 >   2017.05.23.08.54.38 nonaka src/sys/arch/amd64/include/i82093reg.h 1.8
 >   2017.05.23.08.54.38 nonaka src/sys/arch/i386/i386/db_interface.c 1.72
 >   2017.05.23.08.54.38 nonaka src/sys/arch/i386/i386/mainbus.c 1.103
 >   2017.05.23.08.54.38 nonaka src/sys/arch/i386/i386/vector.S 1.69
 >   2017.05.23.08.54.39 nonaka src/sys/arch/i386/include/i82093reg.h 1.10
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/include/cpuvar.h 1.50
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/include/i82489var.h 1.19
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/include/intr.h 1.50
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/include/mpacpi.h 1.11
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/pci/msipic.c 1.9
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/cpu.c 1.125
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/lapic.c 1.58
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/pmc.c 1.7
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/tprof_amdpmi.c 1.7
 >   2017.05.23.08.54.39 nonaka src/sys/arch/x86/x86/tprof_pmi.c 1.14
 >   2017.05.23.08.54.39 nonaka src/sys/arch/xen/include/intr.h 1.40
 >   2017.05.23.08.54.39 nonaka src/sys/arch/xen/include/mpacpi.h 1.2
 >   2017.05.23.08.54.39 nonaka src/sys/arch/xen/x86/intr.c 1.31
 >   2017.05.23.08.54.39 nonaka src/sys/arch/xen/x86/mainbus.c 1.19
 >
 >>How-To-Repeat:
 >
 > Attempt to boot NetBSD-current/i386 on a Transmeta Crusoe CPU.
 >
 >>Fix:
 >

From: Andreas Gustafsson <gson@gson.org>
To: nonaka@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe CPU
Date: Thu, 1 Jun 2017 09:27:57 +0300

 Kimihiro Nonaka wrote:
 >  Could you send a dmesg and a result of cpuctl identify 0 with old kernel?

 dmesg:

 Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
     2006, 2007, 2008, 2009, 2010, 2011, 2012
     The NetBSD Foundation, Inc.  All rights reserved.
 Copyright (c) 1982, 1986, 1989, 1991, 1993
     The Regents of the University of California.  All rights reserved.

 NetBSD 6.1.5 (GUNNEL) #0: Fri Jun 12 15:45:43 EEST 2015
 	gson@guido.araneus.fi:/bracket/prod/6.1.5-edis/i386/obj/sys/arch/i386/compile/GUNNEL
 total memory = 175 MB
 avail memory = 159 MB
 timecounter: Timecounters tick every 10.000 msec
 timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
 NEC                              VUS6BCE-00B-000 (                                )
 mainbus0 (root)
 cpu0 at mainbus0: Transmeta(tm) Crusoe(tm) Processor TMTM5600, id 0x543
 acpi0 at mainbus0: Intel ACPICA 20110623
 acpi0: X/RSDT: OemId <NEC   ,ND000034,06040005>, AslId < LTP,00000000>
 LNK1: ACPI: Found matching pin for 0.3.INTA at func 0: 255
 LNK2: ACPI: Found matching pin for 0.4.INTA at func 0: 11
 LNK3: ACPI: Found matching pin for 0.5.INTA at func 0: 10
 LNK1: ACPI: Found matching pin for 0.6.INTA at func 0: 9
 LNKU: ACPI: Found matching pin for 0.20.INTA at func 0: 5
 acpi0: SCI interrupting at int 9
 timecounter: Timecounter "ACPI-Safe" frequency 3579545 Hz quality 900
 attimer1 at acpi0 (TIME, PNP0100): io 0x40-0x43 irq 0
 npx1 at acpi0 (MATH, PNP0C04): io 0xf0-0xfe irq 13
 npx1: reported by CPUID; using exception 16
 pcppi1 at acpi0 (SPKR, PNP0800): io 0x61
 midi0 at pcppi1: PC speaker
 sysbeep0 at pcppi1
 USKB (PNP0303) at acpi0 not configured
 SYSR (PNP0C02) at acpi0 not configured
 acpilid0 at acpi0 (LID, PNP0C0D): ACPI Lid Switch
 acpiacad0 at acpi0 (ADP, ACPI0003): ACPI AC Adapter
 acpibat0 at acpi0 (BAT2, PNP0C0A-2): ACPI Battery
 acpifan0 at acpi0 (LRA0, PNP0C0B): ACPI Fan
 acpitz0 at acpi0 (THRM)
 acpitz0: levels: critical 100.0 C, passive cooling
 apm0 at acpi0: Power Management spec V1.2
 attimer1: attached to pcppi1
 pci0 at mainbus0 bus 0: configuration mode 1
 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
 pchb0 at pci0 dev 0 function 0: vendor 0x1279 product 0x0395 (rev. 0x00)
 vendor 0x1279 product 0x0396 (RAM memory) at pci0 dev 0 function 1 not configured
 vendor 0x1279 product 0x0397 (RAM memory) at pci0 dev 0 function 2 not configured
 cbb0 at pci0 dev 3 function 0: vendor 0x104c product 0xac50 (rev. 0x01)
 eso0 at pci0 dev 4 function 0: ESS Solo-1 PCI AudioDrive ES1946 Revision E
 eso0: interrupting at irq 11
 eso0: mapping Audio 1 DMA using VC I/O space at 0x1480
 audio0 at eso0: full duplex, playback, capture, mmap, independent
 opl0 at eso0: model OPL3
 midi1 at opl0: ESO Yamaha OPL3
 mpu0 at eso0
 midi2 at mpu0: ESO MPU-401 MIDI UART
 joy0 at eso0
 joy0: joystick not connected
 vga1 at pci0 dev 5 function 0: vendor 0x1002 product 0x4c52 (rev. 0x64)
 wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation)
 wsmux1: connecting to wsdisplay0
 mach64drm0 at vga1: Rage Mobility P/M
 mach64drm0: Initialized mach64 2.0.0 20060718
 fxp0 at pci0 dev 6 function 0: i82559S Ethernet (rev. 0x09)
 fxp0: interrupting at irq 9
 fxp0: May need receiver lock-up workaround
 fxp0: Ethernet address 00:10:a4:16:62:84
 inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
 vendor 0x115d product 0x002b (serial communications, interface 0x02) at pci0 dev 6 function 1 not configured
 pcib0 at pci0 dev 7 function 0: vendor 0x10b9 product 0x1533 (rev. 0x00)
 aceride0 at pci0 dev 16 function 0: Acer Labs M5229 UDMA IDE Controller (rev. 0xc3)
 aceride0: bus-master DMA support present
 aceride0: using PIO transfers above 137GB as workaround for 48bit DMA access bug, expect reduced performance
 aceride0: primary channel wired to compatibility mode
 aceride0: primary channel interrupting at irq 14
 atabus0 at aceride0 channel 0
 aceride0: secondary channel wired to compatibility mode
 aceride0: secondary channel interrupting at irq 15
 atabus1 at aceride0 channel 1
 alipm0 at pci0 dev 17 function 0: 74KHz clock
 iic0 at alipm0: I2C bus
 ohci0 at pci0 dev 20 function 0: vendor 0x10b9 product 0x5237 (rev. 0x03)
 ohci0: interrupting at irq 5
 ohci0: OHCI version 1.0, legacy support
 usb0 at ohci0: USB revision 1.0
 cbb0: cacheline 0x8 lattimer 0x51
 cbb0: bhlc 0x25108
 cbb0: interrupting at irq 9
 cardslot0 at cbb0
 cardbus0 at cardslot0: bus 1
 pcmcia0 at cardslot0
 isa0 at pcib0
 pckbc0 at isa0 port 0x60-0x64
 pckbd0 at pckbc0 (kbd slot)
 pckbc0: using irq 1 for kbd slot
 wskbd0 at pckbd0: console keyboard, using wsdisplay0
 pms0 at pckbc0 (aux slot)
 pckbc0: using irq 12 for aux slot
 wsmouse0 at pms0 mux 0
 acpicpu0 at cpu0: ACPI CPU
 acpicpu0: C1: HLT, lat   0 us, pow     0 mW
 acpicpu0: C2: I/O, lat  10 us, pow     0 mW
 acpicpu0: C3: I/O, lat  32 us, pow     0 mW
 acpicpu0: T0: I/O, lat   1 us, pow     0 mW, 100 %
 acpicpu0: T1: I/O, lat   1 us, pow     0 mW,  88 %
 acpicpu0: T2: I/O, lat   1 us, pow     0 mW,  76 %
 acpicpu0: T3: I/O, lat   1 us, pow     0 mW,  64 %
 acpicpu0: T4: I/O, lat   1 us, pow     0 mW,  52 %
 acpicpu0: T5: I/O, lat   1 us, pow     0 mW,  40 %
 acpicpu0: T6: I/O, lat   1 us, pow     0 mW,  28 %
 acpicpu0: T7: I/O, lat   1 us, pow     0 mW,  16 %
 timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
 acpiacad0: AC adapter online.
 uhub0 at usb0: vendor 0x10b9 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
 uhub0: 2 ports with 2 removable, self powered
 wi0 at pcmcia0 function 0: <Lucent Technologies, WaveLAN/IEEE, Version 01.01, >
 wi0: 802.11 address 00:60:1d:f2:27:87
 wi0: using Lucent Technologies, WaveLAN/IEEE
 wi0: Lucent Firmware: Station (4.52.1)
 wi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
 wd0 at atabus0 drive 0
 wd0: <IC25N020ATCS04-0>
 wd0: drive supports 16-sector PIO transfers, LBA addressing
 wd0: 19077 MB, 41344 cyl, 15 head, 63 sec, 512 bytes/sect x 39070080 sectors
 wd0: 32-bit data port
 wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
 wd0(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA)
 Kernelized RAIDframe activated
 boot device: wd0
 root on wd0a dumps on wd0b
 root file system type: ffs
 acpibat0: normal capacity on 'charge state'
 wsdisplay0: screen 1 added (80x25, vt100 emulation)
 wsdisplay0: screen 2 added (80x25, vt100 emulation)
 wsdisplay0: screen 3 added (80x25, vt100 emulation)
 wsdisplay0: screen 4 added (80x25, vt100 emulation)

 Output from "cpuctl identify 0":

 cpu0: Transmeta Crusoe (586-class), 592.78 MHz, id 0x543
 cpu0: Processor revision 1.3.1.3
 cpu0: Code Morphing Software Rev: 4.1.4-7-51
 cpu0: 20000805 23:30 official release 4.1.4#2
 cpu0: LongRun <600MHz 1600mV 100%>
 cpu0: features  0x84803f<FPU,VME,DE,PSE,TSC,MSR,CMOV,PN,MMX>
 cpu0: "Transmeta(tm) Crusoe(tm) Processor TMTM5600"
 cpu0: serial number 0000-0543-0000-0F29-0A21-475A
 cpu0: Initial APIC ID 0
 cpu0: family 05 model 04 extfamily 00 extmodel 00 stepping 03
 cpu0: UCode version: ?

 -- 
 Andreas Gustafsson, gson@gson.org

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe
 CPU
Date: Mon, 26 Jun 2017 09:12:05 -0500 (CDT)

 A reported here:

   http://mail-index.netbsd.org/current-users/2017/06/24/msg031962.html

 this problem also affects the i486-class CPU found in the Soekris net4501.
 A kernel built with sources from 2017.05.23.08.54.37 boots and runs
 fine.  A kernel built with sources from 2017.05.23.08.54.40 fails
 immediately after printing the segment sizes (and any module load
 messages) with:

 fatal privileged instruction fault in sufatal double fault in supervisor mode
 trap type 13 code 0xc0117238 eip 0x8 cs 0x246 eflags 0xc017a4d1 cr2 0 ilevel 0x8 esp 0xc0405f20
 curlwp 0xc0413ba0 pid 0 lid 1 lowest kstack 0xc058c2c0
 kernel: user trap double fault, code=0
 Stopped in pid 0.1 (system) at  8:      invalid address
 db{0}> 

 Attempting 'bt' immediately reboots the machine.

 Output of 'cpuctl identify 0' under last working kernel:

 cpu0: highest basic info 00000001
 cpu0: AMD Am5x86 W/B 133/160 (486-class)
 cpu0: family 0x4 model 0xf stepping 0x4 (id 0x4f4)
 cpu0: features 0x1<FPU>
 cpu0: Initial APIC ID 0

 "/var/run/dmesg.boot" with last working kernel:

 Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
     2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017
     The NetBSD Foundation, Inc.  All rights reserved.
 Copyright (c) 1982, 1986, 1989, 1991, 1993
     The Regents of the University of California.  All rights reserved.

 NetBSD 7.99.72 (NET4501) #3: Sun Jun 25 23:25:56 CDT 2017
 	sysop@plex760.technoskunk.fur:/r0/build/nbsd-tst/obj/i386/sys/arch/i386/compile/NET4501
 total memory = 65148 KB
 avail memory = 59352 KB
 timecounter: Timecounters tick every 10.000 msec
 timecounter: Timecounter "i8254" frequency 1189200 Hz quality 100
 Generic PC
 mainbus0 (root)
 cpu0 at mainbus0
 cpu0: AMD 486-class, id 0x4f4
 cpu0: package 0, core 0, smt 0
 elansc0 at mainbus0 bus 0: AMD Elan SC520 System Controller
 elansc0: product 0 stepping 1.1, CPU clock 133MHz
 gpio0 at elansc0: 32 pins
 pci0 at elansc0 bus 0: configuration mode 1
 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
 vendor 1022 product 3000 (host bridge) at pci0 dev 0 function 0 not configured
 sip0 at pci0 dev 18 function 0: NatSemi DP83815 10/100 Ethernet, rev 00
 sip0: interrupting at irq 10
 sip0: Ethernet address xx:xx:xx:xx:xx:xx
 nsphyter0 at sip0 phy 0: DP83815 10/100 media interface, rev. 1
 nsphyter0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
 sip1 at pci0 dev 19 function 0: NatSemi DP83815 10/100 Ethernet, rev 00
 sip1: interrupting at irq 11
 sip1: Ethernet address xx:xx:xx:xx:xx:xx
 nsphyter1 at sip1 phy 0: DP83815 10/100 media interface, rev. 1
 nsphyter1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
 sip2 at pci0 dev 20 function 0: NatSemi DP83815 10/100 Ethernet, rev 00
 sip2: interrupting at irq 5
 sip2: Ethernet address xx:xx:xx:xx:xx:xx
 nsphyter2 at sip2 phy 0: DP83815 10/100 media interface, rev. 1
 nsphyter2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
 isa0 at mainbus0
 com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
 com0: console
 com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
 wdc0 at isa0 port 0x1f0-0x1f7 irq 14
 atabus0 at wdc0 channel 0
 timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
 wd0 at atabus0 drive 0
 wd0: <ELITE PRO CF CARD 8GB>
 wd0: drive supports 1-sector PIO transfers, LBA addressing
 wd0: 7647 MB, 15538 cyl, 16 head, 63 sec, 512 bytes/sect x 15662304 sectors
 wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
 boot device: sip0
 root on sip0
 nfs_boot: trying DHCP/BOOTP
 nfs_boot: DHCP next-server: a.b.c.d
 nfs_boot: my_name=net4501d
 nfs_boot: my_domain=technoskunk.fur
 nfs_boot: my_addr=e.f.g.h
 nfs_boot: my_mask=m.n.o.p
 nfs_boot: gateway=r.p.q.t
 root on a.b.c.d:/r0/diskless/net4501d
 root file system type: nfs
 kern.module.path=/stand/i386/7.99.72/modules

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: Simon Burge <simonb@NetBSD.org>
To: Kimihiro Nonaka <nonakap@gmail.com>
Cc: "gnats-bugs@netbsd.org" <gnats-bugs@netbsd.org>,
    port-i386-maintainer@netbsd.org,
    "gnats-admin@netbsd.org" <gnats-admin@netbsd.org>,
    "netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe CPU
Date: Sat, 08 Jul 2017 17:10:33 +1000

 Kimihiro Nonaka wrote:

 > Could you send a dmesg and a result of cpuctl identify 0 with old kernel?

 I get the same problem on my Soekris net4801 with the following cpu:

 	cpu0: highest basic info 00000002
 	cpu0: highest extended info 80000005
 	cpu0: "Geode(TM) Integrated Processor by National Semi"
 	cpu0: National Semiconductor Geode GX1 (586-class)
 	cpu0: family 0x5 model 0x4 stepping 0 (id 0x540)
 	cpu0: features 0x808131<FPU,TSC,MSR,CX8,CMOV,MMX>
 	cpu0: ITLB 1 4KB entries 112-way
 	cpu0: Initial APIC ID 0

 If I change lapic_is_x2apic() to unconditionally return false, my
 Soekris boots (at least to single user mode).

 I tried to get lapic_is_x2apic() to store the value of the MSR it
 reads by changing that function to:

 	uint64_t x2apic_msr;

 	bool
 	lapic_is_x2apic(void)
 	{
 		x2apic_msr = rdmsr(MSR_APICBASE);
 		return false;
 	}

 but that just faulted/paniced too, but slightly differently:

 	> boot net8 -s
 	17730720+696076+839124 [776736+802655]=0x13e1cbc
 	fatal protection faufatal double fault in supervisor mode
 	trap type 13 code 0xc0118298 eip 0x8 cs 0x246 eflags 0xc054bbf6 cr2 0 ilevel 0x8 esp 0xc11ea760
 	curlwp 0xc125f360 pid 0 lid 1 lowest kstack 0xc14e32c0
 	kernel: user trap double fault, code=0
 	Stopped in pid 0.1 (system) at  8:      invalid address
 	db{0}> 

 The chopped off "fatal protection fau" is new.  Could the rdmsr() itself
 be faulting then??

 Is there any further info I get to help?

 Cheers,
 Simon.

From: Kimihiro Nonaka <nonakap@gmail.com>
To: "gnats-bugs@netbsd.org" <gnats-bugs@netbsd.org>
Cc: NONAKA Kimihiro <nonaka@netbsd.org>, "gnats-admin@netbsd.org" <gnats-admin@netbsd.org>, 
	"netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>, Andreas Gustafsson <gson@gson.org>
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe CPU
Date: Sat, 8 Jul 2017 20:06:04 +0900

 Hi,

 2017-07-08 18:15 GMT+09:00 Simon Burge <simonb@netbsd.org>:

 >  If I change lapic_is_x2apic() to unconditionally return false, my
 >  Soekris boots (at least to single user mode).
 >
 >  I tried to get lapic_is_x2apic() to store the value of the MSR it
 >  reads by changing that function to:
 >
 >         uint64_t x2apic_msr;
 >
 >         bool
 >         lapic_is_x2apic(void)
 >         {
 >                 x2apic_msr = rdmsr(MSR_APICBASE);
 >                 return false;
 >         }
 >
 >  but that just faulted/paniced too, but slightly differently:
 >
 >         > boot net8 -s
 >         17730720+696076+839124 [776736+802655]=0x13e1cbc
 >         fatal protection faufatal double fault in supervisor mode
 >         trap type 13 code 0xc0118298 eip 0x8 cs 0x246 eflags 0xc054bbf6 cr2 0 ilevel 0x8 esp 0xc11ea760
 >         curlwp 0xc125f360 pid 0 lid 1 lowest kstack 0xc14e32c0
 >         kernel: user trap double fault, code=0
 >         Stopped in pid 0.1 (system) at  8:      invalid address
 >         db{0}>
 >
 >  The chopped off "fatal protection fau" is new.  Could the rdmsr() itself
 >  be faulting then??
 >
 >  Is there any further info I get to help?

 Could you try the following patch.

 diff --git a/sys/arch/x86/x86/lapic.c b/sys/arch/x86/x86/lapic.c
 index 20822a67184..372e9f8c0c2 100644
 --- a/sys/arch/x86/x86/lapic.c
 +++ b/sys/arch/x86/x86/lapic.c
 @@ -235,10 +235,12 @@ lapic_enable_x2apic(void)
  bool
  lapic_is_x2apic(void)
  {
 -    uint64_t r;
 +    uint64_t msr;

 -    r = rdmsr(MSR_APICBASE);
 -    return (r & (APICBASE_EN | APICBASE_EXTD)) == (APICBASE_EN |
 APICBASE_EXTD);
 +    if (rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
 +        return false;
 +    return (msr & (APICBASE_EN | APICBASE_EXTD)) ==
 +        (APICBASE_EN | APICBASE_EXTD);
  }

  /*


 Regards,
 -- 
 Kimihiro Nonaka

From: "NONAKA Kimihiro" <nonaka@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52266 CVS commit: src/sys/arch/x86/x86
Date: Sat, 8 Jul 2017 14:35:33 +0000

 Module Name:	src
 Committed By:	nonaka
 Date:		Sat Jul  8 14:35:33 UTC 2017

 Modified Files:
 	src/sys/arch/x86/x86: lapic.c

 Log Message:
 PR/52266: use rdmsr_safe(9) instead of rdmsr(9) for old machine.

 tested by simonb@


 To generate a diff of this commit:
 cvs rdiff -u -r1.58 -r1.59 src/sys/arch/x86/x86/lapic.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Simon Burge <simonb@NetBSD.org>
To: Kimihiro Nonaka <nonakap@gmail.com>
Cc: "gnats-bugs@netbsd.org" <gnats-bugs@netbsd.org>,
    "gnats-admin@netbsd.org" <gnats-admin@netbsd.org>,
    "netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>,
    Andreas Gustafsson <gson@gson.org>
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe CPU
Date: Sun, 09 Jul 2017 00:24:35 +1000

 Kimihiro Nonaka wrote:

 > Could you try the following patch.
 > 
 > diff --git a/sys/arch/x86/x86/lapic.c b/sys/arch/x86/x86/lapic.c
 > index 20822a67184..372e9f8c0c2 100644
 > --- a/sys/arch/x86/x86/lapic.c
 > +++ b/sys/arch/x86/x86/lapic.c
 > @@ -235,10 +235,12 @@ lapic_enable_x2apic(void)
 >  bool
 >  lapic_is_x2apic(void)
 >  {
 > -    uint64_t r;
 > +    uint64_t msr;
 > 
 > -    r = rdmsr(MSR_APICBASE);
 > -    return (r & (APICBASE_EN | APICBASE_EXTD)) == (APICBASE_EN |
 > APICBASE_EXTD);
 > +    if (rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
 > +        return false;
 > +    return (msr & (APICBASE_EN | APICBASE_EXTD)) ==
 > +        (APICBASE_EN | APICBASE_EXTD);
 >  }
 > 
 >  /*

 This works (tested on netbsd-8 branch).  Thank you!

 Can you please commit and pull up to the netbsd-8 branch?

 Cheers,
 Simon.

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe
 CPU
Date: Sat, 8 Jul 2017 13:59:41 -0500 (CDT)

 On Mon, 26 Jun 2017, John D. Baker wrote:

 > this problem also affects the i486-class CPU found in the Soekris net4501.

 I rebuilt with the patch, but my net4501 still panics.

 Perhaps I need to remove the kernel objdirs?

 I see the patch has been committed, so I'll update and try again.

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe
 CPU
Date: Sat, 8 Jul 2017 17:52:43 -0500 (CDT)

 On Sat, 8 Jul 2017, John D. Baker wrote:

 > On Mon, 26 Jun 2017, John D. Baker wrote:
 > 
 > > this problem also affects the i486-class CPU found in the Soekris net4501.
 > 
 > I rebuilt with the patch, but my net4501 still panics.
 > 
 > Perhaps I need to remove the kernel objdirs?
 > 
 > I see the patch has been committed, so I'll update and try again.

 I couldn't update just yet, but removed the kernel objdirs and
 rebuilt.

 No change.  My net4501 still panics the same as before.  Perhaps it's
 hitting another instruction or reading another register present in 586+
 CPUs, (Geode GX1, Crusoe) but not in 486 CPUs (AMD Am5x86 W/B)?

 The panic message remains:

 > boot /netbsd.tst -s
 17775532+697548+846740 [821024+839243+13086]=0x14060c0
 fatal privileged instruction fault fatal double fault in supervisor mode
 trap type 13 code 0xc0118298 eip 0x8 cs 0x246 eflags 0xc054cd46 cr2 0 ilevel 0x8 esp 0xc11f5760
 curlwp 0xc126a820 pid 0 lid 1 lowest kstack 0xc15772c0
 kernel: user trap double fault, code=0
 Stopped in pid 0.1 (system) at  8:      invalid address


 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe
 CPU
Date: Sat, 8 Jul 2017 19:37:35 -0500 (CDT)

 On Sat, 8 Jul 2017, John D. Baker wrote:

 > I couldn't update just yet, but removed the kernel objdirs and
 > rebuilt.

 There was some confusion over whether I'd patched the file or not.
 I updated lapic.c only to be sure I got the most recent one and
 rebuilt GENERIC and my custom kernel based on the NET4501 config
 from scratch.

 The net4501 still panics as follows:

 > boot /netbsd.tst -s
 17775628+697548+846740 [1040847+780528+806439]=0x14eefd8
 fatal privileged instruction fault in supervisofatal double fault in supervisor mode
 trap type 13 code 0xc0118298 eip 0x8 cs 0x246 eflags 0xc054cd46 cr2 0 ilevel 0x8 esp 0xc11f5760
 curlwp 0xc126a820 pid 0 lid 1 lowest kstack 0xc165f2c0
 kernel: user trap double fault, code=0


 I should be able to do a complete update and rebuild soon, just to
 make sure.

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe
 CPU
Date: Sat, 8 Jul 2017 23:43:27 -0500 (CDT)

 On Sat, 8 Jul 2017, John D. Baker wrote:

 > I should be able to do a complete update and rebuild soon, just to
 > make sure.

 Fully updated with:

      $NetBSD: lapic.c,v 1.59 2017/07/08 14:35:33 nonaka Exp $

 the net4501 still panics early in the boot process (GENERIC kernel):

 > boot /netbsd.tst -s
 17775624+697548+846740 [821040+839258+13087]=0x14060e0
 fatal privileged instruction fault in supervisor mode
 trap type 0 code 0 eip 0xcfatal double fault in supervisor mode
 trap type 13 code 0xc0118298 eip 0x8 cs 0x246 eflags 0xc054cd46 cr2 0 ilevel 0x8 esp 0xc11f5760
 curlwp 0xc126a820 pid 0 lid 1 lowest kstack 0xc15772c0
 kernel: user trap double fault, code=0
 Stopped in pid 0.1 (system) at  8:      invalid address

 This time, instead of rebooting immediately, trying to get a backtrace
 produces:

 db{0}> bt
 panic: kernel diagnostic assertion "l->l_nopreempt > 0" failed: file "/x/current/src/sys/sys/lwp.h", line 513 
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 eip 0xc0118734 cs 0x8 eflags 0x246 cr2 0x1c ilevel 0x8 esp 0xc126aa08
 curlwp 0xc126a820 pid 733240715 lid 102 lowest kstack 0xc126ab10
 fatal page fault in supervisor modet    c0118734:       popl    %ebp

 after which it reboots.

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with AMD Am5x86 CPU
Date: Sun, 9 Jul 2017 00:42:13 -0500 (CDT)

 Just to double check, I grabbed:

   http://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/201707082110Z/i386/binary/kernel/netbsd-GENERIC.gz

 and it panics on the net4501 the same as my locally-built kernels:

 > boot /netbsd.tst -s
 17769700+697548+846740 [821024+839243+13086]=0x14050c0
 fatal prfatal double fault in supervisor mode
 trap type 13 code 0xc0118298 eip 0x8 cs 0x246 eflags 0xc054cd46 cr2 0 ilevel 0x8 esp 0xc11f4760
 curlwp 0xc1269820 pid 0 lid 1 lowest kstack 0xc15762c0
 kernel: user trap double fault, code=0
 Stopped in pid 0.1 (system) at  8:      invalid address
 db{0}> bt
 panic: kernel diagnostic assertion "l->l_nopreempt > 0" failed: file "/usr/src/sys/sys/lwp.h", line 513 
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 eip 0xc0118734 cs 0x8 eflags 0x246 cr2 0x1c ilevel 0x8 esp 0xc1269a08
 curlwp 0xc1269820 pid 733240715 lid 96 lowest kstack 0xc1269b10
 fatal page fault in supervisor mode     c0118734:       popl    %ebp

 The last line is repeatedly overprinted several times and then the
 machine reboots.

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: Andreas Gustafsson <gson@gson.org>
To: nonaka@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe CPU
Date: Sun, 9 Jul 2017 17:43:01 +0300

 I tested a kernel built from source date 2017.07.08.14.35.33 (which
 has src/sys/arch/x86/x86/lapic.c 1.59) on my Crusoe laptop, and it now
 reboots within a fraction of a second after the kernel starts, too
 fast to read the console messages.
 -- 
 Andreas Gustafsson, gson@gson.org

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52266 CVS commit: [netbsd-8] src/sys/arch/x86/x86
Date: Mon, 10 Jul 2017 12:26:21 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Mon Jul 10 12:26:21 UTC 2017

 Modified Files:
 	src/sys/arch/x86/x86 [netbsd-8]: lapic.c

 Log Message:
 Pull up following revision(s) (requested by nonaka in ticket #110):
 	sys/arch/x86/x86/lapic.c: revision 1.59
 PR/52266: use rdmsr_safe(9) instead of rdmsr(9) for old machine.
 tested by simonb@


 To generate a diff of this commit:
 cvs rdiff -u -r1.58 -r1.58.2.1 src/sys/arch/x86/x86/lapic.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Kimihiro Nonaka <nonakap@gmail.com>
To: "gnats-bugs@netbsd.org" <gnats-bugs@netbsd.org>
Cc: NONAKA Kimihiro <nonaka@netbsd.org>, "gnats-admin@netbsd.org" <gnats-admin@netbsd.org>, 
	"netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>, Andreas Gustafsson <gson@gson.org>
Subject: Re: port-i386/52266: Double fault early in boot with AMD Am5x86 CPU
Date: Tue, 11 Jul 2017 18:12:37 +0900

 Hi,

 2017-07-09 14:45 GMT+09:00 John D. Baker <jdbaker@mylinuxisp.com>:

 >  Just to double check, I grabbed:
 >
 >    http://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/201707082110Z/i386/binary/kernel/netbsd-GENERIC.gz
 >
 >  and it panics on the net4501 the same as my locally-built kernels:

 Could you try the following patch?

 diff --git a/sys/arch/x86/x86/lapic.c b/sys/arch/x86/x86/lapic.c
 index 415bb65b4e5..f87de042054 100644
 --- a/sys/arch/x86/x86/lapic.c
 +++ b/sys/arch/x86/x86/lapic.c
 @@ -237,7 +237,8 @@ lapic_is_x2apic(void)
  {
      uint64_t msr;

 -    if (rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
 +    if (!ISSET(cpu_feature[0], CPUID_MSR) ||
 +        rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
          return false;
      return (msr & (APICBASE_EN | APICBASE_EXTD)) ==
          (APICBASE_EN | APICBASE_EXTD);

 Regards,
 -- 
 Kimihiro Nonaka

From: Andreas Gustafsson <gson@gson.org>
To: Kimihiro Nonaka <nonakap@gmail.com>
Cc: "gnats-bugs\@netbsd.org" <gnats-bugs@netbsd.org>
Subject: Re: port-i386/52266: Double fault early in boot with AMD Am5x86 CPU
Date: Tue, 11 Jul 2017 16:31:20 +0300

 Kimihiro Nonaka wrote:
 > Could you try the following patch?
 > 
 > diff --git a/sys/arch/x86/x86/lapic.c b/sys/arch/x86/x86/lapic.c
 > index 415bb65b4e5..f87de042054 100644
 > --- a/sys/arch/x86/x86/lapic.c
 > +++ b/sys/arch/x86/x86/lapic.c
 > @@ -237,7 +237,8 @@ lapic_is_x2apic(void)
 >  {
 >      uint64_t msr;
 > 
 > -    if (rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
 > +    if (!ISSET(cpu_feature[0], CPUID_MSR) ||
 > +        rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
 >          return false;
 >      return (msr & (APICBASE_EN | APICBASE_EXTD)) ==
 >          (APICBASE_EN | APICBASE_EXTD);
 > 

 My Crusoe still reboots immediately after loading the kernel, even
 with the patch.
 -- 
 Andreas Gustafsson, gson@gson.org

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with AMD Am5x86 CPU
Date: Tue, 11 Jul 2017 10:48:54 -0500 (CDT)

 On Tue, 11 Jul 2017 18:12:37 +0900, Kimihiro Nonaka <nonakap@gmail.com>
 wrote:

 > On 2017-07-09 14:45 GMT+09:00 John D. Baker <jdbaker%mylinuxisp.com@localhost>:
 > 
 > >  Just to double check, I grabbed:
 > >
 > >    http://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/201707082110Z/i386/binary/kernel/netbsd-GENERIC.gz
 > >
 > >  and it panics on the net4501 the same as my locally-built kernels:
 > 
 > Could you try the following patch?
 > 
 > diff --git a/sys/arch/x86/x86/lapic.c b/sys/arch/x86/x86/lapic.c
 > index 415bb65b4e5..f87de042054 100644
 > --- a/sys/arch/x86/x86/lapic.c
 > +++ b/sys/arch/x86/x86/lapic.c
 > @@ -237,7 +237,8 @@ lapic_is_x2apic(void)
 >  {
 >      uint64_t msr;
 > 
 > -    if (rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
 > +    if (!ISSET(cpu_feature[0], CPUID_MSR) ||
 > +        rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
 >          return false;
 >      return (msr & (APICBASE_EN | APICBASE_EXTD)) ==
 >          (APICBASE_EN | APICBASE_EXTD);

 With the above patch, my net4501 boots -current (8.99.1) again!

 Thanks for the patch.  Please commit and pull up to netbsd-8!

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: Andreas Gustafsson <gson@gson.org>
To: nonaka@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: port-i386/52266: Double fault early in boot with AMD Am5x86 CPU
Date: Tue, 11 Jul 2017 20:34:11 +0300

 Nonaka,

 I added some debug printfs to lapic_is_x2apic(), and found that
 on my Crusoe machine, cpu_feature[0] has the value 0x0084803f.
 If I change the "ISSET(cpu_feature[0], CPUID_MSR)"
 in your patch to "ISSET(cpu_feature[0], CPUID_APIC)", the
 kernel boots successfully.
 -- 
 Andreas Gustafsson, gson@gson.org

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with AMD Am5x86 CPU
Date: Tue, 11 Jul 2017 17:18:13 -0500 (CDT)

 Looking at the reports so far and the meaning of each of the various
 CPUID_* bits, it looks like:

   AMD Am5x86 (Elan SC520):  No MSR, No APIC
   NS Geode:  MSR, No APIC, rdmsr_safe() works
   TM Crusoe:  MSR, No APIC, rdmsr_safe() fails

 Are there any CPUs which implement APIC w/o MSR?  If not, then perhaps
 the condition could be made:

   if (!(ISSET(cpu_feature[0], APIC) &&
         ISSET(cpu_feature[0], CPUID_MSR)) ||
       rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
     return false;

 to satisfy the above cases?

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: "John D. Baker" <jdbaker@mylinuxisp.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/52266: Double fault early in boot with AMD Am5x86 CPU
Date: Tue, 11 Jul 2017 17:40:17 -0500 (CDT)

 On Tue, 11 Jul 2017, John D. Baker wrote:

 > Looking at the reports so far and the meaning of each of the various
 > CPUID_* bits, it looks like:
 > 
 >   AMD Am5x86 (Elan SC520):  No MSR, No APIC
 >   NS Geode:  MSR, No APIC, rdmsr_safe() works
 >   TM Crusoe:  MSR, No APIC, rdmsr_safe() fails
 > 
 > Are there any CPUs which implement APIC w/o MSR?

 Thinking about it more, if the CPU doesn't implement APIC, why bother
 reading MSR at all?  As such, Andreas' change to the patch would also
 satisfy all the cases above and is simpler.

 Since the x2apic code hinges on successfully reading MSR to obtain
 MSR_APICBASE, if there's no APIC of which to obtain the base, return
 false.

 Or is there some subtlty I'm missing?

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

From: Kimihiro Nonaka <nonakap@gmail.com>
To: "gnats-bugs@netbsd.org" <gnats-bugs@netbsd.org>
Cc: NONAKA Kimihiro <nonaka@netbsd.org>, "gnats-admin@netbsd.org" <gnats-admin@netbsd.org>, 
	"netbsd-bugs@netbsd.org" <netbsd-bugs@netbsd.org>, Andreas Gustafsson <gson@gson.org>
Subject: Re: port-i386/52266: Double fault early in boot with AMD Am5x86 CPU
Date: Wed, 12 Jul 2017 11:09:23 +0900

 2017-07-12 7:45 GMT+09:00 John D. Baker <jdbaker@mylinuxisp.com>:

 >  > Looking at the reports so far and the meaning of each of the various
 >  > CPUID_* bits, it looks like:
 >  >
 >  >   AMD Am5x86 (Elan SC520):  No MSR, No APIC
 >  >   NS Geode:  MSR, No APIC, rdmsr_safe() works
 >  >   TM Crusoe:  MSR, No APIC, rdmsr_safe() fails
 >  >
 >  > Are there any CPUs which implement APIC w/o MSR?
 >
 >  Thinking about it more, if the CPU doesn't implement APIC, why bother
 >  reading MSR at all?  As such, Andreas' change to the patch would also
 >  satisfy all the cases above and is simpler.

 I agree.

 Updated the patch.

 diff --git a/sys/arch/x86/x86/lapic.c b/sys/arch/x86/x86/lapic.c
 index 415bb65b4e5..e3423d8ce07 100644
 --- a/sys/arch/x86/x86/lapic.c
 +++ b/sys/arch/x86/x86/lapic.c
 @@ -237,7 +237,8 @@ lapic_is_x2apic(void)
  {
      uint64_t msr;

 -    if (rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
 +    if (!ISSET(cpu_feature[0], CPUID_APIC) ||
 +        rdmsr_safe(MSR_APICBASE, &msr) == EFAULT)
          return false;
      return (msr & (APICBASE_EN | APICBASE_EXTD)) ==
          (APICBASE_EN | APICBASE_EXTD);

 Regards,
 -- 
 Kimihiro Nonaka

From: "NONAKA Kimihiro" <nonaka@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52266 CVS commit: src/sys/arch/x86/x86
Date: Thu, 13 Jul 2017 00:44:14 +0000

 Module Name:	src
 Committed By:	nonaka
 Date:		Thu Jul 13 00:44:14 UTC 2017

 Modified Files:
 	src/sys/arch/x86/x86: lapic.c

 Log Message:
 PR/52266: Before access MSR[APICBASE], need to check if APIC is present.


 To generate a diff of this commit:
 cvs rdiff -u -r1.59 -r1.60 src/sys/arch/x86/x86/lapic.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52266 CVS commit: [netbsd-8] src/sys/arch/x86/x86
Date: Fri, 14 Jul 2017 08:41:18 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Fri Jul 14 08:41:18 UTC 2017

 Modified Files:
 	src/sys/arch/x86/x86 [netbsd-8]: lapic.c

 Log Message:
 Pull up following revision(s) (requested by nonaka in ticket #135):
 	sys/arch/x86/x86/lapic.c: revision 1.60
 PR/52266: Before access MSR[APICBASE], need to check if APIC is present.


 To generate a diff of this commit:
 cvs rdiff -u -r1.58.2.1 -r1.58.2.2 src/sys/arch/x86/x86/lapic.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Kimihiro Nonaka <nonakap@gmail.com>
To: Andreas Gustafsson <gson@gson.org>
Cc: NONAKA Kimihiro <nonaka@netbsd.org>, "gnats-bugs@netbsd.org" <gnats-bugs@netbsd.org>
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe CPU
Date: Fri, 21 Jul 2017 17:06:52 +0900

 Could you try src/sys/arch/x86/x86/lapic.c r1.60?

 2017-07-09 23:43 GMT+09:00 Andreas Gustafsson <gson@gson.org>:
 > I tested a kernel built from source date 2017.07.08.14.35.33 (which
 > has src/sys/arch/x86/x86/lapic.c 1.59) on my Crusoe laptop, and it now
 > reboots within a fraction of a second after the kernel starts, too
 > fast to read the console messages.
 > --
 > Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@gson.org>
To: Kimihiro Nonaka <nonakap@gmail.com>
Cc: NONAKA Kimihiro <nonaka@netbsd.org>,
    "gnats-bugs\@netbsd.org" <gnats-bugs@netbsd.org>
Subject: Re: port-i386/52266: Double fault early in boot with Transmeta Crusoe CPU
Date: Sat, 22 Jul 2017 14:57:40 +0300

 Kimihiro Nonaka wrote:
 > Could you try src/sys/arch/x86/x86/lapic.c r1.60?

 I have now tested a kernel built from 2017.07.21.02.51.12 soures,
 which include lapic.c 1.60, and it booted fine.  Thank you!
 -- 
 Andreas Gustafsson, gson@gson.org

State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Sat, 22 Jul 2017 12:03:49 +0000
State-Changed-Why:
The bug has been fixed.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.