NetBSD Problem Report #41038

From www@NetBSD.org  Thu Mar 19 02:51:34 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id EA56363BA0A
	for <gnats-bugs@gnats.netbsd.org>; Thu, 19 Mar 2009 02:51:33 +0000 (UTC)
Message-Id: <20090319025133.B256663B8EC@www.NetBSD.org>
Date: Thu, 19 Mar 2009 02:51:33 +0000 (UTC)
From: r.phillips@uq.edu.au
Reply-To: r.phillips@uq.edu.au
To: gnats-bugs@NetBSD.org
Subject: Pentium 1 crashes during boot install kernel
X-Send-Pr-Version: www-1.0

>Number:         41038
>Category:       kern
>Synopsis:       Pentium 1 crashes during boot install kernel
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 19 02:55:00 +0000 2009
>Last-Modified:  Tue Apr 21 15:50:01 +0000 2009
>Originator:     Ray Phillips
>Release:        -current CVS updated on 13 March 2009
>Organization:
>Environment:
Sorry, this is the best I have:

NetBSD 5.99.8 (GENERIC) #0: Fri Mar 13 21:22:48 EST 2009
        ray@tst.jkmrc.uq.edu.au:/usr/obj/sys/arch/i386/compile/GENERIC

but see Full Description section for dmesg output.
>Description:
When attempting to boot a Pentium 1 using a CD created from i386/installation/cdrom/boot-com.iso the machine crashes, as detailed here:

  http://mail-index.netbsd.org/port-i386/2009/03/17/msg001271.html

Following Martin's suggestion:

  http://mail-index.netbsd.org/port-i386/2009/03/17/msg001274.html

allowed the machine to run sysinst:

>> VESA VBE Version 2.0 8192 k
Welcome to the NetBSD 5.99.8 boot-only install CD
===============================================================================

This CD contains only the installation program.  Binary sets to complete the
installation must be downloaded separately.  The installer can download them
if this machine has a working internet connection.

ACPI (Advanced Configuration and Power Interface) should work on all modern
and legacy hardware.  However if you do encounter a problem while booting,
try disabling it and report a bug at http://www.netbsd.org/.

     1. Install NetBSD
     2. Install NetBSD (no ACPI)
     3. Install NetBSD (no ACPI, no SMP)
     4. Drop to boot prompt

Choose an option; RETURN for default; SPACE to stop countdown.
Option 1 will be chosen in 0
type "?" or "help" for help.
> boot -c
booting cd0a:netbsd (howto 0x1000)
8874208+405940+538728 [457360+443789]=0xa3af20
Loading cd9660
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 5.99.8 (GENERIC) #0: Fri Mar 13 21:22:48 EST 2009
        ray@tst.jkmrc.uq.edu.au:/usr/obj/sys/arch/i386/compile/GENERIC
total memory = 65148 KB
avail memory = 53396 KB
userconf: configure system autoconfiguration:
uc> disable fpa
[274] fpa* disabled
uc> exit
Continuing...
mainbus0 (root)




From the sysinst menus I dropped into sh and ran dmesg:



# dmesg
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 5.99.8 (GENERIC) #0: Fri Mar 13 21:22:48 EST 2009
        ray@tst.jkmrc.uq.edu.au:/usr/obj/sys/arch/i386/compile/GENERIC
total memory = 65148 KB
avail memory = 53396 KB
timecounter: Timecounters tick every 10.000 msec
userconf: configure system autoconfiguration:
uc> disable fpa
[274] fpa* disabled
uc> exit
Continuing...
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
Generic PC
mainbus0 (root)
ACPI Error (tbxfroot-0308): A valid RSDP was not found [20080321]
ACPI: unable to initialize ACPI tables: AE_NOT_FOUND
cpu0 at mainbus0: Intel 586-class, 233MHz, id 0x543
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: vendor 0x8086 product 0x7100 (rev. 0x01)
pcib0 at pci0 dev 7 function 0
pcib0: vendor 0x8086 product 0x7110 (rev. 0x01)
piixide0 at pci0 dev 7 function 1
piixide0: Intel 82371AB IDE controller (PIIX4) (rev. 0x01)
piixide0: bus-master DMA support present
piixide0: primary channel wired to compatibility mode
piixide0: primary channel interrupting at irq 14
atabus0 at piixide0 channel 0
piixide0: secondary channel wired to compatibility mode
piixide0: secondary channel interrupting at irq 15
atabus1 at piixide0 channel 1
uhci0 at pci0 dev 7 function 2: vendor 0x8086 product 0x7112 (rev. 0x01)
uhci0: interrupting at irq 11
usb0 at uhci0: USB revision 1.0
piixpm0 at pci0 dev 7 function 3
piixpm0: vendor 0x8086 product 0x7113 (rev. 0x01)
timecounter: Timecounter "piixpm0" frequency 3579545 Hz quality 900
piixpm0: 24-bit timer
piixpm0: interrupting at SMI, polling
iic0 at piixpm0: I2C bus
vga1 at pci0 dev 15 function 0: vendor 0x102b product 0x0520 (rev. 0x01)
wsdisplay0 at vga1 kbdmux 1
wsmux1: connecting to wsdisplay0
drm at vga1 not configured
ne2 at pci0 dev 16 function 0: Realtek 8029 Ethernet
ne2: Ethernet address 00:40:05:6b:f9:b3
ne2: 10base2, 10baseT, 10baseT-FDX, auto, default [0x00 0x30] auto
ne2: interrupting at irq 5
vendor 0x1011 product 0x000f (FDDI network, revision 0x01) at pci0 dev 17 functi
on 0 not configured
isa0 at pcib0
lpt0 at isa0 port 0x378-0x37b irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0 mux 1
wskbd0: connecting to wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 mux 0
attimer0 at isa0 port 0x40-0x43: AT Timer
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker (CPU-intensive output)
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff
npx0: reported by CPUID; using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
attimer0: attached to pcppi0
isapnp0: read port 0x203
isapnp0: <ESS ES1869 Plug and Play AudioD, ESS0006, , > port 0x800/8 not configu
red
ess0 at isapnp0 port 0x220/16,0x388/4,0x300/2 irq 9 drq 1,0
ess0: ESS Technology ES1869 [version 0x688b]
ess0: audio1 interrupting at irq 9
audio0 at ess0: half duplex, mmap, independent
opl0 at ess0: model OPL3
midi1 at opl0: ESS Yamaha OPL3 (CPU-intensive output)
joy0 at isapnp0 port 0x201/1
joy0: ESS ES1869 Plug and Play AudioD
joy0: joystick not connected
wdc2 at isapnp0 port 0x168/8,0x36e/2 irq 10: ESS ES1869 Plug and Play AudioD
atabus2 at wdc2 channel 0
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
timecounter: Timecounter "TSC" frequency 233040080 Hz quality 3000
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
uhub0 at usb0: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
wd0 at atabus0 drive 0: <QUANTUM FIREBALL SE3.2A>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 3079 MB, 6256 cyl, 16 head, 63 sec, 512 bytes/sect x 6306048 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
atapibus0 at atabus1: 2 targets
cd0 at atapibus0 drive 0: <PLEXTOR CD-R   PX-W8432T, , 1.09> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2
cd0(piixide0:1:0): using PIO mode 4, DMA mode 2 (using DMA)
Kernelized RAIDframe activated
pad0: outputs: 44100Hz, 16-bit, stereo
audio1 at pad0: half duplex
boot device: cd0
root on cd0a dumps on cd0b
cd0(piixide0:1:0):  Check Condition on CDB: 0x46 00 00 00 00 00 00 00 08 00
    SENSE KEY:  Illegal Request
     ASC/ASCQ:  Invalid Command Operation Code

root file system type: cd9660
warning: no /dev/console
#


>How-To-Repeat:
Boot Pentium 1 using i386 -current install CD dated 13 March 2009.  (I haven't tried any other recent versions such as 5.0_RC2 so don't know if the problem exists there too.  I will if you want me to.)
>Fix:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41038: Pentium 1 crashes during boot install kernel
Date: Tue, 21 Apr 2009 09:52:34 +0200

 I worked with Ray and we got a bit of debug output for this:

 fpa0 at pci0 dev 17 function 0: FDDI controller

 PDQ Descriptor Block = 0xc63da000 (PA = 0x1148000)
     Receive Queue          = 0xc63da000
     Transmit Queue         = 0xc63da800
     Host SMT Queue         = 0xc63db000
     Command Response Queue = 0xc63db280
     Command Request Queue  = 0xc63db300
 PDQ Consumer Block = 0xc63dd000
 PDQ CSRs: BASE = 0xc62d7000
     Port Reset                = 0x0 [0x00000000]
     Host Data                 = 0x4 [0x00000000]
     Port Control              = 0x8 [0x00000000]
     Port Data A               = 0xc [0x00000000]
     Port Data B               = 0x10 [0x00000000]
     Port Status               = 0x14 [0x00000200]
     Host Int Type 0           = 0x18 [0x000000d2]
     Host Int Enable           = 0x1c [0x00000000]
     Type 2 Producer           = 0x20 [0x00000000]
     Command Response Producer = 0x28 [0x00000000]
     Command Request Producer  = 0x2c [0x00000000]
     Host SMT Producer         = 0x30 [0x00000000]
     Unsolicited Producer      = 0x34 [0x00000000]
 PDQ Command Request Buffer = 0xc63db500 (PA=0x1149500)
 PDQ Command Response Buffer = 0xc63db900 (PA=0x1149900)
 PDQ Unsolicit Event Buffer = 0xc63dc000 (PA=0x1147000)
 PDQ Adapter State = DMA Unavailable
 CSR cmd spun 423 times
 CSR cmd spun 422 times
 CSR cmd spun 556 times
 CSR cmd spun 490 times
 uvm_fault(0xc09b0800, 0xc63dd000, 2) -> 0xe
 fatal page fault in supervisor mode
 trap type 6 code b eip c04a4c34 cs 8 eflags 10246 cr2 c63dd000 ilevel 8
 kernel: supervisor trap page fault, code=0
 Stopped in pid 0.1 (system) at  netbsd:pdq_stop+0x260:  movw    $0,0(%eax)
 db{0}> bt
 pdq_stop(c0d5e800,c08d6b57,14,4,34,c63db280,1148800,1149000,1200,c0d5e800) 
 at ne
 tbsd:pdq_stop+0x260
 pdq_initialize(1,c62d7000,c63d9a8c,0,c63d99d4,0,4,0,c5e73bc0,c61c8f10) 
 at netbsd
 :pdq_initialize+0x662
 pdq_pci_attach(c61c8ef4,c63d99d4,c0ad5a8c,c63d99d4,8,0,c63d99d4,4000,c6282004,0)
  at netbsd:pdq_pci_attach+0x10c

 This is pretty strange: PDQ Consumer Block = 0xc63dd000
 is setup in sys/dev/ic/pdq_ifsubr.c:594 (side note: I don't understand the
 #ifdef sparc stuff there).

 The driver dies at first write access to pdq->pdq_cbp (i.e. writing a zero
 to 0xc63dd000).

 Anyone spot what's wrong with the bus_dmamem_map() call?

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/41038: Pentium 1 crashes during boot install kernel
Date: Tue, 21 Apr 2009 17:47:50 +0200

 There is invalid code that very badly violates the bus_dmamap_t opaqueness
 (and this is what the #ifdef sparc* stuff works around):

     532 int                             
     533 pdq_os_memalloc_contig(  
     534     pdq_t *pdq)                 
     535 {   
     536     pdq_softc_t * const sc = pdq->pdq_os_ctx;
     537     bus_dma_segment_t db_segs[1], ui_segs[1], cb_segs[1];
     538     int db_nsegs = 0, ui_nsegs = 0;
     539     int steps = 0;                 
     540     int not_ok;


 now db_segs is properly created:
     542     not_ok = bus_dmamem_alloc(sc->sc_dmatag,
     543                          sizeof(*pdq->pdq_dbp), sizeof(*pdq->pdq_dbp),
     544                          sizeof(*pdq->pdq_dbp), db_segs, 1, &db_nsegs,

 but then cb_segs is just copied:
     593     if (!not_ok) {              
     594         steps = 8;
     595         pdq->pdq_unsolicited_info.ui_pa_bufstart = sc->sc_uimap->dm_segs[0].ds_addr;
     596         cb_segs[0] = db_segs[0];
     597         cb_segs[0].ds_addr += offsetof(pdq_descriptor_block_t, pdqdb_consumer);
     598         cb_segs[0].ds_len = sizeof(pdq_consumer_block_t);

 I'd call this working by sheer luck with certain bus_dmamem_* implementations.

 I'm not sure if this should be rewritten properly or just the sparc hacks
 be used everywhere and the other variant removed ;-)

 Thanks to Jochen Kunz for insights into the sparc* hacks. I'm not sure this
 is why it's broken on i386 right now either.

 Martin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.