NetBSD Problem Report #54577

From www@netbsd.org  Thu Sep 26 15:12:55 2019
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 927677A189
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 26 Sep 2019 15:12:55 +0000 (UTC)
Message-Id: <20190926151254.34A217A240@mollari.NetBSD.org>
Date: Thu, 26 Sep 2019 15:12:54 +0000 (UTC)
From: demonicjerseycow@gmail.com
Reply-To: demonicjerseycow@gmail.com
To: gnats-bugs@NetBSD.org
Subject: NetBSD 8.1 hangs constantly on sgimips
X-Send-Pr-Version: www-1.0

>Number:         54577
>Category:       port-sgimips
>Synopsis:       NetBSD 8.1 hangs constantly on sgimips
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-sgimips-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Sep 26 15:15:00 +0000 2019
>Originator:     Mark Kirby
>Release:        NetBSD 8.1
>Organization:
>Environment:
NetBSD dribble 8.1_STABLE NetBSD 8.1_STABLE (GENERIC32_IP3X) #0: Sun Sep 15 11:14:31 BST 2019  root@dribble:/usr/obj/sys/arch/sgimips/compile/GENERIC32_IP3X sgimips
>Description:
Hi,

I am running NetBSD 8.1 on a R10000 SGI O2. I am getting repeatable hard lockups where the system is completely frozen and requires the power to be removed to force it to boot back up.

The locks appear when
1) compiling packages, especially when linking
2) Extracting distfiles
3) cd /usr/pkgsrc && make clean

The hangs are sometimes proceeded by the following console errors
Sep 26 13:01:22 dribble /netbsd: crime: cpu error 4 at address 320046432
Sep 26 13:06:12 dribble /netbsd: crime: cpu error 4 at address 320057696

the address is different each time. These errors do not always precede a lock up but if i see them it will eventually lock up.

I get any where from 40 mins to 7hrs of compile time, it appears to be random how long it will work for. Though shorter compile time is more frequent.

Im happy to try to debug this but i have no idea where to start looking.

Thanks 

Mark

-bash-5.0$ dmesg
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
    2018, 2019 The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 8.1_STABLE (GENERIC32_IP3X) #0: Sun Sep 15 11:14:31 BST 2019
        root@dribble:/usr/obj/sys/arch/sgimips/compile/GENERIC32_IP3X
total memory = 255 MB
(6848 KB reserved for ARCS)
avail memory = 239 MB
timecounter: Timecounters tick every 10.000 msec
mainbus0 (root): SGI-IP32 [SGI, 2], 1 processor
cpu0 at mainbus0: MIPS R10000 CPU (0x934) Rev. 3.4 with built-in FPU Rev. 0.0
cpu0: 64 TLB entries, 16MB max page size
cpu0: 32KB/64B 2-way set-associative L1 instruction cache
cpu0: 32KB/32B 2-way set-associative write-back L1 data cache
cpu0: 1024KB/64B 2-way set-associative write-back L2 data cache
crime0 at mainbus0 addr 0x14000000: rev 1.1 (CRIME_ID: 161)
crmfb0 at mainbus0 addr 0x16000000: SGI CRIME Graphics Display Engine
crmfb0: initial resolution 1280x1024
crmfb0: allocated 5242880 byte fb @ 0x80050000 (0xa1400000)
wsdisplay0 at crmfb0 kbdmux 1: console (default, vt100 emulation)
wsmux1: connecting to wsdisplay0
mace0 at mainbus0 addr 0x1f000000
lpt0 at mace0 offset 0x380000 intr 4 intrmask 0xf0000
com0 at mace0 offset 0x390000 intr 4 intrmask 0x3f00000: ns16550a, working fifo
com1 at mace0 offset 0x398000 intr 4 intrmask 0xfc000000: ns16550a, working fifo
macekbc0 at mace0 offset 0x320000 intr 5 intrmask 0x0: PS2 controller
pckbd0 at macekbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pms0 at macekbc0 (aux slot)
wsmouse0 at pms0 mux 0
mcclock0 at mace0 offset 0x3a0000 intrmask 0x0
mec0 at mace0 offset 0x280000 intr 3 intrmask 0x0: MAC-110 Ethernet, rev 1
mec0: Ethernet address 08:00:69:05:c5:92
nsphy0 at mec0 phy 31: DP83840 10/100 media interface, rev. 1
nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 100baseT4, auto
mavb0 at mace0 offset 0x300000 intr 6 intrmask 0x0: AD1843 rev 1
audio0 at mavb0: full duplex, playback, capture, mmap, independent
mavb0: Virtual format configured - Format SLINEAR, precision 16, channels 2, frequency 48000
mavb0: Latency: 256 milliseconds
spkr0 at audio0: PC Speaker (synthesized)
macepci0 at mace0 offset 0x80000 intr 7 intrmask 0x0: rev 1
pci0 at macepci0 bus 0
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
ahc0 at pci0 dev 1 function 0: Adaptec aic7880 Ultra SCSI adapter
ahc0: interrupting at crime interrupt 8
ahc0: Using left over BIOS settings
ahc0: Host Adapter has no SEEPROM. Using default SCSI target parameters
ahc0: aic7880: Ultra Wide Channel A, SCSI Id=0, 16/253 SCBs
scsibus0 at ahc0: 16 targets, 8 luns per target
ahc1 at pci0 dev 2 function 0: Adaptec aic7880 Ultra SCSI adapter
ahc1: interrupting at crime interrupt 9
ahc1: Using left over BIOS settings
ahc1: Host Adapter has no SEEPROM. Using default SCSI target parameters
ahc1: aic7880: Ultra Wide Channel A, SCSI Id=0, 16/253 SCBs
scsibus1 at ahc1: 16 targets, 8 luns per target
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
timecounter: Timecounter "mips3_cp0_counter" frequency 112511570 Hz quality 100
scsibus0: waiting 2 seconds for devices to settle...
scsibus1: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 2 lun 0: <COMPAQ, BB018135B5, B017> disk fixed
sd0: 17365 MB, 7001 cyl, 20 head, 254 sec, 512 bytes/sect x 35565080 sectors
sd0: sync (50.00ns offset 8), 16-bit (40.000MB/s) transfers, tagged queueing
cd0 at scsibus0 target 4 lun 0: <TOSHIBA, CD-ROM XM-5701TA, 0167> cdrom removable
cd0: sync (100.00ns offset 8), 8-bit (10.000MB/s) transfers
boot device: sd0
root on sd0a dumps on sd0b
root file system type: ffs
kern.module.path=/stand/sgimips/8.1/modules
pid 1(init): ABI set to O32 (e_flags=0x1007)
cd0(ahc0:0:4:0):  Check Condition on CDB: 0x1b 00 00 00 01 00
    SENSE KEY:  Not Ready
     ASC/ASCQ:  Media Load or Eject Failed
wsdisplay0: screen 1 added (default, vt100 emulation)
wsdisplay0: screen 2 added (default, vt100 emulation)
wsdisplay0: screen 3 added (default, vt100 emulation)
wsdisplay0: screen 4 added (default, vt100 emulation)
cd0(ahc0:0:4:0):  Check Condition on CDB: 0x1b 00 00 00 01 00
    SENSE KEY:  Not Ready
     ASC/ASCQ:  Media Load or Eject Failed
-bash-5.0$ 
>How-To-Repeat:
Compile any large package from pkgsrc it will eventually lock the machine up.
>Fix:
unknown

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.