NetBSD Problem Report #40604

From www@NetBSD.org  Tue Feb 10 22:53:26 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 4BB5963BAB8
	for <gnats-bugs@gnats.netbsd.org>; Tue, 10 Feb 2009 22:53:26 +0000 (UTC)
Message-Id: <20090210225326.19D9463B400@narn.NetBSD.org>
Date: Tue, 10 Feb 2009 22:53:26 +0000 (UTC)
From: polimarco@gmail.com
Reply-To: polimarco@gmail.com
To: gnats-bugs@NetBSD.org
Subject: AlphaServer DS20E loses HDs and other Drives when adding 1 GB more RAM
X-Send-Pr-Version: www-1.0

>Number:         40604
>Category:       port-alpha
>Synopsis:       AlphaServer DS20E loses HDs and other Drives when adding 1 GB more RAM
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    port-alpha-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Feb 10 22:55:00 +0000 2009
>Last-Modified:  Thu Oct 07 20:05:02 +0000 2010
>Originator:     Marco Poli
>Release:        5.0 RC1
>Organization:
Design Lab - dlab.poli.usp.br
>Environment:
>Description:
Ok, that's one of the weirdest bugs I have ever faced.

I recently got a AlphaServer DS20E with 2 833 MHz CPUs, 5 Hard Drives and 3 Power Sources. Nice server.

The machine came with 1 GB RAM, 4x256 MB DIMM boards placed in Bank 0.

I installed NetBSD 4.0.1 without any issues and immediately got the 5 Hard Drives in a RAID configuration, with root and swap under RAID1 and /usr under a RAID5. All working very nicely.

One day I received 8 more of that 256 MB memories, and hurried to upgrade the server. I installed the boards in Banks 1 and 2.

What wasn't my surprise in the next boot, when I was faced with a mysterious

-----
probe(esiop0:0:0:0): request sense for a request sense ?
probe(esiop0:0:0:0): request sense failed with error 22
probe(esiop0:0:0:0): generic HBA error
-----
and that 3 messages repeat for each of my other 4 Hard Drives.

and then everything closes with the misterious:
-----
WARNING: can't figure out what device matches "SCSI 1 7 0 0 0 0 0"
-----
That should be my boot and root device, dka0.

The next line asks me to set the root device, but when I hit any key, the following line immediatly appears 3 times:

----
root device:
stray isa irq 1
stray isa irq 1
stray isa irq 1
use one of: fxp0 fd0[a-h] cd0[a-h] ddb halt reboot
stray isa ira 1; stopped logging
----

As you can see, none of my Hard Drives are listed... The first time that happened I imagined I had put some static and physically damaged my SCSI bus, but after a boot into Linux, everything seemed just fine hardware-wise.

Ok, so, lets try to boot using the CD-ROM (dqa0 in my case): now the same thing happens, but:

----
WARNING: can't figure out what device matches "IDE 0 105 0 0 0 0 0"
----

And all the same. The CD doesn't show in the list of available root devices, then.


When I remove the extra memory and leave only Bank 0 full, that is only 1 GB, everything gets back to normal.

Linux 2.6 works just fine with 2 GB or 3 GB of total memory, no issues noticed in the 2 or 3 days of uptime with this configuration.

This bug *might* be related to #38941, with the difference that in my case, it never really hangs, it just gets to a stale "no root disk". The install CD even gets the installation script running, but tells me there is nowhere to install to. I am able to use ddb at any point and quit and restart the installation script.

I can't say for sure. But I don't think it is related to #37915.

Machine is a:

---
COMPAQ AlphaServer DS20E 833 MHz, s/n ...
---

The SCSI device:
---
esiop0 at pci1 dev 7 function 0: Symbios Logic 53c895 (ultra2-wide scsi)
esiop0: using on-board RAM
esiop0: interrupting at dec 6600 irq 47
---

I am sorry for any typos, I was unable to copy-paste the actual screen, this is a newly typed-in reproduction.

Please tell me if I can provide any other useful information.

Thanks!
>How-To-Repeat:


Try to boot a DS20E with more than 1 GB of memory, or with memory in other Banks than Bank 0, I can't really tell.
>Fix:

>Audit-Trail:
From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@NetBSD.org, polimarco@gmail.com
Cc: port-alpha-maintainer@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-alpha/40604: AlphaServer DS20E loses HDs and other Drives
 when adding 1 GB more RAM
Date: Wed, 11 Feb 2009 15:45:18 +0100 (CET)

 > Ok, that's one of the weirdest bugs I have ever faced.
 >
 > I recently got a AlphaServer DS20E with 2 833 MHz CPUs, 5 Hard
 > Drives and 3 Power Sources. Nice server.
 >
 > The machine came with 1 GB RAM, 4x256 MB DIMM boards placed in Bank 0=
 .=

 >
 > I installed NetBSD 4.0.1 without any issues and immediately got
 > the 5 Hard Drives in a RAID configuration, with root and swap
 > under RAID1 and /usr under a RAID5. All working very nicely.
 >
 > One day I received 8 more of that 256 MB memories, and hurried
 > to upgrade the server. I installed the boards in Banks 1 and 2.
 >
 > What wasn't my surprise in the next boot, when I was faced with a mys=
 terious
 >
 > -----
 > probe(esiop0:0:0:0): request sense for a request sense ?
 > probe(esiop0:0:0:0): request sense failed with error 22
 > probe(esiop0:0:0:0): generic HBA error
 > -----
 > and that 3 messages repeat for each of my other 4 Hard Drives.
 >
 > and then everything closes with the misterious:
 > -----
 > WARNING: can't figure out what device matches "SCSI 1 7 0 0 0 0 0"
 > -----
 > That should be my boot and root device, dka0.

 I experienced some similar weirdness on an Alpha DP264 box I
 currently have in operation.  I think my conclusion was that this
 was due to some of the memory being bad.  You could try to see if
 this is the case by trying out the internal memory tester in the
 SRM firmware, or the more brute-force approach of trying to run
 the machine with only the new memory in bank 0, and see if it
 then behaves any better (or worse).

 However, your statement that it works fine with 2 or 3GB total
 memory (presumably the same memory tested above) with no issues
 while running Linux 2.6 makes this maybe implausible as an
 explanation.  Did you try to push the VM system while it ran
 Linux?  You could try pkgsrc/sysutils/memtester to excercise the
 system a bit, even though it's not a "real" memory tester (when
 you run it on a virtual memory system).  You may need to adjust
 the per-process memory limit up, and run sufficiently many
 instances to put strain on larger portions of your memory.

 Myself?  I yanked out 1GB, so my DP264 box now runs with 1GB
 memory.  Admittedly not ideal if this is indeed a bug in NetBSD,
 and not a hardware problem...

 Regards,

 - H=E5vard

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: port-alpha-maintainer@NetBSD.org, gnats-admin@NetBSD.org,
        netbsd-bugs@NetBSD.org
Subject: Re: port-alpha/40604: AlphaServer DS20E loses HDs and other Drives
	when adding 1 GB more RAM
Date: Thu, 12 Feb 2009 16:21:15 +0100

 On Tue, Feb 10, 2009 at 10:55:00PM +0000, polimarco@gmail.com wrote:
 > [...]
 > What wasn't my surprise in the next boot, when I was faced with a mysterious
 > 
 > -----
 > probe(esiop0:0:0:0): request sense for a request sense ?
 > probe(esiop0:0:0:0): request sense failed with error 22
 > probe(esiop0:0:0:0): generic HBA error
 > -----

 Do you have messages from the scsi layer or controller before this one ?

 -- 
 Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@lip6.fr
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Marco Poli <polimarco@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-alpha/40604: AlphaServer DS20E loses HDs and other Drives 
	when adding 1 GB more RAM
Date: Thu, 12 Feb 2009 21:02:42 -0200

 --00163628395ebd8adb0462c0b7dd
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Transfer-Encoding: 7bit

 Hello Manuel,


  as far as I can tell (I have no serial terminal and the install CD does't
 have dmesg), the only ones are:

 ---

 ahc0 at pci0 dev 6 function 0: Adaptec aic7895 Ultra SCSI adapter

 ahc0: interrupting at dec 6600 irq 19

 ahc0: aic7895C: Ultra Wide Channel A, SCSI Id=7, 32/253 SCBs

 scsibus0 at ahc0: 16 targets, 8 luns per target

 ahc1 at pci0 dev 6 function 1: Adaptec aic7895 Ultra SCSI adapter

 ahc1: interrupting at dec 6600 irq 18

 ahc1: aic7895C: Ultra Wide Channel B, SCSI Id=7, 32/253 SCBs

 scsibus1 at ahc1: 16 targets, 8 luns per target


 scsibus 0: waiting 2 seconds for devices to settle...
 scsibus 1: waiting 2 seconds for devices to settle...
 scsibus 2: waiting 2 seconds for devices to settle...
 ---


 The working (1 GB boot) dmesg is:

 ---

 Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,

 2006, 2007

 The NetBSD Foundation, Inc. All rights reserved.

 Copyright (c) 1982, 1986, 1989, 1991, 1993

 The Regents of the University of California. All rights reserved.


  NetBSD 4.0.1 (GENERIC.MP) #0: Tue Oct 7 20:56:07 PDT 2008

 builds@wb27
 :/home/builds/ab/netbsd-4-0-1-RELEASE/alpha/200810080053Z-obj/home/builds/ab/netbsd-4-0-1-RELEASE/src/sys/arch/alpha/compile/
 GENERIC.MP

 COMPAQ AlphaServer DS20E 833 MHz, s/n ...

 8192 byte page size, 2 processors.

 total memory = 1024 MB

 (2752 KB reserved for PROM, 1021 MB used by NetBSD)

 avail memory = 996 MB

 mainbus0 (root)

 cpu0 at mainbus0: ID 0 (primary), 21264B-4

 cpu0: Architecture extensions: 1307<PAT,MVI,CIX,FIX,BWX>

 cpu1 at mainbus0: ID 1, 21264B-4

 cpu1: Architecture extensions: 1307<PAT,MVI,CIX,FIX,BWX>

 tsc0 at mainbus0: 21272 Core Logic Chipset, Cchip rev 0

 tsc0: 8 Dchips, 2 memory buses of 32 bytes

 tsc0: arrays present: 1024MB, 0MB, 0MB, 0MB, Dchip 0 rev 1

 tsp0 at tsc0

 pci0 at tsp0 bus 0

 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok

 sio0 at pci0 dev 5 function 0: Contaq Microsystems 82C693 PCI-ISA Bridge
 (rev. 0x00)

 cypide0 at pci0 dev 5 function 1

 cypide0: Cypress 82C693 IDE Controller (rev. 0x00)

 cypide0: bus-master DMA support present

 cypide0: primary channel wired to compatibility mode

 cypide0: primary channel interrupting at isa irq 14

 atabus0 at cypide0 channel 0

 cypide1 at pci0 dev 5 function 2

 cypide1: Cypress 82C693 IDE Controller (rev. 0x00)

 cypide1: hardware does not support DMA

 cypide1: primary channel wired to compatibility mode

 cypide1: secondary channel interrupting at isa irq 15

 atabus1 at cypide1 channel 0

 ohci0 at pci0 dev 5 function 3: Contaq Microsystems 82C693 PCI-ISA Bridge
 (rev. 0x00)

 ohci0: interrupting at isa irq 10

 ohci0: OHCI version 1.0, legacy support

 usb0 at ohci0: USB revision 1.0

 uhub0 at usb0

 uhub0: Contaq Microsys OHCI root hub, class 9/0, rev 1.00/1.00, addr 1

 uhub0: 2 ports with 2 removable, self powered

 ahc0 at pci0 dev 6 function 0: Adaptec aic7895 Ultra SCSI adapter

 ahc0: interrupting at dec 6600 irq 19

 ahc0: aic7895C: Ultra Wide Channel A, SCSI Id=7, 32/253 SCBs

 scsibus0 at ahc0: 16 targets, 8 luns per target

 ahc1 at pci0 dev 6 function 1: Adaptec aic7895 Ultra SCSI adapter

 ahc1: interrupting at dec 6600 irq 18

 ahc1: aic7895C: Ultra Wide Channel B, SCSI Id=7, 32/253 SCBs

 scsibus1 at ahc1: 16 targets, 8 luns per target

 vga0 at pci0 dev 7 function 0: 3D Labs GLINT Permedia 3 (rev. 0x01)

 wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)

 wsmux1: connecting to wsdisplay0

 isa0 at sio0

 lpt0 at isa0 port 0x3bc-0x3bf irq 7

 com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo

 com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo

 pckbc0 at isa0 port 0x60-0x64

 pckbd0 at pckbc0 (kbd slot)

 pckbc0: using irq 1 for kbd slot

 wskbd0 at pckbd0: console keyboard, using wsdisplay0

 pms0 at pckbc0 (aux slot)

 pckbc0: using irq 12 for aux slot

 wsmouse0 at pms0 mux 0

 attimer0 at isa0 port 0x40-0x43: AT Timer

 pcppi0 at isa0 port 0x61

 pcppi0: children must have an explicit unit

 midi0 at pcppi0: PC speaker (CPU-intensive output)

 spkr0 at pcppi0

 isabeep0 at pcppi0

 fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2

 mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible

 pcppi0: attached to attimer0

 tsp1 at tsc0

 pci1 at tsp1 bus 0

 pci1: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok

 esiop0 at pci1 dev 7 function 0: Symbios Logic 53c895 (ultra2-wide scsi)

 esiop0: using on-board RAM

 esiop0: interrupting at dec 6600 irq 47

 scsibus2 at esiop0: 16 targets, 8 luns per target

 fxp0 at pci1 dev 9 function 0: i82559 Ethernet, rev 8

 fxp0: interrupting at dec 6600 irq 39

 fxp0: Ethernet address 00:50:8b:ae:dc:8a

 inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4

 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto

 fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec

 Kernelized RAIDframe activated

 stray isa irq 14

 scsibus0: waiting 2 seconds for devices to settle...

 scsibus1: waiting 2 seconds for devices to settle...

 scsibus2: waiting 2 seconds for devices to settle...

 atapibus0 at atabus0: 2 targets

 cd0 at atapibus0 drive 0: <CD-224E, , 9.5B> cdrom removable

 cd0: 32-bit data port

 cd0: drive supports PIO mode 4, DMA mode 2

 cd0(cypide0:0:0): using PIO mode 4, DMA mode 2 (using DMA)

 sd0 at scsibus2 target 0 lun 0: <COMPAQ, BD0366459B, B010> disk fixed

 sd0: 34732 MB, 14002 cyl, 20 head, 254 sec, 512 bytes/sect x 71132000
 sectors

 sd0: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged
 queueing

 sd1 at scsibus2 target 1 lun 0: <COMPAQ, BD0366459B, B010> disk fixed

 sd1: 34732 MB, 14002 cyl, 20 head, 254 sec, 512 bytes/sect x 71132000
 sectors

 sd1: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged
 queueing

 sd2 at scsibus2 target 2 lun 0: <COMPAQ, BD0366459B, B010> disk fixed

 sd2: 34732 MB, 14002 cyl, 20 head, 254 sec, 512 bytes/sect x 71132000
 sectors

 sd2: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged
 queueing

 sd3 at scsibus2 target 3 lun 0: <COMPAQ, BD0366459B, B010> disk fixed

 sd3: 34732 MB, 14002 cyl, 20 head, 254 sec, 512 bytes/sect x 71132000
 sectors

 sd3: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged
 queueing

 sd4 at scsibus2 target 4 lun 0: <COMPAQ, BD0366459B, B010> disk fixed

 sd4: 34732 MB, 14002 cyl, 20 head, 254 sec, 512 bytes/sect x 71132000
 sectors

 sd4: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged
 queueing

 raid0: RAID Level 1

 raid0: Components: /dev/sd0a /dev/sd2a

 raid0: Total Sectors: 4460160 (2177 MB)

 raid1: RAID Level 1

 raid1: Components: /dev/sd1a /dev/sd3a

 raid1: Total Sectors: 4460160 (2177 MB)

 raid2: RAID Level 5

 raid2: Components: /dev/sd0d /dev/sd2d /dev/sd3d /dev/sd4d

 raid2: Total Sectors: 200015040 (97663 MB)

 root on raid0a dumps on raid0b

 root file system type: ffs

 WARNING: clock gained 2 days -- CHECK AND RESET THE DATE!

 wsdisplay0: screen 1 added (80x25, vt100 emulation)

 wsdisplay0: screen 2 added (80x25, vt100 emulation)

 wsdisplay0: screen 3 added (80x25, vt100 emulation)

 wsdisplay0: screen 4 added (80x25, vt100 emulation)
 ----




 One more thing that might be of interest: the /dev/md0a from the bootcd
 works and gives me  a working environment, but when I try to mount /dev/cd0a
 /mnt, I get

 stray isa irq 14

 the cd spins and everything hangs.




 On Thu, Feb 12, 2009 at 1:25 PM, Manuel Bouyer <bouyer@antioche.eu.org>wrote:

 > The following reply was made to PR port-alpha/40604; it has been noted by
 > GNATS.
 >
 > From: Manuel Bouyer <bouyer@antioche.eu.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: port-alpha-maintainer@NetBSD.org, gnats-admin@NetBSD.org,
 >        netbsd-bugs@NetBSD.org
 > Subject: Re: port-alpha/40604: AlphaServer DS20E loses HDs and other Drives
 >        when adding 1 GB more RAM
 > Date: Thu, 12 Feb 2009 16:21:15 +0100
 >
 >  On Tue, Feb 10, 2009 at 10:55:00PM +0000, polimarco@gmail.com wrote:
 >  > [...]
 >  > What wasn't my surprise in the next boot, when I was faced with a
 > mysterious
 >  >
 >  > -----
 >  > probe(esiop0:0:0:0): request sense for a request sense ?
 >  > probe(esiop0:0:0:0): request sense failed with error 22
 >  > probe(esiop0:0:0:0): generic HBA error
 >  > -----
 >
 >  Do you have messages from the scsi layer or controller before this one ?
 >
 >  --
 >  Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@lip6.fr
 >      NetBSD: 26 ans d'experience feront toujours la difference
 >  --
 >
 >

 --00163628395ebd8adb0462c0b7dd
 Content-Type: text/html; charset=ISO-8859-1
 Content-Transfer-Encoding: quoted-printable

 <br>Hello Manuel,<br><br><br>&nbsp;as far as I can tell (I have no serial t=
 erminal and the install CD does&#39;t have dmesg), the only ones are:<br><b=
 r>---<br>


 	<meta http-equiv=3D"CONTENT-TYPE" content=3D"text/html; charset=3Dutf-8">
 	<title></title>
 	<meta name=3D"GENERATOR" content=3D"OpenOffice.org 2.4  (Unix)">
 	<style type=3D"text/css">
 	&lt;!--
 		@page { size: 8.5in 11in; margin: 0.79in }
 		P { margin-bottom: 0.08in }
 	--&gt;
 	</style>

 <p style=3D"margin-bottom: 0in;">ahc0 at pci0 dev 6 function 0: Adaptec
 aic7895 Ultra SCSI adapter</p>
 <p style=3D"margin-bottom: 0in;">ahc0: interrupting at dec 6600 irq 19</p>
 <p style=3D"margin-bottom: 0in;">ahc0: aic7895C: Ultra Wide Channel A,
 SCSI Id=3D7, 32/253 SCBs</p>
 <p style=3D"margin-bottom: 0in;">scsibus0 at ahc0: 16 targets, 8 luns
 per target</p>
 <p style=3D"margin-bottom: 0in;">ahc1 at pci0 dev 6 function 1: Adaptec
 aic7895 Ultra SCSI adapter</p>
 <p style=3D"margin-bottom: 0in;">ahc1: interrupting at dec 6600 irq 18</p>
 <p style=3D"margin-bottom: 0in;">ahc1: aic7895C: Ultra Wide Channel B,
 SCSI Id=3D7, 32/253 SCBs</p>
 <p style=3D"margin-bottom: 0in;">scsibus1 at ahc1: 16 targets, 8 luns
 per target</p>
 <br><br>scsibus 0: waiting 2 seconds for devices to settle...<br>scsibus 1:=
  waiting 2 seconds for devices to settle...<br>scsibus 2: waiting 2 seconds=
  for devices to settle...<br>---<br><br><br>The working (1 GB boot) dmesg i=
 s:<br>
 <br>---<br>


 	<meta http-equiv=3D"CONTENT-TYPE" content=3D"text/html; charset=3Dutf-8">
 	<title></title>
 	<meta name=3D"GENERATOR" content=3D"OpenOffice.org 2.4  (Unix)">
 	<style type=3D"text/css">
 	&lt;!--
 		@page { size: 8.5in 11in; margin: 0.79in }
 		P { margin-bottom: 0.08in }
 	--&gt;
 	</style>

 <p style=3D"margin-bottom: 0in;">Copyright (c) 1996, 1997, 1998, 1999,
 2000, 2001, 2002, 2003, 2004, 2005,</p>
 <p style=3D"margin-bottom: 0in;">    2006, 2007</p>
 <p style=3D"margin-bottom: 0in;">    The NetBSD Foundation, Inc.  All
 rights reserved.</p>
 <p style=3D"margin-bottom: 0in;">Copyright (c) 1982, 1986, 1989, 1991,
 1993</p>
 <p style=3D"margin-bottom: 0in;">    The Regents of the University of
 California.  All rights reserved.</p>
 <p style=3D"margin-bottom: 0in;"><br>
 </p>
 <p style=3D"margin-bottom: 0in;">NetBSD 4.0.1 (<a href=3D"http://GENERIC.MP=
 ">GENERIC.MP</a>) #0: Tue Oct=20
 7 20:56:07 PDT 2008</p>
 <p style=3D"margin-bottom: 0in;">      =20
 builds@wb27:/home/builds/ab/netbsd-4-0-1-RELEASE/alpha/200810080053Z-obj/ho=
 me/builds/ab/netbsd-4-0-1-RELEASE/src/sys/arch/alpha/compile/<a href=3D"htt=
 p://GENERIC.MP">GENERIC.MP</a></p>
 <p style=3D"margin-bottom: 0in;">COMPAQ AlphaServer DS20E 833 MHz, s/n ...<=
 /p>
 <p style=3D"margin-bottom: 0in;">8192 byte page size, 2 processors.</p>
 <p style=3D"margin-bottom: 0in;">total memory =3D 1024 MB</p>
 <p style=3D"margin-bottom: 0in;">(2752 KB reserved for PROM, 1021 MB
 used by NetBSD)</p>
 <p style=3D"margin-bottom: 0in;">avail memory =3D 996 MB</p>
 <p style=3D"margin-bottom: 0in;">mainbus0 (root)</p>
 <p style=3D"margin-bottom: 0in;">cpu0 at mainbus0: ID 0 (primary),
 21264B-4</p>
 <p style=3D"margin-bottom: 0in;">cpu0: Architecture extensions:
 1307&lt;PAT,MVI,CIX,FIX,BWX&gt;</p>
 <p style=3D"margin-bottom: 0in;">cpu1 at mainbus0: ID 1, 21264B-4</p>
 <p style=3D"margin-bottom: 0in;">cpu1: Architecture extensions:
 1307&lt;PAT,MVI,CIX,FIX,BWX&gt;</p>
 <p style=3D"margin-bottom: 0in;">tsc0 at mainbus0: 21272 Core Logic
 Chipset, Cchip rev 0</p>
 <p style=3D"margin-bottom: 0in;">tsc0: 8 Dchips, 2 memory buses of 32
 bytes</p>
 <p style=3D"margin-bottom: 0in;">tsc0: arrays present: 1024MB, 0MB, 0MB,
 0MB, Dchip 0 rev 1</p>
 <p style=3D"margin-bottom: 0in;">tsp0 at tsc0</p>
 <p style=3D"margin-bottom: 0in;">pci0 at tsp0 bus 0</p>
 <p style=3D"margin-bottom: 0in;">pci0: i/o space, memory space enabled,
 rd/line, rd/mult, wr/inv ok</p>
 <p style=3D"margin-bottom: 0in;">sio0 at pci0 dev 5 function 0: Contaq
 Microsystems 82C693 PCI-ISA Bridge (rev. 0x00)</p>
 <p style=3D"margin-bottom: 0in;">cypide0 at pci0 dev 5 function 1</p>
 <p style=3D"margin-bottom: 0in;">cypide0: Cypress 82C693 IDE Controller
 (rev. 0x00)</p>
 <p style=3D"margin-bottom: 0in;">cypide0: bus-master DMA support present</p=
 >
 <p style=3D"margin-bottom: 0in;">cypide0: primary channel wired to
 compatibility mode</p>
 <p style=3D"margin-bottom: 0in;">cypide0: primary channel interrupting
 at isa irq 14</p>
 <p style=3D"margin-bottom: 0in;">atabus0 at cypide0 channel 0</p>
 <p style=3D"margin-bottom: 0in;">cypide1 at pci0 dev 5 function 2</p>
 <p style=3D"margin-bottom: 0in;">cypide1: Cypress 82C693 IDE Controller
 (rev. 0x00)</p>
 <p style=3D"margin-bottom: 0in;">cypide1: hardware does not support DMA</p>
 <p style=3D"margin-bottom: 0in;">cypide1: primary channel wired to
 compatibility mode</p>
 <p style=3D"margin-bottom: 0in;">cypide1: secondary channel interrupting
 at isa irq 15</p>
 <p style=3D"margin-bottom: 0in;">atabus1 at cypide1 channel 0</p>
 <p style=3D"margin-bottom: 0in;">ohci0 at pci0 dev 5 function 3: Contaq
 Microsystems 82C693 PCI-ISA Bridge (rev. 0x00)</p>
 <p style=3D"margin-bottom: 0in;">ohci0: interrupting at isa irq 10</p>
 <p style=3D"margin-bottom: 0in;">ohci0: OHCI version 1.0, legacy support</p=
 >
 <p style=3D"margin-bottom: 0in;">usb0 at ohci0: USB revision 1.0</p>
 <p style=3D"margin-bottom: 0in;">uhub0 at usb0</p>
 <p style=3D"margin-bottom: 0in;">uhub0: Contaq Microsys OHCI root hub,
 class 9/0, rev 1.00/1.00, addr 1</p>
 <p style=3D"margin-bottom: 0in;">uhub0: 2 ports with 2 removable, self
 powered</p>
 <p style=3D"margin-bottom: 0in;">ahc0 at pci0 dev 6 function 0: Adaptec
 aic7895 Ultra SCSI adapter</p>
 <p style=3D"margin-bottom: 0in;">ahc0: interrupting at dec 6600 irq 19</p>
 <p style=3D"margin-bottom: 0in;">ahc0: aic7895C: Ultra Wide Channel A,
 SCSI Id=3D7, 32/253 SCBs</p>
 <p style=3D"margin-bottom: 0in;">scsibus0 at ahc0: 16 targets, 8 luns
 per target</p>
 <p style=3D"margin-bottom: 0in;">ahc1 at pci0 dev 6 function 1: Adaptec
 aic7895 Ultra SCSI adapter</p>
 <p style=3D"margin-bottom: 0in;">ahc1: interrupting at dec 6600 irq 18</p>
 <p style=3D"margin-bottom: 0in;">ahc1: aic7895C: Ultra Wide Channel B,
 SCSI Id=3D7, 32/253 SCBs</p>
 <p style=3D"margin-bottom: 0in;">scsibus1 at ahc1: 16 targets, 8 luns
 per target</p>
 <p style=3D"margin-bottom: 0in;">vga0 at pci0 dev 7 function 0: 3D Labs
 GLINT Permedia 3 (rev. 0x01)</p>
 <p style=3D"margin-bottom: 0in;">wsdisplay0 at vga0 kbdmux 1: console
 (80x25, vt100 emulation)</p>
 <p style=3D"margin-bottom: 0in;">wsmux1: connecting to wsdisplay0</p>
 <p style=3D"margin-bottom: 0in;">isa0 at sio0</p>
 <p style=3D"margin-bottom: 0in;">lpt0 at isa0 port 0x3bc-0x3bf irq 7</p>
 <p style=3D"margin-bottom: 0in;">com0 at isa0 port 0x3f8-0x3ff irq 4:
 ns16550a, working fifo</p>
 <p style=3D"margin-bottom: 0in;">com1 at isa0 port 0x2f8-0x2ff irq 3:
 ns16550a, working fifo</p>
 <p style=3D"margin-bottom: 0in;">pckbc0 at isa0 port 0x60-0x64</p>
 <p style=3D"margin-bottom: 0in;">pckbd0 at pckbc0 (kbd slot)</p>
 <p style=3D"margin-bottom: 0in;">pckbc0: using irq 1 for kbd slot</p>
 <p style=3D"margin-bottom: 0in;">wskbd0 at pckbd0: console keyboard,
 using wsdisplay0</p>
 <p style=3D"margin-bottom: 0in;">pms0 at pckbc0 (aux slot)</p>
 <p style=3D"margin-bottom: 0in;">pckbc0: using irq 12 for aux slot</p>
 <p style=3D"margin-bottom: 0in;">wsmouse0 at pms0 mux 0</p>
 <p style=3D"margin-bottom: 0in;">attimer0 at isa0 port 0x40-0x43: AT
 Timer</p>
 <p style=3D"margin-bottom: 0in;">pcppi0 at isa0 port 0x61</p>
 <p style=3D"margin-bottom: 0in;">pcppi0: children must have an explicit
 unit</p>
 <p style=3D"margin-bottom: 0in;">midi0 at pcppi0: PC speaker
 (CPU-intensive output)</p>
 <p style=3D"margin-bottom: 0in;">spkr0 at pcppi0</p>
 <p style=3D"margin-bottom: 0in;">isabeep0 at pcppi0</p>
 <p style=3D"margin-bottom: 0in;">fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq
 2</p>
 <p style=3D"margin-bottom: 0in;">mcclock0 at isa0 port 0x70-0x71:
 mc146818 or compatible</p>
 <p style=3D"margin-bottom: 0in;">pcppi0: attached to attimer0</p>
 <p style=3D"margin-bottom: 0in;">tsp1 at tsc0</p>
 <p style=3D"margin-bottom: 0in;">pci1 at tsp1 bus 0</p>
 <p style=3D"margin-bottom: 0in;">pci1: i/o space, memory space enabled,
 rd/line, rd/mult, wr/inv ok</p>
 <p style=3D"margin-bottom: 0in;">esiop0 at pci1 dev 7 function 0:
 Symbios Logic 53c895 (ultra2-wide scsi)</p>
 <p style=3D"margin-bottom: 0in;">esiop0: using on-board RAM</p>
 <p style=3D"margin-bottom: 0in;">esiop0: interrupting at dec 6600 irq 47</p=
 >
 <p style=3D"margin-bottom: 0in;">scsibus2 at esiop0: 16 targets, 8 luns
 per target</p>
 <p style=3D"margin-bottom: 0in;">fxp0 at pci1 dev 9 function 0: i82559
 Ethernet, rev 8</p>
 <p style=3D"margin-bottom: 0in;">fxp0: interrupting at dec 6600 irq 39</p>
 <p style=3D"margin-bottom: 0in;">fxp0: Ethernet address
 00:50:8b:ae:dc:8a</p>
 <p style=3D"margin-bottom: 0in;">inphy0 at fxp0 phy 1: i82555 10/100
 media interface, rev. 4</p>
 <p style=3D"margin-bottom: 0in;">inphy0: 10baseT, 10baseT-FDX,
 100baseTX, 100baseTX-FDX, auto</p>
 <p style=3D"margin-bottom: 0in;">fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2
 head, 18 sec</p>
 <p style=3D"margin-bottom: 0in;">Kernelized RAIDframe activated</p>
 <p style=3D"margin-bottom: 0in;">stray isa irq 14</p>
 <p style=3D"margin-bottom: 0in;">scsibus0: waiting 2 seconds for devices
 to settle...</p>
 <p style=3D"margin-bottom: 0in;">scsibus1: waiting 2 seconds for devices
 to settle...</p>
 <p style=3D"margin-bottom: 0in;">scsibus2: waiting 2 seconds for devices
 to settle...</p>
 <p style=3D"margin-bottom: 0in;">atapibus0 at atabus0: 2 targets</p>
 <p style=3D"margin-bottom: 0in;">cd0 at atapibus0 drive 0: &lt;CD-224E,
 , 9.5B&gt; cdrom removable</p>
 <p style=3D"margin-bottom: 0in;">cd0: 32-bit data port</p>
 <p style=3D"margin-bottom: 0in;">cd0: drive supports PIO mode 4, DMA
 mode 2</p>
 <p style=3D"margin-bottom: 0in;">cd0(cypide0:0:0): using PIO mode 4, DMA
 mode 2 (using DMA)</p>
 <p style=3D"margin-bottom: 0in;">sd0 at scsibus2 target 0 lun 0:
 &lt;COMPAQ, BD0366459B, B010&gt; disk fixed</p>
 <p style=3D"margin-bottom: 0in;">sd0: 34732 MB, 14002 cyl, 20 head, 254
 sec, 512 bytes/sect x 71132000 sectors</p>
 <p style=3D"margin-bottom: 0in;">sd0: sync (25.00ns offset 31), 16-bit
 (80.000MB/s) transfers, tagged queueing</p>
 <p style=3D"margin-bottom: 0in;">sd1 at scsibus2 target 1 lun 0:
 &lt;COMPAQ, BD0366459B, B010&gt; disk fixed</p>
 <p style=3D"margin-bottom: 0in;">sd1: 34732 MB, 14002 cyl, 20 head, 254
 sec, 512 bytes/sect x 71132000 sectors</p>
 <p style=3D"margin-bottom: 0in;">sd1: sync (25.00ns offset 31), 16-bit
 (80.000MB/s) transfers, tagged queueing</p>
 <p style=3D"margin-bottom: 0in;">sd2 at scsibus2 target 2 lun 0:
 &lt;COMPAQ, BD0366459B, B010&gt; disk fixed</p>
 <p style=3D"margin-bottom: 0in;">sd2: 34732 MB, 14002 cyl, 20 head, 254
 sec, 512 bytes/sect x 71132000 sectors</p>
 <p style=3D"margin-bottom: 0in;">sd2: sync (25.00ns offset 31), 16-bit
 (80.000MB/s) transfers, tagged queueing</p>
 <p style=3D"margin-bottom: 0in;">sd3 at scsibus2 target 3 lun 0:
 &lt;COMPAQ, BD0366459B, B010&gt; disk fixed</p>
 <p style=3D"margin-bottom: 0in;">sd3: 34732 MB, 14002 cyl, 20 head, 254
 sec, 512 bytes/sect x 71132000 sectors</p>
 <p style=3D"margin-bottom: 0in;">sd3: sync (25.00ns offset 31), 16-bit
 (80.000MB/s) transfers, tagged queueing</p>
 <p style=3D"margin-bottom: 0in;">sd4 at scsibus2 target 4 lun 0:
 &lt;COMPAQ, BD0366459B, B010&gt; disk fixed</p>
 <p style=3D"margin-bottom: 0in;">sd4: 34732 MB, 14002 cyl, 20 head, 254
 sec, 512 bytes/sect x 71132000 sectors</p>
 <p style=3D"margin-bottom: 0in;">sd4: sync (25.00ns offset 31), 16-bit
 (80.000MB/s) transfers, tagged queueing</p>
 <p style=3D"margin-bottom: 0in;">raid0: RAID Level 1</p>
 <p style=3D"margin-bottom: 0in;">raid0: Components: /dev/sd0a /dev/sd2a</p>
 <p style=3D"margin-bottom: 0in;">raid0: Total Sectors: 4460160 (2177 MB)</p=
 >
 <p style=3D"margin-bottom: 0in;">raid1: RAID Level 1</p>
 <p style=3D"margin-bottom: 0in;">raid1: Components: /dev/sd1a /dev/sd3a</p>
 <p style=3D"margin-bottom: 0in;">raid1: Total Sectors: 4460160 (2177 MB)</p=
 >
 <p style=3D"margin-bottom: 0in;">raid2: RAID Level 5</p>
 <p style=3D"margin-bottom: 0in;">raid2: Components: /dev/sd0d /dev/sd2d
 /dev/sd3d /dev/sd4d</p>
 <p style=3D"margin-bottom: 0in;">raid2: Total Sectors: 200015040 (97663
 MB)</p>
 <p style=3D"margin-bottom: 0in;">root on raid0a dumps on raid0b</p>
 <p style=3D"margin-bottom: 0in;">root file system type: ffs</p>
 <p style=3D"margin-bottom: 0in;">WARNING: clock gained 2 days -- CHECK
 AND RESET THE DATE!</p>
 <p style=3D"margin-bottom: 0in;">wsdisplay0: screen 1 added (80x25,
 vt100 emulation)</p>
 <p style=3D"margin-bottom: 0in;">wsdisplay0: screen 2 added (80x25,
 vt100 emulation)</p>
 <p style=3D"margin-bottom: 0in;">wsdisplay0: screen 3 added (80x25,
 vt100 emulation)</p>
 <p style=3D"margin-bottom: 0in;">wsdisplay0: screen 4 added (80x25,
 vt100 emulation)</p>
 ----<br><br><br><br><br>One more thing that might be of interest: the /dev/=
 md0a from the bootcd works and gives me&nbsp; a working environment, but wh=
 en I try to mount /dev/cd0a /mnt, I get<br><br>stray isa irq 14<br><br>the =
 cd spins and everything hangs.<br>
 <br><br><br><br><div class=3D"gmail_quote">On Thu, Feb 12, 2009 at 1:25 PM,=
  Manuel Bouyer <span dir=3D"ltr">&lt;<a href=3D"mailto:bouyer@antioche.eu.o=
 rg">bouyer@antioche.eu.org</a>&gt;</span> wrote:<br><blockquote class=3D"gm=
 ail_quote" style=3D"border-left: 1px solid rgb(204, 204, 204); margin: 0pt =
 0pt 0pt 0.8ex; padding-left: 1ex;">
 <div class=3D"Ih2E3d">The following reply was made to PR port-alpha/40604; =
 it has been noted by GNATS.<br>
 <br>
 </div>From: Manuel Bouyer &lt;<a href=3D"mailto:bouyer@antioche.eu.org">bou=
 yer@antioche.eu.org</a>&gt;<br>
 <div class=3D"Ih2E3d">To: gnats-bugs@NetBSD.org<br>
 </div>Cc: port-alpha-maintainer@NetBSD.org, gnats-admin@NetBSD.org,<br>
  &nbsp; &nbsp; &nbsp; &nbsp;netbsd-bugs@NetBSD.org<br>
 <div class=3D"Ih2E3d">Subject: Re: port-alpha/40604: AlphaServer DS20E lose=
 s HDs and other Drives<br>
  &nbsp; &nbsp; &nbsp; &nbsp;when adding 1 GB more RAM<br>
 </div>Date: Thu, 12 Feb 2009 16:21:15 +0100<br>
 <br>
 &nbsp;On Tue, Feb 10, 2009 at 10:55:00PM +0000, <a href=3D"mailto:polimarco=
 @gmail.com">polimarco@gmail.com</a> wrote:<br>
 &nbsp;&gt; [...]<br>
 &nbsp;&gt; What wasn&#39;t my surprise in the next boot, when I was faced w=
 ith a mysterious<br>
 <div class=3D"Ih2E3d">&nbsp;&gt;<br>
 &nbsp;&gt; -----<br>
 &nbsp;&gt; probe(esiop0:0:0:0): request sense for a request sense ?<br>
 &nbsp;&gt; probe(esiop0:0:0:0): request sense failed with error 22<br>
 &nbsp;&gt; probe(esiop0:0:0:0): generic HBA error<br>
 &nbsp;&gt; -----<br>
 <br>
 </div>&nbsp;Do you have messages from the scsi layer or controller before t=
 his one ?<br>
 <br>
 &nbsp;--<br>
 <font color=3D"#888888">&nbsp;Manuel Bouyer, LIP6, Universite Paris VI. &nb=
 sp; &nbsp; &nbsp; &nbsp; &nbsp; <a href=3D"mailto:Manuel.Bouyer@lip6.fr">Ma=
 nuel.Bouyer@lip6.fr</a><br>
  &nbsp; &nbsp; &nbsp;NetBSD: 26 ans d&#39;experience feront toujours la dif=
 ference<br>
 &nbsp;--<br>
 <br>
 </font></blockquote></div><br>

 --00163628395ebd8adb0462c0b7dd--

From: Marco Poli <polimarco@gmail.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-alpha/40604: AlphaServer DS20E loses HDs and other Drives 
	when adding 1 GB more RAM
Date: Thu, 12 Feb 2009 21:36:33 -0200

 Hello Havard!

 I just switched Bank 0 with the 4 memory boards that used to be in
 Bank 1. NetBSD booted fine.

 I installed and ran sysutils/memtester, no errors reported.


 ---





 On Wed, Feb 11, 2009 at 12:50 PM, Havard Eidnes <he@netbsd.org> wrote:
 > The following reply was made to PR port-alpha/40604; it has been noted by GNATS.
 >
 > From: Havard Eidnes <he@NetBSD.org>
 > To: gnats-bugs@NetBSD.org, polimarco@gmail.com
 > Cc: port-alpha-maintainer@netbsd.org, netbsd-bugs@netbsd.org
 > Subject: Re: port-alpha/40604: AlphaServer DS20E loses HDs and other Drives
 >  when adding 1 GB more RAM
 > Date: Wed, 11 Feb 2009 15:45:18 +0100 (CET)
 >
 >  > Ok, that's one of the weirdest bugs I have ever faced.
 >  >
 >  > I recently got a AlphaServer DS20E with 2 833 MHz CPUs, 5 Hard
 >  > Drives and 3 Power Sources. Nice server.
 >  >
 >  > The machine came with 1 GB RAM, 4x256 MB DIMM boards placed in Bank 0=
 >  .=
 >
 >  >
 >  > I installed NetBSD 4.0.1 without any issues and immediately got
 >  > the 5 Hard Drives in a RAID configuration, with root and swap
 >  > under RAID1 and /usr under a RAID5. All working very nicely.
 >  >
 >  > One day I received 8 more of that 256 MB memories, and hurried
 >  > to upgrade the server. I installed the boards in Banks 1 and 2.
 >  >
 >  > What wasn't my surprise in the next boot, when I was faced with a mys=
 >  terious
 >  >
 >  > -----
 >  > probe(esiop0:0:0:0): request sense for a request sense ?
 >  > probe(esiop0:0:0:0): request sense failed with error 22
 >  > probe(esiop0:0:0:0): generic HBA error
 >  > -----
 >  > and that 3 messages repeat for each of my other 4 Hard Drives.
 >  >
 >  > and then everything closes with the misterious:
 >  > -----
 >  > WARNING: can't figure out what device matches "SCSI 1 7 0 0 0 0 0"
 >  > -----
 >  > That should be my boot and root device, dka0.
 >
 >  I experienced some similar weirdness on an Alpha DP264 box I
 >  currently have in operation.  I think my conclusion was that this
 >  was due to some of the memory being bad.  You could try to see if
 >  this is the case by trying out the internal memory tester in the
 >  SRM firmware, or the more brute-force approach of trying to run
 >  the machine with only the new memory in bank 0, and see if it
 >  then behaves any better (or worse).
 >
 >  However, your statement that it works fine with 2 or 3GB total
 >  memory (presumably the same memory tested above) with no issues
 >  while running Linux 2.6 makes this maybe implausible as an
 >  explanation.  Did you try to push the VM system while it ran
 >  Linux?  You could try pkgsrc/sysutils/memtester to excercise the
 >  system a bit, even though it's not a "real" memory tester (when
 >  you run it on a virtual memory system).  You may need to adjust
 >  the per-process memory limit up, and run sufficiently many
 >  instances to put strain on larger portions of your memory.
 >
 >  Myself?  I yanked out 1GB, so my DP264 box now runs with 1GB
 >  memory.  Admittedly not ideal if this is indeed a bug in NetBSD,
 >  and not a hardware problem...
 >
 >  Regards,
 >
 >  - H=E5vard
 >
 >

From: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-alpha/40604
Date: Thu, 7 Oct 2010 22:01:18 +0200

 I just committed code to enable and decode the peripheral machine checks
 on this kind of systems. This clearly shows that this not bad memory or
 anything else:

 ...
 mlx0 at pci0 dev 9 function 0: Mylex RAID (v2 interface)
 mlx0: interrupting at dec 6600 irq 23
 sgmap_load: ----- buf = 0xfffffe0000031300 -----
 sgmap_load: dmaoffset = 0x1300, buflen = 0x80
 sgmap_load: va:endva = 0xfffffe0000030000:0xfffffe0000032000
 sgmap_load: sgvalen = 0x2000, boundary = 0x0
 sgmap_load: sgva = 0x0, pteidx = 0, pte = 0xfffffc00002d0000 (pt = 0xfffffc00002d0000)
 sgmap_load: wbase = 0x800000, vpage = 0x0, DMA addr = 0x801300
 sgmap_load:     pa = 0xfffa6000, pte = 0xfffffc00002d0000, *pte = 0xfffa7

 System Machine Check (660): Rev 0x1, Code 0x202, Flags 0x0

     Software Error Summary Flags   = 0x0000000000000001
     CPU Device Interrupt Requests  = 0x4000000000000000
         DIR = 0x4000000000000000<Pchip 0 error>
     Cchip Miscellaneous Register   = 0x0000000100000030
     Pchip 0 Error Register         = 0x0070fffa73700041
         error    = 0x41<Error lost,Target abort>
         address  = 0xfffa7370, 0x0<No DAC>
         command  = 0x7<PCI memory write>
     Pchip 1 Error Register         = 0x0000000000000000

 unexpected machine check:

     mces    = 0x1
     vector  = 0x660
     param   = 0xfffffc0000006080
     pc      = 0xfffffc00009db220
     ra      = 0xfffffc00009db1e8
     code    = 0x100000202
     curlwp = 0xfffffc0001234a60
         pid = 0.1, comm = system

 panic: machine check
 ...

 This only happens when using SG DMA, which is done for address
 everything > 1GB. I have no idea why it is failing, the code looks ok.
 Maybe I get something wrong here, but why is the Pchip doing a PCI
 memory write when it should be doing a DMA transfer to host memory?


 Hans


 -- 
 %SYSTEM-F-ANARCHISM, The operating system has been overthrown

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.