NetBSD Problem Report #26825

Received: (qmail 27155 invoked by uid 605); 1 Sep 2004 16:04:15 -0000
Message-Id: <200409011549.i81FnXVc027715@heiligenberg.nt.e-technik.tu-darmstadt.de>
Date: Wed, 1 Sep 2004 17:49:33 +0200 (CEST)
From: Hauke Fath <hf@spg.tu-darmstadt.de>
Sender: gnats-bugs-owner@NetBSD.org
Reply-To: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@gnats.NetBSD.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
Subject: mpt driver hangs during 2.0 install
X-Send-Pr-Version: 3.95

>Number:         26825
>Category:       kern
>Synopsis:       mpt driver hangs during 2.0 install
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Sep 01 16:05:00 +0000 2004
>Closed-Date:    
>Last-Modified:  Fri Jan 04 21:32:34 +0000 2008
>Originator:     Hauke Fath <hf@spg.tu-darmstadt.de>
>Release:        NetBSD 2.0_BETA
>Organization:
-- 
/~\  The ASCII Ribbon Campaign                    Hauke Fath
\ /    No HTML/RTF in email	        Institut für Nachrichtentechnik
 X     No Word docs in email	                  TU Darmstadt
/ \  Respect for open standards              Ruf +49-6151-16-3281

>Environment:


System: NetBSD heiligenberg 2.0_BETA NetBSD 2.0_BETA (HEILIGENBERG) #3: Wed Jul 28 16:42:21 CEST 2004 hf@heiligenberg:/var/obj/netbsd-builds/2_0/i386/sys/arch/i386/compile/HEILIGENBERG i386
Architecture: i386
Machine: i386
>Description:

	Attempting a 2.0beta install on an i386 19" server with a LSI
	Logic LSI20320-R RAID 1 (firmware version '1030F00') stalls
	after unpacking a few tarballs;
	at most, it gets to running MAKEDEV.

	/kern/msgbuf has a load of entries like (hand-copied):

sd0(mpt0:0:0:0): command timeout
mpt0: timeout on request index 0x9b, seq 0x000007dd
mpt0: status 0x00000000, Mask 0x00000001, Doorbell 0x24000000
mpt0: request state: On Chip

and

sd0(mpt0:0:0:0): command timeout
mpt0: timeout on request index 0x9b, seq 0x00000b55
mpt0: Device not running
mpt0: mailbox: (0x4000777) State Fault WhoInit No One
mpt0: status 0x00000000, Mask 0x00000001, Doorbell 0x40007777
mpt0: request state: On Chip


I had similar problems about nine months ago installing
1.6Z{something} on an identical machine, but managed to work around
the issue. This time no go, although I haven't tried manually
unpacking the tarballs, yet. I suspect that sysinst mounting '-o
async' overloads the driver.

The OpenBSD mpt(8) manpage points out that their driver does not work
with firmware versions > 1.03.00 without giving details.

>How-To-Repeat:
	Try to install 2.0beta on an i386 machine equipped with a
	LSI20320-R RAID 1.

>Fix:
	No idea.
>Release-Note:
>Audit-Trail:
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-admin@netbsd.org, kern-bug-people@netbsd.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>, gnats-bugs@netbsd.org
Subject: Re: kern/26825: mpt driver hangs during 2.0 install
Date: Sun, 30 Jan 2005 18:19:37 +0100

 Another data point (2.0.1 kernel) -- untarring a 2.0.1 binary
 distribution to an ffs mounted with softdep gives me:

 # sh install-i386-sets.sh
 Compare MD5 checksums...
 base.tgz         -- set OK.
 comp.tgz         -- set OK.
 games.tgz        -- set OK.
 man.tgz          -- set OK.
 misc.tgz         -- set OK.
 text.tgz         -- set OK.
 xbase.tgz        -- set OK.
 xcomp.tgz        -- set OK.
 xetc.tgz         -- set OK.
 xfont.tgz        -- set OK.
 xserver.tgz      -- set OK.
 Unpacking sets in / ...
 - Unpacking base.tgz...  Done.
 - Unpacking comp.tgz...  Done.
 - Unpacking games.tgz...  Done.
 - Unpacking man.tgz...  Done.
 - Unpacking misc.tgz...  Done.
 - Unpacking text.tgz...  Done.
 - Unpacking xbase.tgz...  Done.
 - Unpacking xcomp.tgz...  Done.
 - Unpacking xetc.tgz...  Done.
 - Unpacking xfont.tgz...  Done.
 - Unpacking xserver.tgz...  Done.
 Now reboot, then mount sources to /usr/src
 and run "env USETOOLS=3Dno etcupdate -w100 -v -a -l".
 # mpt0: Unknown async event: 0xb
 mpt0: Unknown async event: 0xb
 mpt0: Unknown async event: 0xb
 sd0(mpt0:0:0:0):  Check Condition on CDB: 0x2a 00 00 2a e4 0f 00 00 10
 00    SENSE KEY:  Hardware Error
      ASC/ASCQ:  Internal Target Failure

 sd0(mpt0:0:0:0):  Check Condition on CDB: 0x2a 00 00 2a e4 1f 00 00 20
 00    SENSE KEY:  Hardware Error
      ASC/ASCQ:  Internal Target Failure

 [...]

 /usr: got error 5 while accessing filesystem
 panic: softdep_deallocate_dependencies: unrecovered I/O error
 Begin traceback...
 softdep_deallocate_dependencies(c10ab578,5,cb0faeec,c02290f2,41fd042a)
 at netbsd:softdep_deallocate_dependencies+0x22
 brelse(c10ab578,2ae7ff,0,0,c10ab578) at netbsd:brelse+0x1f3
 biodone(c10ab578,1,cb0faf34,cb0f2108,0) at netbsd:biodone+0x6c
 scsipi_complete(c0e40a68,10,c0363358,0,0) at
 netbsd:scsipi_complete+0x114
 scsipi_completion_thread(c0d3b2d4,492000,49b000,0,c0100321) at
 netbsd:scsipi_com pletion_thread+0xc0
 End traceback...
 syncing disks... mpt0: Unknown async event: 0xb
 mpt0: Unknown async event: 0xb
 mpt0: Unknown async event: 0xb
 mpt0: Unknown async event: 0xb
 mpt0: Unknown async event: 0xb
 mpt0: Unknown async event: 0xb
 mpt0: Unknown async event: 0xb

 [machine locked]

 -- the machine has 512 MB RAM and unpacks the tarballs very quickly to
 buffer cache. When the cache gets flushed, the LSI20320 jams quickly.

 Unfortunately, we've got four machines equipped with those controllers.
 If it helps, I could run test kernels, or provide a remote login /
 serial console.

 	hauke

 --=20
 /~\  The ASCII Ribbon Campaign                    Hauke Fath
 \ /    No HTML/RTF in email	        Institut f=FCr Nachrichtentechnik
  X     No Word docs in email	                  TU Darmstadt
 / \  Respect for open standards              Ruf +49-6151-16-3281

From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/26825: mpt driver hangs during 2.0 install
Date: Thu, 6 Jul 2006 11:06:26 +0200

 Am 01.09.2004 um 17:49 Uhr +0200 schrieb Hauke Fath:
 >
 >	Attempting a 2.0beta install on an i386 19" server with a LSI
 >	Logic LSI20320-R RAID 1 (firmware version '1030F00') stalls
 >	after unpacking a few tarballs;
 >	at most, it gets to running MAKEDEV.
 >
 >	/kern/msgbuf has a load of entries like (hand-copied):

 [...]

  From fresh, bad experience, I can confirm that the problem is still 
 present in netbsd-3 and -current (a 2006-07-05 build). I tried 
 netbsd-3 with the patch at the end of 
 http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=30531, and it 
 didn't help any.

 	hauke

 -- 
 /~\  The ASCII Ribbon Campaign                    Hauke Fath
 \ /    No HTML/RTF in email	        Institut für Nachrichtentechnik
   X     No Word docs in email	                  TU Darmstadt
 / \  Respect for open standards              Ruf +49-6151-16-3281

Responsible-Changed-From-To: kern-bug-people->tron
Responsible-Changed-By: tron@netbsd.org
Responsible-Changed-When: Thu, 03 Jan 2008 20:35:10 +0000
Responsible-Changed-Why:
I might have fixed that by committing the patch from PR kern/30531.


State-Changed-From-To: open->feedback
State-Changed-By: tron@netbsd.org
State-Changed-When: Thu, 03 Jan 2008 20:35:10 +0000
State-Changed-Why:
Can you please test whether this is fixed in revision 1.13 of
"sys/dev/pci/mpt_pci.c"?


From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: tron@NetBSD.org, kern-bug-people@NetBSD.org,
	gnats-admin@NetBSD.org
Subject: Re: kern/26825 (mpt driver hangs during 2.0 install)
Date: Fri, 4 Jan 2008 10:37:07 +0100

 At 20:35 Uhr +0000 03.01.2008, tron@NetBSD.org wrote:
 >Synopsis: mpt driver hangs during 2.0 install
 >
 >Responsible-Changed-From-To: kern-bug-people->tron
 >Responsible-Changed-By: tron@netbsd.org
 >Responsible-Changed-When: Thu, 03 Jan 2008 20:35:10 +0000
 >Responsible-Changed-Why:
 >I might have fixed that by committing the patch from PR kern/30531.
 >
 >
 >State-Changed-From-To: open->feedback
 >State-Changed-By: tron@netbsd.org
 >State-Changed-When: Thu, 03 Jan 2008 20:35:10 +0000
 >State-Changed-Why:
 >Can you please test whether this is fixed in revision 1.13 of
 >"sys/dev/pci/mpt_pci.c"?

 Thanks for looking into the problem.

 Aas noted in the PR, I have been running the machine in question with 
 the patch from kern/30531 since mid-2006. It didn't make a 
 difference, neither for netbsd-3 nor netbsd-4 -- right now, the 
 machine runs a 4rc1 kernel. I can update the installation by untaring 
 the base, non-X11 tarballs and avoiding bursts by mounting without 
 softdep. Anything more data-intensive than that clogs up the 
 controller.

 As to thorpej's hint in PR30531, the machine in question has plain 
 PCI 33 MHz slots, no PCIX. In a newer PCIX equipped board, the 
 controller would not work at all in RAID1 mode (the machine in 
 question runs Debian 3.x fine). I may be able to free a machine with 
 a newer PCIX board for tests in a few weeks, though.

 	hauke

 -- 
       The ASCII Ribbon Campaign                    Hauke Fath
 ()     No HTML/RTF in email            Institut für Nachrichtentechnik
 /\     No Word docs in email                     TU Darmstadt
       Respect for open standards              Ruf +49-6151-16-3281

Responsible-Changed-From-To: tron->kern-bug-people
Responsible-Changed-By: tron@netbsd.org
Responsible-Changed-When: Fri, 04 Jan 2008 21:32:34 +0000
Responsible-Changed-Why:
I don't have mpt(4) hardware to test this.


State-Changed-From-To: feedback->open
State-Changed-By: tron@netbsd.org
State-Changed-When: Fri, 04 Jan 2008 21:32:34 +0000
State-Changed-Why:
Feedback was provided, needs more work.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.