NetBSD Problem Report #26825
Received: (qmail 27155 invoked by uid 605); 1 Sep 2004 16:04:15 -0000
Message-Id: <200409011549.i81FnXVc027715@heiligenberg.nt.e-technik.tu-darmstadt.de>
Date: Wed, 1 Sep 2004 17:49:33 +0200 (CEST)
From: Hauke Fath <hf@spg.tu-darmstadt.de>
Sender: gnats-bugs-owner@NetBSD.org
Reply-To: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@gnats.NetBSD.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>
Subject: mpt driver hangs during 2.0 install
X-Send-Pr-Version: 3.95
>Number: 26825
>Category: kern
>Synopsis: mpt driver hangs during 2.0 install
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Sep 01 16:05:00 +0000 2004
>Closed-Date:
>Last-Modified: Fri Jan 04 21:32:34 +0000 2008
>Originator: Hauke Fath <hf@spg.tu-darmstadt.de>
>Release: NetBSD 2.0_BETA
>Organization:
--
/~\ The ASCII Ribbon Campaign Hauke Fath
\ / No HTML/RTF in email Institut für Nachrichtentechnik
X No Word docs in email TU Darmstadt
/ \ Respect for open standards Ruf +49-6151-16-3281
>Environment:
System: NetBSD heiligenberg 2.0_BETA NetBSD 2.0_BETA (HEILIGENBERG) #3: Wed Jul 28 16:42:21 CEST 2004 hf@heiligenberg:/var/obj/netbsd-builds/2_0/i386/sys/arch/i386/compile/HEILIGENBERG i386
Architecture: i386
Machine: i386
>Description:
Attempting a 2.0beta install on an i386 19" server with a LSI
Logic LSI20320-R RAID 1 (firmware version '1030F00') stalls
after unpacking a few tarballs;
at most, it gets to running MAKEDEV.
/kern/msgbuf has a load of entries like (hand-copied):
sd0(mpt0:0:0:0): command timeout
mpt0: timeout on request index 0x9b, seq 0x000007dd
mpt0: status 0x00000000, Mask 0x00000001, Doorbell 0x24000000
mpt0: request state: On Chip
and
sd0(mpt0:0:0:0): command timeout
mpt0: timeout on request index 0x9b, seq 0x00000b55
mpt0: Device not running
mpt0: mailbox: (0x4000777) State Fault WhoInit No One
mpt0: status 0x00000000, Mask 0x00000001, Doorbell 0x40007777
mpt0: request state: On Chip
I had similar problems about nine months ago installing
1.6Z{something} on an identical machine, but managed to work around
the issue. This time no go, although I haven't tried manually
unpacking the tarballs, yet. I suspect that sysinst mounting '-o
async' overloads the driver.
The OpenBSD mpt(8) manpage points out that their driver does not work
with firmware versions > 1.03.00 without giving details.
>How-To-Repeat:
Try to install 2.0beta on an i386 machine equipped with a
LSI20320-R RAID 1.
>Fix:
No idea.
>Release-Note:
>Audit-Trail:
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-admin@netbsd.org, kern-bug-people@netbsd.org
Cc: Hauke Fath <hf@spg.tu-darmstadt.de>, gnats-bugs@netbsd.org
Subject: Re: kern/26825: mpt driver hangs during 2.0 install
Date: Sun, 30 Jan 2005 18:19:37 +0100
Another data point (2.0.1 kernel) -- untarring a 2.0.1 binary
distribution to an ffs mounted with softdep gives me:
# sh install-i386-sets.sh
Compare MD5 checksums...
base.tgz -- set OK.
comp.tgz -- set OK.
games.tgz -- set OK.
man.tgz -- set OK.
misc.tgz -- set OK.
text.tgz -- set OK.
xbase.tgz -- set OK.
xcomp.tgz -- set OK.
xetc.tgz -- set OK.
xfont.tgz -- set OK.
xserver.tgz -- set OK.
Unpacking sets in / ...
- Unpacking base.tgz... Done.
- Unpacking comp.tgz... Done.
- Unpacking games.tgz... Done.
- Unpacking man.tgz... Done.
- Unpacking misc.tgz... Done.
- Unpacking text.tgz... Done.
- Unpacking xbase.tgz... Done.
- Unpacking xcomp.tgz... Done.
- Unpacking xetc.tgz... Done.
- Unpacking xfont.tgz... Done.
- Unpacking xserver.tgz... Done.
Now reboot, then mount sources to /usr/src
and run "env USETOOLS=3Dno etcupdate -w100 -v -a -l".
# mpt0: Unknown async event: 0xb
mpt0: Unknown async event: 0xb
mpt0: Unknown async event: 0xb
sd0(mpt0:0:0:0): Check Condition on CDB: 0x2a 00 00 2a e4 0f 00 00 10
00 SENSE KEY: Hardware Error
ASC/ASCQ: Internal Target Failure
sd0(mpt0:0:0:0): Check Condition on CDB: 0x2a 00 00 2a e4 1f 00 00 20
00 SENSE KEY: Hardware Error
ASC/ASCQ: Internal Target Failure
[...]
/usr: got error 5 while accessing filesystem
panic: softdep_deallocate_dependencies: unrecovered I/O error
Begin traceback...
softdep_deallocate_dependencies(c10ab578,5,cb0faeec,c02290f2,41fd042a)
at netbsd:softdep_deallocate_dependencies+0x22
brelse(c10ab578,2ae7ff,0,0,c10ab578) at netbsd:brelse+0x1f3
biodone(c10ab578,1,cb0faf34,cb0f2108,0) at netbsd:biodone+0x6c
scsipi_complete(c0e40a68,10,c0363358,0,0) at
netbsd:scsipi_complete+0x114
scsipi_completion_thread(c0d3b2d4,492000,49b000,0,c0100321) at
netbsd:scsipi_com pletion_thread+0xc0
End traceback...
syncing disks... mpt0: Unknown async event: 0xb
mpt0: Unknown async event: 0xb
mpt0: Unknown async event: 0xb
mpt0: Unknown async event: 0xb
mpt0: Unknown async event: 0xb
mpt0: Unknown async event: 0xb
mpt0: Unknown async event: 0xb
[machine locked]
-- the machine has 512 MB RAM and unpacks the tarballs very quickly to
buffer cache. When the cache gets flushed, the LSI20320 jams quickly.
Unfortunately, we've got four machines equipped with those controllers.
If it helps, I could run test kernels, or provide a remote login /
serial console.
hauke
--=20
/~\ The ASCII Ribbon Campaign Hauke Fath
\ / No HTML/RTF in email Institut f=FCr Nachrichtentechnik
X No Word docs in email TU Darmstadt
/ \ Respect for open standards Ruf +49-6151-16-3281
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/26825: mpt driver hangs during 2.0 install
Date: Thu, 6 Jul 2006 11:06:26 +0200
Am 01.09.2004 um 17:49 Uhr +0200 schrieb Hauke Fath:
>
> Attempting a 2.0beta install on an i386 19" server with a LSI
> Logic LSI20320-R RAID 1 (firmware version '1030F00') stalls
> after unpacking a few tarballs;
> at most, it gets to running MAKEDEV.
>
> /kern/msgbuf has a load of entries like (hand-copied):
[...]
From fresh, bad experience, I can confirm that the problem is still
present in netbsd-3 and -current (a 2006-07-05 build). I tried
netbsd-3 with the patch at the end of
http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=30531, and it
didn't help any.
hauke
--
/~\ The ASCII Ribbon Campaign Hauke Fath
\ / No HTML/RTF in email Institut für Nachrichtentechnik
X No Word docs in email TU Darmstadt
/ \ Respect for open standards Ruf +49-6151-16-3281
Responsible-Changed-From-To: kern-bug-people->tron
Responsible-Changed-By: tron@netbsd.org
Responsible-Changed-When: Thu, 03 Jan 2008 20:35:10 +0000
Responsible-Changed-Why:
I might have fixed that by committing the patch from PR kern/30531.
State-Changed-From-To: open->feedback
State-Changed-By: tron@netbsd.org
State-Changed-When: Thu, 03 Jan 2008 20:35:10 +0000
State-Changed-Why:
Can you please test whether this is fixed in revision 1.13 of
"sys/dev/pci/mpt_pci.c"?
From: Hauke Fath <hf@spg.tu-darmstadt.de>
To: gnats-bugs@NetBSD.org
Cc: tron@NetBSD.org, kern-bug-people@NetBSD.org,
gnats-admin@NetBSD.org
Subject: Re: kern/26825 (mpt driver hangs during 2.0 install)
Date: Fri, 4 Jan 2008 10:37:07 +0100
At 20:35 Uhr +0000 03.01.2008, tron@NetBSD.org wrote:
>Synopsis: mpt driver hangs during 2.0 install
>
>Responsible-Changed-From-To: kern-bug-people->tron
>Responsible-Changed-By: tron@netbsd.org
>Responsible-Changed-When: Thu, 03 Jan 2008 20:35:10 +0000
>Responsible-Changed-Why:
>I might have fixed that by committing the patch from PR kern/30531.
>
>
>State-Changed-From-To: open->feedback
>State-Changed-By: tron@netbsd.org
>State-Changed-When: Thu, 03 Jan 2008 20:35:10 +0000
>State-Changed-Why:
>Can you please test whether this is fixed in revision 1.13 of
>"sys/dev/pci/mpt_pci.c"?
Thanks for looking into the problem.
Aas noted in the PR, I have been running the machine in question with
the patch from kern/30531 since mid-2006. It didn't make a
difference, neither for netbsd-3 nor netbsd-4 -- right now, the
machine runs a 4rc1 kernel. I can update the installation by untaring
the base, non-X11 tarballs and avoiding bursts by mounting without
softdep. Anything more data-intensive than that clogs up the
controller.
As to thorpej's hint in PR30531, the machine in question has plain
PCI 33 MHz slots, no PCIX. In a newer PCIX equipped board, the
controller would not work at all in RAID1 mode (the machine in
question runs Debian 3.x fine). I may be able to free a machine with
a newer PCIX board for tests in a few weeks, though.
hauke
--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-3281
Responsible-Changed-From-To: tron->kern-bug-people
Responsible-Changed-By: tron@netbsd.org
Responsible-Changed-When: Fri, 04 Jan 2008 21:32:34 +0000
Responsible-Changed-Why:
I don't have mpt(4) hardware to test this.
State-Changed-From-To: feedback->open
State-Changed-By: tron@netbsd.org
State-Changed-When: Fri, 04 Jan 2008 21:32:34 +0000
State-Changed-Why:
Feedback was provided, needs more work.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.