NetBSD Problem Report #31477

From www@netbsd.org  Wed Oct  5 03:20:38 2005
Return-Path: <www@netbsd.org>
Received: by narn.netbsd.org (Postfix, from userid 31301)
	id DBD3163B850; Wed,  5 Oct 2005 03:20:37 +0000 (UTC)
Message-Id: <20051005032037.DBD3163B850@narn.netbsd.org>
Date: Wed,  5 Oct 2005 03:20:37 +0000 (UTC)
From: steven_grunza@ieee.org
Reply-To: steven_grunza@ieee.org
To: gnats-bugs@netbsd.org
Subject: Heavy disk IO to raid0 locks up system
X-Send-Pr-Version: www-1.0

>Number:         31477
>Category:       kern
>Synopsis:       Heavy disk IO to raid0 locks up system
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Oct 05 03:21:00 +0000 2005
>Closed-Date:    
>Last-Modified:  Sun Mar 20 06:46:11 +0000 2016
>Originator:     Steven Grunza
>Release:        1.6.2 of port-i386
>Organization:
>Environment:
NetBSD barc 1.6.2 NetBSD 1.6.2 (BARC) #0: Thu Nov 4 15:18:00 EST 2004 toor@barc:/usr/src/sys/arch/i386/compile/BARC i386
>Description:
I have a Promise Ultra100TX2/ATA controller on the PCI bus with two 74GB (ST380011A) drives.  Each drive is on it's own bus as a master.  The motherboard (Dell XPS T500) has a DVD+RW drive as a secondary master and a 13 GB IBM drive as a primary master.

Networking is with a 3C905C 10/100 Ethernet card running at 100 Mbps full-duplex.

The system is normally stable and is used as a file server.  If a client connects (samba 3.0.14a) to a samba share and transfers large amounts of data for several minutes then the file server (barc) stops responding to the client and doesn't respond to pings.  The console (vga text mode /w/ PS2 keyboard) also stops responding to the keyboard.  The only way to recover is to reboot using the PC's front panel reset button.

The data transfers are to a file system on raid0 which is a RAID 1 (mirrored) set of disks.

I often see lines in dmesg about "stray interrupt 7".  I haven't found anything that suggests a connection between "stray interrupt 7" and the crashes.

Any help would be welcome.  I would consider moving to a NetBSD 2.x or even NetBSD 3.x version if that would bring some stability.  I would like to stick with using RAID level 1 for increased data-loss prevention.
>How-To-Repeat:
Use FTP or Samba to transfer a large amount (4GB in one case, installing Cygwin from the Samba share in another) of data.
>Fix:

>Release-Note:

>Audit-Trail:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: kern/31477: Heavy disk IO to raid0 locks up system
Date: Wed, 5 Oct 2005 21:53:47 +0200

 On Wed, Oct 05, 2005 at 03:21:00AM +0000, steven_grunza@ieee.org wrote:
 > >Description:
 > I have a Promise Ultra100TX2/ATA controller on the PCI bus with two 74GB (ST380011A) drives.  Each drive is on it's own bus as a master.  The motherboard (Dell XPS T500) has a DVD+RW drive as a secondary master and a 13 GB IBM drive as a primary master.
 > 
 > Networking is with a 3C905C 10/100 Ethernet card running at 100 Mbps full-duplex.
 > 
 > The system is normally stable and is used as a file server.  If a client connects (samba 3.0.14a) to a samba share and transfers large amounts of data for several minutes then the file server (barc) stops responding to the client and doesn't respond to pings.  The console (vga text mode /w/ PS2 keyboard) also stops responding to the keyboard.  The only way to recover is to reboot using the PC's front panel reset button.
 > 
 > The data transfers are to a file system on raid0 which is a RAID 1 (mirrored) set of disks.
 > 
 > I often see lines in dmesg about "stray interrupt 7".  I haven't found anything that suggests a connection between "stray interrupt 7" and the crashes.

 Shoud not be related.
 For the record, I have several file servers using raidframe raid-1 volumes,
 and none shows this behavior. They are still running 1.6.2_STABLE. I use intel
 gigabit network cards, and and HPT370 adapter (for the IDE-based ones).
 Can you reproduce the hang by doing only disk I/O, or only networks I/O ?
 Can you post the dmesg ?

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

State-Changed-From-To: open->feedback
State-Changed-By: prlw1@NetBSD.org
State-Changed-When: Fri, 11 Mar 2016 12:28:46 +0000
State-Changed-Why:
bouyer@ asked for info


State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 20 Mar 2016 06:46:11 +0000
State-Changed-Why:
Unfortunately, bouyer@ asked for info in 2005 and the submitter's mail
doesn't work any more.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.