NetBSD Problem Report #36690

From martin@duskware.de  Wed Jul 25 07:06:33 2007
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id 45EF363B874
	for <gnats-bugs@gnats.netbsd.org>; Wed, 25 Jul 2007 07:06:33 +0000 (UTC)
Message-Id: <20070725053056.E89B163B874@narn.NetBSD.org>
Date: Wed, 25 Jul 2007 05:30:56 +0000 (UTC)
From: permezel@mac.com
Reply-To: permezel@mac.com
To: netbsd-bugs-owner@NetBSD.org
Subject: KASSERT(delta > 0) in kern_physio
X-Send-Pr-Version: www-1.0

>Number:         36690
>Category:       kern
>Synopsis:       KASSERT(delta > 0) in kern_physio
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    bouyer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jul 25 07:10:01 +0000 2007
>Closed-Date:    Sun Sep 21 20:35:41 +0000 2008
>Last-Modified:  Mon Sep 22 00:50:07 +0000 2008
>Originator:     Damon Permezel
>Release:        4.0 beta 2 from June 26
>Organization:
>Environment:
NetBSD zardoz.damon.com 4.0_BETA2 NetBSD 4.0_BETA2 (ZARDOZ) #0: Tue Jun 26 15:37:57 EST 2007  dap@zardoz.damon.com:/home/dap/proj/3.1/obj/sys/arch/i386/compile/ZARDOZ i386

>Description:
running: dd bs=32k </dev/nrst0 >0

After panic/reboot, I did: mt rew; dd bs=32k </dev/nrst0 >0 count=1
and she immediately wedged tight requiring fresh electrons.

I have found in the past that tape support was somewhat touch and go.  If I try and use the same tape drive to erase a tape (dd </dev/zero bs=128k >/dev/nrst0), quite likely I will get errors on end of media and never be able to recover the tape drive unless I reboot with fresh electrons.  Just reboot with the same old electrons and she's still hosed, which might have something to do with the hand-after-panic I report in second paragraph, which is why I am rambling on so.

No locals.
#1  0xc0292941 in panic (fmt=0x0)
    at /home/dap/proj/3.1/src/sys/kern/subr_prf.c:246
        bootopt = 256
        ap = 0xcba629d8 "\224-<¿c\202>¿ÿ\201>¿?\001"
        intrace = 0
#2  0xc038abac in __assert (t=0xc03c2d94 "diagnostic ", 
    f=0xc03e81d8 "/home/dap/proj/3.1/src/sys/kern/kern_physio.c", l=441, 
    e=0xc03e8263 "delta > 0")
    at /home/dap/proj/3.1/src/sys/lib/libkern/__assert.c:45
No locals.
#3  0xc0274b7e in physio (strategy=0xc030b49b <ststrategy>, obp=0x0, dev=3585, 
    flags=1048576, min_phys=0xc018e47f <ahc_minphys>, uio=0xcba62b90)
    at /home/dap/proj/3.1/src/sys/kern/kern_physio.c:445
        iovp = (struct iovec *) 0xcba62bb4
        l = (struct lwp *) 0xcd2bc010
        p = (struct proc *) 0xcd4df034
        i = 1
        s = <value optimized out>
        error = 0
        error2 = <value optimized out>
        bp = (struct buf *) 0x0
        mbp = (struct buf *) 0xc21bd70c
        concurrency = 15
#4  0xc030a59c in stread (dev=3585, uio=0xcba62b90, iomode=0)
    at /home/dap/proj/3.1/src/sys/dev/scsipi/st.c:1375
No locals.
#5  0xc02c6aeb in spec_read (v=0xcba62b08)
    at /home/dap/proj/3.1/src/sys/miscfs/specfs/spec_vnops.c:294
        vp = (struct vnode *) 0xcff482d0
        uio = (struct uio *) 0xcba62b90
        l = (struct lwp *) 0xcd2bc010
        bp = <value optimized out>
        bdev = <value optimized out>
        cdev = (const struct cdevsw *) 0x0
        bsize = <value optimized out>
        bscale = <value optimized out>
        dpart = {disklab = 0xc03f0689, part = 0x135}
        n = <value optimized out>
        on = <value optimized out>
        error = <value optimized out>
#6  0xc02c0c03 in VOP_READ (vp=0xcff482d0, uio=0xcba62b90, ioflag=0, 
    cred=0xcd1abc24) at /home/dap/proj/3.1/src/sys/kern/vnode_if.c:424
        a = {a_desc = 0xc03a3d60, a_vp = 0xcff482d0, a_uio = 0xcba62b90, 
  a_ioflag = 0, a_cred = 0xcd1abc24}
#7  0xc02bec14 in vn_read (fp=0xcd1f6114, offset=0xcd1f6140, uio=0xcba62b90, 
    cred=0xcd1abc24, flags=1)
    at /home/dap/proj/3.1/src/sys/kern/vfs_vnops.c:448
        vp = (struct vnode *) 0xcff482d0
        error = <value optimized out>
        ioflag = 0
#8  0xc0297198 in dofileread (l=0xcd2bc010, fd=0, fp=0xcd1f6114, 
    buf=0x804f000, nbyte=32768, offset=0xcd1f6140, flags=1, retval=0xcba62c68)
    at /home/dap/proj/3.1/src/sys/kern/sys_generic.c:153
        aiov = {iov_base = 0x8057000, iov_len = 0}
        auio = {uio_iov = 0xcba62bb4, uio_iovcnt = 1, uio_offset = 32768, 
  uio_resid = 0, uio_rw = UIO_READ, uio_vmspace = 0xcd8f6150}
        p = (struct proc *) 0xcd4df034
        vm = (struct vmspace *) 0xcd8f6150
        cnt = <value optimized out>
        error = 0
        ktriov = {iov_base = 0x0, iov_len = 0}
#9  0xc02972fe in sys_read (l=0xcd2bc010, v=0xcba62c48, retval=0xcba62c68)
    at /home/dap/proj/3.1/src/sys/kern/sys_generic.c:103
        fd = 0
        fp = (struct file *) 0xcd1f6114
        p = <value optimized out>

(gdb) print *mbp
$2 = {b_u = {u_actq = {tqe_next = 0xdeadbeef, tqe_prev = 0xc21bdc40}, 
    u_work = {wk_entry = {sqe_next = 0xdeadbeef}}}, b_interlock = {
    lock_data = 0x0, 
    lock_file = 0xc03e9d07 "/home/dap/proj/3.1/src/sys/kern/kern_synch.c", 
    unlock_file = 0xc03e81d8 "/home/dap/proj/3.1/src/sys/kern/kern_physio.c", 
    lock_line = 0x27f, unlock_line = 0x1b3, list = {tqe_next = 0x0, 
      tqe_prev = 0x0}, lock_holder = 0xffffffff}, b_flags = 0x810, 
  b_error = 0x5, b_prio = 0x1, b_bufsize = 0xdeadbeef, b_bcount = 0xdeadbeef, 
  b_resid = 0xdeadbeef, b_dev = 0xffffffff, b_un = {
    b_addr = 0xdeadbeef <Address 0xdeadbeef out of bounds>}, 
  b_blkno = 0xdeadbeefdeadbeef, b_rawblkno = 0xdeadbeefdeadbeef, 
  b_iodone = 0xdeadbeef, b_proc = 0xdeadbeef, b_vp = 0xdeadbeef, b_dep = {
    lh_first = 0x0}, b_saveaddr = 0xdeadbeef, b_fspriv = {
    bf_private = 0xdeadbeef, bf_dcookie = 0xdeadbeefdeadbeef}, b_hash = {
    le_next = 0xdeadbeef, le_prev = 0xdeadbeef}, b_vnbufs = {
    le_next = 0xdeadbeef, le_prev = 0xdeadbeef}, b_freelist = {
    tqe_next = 0xdeadbeef, tqe_prev = 0xdeadbeef}, b_lblkno = 0x10000, 
  b_freelistindex = 0x0}
(gdb) print *uio
$3 = {uio_iov = 0xcba62bb4, uio_iovcnt = 0x1, uio_offset = 0x8000, 
  uio_resid = 0x0, uio_rw = UIO_READ, uio_vmspace = 0xcd8f6150}

(gdb) p *iovp
$4 = {iov_base = 0x8057000, iov_len = 0x0}


So uio_offset = 0x8000 and b_endoffset == b_lblkno == 0x10000

                delta = uio->uio_offset - mbp->b_endoffset;
                KASSERT(delta > 0);

delta = 8000 - 10000.




>How-To-Repeat:
dd detape.

>Fix:
Don't use tapes?

>Release-Note:

>Audit-Trail:
From: Paul Ripke <stix@stix.id.au>
To: NetBSD gnats-bugs <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: kern/36690: KASSERT(delta > 0) in kern_physio
Date: Sat, 29 Sep 2007 23:42:39 +1000

 This appears to be due to instances like the following:

 st0: 65536-byte tape record too big for 32768-byte user buffer
 st0(ahc0:0:6:0):  Check Condition on CDB: 0x08 00 00 80 00 00
     SENSE KEY:  No Additional Sense
                 Incorrect Length Indicator Set
    INFO FIELD:  -32768
      ASC/ASCQ:  No Additional Sense Information
 panic: kernel diagnostic assertion "delta > 0" failed: file "/l/netbsd/netbsd-4/src/sys/kern/kern_physio.c", line 441

 100% reproducible on head of netbsd-4 branch.

 -- 
 Paul Ripke

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
	netbsd-bugs@NetBSD.org, permezel@mac.com
Subject: Re: kern/36690: KASSERT(delta > 0) in kern_physio
Date: Mon, 1 Oct 2007 20:44:13 +0200

 --ew6BAiZeqk4r7MaW
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 On Sat, Sep 29, 2007 at 01:45:02PM +0000, Paul Ripke wrote:
 > The following reply was made to PR kern/36690; it has been noted by GNATS.
 > 
 > From: Paul Ripke <stix@stix.id.au>
 > To: NetBSD gnats-bugs <gnats-bugs@NetBSD.org>
 > Cc: 
 > Subject: Re: kern/36690: KASSERT(delta > 0) in kern_physio
 > Date: Sat, 29 Sep 2007 23:42:39 +1000
 > 
 >  This appears to be due to instances like the following:
 >  
 >  st0: 65536-byte tape record too big for 32768-byte user buffer
 >  st0(ahc0:0:6:0):  Check Condition on CDB: 0x08 00 00 80 00 00
 >      SENSE KEY:  No Additional Sense
 >                  Incorrect Length Indicator Set
 >     INFO FIELD:  -32768
 >       ASC/ASCQ:  No Additional Sense Information
 >  panic: kernel diagnostic assertion "delta > 0" failed: file "/l/netbsd/netbsd-4/src/sys/kern/kern_physio.c", line 441
 >  
 >  100% reproducible on head of netbsd-4 branch.

 Should be fixed in st.c 1.200. Can you please try the attached patch ?

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

 --ew6BAiZeqk4r7MaW
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=diff

 Index: st.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/scsipi/st.c,v
 retrieving revision 1.198
 retrieving revision 1.200
 diff -u -p -u -r1.198 -r1.200
 --- st.c	29 Jul 2007 12:50:23 -0000	1.198
 +++ st.c	1 Oct 2007 18:43:30 -0000	1.200
 @@ -1,4 +1,4 @@
 -/*	$NetBSD: st.c,v 1.198 2007/07/29 12:50:23 ad Exp $ */
 +/*	$NetBSD: st.c,v 1.200 2007/10/01 18:43:30 bouyer Exp $ */

  /*-
   * Copyright (c) 1998, 2004 The NetBSD Foundation, Inc.
 @@ -57,7 +57,7 @@
   */

  #include <sys/cdefs.h>
 -__KERNEL_RCSID(0, "$NetBSD: st.c,v 1.198 2007/07/29 12:50:23 ad Exp $");
 +__KERNEL_RCSID(0, "$NetBSD: st.c,v 1.200 2007/10/01 18:43:30 bouyer Exp $");

  #include "opt_scsi.h"

 @@ -1219,6 +1219,7 @@ ststart(struct scsipi_periph *periph)
  					if (st_space(st, 0, SP_FILEMARKS, 0)) {
  						BUFQ_GET(st->buf_queue);
  						bp->b_error = EIO;
 +						bp->b_resid = bp->b_bcount;
  						biodone(bp);
  						continue;
  					}
 @@ -2234,8 +2235,16 @@ st_interpret_sense(struct scsipi_xfer *x
  				}
  			}
  		}
 -		if (bp)
 +		if (bp) {
  			bp->b_resid = info;
 +			/*
 +			 * buggy device ? A SDLT320 can report an info
 +			 * field of 0x3de8000 on a Media Error/Write Error
 +			 * for this CBD: 0x0a 00 00 80 00 00
 +			 */
 +			if (bp->b_resid > bp->b_bcount || bp->b_resid < 0)
 +				bp->b_resid = bp->b_bcount;
 +		}
  	}

  #ifndef SCSIPI_DEBUG

 --ew6BAiZeqk4r7MaW--

Responsible-Changed-From-To: kern-bug-people->bouyer
Responsible-Changed-By: bouyer@netbsd.org
Responsible-Changed-When: Mon, 01 Oct 2007 18:48:17 +0000
Responsible-Changed-Why:
I did see this panic and commited a possible fix.


State-Changed-From-To: open->feedback
State-Changed-By: bouyer@netbsd.org
State-Changed-When: Mon, 01 Oct 2007 18:48:17 +0000
State-Changed-Why:
I sent a patch.


From: Paul Ripke <stix@stix.id.au>
To: NetBSD gnats-bugs <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: kern/36690: KASSERT(delta > 0) in kern_physio
Date: Fri, 19 Oct 2007 18:31:34 +1000

 Rolled diffs from:
 cvs diff -r 1.198 -r 1.201 st.c
 into a NetBSD 4.0_RC3 kernel, and tests out fine for the above
 blocksize issue.

 Can I suggest this be pulled up to netbsd-4?

 Thanks!
 -- 
 Paul Ripke

From: Damon Permezel <permezel@mac.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: re: kern/36690
Date: Tue, 20 Nov 2007 17:12:03 +1000

 Sorry, I no longer run NetBSD on that machine so I cannot test this.
 I went back to FreeBSD and now have no problems whatsoever with the  
 tape drive.

 Cheers,
 Damon

From: Damon Permezel <permezel@mac.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: re: kern/36690
Date: Thu, 20 Dec 2007 15:18:53 +1000

 sorry.  no longer running NetBSD on that box (or any boxen).  gave up  
 and went back to FreeBSD and now experience no problems whatsoever  
 with the tape drive, including with some tapes that hung the box with  
 NetBSD so badly that I required a power cycle to get the thing working  
 again.

 Sorry I cannot help any further.

State-Changed-From-To: feedback->open
State-Changed-By: pavel@netbsd.org
State-Changed-When: Thu, 20 Dec 2007 10:50:16 +0000
State-Changed-Why:
submitter can't test anymore.


State-Changed-From-To: open->closed
State-Changed-By: bouyer@NetBSD.org
State-Changed-When: Sun, 21 Sep 2008 20:35:41 +0000
State-Changed-Why:
Likely fixed with st.c 1.194.2.1. 


From: Damon Permezel <permezel@mac.com>
To: gnats-bugs@NetBSD.org
Cc: bouyer@NetBSD.org
Subject: Re: kern/36690 (KASSERT(delta > 0) in kern_physio)
Date: Mon, 22 Sep 2008 09:26:45 +1000

 Really like to test this for y'all.  Unfortunately, got a job, so all  
 time has evaporated.  Machine in questing has some nasty loonix crap  
 on it so it is sounding very tempting to try and get some minutes and  
 re-install a sane OS on it.

 Gotta run to work now, though.... i are a loonix kernel hacker.   
 kprint just runs off my fingers as if infinite hoards of monkeys are  
 at work --- not.


 On 2008-Sep-22, at 6:35 AM, bouyer@NetBSD.org wrote:

 > Synopsis: KASSERT(delta > 0) in kern_physio
 >
 > State-Changed-From-To: open->closed
 > State-Changed-By: bouyer@NetBSD.org
 > State-Changed-When: Sun, 21 Sep 2008 20:35:41 +0000
 > State-Changed-Why:
 > Likely fixed with st.c 1.194.2.1.
 >
 >
 >

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.