NetBSD Problem Report #36690
From martin@duskware.de Wed Jul 25 07:06:33 2007
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by narn.NetBSD.org (Postfix) with ESMTP id 45EF363B874
for <gnats-bugs@gnats.netbsd.org>; Wed, 25 Jul 2007 07:06:33 +0000 (UTC)
Message-Id: <20070725053056.E89B163B874@narn.NetBSD.org>
Date: Wed, 25 Jul 2007 05:30:56 +0000 (UTC)
From: permezel@mac.com
Reply-To: permezel@mac.com
To: netbsd-bugs-owner@NetBSD.org
Subject: KASSERT(delta > 0) in kern_physio
X-Send-Pr-Version: www-1.0
>Number: 36690
>Category: kern
>Synopsis: KASSERT(delta > 0) in kern_physio
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: bouyer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jul 25 07:10:01 +0000 2007
>Closed-Date: Sun Sep 21 20:35:41 +0000 2008
>Last-Modified: Mon Sep 22 00:50:07 +0000 2008
>Originator: Damon Permezel
>Release: 4.0 beta 2 from June 26
>Organization:
>Environment:
NetBSD zardoz.damon.com 4.0_BETA2 NetBSD 4.0_BETA2 (ZARDOZ) #0: Tue Jun 26 15:37:57 EST 2007 dap@zardoz.damon.com:/home/dap/proj/3.1/obj/sys/arch/i386/compile/ZARDOZ i386
>Description:
running: dd bs=32k </dev/nrst0 >0
After panic/reboot, I did: mt rew; dd bs=32k </dev/nrst0 >0 count=1
and she immediately wedged tight requiring fresh electrons.
I have found in the past that tape support was somewhat touch and go. If I try and use the same tape drive to erase a tape (dd </dev/zero bs=128k >/dev/nrst0), quite likely I will get errors on end of media and never be able to recover the tape drive unless I reboot with fresh electrons. Just reboot with the same old electrons and she's still hosed, which might have something to do with the hand-after-panic I report in second paragraph, which is why I am rambling on so.
No locals.
#1 0xc0292941 in panic (fmt=0x0)
at /home/dap/proj/3.1/src/sys/kern/subr_prf.c:246
bootopt = 256
ap = 0xcba629d8 "\224-<¿c\202>¿ÿ\201>¿?\001"
intrace = 0
#2 0xc038abac in __assert (t=0xc03c2d94 "diagnostic ",
f=0xc03e81d8 "/home/dap/proj/3.1/src/sys/kern/kern_physio.c", l=441,
e=0xc03e8263 "delta > 0")
at /home/dap/proj/3.1/src/sys/lib/libkern/__assert.c:45
No locals.
#3 0xc0274b7e in physio (strategy=0xc030b49b <ststrategy>, obp=0x0, dev=3585,
flags=1048576, min_phys=0xc018e47f <ahc_minphys>, uio=0xcba62b90)
at /home/dap/proj/3.1/src/sys/kern/kern_physio.c:445
iovp = (struct iovec *) 0xcba62bb4
l = (struct lwp *) 0xcd2bc010
p = (struct proc *) 0xcd4df034
i = 1
s = <value optimized out>
error = 0
error2 = <value optimized out>
bp = (struct buf *) 0x0
mbp = (struct buf *) 0xc21bd70c
concurrency = 15
#4 0xc030a59c in stread (dev=3585, uio=0xcba62b90, iomode=0)
at /home/dap/proj/3.1/src/sys/dev/scsipi/st.c:1375
No locals.
#5 0xc02c6aeb in spec_read (v=0xcba62b08)
at /home/dap/proj/3.1/src/sys/miscfs/specfs/spec_vnops.c:294
vp = (struct vnode *) 0xcff482d0
uio = (struct uio *) 0xcba62b90
l = (struct lwp *) 0xcd2bc010
bp = <value optimized out>
bdev = <value optimized out>
cdev = (const struct cdevsw *) 0x0
bsize = <value optimized out>
bscale = <value optimized out>
dpart = {disklab = 0xc03f0689, part = 0x135}
n = <value optimized out>
on = <value optimized out>
error = <value optimized out>
#6 0xc02c0c03 in VOP_READ (vp=0xcff482d0, uio=0xcba62b90, ioflag=0,
cred=0xcd1abc24) at /home/dap/proj/3.1/src/sys/kern/vnode_if.c:424
a = {a_desc = 0xc03a3d60, a_vp = 0xcff482d0, a_uio = 0xcba62b90,
a_ioflag = 0, a_cred = 0xcd1abc24}
#7 0xc02bec14 in vn_read (fp=0xcd1f6114, offset=0xcd1f6140, uio=0xcba62b90,
cred=0xcd1abc24, flags=1)
at /home/dap/proj/3.1/src/sys/kern/vfs_vnops.c:448
vp = (struct vnode *) 0xcff482d0
error = <value optimized out>
ioflag = 0
#8 0xc0297198 in dofileread (l=0xcd2bc010, fd=0, fp=0xcd1f6114,
buf=0x804f000, nbyte=32768, offset=0xcd1f6140, flags=1, retval=0xcba62c68)
at /home/dap/proj/3.1/src/sys/kern/sys_generic.c:153
aiov = {iov_base = 0x8057000, iov_len = 0}
auio = {uio_iov = 0xcba62bb4, uio_iovcnt = 1, uio_offset = 32768,
uio_resid = 0, uio_rw = UIO_READ, uio_vmspace = 0xcd8f6150}
p = (struct proc *) 0xcd4df034
vm = (struct vmspace *) 0xcd8f6150
cnt = <value optimized out>
error = 0
ktriov = {iov_base = 0x0, iov_len = 0}
#9 0xc02972fe in sys_read (l=0xcd2bc010, v=0xcba62c48, retval=0xcba62c68)
at /home/dap/proj/3.1/src/sys/kern/sys_generic.c:103
fd = 0
fp = (struct file *) 0xcd1f6114
p = <value optimized out>
(gdb) print *mbp
$2 = {b_u = {u_actq = {tqe_next = 0xdeadbeef, tqe_prev = 0xc21bdc40},
u_work = {wk_entry = {sqe_next = 0xdeadbeef}}}, b_interlock = {
lock_data = 0x0,
lock_file = 0xc03e9d07 "/home/dap/proj/3.1/src/sys/kern/kern_synch.c",
unlock_file = 0xc03e81d8 "/home/dap/proj/3.1/src/sys/kern/kern_physio.c",
lock_line = 0x27f, unlock_line = 0x1b3, list = {tqe_next = 0x0,
tqe_prev = 0x0}, lock_holder = 0xffffffff}, b_flags = 0x810,
b_error = 0x5, b_prio = 0x1, b_bufsize = 0xdeadbeef, b_bcount = 0xdeadbeef,
b_resid = 0xdeadbeef, b_dev = 0xffffffff, b_un = {
b_addr = 0xdeadbeef <Address 0xdeadbeef out of bounds>},
b_blkno = 0xdeadbeefdeadbeef, b_rawblkno = 0xdeadbeefdeadbeef,
b_iodone = 0xdeadbeef, b_proc = 0xdeadbeef, b_vp = 0xdeadbeef, b_dep = {
lh_first = 0x0}, b_saveaddr = 0xdeadbeef, b_fspriv = {
bf_private = 0xdeadbeef, bf_dcookie = 0xdeadbeefdeadbeef}, b_hash = {
le_next = 0xdeadbeef, le_prev = 0xdeadbeef}, b_vnbufs = {
le_next = 0xdeadbeef, le_prev = 0xdeadbeef}, b_freelist = {
tqe_next = 0xdeadbeef, tqe_prev = 0xdeadbeef}, b_lblkno = 0x10000,
b_freelistindex = 0x0}
(gdb) print *uio
$3 = {uio_iov = 0xcba62bb4, uio_iovcnt = 0x1, uio_offset = 0x8000,
uio_resid = 0x0, uio_rw = UIO_READ, uio_vmspace = 0xcd8f6150}
(gdb) p *iovp
$4 = {iov_base = 0x8057000, iov_len = 0x0}
So uio_offset = 0x8000 and b_endoffset == b_lblkno == 0x10000
delta = uio->uio_offset - mbp->b_endoffset;
KASSERT(delta > 0);
delta = 8000 - 10000.
>How-To-Repeat:
dd detape.
>Fix:
Don't use tapes?
>Release-Note:
>Audit-Trail:
From: Paul Ripke <stix@stix.id.au>
To: NetBSD gnats-bugs <gnats-bugs@NetBSD.org>
Cc:
Subject: Re: kern/36690: KASSERT(delta > 0) in kern_physio
Date: Sat, 29 Sep 2007 23:42:39 +1000
This appears to be due to instances like the following:
st0: 65536-byte tape record too big for 32768-byte user buffer
st0(ahc0:0:6:0): Check Condition on CDB: 0x08 00 00 80 00 00
SENSE KEY: No Additional Sense
Incorrect Length Indicator Set
INFO FIELD: -32768
ASC/ASCQ: No Additional Sense Information
panic: kernel diagnostic assertion "delta > 0" failed: file "/l/netbsd/netbsd-4/src/sys/kern/kern_physio.c", line 441
100% reproducible on head of netbsd-4 branch.
--
Paul Ripke
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
netbsd-bugs@NetBSD.org, permezel@mac.com
Subject: Re: kern/36690: KASSERT(delta > 0) in kern_physio
Date: Mon, 1 Oct 2007 20:44:13 +0200
--ew6BAiZeqk4r7MaW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Sat, Sep 29, 2007 at 01:45:02PM +0000, Paul Ripke wrote:
> The following reply was made to PR kern/36690; it has been noted by GNATS.
>
> From: Paul Ripke <stix@stix.id.au>
> To: NetBSD gnats-bugs <gnats-bugs@NetBSD.org>
> Cc:
> Subject: Re: kern/36690: KASSERT(delta > 0) in kern_physio
> Date: Sat, 29 Sep 2007 23:42:39 +1000
>
> This appears to be due to instances like the following:
>
> st0: 65536-byte tape record too big for 32768-byte user buffer
> st0(ahc0:0:6:0): Check Condition on CDB: 0x08 00 00 80 00 00
> SENSE KEY: No Additional Sense
> Incorrect Length Indicator Set
> INFO FIELD: -32768
> ASC/ASCQ: No Additional Sense Information
> panic: kernel diagnostic assertion "delta > 0" failed: file "/l/netbsd/netbsd-4/src/sys/kern/kern_physio.c", line 441
>
> 100% reproducible on head of netbsd-4 branch.
Should be fixed in st.c 1.200. Can you please try the attached patch ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--ew6BAiZeqk4r7MaW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
Index: st.c
===================================================================
RCS file: /cvsroot/src/sys/dev/scsipi/st.c,v
retrieving revision 1.198
retrieving revision 1.200
diff -u -p -u -r1.198 -r1.200
--- st.c 29 Jul 2007 12:50:23 -0000 1.198
+++ st.c 1 Oct 2007 18:43:30 -0000 1.200
@@ -1,4 +1,4 @@
-/* $NetBSD: st.c,v 1.198 2007/07/29 12:50:23 ad Exp $ */
+/* $NetBSD: st.c,v 1.200 2007/10/01 18:43:30 bouyer Exp $ */
/*-
* Copyright (c) 1998, 2004 The NetBSD Foundation, Inc.
@@ -57,7 +57,7 @@
*/
#include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD: st.c,v 1.198 2007/07/29 12:50:23 ad Exp $");
+__KERNEL_RCSID(0, "$NetBSD: st.c,v 1.200 2007/10/01 18:43:30 bouyer Exp $");
#include "opt_scsi.h"
@@ -1219,6 +1219,7 @@ ststart(struct scsipi_periph *periph)
if (st_space(st, 0, SP_FILEMARKS, 0)) {
BUFQ_GET(st->buf_queue);
bp->b_error = EIO;
+ bp->b_resid = bp->b_bcount;
biodone(bp);
continue;
}
@@ -2234,8 +2235,16 @@ st_interpret_sense(struct scsipi_xfer *x
}
}
}
- if (bp)
+ if (bp) {
bp->b_resid = info;
+ /*
+ * buggy device ? A SDLT320 can report an info
+ * field of 0x3de8000 on a Media Error/Write Error
+ * for this CBD: 0x0a 00 00 80 00 00
+ */
+ if (bp->b_resid > bp->b_bcount || bp->b_resid < 0)
+ bp->b_resid = bp->b_bcount;
+ }
}
#ifndef SCSIPI_DEBUG
--ew6BAiZeqk4r7MaW--
Responsible-Changed-From-To: kern-bug-people->bouyer
Responsible-Changed-By: bouyer@netbsd.org
Responsible-Changed-When: Mon, 01 Oct 2007 18:48:17 +0000
Responsible-Changed-Why:
I did see this panic and commited a possible fix.
State-Changed-From-To: open->feedback
State-Changed-By: bouyer@netbsd.org
State-Changed-When: Mon, 01 Oct 2007 18:48:17 +0000
State-Changed-Why:
I sent a patch.
From: Paul Ripke <stix@stix.id.au>
To: NetBSD gnats-bugs <gnats-bugs@NetBSD.org>
Cc:
Subject: Re: kern/36690: KASSERT(delta > 0) in kern_physio
Date: Fri, 19 Oct 2007 18:31:34 +1000
Rolled diffs from:
cvs diff -r 1.198 -r 1.201 st.c
into a NetBSD 4.0_RC3 kernel, and tests out fine for the above
blocksize issue.
Can I suggest this be pulled up to netbsd-4?
Thanks!
--
Paul Ripke
From: Damon Permezel <permezel@mac.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: re: kern/36690
Date: Tue, 20 Nov 2007 17:12:03 +1000
Sorry, I no longer run NetBSD on that machine so I cannot test this.
I went back to FreeBSD and now have no problems whatsoever with the
tape drive.
Cheers,
Damon
From: Damon Permezel <permezel@mac.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: re: kern/36690
Date: Thu, 20 Dec 2007 15:18:53 +1000
sorry. no longer running NetBSD on that box (or any boxen). gave up
and went back to FreeBSD and now experience no problems whatsoever
with the tape drive, including with some tapes that hung the box with
NetBSD so badly that I required a power cycle to get the thing working
again.
Sorry I cannot help any further.
State-Changed-From-To: feedback->open
State-Changed-By: pavel@netbsd.org
State-Changed-When: Thu, 20 Dec 2007 10:50:16 +0000
State-Changed-Why:
submitter can't test anymore.
State-Changed-From-To: open->closed
State-Changed-By: bouyer@NetBSD.org
State-Changed-When: Sun, 21 Sep 2008 20:35:41 +0000
State-Changed-Why:
Likely fixed with st.c 1.194.2.1.
From: Damon Permezel <permezel@mac.com>
To: gnats-bugs@NetBSD.org
Cc: bouyer@NetBSD.org
Subject: Re: kern/36690 (KASSERT(delta > 0) in kern_physio)
Date: Mon, 22 Sep 2008 09:26:45 +1000
Really like to test this for y'all. Unfortunately, got a job, so all
time has evaporated. Machine in questing has some nasty loonix crap
on it so it is sounding very tempting to try and get some minutes and
re-install a sane OS on it.
Gotta run to work now, though.... i are a loonix kernel hacker.
kprint just runs off my fingers as if infinite hoards of monkeys are
at work --- not.
On 2008-Sep-22, at 6:35 AM, bouyer@NetBSD.org wrote:
> Synopsis: KASSERT(delta > 0) in kern_physio
>
> State-Changed-From-To: open->closed
> State-Changed-By: bouyer@NetBSD.org
> State-Changed-When: Sun, 21 Sep 2008 20:35:41 +0000
> State-Changed-Why:
> Likely fixed with st.c 1.194.2.1.
>
>
>
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.