NetBSD Problem Report #47646

From jmmv@netbsd.org  Thu Mar 14 06:36:41 2013
Return-Path: <jmmv@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id D4D8863EE4D
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 14 Mar 2013 06:36:40 +0000 (UTC)
Message-Id: <20130314063640.A07E814A32C@mail.netbsd.org>
Date: Thu, 14 Mar 2013 06:36:40 +0000 (UTC)
From: jmmv@netbsd.org
Reply-To: jmmv@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: Test dev/scsipi/t_cd:noisyeject broken
X-Send-Pr-Version: 3.95

>Number:         47646
>Category:       kern
>Synopsis:       Test dev/scsipi/t_cd:noisyeject broken
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 14 06:40:00 +0000 2013
>Closed-Date:    Fri Mar 15 16:19:36 +0000 2013
>Last-Modified:  Fri Mar 15 16:20:03 +0000 2013
>Originator:     Julio Merino
>Release:        NetBSD 6.99.17
>Organization:

>Environment:


System: NetBSD nbair.virtual.network 6.99.17 NetBSD 6.99.17 (GENERIC) #10: Thu Mar 14 13:50:55 JST 2013 jmmv@nbair.virtual.network:/home/jmmv/os/netbsd/amd64/obj/home/jmmv/os/netbsd/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
	The dev/scsipi/t_cd:noisyeject test is broken.  It segfaults, and
	I do not know if this is due to the test being broken or due to
	some problem in rump or the device driver (according to the stack
	trace).  But the test cannot remain broken, so filing this report
	to mark the test as known to fail.

	The stack trace looks like this:

	[...]
	feature 0 having length 133
	feature 18825 having length 73
	feature 59893 having length 255
	feature 9224 having length 19
	feature 35271 having length 157
	feature 30013 having length 1
	feature 72 having length 157
	feature 0 having length 133
	feature 72 haProcess with PID 21857 dumped core; attempting to gather stack trace
	[New process 1]
	[New process 24]
	[New process 23]
	[New process 22]
	[New process 21]
	[New process 20]
	[New process 19]
	[New process 18]
	[New process 17]
	[New process 16]
	[New process 15]
	[New process 14]
	[New process 13]
	[New process 12]
	[New process 11]
	[New process 10]
	[New process 9]
	[New process 8]
	[New process 7]
	[New process 6]
	[New process 5]
	[New process 4]
	[New process 3]
	[New process 2]
	Core was generated by `t_cd'.
	Program terminated with signal 11, Segmentation fault.
	#0  0x00007f7ff7412ade in ?? () from /usr/lib/librumpdev_scsipi.so.0
	#0  0x00007f7ff7412ade in ?? () from /usr/lib/librumpdev_scsipi.so.0
	#1  0x00007f7ff64666d7 in rumpns_bdev_ioctl () from /usr/lib/librump.so.0
	#2  0x00007f7ff70055a1 in rumpns_readdisklabel () from /usr/lib/librumpdev_disk.so.0
	#3  0x00007f7ff7410552 in ?? () from /usr/lib/librumpdev_scsipi.so.0
	#4  0x00007f7ff74149c0 in ?? () from /usr/lib/librumpdev_scsipi.so.0
	#5  0x00007f7ff6466880 in rumpns_cdev_open () from /usr/lib/librump.so.0
	#6  0x00007f7ff6826696 in rumpns_spec_open () from /usr/lib/librumpvfs.so.0
	#7  0x00007f7ff645218c in rumpns_VOP_OPEN () from /usr/lib/librump.so.0
	#8  0x00007f7ff682cc8d in rumpns_vn_open () from /usr/lib/librumpvfs.so.0
	#9  0x00007f7ff6831622 in ?? () from /usr/lib/librumpvfs.so.0
	#10 0x00007f7ff6831769 in ?? () from /usr/lib/librumpvfs.so.0
	#11 0x00007f7ff6835107 in rumpns_sys_open () from /usr/lib/librumpvfs.so.0
	#12 0x00007f7ff64819f1 in ?? () from /usr/lib/librump.so.0
	#13 0x00007f7ff6481b43 in rump___sysimpl_open () from /usr/lib/librump.so.0
	#14 0x00000000004010c3 in atfu_noisyeject_body ()
	#15 0x00007f7ff540a39a in atf_tc_run () from /usr/lib/libatf-c.so.0
	#16 0x00007f7ff5406e7b in atf_tp_main () from /usr/lib/libatf-c.so.0
	#17 0x0000000000400f22 in ___start ()
	#18 0x00007f7ff7ffa000 in ?? ()
	#19 0x0000000000000006 in ?? ()
	#20 0x00007f7ffffffb20 in ?? ()
	#21 0x00007f7ffffffb6d in ?? ()
	#22 0x00007f7ffffffb70 in ?? ()
	#23 0x00007f7ffffffb97 in ?? ()
	#24 0x00007f7ffffffb9a in ?? ()
	#25 0x00007f7ffffffbac in ?? ()
	#26 0x0000000000000000 in ?? ()
	GDB exited successfully
	t_cd:noisyeject  ->  broken: Premature exit; test case received signal 11 (core dumped)
>How-To-Repeat:
	$ kyua debug -k /usr/tests/Kyuafile dev/scsipi/t_cd:noisyeject

	either on i386 or amd64.  Or look at the test logs online.
>Fix:
	Dunno.

>Release-Note:

>Audit-Trail:
From: "Julio Merino" <jmmv@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47646 CVS commit: src/tests/dev/scsipi
Date: Thu, 14 Mar 2013 06:43:32 +0000

 Module Name:	src
 Committed By:	jmmv
 Date:		Thu Mar 14 06:43:32 UTC 2013

 Modified Files:
 	src/tests/dev/scsipi: t_cd.c

 Log Message:
 Expect a crash in the noisyeject test.  Mention PR kern/47646.


 To generate a diff of this commit:
 cvs rdiff -u -r1.4 -r1.5 src/tests/dev/scsipi/t_cd.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Thu, 14 Mar 2013 10:12:10 +0100

 When did this start happening?
 I have never seen it so far.

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: pooka@NetBSD.org
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Thu, 14 Mar 2013 10:41:50 +0100

 To answer the question:

 build: OK with 358301 lines of log, install: OK, tests: 3456 passed, 80 skipped, 42 expected_failure, 21 failed, ATF output: raw, xml, html
 commit 2013.03.10.16.27.11 pooka src/sys/rump/librump/rumpkern/Makefile.rumpkern 1.122
 commit 2013.03.10.16.27.11 pooka src/sys/rump/librump/rumpkern/memalloc.c 1.19
 commit 2013.03.10.16.51.31 pooka src/sys/rump/librump/rumpkern/rump.c 1.256
 commit 2013.03.10.16.51.31 pooka src/sys/rump/librump/rumpkern/rump_private.h 1.74
 commit 2013.03.10.16.51.31 pooka src/sys/rump/librump/rumpkern/threads.c 1.18
 commit 2013.03.10.17.05.12 pooka src/sys/rump/librump/rumpkern/Makefile.rumpkern 1.123
 commit 2013.03.10.17.05.12 pooka src/sys/rump/librump/rumpkern/memalloc.c 1.20
 commit 2013.03.10.17.05.12 pooka src/sys/rump/librump/rumpkern/rump.c 1.257
 commit 2013.03.10.17.05.12 pooka src/sys/rump/librump/rumpkern/vm.c 1.137
 commit 2013.03.10.17.55.42 pooka src/sys/kern/kern_uidinfo.c 1.8
 build: OK with 358770 lines of log, install: OK, tests: 3454 passed, 80 skipped, 42 expected_failure, 23 failed, ATF output: raw, xml, html


 The first one ran the test OK, with the second one it dumped core.

 Antti?

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: pooka@NetBSD.org
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Thu, 14 Mar 2013 15:56:54 +0100

 Note that it does not happen on all archs - the test now fails for me
 with:

 tc-end: 1363272941.977413, noisyeject, failed, Test case was expected to receive a termination signal but it continued execution

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Thu, 14 Mar 2013 22:34:33 +0100

 It dies in:

 #0  0x00007f7ff74185b2 in mmc_getdiscinfo (periph=0x7f7ff6301f08, 
     mmc_discinfo=0x7f7fffffd0b0)
     at /usr/src/sys/rump/dev/lib/libscsipi/../../../../dev/scsipi/cd.c:3056
 #1  0x00007f7ff741515d in cdioctl (dev=1539, cmd=1081631516, 
     addr=0x7f7fffffd0b0, flag=-2147483648, l=0x7f7ff7b4b000)
     at /usr/src/sys/rump/dev/lib/libscsipi/../../../../dev/scsipi/cd.c:1624
 #2  0x00007f7ff64666d7 in rumpns_bdev_ioctl () from /usr/lib/librump.so.0
 #3  0x00007f7ff70055a1 in rumpns_readdisklabel ()
    from /usr/lib/librumpdev_disk.so.0
 #4  0x00007f7ff741566a in cdgetdisklabel (cd=0x7f7ff6b26300)
     at /usr/src/sys/rump/dev/lib/libscsipi/../../../../dev/scsipi/cd.c:1765

 especially:

 3051                    fpos = &gc->feature_desc[0];
 3052                    while (pos < features_len - 4) {
 3053                            gcf = (struct scsipi_get_conf_feature *) fpos;

 and both pos and features_len seem to be way out of bounds:

 (gdb) print features_len
 $1 = 10547445
 (gdb) print pos
 $2 = 1087470


 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: Manuel Bouyer <bouyer@antioche.eu.org>
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Fri, 15 Mar 2013 11:03:53 +0100

 --gBBFr7Ir9EOA20Yy
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 And this patch works around it - looks like somewhere on the way an
 error return value from cdstrategy is lost. With this change, the
 feature_len comes up as 0 and we break out of the loop imediately.

 Manuel?

 Martin

 --gBBFr7Ir9EOA20Yy
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=patch

 Index: cd.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/scsipi/cd.c,v
 retrieving revision 1.309
 diff -u -r1.309 cd.c
 --- cd.c	6 May 2012 17:23:10 -0000	1.309
 +++ cd.c	15 Mar 2013 10:00:32 -0000
 @@ -3030,6 +3030,7 @@
  		/* determine mmc profile and class */
  		flags = XS_CTL_DATA_IN;
  		memset(&gc_cmd, 0, sizeof(gc_cmd));
 +		_lto4b(0, gc->data_len);
  		gc_cmd.opcode = GET_CONFIGURATION;
  		_lto2b(last_feature, gc_cmd.start_at_feature);
  		_lto2b(feat_tbl_len, gc_cmd.data_len);

 --gBBFr7Ir9EOA20Yy--

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: Manuel Bouyer <bouyer@antioche.eu.org>, Antti Kantee <pooka@NetBSD.org>
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Fri, 15 Mar 2013 11:32:16 +0100

 Actually there is no error involved, the value comes from here:

 182             case SCSI_READ_6_COMMAND: {
 183     #ifdef USE_TOSI_ISO
 184                     struct scsi_rw_6 *param = (void *)cmd;
 185     
 186                     printf("reading %d bytes from %d\n",
 187                         param->length * CDBLOCKSIZE,
 188                         _3btol(param->addr) * CDBLOCKSIZE);
 189                     rumpuser_pread(isofd, xs->data,
 190                          param->length * CDBLOCKSIZE,
 191                          _3btol(param->addr) * CDBLOCKSIZE,
 192                          &error);
 193     #endif
 194     
 195                     break;


 with this backtrace:

 #0  scsitest_request (chan=0x7f7ff6b26480, req=ADAPTER_REQ_RUN_XFER, 
     arg=0x7f7ff7be2f00) at /usr/src/sys/rump/dev/lib/libscsitest/scsitest.c:205
 #1  0x00007f7ff741fda5 in scsipi_run_queue (chan=0x7f7ff6b26480)
     at /usr/src/sys/rump/dev/lib/libscsipi/../../../../dev/scsipi/scsipi_base.c:1845
 #2  0x00007f7ff742010c in scsipi_execute_xs (xs=0x7f7ff7be2f00)
     at /usr/src/sys/rump/dev/lib/libscsipi/../../../../dev/scsipi/scsipi_base.c:1970
 #3  0x00007f7ff7420f3b in scsipi_command (periph=0x7f7ff6301f08, 
     cmd=0x7f7fffffcae0, cmdlen=10, data_addr=0x7f7ff630e808 "", datalen=1024, 
     retries=4, timeout=30000, bp=0x0, flags=4096)
     at /usr/src/sys/rump/dev/lib/libscsipi/../../../../dev/scsipi/scsipiconf.c:101
 #4  0x00007f7ff741854f in mmc_getdiscinfo (periph=0x7f7ff6301f08, 
     mmc_discinfo=0x7f7fffffd080)
     at /usr/src/sys/rump/dev/lib/libscsipi/../../../../dev/scsipi/cd.c:3038


 and the xs is:

 $8 = {channel_q = {tqe_next = 0x0, tqe_prev = 0x7f7ff6b26548}, device_q = {
     tqe_next = 0x0, tqe_prev = 0x7f7ff6301f90}, xs_callout = {_c_store = {0x0, 
       0x0, 0x0, 0x0, 0x7f7ff66c68a0, 0x100000000, 0x11deeba1, 0x0, 0x0, 0x0}}, 
   xs_control = 4096, xs_status = 0, xs_periph = 0x7f7ff6301f08, 
   xs_retries = 4, xs_requeuecnt = 0, timeout = 30000, cmd = 0x7f7ff7be2fec, 
   cmdlen = 10, data = 0x7f7ff630e808 "", datalen = 1024, resid = 1024, 
   error = XS_NOERROR, bp = 0x0, sense = {scsi_sense = {
 ...


 So the (simulated) device, when told to read 1024 bytes, reads 0 bytes
 and signals no error - is that OK?

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: Antti Kantee <pooka@NetBSD.org>, Manuel Bouyer <bouyer@NetBSD.org>
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Fri, 15 Mar 2013 12:08:53 +0100

 --pf9I7BMVVzbSWLtt
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 Is there some easy way (easier than allocating a temporary buf) to get
 the residual here? If so we should test that and make sure we got enough
 of an answer to be meaningfull.

 I propose the attached patch to fix this the issue in two, unrelated ways.

 Martin


 --pf9I7BMVVzbSWLtt
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=patch

 Index: cd.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/scsipi/cd.c,v
 retrieving revision 1.309
 diff -u -r1.309 cd.c
 --- cd.c	6 May 2012 17:23:10 -0000	1.309
 +++ cd.c	15 Mar 2013 11:03:17 -0000
 @@ -3033,6 +3033,7 @@
  		gc_cmd.opcode = GET_CONFIGURATION;
  		_lto2b(last_feature, gc_cmd.start_at_feature);
  		_lto2b(feat_tbl_len, gc_cmd.data_len);
 +		memset(gc, 0, feat_tbl_len);

  		error = scsipi_command(periph,
  			(void *)&gc_cmd, sizeof(gc_cmd),
 @@ -3044,7 +3045,7 @@
  		}

  		features_len = _4btol(gc->data_len);
 -		if (features_len < 4)
 +		if (features_len < 4 || features_len > feat_tbl_len)
  			break;

  		pos  = 0;

 --pf9I7BMVVzbSWLtt--

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@NetBSD.org, Antti Kantee <pooka@NetBSD.org>
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Fri, 15 Mar 2013 15:51:29 +0100

 On Fri, Mar 15, 2013 at 11:32:16AM +0100, Martin Husemann wrote:
 > [...]
 > 
 > So the (simulated) device, when told to read 1024 bytes, reads 0 bytes
 > and signals no error - is that OK?

 it's not, it should at last return the feature header. See
 http://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CDoQFjAA&url=ftp%3A%2F%2Fftp.t10.org%2Ft10%2Fdocument.97%2F97-263r0.pdf&ei=LDNDUcbmF8eQhQeG84DgCA&usg=AFQjCNEFP6HJkVddmEJfQ1fCNHPYRTnonA&bvm=bv.43828540,d.ZG4
 (document T10/97-263r0)

 I guess the feature header could have a 0 data length, which would break
 out of the loop immediatly.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@NetBSD.org, Antti Kantee <pooka@NetBSD.org>
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Fri, 15 Mar 2013 15:54:03 +0100

 On Fri, Mar 15, 2013 at 12:08:53PM +0100, Martin Husemann wrote:
 > Is there some easy way (easier than allocating a temporary buf) to get
 > the residual here? If so we should test that and make sure we got enough
 > of an answer to be meaningfull.
 > 
 > I propose the attached patch to fix this the issue in two, unrelated ways.

 I think the problem comes from the emulator, which doesn't return the
 mandatory feature header. But zeroing out the buffer before the
 command, and sanity-checking the result is probably good to have anyway.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Martin Husemann <martin@duskware.de>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@NetBSD.org, Antti Kantee <pooka@NetBSD.org>
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Fri, 15 Mar 2013 16:43:04 +0100

 --+g7M9IMkV8truYOl
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 Ok, so additionally to the last patch do this?

 Martin

 --+g7M9IMkV8truYOl
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=patch

 Index: scsitest.c
 ===================================================================
 RCS file: /cvsroot/src/sys/rump/dev/lib/libscsitest/scsitest.c,v
 retrieving revision 1.5
 diff -u -r1.5 scsitest.c
 --- scsitest.c	27 Oct 2012 17:18:40 -0000	1.5
 +++ scsitest.c	15 Mar 2013 15:42:15 -0000
 @@ -176,7 +176,7 @@
  		break;
  	}
  	case GET_CONFIGURATION: {
 -
 +		memset(xs->data, 0, sizeof(struct scsipi_get_conf_data));
  		break;
  	}
  	case SCSI_READ_6_COMMAND: {

 --+g7M9IMkV8truYOl--

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@NetBSD.org, Antti Kantee <pooka@NetBSD.org>
Subject: Re: kern/47646: Test dev/scsipi/t_cd:noisyeject broken
Date: Fri, 15 Mar 2013 17:04:36 +0100

 On Fri, Mar 15, 2013 at 04:43:04PM +0100, Martin Husemann wrote:
 > Ok, so additionally to the last patch do this?

 Yes, this should fix the root cause of the problem.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47646 CVS commit: src/sys/rump/dev/lib/libscsitest
Date: Fri, 15 Mar 2013 16:14:13 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Fri Mar 15 16:14:13 UTC 2013

 Modified Files:
 	src/sys/rump/dev/lib/libscsitest: scsitest.c

 Log Message:
 For the MMC GET_CONFIGURATION command, return a nulled feature header.
 Fixes PR kern/47646.


 To generate a diff of this commit:
 cvs rdiff -u -r1.5 -r1.6 src/sys/rump/dev/lib/libscsitest/scsitest.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Fri, 15 Mar 2013 16:19:36 +0000
State-Changed-Why:
fixed


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47646 CVS commit: src/sys/dev/scsipi
Date: Fri, 15 Mar 2013 16:16:12 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Fri Mar 15 16:16:12 UTC 2013

 Modified Files:
 	src/sys/dev/scsipi: cd.c

 Log Message:
 Prompted by PR kern/47646, zero the value buffer before running the
 GET_CONFIGURATION command and check for an unexpectedly large feature
 length answer afterwards.


 To generate a diff of this commit:
 cvs rdiff -u -r1.309 -r1.310 src/sys/dev/scsipi/cd.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.