NetBSD Problem Report #59219

From www@netbsd.org  Wed Mar 26 16:01:34 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits)
	 client-signature RSA-PSS (2048 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id D57361A9239
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 26 Mar 2025 16:01:34 +0000 (UTC)
Message-Id: <20250326160133.979961A923D@mollari.NetBSD.org>
Date: Wed, 26 Mar 2025 16:01:33 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: umass(4): fails to give up on all I/O promptly when device is yanked
X-Send-Pr-Version: www-1.0

>Number:         59219
>Category:       kern
>Synopsis:       umass(4): fails to give up on all I/O promptly when device is yanked
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Mar 26 16:05:01 +0000 2025
>Originator:     Taylor R Campbell
>Release:        current, 10, 9, ...
>Organization:
The NetUMass Yankation
>Environment:
>Description:
When a umass(4) device is yanked, the kernel has enough information to instantly conclude that the device is gone and all pending I/O should be abandoned, and yet it continues to hang and spew garbage like this:

[ 496464.497266] umass0: BBB reset failed, TIMEOUT
[ 496469.497333] umass0: BBB bulk-in clear stall failed, TIMEOUT
[ 496474.497402] umass0: BBB bulk-out clear stall failed, TIMEOUT
[ 496494.497688] umass0: BBB reset failed, TIMEOUT
[ 496499.497731] umass0: BBB bulk-in clear stall failed, TIMEOUT
[ 496504.497798] umass0: BBB bulk-out clear stall failed, TIMEOUT
[ 496524.498063] umass0: BBB reset failed, TIMEOUT
[ 496529.498130] umass0: BBB bulk-in clear stall failed, TIMEOUT
[ 496534.498196] umass0: BBB bulk-out clear stall failed, TIMEOUT
[ 496554.498462] umass0: BBB reset failed, TIMEOUT

>How-To-Repeat:
yank a umass(4) device while I/O is happening
>Fix:
Yes, please!

umass_detach already sets sc_dying (which really shouldn't be necessary) and aborts the pipes and detaches the children (which really should be enough), but something in the scsipi state machine or something keeps retrying things when it shouldn't.

Also, it looks like umass_detach attempts to abort the pipes multiple times -- first directly, and then via umass_disco.  There should be only one attempt to do this at exactly the correct time, not some flailing to bang on it like whack-a-mole until it's gone.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.