NetBSD Problem Report #59570

From www@netbsd.org  Sun Aug  3 21:15:18 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits)
	 client-signature RSA-PSS (2048 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 442AF1A923C
	for <gnats-bugs@gnats.NetBSD.org>; Sun,  3 Aug 2025 21:15:18 +0000 (UTC)
Message-Id: <20250803211516.E89ED1A923E@mollari.NetBSD.org>
Date: Sun,  3 Aug 2025 21:15:16 +0000 (UTC)
From: brandon@burn.net
Reply-To: brandon@burn.net
To: gnats-bugs@NetBSD.org
Subject: tar fails with physical tape drives
X-Send-Pr-Version: www-1.0

>Number:         59570
>Category:       bin
>Synopsis:       tar fails with physical tape drives
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 03 21:20:00 +0000 2025
>Closed-Date:    Sat May 09 21:43:29 +0000 2026
>Last-Modified:  Sat May 09 21:43:29 +0000 2026
>Originator:     Brandon Applegate
>Release:        10.1
>Organization:
>Environment:
>Description:
When using tar with a (SCSI) attached tape drive, writing appears to succeed.  However reading fails with "damaged archive" / "skipping header" messages.  I have tried:

- Multiple tape drives.
- Multiple block sizes. (tar -b)
- Using tar to create a tar file - dd'ing this to tape.  dd'ing it back off to a file.
- Using gtar from pkgsrc
- Multiple architectures (amd64, hppa, sparc).

Using the same drive and tape on a Solaris or Linux system behaves correctly (no errors).

I had a thread on netbsd-users here:

https://mail-index.netbsd.org/netbsd-users/2025/07/22/msg032890.html

I tested a few releases going backwards from 10.1 and it seems that 7.2 let me use the tape drive with no errors.

The fact that tar/gtar/dd all have essentially the same failure modes makes me think something is wrong in the st driver nowadays ?

I have multiple drives and machiens and am happy to run any tests requested.


>How-To-Repeat:
Connect known good tape drive, use known good tape.

tar -cpvf /dev/st0 /usr/pkg

tar -xpvf /dev/st0
>Fix:

>Release-Note:

>Audit-Trail:
From: Brandon Applegate <brandon@burn.net>
To: "gnats-bugs@netbsd.org" <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: bin/59570
Date: Mon, 29 Dec 2025 21:27:16 -0500

 --Apple-Mail=_5E5FB6FF-64F5-4456-AD35-31160495C62C
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/plain;
 	charset=utf-8

 A few more tests / data points.

 I also tried bsdtar and star from pkgsrc and both have the same =
 behavior.

 PS: I submitted this with the category of =E2=80=98bin=E2=80=99.  I=E2=80=99=
 m not sure if that=E2=80=99s correct so please feel free to change this =
 (if possible) to the correct category.=

 --Apple-Mail=_5E5FB6FF-64F5-4456-AD35-31160495C62C
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename=signature.asc
 Content-Type: application/pgp-signature;
 	name=signature.asc
 Content-Description: Message signed with OpenPGP

 -----BEGIN PGP SIGNATURE-----

 iQIzBAEBCgAdFiEEomlgw3M1KeMN2gGXGDTB+aGyVZoFAmlTOIQACgkQGDTB+aGy
 VZqZ5g//W1b5e+f9GwP5kg0KW8Z3ERFqrMfVMpF8fpnhX/DBMrC07c/bZGp+uaiy
 lqf2memDOuXQyVcQ77Dxf1IRh7dzvrit3T3PNtCvWAey13L72bg+Zfspt/SZclu6
 S42x14JArFCCSuwTOh/whV+oK23UBx1eCtW2UU4HhSdd+FSNzQR0m5Xn9fsrRaWg
 BXBOe2kz0sLCWH/zUoz5G5m0k4uSlblgvBRM0yjGET4H2Z0Fh6ojCawFOp8+6fFK
 dY++xvwWtcMpClXXpptZFFlTPLBD7iZwztq4eWux7ITybMkD32Lc00htJbGW8JPm
 k/9j4M5+EElZxrx1JnTgNj3yfvsdVIPbrf0a1MMYqdLnoeX557PwKxIQST3yEAV6
 X2TMKZM2irIPcRfPO64vuV1XV7rY5H0WzDissQu3jnpj/AqR4naHpJ5RhZPjK+CR
 wQxzeObPK6sRcRjMQTCrPinZju+7IpncMGfcQ1l5YsnVLDNj8Sr1LhshcuOw7Ueg
 CPhW0AmEaRwsc13PNpe5Jr4mQa1z/3c1groYAMybXRYG/np3Xk3COS2PEz29n7BV
 xUcnfXudk4f+iaAZmxHRUVL/2kUOSIrkha+2UpKgsIH1Q1morWkJoLhlKDNtumwe
 12kvO6utOOC/SrZ9z9HSbQlp3H0yy0592eRdC1xxhQcDVlEIcNY=
 =sIww
 -----END PGP SIGNATURE-----

 --Apple-Mail=_5E5FB6FF-64F5-4456-AD35-31160495C62C--

From: Brandon Applegate <brandon@burn.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/59570
Date: Wed, 6 May 2026 10:55:51 -0400 (EDT)

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.

 --8323329-1683336105-1778079352=:2908
 Content-Type: text/plain; format=flowed; charset=ISO-8859-15
 Content-Transfer-Encoding: 8BIT

 FYI: this should likely be category "kern" not "bin".  I don't think I can 
 change it this at this point.

 Possible root cause found.  Looks like after 7.2 - leading into 8.0 - 
 there was work done on scsipi to make things MPSAFE:

 "Make scsipi framework MPSAFE. [mlelstv 20161120]"

 I just took a stock 10.1 machine (sparc64) + DAT40 / DDS4 drive (HP 
 C5683A) and did a test.  I applied the patch below (fireaxe I know...) to 
 disable MPSAFE flags for st.c  After doing this I no longer get the tar 
 bad header errors.  I wrote the kernel source directory, and extracted it 
 back to disk.  I didn't get any of the tar errors and also md5s seem to 
 all match:

 bash-5.3# diff <(cd /usr/src/sys/ && find . -type f -exec md5 {} + | sort 
 -k 4) <(cd /var/tmp/sttest/ && find . -type f -exec md5 {} + | sort -k 4)
 bash-5.3#

 So disclaimer - I am not a C dev so I don't pretend to fully undertsand 
 the 2016 refactor for this code or what all my change has wrought.  But it 
 definitely seems that this refactor > release 7.2 is what is causing real 
 tape drives to fail on reads as I have seen.  Hopefully someone can take a 
 deeper dive and figure out the issue that's happening here - and how to 
 integrate a fix to the MPSAFE work done in 2016.

 I will keep this tape drive connected and ready to go - I'm happy to test 
 any kernel tweaks or patches for this.  Thanks.

 --- st.c.orig   2026-05-06 07:53:07.402910289 -0400
 +++ st.c        2026-05-06 07:56:16.006832754 -0400
 @@ -114,7 +114,7 @@
          .d_dump = stdump,
          .d_psize = nosize,
          .d_discard = nodiscard,
 -       .d_flag = D_TAPE | D_MPSAFE
 +       .d_flag = D_TAPE
   };

   const struct cdevsw st_cdevsw = {
 @@ -129,7 +129,7 @@
          .d_mmap = nommap,
          .d_kqfilter = nokqfilter,
          .d_discard = nodiscard,
 -       .d_flag = D_TAPE | D_MPSAFE
 +       .d_flag = D_TAPE
   };

   /*

 --
 Brandon Applegate - CCIE 10273
 PGP Key fingerprint:
 A269 60C3 7335 29E3 0DDA 0197 1834 C1F9 A1B2 559A
 "For thousands of years men dreamed of pacts with demons.
 Only now are such things possible."
 --8323329-1683336105-1778079352=:2908--

From: Brandon Applegate <brandon@burn.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/59570
Date: Sat, 9 May 2026 10:18:41 -0400 (EDT)

 I can't believe it took me this long to realize what I was doing :(  I was 
 using /dev/st0 vs /dev/rst0 (linux muscle memory).  I can extract 
 correctly using tar from /dev/rst0.  We can close this PR.  Sorry for the 
 noise.

State-Changed-From-To: open->closed
State-Changed-By: gutteridge@NetBSD.org
State-Changed-When: Sat, 09 May 2026 21:43:29 +0000
State-Changed-Why:
Local issue, submitter good to close.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.49 2026/05/14 01:52:41 riastradh Exp $
$NetBSD: gnats_config.sh,v 1.10 2026/05/13 22:00:09 riastradh Exp $
Copyright © 1994-2026 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.