NetBSD Problem Report #58111

From www@netbsd.org  Thu Apr  4 07:09:20 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 07B231A9239
	for <gnats-bugs@gnats.NetBSD.org>; Thu,  4 Apr 2024 07:09:20 +0000 (UTC)
Message-Id: <20240404070919.1DE5E1A923B@mollari.NetBSD.org>
Date: Thu,  4 Apr 2024 07:09:19 +0000 (UTC)
From: mp@petermann-it.de
Reply-To: mp@petermann-it.de
To: gnats-bugs@NetBSD.org
Subject: Tracking issue for potential ZFS data corruption
X-Send-Pr-Version: www-1.0

>Number:         58111
>Category:       kern
>Synopsis:       Tracking issue for potential ZFS data corruption
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Apr 04 07:10:00 +0000 2024
>Last-Modified:  Fri Jul 05 10:05:01 +0000 2024
>Originator:     Matthias Petermann
>Release:        netbsd-10 and potentially all previous releases that contain ZFS code
>Organization:
-
>Environment:
n/a
>Description:
This ticket is a follow-up to an IRC discussion. It was discussed that NetBSD might be affected by a data corruption issue with ZFS that was recently fixed in FreeBSD. The issue consists of the implementation of the IOCTLs FIOSEEKDATA/FIOSEEKHOLE. The initial assessment on IRC was that these IOCTLs are probably not used in the base system. Theoretically, however, there is the possibility that 3rd party tools, for example from the pkgsrc, will make use of them.

Tracking tickets on FreeBSD: 

  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=275308
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256205

Accompanying articles:

  https://www.theregister.com/2023/11/27/openzfs_2_2_0_data_corruption/
  https://www.theregister.com/2023/12/04/two_new_versions_of_openzfs/

Thanks to Riastradh for the initial assessment and guidance!
>How-To-Repeat:
see FreeBSD tickets
>Fix:
see FreeBSD tickets

>Audit-Trail:
From: Simon Burge <simonb@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: Re: kern/58111: Tracking issue for potential ZFS data corruption
Date: Fri, 05 Apr 2024 00:59:26 +1100

 mp@petermann-it.de wrote:

 > >Number:         58111
 > >Category:       kern
 > >Synopsis:       Tracking issue for potential ZFS data corruption
 > ...
 > Tracking tickets on FreeBSD: =

 >
 >   https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D275308
 >   https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D256205
 >
 > Accompanying articles:
 >
 >   https://www.theregister.com/2023/11/27/openzfs_2_2_0_data_corruption/
 >   https://www.theregister.com/2023/12/04/two_new_versions_of_openzfs/
 >
 > Thanks to Riastradh for the initial assessment and guidance!
 > >How-To-Repeat:
 > see FreeBSD tickets
 > >Fix:
 > see FreeBSD tickets

 FreeBSD 12.4 which uses similar ZFS code to us is affected:
 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D275308#c22

 The fix is here:
 https://github.com/robn/freebsd-src/commit/b51ff59cce9c288111f84c8541986ac=
 6647703e7

 https://www.freebsd.org/security/patches/EN-23:16/openzfs.12.patch
 applies cleanly to our tree, builds and boots.

 There's a reproducer script in https://www.illumos.org/issues/16087
 that doesn't trigger for NetBSD using coreutils "cp" as we don't expose
 SEEK_DATA and SEEK_HOLE directly.  We _may_ be able to get to the code
 path via "ioctl(fd, FIOSEEKDATA, &arg)" and "ioctl(fd, FIOSEEKHOLE, &arg)"=
 ?

 Cheers,
 Simon.

From: "Simon Burge" <simonb@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/58111 CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs
Date: Fri, 5 Apr 2024 11:20:34 +0000

 Module Name:	src
 Committed By:	simonb
 Date:		Fri Apr  5 11:20:34 UTC 2024

 Modified Files:
 	src/external/cddl/osnet/dist/uts/common/fs/zfs: dmu.c

 Log Message:
 Apply FreeBSD svn r373278 fix for ZFS corruption.  Fix for NetBSD
 PR kern/58111 .

 It would be extremely unlikely to trip this bug on NetBSD, as we don't
 expose SEEK_DATA and SEEK_HOLE and you need to call ioctl(2) with
 FIOSEEKDATA and FIOSEEKHOLE directly which no currently known code does,
 and even then be unlucky enough to trip a race condition.

 With a reproducer based on that in https://www.illumos.org/issues/16087,
 I saw 11 groups of failures over 8 hours.  With this patch, no
 failures in 10 hours.  The repro for NetBSD will be attached to
 https://gnats.netbsd.org/58111 .

 Original FreeBSD commit message:
 --------------------------------
 dnode_is_dirty: check dnode and its data for dirtiness

 Over its history this the dirty dnode test has been changed between
 checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and
 `dn_dirty_record`.

 It turns out both are actually required.

 In the case of appending data to a newly created file, the dnode proper
 is dirtied (at least to change the blocksize) and dirty records are
 added.  Thus, a single logical operation is represented by separate
 dirty indicators, and must not be separated.

 The incorrect dirty check becomes a problem when the first block of a
 file is being appended to while another process is calling lseek to skip
 holes. There is a small window where the dnode part is undirtied while
 there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)`
 would not know that the file is dirty, and would go to
 `dnode_next_offset()`. Since the object has no data blocks yet, it
 returns `ESRCH`, indicating no data found, which results in `ENXIO`
 being returned to `lseek()`'s caller.

 This change simply updates the dirty check to check both types of dirty.
 If there's anything dirty at all, we immediately go to the "wait for
 sync" stage, It doesn't really matter after that; both changes are on
 disk, so the dirty fields should be correct.

 Sponsored by:   Klara, Inc.
 Sponsored by:   Wasabi Technology, Inc.


 To generate a diff of this commit:
 cvs rdiff -u -r1.6 -r1.7 src/external/cddl/osnet/dist/uts/common/fs/zfs/dmu.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Simon Burge <simonb@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: Re: kern/58111: Tracking issue for potential ZFS data corruption
Date: Fri, 05 Apr 2024 22:31:14 +1100

 > >Number:         58111
 > >Category:       kern
 > >Synopsis:       Tracking issue for potential ZFS data corruption

 Attached is a reproducer for the ZFS corruption bug.  There's a
 coreutils patch to make "cp" to use the FIOSEEKHOLE and FIOSEEKDATA
 ioctls.  Coreutils "cp" and "dd" should be installed on the target as
 "/usr/bin/gcp" and "gdd" (in your path) and needs bash as /bin/bash.
 Should probably use NetBSD's dd with msgfmt=quiet and use NetBSD's
 /bin/sh with some tweaks, but I was lazy.

 Running this reproducer on an 8 CPU qemu NetBSD amd64 VM with an 8GB ZFS
 pool got 11 groups of failures over 8 hours on an unpatched host.

 Cheers,
 Simon.
 --
 # This is a shell archive.  Save it in a file, remove anything before
 # this line, and then unpack it by entering "sh file".  Note, it may
 # create directories; files and directories will be owned by you and
 # have default permissions.
 #
 # This archive contains:
 #
 #	reproducer.sh
 #	repro10.sh
 #	coreutils-copy.c.diff
 #
 echo x - 'reproducer.sh'
 sed 's/^X//' >'reproducer.sh' << 'END-of-reproducer.sh'
 X#!/bin/bash
 X#
 X# Run this script multiple times in parallel inside your pool's mount
 X# to reproduce https://github.com/openzfs/zfs/issues/15526.  Like:
 X#
 X# ./reproducer.sh & ./reproducer.sh & ./reproducer.sh & /reproducer.sh & wait
 X#
 X
 X#if [ $(cat /sys/module/zfs/parameters/zfs_bclone_enabled) != "1" ] ; then
 X#	echo "please set /sys/module/zfs/parameters/zfs_bclone_enabled = 1"
 X#	exit
 X#fi
 X
 X#CP=/home/rich/coreutils-9.1/src/cp
 X#CP=/home/rich/coreutils-9.3/src/cp
 X#CP=/home/rich/coreutils/src/cp
 XCP=/usr/bin/gcp
 X
 Xprefix="reproducer_${BASHPID}_"
 Xgdd if=/dev/urandom of=${prefix}0 bs=1M count=1 status=none
 X
 X##### echo "writing files"
 Xend=500
 Xh=0
 Xfor i in `seq 1 2 $end` ; do
 X	let "j=$i+1"
 X	${CP} ${prefix}$h ${prefix}$i
 X	${CP} ${prefix}$i ${prefix}$j
 X	let "h++"
 Xdone
 X
 X##### echo "checking files"
 Xfor i in `seq 1 $end` ; do
 X	diff ${prefix}0 ${prefix}$i
 Xdone
 END-of-reproducer.sh
 echo x - 'repro10.sh'
 sed 's/^X//' >'repro10.sh' << 'END-of-repro10.sh'
 X#!/bin/sh
 X
 Xdate
 X# echo cleaning up previous test
 Xrm -f reproducer_* 2> /dev/null
 X
 Xscriptdir=$(dirname $0)
 X
 X${scriptdir}/reproducer.sh &
 X${scriptdir}/reproducer.sh &
 X${scriptdir}/reproducer.sh &
 X${scriptdir}/reproducer.sh &
 X${scriptdir}/reproducer.sh &
 X${scriptdir}/reproducer.sh &
 X${scriptdir}/reproducer.sh &
 X${scriptdir}/reproducer.sh &
 X${scriptdir}/reproducer.sh &
 X${scriptdir}/reproducer.sh &
 Xwait
 END-of-repro10.sh
 echo x - 'coreutils-copy.c.diff'
 sed 's/^X//' >'coreutils-copy.c.diff' << 'END-of-coreutils-copy.c.diff'
 X--- src/copy.c.orig	2023-08-29 21:39:27.000000000 +1000
 X+++ src/copy.c	2024-04-05 02:48:41.652462664 +1100
 X@@ -534,6 +534,7 @@
 X   return true;
 X }
 X 
 X+#define SEEK_HOLE	// XXX netbsd
 X #ifdef SEEK_HOLE
 X /* Perform an efficient extent copy, if possible.  This avoids
 X    the overhead of detecting holes in hole-introducing/preserving
 X@@ -562,7 +563,10 @@
 X 
 X   while (0 <= ext_start)
 X     {
 X-      off_t ext_end = lseek (src_fd, ext_start, SEEK_HOLE);
 X+      //XXX off_t ext_end = lseek (src_fd, ext_start, SEEK_HOLE);
 X+      off_t ext_end = ext_start;
 X+      if (ioctl(src_fd, FIOSEEKHOLE, &ext_end) < 0)
 X+	ext_end = -1;
 X       if (ext_end < 0)
 X         {
 X           if (errno != ENXIO)
 X@@ -641,7 +645,10 @@
 X           break;
 X         }
 X 
 X-      ext_start = lseek (src_fd, dest_pos, SEEK_DATA);
 X+      //XXX ext_start = lseek (src_fd, dest_pos, SEEK_DATA);
 X+      ext_start = dest_pos;
 X+      if (ioctl(src_fd, FIOSEEKDATA, &ext_start) < 0)
 X+	ext_start = -1;
 X       if (ext_start < 0 && errno != ENXIO)
 X         goto cannot_lseek;
 X     }
 X@@ -1141,13 +1148,19 @@
 X 
 X   /* Only attempt SEEK_HOLE if this heuristic
 X      suggests the file is sparse.  */
 X+#if 0	// XXX skip this check!
 X   if (! (HAVE_STRUCT_STAT_ST_BLOCKS
 X          && S_ISREG (sb->st_mode)
 X          && ST_NBLOCKS (*sb) < sb->st_size / ST_NBLOCKSIZE))
 X     return PLAIN_SCANTYPE;
 X+#endif	// XXX
 X 
 X #ifdef SEEK_HOLE
 X-  off_t ext_start = lseek (fd, 0, SEEK_DATA);
 X+  //XXX off_t ext_start = lseek (fd, 0, SEEK_DATA);
 X+  off_t ext_start = 0;
 X+  if (ioctl(fd, FIOSEEKDATA, &ext_start) < 0)
 X+    ext_start = -1;
 X+
 X   if (0 <= ext_start || errno == ENXIO)
 X     {
 X       scan_inference->ext_start = ext_start;
 END-of-coreutils-copy.c.diff
 exit

From: Matthias Petermann <mp@petermann-it.de>
To: simonb@NetBSD.org
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/58111: Tracking issue for potential ZFS data corruption
Date: Tue, 9 Apr 2024 06:00:39 +0200

 This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
 --------------8KvO0vw2jB90yve9iN5bW3pL
 Content-Type: multipart/mixed; boundary="------------bAjJ5D8LtPXfb00wL6Z9cc4J";
  protected-headers="v1"
 From: Matthias Petermann <mp@petermann-it.de>
 To: simonb@NetBSD.org
 Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
  gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
 Message-ID: <4f72c4d4-a618-4b31-ad2f-ce15c93ce6ff@petermann-it.de>
 Subject: Re: kern/58111: Tracking issue for potential ZFS data corruption
 References: <pr-kern-58111@gnats.netbsd.org>
  <20240404070919.1DE5E1A923B@mollari.NetBSD.org>
  <20240405113502.1C2F61A923B@mollari.NetBSD.org>
 In-Reply-To: <20240405113502.1C2F61A923B@mollari.NetBSD.org>

 --------------bAjJ5D8LtPXfb00wL6Z9cc4J
 Content-Type: multipart/mixed; boundary="------------qZbiPvZ00jLd3sXgG1lSjzUN"

 --------------qZbiPvZ00jLd3sXgG1lSjzUN
 Content-Type: text/plain; charset=UTF-8; format=flowed
 Content-Transfer-Encoding: base64

 SGVsbG8gU2ltb24sIHRoYW5rIHlvdSB2ZXJ5IG11Y2ggZm9yIHRha2luZyBjYXJlIG9mIHRo
 aXMgYnVnIHNvIHF1aWNrbHkuIA0KVGhlIHJlcHJvZHVjZXIgaXMgdmVyeSBoZWxwZnVsLCBh
 bmQgaXQncyBnb29kIHRvIGtub3cgdGhhdCB0aGUgYnVnIGNhbiANCm9ubHkgb2NjdXIgdGhl
 b3JldGljYWxseSBpbiBwcm9kdWN0aW9uIHN5c3RlbXMgYXQgdGhlIG1vbWVudC4gTXkgDQp0
 aG91Z2h0cyB3ZXJlIHdpdGggdGhlIHBlb3BsZSB3aG8gdXNlIFpGUyBhcyBiYWNraW5nIHN0
 b3JhZ2UgZm9yIFZNcyBhbmQgDQpoYXZlIGxhcmdlIHNwYXJzZSBmaWxlcyBhbmQgSSBhbSBn
 bGFkIHRoYXQgdGhlIGJ1ZyB3aWxsIG5vdCBiZSBhIHRocmVhdCANCnRvIHVzIGluIHRoZSBm
 dXR1cmUuIFRoYW5rcyBhZ2FpbiBmb3IgdGhlIGdyZWF0IHdvcmshDQoNCk1hdHRoaWFzDQo=

 --------------qZbiPvZ00jLd3sXgG1lSjzUN
 Content-Type: application/pgp-keys; name="OpenPGP_0x92FCF71C2A41A789.asc"
 Content-Disposition: attachment; filename="OpenPGP_0x92FCF71C2A41A789.asc"
 Content-Description: OpenPGP public key
 Content-Transfer-Encoding: quoted-printable

 -----BEGIN PGP PUBLIC KEY BLOCK-----

 xjMEY9T0phYJKwYBBAHaRw8BAQdAjAYSym+QH+Gpfsa3/pCMaYmA2iH7LumAhd73
 YtzS8IXNFDxtcEBwZXRlcm1hbm4taXQuZGU+wosEEBYIADMCGQEFAmPU9KYCGwME
 CwkIBwYVCAkKCwIDFgIBFiEEAA2o7sa6xIy7/cPokvz3HCpBp4kACgkQkvz3HCpB
 p4nBEgD/aN4AurifvzZjz31ZGDP++zgSWU8mZ2QTfiP1nRwPtm0A/0RLAsOqprX6
 ern4+ePrrmNWHCcvvQI04V+C5uwiaHkCzjgEY9T0phIKKwYBBAGXVQEFAQEHQMZD
 uHdoWTDIYm7xK1F5xIAzJMV2rR29YOzQNKhtoo0CAwEIB8J4BBgWCAAgBQJj1PSm
 AhsMFiEEAA2o7sa6xIy7/cPokvz3HCpBp4kACgkQkvz3HCpBp4m3LgD+M/91df+w
 AHgFPKT92PLF7PKIFVXK7S1dY0bgwYw5ZOYA/3PEjwJ+0fIkEyFPN2VaOU8Q4VfW
 p2/PE4Hk/tFEX8oL
 =3DnX/1
 -----END PGP PUBLIC KEY BLOCK-----

 --------------qZbiPvZ00jLd3sXgG1lSjzUN--

 --------------bAjJ5D8LtPXfb00wL6Z9cc4J--

 --------------8KvO0vw2jB90yve9iN5bW3pL
 Content-Type: application/pgp-signature; name="OpenPGP_signature.asc"
 Content-Description: OpenPGP digital signature
 Content-Disposition: attachment; filename="OpenPGP_signature.asc"

 -----BEGIN PGP SIGNATURE-----

 wnsEABYIACMWIQQADajuxrrEjLv9w+iS/PccKkGniQUCZhS9aAUDAAAAAAAKCRCS/PccKkGniVcv
 AP9q1GWtJ/mWy4S0zQy3d86fNeRdHyTc7jK44gOOG4Gr5AD9G9pC4RXtgheCW86nSvP96jDefTCF
 o6mTc0qanjhGKAc=
 =doUQ
 -----END PGP SIGNATURE-----

 --------------8KvO0vw2jB90yve9iN5bW3pL--

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@NetBSD.org, simonb@NetBSD.org
Cc: 
Subject: Re: kern/58111 Tracking issue for potential ZFS data corruption
Date: Fri, 5 Jul 2024 19:03:40 +0900

 Hi Simon!

 Do you think this is worth pulling up to netbsd-10?
 If not, I will close this PR.

 Thanks,
 rin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.