NetBSD Problem Report #57496

From www@netbsd.org  Sat Jul  1 14:38:02 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 9456D1A923E
	for <gnats-bugs@gnats.NetBSD.org>; Sat,  1 Jul 2023 14:38:02 +0000 (UTC)
Message-Id: <20230701143800.EF6CE1A9241@mollari.NetBSD.org>
Date: Sat,  1 Jul 2023 14:38:00 +0000 (UTC)
From: hpaluch@seznam.cz
Reply-To: hpaluch@seznam.cz
To: gnats-bugs@NetBSD.org
Subject: fuse-exfat: broken on wedges: 10GB wedge seen as 4.7TB breaking all exfat commands
X-Send-Pr-Version: www-1.0

>Number:         57496
>Category:       pkg
>Synopsis:       fuse-exfat: broken on wedges: 10GB wedge seen as 4.7TB breaking all exfat commands
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    pkg-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jul 01 14:40:00 +0000 2023
>Last-Modified:  Sun Jul 02 11:20:01 +0000 2023
>Originator:     Henryk Paluch
>Release:        9.3-RELEASE + pkgsrc-2023Q2
>Organization:
N/A
>Environment:
NetBSD localhost 9.3 NetBSD 9.3 (GENERIC) #0: Thu Aug  4 15:30:37 UTC 2022  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
When attempting to format and use exFAT partition on wedge (GPT in my case) - all utilities (mkexfatfs, exfatfsck, mount.exfat-fuse) will see such wedge with ridiculous incorrect size (4.7TB instead of 10GB in my case) and thus choking on any useful operation.

The problem is bug in libexfat/io.c that treats wedge as disk (or disklabel) and thus computing incorrect device size for wedge.


>How-To-Repeat:
1. Create GPT partition for exFAT, example (you need to customize it to your environment):

gpt add -a 4k -t windows -l exfat1 -b 43057216 ld0

(you have to replace 43057216 with your first free block on GPT table and replace "ld0" with your disk device).
Please note that exFAT minimum size is 1GB or so.

2. Check new wedge corresponding to new exFAT partition in GPT table (replace "ld0" with your disk name).

dkctl ld0 listwedges

  /dev/rld0: 3 wedges:
  ...
  dk2: exfat1, 19857304 blocks at 43057216, type: ntfs

(listed only new wedge for exfat). In my case the wedge has size 10GB.

3. Install fuse-exfat package:

pkgin in fuse-exfat

In my case package version is fuse-exfat-1.3.0nb1

4. Format new wedge as exFAT (please note that libexfat works only with Block device - it will internally replace it with Raw device):

mkexfatfs -n EXFAT1 /dev/dk2

5. There is no error reported, but the formatted filesystem has totally wrong size. You can verify this claim two ways:

a) using exfatfsck, for example:

exfatfsck -n /dev/dk2

exfatfsck 1.3.0
Checking file system on /dev/dk2.
File system version           1.0
Sector size                 512 bytes
Cluster size                128 KB
Volume size                4848 GB
Used space                  368 MB
Available space            4848 GB
Totally 2 directories and 21 files.
File system checking finished. No errors found.

NOTICE wrong "Volume size" and "Available space" size (4848 GB = 4.8TB) - it should be just 10GB.

b) you can try to mount this filesystem and issue `df` command:

mkdir -p /mnt/exfat
mount.exfat-fuse /dev/dk2 /mnt/exfat/

df -h /mnt/exfat

Filesystem         Size       Used      Avail %Cap Mounted on
/dev/puffs         4.7T       157M       4.7T   0% /mnt/exfat

Notice totally bogus Size (again around 4.8TB) and Avail - if you attempt to write lot of data of filesystem and will be completely screwed...


>Fix:
I created this patch (for already patched pkgsrc-2023Q2), that fixes problem for me. However I'm new to NetBSD and I need some mentor to review this patch and some volunteer to test it (with the risk of data loss):

--- libexfat/io.c.orig	2023-07-01 15:28:46.089450095 +0200
+++ libexfat/io.c	2023-07-01 16:11:06.598265322 +0200
@@ -36,7 +36,9 @@
 #include <sys/dkio.h>
 #include <sys/ioctl.h>
 #elif defined(__NetBSD__)
+#include <sys/disk.h>
 #include <sys/param.h>
+#include <sys/ioctl.h>
 #include <util.h>
 #include "nbpartutil.h"
 #elif __linux__
@@ -239,11 +241,40 @@ struct exfat_dev* exfat_open(const char*
 			char device[MAXPATHLEN];
 			u_int secsize;
 			off_t dksize;
+			int err;
+			const char *WEDGE_PREFIX = "/dev/rdk";

 			/* mkexfatfs can only use the block device, but */
 			/* getdisksize() needs the raw device name      */
-			getdiskrawname(device, sizeof(device), spec);
-			getdisksize(device, &secsize, &dksize);
+			if (getdiskrawname(device, sizeof(device), spec)==NULL){
+				exfat_error("getdiskrawname('%s'): %s",spec,strerror(errno));
+				return NULL;
+			}
+
+			err = getdisksize(device, &secsize, &dksize);
+			if (err){
+				exfat_error("getdisksize('%s'): error=%d",device,err);
+				return NULL;
+			}
+
+			/* NOTE: dksize is incorrect for wedges - we have to use dedicated ioctl */
+			if (strncmp(device,WEDGE_PREFIX,strlen(WEDGE_PREFIX)) == 0){
+				int fd;
+				struct dkwedge_info dkw;
+				if ((fd = open_ro(device)) == -1){
+					exfat_error("wedge: open_ro('%s'): %s",
+							device,strerror(errno));
+					return NULL;
+				}
+				if (ioctl(fd, DIOCGWEDGEINFO, &dkw) < 0) {
+					exfat_error("wedge: ioctl('%s',DIOCGWEDGEINFO): %s",
+							device,strerror(errno));
+					close(fd);
+					return NULL;
+				}
+				close(fd);
+				dksize = dkw.dkw_size;
+			}
 			dev->size = secsize * dksize;
 		}
 		if (dev->size <= 0) {

>Audit-Trail:
From: Henryk Paluch <hpaluch@seznam.cz>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Subject: Re: pkg/57496
Date: Sun, 2 Jul 2023 09:47:47 +0200

 Hello!

 Actually there is bug for all partition types, not just wedges - it 
 again scales already scaled partition size.

 Here is proper fix:

 --- /home/builder/orig/io.c	2023-07-01 08:28:46.534337339 +0000
 +++ libexfat/io.c	2023-07-02 07:39:07.883316521 +0000
 @@ -244,7 +244,7 @@ struct exfat_dev* exfat_open(const char*
   			/* getdisksize() needs the raw device name      */
   			getdiskrawname(device, sizeof(device), spec);
   			getdisksize(device, &secsize, &dksize);
 -			dev->size = secsize * dksize;
 +			dev->size = dksize;
   		}
   		if (dev->size <= 0) {
   			exfat_error("Unable to determine file system size");

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: Henryk Paluch <hpaluch@seznam.cz>
Subject: Re: Subject: Re: pkg/57496
Date: Sun, 02 Jul 2023 18:16:38 +0700

 I suspect you're right.

 This might have been caused by an unfortunate disparity between
 the NetBSD kernel's internal getdisksize() routine, and the one
 essentially all of userland shares, which comes from src/sbin/fsck/partutil.c

 Both of them have 3 params, the first identifies the disk in question
 (quite differently, but that doesn't matter), the other two are pointers
 to where the two results are to be stored.

 The kernel's version returns the sector size, and the number of sectors.

 partutil's version returns the sector size, and the media size (the product
 of the number of sectors and the sector size).

 With the kernel version, multiplying the two results is the correct way
 to determine the total size, but as you determined, is completely the wrong
 thing to do for the partutil version (which instead requires the user to
 divide the media size by the sector size, if the number of sectors is wanted).

 The partutil version is included (by file copy, effectively) in fuse-exfat in
 	patches/patch-libexfat__nbpartutil.[ch]
 But the patch in
 	patches/patch-libexfat__io.c

 is assuming that it works more like the kernel version (though correctly
 using the partutil version parameter order).   The latter patch needs to be
 regenerated with the change you identified.

 kre

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.