NetBSD Problem Report #59153

From paul@whooppee.com  Fri Mar  7 22:43:46 2025
Return-Path: <paul@whooppee.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits)
	 client-signature RSA-PSS (2048 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 746681A9239
	for <gnats-bugs@gnats.NetBSD.org>; Fri,  7 Mar 2025 22:43:46 +0000 (UTC)
Message-Id: <20250307224313.74CA049A968@speedy.whooppee.com>
Date: Fri,  7 Mar 2025 14:43:13 -0800 (PST)
From: paul@whooppee.com
Reply-To: paul@whooppee.com
To: gnats-bugs@NetBSD.org
Subject: crash dump doesn't dump
X-Send-Pr-Version: 3.95

>Number:         59153
>Category:       kern
>Synopsis:       crash doesn't dump
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Mar 07 22:45:01 +0000 2025
>Last-Modified:  Sun Apr 13 07:55:01 +0000 2025
>Originator:     Paul Goyette
>Release:        NetBSD 10.99.12
>Organization:
+---------------------+--------------------------+----------------------+
| Paul Goyette (.sig) | PGP Key fingerprint:     | E-mail addresses:    |
| (Retired)           | 1B11 1849 721C 56C8 F63A | paul@whooppee.com    |
| Software Developer  | 6E2E 05FD 15CE 9F2D 5102 | pgoyette@netbsd.org  |
| & Network Engineer  |                          | pgoyette99@gmail.com |
+---------------------+--------------------------+----------------------+
>Environment:


System: NetBSD speedy.whooppee.com 10.99.12 NetBSD 10.99.12 (SPEEDY 2025-02-09 23:04:47 UTC) #0: Mon Feb 10 04:36:23 UTC 2025 paul@speedy.whooppee.com:/build/netbsd-local/obj/amd64/sys/arch/amd64/compile/SPEEDY amd64
Architecture: x86_64
Machine: amd64
>Description:
When system panics it tries to write a dumpfile.  However, the dump
fails without a detailed error, and the failure is not written to the
dmesg buffer.

# swapctl -z
dump device is dk6

[     1.658122] ld0: GPT GUID: 4ebed702-001b-41b4-8b42-2d8886317b60
...
[     1.701091] dk6 at ld0: "NVME-dump", 629145600 blocks at 2411726848, type: swap
...
[   229.000000] dump [   1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,

>How-To-Repeat:
just crash/panic - perhaps it needs dump partition to reside on gpt
drive, vs mbr?

>Fix:
Please

>Release-Note:

>Audit-Trail:
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/59153: crash dump doesn't dump
Date: Sat, 8 Mar 2025 00:26:23 -0000 (UTC)

 paul@whooppee.com writes:

 >[     1.658122] ld0: GPT GUID: 4ebed702-001b-41b4-8b42-2d8886317b60
 >...
 >[     1.701091] dk6 at ld0: "NVME-dump", 629145600 blocks at 2411726848, type: swap
 >...
 >[   229.000000] dump [   1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,

 Maybe lucky that it didn't dump.

 While the dump interface works with daddr_t block addresses, the ld(4)
 backend works with 'int'.

 The partition starts at 2411726848, which, as an 'int' is a negative
 value, that gets interpreted as a daddr_t (sign extended) again.

 static int
 ld_nvme_dump(struct ld_softc *ld, void *data, int blkno, int blkcnt)
 {
         struct ld_nvme_softc *sc = device_private(ld->sc_dv);

         return nvme_ns_dobio(sc->sc_nvme, sc->sc_nsid, sc,
             NULL, data, blkcnt * ld->sc_secsize,
             sc->sc_ld.sc_secsize, blkno,
             NVME_NS_CTX_F_POLL,
             ld_nvme_biodone);
 }

From: matthew green <mrg@eterna23.net>
To: gnats-bugs@netbsd.org, mlelstv@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org, paul@whooppee.com
Subject: re: kern/59153: crash dump doesn't dump
Date: Sat, 08 Mar 2025 16:27:29 +1100

 >  ld_nvme_dump(struct ld_softc *ld, void *data, int blkno, int blkcnt)

 wow.  this is so dangerous.

 can we even fix it in netbsd-9 and netbsd-10 properly?  i see
 many of the ld backends have modules.


 .mrg.

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/59153: crash dump doesn't dump
Date: Sat, 8 Mar 2025 05:51:31 -0000 (UTC)

 mrg@eterna23.net (matthew green) writes:

 >>  ld_nvme_dump(struct ld_softc *ld, void *data, int blkno, int blkcnt)

 >wow.  this is so dangerous.

 >can we even fix it in netbsd-9 and netbsd-10 properly?  i see
 >many of the ld backends have modules.


 Fixing will be an incompatible step, that would require some
 magic that you have to maintain forever.

 Otherwise, just prevent the crash and limit blkno to a positive
 32bit value for -9 and -10. As good as before, but without
 the danger.

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/59153 crash doesn't dump (Fwd: CVS commit: src/sys/dev)
Date: Sun, 13 Apr 2025 11:17:00 +0900

 XXX
 pullup netbsd-10, netbsd-9

 -------- Forwarded Message --------
 Subject: CVS commit: src/sys/dev
 Date: Sat, 12 Apr 2025 07:30:01 +0000
 From: Michael van Elst <mlelstv@netbsd.org>
 Reply-To: source-changes-d@NetBSD.org
 To: source-changes-full@NetBSD.org

 Module Name:	src
 Committed By:	mlelstv
 Date:		Sat Apr 12 07:30:01 UTC 2025

 Modified Files:
 	src/sys/dev: ld.c

 Log Message:
 ld sc_dump backend takes an 'int' as disk address, fail when the
 disk address is outside the possible range of an int.


 To generate a diff of this commit:
 cvs rdiff -u -r1.114 -r1.115 src/sys/dev/ld.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.


From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59153 CVS commit: src/sys
Date: Sun, 13 Apr 2025 02:34:03 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Sun Apr 13 02:34:03 UTC 2025

 Modified Files:
 	src/sys/arch/usermode/dev: ld_thunkbus.c
 	src/sys/dev: ld.c ldvar.h
 	src/sys/dev/ata: ld_ataraid.c
 	src/sys/dev/i2o: ld_iop.c
 	src/sys/dev/ic: ld_aac.c ld_cac.c ld_icp.c ld_mlx.c ld_nvme.c
 	src/sys/dev/pci: ld_amr.c ld_twa.c ld_twe.c ld_virtio.c
 	src/sys/dev/sdmmc: ld_sdmmc.c

 Log Message:
 ld(4): Convert blkno argument for sc_dump() to daddr_t

 PR kern/59153

 (1) For backends that accept 64-bit block address, i.e.,
 nvme(4), virtio(4), aac(4), iop(4), and mainbus(usermode/4),
 this should enable to dump beyond 2Gi blocks.

 (2) sdmmc(4) backend allows to dump up to the last block.

 (3) For other backends, block address is handled as `int`.
 Some of them may support blocks up to 4Gi, but I do not have
 enough time to examine datasheets. So, continue to reject >2Gi
 blocks as before.

 XXX
 This is KABI change, and cannot be pulled up into netbsd-{10,9}.

 XXX
 Compile-test only (for amd64/ALL) due to lack of large SSDs ;)

 Thanks mlelstv@ for discussion and careful review!!


 To generate a diff of this commit:
 cvs rdiff -u -r1.33 -r1.34 src/sys/arch/usermode/dev/ld_thunkbus.c
 cvs rdiff -u -r1.115 -r1.116 src/sys/dev/ld.c
 cvs rdiff -u -r1.36 -r1.37 src/sys/dev/ldvar.h
 cvs rdiff -u -r1.51 -r1.52 src/sys/dev/ata/ld_ataraid.c
 cvs rdiff -u -r1.41 -r1.42 src/sys/dev/i2o/ld_iop.c
 cvs rdiff -u -r1.31 -r1.32 src/sys/dev/ic/ld_aac.c src/sys/dev/ic/ld_cac.c
 cvs rdiff -u -r1.32 -r1.33 src/sys/dev/ic/ld_icp.c
 cvs rdiff -u -r1.23 -r1.24 src/sys/dev/ic/ld_mlx.c
 cvs rdiff -u -r1.25 -r1.26 src/sys/dev/ic/ld_nvme.c
 cvs rdiff -u -r1.25 -r1.26 src/sys/dev/pci/ld_amr.c
 cvs rdiff -u -r1.20 -r1.21 src/sys/dev/pci/ld_twa.c
 cvs rdiff -u -r1.40 -r1.41 src/sys/dev/pci/ld_twe.c
 cvs rdiff -u -r1.42 -r1.43 src/sys/dev/pci/ld_virtio.c
 cvs rdiff -u -r1.44 -r1.45 src/sys/dev/sdmmc/ld_sdmmc.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59153 CVS commit: src/sys/sys
Date: Sun, 13 Apr 2025 07:52:55 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Sun Apr 13 07:52:55 UTC 2025

 Modified Files:
 	src/sys/sys: param.h

 Log Message:
 Welcome to 10.99.13; bump for ld(4) change for PR kern/59153


 To generate a diff of this commit:
 cvs rdiff -u -r1.735 -r1.736 src/sys/sys/param.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.