NetBSD Problem Report #55055

From mpumford@mudcovered.org.uk  Sat Mar  7 15:55:05 2020
Return-Path: <mpumford@mudcovered.org.uk>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 08E501A9218
	for <gnats-bugs@gnats.NetBSD.org>; Sat,  7 Mar 2020 15:55:04 +0000 (UTC)
Message-Id: <20200307155459.75D68BE2BE9@trigati.mudcovered.org.uk>
Date: Sat,  7 Mar 2020 15:54:59 +0000 (GMT)
From: mpumford@mudcovered.org.uk
Reply-To: mpumford@mudcovered.org.uk
To: gnats-bugs@NetBSD.org
Subject: Panic running dump on snapshot
X-Send-Pr-Version: 3.95

>Number:         55055
>Category:       kern
>Synopsis:       Panic running dump on snapshot
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Mar 07 16:00:00 +0000 2020
>Originator:     Mike Pumford
>Release:        NetBSD 9.0_STABLE
>Organization:
>Environment:
System: NetBSD 9.0_STABLE (GENERIC) #0: Sat Feb 29 09:47:11 GMT 2020
Architecture: x86_64
Machine: amd64
>Description:
 282464.801538] panic: ffs_blkfree_snap: bad size: dev = 0xa803, bno = 514 bsize = 32768, size = 32768, fs = /work
[ 282464.801538] cpu0: Begin traceback...
[ 282464.801538] vpanic() at netbsd:vpanic+0x160
[ 282464.801538] snprintf() at netbsd:snprintf
[ 282464.801538] ffs_mapsearch() at netbsd:ffs_mapsearch
[ 282464.801538] ffs_blkfree_snap() at netbsd:ffs_blkfree_snap+0xb5
[ 282464.811542] mapacct() at netbsd:mapacct+0x104
[ 282464.811542] expunge() at netbsd:expunge+0x2eb
[ 282464.811542] ffs_snapshot() at netbsd:ffs_snapshot+0xfbb
[ 282464.811542] VFS_SNAPSHOT() at netbsd:VFS_SNAPSHOT+0x38
[ 282464.811542] fss_ioctl() at netbsd:fss_ioctl+0xa4f
[ 282464.811542] VOP_IOCTL() at netbsd:VOP_IOCTL+0x54
[ 282464.811542] vn_ioctl() at netbsd:vn_ioctl+0xa5
[ 282464.811542] sys_ioctl() at netbsd:sys_ioctl+0x5ab
[ 282464.811542] syscall() at netbsd:syscall+0x157
[ 282464.811542] --- syscall (number 54) ---
[ 282464.811542] 7afc7a7681ba:
[ 282464.811542] cpu0: End traceback...

I do have the kernel coredump from this so can provide further diagnostics

>How-To-Repeat:
I was running the following command:
/sbin/dump -2 -X -a -u -f - <mount point> |  | gzip -c >outfile

Source was ffv2 filesystem on:
[     1.045981] pci2: i/o space, memory space enabled, rd/line, wr/inv ok
[     1.045981] nvme0 at pci2 dev 0 function 0: vendor 144d product a808 (rev. 0x00)
[     1.045981] nvme0: NVMe 1.3
[     1.045981] nvme0: for admin queue interrupting at msix4 vec 0
[     1.045981] nvme0: Samsung SSD 970 EVO Plus 1TB, firmware 2B2QEXM7, serial S4EWNF0MA11810R

Destination was ffsv2 on a raidframe raid0 on a pair of identical drives:
[     4.413792] wd0 at atabus3 drive 0
[     4.433798] wd0: <ST2000DM006-2DM164>
[     4.453806] wd0: drive supports 16-sector PIO transfers, LBA48 addressing
[     4.453806] wd0: 1863 GB, 3876021 cyl, 16 head, 63 sec, 512 bytes/sect x 3907029168 sectors
[     4.523836] wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133), WRITE DMA FUA, NCQ (32 tags)
[     4.523836] wd0(ahcisata1:2:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133) (using DMA), NCQ (31 tags)

No particular filesystem activity going on at the time and the filesystem 
being backed up hadn't seen a large amount of activity since the previous 
level 1 dump the day before./
>Fix:

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.