NetBSD Problem Report #57475

From www@netbsd.org  Tue Jun 20 05:36:10 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id F01F11A923E
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 20 Jun 2023 05:36:09 +0000 (UTC)
Message-Id: <20230620053538.6A7831A9241@mollari.NetBSD.org>
Date: Tue, 20 Jun 2023 05:35:38 +0000 (UTC)
From: mp@petermann-it.de
Reply-To: mp@petermann-it.de
To: gnats-bugs@NetBSD.org
Subject: Error message ffs_snapshot_mount: vget failed 2 after fsck
X-Send-Pr-Version: www-1.0

>Number:         57475
>Category:       kern
>Synopsis:       Error message ffs_snapshot_mount: vget failed 2 after fsck
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jun 20 05:40:00 +0000 2023
>Last-Modified:  Tue Jun 20 20:15:01 +0000 2023
>Originator:     Matthias Petermann
>Release:        10.0_BETA
>Organization:
>Environment:
etBSD vhost2.lan 10.0_BETA NetBSD 10.0_BETA (XEN3_DOM0) #0: Sun Jun 11 22:32:13 UTC 2023  root@ws.local:/build/netbsd-10/obj/sys/arch/amd64/compile/XEN3_DOM0 amd64
>Description:
I have used fssconfig to create a persistent snapshot on /data/.snap
created. Afterwards several actions caused a crash of the
system crash. I then performed an fsck, which removed the file
among other things the file /data/.snap was removed. Since then
when mounting the file system the message:

[ 12.132674] ffs_snapshot_mount: vget failed 2

The message seems to be originated from sys/ufs/ffs/ffs_snapshot.c,
when a snapshot is registered in the superblock and the corresponding
inode no longer exists.

However, with fsdb the file ".snap" with type unknown is still 
found:

vhost2$ doas fsdb -n /dev/rdk3
** /dev/rdk3 (NO WRITE)
** File system is journaled; replaying journal
CANNOT REPLAY JOURNAL IN -n MODE; continuing anyway
Editing file system `/dev/rdk3'
Last Mounted on /data
current inode: directory
I=2 MODE=40755 SIZE=512 EXTSIZE=0
            MTIME=Jun 19 18:01:25 2023 [707192237 nsec]
            CTIME=Jun 19 18:01:25 2023 [707192237 nsec]
            ATIME=Jun 20 04:15:14 2023 [239283678 nsec]
        BIRTHTIME=Jun 14 09:44:25 2023 [727546000 nsec]
OWNER=root GRP=wheel LINKCNT=4 FLAGS=0x0 BLKCNT=0x8 GEN=0x4a3fafc8
fsdb (inum: 2)> ls
slot 0 off 0 ino 2 reclen 12: directory, `.'
slot 1 off 12 ino 2 reclen 12: directory, `..'
slot 2 off 24 ino 26797056 reclen 20: directory, `prometheus'
slot 3 off 44 ino 36166656 reclen 12: directory, `vhd'
slot 4 off 56 ino 0 reclen 456: unknown, `.snap'
fsdb (inum: 2)>

Beside the actual bug report, actually there are two points I'd like to ask for help:

- What can I do to get rid of the "unknown" .snap entry in the root directory
of the FS without damaging the Filesystem?

- How can I unregister the snapshot from the superblock to get gid of the vget vailed 2
message?
>How-To-Repeat:
Not sure if all of this contributes to the actual problem, but here is the bigger picture:

1) Setup a FFSv2 with WAPBL (/data)
2) Create some sparse deployed vnd-images (/data/vhd1.img, /data/vhd2.img ...) 
3) configure the vnd-images so that they become available as vnd1....vndX (implicitly by Xen Dom0 xbd backend)
4) format the vnd images with FFSv2 and WAPBL (within Xen DomU)
5) create a persistent snapshot of /data as /data/.snap (fss0)
6) mount the snapshot read-only to /mnt
7) configure the vnd-images from the snapshot under/tmp/vhd1.img, /tmp/vhd2.img ... so they become vndX+1, vndX+2 ...
8) use /sbin/dump to dump the filesystems from vndX+1, vndX+2 

While dump is running, the system crashes. After booting into single user mode and doing fsck_ffs, the filesystem can be recovered. At this point /data/.snap gets unlinked by fsck. At the next boot, the "vget failed 2" message occurs in the kernel log.
>Fix:
n/a

>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/57475: Error message ffs_snapshot_mount: vget failed 2 after fsck
Date: Wed, 21 Jun 2023 03:10:39 +0700

     Date:        Tue, 20 Jun 2023 05:40:01 +0000 (UTC)
     From:        mp@petermann-it.de
     Message-ID:  <20230620054001.335621A9242@mollari.NetBSD.org>

 I cannot comment on the problem with the snapshot, nor with why
 what you're doing is causing a panic in the first place, but this
 one is easy:

   | - What can I do to get rid of the "unknown" .snap entry in the root
   |   directory of the FS without damaging the Filesystem?

 Nothing.

 It is already gone.   The fsdb entry:

 slot 4 off 56 ino 0 reclen 456: unknown, `.snap'

 is simply describing the free space at the end of the directory (that's
 what "ino 0" tells you, a free record), the reclen is also a hint, a
 directory entry for a filename that long would probably have a reclen
 of 16 (certainly >= 12 and <= 20, and a multiple of 4).   But instead
 it is the whole rest of the block (512 - 56) ... 56 is its offset.

 That the data field happens to contain ".snap" is irrelevant, that's
 just what was there last.    If you really want that string to go away,
 simply create "/data/foobar" (irrelevant what that is) and then delete
 it again, and the ".snap" in the empty (unused) space will be changed
 to "foobar".

 kre

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.