NetBSD Problem Report #57475
From www@netbsd.org Tue Jun 20 05:36:10 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id F01F11A923E
for <gnats-bugs@gnats.NetBSD.org>; Tue, 20 Jun 2023 05:36:09 +0000 (UTC)
Message-Id: <20230620053538.6A7831A9241@mollari.NetBSD.org>
Date: Tue, 20 Jun 2023 05:35:38 +0000 (UTC)
From: mp@petermann-it.de
Reply-To: mp@petermann-it.de
To: gnats-bugs@NetBSD.org
Subject: Error message ffs_snapshot_mount: vget failed 2 after fsck
X-Send-Pr-Version: www-1.0
>Number: 57475
>Category: kern
>Synopsis: Error message ffs_snapshot_mount: vget failed 2 after fsck
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Jun 20 05:40:00 +0000 2023
>Last-Modified: Tue Jun 20 20:15:01 +0000 2023
>Originator: Matthias Petermann
>Release: 10.0_BETA
>Organization:
>Environment:
etBSD vhost2.lan 10.0_BETA NetBSD 10.0_BETA (XEN3_DOM0) #0: Sun Jun 11 22:32:13 UTC 2023 root@ws.local:/build/netbsd-10/obj/sys/arch/amd64/compile/XEN3_DOM0 amd64
>Description:
I have used fssconfig to create a persistent snapshot on /data/.snap
created. Afterwards several actions caused a crash of the
system crash. I then performed an fsck, which removed the file
among other things the file /data/.snap was removed. Since then
when mounting the file system the message:
[ 12.132674] ffs_snapshot_mount: vget failed 2
The message seems to be originated from sys/ufs/ffs/ffs_snapshot.c,
when a snapshot is registered in the superblock and the corresponding
inode no longer exists.
However, with fsdb the file ".snap" with type unknown is still
found:
vhost2$ doas fsdb -n /dev/rdk3
** /dev/rdk3 (NO WRITE)
** File system is journaled; replaying journal
CANNOT REPLAY JOURNAL IN -n MODE; continuing anyway
Editing file system `/dev/rdk3'
Last Mounted on /data
current inode: directory
I=2 MODE=40755 SIZE=512 EXTSIZE=0
MTIME=Jun 19 18:01:25 2023 [707192237 nsec]
CTIME=Jun 19 18:01:25 2023 [707192237 nsec]
ATIME=Jun 20 04:15:14 2023 [239283678 nsec]
BIRTHTIME=Jun 14 09:44:25 2023 [727546000 nsec]
OWNER=root GRP=wheel LINKCNT=4 FLAGS=0x0 BLKCNT=0x8 GEN=0x4a3fafc8
fsdb (inum: 2)> ls
slot 0 off 0 ino 2 reclen 12: directory, `.'
slot 1 off 12 ino 2 reclen 12: directory, `..'
slot 2 off 24 ino 26797056 reclen 20: directory, `prometheus'
slot 3 off 44 ino 36166656 reclen 12: directory, `vhd'
slot 4 off 56 ino 0 reclen 456: unknown, `.snap'
fsdb (inum: 2)>
Beside the actual bug report, actually there are two points I'd like to ask for help:
- What can I do to get rid of the "unknown" .snap entry in the root directory
of the FS without damaging the Filesystem?
- How can I unregister the snapshot from the superblock to get gid of the vget vailed 2
message?
>How-To-Repeat:
Not sure if all of this contributes to the actual problem, but here is the bigger picture:
1) Setup a FFSv2 with WAPBL (/data)
2) Create some sparse deployed vnd-images (/data/vhd1.img, /data/vhd2.img ...)
3) configure the vnd-images so that they become available as vnd1....vndX (implicitly by Xen Dom0 xbd backend)
4) format the vnd images with FFSv2 and WAPBL (within Xen DomU)
5) create a persistent snapshot of /data as /data/.snap (fss0)
6) mount the snapshot read-only to /mnt
7) configure the vnd-images from the snapshot under/tmp/vhd1.img, /tmp/vhd2.img ... so they become vndX+1, vndX+2 ...
8) use /sbin/dump to dump the filesystems from vndX+1, vndX+2
While dump is running, the system crashes. After booting into single user mode and doing fsck_ffs, the filesystem can be recovered. At this point /data/.snap gets unlinked by fsck. At the next boot, the "vget failed 2" message occurs in the kernel log.
>Fix:
n/a
>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/57475: Error message ffs_snapshot_mount: vget failed 2 after fsck
Date: Wed, 21 Jun 2023 03:10:39 +0700
Date: Tue, 20 Jun 2023 05:40:01 +0000 (UTC)
From: mp@petermann-it.de
Message-ID: <20230620054001.335621A9242@mollari.NetBSD.org>
I cannot comment on the problem with the snapshot, nor with why
what you're doing is causing a panic in the first place, but this
one is easy:
| - What can I do to get rid of the "unknown" .snap entry in the root
| directory of the FS without damaging the Filesystem?
Nothing.
It is already gone. The fsdb entry:
slot 4 off 56 ino 0 reclen 456: unknown, `.snap'
is simply describing the free space at the end of the directory (that's
what "ino 0" tells you, a free record), the reclen is also a hint, a
directory entry for a filename that long would probably have a reclen
of 16 (certainly >= 12 and <= 20, and a multiple of 4). But instead
it is the whole rest of the block (512 - 56) ... 56 is its offset.
That the data field happens to contain ".snap" is irrelevant, that's
just what was there last. If you really want that string to go away,
simply create "/data/foobar" (irrelevant what that is) and then delete
it again, and the ".snap" in the empty (unused) space will be changed
to "foobar".
kre
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.