NetBSD Problem Report #55735

From bernd@bor.bersie.loc  Mon Oct 19 12:59:55 2020
Return-Path: <bernd@bor.bersie.loc>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id F309E1A9217
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 19 Oct 2020 12:59:54 +0000 (UTC)
Message-Id: <20201019114156.3DC851B8AF8@bor.bersie.loc>
Date: Mon, 19 Oct 2020 13:41:56 +0200 (CEST)
From: bernd.sieker@posteo.net
Reply-To: bernd.sieker@posteo.net
To: gnats-bugs@NetBSD.org
Subject: union fs on top of nfs causes kernel panic
X-Send-Pr-Version: 3.95

>Number:         55735
>Category:       kern
>Synopsis:       union fs on top of nfs causes kernel panic
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 19 13:00:01 +0000 2020
>Last-Modified:  Sun May 30 07:10:01 +0000 2021
>Originator:     bernd.sieker@posteo.net
>Release:        NetBSD 9.1
>Organization:

>Environment:
System: NetBSD bor.bersie.loc 9.1 NetBSD 9.1 (BOR) #8: Fri Oct 16 18:32:47 CEST 2020 bernd@bor.bersie.loc:/usr/src/sys/arch/amd64/compile/BOR amd64
Architecture: x86_64
Machine: amd64
>Description:
A custom-built 9.1 kernel, including zfs (module), union and nfs file systems occasionally panics when a union filesystem is mounted on top of an nfs-mounted filesystem.

The machine is a PowerEdge T110 server with a 4-core Intel Xeon X3430 at 2.4 GHz, with 4 GB ECC RAM.

It also uses zfs, so I cannot rule that out as a contributing factor, but crashes have only happened (4 times so far within a few days) when the union filesystem was mounted, and shows union- and nfs-related syscalls.

Here is an excerpt from fstab:

  niob:/usr/source               /usr/source             nfs     rw
  /usr/source/pkgsrc             /usr/pkgsrc             union   rw,-b

The zfs module is loaded from /etc/modules.conf along with the solaris module, and one raidz1 zpool is created on three disks.

A typical traceback from the crashdump follows:

[ 13176.105223] uvm_fault(0xffffa01f0e5faa10, 0x0, 1) -> e
[ 13176.105223] fatal page fault in supervisor mode
[ 13176.105223] trap type 6 code 0 rip 0xffffffff8055e9da cs 0x8 rflags 0x10246 cr2 0x56 ilevel 0 rsp 0xffffa300684aa670
[ 13176.105223] curlwp 0xffffa01ff342c2c0 pid 1180.1 lowest kstack 0xffffa300684a82c0
[ 13176.105223] panic: trap
[ 13176.105223] cpu1: Begin traceback...
[ 13176.105223] vpanic() at netbsd:vpanic+0x143
[ 13176.105223] snprintf() at netbsd:snprintf
[ 13176.105223] startlwp() at netbsd:startlwp
[ 13176.105223] alltraps() at netbsd:alltraps+0xbb
[ 13176.105223] nfs_request() at netbsd:nfs_request+0x18d
[ 13176.115228] nfs_getattr() at netbsd:nfs_getattr+0x16e
[ 13176.115228] VOP_GETATTR() at netbsd:VOP_GETATTR+0x53
[ 13176.115228] union_loadvnode() at netbsd:union_loadvnode+0x161
[ 13176.115228] vcache_get() at netbsd:vcache_get+0x1d6
[ 13176.115228] union_allocvp() at netbsd:union_allocvp+0x218
[ 13176.115228] union_root() at netbsd:union_root+0x52
[ 13176.115228] VFS_ROOT() at netbsd:VFS_ROOT+0x1c
[ 13176.115228] lookup_once() at netbsd:lookup_once+0x262
[ 13176.115228] namei_tryemulroot() at netbsd:namei_tryemulroot+0x32e
[ 13176.115228] namei() at netbsd:namei+0x41
[ 13176.125231] fd_nameiat.isra.2() at netbsd:fd_nameiat.isra.2+0x54
[ 13176.125231] do_sys_statat() at netbsd:do_sys_statat+0x77
[ 13176.125231] sys___stat50() at netbsd:sys___stat50+0x28
[ 13176.125231] syscall() at netbsd:syscall+0x13f
[ 13176.125231] --- syscall (number 439) ---
[ 13176.125231] 70bdb056231a:
[ 13176.125231] cpu1: End traceback...


>How-To-Repeat:
Use a union-mount on an nfs-mounted filesystem and use it. Sometimes it will panic, but the exact circumstances are not known. In some cases it happened when calling "df -h", but not always.
>Fix:
Unknown

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55735: union fs on top of nfs causes kernel panic
Date: Sun, 30 May 2021 07:09:29 +0000

 On Mon, Oct 19, 2020 at 01:00:01PM +0000, bernd.sieker@posteo.net wrote:
  > A custom-built 9.1 kernel, including zfs (module), union and nfs
  > file systems occasionally panics when a union filesystem is mounted
  > on top of an nfs-mounted filesystem.

 so... unfortunately panics from onionfs are not exactly unexpected,
 though this one seems a bit odd.

  > [ 13176.105223] alltraps() at netbsd:alltraps+0xbb
  > [ 13176.105223] nfs_request() at netbsd:nfs_request+0x18d

 Do you have debug info? If you can figure out which line of
 nfs_request() that is, it would be helpful, as nfs_request() is about
 450 lines of fairly messy code.

 -- 
 David A. Holland
 dholland@netbsd.org

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.