NetBSD Problem Report #55607
From brad@anduin.eldar.org Tue Aug 25 20:30:23 2020
Return-Path: <brad@anduin.eldar.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id B64E01A923A
for <gnats-bugs@gnats.NetBSD.org>; Tue, 25 Aug 2020 20:30:23 +0000 (UTC)
Message-Id: <202008252030.07PKUIOw028746@anduin.eldar.org>
Date: Tue, 25 Aug 2020 16:30:18 -0400 (EDT)
From: brad@anduin.eldar.org
Reply-To: brad@anduin.eldar.org
To: gnats-bugs@NetBSD.org
Subject: Panic 9.0_STABLE Xen PV with ZFS module
X-Send-Pr-Version: 3.95
>Number: 55607
>Category: kern
>Synopsis: Panic 9.0_STABLE Xen PV with ZFS module
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Aug 25 20:35:00 +0000 2020
>Last-Modified: Mon Jan 17 18:40:00 +0000 2022
>Originator: brad@anduin.eldar.org
>Release: NetBSD 9.0_STABLE
>Organization:
eldar.org
>Environment:
System: NetBSD samwise.nat.eldar.org 9.0_STABLE NetBSD 9.0_STABLE (XEN3_DOMU) #0: Sat May 9 22:12:48 EDT 2020 brad@samwise.nat.eldar.org:/lhome/NetBSD_9_branch_20200509/amd64/OBJ/sys/arch/amd64/compile/XEN3_DOMU amd64
Architecture: x86_64
Machine: amd64
>Description:
I have been using ZFS with a 9.0_STABLE Xen PV system for some time
with pretty good results. The last time that the system was updated
was 2020-05-09 and all was ok. I went to update the system to
2020-08-20 and when the zfs module loads the DOMU guest panics with
this:
[ 3.8500388] fatal page fault in supervisor mode
[ 3.8500388] trap type 6 code 0x2 rip 0xffffffff8022abba cs 0xe030 rflags 0x10202 cr2 0xca000000eb ilevel 0x8 rsp 0xffff8c00d2063c30
[ 3.8500388] curlwp 0xffff8c0007460040 pid 0.2 lowest kstack 0xffff8c00d20602c0
kernel: page fault trap, code=0
Stopped in pid 0.2 (system) at netbsd:xbd_handler+0x223: movl $0x5,20(
%r13)
xbd_handler() at netbsd:xbd_handler+0x223
xen_intr_biglock_wrapper() at netbsd:xen_intr_biglock_wrapper+0x1d
evtchn_do_event() at netbsd:evtchn_do_event+0xe8
do_hypervisor_callback() at netbsd:do_hypervisor_callback+0x161
Xhypervisor_pvhvm_callback() at netbsd:Xhypervisor_pvhvm_callback+0xa0
idle_loop() at netbsd:idle_loop+0xb6
ds 16
es b910
fs 0
gs d3
rdi 6
rsi ffff8c00d3556000
rbp ffff8c00d2063ca0
rbx 1d
rdx 9a0
rcx 16
rax ffff8c000846b910
r8 698
r9 32
r10 0
r11 246
r12 0
r13 ca000000cb
r14 ffffffffffffffff
r15 ffff8c000846b000
rip ffffffff8022abba xbd_handler+0x223
cs e030
rflags 10202
rsp ffff8c00d2063c30
ss e02b
netbsd:xbd_handler+0x223: movl $0x5,20(%r13)
The zfs module from 2020-05-09 actually appears to work with the
2020-08-20 DOMU kernel and doesn't panic which may be suggestive of
something.
>How-To-Repeat:
Try using zfs in a very recent 9.0_STABLE
>Fix:
Don't know, except that -current appears to work fine.
>Audit-Trail:
From: Christian Kujau <lists@nerdbynature.de>
To: gnats-bugs@NetBSD.org
Cc: lists@nerdbynature.de
Subject: Re: kern/55607
Date: Mon, 17 Jan 2022 19:34:56 +0100 (CET)
This still happens with 9.2_STABLE (XEN3_DOMU) when run as a Xen PV
DomU:
$ doas zpool create test /dev/xbd2a
fatal page fault in supervisor mode
[ 86.3298078] trap type 6 code 0x2 rip 0xffffffff8022ae1a cs 0xe030 rflags 0x10286 cr2 0x15300000174 ilevel 0x8 rsp 0xffffcf80790e3c30
[ 86.3298078] curlwp 0xffffcf8003a35040 pid 0.2 lowest kstack 0xffffcf80790e02c0
[ 86.3298078] panic: trap
[ 86.3298078] cpu0: Begin traceback...
[ 86.3298078] vpanic() at netbsd:vpanic+0x143
[ 86.3298078] snprintf() at netbsd:snprintf
[ 86.3298078] startlwp() at netbsd:startlwp
[ 86.3298078] alltraps() at netbsd:alltraps+0xae
[ 86.3298078] xen_intr_biglock_wrapper() at netbsd:xen_intr_biglock_wrapper+0x1d
[ 86.3298078] evtchn_do_event() at netbsd:evtchn_do_event+0xe8
[ 86.3298078] do_hypervisor_callback() at netbsd:do_hypervisor_callback+0x161
[ 86.3298078] Xhypervisor_pvhvm_callback() at netbsd:Xhypervisor_pvhvm_callback+0xa0
[ 86.3298078] idle_loop() at netbsd:idle_loop+0xb6
[ 86.3298078] cpu0: End traceback...
[ 86.3298078] dumping to dev 142,1 (offset=525199, size=0): not possible
[ 86.3298078] rebooting...
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.