NetBSD Problem Report #55702

From www@netbsd.org  Wed Oct  7 21:07:30 2020
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 425421A923A
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  7 Oct 2020 21:07:30 +0000 (UTC)
Message-Id: <20201007210729.0157C1A923C@mollari.NetBSD.org>
Date: Wed,  7 Oct 2020 21:07:28 +0000 (UTC)
From: netbsd@eq.cz
Reply-To: netbsd@eq.cz
To: gnats-bugs@NetBSD.org
Subject: panic: kernel diagnostic assertion "(pg->flags & PG_PAGEOUT) == 0" failed
X-Send-Pr-Version: www-1.0

>Number:         55702
>Category:       kern
>Synopsis:       panic: kernel diagnostic assertion "(pg->flags & PG_PAGEOUT) == 0" failed
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    chs
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Oct 07 21:10:00 +0000 2020
>Closed-Date:    Wed May 17 11:28:13 +0000 2023
>Last-Modified:  Wed May 17 11:28:13 +0000 2023
>Originator:     rudolf
>Release:        current
>Organization:
>Environment:
NetBSD  9.99.73 NetBSD 9.99.73 (GENERIC) #0: Tue Oct  6 16:39:23 UTC 2020  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
Copying ~ 600 GB of data using rsync (with flags: -aH)

FROM:
  read-only mounted ffs filesystem on a cgd on a SATA drive ("spinning rust") plugged to a notebook using usb-sata adapter (connected to xhci)

TO:
  zfs filesystem on a cgd on a SSD drive

results (happened two times in a row after 10 - 20 GB of transferred data) in the following panic:

kernel diagnostic assertion "(pg->flags & PG_PAGEOUT) == 0" failed: file "/home/source/ab/HEAD/src/sys/uvm/uvm_page.c", line 1616

The assertion was added in the following commit ~ 16 years ago:
"uvm_page_unbusy: add assertions and comments about PG_RELEASED anon pages."
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/uvm/uvm_page.c.diff?r1=1.97&r2=1.98

The system has a dual-core CPU, 16 GB RAM and 16 GB of swap (vm.swap_encrypt=1). There was no other significant activity during the copying.

I have photos of the panic, backtrace and registers. Backtrace:
vpanic()
__x86_indirect_thunk_rax()
uvm_page_unbusy()
zfs_putpage()
genfs_do_putpages()
zfs_netbsd_putpages()
VOP_PUTPAGES()
uvm_pageout()

>How-To-Repeat:
The panic occurred twice in a row so I believe I can trigger it again if more information is required, please let me know what information you need in that case. 
>Fix:

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->chs
Responsible-Changed-By: chs@NetBSD.org
Responsible-Changed-When: Wed, 07 Oct 2020 22:20:32 +0000
Responsible-Changed-Why:
mine


From: Chuck Silvers <chuq@chuq.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55702: panic: kernel diagnostic assertion "(pg->flags &
 PG_PAGEOUT) == 0" failed
Date: Wed, 7 Oct 2020 15:22:20 -0700

 does this patch fix it for you?
 http://ftp.netbsd.org/pub/NetBSD/misc/chs/diff.zfs-vs-pageout.1

 -Chuck

From: rudolf <netbsd@eq.cz>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55702: panic: kernel diagnostic assertion "(pg->flags &
 PG_PAGEOUT) == 0" failed
Date: Thu, 8 Oct 2020 18:05:37 +0000

 Chuck Silvers wrote:
 >   does this patch fix it for you?
 >   http://ftp.netbsd.org/pub/NetBSD/misc/chs/diff.zfs-vs-pageout.1

 Yes, I've now transferred all the data without a panic. Thanks!

 r.

From: "Chuck Silvers" <chs@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55702 CVS commit: src/sys
Date: Sun, 18 Oct 2020 18:22:29 +0000

 Module Name:	src
 Committed By:	chs
 Date:		Sun Oct 18 18:22:29 UTC 2020

 Modified Files:
 	src/sys/rump/librump/rumpvfs: vm_vfs.c
 	src/sys/uvm: uvm_page.c uvm_pager.c

 Log Message:
 Move the handling of PG_PAGEOUT from uvm_aio_aiodone_pages() to
 uvm_page_unbusy() so that all callers of uvm_page_unbusy() don't need to
 handle this flag separately.  Split out the pages part of uvm_aio_aiodone()
 into uvm_aio_aiodone_pages() in rump just like in the real kernel.
 In ZFS functions that can fail to copy data between the ARC and VM pages,
 use uvm_aio_aiodone_pages() rather than uvm_page_unbusy() so that we can
 handle these "I/O" errors.  Fixes PR 55702.


 To generate a diff of this commit:
 cvs rdiff -u -r1.38 -r1.39 src/sys/rump/librump/rumpvfs/vm_vfs.c
 cvs rdiff -u -r1.247 -r1.248 src/sys/uvm/uvm_page.c
 cvs rdiff -u -r1.129 -r1.130 src/sys/uvm/uvm_pager.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Chuck Silvers <chuq@chuq.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: PR/55702 CVS commit: src/sys
Date: Sat, 24 Oct 2020 21:17:40 +0900

 On 2020/10/19 3:25, Chuck Silvers wrote:
 ...
 >   Module Name:	src
 >   Committed By:	chs
 >   Date:		Sun Oct 18 18:22:29 UTC 2020
 >   
 >   Modified Files:
 >   	src/sys/rump/librump/rumpvfs: vm_vfs.c
 >   	src/sys/uvm: uvm_page.c uvm_pager.c
 >   
 >   Log Message:
 >   Move the handling of PG_PAGEOUT from uvm_aio_aiodone_pages() to
 >   uvm_page_unbusy() so that all callers of uvm_page_unbusy() don't need to
 >   handle this flag separately.  Split out the pages part of uvm_aio_aiodone()
 >   into uvm_aio_aiodone_pages() in rump just like in the real kernel.
 >   In ZFS functions that can fail to copy data between the ARC and VM pages,
 >   use uvm_aio_aiodone_pages() rather than uvm_page_unbusy() so that we can
 >   handle these "I/O" errors.  Fixes PR 55702.
 >   
 >   
 >   To generate a diff of this commit:
 >   cvs rdiff -u -r1.38 -r1.39 src/sys/rump/librump/rumpvfs/vm_vfs.c
 >   cvs rdiff -u -r1.247 -r1.248 src/sys/uvm/uvm_page.c
 >   cvs rdiff -u -r1.129 -r1.130 src/sys/uvm/uvm_pager.c
 >   
 >   Please note that diffs are not public domain; they are subject to the
 >   copyright notices on the relevant files.

 Hi,

 Recent -current on aarch64 easily hits KASSERT as:

 	panic: kernel diagnostic assertion "(pg->flags & PG_PAGEOUT) == 0" failed: file "../../../../uvm/uvm_page.c", line 1448

 Full backtrace is attached below.

 I carried out bisectioning to find this commit; By reverting this for
 sys/uvm/*, -current works without problems as far as I can see.

 Chuck, can you please take a look?

 Thanks,
 rin
 ----
 panic: kernel diagnostic assertion "(pg->flags & PG_PAGEOUT) == 0" failed: file "../../../../uvm/uvm_page.c", line 1448
 cpu0: Begin traceback...
 trace fp ffffc00031ea78c0
 fp ffffc00031ea78f0 vpanic() at ffffc000004f4e0c netbsd:vpanic+0x14c
 fp ffffc00031ea7950 kern_assert() at ffffc00000819cdc netbsd:kern_assert+0x5c
 fp ffffc00031ea79e0 uvm_pagefree() at ffffc0000046c328 netbsd:uvm_pagefree+0x4a8
 fp ffffc00031ea7a10 uvm_anon_release() at ffffc00000454588 netbsd:uvm_anon_release+0x58
 fp ffffc00031ea7a30 uvm_aio_aiodone_pages() at ffffc0000046ef00 netbsd:uvm_aio_aiodone_pages+0x410
 fp ffffc00031ea7ac0 uvm_aio_aiodone() at ffffc0000046f22c netbsd:uvm_aio_aiodone+0xac
 fp ffffc00031ea7bb0 scsipi_complete() at ffffc0000010a120 netbsd:scsipi_complete+0x120
 fp ffffc00031ea7c00 scsipi_done() at ffffc0000010b174 netbsd:scsipi_done+0xe4
 fp ffffc00031ea7c40 usb_transfer_complete() at ffffc0000011d638 netbsd:usb_transfer_complete+0x1f8
 fp ffffc00031ea7c80 dwc2_softintr() at ffffc000003aa758 netbsd:dwc2_softintr+0x148
 fp ffffc00031ea7ce0 usb_soft_intr() at ffffc000001196cc netbsd:usb_soft_intr+0x24
 fp ffffc00031ea7d40 softint_dispatch() at ffffc000004c21c8 netbsd:softint_dispatch+0xe0
 fp ffffc00043c9fe80 cpu_switchto_softint() at ffffc0000009c650 netbsd:cpu_switchto_softint+0x6c
 tf ffffc00043c9fed0 el0_trap() at ffffc000000a2ff0 netbsd:el0_trap
 ---- trapframe 0xffffc00043c9fed0 (304 bytes) ----
      pc=000000000b89376c,   spsr=0000000080000010 (AArch32)
     esr=0000000092000047,    far=00000000ea013000
      r0=00000000f4d5b000,     r1=00000000f4a70708
      r2=0000000000000042,     r3=000000000000006c
      r4=0000000000000043,     r5=00000000f4a7074a
      r6=0000000000000001,     r7=000000004443fb80
      r8=0000000000000000,     r9=00000000eca43bb8
     r10=0000000000000007,    r11=000000000bd4552c
     r12=000000005acc8daf, sp=r13=00000000fff77b18
 lr=r14=00000000f4a70739, pc=r15=000000000b89376c
 ------------------------------------------------
 cpu0: End traceback...
 Stopped in pid 0.6 (system) at  netbsd:cpu_Debugger+0x4:        ret
 db{0}>

From: Chuck Silvers <chuq@chuq.com>
To: Rin Okuyama <rokuyama.rk@gmail.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: PR/55702 CVS commit: src/sys
Date: Sat, 24 Oct 2020 17:13:27 -0700

 On Sat, Oct 24, 2020 at 09:17:40PM +0900, Rin Okuyama wrote:
 > Recent -current on aarch64 easily hits KASSERT as:
 > 
 > 	panic: kernel diagnostic assertion "(pg->flags & PG_PAGEOUT) == 0" failed: file "../../../../uvm/uvm_page.c", line 1448

 I committed a fix for this just now, please let me know if there is still a problem.

 -Chuck

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Chuck Silvers <chuq@chuq.com>
Cc: gnats-bugs@netbsd.org
Subject: Re: PR/55702 CVS commit: src/sys
Date: Sun, 25 Oct 2020 19:16:27 +0900

 On 2020/10/25 9:13, Chuck Silvers wrote:
 > On Sat, Oct 24, 2020 at 09:17:40PM +0900, Rin Okuyama wrote:
 >> Recent -current on aarch64 easily hits KASSERT as:
 >>
 >> 	panic: kernel diagnostic assertion "(pg->flags & PG_PAGEOUT) == 0" failed: file "../../../../uvm/uvm_page.c", line 1448
 > 
 > I committed a fix for this just now, please let me know if there is still a problem.

 It seems fine for now. Thank you so much for quick fix!!

 rin

From: "Chuck Silvers" <chs@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55702 CVS commit: src/external/cddl/osnet/dist/uts/common/fs/zfs
Date: Sun, 15 Nov 2020 00:54:13 +0000

 Module Name:	src
 Committed By:	chs
 Date:		Sun Nov 15 00:54:13 UTC 2020

 Modified Files:
 	src/external/cddl/osnet/dist/uts/common/fs/zfs: zfs_vnops.c

 Log Message:
 Commit the ZFS file that I forgot in this previous commit:

 Move the handling of PG_PAGEOUT from uvm_aio_aiodone_pages() to
 uvm_page_unbusy() so that all callers of uvm_page_unbusy() don't need to
 handle this flag separately.  Split out the pages part of uvm_aio_aiodone()
 into uvm_aio_aiodone_pages() in rump just like in the real kernel.
 In ZFS functions that can fail to copy data between the ARC and VM pages,
 use uvm_aio_aiodone_pages() rather than uvm_page_unbusy() so that we can
 handle these "I/O" errors.  Fixes PR 55702.


 To generate a diff of this commit:
 cvs rdiff -u -r1.70 -r1.71 \
     src/external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55702 CVS commit: [netbsd-9] src
Date: Tue, 6 Jul 2021 04:22:35 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Tue Jul  6 04:22:35 UTC 2021

 Modified Files:
 	src/external/cddl/osnet/dist/uts/common/fs/zfs [netbsd-9]: zfs_vnops.c
 	src/sys/rump/librump/rumpkern [netbsd-9]: vm.c
 	src/sys/rump/librump/rumpvfs [netbsd-9]: vm_vfs.c
 	src/sys/uvm [netbsd-9]: uvm_anon.c uvm_page.c uvm_pager.c
 	src/tests/rump/rumpkern [netbsd-9]: t_vm.c

 Log Message:
 Pull up following revision(s) - all via patch -
 (requested by riastradh in ticket #1317):

 	sys/uvm/uvm_page.c: revision 1.248
 	sys/uvm/uvm_anon.c: revision 1.80
 	sys/rump/librump/rumpvfs/vm_vfs.c: revision 1.40
 	sys/rump/librump/rumpvfs/vm_vfs.c: revision 1.41
 	sys/rump/librump/rumpkern/vm.c: revision 1.191
 	sys/uvm/uvm_pager.c: revision 1.130
 	external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vnops.c: revision 1.71
 	tests/rump/rumpkern/t_vm.c: revision 1.5
 	tests/rump/rumpkern/t_vm.c: revision 1.6
 	sys/rump/librump/rumpvfs/vm_vfs.c: revision 1.39

 Move the handling of PG_PAGEOUT from uvm_aio_aiodone_pages() to
 uvm_page_unbusy() so that all callers of uvm_page_unbusy() don't need to
 handle this flag separately.  Split out the pages part of uvm_aio_aiodone()
 into uvm_aio_aiodone_pages() in rump just like in the real kernel.

 In ZFS functions that can fail to copy data between the ARC and VM pages,
 use uvm_aio_aiodone_pages() rather than uvm_page_unbusy() so that we can
 handle these "I/O" errors.  Fixes PR 55702.

 fix an incorrect assertion in the previous commit.

 Handle PG_PAGEOUT in uvm_anon_release() too.

 Commit the ZFS file that I forgot in this previous commit:

 Move the handling of PG_PAGEOUT from uvm_aio_aiodone_pages() to
 uvm_page_unbusy() so that all callers of uvm_page_unbusy() don't need to
 handle this flag separately.  Split out the pages part of uvm_aio_aiodone()
 into uvm_aio_aiodone_pages() in rump just like in the real kernel.

 In ZFS functions that can fail to copy data between the ARC and VM pages,
 use uvm_aio_aiodone_pages() rather than uvm_page_unbusy() so that we can
 handle these "I/O" errors.  Fixes PR 55702.
 update the rump copy of uvm_page_unbusy() to match the real version,
 in particular handle PG_PAGEOUT.  fixes a few atf tests.
 the busypage test is buggy, expect it to fail.

 make rump's uvm_aio_aiodone_pages() look more like the kernel version.
 fixes some more rumpy assertions.

 for the busypage test, replace atf_tc_expect_fail() with atf_tc_skip()
 because atf apparently has no way to expect a test program to crash.
 fixes PR 55945.


 To generate a diff of this commit:
 cvs rdiff -u -r1.50.2.9 -r1.50.2.10 \
     src/external/cddl/osnet/dist/uts/common/fs/zfs/zfs_vnops.c
 cvs rdiff -u -r1.173 -r1.173.14.1 src/sys/rump/librump/rumpkern/vm.c
 cvs rdiff -u -r1.34 -r1.34.34.1 src/sys/rump/librump/rumpvfs/vm_vfs.c
 cvs rdiff -u -r1.64 -r1.64.8.1 src/sys/uvm/uvm_anon.c
 cvs rdiff -u -r1.199 -r1.199.4.1 src/sys/uvm/uvm_page.c
 cvs rdiff -u -r1.111.8.1 -r1.111.8.2 src/sys/uvm/uvm_pager.c
 cvs rdiff -u -r1.4 -r1.4.16.1 src/tests/rump/rumpkern/t_vm.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 07 Jul 2021 00:27:01 +0000
State-Changed-Why:
Is this fixed? Looks like it to me...


From: rudolf <netbsd@eq.cz>
To: gnats-bugs@netbsd.org, chs@netbsd.org, netbsd-bugs@netbsd.org,
 gnats-admin@netbsd.org, dholland@NetBSD.org
Cc: 
Subject: Re: kern/55702 (panic: kernel diagnostic assertion "(pg->flags &
 PG_PAGEOUT) == 0" failed)
Date: Sat, 25 Sep 2021 11:55:57 +0200

 On 7/7/21 2:27 AM, dholland@NetBSD.org wrote:
 > Is this fixed? Looks like it to me...

 Looks like it to me too.

 r.

State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 17 May 2023 11:28:13 +0000
State-Changed-Why:
confirmed fixed in 2021, oops :-(


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.