NetBSD Problem Report #55366

From hannken@eis.cs.tu-bs.de  Thu Jun 11 08:40:58 2020
Return-Path: <hannken@eis.cs.tu-bs.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 122391A9219
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 11 Jun 2020 08:40:58 +0000 (UTC)
Message-Id: <20200611084050.A30C9CBD9F@builder.isf.cs.tu-bs.de>
Date: Thu, 11 Jun 2020 10:40:50 +0200 (MEST)
From: hannken@eis.cs.tu-bs.de
Reply-To: hannken@eis.cs.tu-bs.de
To: gnats-bugs@NetBSD.org
Subject: Assertion "ref >= 0" file "sys/uvm/uvm_amap.c" failed.
X-Send-Pr-Version: 3.95

>Number:         55366
>Category:       kern
>Synopsis:       Assertion "ref >= 0" file "sys/uvm/uvm_amap.c" failed.
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    chs
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jun 11 08:45:00 +0000 2020
>Closed-Date:    Tue Oct 20 09:24:09 +0000 2020
>Last-Modified:  Tue Oct 20 09:24:09 +0000 2020
>Originator:     Juergen Hannken-Illjes
>Release:        NetBSD 9.99.64
>Organization:

>Environment:


System: NetBSD burner.dd 9.99.64 NetBSD 9.99.64 (work.amd64) #115: Wed Jun 10 14:39:27 MEST 2020 hannken@builder.isf.cs.tu-bs.de:/work/build/obj/obj.amd64/sys/arch/amd64/compile/work.amd64 amd64
Architecture: x86_64
Machine: amd64
>Description:

Assertion "ref >= 0" fails for operation "amap_pp_adjref()".

Here we have this amap:

(gdb) print *amap
$2 = {
  am_lock = 0xffff93ad17182100,
  am_ref = 1,
  am_flags = 0,
  am_maxslot = 18,
  am_nslot = 18,
  am_nused = 3,
  am_slots = 0xffff93ad173c2100,
  am_bckptr = 0xffff93ad1a702b80,
  am_anon = 0xffff93ad219b4c00,
  am_ppref = 0xffff93ad2611ddc0,
  am_list = {
    le_next = 0xffff93ad21f29300,
    le_prev = 0xffff93ad21f293f0
  }
}
(gdb) print *amap->am_ppref@18
$3 = {1, -3, 17, 0 <repeats 15 times>}

The backtrace isL:

#10 0xffffffff80b443b5 in vpanic (fmt=0xffffffff81112bc0 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ", ap=ap@entry=0xffffbf01508a4b78) at src/sys/kern/subr_prf.c:288
#11 0xffffffff80ca9686 in kern_assert (fmt=fmt@entry=0xffffffff81112bc0 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ") at src/sys/lib/libkern/kern_assert.c:51
#12 0xffffffff80a98f2a in amap_pp_adjref (amap=amap@entry=0xffff93ad21f29358, curslot=curslot@entry=0, slotlen=<optimized out>, adjval=adjval@entry=-1) at src/sys/uvm/uvm_amap.c:1218
#13 0xffffffff80a99f37 in amap_adjref_anons (amap=0xffff93ad21f29358, offset=17, len=1, refv=-1, all=<optimized out>) at src/sys/uvm/uvm_amap.c:1577
#14 0xffffffff80aa9c85 in uvm_map_unreference_amap (flags=2, entry=0xffff93ad1b008e80) at src/sys/uvm/uvm_map.c:2368
#15 uvm_unmap_detach (first_entry=0xffff93ad1b008e80, flags=flags@entry=2) at src/sys/uvm/uvm_map.c:2368
#16 0xffffffff80aa4603 in uvm_io (map=0xffff93ad20730e48, uio=uio@entry=0xffffbf01508a4d50, flags=<optimized out>, flags@entry=0) at src/sys/uvm/uvm_io.c:135
#17 0xffffffff80b2d4cc in copyin_vmspace (len=<optimized out>, kaddr=<optimized out>, uaddr=<optimized out>, vm=<optimized out>) at src/sys/kern/subr_copy.c:229
#18 copyin_vmspace (vm=<optimized out>, uaddr=<optimized out>, kaddr=<optimized out>, len=<optimized out>) at src/sys/kern/subr_copy.c:205
#19 0xffffffff80b2d723 in copyin_proc (p=<optimized out>, uaddr=0x7f7fff780fe0, kaddr=0xffffbf01508a4e20, len=32) at src/sys/kern/subr_copy.c:280
#20 0xffffffff80b00794 in sysctl_kern_proc_args (namelen=2, newp=0x0, newlen=<optimized out>, oname=0xffffbf01508a4f30, rnode=0xffffbf001ef44f60, l=0xffff93ad1e27b900, oldlenp=0xffffbf01508a4f28, oldp=0x7f7fff46c324, name=<optimized out>) at src/sys/kern/kern_proc.c:2398
#21 sysctl_kern_proc_args (name=<optimized out>, namelen=<optimized out>, oldp=0x7f7fff46c324, oldlenp=0xffffbf01508a4f28, newp=<optimized out>, newlen=<optimized out>, oname=0xffffbf01508a4f30, l=0xffff93ad1e27b900, rnode=0xffffbf001ef44f60) at src/sys/kern/kern_proc.c:2306
#22 0xffffffff80b179e8 in sysctl_dispatch (name=name@entry=0xffffbf01508a4f30, namelen=<optimized out>, oldp=0x7f7fff46c324, oldlenp=oldlenp@entry=0xffffbf01508a4f28, newp=0x0, newlen=0, oname=oname@entry=0xffffbf01508a4f30, l=l@entry=0xffff93ad1e27b900, rnode=<optimized out>, rnode@entry=0x0) at src/sys/kern/kern_sysctl.c:454
#23 0xffffffff80b17c35 in sys___sysctl (l=0xffff93ad1e27b900, uap=0xffffbf01508a5000, retval=<optimized out>) at src/sys/kern/kern_sysctl.c:310
#24 0xffffffff8066d143 in sy_call (rval=0xffffbf01508a4fb0, uap=0xffffbf01508a5000, l=0xffff93ad1e27b900, sy=0xffffffff81d0cd30 <sysent+4848>) at src/sys/sys/syscallvar.h:65
#25 sy_invoke (code=202, rval=0xffffbf01508a4fb0, uap=0xffffbf01508a5000, l=0xffff93ad1e27b900, sy=0xffffffff81d0cd30 <sysent+4848>) at src/sys/sys/syscallvar.h:94
#26 syscall (frame=0xffffbf01508a5000) at src/sys/arch/x86/x86/syscall.c:138
#27 0xffffffff8032425d in handle_syscall () at src/sys/../external/cddl/osnet/dist/uts/common/fs/zfs/dmu_traverse.c:706

>How-To-Repeat:

Run this script on a 16-core VM, DIAGNOSTIC+DEBUG+LOCKDEBUG:

pgloop(){

	while :; do
		pgrep nope
	done
}

for I in $( seq 100 ); do
	pgloop &
done

while :; do
	uptime
	sleep 60
done

Wait 4 to 24 hours and get this assertion.

Same problem seen on -7 and -8 release kernels.
>Fix:


>Release-Note:

>Audit-Trail:
From: "Chuck Silvers" <chs@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55366 CVS commit: src/sys/uvm
Date: Sun, 20 Sep 2020 23:03:01 +0000

 Module Name:	src
 Committed By:	chs
 Date:		Sun Sep 20 23:03:01 UTC 2020

 Modified Files:
 	src/sys/uvm: uvm_amap.c

 Log Message:
 Effectively disable the AMAP_REFALL flag because it is unsafe.
 This flag tells the amap code that it does not need to allocate ppref
 as part of adding or removing a reference, but that is only correct
 if the range of the reference being added or removed is the same
 as the range of all other references to the amap, and the point of
 this flag is exactly to try to optimize the case where the range is
 different and thus this flag would not be correct to use.
 Fixes PR 55366.


 To generate a diff of this commit:
 cvs rdiff -u -r1.123 -r1.124 src/sys/uvm/uvm_amap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: kern-bug-people->chs
Responsible-Changed-By: chs@NetBSD.org
Responsible-Changed-When: Sun, 20 Sep 2020 23:07:03 +0000
Responsible-Changed-Why:
I've been working on this.


State-Changed-From-To: open->feedback
State-Changed-By: chs@NetBSD.org
State-Changed-When: Sun, 20 Sep 2020 23:07:03 +0000
State-Changed-Why:
can you confirm that the change I committed to fix this fixes it for you as well?


From: "Chuck Silvers" <chs@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55366 CVS commit: src/sys/uvm
Date: Mon, 21 Sep 2020 18:41:59 +0000

 Module Name:	src
 Committed By:	chs
 Date:		Mon Sep 21 18:41:59 UTC 2020

 Modified Files:
 	src/sys/uvm: uvm_amap.c uvm_io.c

 Log Message:
 the previous fix for PR 55366 in uvm_amap.c 1.124 was incomplete:
  - amap_adjref_anons() must also ignore AMAP_REFALL when updating
    the ppref, not just when deciding whether or not to initialize ppref.
  - UVM_EXTRACT_QREF relies on AMAP_REFALL to work properly,
    and since we can't use AMAP_REFALL then we can't use QREF either.


 To generate a diff of this commit:
 cvs rdiff -u -r1.124 -r1.125 src/sys/uvm/uvm_amap.c
 cvs rdiff -u -r1.28 -r1.29 src/sys/uvm/uvm_io.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "J. Hannken-Illjes" <hannken@eis.cs.tu-bs.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/55366 (Assertion "ref >= 0" file "sys/uvm/uvm_amap.c"
 failed.)
Date: Sat, 26 Sep 2020 09:27:55 +0200

 --Apple-Mail=_F87FBEF2-9372-4AD9-9D60-D3BA0F5C70EA
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/plain;
 	charset=us-ascii

 > On 21. Sep 2020, at 01:07, chs@netbsd.org <chs@NetBSD.org> wrote:
 <snip>
 > can you confirm that the change I committed to fix this fixes it for =
 you as well?

 Confirmed -- the pgrep test runs for 35+ hours without problems.

 Please request pullup to -8 and -9.

 Thanks for fixing this annoying problem.

 --Apple-Mail=_F87FBEF2-9372-4AD9-9D60-D3BA0F5C70EA
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename=signature.asc
 Content-Type: application/pgp-signature;
 	name=signature.asc
 Content-Description: Message signed with OpenPGP

 -----BEGIN PGP SIGNATURE-----

 iQEzBAEBCAAdFiEE2BL3ha7Xao4WUZVYKoaVJdNr+uEFAl9u7XsACgkQKoaVJdNr
 +uGYfAgArphY2Wcfw2KVNwrLmLhlukXHjajiYOEUm9OUhTfNWipPXEOgext8NRGO
 XDJ9ZsuJIizl405ND+hLwJpe/neBin72Fi1/yTmmHxD2Ajzj7ULMJzINEh4rU299
 0OyOldq1zm8a/ciX8V6HM3v16nHmHVnv5NT2H4NUqvkXoVxwUa0d++nCPC6Lp86Z
 4Comb5FpSgiTRYKiYeLX0hBAz5kiT4YGZVeSoAVcUnrzrKzTkf08+oqQzxmHnWZo
 hy8oE/9KGYefqs5GWtMyRbEdU0aOG+dVL9d7M8M+BSNe5VwXx720tuBNACHEjHTy
 7NegEkgv2ay2lMEccag671YgtBLVQQ==
 =Ykbm
 -----END PGP SIGNATURE-----

 --Apple-Mail=_F87FBEF2-9372-4AD9-9D60-D3BA0F5C70EA--

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55366 CVS commit: [netbsd-9] src/sys/uvm
Date: Sun, 4 Oct 2020 18:14:13 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sun Oct  4 18:14:13 UTC 2020

 Modified Files:
 	src/sys/uvm [netbsd-9]: uvm_amap.c uvm_io.c

 Log Message:
 Pull up following revision(s) (requested by chs in ticket #1095):

 	sys/uvm/uvm_amap.c: revision 1.124 (via patch)
 	sys/uvm/uvm_amap.c: revision 1.125 (via patch)
 	sys/uvm/uvm_io.c: revision 1.29 (via patch)

 Effectively disable the AMAP_REFALL flag because it is unsafe.

 This flag tells the amap code that it does not need to allocate ppref
 as part of adding or removing a reference, but that is only correct
 if the range of the reference being added or removed is the same
 as the range of all other references to the amap, and the point of
 this flag is exactly to try to optimize the case where the range is
 different and thus this flag would not be correct to use.
 Fixes PR 55366.

 The previous fix for PR 55366 in uvm_amap.c 1.124 was incomplete:
  - amap_adjref_anons() must also ignore AMAP_REFALL when updating
    the ppref, not just when deciding whether or not to initialize ppref.
  - UVM_EXTRACT_QREF relies on AMAP_REFALL to work properly,
    and since we can't use AMAP_REFALL then we can't use QREF either.


 To generate a diff of this commit:
 cvs rdiff -u -r1.109.4.1 -r1.109.4.2 src/sys/uvm/uvm_amap.c
 cvs rdiff -u -r1.28 -r1.28.22.1 src/sys/uvm/uvm_io.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: feedback->closed
State-Changed-By: hannken@NetBSD.org
State-Changed-When: Tue, 20 Oct 2020 09:24:09 +0000
State-Changed-Why:
Fixed and pullup complete -- thanks.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.