NetBSD Problem Report #55004
From gson@gson.org Sat Feb 22 16:12:51 2020
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 41D7B1A9213
for <gnats-bugs@gnats.NetBSD.org>; Sat, 22 Feb 2020 16:12:51 +0000 (UTC)
Message-Id: <20200222161245.8DF7B253FA3@guava.gson.org>
Date: Sat, 22 Feb 2020 18:12:45 +0200 (EET)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: Hundreds of file system tests now fail on real hardware
X-Send-Pr-Version: 3.95
>Number: 55004
>Category: kern
>Synopsis: Hundreds of file system tests now fail on real hardware
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: jdolecek
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Feb 22 16:15:00 +0000 2020
>Closed-Date: Wed Feb 26 15:29:23 +0000 2020
>Last-Modified: Wed Feb 26 15:29:23 +0000 2020
>Originator: Andreas Gustafsson
>Release: NetBSD-current, source date >= 2020.02.21.02.04.40
>Organization:
>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:
My amd64 testbed bare metal is showing hundreds of new test failures
in file system tests, such as:
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2020/2020.02.21.02.04.40/test.html#fs_vfs_t_full_ext2fs_fillfs
The qemu-based TNF i386 testbed does not appear to be affected.
The problem started during a recent period when the system did not
even install, so it's not easily auto-bisected. During this period,
the following commits were made:
commit 2020.02.20.15.48.38 riastradh src/sys/kern/vfs_bio.c 1.288
commit 2020.02.20.15.48.52 riastradh src/lib/libp2k/p2k.c 1.72
commit 2020.02.20.18.24.20 pgoyette src/bin/sh/sh.1 1.224
commit 2020.02.20.19.59.12 christos src/external/historical/nawk/dist/lex.c 1.7
commit 2020.02.20.19.59.12 christos src/external/historical/nawk/dist/lib.c 1.11
commit 2020.02.20.19.59.12 christos src/external/historical/nawk/dist/main.c 1.11
commit 2020.02.20.19.59.12 christos src/external/historical/nawk/dist/proto.h 1.11
commit 2020.02.20.19.59.12 christos src/external/historical/nawk/dist/run.c 1.12
commit 2020.02.20.21.14.23 jdolecek src/sys/kern/subr_autoconf.c 1.266
commit 2020.02.20.22.38.54 kamil src/tests/lib/libc/sys/t_ptrace_wait.c 1.164
commit 2020.02.20.22.52.10 joerg src/sys/rump/Makefile.rump 1.125
commit 2020.02.20.22.52.10 joerg src/sys/rump/librump/rumpkern/kobj_rename.c 1.3
commit 2020.02.20.23.57.16 kamil src/tests/lib/libc/sys/t_ptrace_x86_wait.h 1.24
commit 2020.02.21.00.26.21 joerg src/external/bsd/libevent/dist/test/regress_http.c 1.6
commit 2020.02.21.00.26.21 joerg src/external/bsd/libevent/dist/test/regress_ssl.c 1.4
commit 2020.02.21.00.26.22 joerg src/external/bsd/wpa/dist/src/radius/radius_client.c 1.2
commit 2020.02.21.00.26.22 joerg src/external/cddl/osnet/dist/tools/ctf/cvt/iidesc.c 1.4
commit 2020.02.21.00.26.22 joerg src/sys/arch/x86/x86/spectre.c 1.34
commit 2020.02.21.00.26.22 joerg src/sys/arch/x86/x86/tsc.c 1.38
commit 2020.02.21.00.26.22 joerg src/sys/dev/clockctl.c 1.38
commit 2020.02.21.00.26.22 joerg src/sys/dev/nvmm/x86/nvmm_x86_svm.c 1.56
commit 2020.02.21.00.26.22 joerg src/sys/dev/nvmm/x86/nvmm_x86_vmx.c 1.49
commit 2020.02.21.00.26.22 joerg src/sys/dist/pf/net/pf_ioctl.c 1.57
commit 2020.02.21.00.26.22 joerg src/sys/external/bsd/ipf/netinet/ip_fil_netbsd.c 1.34
commit 2020.02.21.00.26.22 joerg src/sys/kern/kern_ktrace.c 1.175
commit 2020.02.21.00.26.22 joerg src/sys/kern/kern_proc.c 1.241
commit 2020.02.21.00.26.22 joerg src/sys/kern/kern_resource.c 1.186
commit 2020.02.21.00.26.22 joerg src/sys/kern/kern_veriexec.c 1.23
commit 2020.02.21.00.26.22 joerg src/sys/kern/sys_pset.c 1.23
commit 2020.02.21.00.26.22 joerg src/sys/kern/sysv_ipc.c 1.41
commit 2020.02.21.00.26.22 joerg src/sys/kern/uipc_socket.c 1.287
commit 2020.02.21.00.26.22 joerg src/sys/kern/vfs_init.c 1.50
commit 2020.02.21.00.26.23 joerg src/sys/net/if.c 1.473
commit 2020.02.21.00.26.23 joerg src/sys/netsmb/smb_conn.c 1.31
commit 2020.02.21.00.26.23 joerg src/sys/secmodel/extensions/secmodel_extensions.c 1.11
commit 2020.02.21.00.26.23 joerg src/sys/secmodel/keylock/secmodel_keylock.c 1.10
commit 2020.02.21.00.26.23 joerg src/sys/secmodel/securelevel/secmodel_securelevel.c 1.33
commit 2020.02.21.00.26.23 joerg src/sys/secmodel/suser/secmodel_suser.c 1.51
commit 2020.02.21.02.04.40 riastradh src/sys/kern/vfs_bio.c 1.289
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback
State-Changed-By: maya@NetBSD.org
State-Changed-When: Sat, 22 Feb 2020 19:46:55 +0000
State-Changed-Why:
Assuming fixed: http://mail-index.netbsd.org/current-users/2020/02/21/msg037802.html
With vfs_bio.c 1.289
From: Andreas Gustafsson <gson@gson.org>
To: maya@NetBSD.org
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/55004 (Hundreds of file system tests now fail on real hardware)
Date: Sat, 22 Feb 2020 22:27:10 +0200
maya@NetBSD.org wrote:
> Assuming fixed: http://mail-index.netbsd.org/current-users/2020/02/21/msg037802.html
> With vfs_bio.c 1.289
Your assumption is incorrect. I said the problem started during the
period ending which the commit of vfs_bio.c 1.289, and that means it
was failing *after* that commit:
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/commits-2020.02.html#2020.02.21.02.04.40
--
Andreas Gustafsson, gson@gson.org
State-Changed-From-To: feedback->open
State-Changed-By: gson@NetBSD.org
State-Changed-When: Sat, 22 Feb 2020 20:37:36 +0000
State-Changed-Why:
Still broken.
Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Sat, 22 Feb 2020 20:58:56 +0000
Responsible-Changed-Why:
Seems this started with my config_mountroot() mutex changes, I'll fix this.
From: "Andrew Doran" <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/55004 CVS commit: src/sys/rump/librump/rumpkern
Date: Sat, 22 Feb 2020 21:45:35 +0000
Module Name: src
Committed By: ad
Date: Sat Feb 22 21:45:35 UTC 2020
Modified Files:
src/sys/rump/librump/rumpkern: rump.c
Log Message:
rump_init(): need to call config_init() now.
PR kern/55004 (Hundreds of file system tests now fail on real hardware)
To generate a diff of this commit:
cvs rdiff -u -r1.341 -r1.342 src/sys/rump/librump/rumpkern/rump.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: jdolecek@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org, Andreas Gustafsson <gson@gson.org>,
chs@netbsd.org
Subject: Re: kern/55004 (Hundreds of file system tests now fail on real
hardware)
Date: Sat, 22 Feb 2020 22:11:44 +0000
I see still more breakage probably due to the recent removal of aiodoned
intersecting with LFS brain damage.
fs_cleanerd[27468]: /mnt: attaching cleaner
[ 1.1700090] panic: kernel diagnostic assertion "giantcnt == 1" failed: file "klock.c", line 127
[ 1.1700090] rump kernel halting...
halted
Thread 42 "" received signal SIGABRT, Aborted.
[Switching to LWP 42 of process 27468]
0x000070110c58609a in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0 0x000070110c58609a in _lwp_kill () from /usr/lib/libc.so.12
#1 0x000070110c58643a in abort () from /usr/lib/libc.so.12
#2 0x000070110e608713 in ?? () from /usr/lib/librumpuser.so.0
#3 0x000070110eaf0efa in cpu_reboot (howto=4, bootstr=0x0) at emul.c:431
#4 0x000070110ea8a8dd in kern_reboot (howto=4, bootstr=0x0) at /home/ad/src/sys/rump/librump/rumpkern/../../../kern/kern_reboot.c:61
#5 0x000070110ea874d5 in vpanic (fmt=0x70110eb097a8 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ", ap=0x7010ffb4fd48) at /home/ad/src/sys/rump/librump/rumpkern/../../../kern/subr_prf.c:336
#6 0x000070110ea67c63 in kern_assert (fmt=0x70110eb097a8 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ") at /home/ad/src/sys/rump/librump/rumpkern/../../../lib/libkern/kern_assert.c:51
#7 0x000070110eaf2d5d in _kernel_unlock (nlocks=-1, countp=0x0) at klock.c:127
#8 0x00007011126535cd in lfs_free_aiodone (bp=0x70111291f3c8) at /home/ad/src/sys/rump/fs/lib/liblfs/../../../../ufs/lfs/lfs_segment.c:2516
#9 0x000070110f266737 in biodone2 (bp=0x70111291f3c8) at /home/ad/src/sys/rump/librump/rumpvfs/../../../kern/vfs_bio.c:1702
#10 0x000070110f266616 in biodone (bp=0x70111291f3c8) at /home/ad/src/sys/rump/librump/rumpvfs/../../../kern/vfs_bio.c:1666
#11 0x0000701112653aa9 in lfs_cluster_aiodone (bp=0x70111291f758) at /home/ad/src/sys/rump/fs/lib/liblfs/../../../../ufs/lfs/lfs_segment.c:2621
#12 0x000070110f266737 in biodone2 (bp=0x70111291f758) at /home/ad/src/sys/rump/librump/rumpvfs/../../../kern/vfs_bio.c:1702
#13 0x000070110f266616 in biodone (bp=0x70111291f758) at /home/ad/src/sys/rump/librump/rumpvfs/../../../kern/vfs_bio.c:1666
#14 0x000070110f25f3aa in rump_biodone (arg=0x70111291f758, count=1536, error=0) at rump_vfs.c:521
#15 0x000070110e60722f in ?? () from /usr/lib/librumpuser.so.0
#16 0x000070110e607313 in ?? () from /usr/lib/librumpuser.so.0
#17 0x000070110e20caf2 in ?? () from /usr/lib/libpthread.so.1
#18 0x000070110c48fd10 in ?? () from /usr/lib/libc.so.12
#19 0x0000000000000000 in ?? ()
From: "Andrew Doran" <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/55004 CVS commit: src/sys/ufs/lfs
Date: Sat, 22 Feb 2020 22:20:47 +0000
Module Name: src
Committed By: ad
Date: Sat Feb 22 22:20:47 UTC 2020
Modified Files:
src/sys/ufs/lfs: lfs_segment.c
Log Message:
Make LFS/rump play nice with aiodoned removal.
PR kern/55004 (Hundreds of file system tests now fail on real hardware)
To generate a diff of this commit:
cvs rdiff -u -r1.282 -r1.283 src/sys/ufs/lfs/lfs_segment.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Sun, 23 Feb 2020 01:46:02 +0000
State-Changed-Why:
ad committed change to rump_init() to call config_init() early, this should
fix this problem. Can you confirm it works now?
From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/55004 CVS commit: src/sys/rump/librump/rumpdev
Date: Sun, 23 Feb 2020 01:53:03 +0000
Module Name: src
Committed By: jdolecek
Date: Sun Feb 23 01:53:03 UTC 2020
Modified Files:
src/sys/rump/librump/rumpdev: rump_dev.c
Log Message:
no need to call config_init_mi() in rumpdev any more - rump_init() now calls
config_init(), and the sysctl shouldn't be needed
PR kern/55004
To generate a diff of this commit:
cvs rdiff -u -r1.27 -r1.28 src/sys/rump/librump/rumpdev/rump_dev.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: feedback->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 26 Feb 2020 15:29:23 +0000
State-Changed-Why:
Confirmed fixed, thanks.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.