NetBSD Problem Report #57257

From www@netbsd.org  Fri Mar  3 23:58:59 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 3C0521A9239
	for <gnats-bugs@gnats.NetBSD.org>; Fri,  3 Mar 2023 23:58:59 +0000 (UTC)
Message-Id: <20230303235827.BB7DC1A923C@mollari.NetBSD.org>
Date: Fri,  3 Mar 2023 23:58:27 +0000 (UTC)
From: sunqingyao19970825@icloud.com
Reply-To: sunqingyao19970825@icloud.com
To: gnats-bugs@NetBSD.org
Subject: Nine unexpected tests(7) failure cases on Vultr
X-Send-Pr-Version: www-1.0

>Number:         57257
>Category:       lib
>Synopsis:       Nine unexpected tests(7) failure cases on Vultr
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Mar 04 00:00:00 +0000 2023
>Last-Modified:  Sat Mar 04 07:25:01 +0000 2023
>Originator:     Qingyao Sun
>Release:        9.3
>Organization:
The University of Chicago
>Environment:
NetBSD vigor3.your.domain 9.3 NetBSD 9.3 (GENERIC) #0: Thu Aug  4 15:30:37 UTC 2022  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
This should really be a follow-up of #57256, but I cannot figure out how to amend an existing problem report, so here we are.

After submitting problem report #57256, I decided to re-run the whole test suite on another Vultr instance to see if I can replicate the issue. It turns out that this time I get 9 unexpected failures (with some overlap with #57256)! See the relevant part from the test log below.

================================================
Content-Type: application/X-atf-tps; version="3"

info: atf.version, Automated Testing Framework 0.20 (atf-0.20)
info: tests.root, /usr/tests
info: time.start, Fri Mar 3 15:54:00 CST 2023
info: uname.sysname, NetBSD
info: uname.nodename, vigor3.your.domain
info: uname.release, 9.3
info: uname.version, NetBSD 9.3 (GENERIC) #0: Thu Aug 4 15:30:37 UTC 2022 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC
info: uname.machine, amd64
info: env, ENV=/root/.shrc
info: env, BLOCKSIZE=1k
info: env, PWD=/usr/tests
info: env, MAIL=/var/mail/qys
info: env, HOME=/root
info: env, _=/etc/shrc
info: env, PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/pkg/sbin:/usr/pkg/bin:/usr/X11R7/bin:/usr/local/sbin:/usr/local/bin
info: env, HOST=vigor3.your.domain
info: env, SSH_CONNECTION=205.208.121.48 57692 216.128.150.65 22
info: env, SSH_TTY=/dev/pts/0
info: env, TERM=screen
info: env, USER=qys
info: env, TMUX=/tmp//tmux-0/default,824,0
info: env, SSH_AUTH_INFO_0=publickey ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIO0/4Oa71RcyVFoFnt6GhUT1CMHFIcJHTfru9yYDsaQd
info: env, SSH_CLIENT=205.208.121.48 57692 22
info: env, EDITOR=vi
info: env, LOGNAME=qys
info: env, SU_FROM=qys
info: env, SHELL=/bin/sh
info: env, TMUX_PANE=%0
tps-count: 844
[......]
tc-start: 1677881109.776540, tracer_sysctl_lookup_without_duplicates
tc-end: 1677881112.850039, tracer_sysctl_lookup_without_duplicates, failed, /usr/src/tests/lib/libc/sys/t_ptrace_wait.c:2428: msg_read_chil
d("tracer ready" " from parent " "parent_tracer", &parent_tracer, &msg, sizeof(msg)) == 0: Undefined error: 0
[......]
tc-start: 1677881132.416845, tracer_sysctl_lookup_without_duplicates
tc-end: 1677881433.428762, tracer_sysctl_lookup_without_duplicates, failed, Test case timed out after 300 seconds
[......]
tc-start: 1677881674.572806, threxec
tc-so:Executing command [ rump_server unix://csock ]
tc-so:Executing command [ /usr/tests/lib/librumpclient/h_execthr ]
tc-end: 1677881975.607304, threxec, failed, Test case timed out after 300 seconds
[......]
tc-start: 1677882399.319857, pingsize
tc-so:[   1.1500090] panic: kernel diagnostic assertion "!(nextpkt > busmem->shm_last && sc->sc_devgen == busmem->shm_gen)" failed: file "/
usr/src/sys/rump/net/lib/libshmif/if_shmem.c", line 763
tc-so:[   1.1500090] rump kernel halting...
tc-so:halted
tc-end: 1677882699.329178, pingsize, failed, Test case timed out after 300 seconds
[......]
tc-start: 1677884463.119527, npf_gc
tc-so:Executing command [ npfctl debug -c /usr/tests/net/npf/npftest.conf -o ./npf.plist ]
tc-so:Executing command [ npftest -c npf.plist -T gc ]
tc-se:Fail: program did not exit cleanly
tc-se:stdout:
tc-se:
tc-se:stderr:
tc-se:
tc-end: 1677884463.173990, npf_gc, failed, atf-check failed; see the output of the test for details
[......]
tc-start: 1677884777.16265, get_nfs_le_1_group
tc-so:Executing command [ newfs -q group -B le -O 1 -s 4000 -F fsimage ]
tc-so:Executing command [ rump_server -lrumpvfs -lrumpdev -lrumpnet -lrumpnet_net -lrumpnet_netinet -lrumpnet_netinet6 -lrumpnet_local -lru
mpnet_shmif -lrumpdev_disk -lrumpfs_ffs -lrumpfs_nfs -lrumpfs_nfsserver -d key=/dk,hostpath=fsimage,size=host unix://test ]
tc-so:Executing command [ rump.ifconfig shmif0 create ]
tc-so:Executing command [ rump.ifconfig shmif0 linkstr shmbus ]
tc-so:Executing command [ rump.ifconfig shmif0 inet 10.1.1.1 ]
tc-so:Executing command [ mkdir /rump/etc ]
tc-so:Executing command [ mkdir /rump/export ]
tc-so:Executing command [ mkdir -p /rump/var/run ]
tc-so:Executing command [ mkdir -p /rump/var/db ]
tc-so:Executing command [ touch /rump/var/db/mountdtab ]
tc-so:Executing command [ mount_ffs /dk /rump/export ]
tc-so:Executing command [ env LD_PRELOAD=/usr/lib/librumphijack.so RUMPHIJACK=vfs=getvfsstat,blanket=/export edquota -g -s10k/20 -h40M/50k
-t 2W/3D 0 ]
tc-so:Executing command [ env LD_PRELOAD=/usr/lib/librumphijack.so RUMPHIJACK=vfs=getvfsstat,blanket=/export quota -g -v ]
tc-so:Executing command [ rpcbind ]
tc-so:Executing command [ mountd /rump/etc/exports ]
tc-so:Executing command [ nfsd ]
tc-so:Executing command [ /usr/libexec/rpc.rquotad ]
tc-so:Executing command [ rump_server -lrumpvfs -lrumpnet -lrumpnet_net -lrumpnet_netinet -lrumpnet_shmif -lrumpfs_nfs unix://clientsock ]
tc-so:Executing command [ rump.ifconfig shmif0 create ]
tc-so:Executing command [ rump.ifconfig shmif0 linkstr shmbus ]
tc-so:Executing command [ rump.ifconfig shmif0 inet 10.1.1.100 ]
tc-so:Executing command [ mkdir /rump/mnt ]
tc-so:Executing command [ mount_nfs 10.1.1.1:/export /rump/mnt ]
tc-se:Fail: stderr not empty
tc-se:--- /dev/null     2023-03-03 23:06:17.119422972 +0000
tc-se:+++ /tmp/check.EoZjfB/stderr      2023-03-03 23:06:17.284291858 +0000
tc-se:@@ -0,0 +1 @@
tc-se:+mount_nfs: rpcbind to nfs on server: RPC: Program not registered
tc-end: 1677884837.322765, get_nfs_le_1_group, failed, atf-check failed; see the output of the test for details
[......]
tc-start: 1677884985.251057, nfs_renamerace_cycle
tc-se:Getting export list.
tc-se:Got line
tc-se:Got line /myexport -noresvport -noresvmnt -maproot=0:0 10.3.2.2
tc-se:Making new ep fs=0xc500,0x78b
tc-se:doing opt -noresvport -noresvmnt -maproot=0:0 10.3.2.2
tc-se:doing opt -noresvmnt -maproot=0:0 10.3.2.2
tc-se:doing opt -maproot=0:0 10.3.2.2
tc-se:got host 10.3.2.2
tc-se:Got line /myexport -ro -noresvport -noresvmnt -maproot=0:0 10.4.2.2
tc-se:Found ep fs=0xc500,0x78b
tc-se:doing opt -ro -noresvport -noresvmnt -maproot=0:0 10.4.2.2
tc-se:doing opt -noresvport -noresvmnt -maproot=0:0 10.4.2.2
tc-se:doing opt -noresvmnt -maproot=0:0 10.4.2.2
tc-se:doing opt -maproot=0:0 10.4.2.2
tc-se:got host 10.4.2.2
tc-se:Getting mount list.
tc-se:Here we go.
tc-se:t_renamerace: rpcbind to nfs on server: RPC: Program not registered
tc-se:can't register with udp6 portmap
tc-se:can't register with tcp6 portmap
tc-se:got mount request from 10.3.2.2
tc-se:-> rpcpath: /myexport
tc-se:-> dirpath: /myexport
tc-se:Mount successful.
tc-so:[   1.0000090] WARNING: ZFS on NetBSD is under development
tc-so:[   1.0000090] ZFS filesystem version: 5
tc-so:[   2.0300090] 10.3.2.1:/myexport: inaccurate wcc data (ctime) detected, disabling wcc (ctime 1677884986.316616157 1677884986.3166161
57, mtime 1677884986.316616157 1677884986.316616157)
tc-end: 1677884996.343187, nfs_renamerace_cycle, failed, Test case was expecting a failure but none were raised
[......]
tc-start: 1677885019.587842, puffs_renamerace_cycle
tc-se:readshovel r1 -1 / 6
tc-so:[   1.0000090] WARNING: ZFS on NetBSD is under development
tc-so:[   1.0000090] ZFS filesystem version: 5
tc-end: 1677885029.667648, puffs_renamerace_cycle, failed, Test case was expecting a failure but none were raised
[......]
tc-start: 1677885259.295544, puffs_writeopen
tc-so:[   1.0000090] WARNING: ZFS on NetBSD is under development
tc-so:[   1.0000090] ZFS filesystem version: 5
tc-end: 1677885560.345861, puffs_writeopen, failed, Test case timed out after 300 seconds
[......]
>How-To-Repeat:
Spin up a virtual machine on Vultr.com with the configuration "Cloud Compute + Intel High Frequency + 32GB NVMe ($6/month)", install NetBSD 9.3 on it with https://cdn.netbsd.org/pub/NetBSD/NetBSD-9.3/images/NetBSD-9.3-amd64.iso, and run

```bash
# cd /usr/tests
# atf-run | tee ~/tests.log | atf-report
```
>Fix:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/57257: Nine unexpected tests(7) failure cases on Vultr
Date: Sat, 4 Mar 2023 08:20:20 +0100

 Some of these failures are due to races, so won't be the same each run.
 Please compare your results against our own test runs:

 	http://releng.netbsd.org/test-results.html

 in your case probably the most comparable is:

 	https://www-soc.lip6.fr/~bouyer/NetBSD-tests/qemu/netbsd-9/amd64/

 and it has a varying number of unexpected failures (latest run right now
 shows 16) or

 	https://www-soc.lip6.fr/~bouyer/NetBSD-tests/xen/netbsd-9/

 which does slightly better (latest run 3).

 Martin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.