NetBSD Problem Report #54111

From martin@duskware.de  Wed Apr 10 05:07:06 2019
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 7CC297A1AB
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 10 Apr 2019 05:07:06 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: t_ptrace_wait* hangs and kills test runs
X-Send-Pr-Version: 3.95

>Number:         54111
>Category:       kern
>Synopsis:       t_ptrace_wait* hangs and kills test runs
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kamil
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Apr 10 05:10:00 +0000 2019
>Closed-Date:    Fri Jul 05 05:14:32 +0000 2019
>Last-Modified:  Fri Jul 05 05:14:32 +0000 2019
>Originator:     Martin Husemann
>Release:        NetBSD 8.99.37
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD thirdstage.duskware.de 8.99.37 NetBSD 8.99.37 (MODULAR) #169: Tue Apr 9 18:41:05 CEST 2019 martin@thirdstage.duskware.de:/usr/src/sys/arch/sparc64/compile/MODULAR sparc64
Architecture: sparc64
Machine: sparc64
>Description:

Running the atf tests on a mulitprocessor sparc64 machine hangs (randomly,
but nearly every time) in one of the t_ptrace_wait* tests.

The exiting test program is not collected, any further test activity
is blocked. This kills the regular test runs on real hardware
and is a netbsd-9 branch blocker.

Excerpt from ps axwwwwd:

  PID TTY    STAT    TIME COMMAND
    0 ?      OKl  7:45.83 [system]
    1 ?      Is   0:00.06 - init 
 7955 ?      D    0:00.00 |-- t_ptrace_wait6 -r/tmp/atf-run.Aq6p7i/tcr -s/usr/tests/lib/libc/sys -vunprivileged-user clone_vfork_signalignored:body 
 3911 ?      Z    0:00.00 | `-- (t_ptrace_wait6)
   34 pts/0- I    0:00.02 |-- /bin/sh ./test.sh 
  435 pts/0- I    0:00.01 | `-- /bin/sh ./test.sh 
  425 pts/0- I    0:02.23 |   |-- atf-report -oxml:/test-bed/work/atf.xml -oticker:- 
  430 pts/0- I    0:18.51 |   |-- atf-run 
 4954 ?      Z    0:00.00 |   | `-- (t_ptrace_wait6)
  431 pts/0- I    0:00.86 |   |-- tee /test-bed/work/atf.raw 
  445 pts/0- I    0:00.22 |   `-- tee /test-bed/work/atf.log 

PID 4954 is showing the issue.

Test machine dmesg at: https://www.NetBSD.org/~martin/sparc64-atf/dmesg.txt

I have also seen similar hangs with other variants (ISTR t_ptrace_wait4
at least).

>How-To-Repeat:
s/a

>Fix:
n/a

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->kamil
Responsible-Changed-By: kamil@NetBSD.org
Responsible-Changed-When: Wed, 10 Apr 2019 11:47:53 +0200
Responsible-Changed-Why:
Take.


From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/54111: t_ptrace_wait* hangs and kills test runs
Date: Mon, 15 Apr 2019 17:22:31 +0200

 This also happens on multiprocessor evbarm machines:

   PID TTY    STAT    TIME COMMAND
 29272 ?      D    0:00.00 |-- t_ptrace_wait6 -r/tmp/atf-run.6XQgPa/tcr -s/usr/te                  sts/lib/libc/sys -vunprivileged-user clone_vfork_signalmasked:body 
  9906 ?      Z    0:00.00 | `-- (t_ptrace_wait6)
   501 pts/0- I    0:00.03 |-- /bin/sh ./test.sh 
   596 pts/0- I    0:00.01 | `-- /bin/sh ./test.sh 
    34 pts/0- I    0:03.75 |   |-- atf-report -oxml:/test-bed/work/atf.xml -otick                  er:- 
   476 pts/0- I    0:02.26 |   |-- tee /test-bed/work/atf.raw 
   553 pts/0- I    0:34.14 |   |-- atf-run 
 10792 ?      Z    0:00.00 |   | `-- (t_ptrace_wait6)
   606 pts/0- I    0:00.55 |   `-- tee /test-bed/work/atf.log 

 Martin

From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54111 CVS commit: src/tests/lib/libc/sys
Date: Mon, 15 Apr 2019 16:47:47 +0000

 Module Name:	src
 Committed By:	kamil
 Date:		Mon Apr 15 16:47:47 UTC 2019

 Modified Files:
 	src/tests/lib/libc/sys: t_ptrace_wait.c

 Log Message:
 Temporarily ifdef out PTRACE_VFORK and PTRACE_VFORKDONE tests

 It's not reliable on all ports. sparc and evbarm are known to hang.

 PR kern/54111 by Martin Husemann


 To generate a diff of this commit:
 cvs rdiff -u -r1.108 -r1.109 src/tests/lib/libc/sys/t_ptrace_wait.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@netbsd.org, kamil@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org, martin@NetBSD.org
Cc: 
Subject: Re: PR/54111 CVS commit: src/tests/lib/libc/sys
Date: Tue, 16 Apr 2019 23:54:24 -0400

 On Apr 15,  4:50pm, kamil@netbsd.org ("Kamil Rytarowski") wrote:
 -- Subject: PR/54111 CVS commit: src/tests/lib/libc/sys

 | The following reply was made to PR kern/54111; it has been noted by GNATS.
 | 
 | From: "Kamil Rytarowski" <kamil@netbsd.org>
 | To: gnats-bugs@gnats.NetBSD.org
 | Cc: 
 | Subject: PR/54111 CVS commit: src/tests/lib/libc/sys
 | Date: Mon, 15 Apr 2019 16:47:47 +0000
 | 
 |  Module Name:	src
 |  Committed By:	kamil
 |  Date:		Mon Apr 15 16:47:47 UTC 2019
 |  
 |  Modified Files:
 |  	src/tests/lib/libc/sys: t_ptrace_wait.c
 |  
 |  Log Message:
 |  Temporarily ifdef out PTRACE_VFORK and PTRACE_VFORKDONE tests
 |  
 |  It's not reliable on all ports. sparc and evbarm are known to hang.
 |  

 This breaks all llvm builds.

 http://releng.netbsd.org/builds/HEAD-llvm/201904162100Z/amd64.build.failed

 christos

From: Kamil Rytarowski <n54@gmx.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: PR/54111 CVS commit: src/tests/lib/libc/sys
Date: Wed, 17 Apr 2019 14:48:58 +0200

 On 17.04.2019 06:00, Christos Zoulas wrote:
 > The following reply was made to PR kern/54111; it has been noted by GNAT=
 S.
 >
 > From: christos@zoulas.com (Christos Zoulas)
 > To: gnats-bugs@netbsd.org, kamil@netbsd.org, gnats-admin@netbsd.org,
 > 	netbsd-bugs@netbsd.org, martin@NetBSD.org
 > Cc:
 > Subject: Re: PR/54111 CVS commit: src/tests/lib/libc/sys
 > Date: Tue, 16 Apr 2019 23:54:24 -0400
 >
 >  On Apr 15,  4:50pm, kamil@netbsd.org ("Kamil Rytarowski") wrote:
 >  -- Subject: PR/54111 CVS commit: src/tests/lib/libc/sys
 >
 >  | The following reply was made to PR kern/54111; it has been noted by G=
 NATS.
 >  |
 >  | From: "Kamil Rytarowski" <kamil@netbsd.org>
 >  | To: gnats-bugs@gnats.NetBSD.org
 >  | Cc:
 >  | Subject: PR/54111 CVS commit: src/tests/lib/libc/sys
 >  | Date: Mon, 15 Apr 2019 16:47:47 +0000
 >  |
 >  |  Module Name:	src
 >  |  Committed By:	kamil
 >  |  Date:		Mon Apr 15 16:47:47 UTC 2019
 >  |
 >  |  Modified Files:
 >  |  	src/tests/lib/libc/sys: t_ptrace_wait.c
 >  |
 >  |  Log Message:
 >  |  Temporarily ifdef out PTRACE_VFORK and PTRACE_VFORKDONE tests
 >  |
 >  |  It's not reliable on all ports. sparc and evbarm are known to hang.
 >  |
 >
 >  This breaks all llvm builds.
 >
 >  http://releng.netbsd.org/builds/HEAD-llvm/201904162100Z/amd64.build.fai=
 led
 >
 >  christos
 >
 >

 I will fix it!

State-Changed-From-To: open->closed
State-Changed-By: kamil@NetBSD.org
State-Changed-When: Fri, 05 Jul 2019 07:14:32 +0200
State-Changed-Why:
Fixes as of 8.99.50.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.