NetBSD Problem Report #53391

From martin@duskware.de  Sun Jun 24 15:08:35 2018
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 735917A281
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 24 Jun 2018 15:08:35 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: lib/libc/gen/posix_spawn/t_spawnattr hangs on uniprocessor
X-Send-Pr-Version: 3.95

>Number:         53391
>Category:       bin
>Synopsis:       lib/libc/gen/posix_spawn/t_spawnattr hangs on uniprocessor
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    martin
>State:          feedback
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Jun 24 15:10:00 +0000 2018
>Closed-Date:    
>Last-Modified:  Thu Apr 04 06:10:01 +0000 2024
>Originator:     Martin Husemann
>Release:        NetBSD 8.0_RC2
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD unpluged.duskware.de 8.0_RC2 NetBSD 8.0_RC2 (UNPLUGED) #2: Sun Jun 24 13:41:20 CEST 2018 martin@seven-days-to-the-wolves.aprisoft.de:/work/src-8/sys/arch/evbarm/compile/UNPLUGED evbarm
Architecture: earm
Machine: evbarm
>Description:

The lib/libc/gen/posix_spawn/t_spawnattr hangs reproducably
on a single processor ARM machine (but at least did the
same once on a multiprocessor alpha).

The test ends spawns a helper and ends with:

        ATF_REQUIRE_MSG(pid == getpgid(pid), "child pid: %d, child pgid: %d",
            pid, getpgid(pid));

        /* ready, let child go */
        write(pfd[1], "q", 1);
        close(pfd[0]);
        close(pfd[1]);

        /* wait and check result from child */
        waitpid(pid, &status, 0);
        ATF_REQUIRE(WIFEXITED(status) && WEXITSTATUS(status) == EXIT_SUCCESS);

        posix_spawnattr_destroy(&attr);

but when hanging, the test process itself is trying to exit, while
the helper process is still reading commands from the pipe.

  PID TTY    STAT    TIME COMMAND
    0 ?      DKl  0:15.62 [system]
    1 ?      Is   0:00.13 - init 
[..]
21297 ?      I    0:00.01 |-- h_spawnattr 3 
  581 pts/0- I    0:00.03 |-- /bin/sh ./test.sh 
  490 pts/0- I    0:00.01 | `-- /bin/sh ./test.sh 
   46 pts/0- I    0:00.79 |   |-- tee /test-bed/work/atf.raw 
   47 pts/0- I    0:10.30 |   |-- atf-run 
22326 ?      Z    0:00.00 |   | `-- (t_spawnattr)
   71 pts/0- I    0:00.16 |   |-- tee /test-bed/work/atf.log 
  628 pts/0- I    0:02.28 |   `-- atf-report -oxml:/test-bed/work/atf.xml -oticker:- 


>How-To-Repeat:

cd /usr/tests/lib/libc/gen/posix_spawn && atf-run | atf-report

>Fix:
n/a

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: gnats-admin->martin
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Mon, 25 Jun 2018 05:27:25 +0000
Responsible-Changed-Why:
Take


From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: bin/53391: lib/libc/gen/posix_spawn/t_spawnattr hangs
Date: Mon, 25 Jun 2018 07:30:26 +0200

 The issue consists of mulitple things:

  - the kernel part fails, we get the wrong scheduler priority
    (I think this very recentish has been fixed on head by Christos)
  - the test case is broken and aborts the test case, without notifying
    the helper/child process
  - not all siblings terminate properly, atf-run hangs


 Martin

State-Changed-From-To: open->feedback
State-Changed-By: maya@NetBSD.org
State-Changed-When: Tue, 26 Jun 2018 21:17:37 +0000
State-Changed-Why:
this one was a missed pullup by me following pullup-8 892, thanks martin for figuring it out.
is it fixed now?


State-Changed-From-To: feedback->analyzed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Wed, 27 Jun 2018 04:10:48 +0000
State-Changed-Why:
It works for now, but test failure (in some of the ATF_REQUIRE) would
still be fatal and kill the whole test run. The test case is
missing proper cleanup - I'll add that in -current.


State-Changed-From-To: analyzed->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Thu, 04 Apr 2024 05:40:17 +0000
State-Changed-Why:
Is this still an issue?  I'm not sure I saw changes addressing it
but it's not immediately obvious what changes to look for.


From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: bin/53391 (lib/libc/gen/posix_spawn/t_spawnattr hangs on
 uniprocessor)
Date: Thu, 4 Apr 2024 08:05:04 +0200

 On Thu, Apr 04, 2024 at 05:40:17AM +0000, riastradh@NetBSD.org wrote:
 > Is this still an issue?  I'm not sure I saw changes addressing it
 > but it's not immediately obvious what changes to look for.

 It is not hanging and I think my analyzis back then was wrong - when the
 test program exits (due to ATF_REQUIRE failing) the pipe should be closed
 and the child exit. The zombie process shown in the old ps output has to be
 some other (now fixed) kernel bug.

 Martin

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.