NetBSD Problem Report #53391
From martin@duskware.de Sun Jun 24 15:08:35 2018
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 735917A281
for <gnats-bugs@gnats.NetBSD.org>; Sun, 24 Jun 2018 15:08:35 +0000 (UTC)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: lib/libc/gen/posix_spawn/t_spawnattr hangs on uniprocessor
X-Send-Pr-Version: 3.95
>Number: 53391
>Category: bin
>Synopsis: lib/libc/gen/posix_spawn/t_spawnattr hangs on uniprocessor
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: martin
>State: feedback
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Jun 24 15:10:00 +0000 2018
>Closed-Date:
>Last-Modified: Thu Apr 04 06:10:01 +0000 2024
>Originator: Martin Husemann
>Release: NetBSD 8.0_RC2
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD unpluged.duskware.de 8.0_RC2 NetBSD 8.0_RC2 (UNPLUGED) #2: Sun Jun 24 13:41:20 CEST 2018 martin@seven-days-to-the-wolves.aprisoft.de:/work/src-8/sys/arch/evbarm/compile/UNPLUGED evbarm
Architecture: earm
Machine: evbarm
>Description:
The lib/libc/gen/posix_spawn/t_spawnattr hangs reproducably
on a single processor ARM machine (but at least did the
same once on a multiprocessor alpha).
The test ends spawns a helper and ends with:
ATF_REQUIRE_MSG(pid == getpgid(pid), "child pid: %d, child pgid: %d",
pid, getpgid(pid));
/* ready, let child go */
write(pfd[1], "q", 1);
close(pfd[0]);
close(pfd[1]);
/* wait and check result from child */
waitpid(pid, &status, 0);
ATF_REQUIRE(WIFEXITED(status) && WEXITSTATUS(status) == EXIT_SUCCESS);
posix_spawnattr_destroy(&attr);
but when hanging, the test process itself is trying to exit, while
the helper process is still reading commands from the pipe.
PID TTY STAT TIME COMMAND
0 ? DKl 0:15.62 [system]
1 ? Is 0:00.13 - init
[..]
21297 ? I 0:00.01 |-- h_spawnattr 3
581 pts/0- I 0:00.03 |-- /bin/sh ./test.sh
490 pts/0- I 0:00.01 | `-- /bin/sh ./test.sh
46 pts/0- I 0:00.79 | |-- tee /test-bed/work/atf.raw
47 pts/0- I 0:10.30 | |-- atf-run
22326 ? Z 0:00.00 | | `-- (t_spawnattr)
71 pts/0- I 0:00.16 | |-- tee /test-bed/work/atf.log
628 pts/0- I 0:02.28 | `-- atf-report -oxml:/test-bed/work/atf.xml -oticker:-
>How-To-Repeat:
cd /usr/tests/lib/libc/gen/posix_spawn && atf-run | atf-report
>Fix:
n/a
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: gnats-admin->martin
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Mon, 25 Jun 2018 05:27:25 +0000
Responsible-Changed-Why:
Take
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: bin/53391: lib/libc/gen/posix_spawn/t_spawnattr hangs
Date: Mon, 25 Jun 2018 07:30:26 +0200
The issue consists of mulitple things:
- the kernel part fails, we get the wrong scheduler priority
(I think this very recentish has been fixed on head by Christos)
- the test case is broken and aborts the test case, without notifying
the helper/child process
- not all siblings terminate properly, atf-run hangs
Martin
State-Changed-From-To: open->feedback
State-Changed-By: maya@NetBSD.org
State-Changed-When: Tue, 26 Jun 2018 21:17:37 +0000
State-Changed-Why:
this one was a missed pullup by me following pullup-8 892, thanks martin for figuring it out.
is it fixed now?
State-Changed-From-To: feedback->analyzed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Wed, 27 Jun 2018 04:10:48 +0000
State-Changed-Why:
It works for now, but test failure (in some of the ATF_REQUIRE) would
still be fatal and kill the whole test run. The test case is
missing proper cleanup - I'll add that in -current.
State-Changed-From-To: analyzed->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Thu, 04 Apr 2024 05:40:17 +0000
State-Changed-Why:
Is this still an issue? I'm not sure I saw changes addressing it
but it's not immediately obvious what changes to look for.
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: bin/53391 (lib/libc/gen/posix_spawn/t_spawnattr hangs on
uniprocessor)
Date: Thu, 4 Apr 2024 08:05:04 +0200
On Thu, Apr 04, 2024 at 05:40:17AM +0000, riastradh@NetBSD.org wrote:
> Is this still an issue? I'm not sure I saw changes addressing it
> but it's not immediately obvious what changes to look for.
It is not hanging and I think my analyzis back then was wrong - when the
test program exits (due to ATF_REQUIRE failing) the pipe should be closed
and the child exit. The zombie process shown in the old ps output has to be
some other (now fixed) kernel bug.
Martin
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.