NetBSD Problem Report #54786

From gson@gson.org  Thu Dec 19 08:26:09 2019
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 1267B7A173
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 19 Dec 2019 08:26:09 +0000 (UTC)
Message-Id: <20191219082603.ADD9C253FE6@guava.gson.org>
Date: Thu, 19 Dec 2019 10:26:03 +0200 (EET)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: Panic triggered by posix_spawn_kill_spawner test case
X-Send-Pr-Version: 3.95

>Number:         54786
>Category:       kern
>Synopsis:       Panic triggered by posix_spawn_kill_spawner test case
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kamil
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Dec 19 08:30:00 +0000 2019
>Closed-Date:    Tue Apr 07 06:44:27 +0000 2020
>Last-Modified:  Tue Apr 07 06:44:27 +0000 2020
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current
>Organization:
>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:

On the TNF i386 testbed, the ATF test runs have been randomly failing
to complete due to a panic triggered by the posix_spawn_kill_spawner
test case:

  i386/results/2019/2019.10.29.22.21.53/test.log.gz:    posix_spawn_kill_spawner: [ 8372.7092110] uvm_fault(0xc243ed68, 0, 1) -> 0xe
  i386/results/2019/2019.11.04.12.45.10/test.log.gz:    posix_spawn_kill_spawner: [ 7857.2869663] uvm_fault(0xc22e1f0c, 0, 1) -> 0xe
  i386/results/2019/2019.11.06.11.55.18/test.log.gz:    posix_spawn_kill_spawner: [ 7668.6998710] uvm_fault(0xc2441378, 0, 1) -> 0xe
  i386/results/2019/2019.11.13.07.56.10/test.log.gz:    posix_spawn_kill_spawner: [ 7276.8763671] uvm_fault(0xc2432870, 0, 1) -> 0xe
  i386/results/2019/2019.11.14.06.00.16/test.log.gz:    posix_spawn_kill_spawner: [ 7601.2529275] uvm_fault(0xc22d051c, 0, 1) -> 0xe
  i386/results/2019/2019.11.29.00.36.22/test.log.gz:    posix_spawn_kill_spawner: [ 7501.2478758] uvm_fault(0xc22d9a14, 0, 1) -> 0xe
  i386/results/2019/2019.12.08.19.52.37/test.log.gz:    posix_spawn_kill_spawner: [ 8422.8664408] uvm_fault(0xc237a0f8, 0, 1) -> 0xe
  i386/results/2019/2019.12.16.13.48.44/test.log.gz:    posix_spawn_kill_spawner: [ 7705.1209098] uvm_fault(0xc251d520, 0, 1) -> 0xe

The full console output from the latest failure is at:

  http://releng.netbsd.org/b5reports/i386/2019/2019.12.16.13.48.44/test.log

>How-To-Repeat:

Boot -current/i386 in qemu, log in as root, and run:

cd /usr/tests/lib/libc/sys/
while ./t_ptrace_waitpid posix_spawn_kill_spawner
do true
done

>Fix:

>Release-Note:

>Audit-Trail:
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc: gson@gson.org (Andreas Gustafsson)
Subject: Re: kern/54786: Panic triggered by posix_spawn_kill_spawner test case
Date: Sun, 5 Apr 2020 18:32:09 +0300

 The TNF i386 testbed has now paniced in the posix_spawn_kill_spawner
 test case in the last three test runs:

   http://releng.netbsd.org/b5reports/i386/commits-2020.04.html#2020.04.05.01.21.43

 Looks like this bug has changed from random to reproducible.
 -- 
 Andreas Gustafsson, gson@gson.org

Responsible-Changed-From-To: kern-bug-people->kamil
Responsible-Changed-By: kamil@NetBSD.org
Responsible-Changed-When: Mon, 06 Apr 2020 10:22:20 +0200
Responsible-Changed-Why:
Fix committed. Please check.


State-Changed-From-To: open->feedback
State-Changed-By: kamil@NetBSD.org
State-Changed-When: Mon, 06 Apr 2020 10:22:20 +0200
State-Changed-Why:


From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/54786 CVS commit: src/sys
Date: Mon, 6 Apr 2020 08:20:05 +0000

 Module Name:	src
 Committed By:	kamil
 Date:		Mon Apr  6 08:20:05 UTC 2020

 Modified Files:
 	src/sys/kern: kern_exec.c kern_fork.c kern_proc.c kern_sig.c
 	src/sys/sys: proc.h

 Log Message:
 Reintroduce struct proc::p_oppid

 Relying on p_opptr is not safe as there is a race between:
  - spawner giving a birth to a child process and being killed
  - spawnee accessng p_opptr and reporting TRAP_CHLD

 PR kern/54786 by Andreas Gustafsson


 To generate a diff of this commit:
 cvs rdiff -u -r1.494 -r1.495 src/sys/kern/kern_exec.c
 cvs rdiff -u -r1.220 -r1.221 src/sys/kern/kern_fork.c
 cvs rdiff -u -r1.242 -r1.243 src/sys/kern/kern_proc.c
 cvs rdiff -u -r1.386 -r1.387 src/sys/kern/kern_sig.c
 cvs rdiff -u -r1.361 -r1.362 src/sys/sys/proc.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: feedback->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Tue, 07 Apr 2020 06:44:27 +0000
State-Changed-Why:
The i386 tests are running to completion again.  Thanks.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.