NetBSD Problem Report #51995

From www@NetBSD.org  Thu Feb 23 04:31:42 2017
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 147697A266
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 23 Feb 2017 04:31:42 +0000 (UTC)
Message-Id: <20170223043140.ECB977A2A4@mollari.NetBSD.org>
Date: Thu, 23 Feb 2017 04:31:40 +0000 (UTC)
From: n54@gmx.com
Reply-To: n54@gmx.com
To: gnats-bugs@NetBSD.org
Subject: ptrace(2) PT_RESUME is not reliable
X-Send-Pr-Version: www-1.0

>Number:         51995
>Category:       kern
>Synopsis:       ptrace(2) PT_RESUME is not reliable
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kamil
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Feb 23 04:35:00 +0000 2017
>Closed-Date:    Mon Oct 21 18:37:07 +0000 2019
>Last-Modified:  Wed Oct 23 19:30:03 +0000 2019
>Originator:     Kamil Rytarowski
>Release:        NetBSD 7.99.62 amd64
>Organization:
TNF
>Environment:
NetBSD chieftec 7.99.62 NetBSD 7.99.61 (GENERIC) #3: Thu Feb 23 02:56:52 CET 2017  root@chieftec:/public/netbsd-root/sys/arch/amd64/compile/GENERIC amd64
>Description:
ptrace(2) operation PT_RESUME is not reliable in resuming the specified LWP
>How-To-Repeat:
$ cd /usr/tests/kernel/
$ atf-run t_ptrace_wait|atf-report
>Fix:
N/A

>Release-Note:

>Audit-Trail:
From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51995 CVS commit: src/tests/kernel
Date: Tue, 28 Feb 2017 13:29:52 +0000

 Module Name:	src
 Committed By:	kamil
 Date:		Tue Feb 28 13:29:52 UTC 2017

 Modified Files:
 	src/tests/kernel: t_ptrace_wait.c

 Log Message:
 Mark resume1 and syscallemu1 tests broken in t_ptrace_wait*

 resume1:
     PR kern/51995 ptrace(2) PT_RESUME is not reliable

 syscallemu1:
     PR kern/52012 PT_SYSCALL does not stop on syscall entry

 Sponsored by <The NetBSD Foundation>


 To generate a diff of this commit:
 cvs rdiff -u -r1.74 -r1.75 src/tests/kernel/t_ptrace_wait.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51995 CVS commit: src/tests/kernel
Date: Tue, 28 Mar 2017 03:19:21 +0000

 Module Name:	src
 Committed By:	kamil
 Date:		Tue Mar 28 03:19:20 UTC 2017

 Modified Files:
 	src/tests/kernel: t_ptrace_wait.c

 Log Message:
 Set timeout expected in resume1 (t_ptrace_wait*)

 Mark timeout for this test 5 sec. It sometimes works sometimes does not.

 Add a local sleep(3) at the end to get consisten report about timeouting
 always.

 PR kern/51995

 Sponsored by <The NetBSD Foundation>


 To generate a diff of this commit:
 cvs rdiff -u -r1.81 -r1.82 src/tests/kernel/t_ptrace_wait.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: kern-bug-people->kamil
Responsible-Changed-By: kamil@NetBSD.org
Responsible-Changed-When: Fri, 06 Oct 2017 23:13:33 +0200
Responsible-Changed-Why:
Take.


From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51995 CVS commit: src/tests/lib/libc/sys
Date: Fri, 22 Dec 2017 17:35:14 +0000

 Module Name:	src
 Committed By:	kamil
 Date:		Fri Dec 22 17:35:14 UTC 2017

 Modified Files:
 	src/tests/lib/libc/sys: t_ptrace_wait.c

 Log Message:
 ptrace atf: Clanup reports of failures

 Mark resume* suspend* tests as expected failure and link with PR 51995.

 Sponsored by <The NetBSD Foundation>


 To generate a diff of this commit:
 cvs rdiff -u -r1.16 -r1.17 src/tests/lib/libc/sys/t_ptrace_wait.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51995 CVS commit: [netbsd-8] src/tests/lib/libc/sys
Date: Sun, 25 Feb 2018 20:59:47 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Sun Feb 25 20:59:46 UTC 2018

 Modified Files:
 	src/tests/lib/libc/sys [netbsd-8]: t_ptrace_amd64_wait.h
 	    t_ptrace_i386_wait.h t_ptrace_wait.c t_ptrace_x86_wait.h

 Log Message:
 Pull up following revision(s) (requested by martin in ticket #586):
 	tests/lib/libc/sys/t_ptrace_amd64_wait.h: 1.2
 	tests/lib/libc/sys/t_ptrace_i386_wait.h: 1.2
 	tests/lib/libc/sys/t_ptrace_wait.c: 1.10-1.20
 	tests/lib/libc/sys/t_ptrace_x86_wait.h: 1.2-1.3
 PR kern/52167 strikes on sparc64 too.
 --
 Temporarily disable t_ptrace_wait*::resume1 in ATF tests
 It hangs forever on releng machines.
 Sponsored by <The NetBSD Foundation>
 --
 Remove expected failure (fixed in kern_sig.c 1.339)
 --
 sync a bit more with reality; some things still fail, some new failures.
 reduce spewage, be more explanatory about syscall errors.
 --
 Add expected failures.
 --
 make it fail instead of hang under qemu; XXX: need to investigate.
 --
 t_ptrace_wait*: Disable suspend* tests
 These tests can hang the system. These interfaces will be improved and
 temporarily disable them.
 --
 ptrace atf: Clanup reports of failures
 Mark resume* suspend* tests as expected failure and link with PR 51995.
 Sponsored by <The NetBSD Foundation>
 --
 report which errno failed
 --
 atf: t_ptrace_wait: Mark attach2 as racy
 --
 atf: ptrace: Temporarily disable signal3 as it breaks now on some ports
 This test is marked as failing with: PR kern/51918.


 To generate a diff of this commit:
 cvs rdiff -u -r1.1 -r1.1.8.1 src/tests/lib/libc/sys/t_ptrace_amd64_wait.h \
     src/tests/lib/libc/sys/t_ptrace_i386_wait.h \
     src/tests/lib/libc/sys/t_ptrace_x86_wait.h
 cvs rdiff -u -r1.9 -r1.9.2.1 src/tests/lib/libc/sys/t_ptrace_wait.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51995 CVS commit: src/tests/lib/libc/sys
Date: Wed, 1 May 2019 23:44:16 +0000

 Module Name:	src
 Committed By:	kamil
 Date:		Wed May  1 23:44:16 UTC 2019

 Modified Files:
 	src/tests/lib/libc/sys: t_ptrace_wait.c

 Log Message:
 ATF ptrace(2) tests suspend1 and resume1 now pass

 Verified on bare metal and in qemu.

 PR kern/51995


 To generate a diff of this commit:
 cvs rdiff -u -r1.117 -r1.118 src/tests/lib/libc/sys/t_ptrace_wait.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51995 CVS commit: src/tests/lib/libc/sys
Date: Thu, 2 May 2019 00:34:06 +0000

 Module Name:	src
 Committed By:	kamil
 Date:		Thu May  2 00:34:06 UTC 2019

 Modified Files:
 	src/tests/lib/libc/sys: t_ptrace_wait.c

 Log Message:
 Rename and partially enable trace_thread ATF ptrace(2) tests

 Rename trace_thrad[1234] to more meaningful names:

  - trace_thread_nolwpevents
  - trace_thread_lwpexit
  - trace_thread_lwpcreate
  - trace_thread_lwpcreate_and_exit

 In my local tests LWP CREATE events work as expected.
 LWP EXIT ones are still racy and keep them disabled racy.

 PR kern/51995


 To generate a diff of this commit:
 cvs rdiff -u -r1.118 -r1.119 src/tests/lib/libc/sys/t_ptrace_wait.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51995 CVS commit: src/tests/lib/libc/sys
Date: Sun, 13 Oct 2019 04:05:39 +0000

 Module Name:	src
 Committed By:	kamil
 Date:		Sun Oct 13 04:05:39 UTC 2019

 Modified Files:
 	src/tests/lib/libc/sys: t_ptrace_wait.c

 Log Message:
 Enable TEST_LWP_ENABLED in t_ptrace_wait*

 The LWP events (created, exited) are now reliable in my local tests.

 PR kern/51420
 PR kern/51995


 To generate a diff of this commit:
 cvs rdiff -u -r1.135 -r1.136 src/tests/lib/libc/sys/t_ptrace_wait.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: kamil@NetBSD.org
State-Changed-When: Mon, 21 Oct 2019 20:37:07 +0200
State-Changed-Why:
Fixed in HEAD.
Merged into -9.


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/51995 CVS commit: [netbsd-9] src
Date: Wed, 23 Oct 2019 19:25:39 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Wed Oct 23 19:25:39 UTC 2019

 Modified Files:
 	src/sys/kern [netbsd-9]: kern_sig.c sys_ptrace_common.c
 	src/tests/lib/libc/sys [netbsd-9]: t_ptrace_wait.c

 Log Message:
 Pull up following revision(s) (requested by kamil in ticket #366):

 	tests/lib/libc/sys/t_ptrace_wait.c: revision 1.136
 	sys/kern/kern_sig.c: revision 1.373
 	tests/lib/libc/sys/t_ptrace_wait.c: revision 1.138
 	tests/lib/libc/sys/t_ptrace_wait.c: revision 1.139
 	sys/kern/kern_sig.c: revision 1.376
 	tests/lib/libc/sys/t_ptrace_wait.c: revision 1.140
 	sys/kern/sys_ptrace_common.c: revision 1.64

 Fix typo in a comment

 Enable TEST_LWP_ENABLED in t_ptrace_wait*
 The LWP events (created, exited) are now reliable in my local tests.
 PR kern/51420
 PR kern/51995

 Remove the short-circuit lwp_exit() path from sigswitch()

 sigswitch() can be called from exit1() through:

    ttywait()->ttysleep()-> cv_timedwait_sig()->sleepq_block()->issignal()->sigswitch()

 lwp_exit() called for the last LWP triggers exit1() and this causes a panic.
 The debugger related signals have short-circuit demise paths in
 eventswitch() and other functions, before calling sigswitch().

 This change restores the original behavior, but there is an open question
 whether the kernel crash is a red herring of misbehavior of ttywait().
 This should fix PR kern/54618 by David H. Gutteridge

 Fix a race condition when handling concurrent LWP signals and add a test

 Fix a race condition that caused PT_GET_SIGINFO to return incorrect
 information when multiple signals were delivered concurrently
 to different LWPs.  Add a regression test that verifies that when 50
 threads concurrently use pthread_kill() on themselves, the debugger
 receives all signals with correct information.

 The kernel uses separate signal queues for each LWP.  However,
 the signal context used to implement PT_GET_SIGINFO is stored in 'struct
 proc' and therefore common to all LWPs in the process.  Previously,
 this member was filled in kpsignal2(), i.e. when the signal was sent.

 This meant that if another LWP managed to send another signal
 concurrently, the data was overwritten before the process was stopped.

 As a result, PT_GET_SIGINFO did not report the correct LWP and signal
 (it could even report a different signal than wait()).  This can be
 quite reliably reproduced with the number of 20 LWPs, however it can
 also occur with 10.

 This patch moves setting of signal context to issignal(), just before
 the process is actually stopped.  The data is taken from per-LWP
 or per-process signal queue.  The added test confirms that the debugger
 correctly receives all signals, and PT_GET_SIGINFO reports both correct
 LWP and signal number.
 Reviewed by kamil.

 Remove preprocessor switch TEST_VFORK_ENABLED in t_ptrace_wait*
 vfork(2) tests are now enabled always and confirmed to be stable.

 Remove preprocessor switch TEST_LWP_ENABLED in t_ptrace_wait*
 LWP tests are now enabled always and confirmed to be stable.


 To generate a diff of this commit:
 cvs rdiff -u -r1.364.2.7 -r1.364.2.8 src/sys/kern/kern_sig.c
 cvs rdiff -u -r1.58.2.8 -r1.58.2.9 src/sys/kern/sys_ptrace_common.c
 cvs rdiff -u -r1.131.2.5 -r1.131.2.6 src/tests/lib/libc/sys/t_ptrace_wait.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.