NetBSD Problem Report #51995
From www@NetBSD.org Thu Feb 23 04:31:42 2017
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 147697A266
for <gnats-bugs@gnats.NetBSD.org>; Thu, 23 Feb 2017 04:31:42 +0000 (UTC)
Message-Id: <20170223043140.ECB977A2A4@mollari.NetBSD.org>
Date: Thu, 23 Feb 2017 04:31:40 +0000 (UTC)
From: n54@gmx.com
Reply-To: n54@gmx.com
To: gnats-bugs@NetBSD.org
Subject: ptrace(2) PT_RESUME is not reliable
X-Send-Pr-Version: www-1.0
>Number: 51995
>Category: kern
>Synopsis: ptrace(2) PT_RESUME is not reliable
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kamil
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Feb 23 04:35:00 +0000 2017
>Closed-Date: Mon Oct 21 18:37:07 +0000 2019
>Last-Modified: Wed Oct 23 19:30:03 +0000 2019
>Originator: Kamil Rytarowski
>Release: NetBSD 7.99.62 amd64
>Organization:
TNF
>Environment:
NetBSD chieftec 7.99.62 NetBSD 7.99.61 (GENERIC) #3: Thu Feb 23 02:56:52 CET 2017 root@chieftec:/public/netbsd-root/sys/arch/amd64/compile/GENERIC amd64
>Description:
ptrace(2) operation PT_RESUME is not reliable in resuming the specified LWP
>How-To-Repeat:
$ cd /usr/tests/kernel/
$ atf-run t_ptrace_wait|atf-report
>Fix:
N/A
>Release-Note:
>Audit-Trail:
From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51995 CVS commit: src/tests/kernel
Date: Tue, 28 Feb 2017 13:29:52 +0000
Module Name: src
Committed By: kamil
Date: Tue Feb 28 13:29:52 UTC 2017
Modified Files:
src/tests/kernel: t_ptrace_wait.c
Log Message:
Mark resume1 and syscallemu1 tests broken in t_ptrace_wait*
resume1:
PR kern/51995 ptrace(2) PT_RESUME is not reliable
syscallemu1:
PR kern/52012 PT_SYSCALL does not stop on syscall entry
Sponsored by <The NetBSD Foundation>
To generate a diff of this commit:
cvs rdiff -u -r1.74 -r1.75 src/tests/kernel/t_ptrace_wait.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51995 CVS commit: src/tests/kernel
Date: Tue, 28 Mar 2017 03:19:21 +0000
Module Name: src
Committed By: kamil
Date: Tue Mar 28 03:19:20 UTC 2017
Modified Files:
src/tests/kernel: t_ptrace_wait.c
Log Message:
Set timeout expected in resume1 (t_ptrace_wait*)
Mark timeout for this test 5 sec. It sometimes works sometimes does not.
Add a local sleep(3) at the end to get consisten report about timeouting
always.
PR kern/51995
Sponsored by <The NetBSD Foundation>
To generate a diff of this commit:
cvs rdiff -u -r1.81 -r1.82 src/tests/kernel/t_ptrace_wait.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Responsible-Changed-From-To: kern-bug-people->kamil
Responsible-Changed-By: kamil@NetBSD.org
Responsible-Changed-When: Fri, 06 Oct 2017 23:13:33 +0200
Responsible-Changed-Why:
Take.
From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51995 CVS commit: src/tests/lib/libc/sys
Date: Fri, 22 Dec 2017 17:35:14 +0000
Module Name: src
Committed By: kamil
Date: Fri Dec 22 17:35:14 UTC 2017
Modified Files:
src/tests/lib/libc/sys: t_ptrace_wait.c
Log Message:
ptrace atf: Clanup reports of failures
Mark resume* suspend* tests as expected failure and link with PR 51995.
Sponsored by <The NetBSD Foundation>
To generate a diff of this commit:
cvs rdiff -u -r1.16 -r1.17 src/tests/lib/libc/sys/t_ptrace_wait.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51995 CVS commit: [netbsd-8] src/tests/lib/libc/sys
Date: Sun, 25 Feb 2018 20:59:47 +0000
Module Name: src
Committed By: snj
Date: Sun Feb 25 20:59:46 UTC 2018
Modified Files:
src/tests/lib/libc/sys [netbsd-8]: t_ptrace_amd64_wait.h
t_ptrace_i386_wait.h t_ptrace_wait.c t_ptrace_x86_wait.h
Log Message:
Pull up following revision(s) (requested by martin in ticket #586):
tests/lib/libc/sys/t_ptrace_amd64_wait.h: 1.2
tests/lib/libc/sys/t_ptrace_i386_wait.h: 1.2
tests/lib/libc/sys/t_ptrace_wait.c: 1.10-1.20
tests/lib/libc/sys/t_ptrace_x86_wait.h: 1.2-1.3
PR kern/52167 strikes on sparc64 too.
--
Temporarily disable t_ptrace_wait*::resume1 in ATF tests
It hangs forever on releng machines.
Sponsored by <The NetBSD Foundation>
--
Remove expected failure (fixed in kern_sig.c 1.339)
--
sync a bit more with reality; some things still fail, some new failures.
reduce spewage, be more explanatory about syscall errors.
--
Add expected failures.
--
make it fail instead of hang under qemu; XXX: need to investigate.
--
t_ptrace_wait*: Disable suspend* tests
These tests can hang the system. These interfaces will be improved and
temporarily disable them.
--
ptrace atf: Clanup reports of failures
Mark resume* suspend* tests as expected failure and link with PR 51995.
Sponsored by <The NetBSD Foundation>
--
report which errno failed
--
atf: t_ptrace_wait: Mark attach2 as racy
--
atf: ptrace: Temporarily disable signal3 as it breaks now on some ports
This test is marked as failing with: PR kern/51918.
To generate a diff of this commit:
cvs rdiff -u -r1.1 -r1.1.8.1 src/tests/lib/libc/sys/t_ptrace_amd64_wait.h \
src/tests/lib/libc/sys/t_ptrace_i386_wait.h \
src/tests/lib/libc/sys/t_ptrace_x86_wait.h
cvs rdiff -u -r1.9 -r1.9.2.1 src/tests/lib/libc/sys/t_ptrace_wait.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51995 CVS commit: src/tests/lib/libc/sys
Date: Wed, 1 May 2019 23:44:16 +0000
Module Name: src
Committed By: kamil
Date: Wed May 1 23:44:16 UTC 2019
Modified Files:
src/tests/lib/libc/sys: t_ptrace_wait.c
Log Message:
ATF ptrace(2) tests suspend1 and resume1 now pass
Verified on bare metal and in qemu.
PR kern/51995
To generate a diff of this commit:
cvs rdiff -u -r1.117 -r1.118 src/tests/lib/libc/sys/t_ptrace_wait.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51995 CVS commit: src/tests/lib/libc/sys
Date: Thu, 2 May 2019 00:34:06 +0000
Module Name: src
Committed By: kamil
Date: Thu May 2 00:34:06 UTC 2019
Modified Files:
src/tests/lib/libc/sys: t_ptrace_wait.c
Log Message:
Rename and partially enable trace_thread ATF ptrace(2) tests
Rename trace_thrad[1234] to more meaningful names:
- trace_thread_nolwpevents
- trace_thread_lwpexit
- trace_thread_lwpcreate
- trace_thread_lwpcreate_and_exit
In my local tests LWP CREATE events work as expected.
LWP EXIT ones are still racy and keep them disabled racy.
PR kern/51995
To generate a diff of this commit:
cvs rdiff -u -r1.118 -r1.119 src/tests/lib/libc/sys/t_ptrace_wait.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Kamil Rytarowski" <kamil@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51995 CVS commit: src/tests/lib/libc/sys
Date: Sun, 13 Oct 2019 04:05:39 +0000
Module Name: src
Committed By: kamil
Date: Sun Oct 13 04:05:39 UTC 2019
Modified Files:
src/tests/lib/libc/sys: t_ptrace_wait.c
Log Message:
Enable TEST_LWP_ENABLED in t_ptrace_wait*
The LWP events (created, exited) are now reliable in my local tests.
PR kern/51420
PR kern/51995
To generate a diff of this commit:
cvs rdiff -u -r1.135 -r1.136 src/tests/lib/libc/sys/t_ptrace_wait.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: kamil@NetBSD.org
State-Changed-When: Mon, 21 Oct 2019 20:37:07 +0200
State-Changed-Why:
Fixed in HEAD.
Merged into -9.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51995 CVS commit: [netbsd-9] src
Date: Wed, 23 Oct 2019 19:25:39 +0000
Module Name: src
Committed By: martin
Date: Wed Oct 23 19:25:39 UTC 2019
Modified Files:
src/sys/kern [netbsd-9]: kern_sig.c sys_ptrace_common.c
src/tests/lib/libc/sys [netbsd-9]: t_ptrace_wait.c
Log Message:
Pull up following revision(s) (requested by kamil in ticket #366):
tests/lib/libc/sys/t_ptrace_wait.c: revision 1.136
sys/kern/kern_sig.c: revision 1.373
tests/lib/libc/sys/t_ptrace_wait.c: revision 1.138
tests/lib/libc/sys/t_ptrace_wait.c: revision 1.139
sys/kern/kern_sig.c: revision 1.376
tests/lib/libc/sys/t_ptrace_wait.c: revision 1.140
sys/kern/sys_ptrace_common.c: revision 1.64
Fix typo in a comment
Enable TEST_LWP_ENABLED in t_ptrace_wait*
The LWP events (created, exited) are now reliable in my local tests.
PR kern/51420
PR kern/51995
Remove the short-circuit lwp_exit() path from sigswitch()
sigswitch() can be called from exit1() through:
ttywait()->ttysleep()-> cv_timedwait_sig()->sleepq_block()->issignal()->sigswitch()
lwp_exit() called for the last LWP triggers exit1() and this causes a panic.
The debugger related signals have short-circuit demise paths in
eventswitch() and other functions, before calling sigswitch().
This change restores the original behavior, but there is an open question
whether the kernel crash is a red herring of misbehavior of ttywait().
This should fix PR kern/54618 by David H. Gutteridge
Fix a race condition when handling concurrent LWP signals and add a test
Fix a race condition that caused PT_GET_SIGINFO to return incorrect
information when multiple signals were delivered concurrently
to different LWPs. Add a regression test that verifies that when 50
threads concurrently use pthread_kill() on themselves, the debugger
receives all signals with correct information.
The kernel uses separate signal queues for each LWP. However,
the signal context used to implement PT_GET_SIGINFO is stored in 'struct
proc' and therefore common to all LWPs in the process. Previously,
this member was filled in kpsignal2(), i.e. when the signal was sent.
This meant that if another LWP managed to send another signal
concurrently, the data was overwritten before the process was stopped.
As a result, PT_GET_SIGINFO did not report the correct LWP and signal
(it could even report a different signal than wait()). This can be
quite reliably reproduced with the number of 20 LWPs, however it can
also occur with 10.
This patch moves setting of signal context to issignal(), just before
the process is actually stopped. The data is taken from per-LWP
or per-process signal queue. The added test confirms that the debugger
correctly receives all signals, and PT_GET_SIGINFO reports both correct
LWP and signal number.
Reviewed by kamil.
Remove preprocessor switch TEST_VFORK_ENABLED in t_ptrace_wait*
vfork(2) tests are now enabled always and confirmed to be stable.
Remove preprocessor switch TEST_LWP_ENABLED in t_ptrace_wait*
LWP tests are now enabled always and confirmed to be stable.
To generate a diff of this commit:
cvs rdiff -u -r1.364.2.7 -r1.364.2.8 src/sys/kern/kern_sig.c
cvs rdiff -u -r1.58.2.8 -r1.58.2.9 src/sys/kern/sys_ptrace_common.c
cvs rdiff -u -r1.131.2.5 -r1.131.2.6 src/tests/lib/libc/sys/t_ptrace_wait.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.