NetBSD Problem Report #53120
From martin@duskware.de Thu Mar 22 08:01:34 2018
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 6D4AD7A1DC
for <gnats-bugs@gnats.NetBSD.org>; Thu, 22 Mar 2018 08:01:34 +0000 (UTC)
Message-Id: <20180322080124.1A3C15CC770@emmas.aprisoft.de>
Date: Thu, 22 Mar 2018 09:01:24 +0100 (CET)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: gdb issues with threaded programs
X-Send-Pr-Version: 3.95
>Number: 53120
>Notify-List: gson@gson.org
>Category: kern
>Synopsis: gdb issues with threaded programs
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kamil
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Mar 22 08:05:00 +0000 2018
>Closed-Date: Tue Jun 04 12:46:01 +0000 2019
>Last-Modified: Tue Jun 04 12:46:01 +0000 2019
>Originator: Martin Husemann
>Release: NetBSD 8.99.14
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD night-owl.duskware.de 8.99.14 NetBSD 8.99.14 (NIGHT-OWL) #590: Mon Mar 19 14:59:01 CET 2018 martin@night-owl.duskware.de:/usr/src/sys/arch/amd64/compile/NIGHT-OWL amd64
Architecture: x86_64
Machine: amd64
>Description:
Trying to debug some X / threaded programs does not work in -current:
(gdb) run
Starting program: /usr/pkg/bin/xfce4-clipman
[New LWP 2 of process 459]
(xfce4-clipman:459): GVFS-RemoteVolumeMonitor-WARNING **: remote volume monitor with dbus name org.gtk.Private.HalVolumeMonitor is not supported
Thread 2 received signal SIGTRAP, Trace/breakpoint trap.
[Switching to LWP 2 of process 459]
0x00007f7ff123e7da in _sys___kevent50 () from /usr/lib/libc.so.12
(gdb) c
Continuing.
Thread 2 received signal SIGSEGV, Segmentation fault.
0x00007f7ff123e7da in _sys___kevent50 () from /usr/lib/libc.so.12
(gdb) c
Continuing.
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
>How-To-Repeat:
s/a
>Fix:
n/a
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->kamil
Responsible-Changed-By: kamil@NetBSD.org
Responsible-Changed-When: Thu, 22 Mar 2018 14:06:16 +0100
Responsible-Changed-Why:
Take.
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, martin@NetBSD.org, christos@NetBSD.org, kamil@NetBSD.org
Cc:
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Thu, 21 Jun 2018 16:18:37 +0300
I'm also having gdb issues with threaded programs in -current, similar
to those Martin reported in March, and they are impeding my attempts
to track down other bugs.
Here is a shell script I'm currently using to test for the problem:
sysctl -w security.pax.mprotect.enabled=0 || true
sysctl -w security.pax.mprotect.ptrace=0 || true
cat <<EOF >test.gdb
b main
run +time=1 +tries=1 @127.0.0.177
b write
cont
cont
cont
cont
cont
EOF
gdb --batch -x test.gdb dig >gdb.out
cat gdb.out
! grep SIGSEGV gdb.out
This will exit with a nonzero status if gdb is triggering spurious
SIGSEGVs, as it does in -current.
I believe the problem was introduced by Christos' import of gdb 8.0.1
in November 2017. I have not run a full bisection, but I manually ran
the above test against some source dates around that time, and they show
the test passing before the import and failing after it:
2017.05.01.12.29.40 pass
2017.11.28.15.31.33 pass
2017.11.30.14.51.01 fail
2017.11.30.15.26.57 fail
2017.12.02.22.51.22 fail
2018.06.19.09.25.13 fail
Would anyone object to reassigning this PR from Kamil to Christos?
--
Andreas Gustafsson, gson@gson.org
From: maya@netbsd.org
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Thu, 21 Jun 2018 14:31:41 +0000
Tested pkgsrc gdb 8.1 and gdb 7.11 (which have far fewer patches), both SIGSEGV.
From: Kamil Rytarowski <n54@gmx.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Thu, 21 Jun 2018 19:26:25 +0200
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--AUme6imxWcFXGD2Ki4xvrt2UqXQS45nWq
Content-Type: multipart/mixed; boundary="2m73ab3hkDbqWberdyQrILydaH2SQG9Hr";
protected-headers="v1"
From: Kamil Rytarowski <n54@gmx.com>
To: gnats-bugs@NetBSD.org
Message-ID: <a1fa055f-0139-7b86-243c-719b03fd7f7b@gmx.com>
Subject: Re: kern/53120 (gdb issues with threaded programs)
References: <pr-kern-53120@gnats.netbsd.org>
<20180322080124.1A3C15CC770@emmas.aprisoft.de>
<20180621133501.6CB667A221@mollari.NetBSD.org>
In-Reply-To: <20180621133501.6CB667A221@mollari.NetBSD.org>
--2m73ab3hkDbqWberdyQrILydaH2SQG9Hr
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
On 21.06.2018 15:35, Andreas Gustafsson wrote:
> The following reply was made to PR kern/53120; it has been noted by GNA=
TS.
>=20
> From: Andreas Gustafsson <gson@gson.org>
> To: gnats-bugs@NetBSD.org, martin@NetBSD.org, christos@NetBSD.org, kami=
l@NetBSD.org
> Cc:=20
> Subject: Re: kern/53120 (gdb issues with threaded programs)
> Date: Thu, 21 Jun 2018 16:18:37 +0300
>=20
> I'm also having gdb issues with threaded programs in -current, similar=
> to those Martin reported in March, and they are impeding my attempts
> to track down other bugs.
> =20
> Here is a shell script I'm currently using to test for the problem:
> =20
> sysctl -w security.pax.mprotect.enabled=3D0 || true
> sysctl -w security.pax.mprotect.ptrace=3D0 || true
> cat <<EOF >test.gdb
> b main
> run +time=3D1 +tries=3D1 @127.0.0.177
> b write
> cont
> cont
> cont
> cont
> cont
> EOF
> gdb --batch -x test.gdb dig >gdb.out
> cat gdb.out
> ! grep SIGSEGV gdb.out
> =20
> This will exit with a nonzero status if gdb is triggering spurious
> SIGSEGVs, as it does in -current.
> =20
> I believe the problem was introduced by Christos' import of gdb 8.0.1
> in November 2017. I have not run a full bisection, but I manually ran=
> the above test against some source dates around that time, and they sh=
ow
> the test passing before the import and failing after it:
> =20
> 2017.05.01.12.29.40 pass
> 2017.11.28.15.31.33 pass
> 2017.11.30.14.51.01 fail
> 2017.11.30.15.26.57 fail
> 2017.12.02.22.51.22 fail
> 2018.06.19.09.25.13 fail
> =20
> Would anyone object to reassigning this PR from Kamil to Christos?
> --=20
> Andreas Gustafsson, gson@gson.org
> =20
>=20
I've estimated that I will need around 6 months for remaining
showstoppers in ptrace(2) in MI code (signals, forks/vforks, threads)...
however I'm preempted by sanitizer work partially due to GSoC tasks
(fuzzing etc), reports from !x86 ports (syscall(2)/__syscall(2) issues)
and LLVM-7svn branching scheduled for August 1st.
If something worked with threads before GDB 8.0, it was by an accident.
Please be patient and I will get it functional for -9.
--2m73ab3hkDbqWberdyQrILydaH2SQG9Hr--
--AUme6imxWcFXGD2Ki4xvrt2UqXQS45nWq
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"
-----BEGIN PGP SIGNATURE-----
iQJABAEBCAAqFiEELaxVpweEzw+lMDwuS7MI6bAudmwFAlsr38EMHG41NEBnbXgu
Y29tAAoJEEuzCOmwLnZs1QMP/1Fcq9lToQ+1zgQ/lpzkbBwHacWTBFUXEp/c0pII
bvgBX8LYjr1J0lbBfCDLfjhR8Vx6BomkatOEhVP8//m1mU7IULZUvk7Shcv+HZy6
VG9yEhVTUC/17gk7woMB6LfOMp0qYWrtgfckfIQjr4/y1YkbWA5LLLwnRQbbwFvo
NBSVgzfGljq+WkDYqR4kfbfc6NZRpnV9XVMBbI9cM0zs4vzwb3zg1oRAnUXEDGyu
0OidHgNIdsKhqWeL/fr/tef/hTYkK+Xx/H+xckCdi5UUftQU1FwwGnbOzffFGqCW
Vdb4hBqKWbcIfkdMrNAUCPVpyaWBTu2DVUpe4FZFChXmQXU2BvfSzIqPa+87p7I1
cnnUx0HiDb03SL+xp6aLbwrDyfL51H/Lj1C1Vg5ijydg0TlaGf0ahwnHBf4A4Sbc
9VuY94+/EFLMhjPPPpWZVW7ZOVPA9+NciHemHmakmQZZ+Ldg5Sa5wOZFblqKaKzC
9R3L4tgqOH5JA2QBPAkocPGGFtbfx+OYyDfg4JqGwK/3GReUxM4IWB4obPNLjt7s
gKZ/69Vmm2amDHxnJLAP6I3qAAtUUYp0i9nVfkC1A6Xq4WYIfBAhvke9bMTm0zje
qKmqu7aXMEGdBT5JjmWXy0A+x61uYO7XCqL2p0TkYf49rCXPFAW0e6fkpFS76/VY
1nRO
=Pjok
-----END PGP SIGNATURE-----
--AUme6imxWcFXGD2Ki4xvrt2UqXQS45nWq--
Responsible-Changed-From-To: kamil->christos
Responsible-Changed-By: maya@NetBSD.org
Responsible-Changed-When: Fri, 22 Jun 2018 01:54:34 +0000
Responsible-Changed-Why:
Responsible-Changed-From-To: christos->kamil
Responsible-Changed-By: kamil@NetBSD.org
Responsible-Changed-When: Fri, 22 Jun 2018 13:22:32 +0200
Responsible-Changed-Why:
Take.
Threads in debuggers are not a matter of a 'fix' in GDB.
It's like expecting to get scaling mysql to 80cpus without SMP work in the kernel.
This is a multilayer challenge that I'm working on.
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: kamil@NetBSD.org,
christos@NetBSD.org,
martin@NetBSD.org
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Sat, 23 Jun 2018 16:37:52 +0300
I see that Christos made some commits including
2018.06.23.03.15.55 christos src/external/gpl3/gdb/dist/gdb/inf-ptrace.c 1.18
2018.06.23.03.15.55 christos src/external/gpl3/gdb/dist/gdb/nbsd-nat.c 1.6
and with those, my test case is passing. Thanks, Christos!
Martin, is it working for you?
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Fri, 8 Feb 2019 16:21:36 +0200
I have devised a new scripted test case which fails on amd64 in all of
NetBSD-7, -8, and -current:
http://www.gson.org/netbsd/bugs/53120/test.sh
--
Andreas Gustafsson, gson@gson.org
From: Kamil Rytarowski <n54@gmx.com>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Sat, 9 Feb 2019 18:47:13 +0100
On 08.02.2019 15:25, Andreas Gustafsson wrote:
> The following reply was made to PR kern/53120; it has been noted by GNATS.
>
> From: Andreas Gustafsson <gson@gson.org>
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: kern/53120 (gdb issues with threaded programs)
> Date: Fri, 8 Feb 2019 16:21:36 +0200
>
> I have devised a new scripted test case which fails on amd64 in all of
> NetBSD-7, -8, and -current:
>
> http://www.gson.org/netbsd/bugs/53120/test.sh
>
> --
> Andreas Gustafsson, gson@gson.org
>
>
Acknowledged, I'm in the process of cleaning the currently existing
tests and I will keep add new scenarios followed by kernel fixes.
State-Changed-From-To: open->closed
State-Changed-By: kamil@NetBSD.org
State-Changed-When: Tue, 04 Jun 2019 14:46:01 +0200
State-Changed-Why:
The original issue with LWP events in X threaded applications is fixed.
Spawning xfce4-clipman works for me and the reported issues are gone.
Closing this report.
If there are more issues (and there are some of them) please report a new bug.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.