NetBSD Problem Report #53120

From martin@duskware.de  Thu Mar 22 08:01:34 2018
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 6D4AD7A1DC
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 22 Mar 2018 08:01:34 +0000 (UTC)
Message-Id: <20180322080124.1A3C15CC770@emmas.aprisoft.de>
Date: Thu, 22 Mar 2018 09:01:24 +0100 (CET)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: gdb issues with threaded programs
X-Send-Pr-Version: 3.95

>Number:         53120
>Notify-List:    gson@gson.org
>Category:       kern
>Synopsis:       gdb issues with threaded programs
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kamil
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 22 08:05:00 +0000 2018
>Closed-Date:    Tue Jun 04 12:46:01 +0000 2019
>Last-Modified:  Tue Jun 04 12:46:01 +0000 2019
>Originator:     Martin Husemann
>Release:        NetBSD 8.99.14
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD night-owl.duskware.de 8.99.14 NetBSD 8.99.14 (NIGHT-OWL) #590: Mon Mar 19 14:59:01 CET 2018 martin@night-owl.duskware.de:/usr/src/sys/arch/amd64/compile/NIGHT-OWL amd64
Architecture: x86_64
Machine: amd64
>Description:

Trying to debug some X / threaded programs does not work in -current:

(gdb) run
Starting program: /usr/pkg/bin/xfce4-clipman 
[New LWP 2 of process 459]

(xfce4-clipman:459): GVFS-RemoteVolumeMonitor-WARNING **: remote volume monitor with dbus name org.gtk.Private.HalVolumeMonitor is not supported

Thread 2 received signal SIGTRAP, Trace/breakpoint trap.
[Switching to LWP 2 of process 459]
0x00007f7ff123e7da in _sys___kevent50 () from /usr/lib/libc.so.12
(gdb) c
Continuing.

Thread 2 received signal SIGSEGV, Segmentation fault.
0x00007f7ff123e7da in _sys___kevent50 () from /usr/lib/libc.so.12
(gdb) c
Continuing.

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.


>How-To-Repeat:
s/a

>Fix:
n/a

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->kamil
Responsible-Changed-By: kamil@NetBSD.org
Responsible-Changed-When: Thu, 22 Mar 2018 14:06:16 +0100
Responsible-Changed-Why:
Take.


From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, martin@NetBSD.org, christos@NetBSD.org, kamil@NetBSD.org
Cc: 
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Thu, 21 Jun 2018 16:18:37 +0300

 I'm also having gdb issues with threaded programs in -current, similar
 to those Martin reported in March, and they are impeding my attempts
 to track down other bugs.

 Here is a shell script I'm currently using to test for the problem:

 sysctl -w security.pax.mprotect.enabled=0 || true
 sysctl -w security.pax.mprotect.ptrace=0 || true
 cat <<EOF >test.gdb
 b main
 run +time=1 +tries=1 @127.0.0.177
 b write
 cont
 cont
 cont
 cont
 cont
 EOF
 gdb --batch -x test.gdb dig >gdb.out
 cat gdb.out
 ! grep SIGSEGV gdb.out

 This will exit with a nonzero status if gdb is triggering spurious
 SIGSEGVs, as it does in -current.

 I believe the problem was introduced by Christos' import of gdb 8.0.1
 in November 2017.  I have not run a full bisection, but I manually ran
 the above test against some source dates around that time, and they show
 the test passing before the import and failing after it:

 2017.05.01.12.29.40   pass
 2017.11.28.15.31.33   pass
 2017.11.30.14.51.01   fail
 2017.11.30.15.26.57   fail
 2017.12.02.22.51.22   fail
 2018.06.19.09.25.13   fail

 Would anyone object to reassigning this PR from Kamil to Christos?
 -- 
 Andreas Gustafsson, gson@gson.org

From: maya@netbsd.org
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Thu, 21 Jun 2018 14:31:41 +0000

 Tested pkgsrc gdb 8.1 and gdb 7.11 (which have far fewer patches), both SIGSEGV.

From: Kamil Rytarowski <n54@gmx.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Thu, 21 Jun 2018 19:26:25 +0200

 This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
 --AUme6imxWcFXGD2Ki4xvrt2UqXQS45nWq
 Content-Type: multipart/mixed; boundary="2m73ab3hkDbqWberdyQrILydaH2SQG9Hr";
  protected-headers="v1"
 From: Kamil Rytarowski <n54@gmx.com>
 To: gnats-bugs@NetBSD.org
 Message-ID: <a1fa055f-0139-7b86-243c-719b03fd7f7b@gmx.com>
 Subject: Re: kern/53120 (gdb issues with threaded programs)
 References: <pr-kern-53120@gnats.netbsd.org>
  <20180322080124.1A3C15CC770@emmas.aprisoft.de>
  <20180621133501.6CB667A221@mollari.NetBSD.org>
 In-Reply-To: <20180621133501.6CB667A221@mollari.NetBSD.org>

 --2m73ab3hkDbqWberdyQrILydaH2SQG9Hr
 Content-Type: text/plain; charset=utf-8
 Content-Language: en-US
 Content-Transfer-Encoding: quoted-printable

 On 21.06.2018 15:35, Andreas Gustafsson wrote:
 > The following reply was made to PR kern/53120; it has been noted by GNA=
 TS.
 >=20
 > From: Andreas Gustafsson <gson@gson.org>
 > To: gnats-bugs@NetBSD.org, martin@NetBSD.org, christos@NetBSD.org, kami=
 l@NetBSD.org
 > Cc:=20
 > Subject: Re: kern/53120 (gdb issues with threaded programs)
 > Date: Thu, 21 Jun 2018 16:18:37 +0300
 >=20
 >  I'm also having gdb issues with threaded programs in -current, similar=

 >  to those Martin reported in March, and they are impeding my attempts
 >  to track down other bugs.
 > =20
 >  Here is a shell script I'm currently using to test for the problem:
 > =20
 >  sysctl -w security.pax.mprotect.enabled=3D0 || true
 >  sysctl -w security.pax.mprotect.ptrace=3D0 || true
 >  cat <<EOF >test.gdb
 >  b main
 >  run +time=3D1 +tries=3D1 @127.0.0.177
 >  b write
 >  cont
 >  cont
 >  cont
 >  cont
 >  cont
 >  EOF
 >  gdb --batch -x test.gdb dig >gdb.out
 >  cat gdb.out
 >  ! grep SIGSEGV gdb.out
 > =20
 >  This will exit with a nonzero status if gdb is triggering spurious
 >  SIGSEGVs, as it does in -current.
 > =20
 >  I believe the problem was introduced by Christos' import of gdb 8.0.1
 >  in November 2017.  I have not run a full bisection, but I manually ran=

 >  the above test against some source dates around that time, and they sh=
 ow
 >  the test passing before the import and failing after it:
 > =20
 >  2017.05.01.12.29.40   pass
 >  2017.11.28.15.31.33   pass
 >  2017.11.30.14.51.01   fail
 >  2017.11.30.15.26.57   fail
 >  2017.12.02.22.51.22   fail
 >  2018.06.19.09.25.13   fail
 > =20
 >  Would anyone object to reassigning this PR from Kamil to Christos?
 >  --=20
 >  Andreas Gustafsson, gson@gson.org
 > =20
 >=20

 I've estimated that I will need around 6 months for remaining
 showstoppers in ptrace(2) in MI code (signals, forks/vforks, threads)...
 however I'm preempted by sanitizer work partially due to GSoC tasks
 (fuzzing etc), reports from !x86 ports (syscall(2)/__syscall(2) issues)
 and LLVM-7svn branching scheduled for August 1st.

 If something worked with threads before GDB 8.0, it was by an accident.
 Please be patient and I will get it functional for -9.


 --2m73ab3hkDbqWberdyQrILydaH2SQG9Hr--

 --AUme6imxWcFXGD2Ki4xvrt2UqXQS45nWq
 Content-Type: application/pgp-signature; name="signature.asc"
 Content-Description: OpenPGP digital signature
 Content-Disposition: attachment; filename="signature.asc"

 -----BEGIN PGP SIGNATURE-----

 iQJABAEBCAAqFiEELaxVpweEzw+lMDwuS7MI6bAudmwFAlsr38EMHG41NEBnbXgu
 Y29tAAoJEEuzCOmwLnZs1QMP/1Fcq9lToQ+1zgQ/lpzkbBwHacWTBFUXEp/c0pII
 bvgBX8LYjr1J0lbBfCDLfjhR8Vx6BomkatOEhVP8//m1mU7IULZUvk7Shcv+HZy6
 VG9yEhVTUC/17gk7woMB6LfOMp0qYWrtgfckfIQjr4/y1YkbWA5LLLwnRQbbwFvo
 NBSVgzfGljq+WkDYqR4kfbfc6NZRpnV9XVMBbI9cM0zs4vzwb3zg1oRAnUXEDGyu
 0OidHgNIdsKhqWeL/fr/tef/hTYkK+Xx/H+xckCdi5UUftQU1FwwGnbOzffFGqCW
 Vdb4hBqKWbcIfkdMrNAUCPVpyaWBTu2DVUpe4FZFChXmQXU2BvfSzIqPa+87p7I1
 cnnUx0HiDb03SL+xp6aLbwrDyfL51H/Lj1C1Vg5ijydg0TlaGf0ahwnHBf4A4Sbc
 9VuY94+/EFLMhjPPPpWZVW7ZOVPA9+NciHemHmakmQZZ+Ldg5Sa5wOZFblqKaKzC
 9R3L4tgqOH5JA2QBPAkocPGGFtbfx+OYyDfg4JqGwK/3GReUxM4IWB4obPNLjt7s
 gKZ/69Vmm2amDHxnJLAP6I3qAAtUUYp0i9nVfkC1A6Xq4WYIfBAhvke9bMTm0zje
 qKmqu7aXMEGdBT5JjmWXy0A+x61uYO7XCqL2p0TkYf49rCXPFAW0e6fkpFS76/VY
 1nRO
 =Pjok
 -----END PGP SIGNATURE-----

 --AUme6imxWcFXGD2Ki4xvrt2UqXQS45nWq--

Responsible-Changed-From-To: kamil->christos
Responsible-Changed-By: maya@NetBSD.org
Responsible-Changed-When: Fri, 22 Jun 2018 01:54:34 +0000
Responsible-Changed-Why:


Responsible-Changed-From-To: christos->kamil
Responsible-Changed-By: kamil@NetBSD.org
Responsible-Changed-When: Fri, 22 Jun 2018 13:22:32 +0200
Responsible-Changed-Why:
Take.
Threads in debuggers are not a matter of a 'fix' in GDB.
It's like expecting to get scaling mysql to 80cpus without SMP work in the kernel.
This is a multilayer challenge that I'm working on.


From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: kamil@NetBSD.org,
    christos@NetBSD.org,
    martin@NetBSD.org
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Sat, 23 Jun 2018 16:37:52 +0300

 I see that Christos made some commits including

   2018.06.23.03.15.55 christos src/external/gpl3/gdb/dist/gdb/inf-ptrace.c 1.18
   2018.06.23.03.15.55 christos src/external/gpl3/gdb/dist/gdb/nbsd-nat.c 1.6

 and with those, my test case is passing.  Thanks, Christos!

 Martin, is it working for you?
 --
 Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Fri, 8 Feb 2019 16:21:36 +0200

 I have devised a new scripted test case which fails on amd64 in all of
 NetBSD-7, -8, and -current:

   http://www.gson.org/netbsd/bugs/53120/test.sh

 -- 
 Andreas Gustafsson, gson@gson.org

From: Kamil Rytarowski <n54@gmx.com>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/53120 (gdb issues with threaded programs)
Date: Sat, 9 Feb 2019 18:47:13 +0100

 On 08.02.2019 15:25, Andreas Gustafsson wrote:
 > The following reply was made to PR kern/53120; it has been noted by GNATS.
 > 
 > From: Andreas Gustafsson <gson@gson.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: kern/53120 (gdb issues with threaded programs)
 > Date: Fri, 8 Feb 2019 16:21:36 +0200
 > 
 >  I have devised a new scripted test case which fails on amd64 in all of
 >  NetBSD-7, -8, and -current:
 >  
 >    http://www.gson.org/netbsd/bugs/53120/test.sh
 >  
 >  -- 
 >  Andreas Gustafsson, gson@gson.org
 >  
 > 

 Acknowledged, I'm in the process of cleaning the currently existing
 tests and I will keep add new scenarios followed by kernel fixes.

State-Changed-From-To: open->closed
State-Changed-By: kamil@NetBSD.org
State-Changed-When: Tue, 04 Jun 2019 14:46:01 +0200
State-Changed-Why:
The original issue with LWP events in X threaded applications is fixed.

Spawning xfce4-clipman works for me and the reported issues are gone.

Closing this report.

If there are more issues (and there are some of them) please report a new bug.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.