NetBSD Problem Report #56330

From dholland@netbsd.org  Sun Jul 25 04:28:17 2021
Return-Path: <dholland@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 0EDF31A921F
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 25 Jul 2021 04:28:17 +0000 (UTC)
Message-Id: <20210725042816.C19A384E56@mail.netbsd.org>
Date: Sun, 25 Jul 2021 04:28:16 +0000 (UTC)
From: dholland@NetBSD.org
Reply-To: dholland@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: amd64 gdb issues
X-Send-Pr-Version: 3.95

>Number:         56330
>Category:       toolchain
>Synopsis:       amd64 gdb issues
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    toolchain-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Jul 25 04:30:00 +0000 2021
>Last-Modified:  Wed Dec 07 18:50:01 +0000 2022
>Originator:     David A. Holland
>Release:        NetBSD 9.99.85 (20210623)
>Organization:
>Environment:
System: NetBSD valkyrie 9.99.85 NetBSD 9.99.85 (VALKYRIE) #7: Wed Jun 23 18:32:25 EDT 2021  dholland@valkyrie:/usr/src/sys/arch/amd64/compile/VALKYRIE amd64
Architecture: x86_64
Machine: amd64
>Description:

1. Debugging multithreaded processes in gdb doesn't work. I can run
the target and set breakpoints, but any attempt to step or next
immediately stops with SIGTRAP in ___lwp_park60, and if you continue
trying eventually the target segfaults.

2. About a third of the time or so, killing the target causes gdb to
hang. ps -s shows one thread with wchan "wait", and a bunch of others
"parked". The target is stopped. (This may only happen with
multithreaded processes, not sure.)

3. A whole stack of ptrace tests are currently failing in the testbed;
it's likely this is the root cause.

>How-To-Repeat:
>Fix:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: toolchain/56330: amd64 gdb issues
Date: Wed, 7 Dec 2022 19:45:37 +0100

 This is not a plain gdb issue, the ptrace part causes serious kernel
 trouble. After trying to gdb a hanging firefox in -current (easily 
 reproducable) I ended up with all firefox procs as zombies and no
 progress made:

   PID TTY   STAT     TIME COMMAND
 [..]
  9579 ?     DXEl  0:54.86 (firefox)
  3117 ?     Z     0:00.00 - (firefox)
  4376 ?     Z     0:00.00 - (firefox)
  6129 ?     Z     0:00.00 - (firefox)
  8782 ?     Z     0:00.00 - (firefox)
  9735 ?     Z     0:00.00 - (firefox)
 10601 ?     Z     0:00.00 - (firefox)
 10994 ?     Z     0:00.00 - (firefox)
 11404 ?     Z     0:00.00 - (firefox)
 11824 ?     Z     0:00.00 - (firefox)

 Everything else was still working fine. Process 9579 is still considered being
 traced (X) and uninteruptible waiting (D) and already exiting (E), but making
 no progress with that. Besides a reboot not much that I can do to fix it - duh!

 Martin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2022 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.