NetBSD Problem Report #57535

From brad@anduin.eldar.org  Fri Jul 21 16:41:40 2023
Return-Path: <brad@anduin.eldar.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 137FC1A923D
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 21 Jul 2023 16:41:40 +0000 (UTC)
Message-Id: <202307211641.36LGfZV1008520@anduin.eldar.org>
Date: Fri, 21 Jul 2023 12:41:35 -0400 (EDT)
From: brad@anduin.eldar.org
Reply-To: brad@anduin.eldar.org
To: gnats-bugs@NetBSD.org
Subject: dtrace on Xen DOMU might need -x nolibs
X-Send-Pr-Version: 3.95

>Number:         57535
>Category:       port-xen
>Synopsis:       dtrace on Xen DOMU might need -x nolibs
>Confidential:   no
>Severity:       non-critical
>Priority:       high
>Responsible:    port-xen-maintainer
>State:          feedback
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jul 21 16:45:00 +0000 2023
>Closed-Date:    
>Last-Modified:  Wed Jul 26 00:30:01 +0000 2023
>Originator:     brad@anduin.eldar.org
>Release:        NetBSD 10.99.6
>Organization:
	eldar.org
>Environment:
System:	NetBSD samwise.nat.eldar.org 10.99.6 NetBSD 10.99.6 (SAMWISE) #0: Thu Jul 20 21:14:18 EDT 2023  brad@samwise.nat.eldar.org:/usr/src/sys/arch/amd64/compile/SAMWISE amd64
Architecture: x86_64
Machine: amd64
>Description:
	I was asked to run the following dtrace probe to help track down another problem:

dtrace -n 'sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }'

This errors with:

dtrace: invalid probe specifier sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }: "/usr/lib/dtrace/psinfo.d", line 46: syntax error near "u_int"

The workaround is to use '-x nolibs' on the dtrace command line.

>How-To-Repeat:
	Set up a -current DOMU in normal PV mode and try the probe mentioned
	above.  You may have to compile a kernel with KDTRACE_HOOKS.

>Fix:
	I know little about how dtrace works and can not offer a solution.
	However, I can probably test any proposed solution that might
	come around.

>Release-Note:

>Audit-Trail:
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57535 CVS commit: src/share/mk
Date: Fri, 21 Jul 2023 20:03:13 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Fri Jul 21 20:03:13 UTC 2023

 Modified Files:
 	src/share/mk: bsd.own.mk

 Log Message:
 bsd.own.mk: Use MACHINE_ARCH for default MKCTF/MKDTRACE=yes x86.

 The substantive impact of this is that it enables MKCTF=yes for Xen
 kernels.  This is a change because, when building a Xen kernel
 (XEN3_DOM0, XEN3_DOMU), MACHINE is set to `xen', not to `i386' or
 `amd64', so the conditional never took effect.

 (The side effect of setting MKDTRACE=yes when building Xen kernels is
 unlikely to matter; that affects module and userland builds.)

 PR port-xen/57535

 XXX pullup-10


 To generate a diff of this commit:
 cvs rdiff -u -r1.1342 -r1.1343 src/share/mk/bsd.own.mk

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Fri, 21 Jul 2023 20:13:06 +0000
State-Changed-Why:
Can you please try a clean XEN3_DOMU kernel build and report back?

Note: You must clean the entire XEN3_DOMU kernel build directory
first (and cvs up or equivalent); it is not enough to do an
incremental build in this case.


From: Brad Spencer <brad@anduin.eldar.org>
To: gnats-bugs@netbsd.org
Cc: port-xen-maintainer@netbsd.org, netbsd-bugs@netbsd.org,
        gnats-admin@netbsd.org, riastradh@NetBSD.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 06:42:27 -0400

 riastradh@NetBSD.org writes:

 > Synopsis: dtrace on Xen DOMU might need -x nolibs
 >
 > State-Changed-From-To: open->feedback
 > State-Changed-By: riastradh@NetBSD.org
 > State-Changed-When: Fri, 21 Jul 2023 20:13:06 +0000
 > State-Changed-Why:
 > Can you please try a clean XEN3_DOMU kernel build and report back?
 >
 > Note: You must clean the entire XEN3_DOMU kernel build directory
 > first (and cvs up or equivalent); it is not enough to do an
 > incremental build in this case.

 I did a clean build of the world after a cvs update of the source tree
 and the problem mentioned in this PR is still present:

 # dtrace -n 'sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }'
 dtrace: invalid probe specifier sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }: "/usr/lib/dtrace/psinfo.d", line 46: syntax error near "u_int"



 I noted that a difference between GENERIC and XEN3_DOM0 is that there is
 a 'makeoptions DEBUG="-g"' present with a comment that it is needed for
 CTF and this option isn't present in XEN3_DOMU.  This is probably
 another problem, but if you try to build a XEN3_DOMU kernel with
 DEBUG="-g" you get this during link:

 #      link  SAMWISE/netbsd                                                                                                                                       /lhome/CURRENT_20230720/amd64/TOOLS/bin/x86_64--netbsd-ld -Map netbsd.map --cref -T netbsd.ldscript -Ttext 0xffffffff80200000 -e start -X -o netbsd ${SYSTEM_OBJ:[@]:Nswapnetbsd.o} ${EXTRA_OBJ} vers.o swapnetbsd.o                                                                                                                /lhome/CURRENT_20230720/amd64/TOOLS/bin/x86_64--netbsd-ld: warning: netbsd has a LOAD segment with RWX permissions                                                NetBSD 10.99.6 (SAMWISE) #1: Sat Jul 22 06:28:38 EDT 2023                                                                                                         
    text    data     bss     dec     hex filename                                                                                                                  
 5650174  273832 1736704 7660710  74e4a6 netbsd                                                                                                                    ERROR: nbctfmerge: Input file adiantum.o was partially built from C sources, but no CTF data was present

 I am pretty sure that the cvs updated correctly:

 #       $NetBSD: bsd.own.mk,v 1.1343 2023/07/21 20:03:13 riastradh Exp $

From: Brad Spencer <brad@anduin.eldar.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
        netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 07:46:28 -0400

 Taylor R Campbell <riastradh@NetBSD.org> writes:

 >> Date: Sat, 22 Jul 2023 06:42:27 -0400
 >> From: Brad Spencer <brad@anduin.eldar.org>
 >> 
 >> I did a clean build of the world after a cvs update of the source tree
 >> and the problem mentioned in this PR is still present:
 >
 > Can you please share the output of:
 >
 > $ ctfdump -t SAMWISE/adiantum.o
 > $ ctfdump -t SAMWISE/netbsd
 > $ config -x SAMWISE/netbsd
 >
 > in your clean build?
 >
 > If you delete the SAMWISE build directory and rebuild it, do you get
 > the same output?

 I should not build kernels before breakfast.  No, the build artifact
 directory for that kernel was NOT completely clean and I noticed that
 later.  The rest of the world was completely empty.

 After really making sure that the build directory for the custom kernel
 was gone, the build with "-g" worked fine.  Sorry for the
 noise... however, the original problem reported still is present, so
 adding debugging symbols didn't seem to help.

 # dtrace -n 'sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }'
 dtrace: invalid probe specifier sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }: "/usr/lib/dtrace/psinfo.d", line 46: syntax error near "u_int"

 uname -a
 NetBSD samwise.nat.eldar.org 10.99.6 NetBSD 10.99.6 (SAMWISE) #0: Sat Jul 22 07:14:36 EDT 2023  brad@samwise.nat.eldar.org:/usr/src/sys/arch/amd64/compile/SAMWISE amd64

 I get this on the console of the DOMU when the dtrace is executed:

 [   272.554696] fbt: no CTF data for module netbsd

 >> I noted that a difference between GENERIC and XEN3_DOM0 is that there is
 >> a 'makeoptions DEBUG="-g"' present with a comment that it is needed for
 >> CTF and this option isn't present in XEN3_DOMU.
 >
 > Evidently I never noticed this because I always build with MKDEBUG=yes
 > (or MKDEBUGKERNEL=yes) which renders it unnecessary.
 >
 > I think maybe we should delete the makeoptions lines in the kernel
 > configs, and set MKDEBUG=yes or at least MKDEBUGKERNEL=yes by default.

 Probably, but I don't have a great strong opinion.  It would seem like
 having magic makeoptions for something like this isn't all that
 desirable.  I know that I have personally broken the official build
 because I forgot to test a build with MKDEBUG=yes and missed something
 new that was part of the debug set.  It does add to the size of the
 artifacts and does take longer.  Maybe it should be enabled for -current
 and BETAs and disabled for the releases.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 11:52:29 +0000

 > Date: Sat, 22 Jul 2023 07:46:28 -0400
 > From: Brad Spencer <brad@anduin.eldar.org>
 > 
 > After really making sure that the build directory for the custom kernel
 > was gone, the build with "-g" worked fine.  Sorry for the
 > noise... however, the original problem reported still is present, so
 > adding debugging symbols didn't seem to help.

 Can you please share the output of:

 ctfdump -t SAMWISE/netbsd
 ctfdump -t /netbsd

 after you have rebuilt and reinstalled the kernel from a clean build
 directory?

 If you're not booting from /netbsd, run it on whatever is the kernel
 that you are actually booting -- and double-check to make sure that
 you are actually booting the kernel you think you are booting.

From: Brad Spencer <brad@anduin.eldar.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
        netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 08:21:25 -0400

 Taylor R Campbell <riastradh@NetBSD.org> writes:

 >> Date: Sat, 22 Jul 2023 07:46:28 -0400
 >> From: Brad Spencer <brad@anduin.eldar.org>
 >> 
 >> After really making sure that the build directory for the custom kernel
 >> was gone, the build with "-g" worked fine.  Sorry for the
 >> noise... however, the original problem reported still is present, so
 >> adding debugging symbols didn't seem to help.
 >
 > Can you please share the output of:
 >
 > ctfdump -t SAMWISE/netbsd
 > ctfdump -t /netbsd
 >
 > after you have rebuilt and reinstalled the kernel from a clean build
 > directory?

 Those are here:

 http://anduin.eldar.org/~brad/ctf/ctfdump_built_SAMWISE
 http://anduin.eldar.org/~brad/ctf/ctfdump_booted_SAMWISE

 Since this is a pure DOMU PV, the booted kernel lives on the DOM0 and
 the ctfdump was taken from there using the path indicated in the DOMU
 Xen config.  The two outputs appear to be identical.  I also placed the
 kernel in /netbsd on the DOMU, although there really should not be
 anything that cares much about that on a pure PV (and that didn't effect
 dtrace).

 > If you're not booting from /netbsd, run it on whatever is the kernel
 > that you are actually booting -- and double-check to make sure that
 > you are actually booting the kernel you think you are booting.

 Yes, I did that double check.  I am running the kernel build with "-g"
 as far as I can tell.  For completeness here is the "config -x" of the
 booted kernel:

 http://anduin.eldar.org/~brad/ctf/config_SAMWISE

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 11:12:37 +0000

 > Date: Sat, 22 Jul 2023 06:42:27 -0400
 > From: Brad Spencer <brad@anduin.eldar.org>
 > 
 > I did a clean build of the world after a cvs update of the source tree
 > and the problem mentioned in this PR is still present:

 Can you please share the output of:

 $ ctfdump -t SAMWISE/adiantum.o
 $ ctfdump -t SAMWISE/netbsd
 $ config -x SAMWISE/netbsd

 in your clean build?

 If you delete the SAMWISE build directory and rebuild it, do you get
 the same output?

 > I noted that a difference between GENERIC and XEN3_DOM0 is that there is
 > a 'makeoptions DEBUG="-g"' present with a comment that it is needed for
 > CTF and this option isn't present in XEN3_DOMU.

 Evidently I never noticed this because I always build with MKDEBUG=yes
 (or MKDEBUGKERNEL=yes) which renders it unnecessary.

 I think maybe we should delete the makeoptions lines in the kernel
 configs, and set MKDEBUG=yes or at least MKDEBUGKERNEL=yes by default.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 13:26:09 +0000

 Can you also share the output of `ctfdump -t /dev/ksyms'?

From: Martin Husemann <martin@duskware.de>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: Brad Spencer <brad@anduin.eldar.org>, gnats-bugs@netbsd.org,
	port-xen-maintainer@netbsd.org, netbsd-bugs@netbsd.org,
	gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 16:49:39 +0200

 On Sat, Jul 22, 2023 at 11:12:37AM +0000, Taylor R Campbell wrote:
 > I think maybe we should delete the makeoptions lines in the kernel
 > configs, and set MKDEBUG=yes or at least MKDEBUGKERNEL=yes by default.

 If you mean that globally:
 last time we tried that the evbarm* builds exploded in size (due to millions
 of kernels build there). This could be less drastic nowadays with many
 more configs moved into GENERIC but we need to be a bit carefull here.

 Martin

From: Brad Spencer <brad@anduin.eldar.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
        netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 11:10:06 -0400

 Taylor R Campbell <riastradh@NetBSD.org> writes:

 > Can you also share the output of `ctfdump -t /dev/ksyms'?

 # ctfdump -t /dev/ksyms
 /dev/ksyms does not contain a CTF preamble

 The pseudo device is working because the first little bit of a "strings
 -a /dev/ksyms" produces this:

 .note.netbsd.ident
 .symtab
 .strtab
 .shstrtab
 .bss
 .SUNW_ctf
 NetBSD
 xpq_queue_array
 xen_bootstrap_tables
 __func__.10
 __func__.11
 __func__.9
 __func__.8
 __func__.7
 __func__.6
 __func__.5
 __func__.4
 __func__.3
 __func__.2
 __func__.1
 __func__.0
 xen_idt_page.0
 initted.0
 xen_ipi_ast
 xen_ipi_hvcb
 xen_ipi_synch_fpu
 xen_ipi_halt
 xen_ipi_handler
 xen_ipifunc
 xen_ipi_kpreempt
 xen_ipi_generic
 xen_ipi_xcall
 .
 .
 .

From: Brad Spencer <brad@anduin.eldar.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
        netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, riastradh@NetBSD.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Tue, 25 Jul 2023 20:26:24 -0400

 Brad Spencer <brad@anduin.eldar.org> writes:

 > riastradh@NetBSD.org writes:
 >
 >> Synopsis: dtrace on Xen DOMU might need -x nolibs
 >>
 >> State-Changed-From-To: open->feedback
 >> State-Changed-By: riastradh@NetBSD.org
 >> State-Changed-When: Fri, 21 Jul 2023 20:13:06 +0000
 >> State-Changed-Why:
 >> Can you please try a clean XEN3_DOMU kernel build and report back?
 >>
 >> Note: You must clean the entire XEN3_DOMU kernel build directory
 >> first (and cvs up or equivalent); it is not enough to do an
 >> incremental build in this case.


 The system disappeared from "ruptime", but wasn't down.  In the past
 this has indicated that the clock went backwards (which confuses the
 ruptime / rwho stuff).  What I usually do is log into the console, go
 into single user and umount the filesystems before destroying the DOMU
 (or if I am luckly a reboot, but that doesn't always work).  Upon doing
 a "umount -a" the following was printed on the console:

 [ 175409.0295886] WARNING: lwp 8442 (umount): negative runtime: (-1 + 0x857d0fb35b7fa2d0/2^64) sec
 [ 175409.0295886] WARNING: pid 8442 (umount): negative runtime; monotonic clock has gone backwards

 Before I did any of that I did a date and noticed that the clock was
 about 1 hour behind.  I personally don't think that this could have been
 due to anything gradually changing with the clock.



 -- 
 Brad Spencer - brad@anduin.eldar.org - KC8VKS - http://anduin.eldar.org

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.