NetBSD Problem Report #57535
From brad@anduin.eldar.org Fri Jul 21 16:41:40 2023
Return-Path: <brad@anduin.eldar.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 137FC1A923D
for <gnats-bugs@gnats.NetBSD.org>; Fri, 21 Jul 2023 16:41:40 +0000 (UTC)
Message-Id: <202307211641.36LGfZV1008520@anduin.eldar.org>
Date: Fri, 21 Jul 2023 12:41:35 -0400 (EDT)
From: brad@anduin.eldar.org
Reply-To: brad@anduin.eldar.org
To: gnats-bugs@NetBSD.org
Subject: dtrace on Xen DOMU might need -x nolibs
X-Send-Pr-Version: 3.95
>Number: 57535
>Category: port-xen
>Synopsis: dtrace on Xen DOMU might need -x nolibs
>Confidential: no
>Severity: non-critical
>Priority: high
>Responsible: port-xen-maintainer
>State: feedback
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Jul 21 16:45:00 +0000 2023
>Closed-Date:
>Last-Modified: Wed Jul 26 00:30:01 +0000 2023
>Originator: brad@anduin.eldar.org
>Release: NetBSD 10.99.6
>Organization:
eldar.org
>Environment:
System: NetBSD samwise.nat.eldar.org 10.99.6 NetBSD 10.99.6 (SAMWISE) #0: Thu Jul 20 21:14:18 EDT 2023 brad@samwise.nat.eldar.org:/usr/src/sys/arch/amd64/compile/SAMWISE amd64
Architecture: x86_64
Machine: amd64
>Description:
I was asked to run the following dtrace probe to help track down another problem:
dtrace -n 'sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }'
This errors with:
dtrace: invalid probe specifier sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }: "/usr/lib/dtrace/psinfo.d", line 46: syntax error near "u_int"
The workaround is to use '-x nolibs' on the dtrace command line.
>How-To-Repeat:
Set up a -current DOMU in normal PV mode and try the probe mentioned
above. You may have to compile a kernel with KDTRACE_HOOKS.
>Fix:
I know little about how dtrace works and can not offer a solution.
However, I can probably test any proposed solution that might
come around.
>Release-Note:
>Audit-Trail:
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57535 CVS commit: src/share/mk
Date: Fri, 21 Jul 2023 20:03:13 +0000
Module Name: src
Committed By: riastradh
Date: Fri Jul 21 20:03:13 UTC 2023
Modified Files:
src/share/mk: bsd.own.mk
Log Message:
bsd.own.mk: Use MACHINE_ARCH for default MKCTF/MKDTRACE=yes x86.
The substantive impact of this is that it enables MKCTF=yes for Xen
kernels. This is a change because, when building a Xen kernel
(XEN3_DOM0, XEN3_DOMU), MACHINE is set to `xen', not to `i386' or
`amd64', so the conditional never took effect.
(The side effect of setting MKDTRACE=yes when building Xen kernels is
unlikely to matter; that affects module and userland builds.)
PR port-xen/57535
XXX pullup-10
To generate a diff of this commit:
cvs rdiff -u -r1.1342 -r1.1343 src/share/mk/bsd.own.mk
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Fri, 21 Jul 2023 20:13:06 +0000
State-Changed-Why:
Can you please try a clean XEN3_DOMU kernel build and report back?
Note: You must clean the entire XEN3_DOMU kernel build directory
first (and cvs up or equivalent); it is not enough to do an
incremental build in this case.
From: Brad Spencer <brad@anduin.eldar.org>
To: gnats-bugs@netbsd.org
Cc: port-xen-maintainer@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org, riastradh@NetBSD.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 06:42:27 -0400
riastradh@NetBSD.org writes:
> Synopsis: dtrace on Xen DOMU might need -x nolibs
>
> State-Changed-From-To: open->feedback
> State-Changed-By: riastradh@NetBSD.org
> State-Changed-When: Fri, 21 Jul 2023 20:13:06 +0000
> State-Changed-Why:
> Can you please try a clean XEN3_DOMU kernel build and report back?
>
> Note: You must clean the entire XEN3_DOMU kernel build directory
> first (and cvs up or equivalent); it is not enough to do an
> incremental build in this case.
I did a clean build of the world after a cvs update of the source tree
and the problem mentioned in this PR is still present:
# dtrace -n 'sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }'
dtrace: invalid probe specifier sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }: "/usr/lib/dtrace/psinfo.d", line 46: syntax error near "u_int"
I noted that a difference between GENERIC and XEN3_DOM0 is that there is
a 'makeoptions DEBUG="-g"' present with a comment that it is needed for
CTF and this option isn't present in XEN3_DOMU. This is probably
another problem, but if you try to build a XEN3_DOMU kernel with
DEBUG="-g" you get this during link:
# link SAMWISE/netbsd /lhome/CURRENT_20230720/amd64/TOOLS/bin/x86_64--netbsd-ld -Map netbsd.map --cref -T netbsd.ldscript -Ttext 0xffffffff80200000 -e start -X -o netbsd ${SYSTEM_OBJ:[@]:Nswapnetbsd.o} ${EXTRA_OBJ} vers.o swapnetbsd.o /lhome/CURRENT_20230720/amd64/TOOLS/bin/x86_64--netbsd-ld: warning: netbsd has a LOAD segment with RWX permissions NetBSD 10.99.6 (SAMWISE) #1: Sat Jul 22 06:28:38 EDT 2023
text data bss dec hex filename
5650174 273832 1736704 7660710 74e4a6 netbsd ERROR: nbctfmerge: Input file adiantum.o was partially built from C sources, but no CTF data was present
I am pretty sure that the cvs updated correctly:
# $NetBSD: bsd.own.mk,v 1.1343 2023/07/21 20:03:13 riastradh Exp $
From: Brad Spencer <brad@anduin.eldar.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 07:46:28 -0400
Taylor R Campbell <riastradh@NetBSD.org> writes:
>> Date: Sat, 22 Jul 2023 06:42:27 -0400
>> From: Brad Spencer <brad@anduin.eldar.org>
>>
>> I did a clean build of the world after a cvs update of the source tree
>> and the problem mentioned in this PR is still present:
>
> Can you please share the output of:
>
> $ ctfdump -t SAMWISE/adiantum.o
> $ ctfdump -t SAMWISE/netbsd
> $ config -x SAMWISE/netbsd
>
> in your clean build?
>
> If you delete the SAMWISE build directory and rebuild it, do you get
> the same output?
I should not build kernels before breakfast. No, the build artifact
directory for that kernel was NOT completely clean and I noticed that
later. The rest of the world was completely empty.
After really making sure that the build directory for the custom kernel
was gone, the build with "-g" worked fine. Sorry for the
noise... however, the original problem reported still is present, so
adding debugging symbols didn't seem to help.
# dtrace -n 'sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }'
dtrace: invalid probe specifier sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }: "/usr/lib/dtrace/psinfo.d", line 46: syntax error near "u_int"
uname -a
NetBSD samwise.nat.eldar.org 10.99.6 NetBSD 10.99.6 (SAMWISE) #0: Sat Jul 22 07:14:36 EDT 2023 brad@samwise.nat.eldar.org:/usr/src/sys/arch/amd64/compile/SAMWISE amd64
I get this on the console of the DOMU when the dtrace is executed:
[ 272.554696] fbt: no CTF data for module netbsd
>> I noted that a difference between GENERIC and XEN3_DOM0 is that there is
>> a 'makeoptions DEBUG="-g"' present with a comment that it is needed for
>> CTF and this option isn't present in XEN3_DOMU.
>
> Evidently I never noticed this because I always build with MKDEBUG=yes
> (or MKDEBUGKERNEL=yes) which renders it unnecessary.
>
> I think maybe we should delete the makeoptions lines in the kernel
> configs, and set MKDEBUG=yes or at least MKDEBUGKERNEL=yes by default.
Probably, but I don't have a great strong opinion. It would seem like
having magic makeoptions for something like this isn't all that
desirable. I know that I have personally broken the official build
because I forgot to test a build with MKDEBUG=yes and missed something
new that was part of the debug set. It does add to the size of the
artifacts and does take longer. Maybe it should be enabled for -current
and BETAs and disabled for the releases.
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 11:52:29 +0000
> Date: Sat, 22 Jul 2023 07:46:28 -0400
> From: Brad Spencer <brad@anduin.eldar.org>
>
> After really making sure that the build directory for the custom kernel
> was gone, the build with "-g" worked fine. Sorry for the
> noise... however, the original problem reported still is present, so
> adding debugging symbols didn't seem to help.
Can you please share the output of:
ctfdump -t SAMWISE/netbsd
ctfdump -t /netbsd
after you have rebuilt and reinstalled the kernel from a clean build
directory?
If you're not booting from /netbsd, run it on whatever is the kernel
that you are actually booting -- and double-check to make sure that
you are actually booting the kernel you think you are booting.
From: Brad Spencer <brad@anduin.eldar.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 08:21:25 -0400
Taylor R Campbell <riastradh@NetBSD.org> writes:
>> Date: Sat, 22 Jul 2023 07:46:28 -0400
>> From: Brad Spencer <brad@anduin.eldar.org>
>>
>> After really making sure that the build directory for the custom kernel
>> was gone, the build with "-g" worked fine. Sorry for the
>> noise... however, the original problem reported still is present, so
>> adding debugging symbols didn't seem to help.
>
> Can you please share the output of:
>
> ctfdump -t SAMWISE/netbsd
> ctfdump -t /netbsd
>
> after you have rebuilt and reinstalled the kernel from a clean build
> directory?
Those are here:
http://anduin.eldar.org/~brad/ctf/ctfdump_built_SAMWISE
http://anduin.eldar.org/~brad/ctf/ctfdump_booted_SAMWISE
Since this is a pure DOMU PV, the booted kernel lives on the DOM0 and
the ctfdump was taken from there using the path indicated in the DOMU
Xen config. The two outputs appear to be identical. I also placed the
kernel in /netbsd on the DOMU, although there really should not be
anything that cares much about that on a pure PV (and that didn't effect
dtrace).
> If you're not booting from /netbsd, run it on whatever is the kernel
> that you are actually booting -- and double-check to make sure that
> you are actually booting the kernel you think you are booting.
Yes, I did that double check. I am running the kernel build with "-g"
as far as I can tell. For completeness here is the "config -x" of the
booted kernel:
http://anduin.eldar.org/~brad/ctf/config_SAMWISE
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 11:12:37 +0000
> Date: Sat, 22 Jul 2023 06:42:27 -0400
> From: Brad Spencer <brad@anduin.eldar.org>
>
> I did a clean build of the world after a cvs update of the source tree
> and the problem mentioned in this PR is still present:
Can you please share the output of:
$ ctfdump -t SAMWISE/adiantum.o
$ ctfdump -t SAMWISE/netbsd
$ config -x SAMWISE/netbsd
in your clean build?
If you delete the SAMWISE build directory and rebuild it, do you get
the same output?
> I noted that a difference between GENERIC and XEN3_DOM0 is that there is
> a 'makeoptions DEBUG="-g"' present with a comment that it is needed for
> CTF and this option isn't present in XEN3_DOMU.
Evidently I never noticed this because I always build with MKDEBUG=yes
(or MKDEBUGKERNEL=yes) which renders it unnecessary.
I think maybe we should delete the makeoptions lines in the kernel
configs, and set MKDEBUG=yes or at least MKDEBUGKERNEL=yes by default.
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 13:26:09 +0000
Can you also share the output of `ctfdump -t /dev/ksyms'?
From: Martin Husemann <martin@duskware.de>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: Brad Spencer <brad@anduin.eldar.org>, gnats-bugs@netbsd.org,
port-xen-maintainer@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 16:49:39 +0200
On Sat, Jul 22, 2023 at 11:12:37AM +0000, Taylor R Campbell wrote:
> I think maybe we should delete the makeoptions lines in the kernel
> configs, and set MKDEBUG=yes or at least MKDEBUGKERNEL=yes by default.
If you mean that globally:
last time we tried that the evbarm* builds exploded in size (due to millions
of kernels build there). This could be less drastic nowadays with many
more configs moved into GENERIC but we need to be a bit carefull here.
Martin
From: Brad Spencer <brad@anduin.eldar.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Sat, 22 Jul 2023 11:10:06 -0400
Taylor R Campbell <riastradh@NetBSD.org> writes:
> Can you also share the output of `ctfdump -t /dev/ksyms'?
# ctfdump -t /dev/ksyms
/dev/ksyms does not contain a CTF preamble
The pseudo device is working because the first little bit of a "strings
-a /dev/ksyms" produces this:
.note.netbsd.ident
.symtab
.strtab
.shstrtab
.bss
.SUNW_ctf
NetBSD
xpq_queue_array
xen_bootstrap_tables
__func__.10
__func__.11
__func__.9
__func__.8
__func__.7
__func__.6
__func__.5
__func__.4
__func__.3
__func__.2
__func__.1
__func__.0
xen_idt_page.0
initted.0
xen_ipi_ast
xen_ipi_hvcb
xen_ipi_synch_fpu
xen_ipi_halt
xen_ipi_handler
xen_ipifunc
xen_ipi_kpreempt
xen_ipi_generic
xen_ipi_xcall
.
.
.
From: Brad Spencer <brad@anduin.eldar.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, port-xen-maintainer@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, riastradh@NetBSD.org
Subject: Re: port-xen/57535 (dtrace on Xen DOMU might need -x nolibs)
Date: Tue, 25 Jul 2023 20:26:24 -0400
Brad Spencer <brad@anduin.eldar.org> writes:
> riastradh@NetBSD.org writes:
>
>> Synopsis: dtrace on Xen DOMU might need -x nolibs
>>
>> State-Changed-From-To: open->feedback
>> State-Changed-By: riastradh@NetBSD.org
>> State-Changed-When: Fri, 21 Jul 2023 20:13:06 +0000
>> State-Changed-Why:
>> Can you please try a clean XEN3_DOMU kernel build and report back?
>>
>> Note: You must clean the entire XEN3_DOMU kernel build directory
>> first (and cvs up or equivalent); it is not enough to do an
>> incremental build in this case.
The system disappeared from "ruptime", but wasn't down. In the past
this has indicated that the clock went backwards (which confuses the
ruptime / rwho stuff). What I usually do is log into the console, go
into single user and umount the filesystems before destroying the DOMU
(or if I am luckly a reboot, but that doesn't always work). Upon doing
a "umount -a" the following was printed on the console:
[ 175409.0295886] WARNING: lwp 8442 (umount): negative runtime: (-1 + 0x857d0fb35b7fa2d0/2^64) sec
[ 175409.0295886] WARNING: pid 8442 (umount): negative runtime; monotonic clock has gone backwards
Before I did any of that I did a date and noticed that the clock was
about 1 hour behind. I personally don't think that this could have been
due to anything gradually changing with the clock.
--
Brad Spencer - brad@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.