NetBSD Problem Report #56561
From www@netbsd.org Mon Dec 20 15:36:54 2021
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 3F15C1A9239
for <gnats-bugs@gnats.NetBSD.org>; Mon, 20 Dec 2021 15:36:54 +0000 (UTC)
Message-Id: <20211220153653.4AB631A923B@mollari.NetBSD.org>
Date: Mon, 20 Dec 2021 15:36:53 +0000 (UTC)
From: prlw1@cam.ac.uk
Reply-To: prlw1@cam.ac.uk
To: gnats-bugs@NetBSD.org
Subject: cv_is_valid assertion failure in intel drm
X-Send-Pr-Version: www-1.0
>Number: 56561
>Category: kern
>Synopsis: cv_is_valid assertion failure in intel drm
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Dec 20 15:40:00 +0000 2021
>Closed-Date: Sun Aug 28 13:37:41 +0000 2022
>Last-Modified: Sun Aug 28 13:37:41 +0000 2022
>Originator: Patrick Welche
>Release: NetBSD-9.99.92/amd64 of 20 Dec 2021
>Organization:
>Environment:
>Description:
Trying out the new drm code on a
i915drmkms0 at pci0 dev 2 function 0: Intel UHD Graphics 620 (rev. 0x02)
which now has hardware acceleration(!) I happened across this panic:
(gdb) print panicstr
$1 = 0xffffffff810f8600 <scratchstr> "kernel diagnostic assertion \"cv_is_valid(cv)\" failed: file \"../../../../kern/kern_condvar.c\", line 511 "
(gdb) bt
#0 0xffffffff80222705 in cpu_reboot (howto=howto@entry=260,
bootstr=bootstr@entry=0x0) at ../../../../arch/amd64/amd64/machdep.c:713
#1 0xffffffff808c69c4 in kern_reboot (howto=howto@entry=260,
bootstr=bootstr@entry=0x0) at ../../../../kern/kern_reboot.c:73
#2 0xffffffff8090909a in vpanic (
fmt=0xffffffff80d8e280 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ", ap=ap@entry=0xffffb6813c48edc8) at ../../../../kern/subr_prf.c:290
#3 0xffffffff80a5bb47 in kern_assert (
fmt=fmt@entry=0xffffffff80d8e280 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ") at ../../../../../../lib/libkern/kern_assert.c:51
#4 0xffffffff80894d9a in cv_broadcast (cv=0xffffb6814e554b50)
at ../../../../kern/kern_condvar.c:511
#5 0xffffffff80a2be73 in linux___dma_fence_signal_wake (
fence=0xffffa1db7f364840, timestamp=<optimized out>)
at ../../../../external/bsd/drm2/linux/linux_dma_fence.c:1176
#6 0xffffffff80575b7d in signal_irq_work (work=0xffffb6801fac41e8)
at ../../../../external/bsd/drm2/dist/drm/i915/gt/intel_breadcrumbs.c:218
#7 0xffffffff80a30939 in irq_work_intr (cookie=<optimized out>)
at ../../../../external/bsd/drm2/linux/linux_irq_work.c:74
#8 0xffffffff808d4ff0 in softint_execute (s=5, l=0xffffa1debe5fe4c0)
at ../../../../kern/kern_softint.c:565
#9 softint_dispatch (pinned=<optimized out>, s=5)
at ../../../../kern/kern_softint.c:814
#10 0xffffffff8021d7ff in Xsoftintr ()
#11 0xa6658f2fddbb7892 in ?? ()
#12 0xa6658f2fddbb7892 in ?? ()
Backtrace stopped: Cannot access memory at address 0xffffb6813c48f000
(gdb) frame 4
#4 0xffffffff80894d9a in cv_broadcast (cv=0xffffb6814e554b50)
at ../../../../kern/kern_condvar.c:511
511 KASSERT(cv_is_valid(cv));
(gdb) list
506 */
507 void
508 cv_broadcast(kcondvar_t *cv)
509 {
510
511 KASSERT(cv_is_valid(cv));
512
513 if (__predict_false(!LIST_EMPTY(CV_SLEEPQ(cv))))
514 cv_wakeup_all(cv);
515 }
(gdb) print *cv
$1 = {cv_opaque = {0x0, 0xffffffff80d69100 <deadcv>}}
>How-To-Repeat:
Not entirely sure: things have been pretty stable...
>Fix:
>Release-Note:
>Audit-Trail:
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Tue, 21 Dec 2021 10:18:42 +0000
While "idle" (running X) same laptop/kernel just fell over with:
Crash version 9.99.92, image version 9.99.92.
crash: _kvm_kvatop(0)
Kernel compiled without options LOCKDEBUG.
System panicked: trap
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NVGA_RASTERCONSOLE() at 0
_KERNEL_OPT_PMS_DISABLE_POWERHOOK() at ffffa6814e59a000
sys_reboot() at sys_reboot
vpanic() at vpanic+0x154
device_printf() at device_printf
startlwp() at startlwp
calltrap() at calltrap+0x19
intel_disable_ddi() at intel_disable_ddi+0xa3
intel_encoders_disable() at intel_encoders_disable+0x90
hsw_crtc_disable() at hsw_crtc_disable+0x13
intel_old_crtc_state_disables() at intel_old_crtc_state_disables+0x11c
intel_atomic_commit_tail() at intel_atomic_commit_tail+0xf1b
intel_atomic_commit() at intel_atomic_commit+0x29e
drm_atomic_connector_commit_dpms() at drm_atomic_connector_commit_dpms+0xe0
drm_mode_obj_set_property_ioctl() at drm_mode_obj_set_property_ioctl+0x162
drm_connector_property_set_ioctl() at drm_connector_property_set_ioctl+0x27
drm_ioctl() at drm_ioctl+0x2cb
drm_ioctl_shim() at drm_ioctl_shim+0x2f
sys_ioctl() at sys_ioctl+0x555
syscall() at syscall+0x18c
--- syscall (number 54) ---
syscall+0x18c:
(gdb couldn't make sense of the corefile(!))
PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
1511 >1511 7 1 40 ffffed90b3166240 X
0 > 189 7 2 240 ffffed90adf20bc0 ioflush
0 > 163 7 0 200 ffffed90ada62580 nd6_timer
0 > 202 7 3 240 ffffed90ad7f12c0 iic0
0 > 124 1 7 201 ffffed90acf791c0 idle/7
0 > 118 1 6 201 ffffed90acf2c140 idle/6
0 > 112 1 5 201 ffffed90aceaf0c0 idle/5
0 > 106 1 4 201 ffffed90ace52040 idle/4
0 > 6 7 0 200 ffffed9408ffe4c0 softser/0
0 > 5 7 0 200 ffffed9408ffe080 softclk/0
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: prlw1@cam.ac.uk
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Fri, 24 Dec 2021 13:47:11 +0000
Do you have dmesg from the crash dump of the latest crash? Wondering
whether the original trap description is there.
From: Patrick Welche <prlw1@cam.ac.uk>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Sun, 26 Dec 2021 17:54:13 +0000
On Fri, Dec 24, 2021 at 01:47:11PM +0000, Taylor R Campbell wrote:
> Do you have dmesg from the crash dump of the latest crash? Wondering
> whether the original trap description is there.
Indeed it is! It turns out the trap happens more frequently, and
usually when idle. This is with a 22 Dec 9.99.93 kernel:
prevented execution of 0x0 (SMEP)
fatal page fault in supervisor mode
trap type 6 code 0x10 rip 0 cs 0x8 rflags 0x10246 cr2 0 ilevel 0 rsp 0xffff94014e33a898
curlwp 0xffffc78220fd6b00 pid 1521.1521 lowest kstack 0xffff94014e3362c0
panic: trap
cpu1: Begin traceback...
vpanic() at netbsd:vpanic+0x14a
panic() at netbsd:panic+0x3c
trap() at netbsd:trap+0xa7d
--- trap (number 6) ---
?() at 0
intel_disable_ddi() at netbsd:intel_disable_ddi+0xa3
intel_encoders_disable() at netbsd:intel_encoders_disable+0x90
hsw_crtc_disable() at netbsd:hsw_crtc_disable+0x13
intel_old_crtc_state_disables() at netbsd:intel_old_crtc_state_disables+0x11c
intel_atomic_commit_tail() at netbsd:intel_atomic_commit_tail+0xf1b
intel_atomic_commit() at netbsd:intel_atomic_commit+0x29e
drm_atomic_connector_commit_dpms() at netbsd:drm_atomic_connector_commit_dpms+0xe0
drm_mode_obj_set_property_ioctl() at netbsd:drm_mode_obj_set_property_ioctl+0x162
drm_connector_property_set_ioctl() at netbsd:drm_connector_property_set_ioctl+0x27
drm_ioctl() at netbsd:drm_ioctl+0x2cb
drm_ioctl_shim() at netbsd:drm_ioctl_shim+0x2f
sys_ioctl() at netbsd:sys_ioctl+0x555
syscall() at netbsd:syscall+0x18c
--- syscall (number 54) ---
netbsd:syscall+0x18c:
cpu1: End traceback...
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Patrick Welche <prlw1@cam.ac.uk>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Sun, 26 Dec 2021 17:57:50 +0000
> Date: Sun, 26 Dec 2021 17:54:13 +0000
> From: Patrick Welche <prlw1@cam.ac.uk>
>
> Indeed it is! It turns out the trap happens more frequently, and
> usually when idle. This is with a 22 Dec 9.99.93 kernel:
>
> prevented execution of 0x0 (SMEP)
> [...]
> intel_disable_ddi() at netbsd:intel_disable_ddi+0xa3
Got a line number for intel_disable_ddi+0xa3?
From: Patrick Welche <prlw1@cam.ac.uk>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Sun, 26 Dec 2021 18:11:16 +0000
On Sun, Dec 26, 2021 at 05:57:50PM +0000, Taylor R Campbell wrote:
> > Date: Sun, 26 Dec 2021 17:54:13 +0000
> > From: Patrick Welche <prlw1@cam.ac.uk>
> >
> > Indeed it is! It turns out the trap happens more frequently, and
> > usually when idle. This is with a 22 Dec 9.99.93 kernel:
> >
> > prevented execution of 0x0 (SMEP)
> > [...]
> > intel_disable_ddi() at netbsd:intel_disable_ddi+0xa3
>
> Got a line number for intel_disable_ddi+0xa3?
Sort of? A3=163
Dump of assembler code for function intel_disable_ddi:
0xffffffff804e9d6d <+0>: push %rbp
...
0xffffffff804e9e0b <+158>: call 0xffffffff80521597 <intel_edp_backlight_off>
0xffffffff804e9e10 <+163>: xor %edx,%edx <----
0xffffffff804e9e12 <+165>: mov %r12,%rsi
0xffffffff804e9e15 <+168>: mov %r15,%rdi
0xffffffff804e9e18 <+171>: pop %r12
0xffffffff804e9e1a <+173>: pop %r13
0xffffffff804e9e1c <+175>: pop %r14
0xffffffff804e9e1e <+177>: pop %r15
0xffffffff804e9e20 <+179>: pop %rbp
0xffffffff804e9e21 <+180>: jmp 0xffffffff805215fd <intel_dp_sink_set_decompression_state>
(I'm clearly net getting addr2line right...)
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Patrick Welche <prlw1@cam.ac.uk>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Sun, 26 Dec 2021 19:35:46 +0000
This is a multi-part message in MIME format.
--=_iH22Mq8cEGadrlwrn55fhppMX7+AHtfW
Can you try the attached patch?
You might be able to accelerate testing by explicitly asking to blank
the screen, maybe by screenblank(1), xset dpms, or xscreensaver.
--=_iH22Mq8cEGadrlwrn55fhppMX7+AHtfW
Content-Type: text/plain; charset="ISO-8859-1"; name="intelpanel"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="intelpanel.patch"
From 26fda8a04b487a3a59e6940cbd02faf0fcc68a8a Mon Sep 17 00:00:00 2001
From: Taylor R Campbell <riastradh@NetBSD.org>
Date: Sun, 26 Dec 2021 19:33:15 +0000
Subject: [PATCH] i915: Unifdef cnp_enable/disable_backlight.
Not sure why this was ifdef'd out in the first place! Appears to
have been a mistake in merge.
---
sys/external/bsd/drm2/dist/drm/i915/display/intel_panel.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/sys/external/bsd/drm2/dist/drm/i915/display/intel_panel.c b/sy=
s/external/bsd/drm2/dist/drm/i915/display/intel_panel.c
index a5204f61f3ab..c0f192e7bc73 100644
--- a/sys/external/bsd/drm2/dist/drm/i915/display/intel_panel.c
+++ b/sys/external/bsd/drm2/dist/drm/i915/display/intel_panel.c
@@ -831,7 +831,6 @@ static void bxt_disable_backlight(const struct drm_conn=
ector_state *old_conn_sta
}
}
=20
-#ifndef __NetBSD__ /* XXX mipi */
static void cnp_disable_backlight(const struct drm_connector_state *old_co=
nn_state)
{
struct intel_connector *connector =3D to_intel_connector(old_conn_state->=
connector);
@@ -846,6 +845,7 @@ static void cnp_disable_backlight(const struct drm_conn=
ector_state *old_conn_sta
tmp & ~BXT_BLC_PWM_ENABLE);
}
=20
+#ifndef __NetBSD__ /* XXX mipi */
static void pwm_disable_backlight(const struct drm_connector_state *old_co=
nn_state)
{
struct intel_connector *connector =3D to_intel_connector(old_conn_state->=
connector);
@@ -1138,7 +1138,6 @@ static void bxt_enable_backlight(const struct intel_c=
rtc_state *crtc_state,
pwm_ctl | BXT_BLC_PWM_ENABLE);
}
=20
-#ifndef __NetBSD__ /* XXX mipi */
static void cnp_enable_backlight(const struct intel_crtc_state *crtc_state,
const struct drm_connector_state *conn_state)
{
@@ -1170,6 +1169,7 @@ static void cnp_enable_backlight(const struct intel_c=
rtc_state *crtc_state,
pwm_ctl | BXT_BLC_PWM_ENABLE);
}
=20
+#ifndef __NetBSD__ /* XXX mipi */
static void pwm_enable_backlight(const struct intel_crtc_state *crtc_state,
const struct drm_connector_state *conn_state)
{
@@ -2008,10 +2008,8 @@ intel_panel_init_backlight_funcs(struct intel_panel =
*panel)
panel->backlight.hz_to_pwm =3D bxt_hz_to_pwm;
} else if (INTEL_PCH_TYPE(dev_priv) >=3D PCH_CNP) {
panel->backlight.setup =3D cnp_setup_backlight;
-#ifndef __NetBSD__ /* XXX mipi */
panel->backlight.enable =3D cnp_enable_backlight;
panel->backlight.disable =3D cnp_disable_backlight;
-#endif
panel->backlight.set =3D bxt_set_backlight;
panel->backlight.get =3D bxt_get_backlight;
panel->backlight.hz_to_pwm =3D cnp_hz_to_pwm;
--=_iH22Mq8cEGadrlwrn55fhppMX7+AHtfW--
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Mon, 27 Dec 2021 19:41:26 +0000
On Sun, Dec 26, 2021 at 07:40:02PM +0000, Taylor R Campbell wrote:
> Can you try the attached patch?
It seems to have fixed the trap - I observed a successful uneventful
blanking of the screen with it!
From: Patrick Welche <prlw1@talktalk.net>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Wed, 12 Jan 2022 11:11:18 +0000
On the same laptop, with 9.99.93 10 Jan 2022 code, I just observed:
panic() at device_printf
trap() at startlwp
--- trap (number 6) ---
i915_gem_evict_for_node() at i915_gem_evict_for_node+0xc9
i915_gem_gtt_reserve() at i915_gem_gtt_reserve+0xdc
i915_gem_gtt_insert() at i915_gem_gtt_insert+0x1f4
i915_vma_pin() at i915_vma_pin+0xa10
eb_lookup_vmas() at eb_lookup_vmas+0x5e5
i915_gem_do_execbuffer() at i915_gem_do_execbuffer+0x6c4
i915_gem_execbuffer2_ioctl() at i915_gem_execbuffer2_ioctl+0x1f9
drm_ioctl() at drm_ioctl+0x214
drm_ioctl_shim() at drm_ioctl_shim+0x2f
sys_ioctl() at sys_ioctl+0x555
syscall() at syscall+0x18c
--- syscall (number 54) ---
syscall+0x18c:
(note to self: /usr/obj/crash/netbsd.1.core.gz)
From: Patrick Welche <prlw1@talktalk.net>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Mon, 7 Mar 2022 09:50:50 +0000
I just observed the original cv_is_valid(cv) assertion as mentioned at
the top of this bug on 9.99.93 of 1st March 2022.
Just in case the line numbers changed:
(gdb) thread apply all bt
Thread 2.1 (<kvm>):
#0 0xffffffff80222765 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at ../../../../arch/amd64/amd64/machdep.c:720
#1 0xffffffff808af187 in kern_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at ../../../../kern/kern_reboot.c:73
#2 0xffffffff808f43ca in vpanic (fmt=0xffffffff80d8e4a8 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ", ap=ap@entry=0xffffde013c59edc8) at ../../../../kern/subr_prf.c:290
#3 0xffffffff80a701f7 in kern_assert (fmt=fmt@entry=0xffffffff80d8e4a8 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ") at ../../../../../../lib/libkern/kern_assert.c:51
#4 0xffffffff8087b1aa in cv_broadcast (cv=0xffffde014e40ab50) at ../../../../kern/kern_condvar.c:511
#5 0xffffffff80a1a553 in linux___dma_fence_signal_wake (fence=0xffff969807f14a40, timestamp=<optimized out>) at ../../../../external/bsd/drm2/linux/linux_dma_fence.c:1176
#6 0xffffffff8057e467 in signal_irq_work (work=0xffffde001faed1e8) at ../../../../external/bsd/drm2/dist/drm/i915/gt/intel_breadcrumbs.c:218
#7 0xffffffff80a1f089 in irq_work_intr (cookie=<optimized out>) at ../../../../external/bsd/drm2/linux/linux_irq_work.c:74
#8 0xffffffff808bdbd0 in softint_execute (s=5, l=0xffff969b483f24c0) at ../../../../kern/kern_softint.c:565
#9 softint_dispatch (pinned=<optimized out>, s=5) at ../../../../kern/kern_softint.c:814
#10 0xffffffff8021d7ff in Xsoftintr ()
#11 0x07b77c698bd464af in ?? ()
#12 0x07f77c298b9464ef in ?? ()
Backtrace stopped: Cannot access memory at address 0xffffde013c59f000
From: "John D. Baker" <jdbaker@consolidated.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Sat, 28 May 2022 11:19:15 -0500 (CDT)
I have just run into this panic as well on 9.99.97/amd64. The system
was idle with my usual X arangement. When I returned to the machine,
it had rebooted.
The last bit of "dmesg" as recovered by 'crash':
[...]
[ 8348.0558266] heartbeat *
[ 8348.0558266] heartbeat Idle? no
[ 8348.0558266] heartbeat Signals:
[ 8348.0558266] heartbeat [2:2cac6*] @ 5990ms
[ 8348.0558266] heartbeat [2:2cac7] @ 4990ms
[ 8348.0558266] i915drmkms0: notice: Resetting chip for stopped heartbeat on rcs
0
[ 8348.0558266] i915drmkms0: notice: xlock[1578] context reset due to GPU hang
[ 9078.8967489] nfs server yggdrasil:/r0/home/jdbaker: not responding
[ 9079.0665965] nfs server yggdrasil:/r0/home/jdbaker: is alive again
[ 10316.4686326] nfs server yggdrasil:/r0/home/jdbaker: not responding
[ 10316.5386338] nfs server yggdrasil:/r0/home/jdbaker: is alive again
[ 13680.6386480] panic: kernel diagnostic assertion "cv_is_valid(cv)" failed: fi
le "/x/current/src/sys/kern/kern_condvar.c", line 511
[ 13680.6386480] cpu0: Begin traceback...
[ 13680.6386480] vpanic() at netbsd:vpanic+0x183
[ 13680.6386480] kern_assert() at netbsd:kern_assert+0x4b
[ 13680.6386480] cv_broadcast() at netbsd:cv_broadcast+0x56
[ 13680.6386480] linux___dma_fence_signal_wake() at netbsd:linux___dma_fence_sig
nal_wake+0x13e
[ 13680.6386480] signal_irq_work() at netbsd:signal_irq_work+0x2ca
[ 13680.6386480] irq_work_intr() at netbsd:irq_work_intr+0x87
[ 13680.6386480] softint_dispatch() at netbsd:softint_dispatch+0xf9
[ 13680.6386480] DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xffffc580aed4
90f0
[ 13680.6386480] Xsoftintr() at netbsd:Xsoftintr+0x4f
[ 13680.6386480] --- interrupt ---
[ 13680.6386480] bfbd3f7fb7d79fb7:
[ 13680.6386480] cpu0: End traceback...
[ 13680.6386480] dumping to dev 0,1 (offset=16867127, size=2086023):
[ 13680.6386480] dump
--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]consolidated[flyspeck]net OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645
From: "John D. Baker" <jdbaker@consolidated.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Sun, 10 Jul 2022 16:15:30 -0500 (CDT)
Just tried running a 9.99.98/amd64 system with i915drmkms on an
82G41-based system and the panic still occurs:
[...]
[ 8794.379988] heartbeat rcs0 heartbeat {prio:-2147483645} not ticking
[ 8794.379988] heartbeat Awake? 6
[ 8794.379988] heartbeat Barriers?: no
[ 8794.379988] heartbeat Latency: 18us
[ 8794.379988] heartbeat Heartbeat: 3000 ms ago
[ 8794.379988] heartbeat Reset count: 0 (global 0)
[ 8794.379988] heartbeat Requests:
[ 8794.379988] heartbeat active 2:b16a9*- @ 6000ms: xlock[4402]
[ 8794.379988] heartbeat ring->start: 0x00004000
[ 8794.379988] heartbeat ring->head: 0x00001ca0
[ 8794.379988] heartbeat ring->tail: 0x00001e30
[ 8794.379988] heartbeat ring->emit: 0x00001e30
[ 8794.379988] heartbeat ring->space: 0x00003e30
[ 8794.379988] heartbeat ring->hwsp: 0x00002100
[ 8794.379988] heartbeat [head 1ca0, postfix 1d10, tail 1d28, batch 0x00000000_00d3c000]:
[ 8794.379988] warning: /x/current/src/sys/external/bsd/drm2/dist/drm/i915/gt/intel_engine_cs.c:1234: WARN_ON_ONCE(hex_dump_to_buffer(buf + pos, len - pos, rowsize, sizeof(u32), line, sizeof(line), 0) >= sizeof(line))
[ 8794.379988] heartbeat [0000] 22000002 0240007a 04e0ff1f 00000000 00000000 00000002 00000002 00000002
[ 8794.379988] 00000002 00000002 00000002 00000002 00000002 00000002 0
[ 8794.379988] heartbeat [0020] 00000002 00000002 00000002 00000002 00000002 00000002 00000002 00000002
[ 8794.379988] 00000002 0240007a 04e0ff1f 00000000 00000000 22000002 0
[ 8794.379988] heartbeat [0040] 00000002 0240007a 04e0ff1f 00000000 00000000 22000002 00000000 0000000c
[ 8794.379988] 0c31dc00 00000000 80018018 00c0d300 00000002 01008010 0
[ 8794.379988] heartbeat [0060] 0c31dc00 00000000 80018018 00c0d300 00000002 01008010 00010000 a9160b00
[ 8794.379988] 00000001 00000000
[ 8794.379988] heartbeat [0080] 00000001 00000000
[ 8794.379988] heartbeat On hold?: 0
[ 8794.379988] heartbeat MMIO base: 0x00002000
[ 8794.379988] heartbeat CCID: 0x00dc310d
[ 8794.379988] heartbeat RING_START: 0x00004000
[ 8794.379988] heartbeat RING_HEAD: 0x00001d10
[ 8794.379988] heartbeat RING_TAIL: 0x00001e30
[ 8794.379988] heartbeat RING_CTL: 0x00003001
[ 8794.379988] heartbeat RING_MODE: 0x00000040
[ 8794.379988] heartbeat ACTHD: 0x00000000_00d3c1bc
[ 8794.379988] heartbeat BBADDR: 0x00000000_00d3c1bb
[ 8794.379988] heartbeat DMA_FADDR: 0x00000000_00d3c380
[ 8794.379988] heartbeat IPEIR: 0x00000000
[ 8794.379988] heartbeat IPEHR: 0x60020100
[ 8794.379988] heartbeat E 2:b16a9*- @ 6000ms: xlock[4402]
[ 8794.379988] heartbeat E 2:b16aa- @ 5840ms: X[19491]
[ 8794.379988] heartbeat E 2:b16ab @ 3000ms: [i915]
[ 8794.379988] heartbeat HWSP:
[ 8794.379988] heartbeat [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 8794.379988] 00000000 00000000 00000000 00000000 00000000 00000000 0
[ 8794.379988] heartbeat *
[ 8794.379988] heartbeat [0100] a8160b00 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 8794.379988] 00000000 00000000 00000000 00000000 00000000 00000000 0
[ 8794.379988] heartbeat [0120] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 8794.379988] 00000000 00000000 00000000 00000000 00000000 00000000 0
[ 8794.379988] heartbeat *
[ 8794.379988] heartbeat Idle? no
[ 8794.379988] heartbeat Signals:
[ 8794.379988] heartbeat [2:b16a9*] @ 6000ms
[ 8794.379988] heartbeat [2:b16aa] @ 5840ms
[ 8794.379988] i915drmkms0: notice: Resetting chip for stopped heartbeat on rcs0
[ 8794.379988] i915drmkms0: notice: xlock[4402] context reset due to GPU hang
[...]
[ 15477.672336] panic: kernel diagnostic assertion "cv_is_valid(cv)" failed: file "/x/current/src/sys/kern/kern_condvar.c", line 511
[ 15477.672336] cpu0: Begin traceback...
[ 15477.672336] vpanic() at netbsd:vpanic+0x183
[ 15477.672336] kern_assert() at netbsd:kern_assert+0x4b
[ 15477.672336] cv_broadcast() at netbsd:cv_broadcast+0x56
[ 15477.672336] linux___dma_fence_signal_wake() at netbsd:linux___dma_fence_signal_wake+0x13e
[ 15477.672336] signal_irq_work() at netbsd:signal_irq_work+0x2ca
[ 15477.672336] irq_work_intr() at netbsd:irq_work_intr+0x87
[ 15477.672336] softint_dispatch() at netbsd:softint_dispatch+0xf9
[ 15477.672336] DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xffffb300aed490f0
[ 15477.672336] Xsoftintr() at netbsd:Xsoftintr+0x4f
[ 15477.672336] --- interrupt ---
[ 15477.672336] ff9dffe7fafbfeff:
[ 15477.672336] cpu0: End traceback...
[ 15477.672336] dumping to dev 0,1 (offset=16867127, size=2086023):
[ 15477.672336] dump {subsequent boot dmesg begins here}
Was playing a YouTube video through Firefox 101.0.1 (pkgsrc-2022Q2)
at the time, but I've seen it before while the system was essentially
idle (or as idle as it can be with Firefox running).
--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]consolidated[flyspeck]net OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56561 CVS commit: src/sys/external/bsd/drm2/dist/drm/i915
Date: Mon, 11 Jul 2022 18:56:00 +0000
Module Name: src
Committed By: riastradh
Date: Mon Jul 11 18:56:00 UTC 2022
Modified Files:
src/sys/external/bsd/drm2/dist/drm/i915: i915_request.c
Log Message:
i915: Defer destroying waitqueue until after callback is removed.
Candidate fix for PR kern/56561.
To generate a diff of this commit:
cvs rdiff -u -r1.16 -r1.17 \
src/sys/external/bsd/drm2/dist/drm/i915/i915_request.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "John D. Baker" <jdbaker@consolidated.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/56561: cv_is_valid assertion failure in intel drm
Date: Mon, 18 Jul 2022 12:42:28 -0500 (CDT)
After this commit:
https://mail-index.netbsd.org/source-changes/2022/07/11/msg139718.html
I've been running the resulting 9.99.98 (from sources around 13 July
2022) for over 2 days straight without this issue recurring, so I'd say
it's fixed. (It would usually occur within a few hours of booting.)
Now building 9.99.99 to see if the issues described here:
https://mail-index.netbsd.org/current-users/2022/07/17/msg042673.html
affect my i82G41 system.
--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]consolidated[flyspeck]net OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645
State-Changed-From-To: open->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sun, 28 Aug 2022 13:37:41 +0000
State-Changed-Why:
fixed in i915_request.c 1.17 on 2022-07-11
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.