NetBSD Problem Report #58078

From www@netbsd.org  Mon Mar 25 20:29:18 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id AEF561A9239
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 25 Mar 2024 20:29:18 +0000 (UTC)
Message-Id: <20240325202916.DBFF51A923A@mollari.NetBSD.org>
Date: Mon, 25 Mar 2024 20:29:16 +0000 (UTC)
From: naguam@ik.me
Reply-To: naguam@ik.me
To: gnats-bugs@NetBSD.org
Subject: Ironlake mobile not working nicely
X-Send-Pr-Version: www-1.0

>Number:         58078
>Category:       kern
>Synopsis:       Ironlake mobile not working nicely
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar 25 20:30:00 +0000 2024
>Originator:     Naguam
>Release:        NetBSD10.0_RC6
>Organization:
None
>Environment:
NetBSD i5-1gen.Home 10.0_RC6 NetBSD 10.0_RC6 (GENERIC) #0: Tue Mar 12 10:19:02 UTC 2024  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
Hello,

I own a Dell Latitude E4310 with a i5 560M and a gen5 Ironlake mobile iGPU.

I wasn't able to reproduce complete crash again (but still weird behaviours) as it seems a bit random, but still some visual problem and kernel logs appears so I'm going to detail everything.

First at boot without doing anything, on both RC5 and RC6 I got.

[     4.664050] intelfb0: framebuffer at 0xe0010000, size 1366x768, depth 32, stride 5504
[     6.194048] {drm:netbsd:intel_cpu_fifo_underrun_irq_handler+0x64} *ERROR* CPU pipe A FIFO underrun

Then with glxgears with both the integrated MesaLib (19) and the pkgsrc Mesalib (more recent 21.3) glxgears behave weirdly.

Please find the video in the following link.

https://www.swisstransfer.com/d/6e0ba072-2451-4008-97da-92a308c8d2ab

Sometimes in both RC5 and RC6 I got crashes but I was not able to reproduce again and did not keep the .core generated, so my bad.

I used the default CTWM WM as GUI.

Here are the dmesg after the weird glxgears

[   132.218539] heartbeat rcs0 heartbeat {prio:-2147483645} not ticking
[   132.218539] heartbeat 	Awake? 6
[   132.218539] heartbeat 	Barriers?: no
[   132.218539] heartbeat 	Latency: 28us
[   132.218539] heartbeat 	Heartbeat: 3000 ms ago
[   132.218539] heartbeat 	Reset count: 0 (global 0)
[   132.218539] heartbeat 	Requests:
[   132.218539] heartbeat 		active  2:41b*-  @ 5870ms: glxgears[1785]
[   132.218539] heartbeat 		ring->start:  0x00004000
[   132.218539] heartbeat 		ring->head:   0x000011f8
[   132.218539] heartbeat 		ring->tail:   0x000016b8
[   132.218539] heartbeat 		ring->emit:   0x000016b8
[   132.218539] heartbeat 		ring->space:  0x00003b00
[   132.218539] heartbeat 		ring->hwsp:   0x00002100
[   132.218539] heartbeat [head 1260, postfix 12d8, tail 1340, batch 0x00000000_00abd000]:
[   132.218539] warning: /usr/src/sys/external/bsd/drm2/dist/drm/i915/gt/intel_engine_cs.c:1234: WARN_ON_ONCE(hex_dump_to_buffer(buf + pos, len - pos, rowsize, sizeof(u32), line, sizeof(line), 0) >= sizeof(line))
[   132.218539] heartbeat [0000] 22000002 0240007a 04e0ff1f 00000000 00000000 00000002 00000002 00000002
[   132.218539] 00000002 00000002 00000002 00000002 00000002 00000002 0
[   132.218539] heartbeat [0020] 00000002 00000002 00000002 00000002 00000002 00000002 00000002 00000002
[   132.218539] 00000002 0240007a 04e0ff1f 00000000 00000000 22000002 0
[   132.218539] heartbeat [0040] 00000002 0240007a 04e0ff1f 00000000 00000000 22000002 01008005 00000000
[   132.218539] 0000000c 0cc1ab00 00000000 00008005 80018018 00d0ab00 0
[   132.218539] heartbeat [0060] 0000000c 0cc1ab00 00000000 00008005 80018018 00d0ab00 00000002 01008010
[   132.218539] 00010000 1b040000 01008010 00010000 1b040000 01008010 0
[   132.218539] heartbeat [0080] 00010000 1b040000 01008010 00010000 1b040000 01008010 00010000 1b040000
[   132.218539] 01008010 00010000 1b040000 01008010 00010000 1b040000 0
[   132.218539] heartbeat [00a0] 01008010 00010000 1b040000 01008010 00010000 1b040000 01008010 00010000
[   132.218539] 1b040000 01008010 00010000 1b040000 01008010 00010000 1
[   132.218539] heartbeat [00c0] 1b040000 01008010 00010000 1b040000 01008010 00010000 1b040000 00000001

[   132.218539] heartbeat 	On hold?: 0
[   132.218539] heartbeat 	MMIO base:  0x00002000
[   132.218539] heartbeat 	CCID: 0x00abc10d
[   132.218539] heartbeat 	RING_START: 0x00004000
[   132.218539] heartbeat 	RING_HEAD:  0x000012d8
[   132.218539] heartbeat 	RING_TAIL:  0x000016b8
[   132.218539] heartbeat 	RING_CTL:   0x00003001
[   132.218539] heartbeat 	RING_MODE:  0x00000040
[   132.218539] heartbeat 	ACTHD:  0x00000000_00abd068
[   132.218539] heartbeat 	BBADDR: 0x00000000_00abd06b
[   132.218539] heartbeat 	DMA_FADDR: 0x00000000_00abd068
[   132.218539] heartbeat 	IPEIR: 0x00000000
[   132.218539] heartbeat 	IPEHR: 0x09584ac0
[   132.218539] heartbeat 		E  2:41b*-  @ 5870ms: glxgears[1785]
[   132.218539] heartbeat 		E  2:41c  @ 5850ms: X[1590]
[   132.218539] heartbeat 		E  2:41d-  @ 5850ms: X[1590]
[   132.218539] heartbeat 		E  2:41e  @ 5850ms: glxgears[1785]
[   132.218539] heartbeat 		E  2:41f  @ 3000ms: [i915]
[   132.218539] heartbeat HWSP:
[   132.218539] heartbeat [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[   132.218539] 00000000 00000000 00000000 00000000 00000000 00000000 0
[   132.218539] heartbeat *
[   132.218539] heartbeat [0100] 1a040000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[   132.218539] 00000000 00000000 00000000 00000000 00000000 00000000 0
[   132.218539] heartbeat [0120] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[   132.218539] 00000000 00000000 00000000 00000000 00000000 00000000 0
[   132.218539] heartbeat *
[   132.218539] heartbeat Idle? no
[   132.218539] heartbeat Signals:
[   132.218539] heartbeat 	[2:41b*] @ 5870ms
[   132.218539] heartbeat 	[2:41d] @ 5850ms
[   132.218539] i915drmkms0: notice: Resetting chip for stopped heartbeat on rcs0
[   132.218539] i915drmkms0: notice: glxgears[1785] context reset due to GPU hang

After these bugs the glxgears seem to work nicely though when restarted.

I also got crash with firefox on websites with videos (not youtube), but random websites with h264 and probably h265 videos.

I had enabled webgl for at least webgl1 (my gpu does not support webgl2). and it worked. But still crashes.

I removed core dump the past week and wasn't able to reproduce again for this report... so

Also crashes seemed more frequent on the more recent MesaLib.

But still the dmesg seem quite explicit.

However in my tests I think I had some

Here is the complete dmesg
https://www.pastery.net/kcpzmq/

Right now, I understand why webgl is disabled by default even if I am unsure it is was caused that on Firefox.

However I think sad this "non-fancy" hardware got problems with NetBSD.

The computer also does not get out of suspend properly, but I don't know for that if I'm doing something wrong.
But I only used the suspend fn button for that so I believe not getting out of suspend is a bug.

PS maybe Crocus could be a good add-on for MesaLib.

Cheers,
>How-To-Repeat:
An Ironlake mobile gpu.

NetBSD 10 RC5 or RC6 full installation with xdm enabled.

Run glxgears.

Try to have a desktop usage.

>Fix:
No idea

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.