NetBSD Problem Report #56672
From manu@netbsd.org Wed Jan 26 10:26:24 2022
Return-Path: <manu@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 14DAF1A9239
for <gnats-bugs@gnats.NetBSD.org>; Wed, 26 Jan 2022 10:26:24 +0000 (UTC)
Message-Id: <20220126102623.E8C3184CFD@mail.netbsd.org>
Date: Wed, 26 Jan 2022 10:26:23 +0000 (UTC)
From: manu@netbsd.org
Reply-To: manu@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: i915drmkms hangs on boot
X-Send-Pr-Version: 3.95
>Number: 56672
>Category: kern
>Synopsis: i915drmkms hangs on boot
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: feedback
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 26 10:30:00 +0000 2022
>Closed-Date:
>Last-Modified: Sun Jul 09 20:10:01 +0000 2023
>Originator: Emmanuel Dreyfus
>Release: NetBSD 9.99.93
>Organization:
Emmanuel Dreyfus
manu@netbsd.org
>Environment:
NetBSD basalte 9.99.93 NetBSD 9.99.93 (GENERIC) #96: Fri Jan 7 05:54:04 CET 2022 manu@basalte:/home2/manu/netbsd-src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
Since the recent DRM upgrade, I randomly get hang at boot time in i915drmkms initialization.
Initial investigation with ddb shows that the i915_flip kernel thread gets a corrupted backtrace. I suspect everyone is waiting on it to complete, but that will never happen because it crashed.
>How-To-Repeat:
Reboot and sometimes get hung
>Fix:
Not knwon yet
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback
State-Changed-By: maya@NetBSD.org
State-Changed-When: Sun, 22 May 2022 22:23:33 +0000
State-Changed-Why:
Does this still happen? Can you mention the latest dmesg messages, preferably with `boot -v -x`?
From: Emmanuel Dreyfus <manu@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56672 (i915drmkms hangs on boot)
Date: Tue, 24 May 2022 08:48:41 +0000
On Sun, May 22, 2022 at 10:23:34PM +0000, maya@NetBSD.org wrote:
> Does this still happen? Can you mention the latest dmesg messages, preferably with `boot -v -x`?
Yes, it still happens, but not with a serial console, hence I cannot
copy/paste the whole output
Here is a case:
intelfb: framebuffer at 0xc0040000, size 1200x1920, depth 32, srtride 4800
(hang with black screen until I hit power butten...)
acpi0: power button pressed, shuttig down!
(...)
{drm:netbsd:drm_atomic_helper_wait_for_flip_done+0x1ea} *ERROR* [CRTC:51:pipe A] flip_done timed out
Another case:
panic: kernel assertion "(i * BITMAP_SIZE) < pp->pr_itemperpage" failed: file (...) subr_poool.c, line 450
backtrace show we come from:
(...)
kmem_intr_alloc
drm_client_modeset_probe
drm_fb_helper_hotplug_event.part.0
It seems to crash less often with -v -x, there must be a lot hangs caused
by race conditions, because -v -x or serial console cause a huge slowdown.
Without -v -x, I most often hang on:
i915drmkms0: interrupting at msi5 vec 0 (i9015drmkms0)
i915drmkms0: notice: Failed to load DNC firmware i915/kbl_dmc_vec1_04.bin: Disabling runtiome power management
--
Emmanuel Dreyfus
manu@netbsd.org
State-Changed-From-To: feedback->open
State-Changed-By: maya@NetBSD.org
State-Changed-When: Fri, 27 May 2022 01:13:55 +0000
State-Changed-Why:
State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Thu, 15 Sep 2022 21:18:01 +0000
State-Changed-Why:
Can you please try with a LOCKDEBUG kernel?
(DIAGNOSTIC+DEBUG+LOCKDEBUG)
From: Emmanuel Dreyfus <manu@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org, riastradh@NetBSD.org, manu@netbsd.org
Subject: Re: kern/56672 (i915drmkms hangs on boot)
Date: Sun, 9 Oct 2022 00:16:45 +0000
On Thu, Sep 15, 2022 at 09:18:02PM +0000, riastradh@NetBSD.org wrote:
> Can you please try with a LOCKDEBUG kernel?
> (DIAGNOSTIC+DEBUG+LOCKDEBUG)
After upgrading to -current from 20221002, I have seen no panic at boot
time for a while. The machine stil crashes often at boot though. It
happens at framebuffer switch while the screen is black, and if there
is a panic, I cannot see it.
--
Emmanuel Dreyfus
manu@netbsd.org
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/56672: i915drmkms hangs on boot
Date: Wed, 31 May 2023 09:36:32 +0200
FWIW, I'm also seeing a hang an boot on a Dell laptop with i915.
It hangs after printing "i915drmkms0: notice: Failed to load DNC firmware".
I've not seen the other issues described in this PR
I also have 2 other hosts (desktops) with i915 and never seen any boot issue
with them so it seems to be hardware-specific.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Taylor R Campbell <riastradh@NetBSD.org>
To: manu@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/56672: i915drmkms hangs on boot
Date: Sun, 9 Jul 2023 20:09:00 +0000
Can you try enabling the new HEARTBEAT kernel option, and setting
DDB_COMMANDONENTER="bt;ps;show all tstiles/t;sync", and see if you get
more diagnostic information that way?
This should be considerably less costly than LOCKDEBUG, in case that
was a barrier to further feedback the last time around -- I intend to
turn HEARTBEAT on by default once it has had a little more testing.
options HEARTBEAT
options HEARTBEAT_MAX_PERIOD_DEFAULT=15
options DDB_COMMANDONENTER="bt;ps;show all tstiles/t;sync"
Without more diagnostic information, there's not much more I can do
about this.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.