NetBSD Problem Report #49424
From he@smistad.uninett.no Thu Nov 27 14:57:23 2014
Return-Path: <he@smistad.uninett.no>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 81D89A5B2E
for <gnats-bugs@gnats.NetBSD.org>; Thu, 27 Nov 2014 14:57:23 +0000 (UTC)
Message-Id: <20141127145717.A11553D0B5@smistad.uninett.no>
Date: Thu, 27 Nov 2014 15:57:17 +0100 (CET)
From: he@NetBSD.org
Reply-To: he@NetBSD.org
To: gnats-bugs@gnats.NetBSD.org
Subject: DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
X-Send-Pr-Version: 3.95
>Number: 49424
>Category: kern
>Synopsis: DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: riastradh
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Nov 27 15:00:00 +0000 2014
>Closed-Date: Mon Mar 30 15:36:21 +0000 2015
>Last-Modified: Mon Mar 30 15:36:21 +0000 2015
>Originator: Havard Eidnes
>Release: NetBSD 7.0_BETA
>Organization:
None
>Environment:
System: NetBSD xxx 7.0_BETA NetBSD 7.0_BETA (DRMKMS) #1: ...
Architecture: amd64
Machine: amd64
>Description:
The DRMKMS kernel, built from netbsd-7 sources checked out
today panics during autoconfig on this type of host. The
kernel messages near the failure point is:
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: Intel Iron Lake Host Bridge (rev. 0x02)
agp0 at pchb0: G4X-family chipset
agp0: detected 32252k stolen memory
agp0: aperture at 0xd0000000, size 0x10000000
i915drmkms0 at pci0 dev 2 function 0: Intel Iron Lake Integrated Graphics Device (rev. 0x02)
drm: Memory usable by graphics device = 512M
panic: lockdebug_lookup: uninitialized lock (lock=0xffffffff81158000, from=ffffffff8053c715)
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8028dfed cs 8 rflags 246 cr2 0 ilevel 8 rsp ffffffff81383708
curlwp 0xffffffff811132c0 pid 0.1 lowest kstack 0xffffffff813802c0
I have a backtrace as a picture on my cellphone; will decode
that later (sunday?).
A GENERIC kernel probes the graphics as
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x0044 (rev. 0x02)
agp0 at pchb0: G4X-family chipset
agp0: detected 32252k stolen memory
agp0: aperture at 0xd0000000, size 0x10000000
vga0 at pci0 dev 2 function 0: vendor 0x8086 product 0x0046 (rev. 0x02)
wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation), using wskbd0
wsmux1: connecting to wsdisplay0
drm at vga0 not configured
vendor 0x8086 product 0x3b64 (miscellaneous communications, revision 0x06) at pci0 dev 22 function 0 not configured
(The vesa X11 driver doesn't appear to want to drive the
display at its native 1440x900, instead it does 1280x800, with
predictable suboptimal result.)
>How-To-Repeat:
Try to boot the DRMKMS kernel, on a Lenovo T410s, watch it
panic during probing with the above message.
>Fix:
Sorry, I've not investigated further.
>Release-Note:
>Audit-Trail:
From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/49424: DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
Date: Tue, 02 Dec 2014 15:31:24 +0100 (CET)
Hi,
here's some more info, transcribed from screen dumps via my cell
phone of the laptop screen:
...
agp0 at pchb0: G4X-family chipset
agp0: detected 32252k stolen memory
agp0: aperture at 0xd0000000, size 0x10000000
i915drmkms0 at pci0 dev 2 function 0: Intel Iron Lake Integrated Graphics Device (rev. 0x02)
drm: Memory usable by graphics device: 512M
panic: lockdebug_lookup: uninitialized lock (lock=0xffffffff81158000, from=ffffffff8053c715)
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8028dfed cs 8 rflags 246 cr2 0 ilevel 8 rsp fffffffff81383708
curlwp 0xffffffff811132c0 pid 0.1 lowest kstack 0xffffffff813802c0
Stopped in pid 0.1 (system) at netbsd:breakpoint+0x5: leave
db{0}>
db{0}> trace
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x13c
snprintf() at netbsd:snprintf
lockdebug_locked() at netbsd:lockdebug_locked
mutex_enter() at netbsd:mutex_enter+0x43f
intel_disable_gt_powersave() at netbsd:intel_disable_gt_powersave+0x104
intel_uncore_sanitize() at netbsd:intel_uncore_sanitize+0x13
i915_driver_load() at netbsd:i915_driver_load+0x8da
drm_dev_register() at netbsd:drm_dev_register+0x87
drm_pci_attach() at netbsd:drm_pci_attach+0x2b7
i915drmkms_attach() at netbsd:i915drmkms_attach+0x9b
config_attach_loc()
pci_probe_device()
pci_enumerate_bus()
pcirescan()
pciattach()
config_attach_loc()
mp_pci_scan()
mainbus_attach()
config_attach_loc()
cpu_configure()
main()
db{0}>
Best regards,
- Havard
From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/49424: DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
Date: Wed, 03 Dec 2014 09:15:35 +0100 (CET)
> agp0 at pchb0: G4X-family chipset
> agp0: detected 32252k stolen memory
> agp0: aperture at 0xd0000000, size 0x10000000
> i915drmkms0 at pci0 dev 2 function 0: Intel Iron Lake Integrated Graphics Device (rev. 0x02)
> drm: Memory usable by graphics device: 512M
> panic: lockdebug_lookup: uninitialized lock (lock=0xffffffff81158000, from=ffffffff8053c715)
> fatal breakpoint trap in supervisor mode
...
> db{0}> trace
> breakpoint() at netbsd:breakpoint+0x5
> vpanic() at netbsd:vpanic+0x13c
> snprintf() at netbsd:snprintf
> lockdebug_locked() at netbsd:lockdebug_locked
> mutex_enter() at netbsd:mutex_enter+0x43f
> intel_disable_gt_powersave() at netbsd:intel_disable_gt_powersave+0x104
> intel_uncore_sanitize() at netbsd:intel_uncore_sanitize+0x13
> i915_driver_load() at netbsd:i915_driver_load+0x8da
And ... it looks indeed like i915_driver_load() does not
initialize the lock for the power management sub-struct,
dev_priv->rps.hw_lock after zero-allocating dev_priv. May I
suggest:
Index: i915_dma.c
===================================================================
RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c,v
retrieving revision 1.12
diff -u -r1.12 i915_dma.c
--- i915_dma.c 5 Nov 2014 23:46:09 -0000 1.12
+++ i915_dma.c 3 Dec 2014 08:05:21 -0000
@@ -1645,9 +1645,11 @@ int i915_driver_load(struct drm_device *
#ifdef __NetBSD__
linux_mutex_init(&dev_priv->dpio_lock);
linux_mutex_init(&dev_priv->modeset_restore_lock);
+ linux_mutex_init(&dev_priv->rps.hw_lock);
#else
mutex_init(&dev_priv->dpio_lock);
mutex_init(&dev_priv->modeset_restore_lock);
+ mutex_init(&dev_priv->rps.hw_lock);
#endif
intel_pm_setup(dev);
There's already a release of this mutex with an XXX comment near
the end of this function:
free_priv:
/* XXX intel_pm_fini */
linux_mutex_destroy(&dev_priv->rps.hw_lock);
and another in i915_driver_unload(), but there's neither an
intel_pm_fini nor an intel_pm_init anywhere to be seen.
I'll test this change shortly.
Regards,
- Havard
From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/49424: DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
Date: Wed, 03 Dec 2014 12:27:32 +0100 (CET)
> Index: i915_dma.c
> ===================================================================
> RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c,v
> retrieving revision 1.12
> diff -u -r1.12 i915_dma.c
> --- i915_dma.c 5 Nov 2014 23:46:09 -0000 1.12
> +++ i915_dma.c 3 Dec 2014 08:05:21 -0000
> @@ -1645,9 +1645,11 @@ int i915_driver_load(struct drm_device *
> #ifdef __NetBSD__
> linux_mutex_init(&dev_priv->dpio_lock);
> linux_mutex_init(&dev_priv->modeset_restore_lock);
> + linux_mutex_init(&dev_priv->rps.hw_lock);
> #else
> mutex_init(&dev_priv->dpio_lock);
> mutex_init(&dev_priv->modeset_restore_lock);
> + mutex_init(&dev_priv->rps.hw_lock);
> #endif
>
> intel_pm_setup(dev);
This is wrong, it turns out.
This is a so-called "Ironlake" variant, so it uses another path
through intel_disable_gt_powersave(), instead hitting the
uninitialized mchdev_lock in ironlake_disable_drps(). This
spinlock is being initialized by intel_gpu_ips_init(), but it is
called much, much later in i915_driver_load(), i.e. too late for
the indirect use of the lock via intel_uncore_init().
So this is my new suggestion, which seems to provide a working
DRMKMS kernel and working X11 on this machine:
Index: i915_dma.c
===================================================================
RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c,v
retrieving revision 1.12
diff -u -p -r1.12 i915_dma.c
--- i915_dma.c 5 Nov 2014 23:46:09 -0000 1.12
+++ i915_dma.c 3 Dec 2014 11:05:28 -0000
@@ -1697,6 +1697,10 @@ int i915_driver_load(struct drm_device *
/* This must be called before any calls to HAS_PCH_* */
intel_detect_pch(dev);
+ /* Needed here to initialize mchdev_lock, before any PM calls */
+ if (IS_GEN5(dev))
+ intel_gpu_ips_init(dev_priv);
+
intel_uncore_init(dev);
ret = i915_gem_gtt_init(dev);
@@ -1819,9 +1823,6 @@ int i915_driver_load(struct drm_device *
acpi_video_register();
}
- if (IS_GEN5(dev))
- intel_gpu_ips_init(dev_priv);
-
intel_init_runtime_pm(dev_priv);
return 0;
Regards,
- Havard
Responsible-Changed-From-To: kern-bug-people->riastradh
Responsible-Changed-By: riastradh@NetBSD.org
Responsible-Changed-When: Wed, 03 Dec 2014 23:17:44 +0000
Responsible-Changed-Why:
mine
State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 28 Feb 2015 15:13:08 +0000
State-Changed-Why:
Can you please try again from HEAD? I believe the following chnage
should have fixed this:
https://mail-index.netbsd.org/source-changes/2015/02/25/msg063399.html
State-Changed-From-To: feedback->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Mon, 30 Mar 2015 15:36:21 +0000
State-Changed-Why:
submitter reports fixed
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.