NetBSD Problem Report #49424

From he@smistad.uninett.no  Thu Nov 27 14:57:23 2014
Return-Path: <he@smistad.uninett.no>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 81D89A5B2E
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 27 Nov 2014 14:57:23 +0000 (UTC)
Message-Id: <20141127145717.A11553D0B5@smistad.uninett.no>
Date: Thu, 27 Nov 2014 15:57:17 +0100 (CET)
From: he@NetBSD.org
Reply-To: he@NetBSD.org
To: gnats-bugs@gnats.NetBSD.org
Subject: DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
X-Send-Pr-Version: 3.95

>Number:         49424
>Category:       kern
>Synopsis:       DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    riastradh
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Nov 27 15:00:00 +0000 2014
>Closed-Date:    Mon Mar 30 15:36:21 +0000 2015
>Last-Modified:  Mon Mar 30 15:36:21 +0000 2015
>Originator:     Havard Eidnes
>Release:        NetBSD 7.0_BETA
>Organization:
	None
>Environment:
System: NetBSD xxx 7.0_BETA NetBSD 7.0_BETA (DRMKMS) #1: ...
Architecture: amd64
Machine: amd64
>Description:
	The DRMKMS kernel, built from netbsd-7 sources checked out
	today panics during autoconfig on this type of host.  The
	kernel messages near the failure point is:

pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: Intel Iron Lake Host Bridge (rev. 0x02)
agp0 at pchb0: G4X-family chipset
agp0: detected 32252k stolen memory
agp0: aperture at 0xd0000000, size 0x10000000
i915drmkms0 at pci0 dev 2 function 0: Intel Iron Lake Integrated Graphics Device (rev. 0x02)
drm: Memory usable by graphics device = 512M
panic: lockdebug_lookup: uninitialized lock (lock=0xffffffff81158000, from=ffffffff8053c715)
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8028dfed cs 8 rflags 246 cr2 0 ilevel 8 rsp ffffffff81383708
curlwp 0xffffffff811132c0 pid 0.1 lowest kstack 0xffffffff813802c0

	I have a backtrace as a picture on my cellphone; will decode
	that later (sunday?).

	A GENERIC kernel probes the graphics as

pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x0044 (rev. 0x02)
agp0 at pchb0: G4X-family chipset
agp0: detected 32252k stolen memory
agp0: aperture at 0xd0000000, size 0x10000000
vga0 at pci0 dev 2 function 0: vendor 0x8086 product 0x0046 (rev. 0x02)
wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation), using wskbd0
wsmux1: connecting to wsdisplay0
drm at vga0 not configured
vendor 0x8086 product 0x3b64 (miscellaneous communications, revision 0x06) at pci0 dev 22 function 0 not configured

	(The vesa X11 driver doesn't appear to want to drive the
	display at its native 1440x900, instead it does 1280x800, with
	predictable suboptimal result.)

>How-To-Repeat:
	Try to boot the DRMKMS kernel, on a Lenovo T410s, watch it
	panic during probing with the above message.

>Fix:
	Sorry, I've not investigated further.

>Release-Note:

>Audit-Trail:
From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/49424: DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
Date: Tue, 02 Dec 2014 15:31:24 +0100 (CET)

 Hi,

 here's some more info, transcribed from screen dumps via my cell
 phone of the laptop screen:

 ...
 agp0 at pchb0: G4X-family chipset
 agp0: detected 32252k stolen memory
 agp0: aperture at 0xd0000000, size 0x10000000
 i915drmkms0 at pci0 dev 2 function 0: Intel Iron Lake Integrated Graphics Device (rev. 0x02)
 drm: Memory usable by graphics device: 512M
 panic: lockdebug_lookup: uninitialized lock (lock=0xffffffff81158000, from=ffffffff8053c715)
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 rip ffffffff8028dfed cs 8 rflags 246 cr2 0 ilevel 8 rsp fffffffff81383708
 curlwp 0xffffffff811132c0 pid 0.1 lowest kstack 0xffffffff813802c0
 Stopped in pid 0.1 (system) at  netbsd:breakpoint+0x5:  leave
 db{0}> 

 db{0}> trace
 breakpoint() at netbsd:breakpoint+0x5
 vpanic() at netbsd:vpanic+0x13c
 snprintf() at netbsd:snprintf
 lockdebug_locked() at netbsd:lockdebug_locked
 mutex_enter() at netbsd:mutex_enter+0x43f
 intel_disable_gt_powersave() at netbsd:intel_disable_gt_powersave+0x104
 intel_uncore_sanitize() at netbsd:intel_uncore_sanitize+0x13
 i915_driver_load() at netbsd:i915_driver_load+0x8da
 drm_dev_register() at netbsd:drm_dev_register+0x87
 drm_pci_attach() at netbsd:drm_pci_attach+0x2b7
 i915drmkms_attach() at netbsd:i915drmkms_attach+0x9b
 config_attach_loc()
 pci_probe_device()
 pci_enumerate_bus()
 pcirescan()
 pciattach()
 config_attach_loc()
 mp_pci_scan()
 mainbus_attach()
 config_attach_loc()
 cpu_configure()
 main()
 db{0}>

 Best regards,

 - Havard

From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/49424: DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
Date: Wed, 03 Dec 2014 09:15:35 +0100 (CET)

 >  agp0 at pchb0: G4X-family chipset
 >  agp0: detected 32252k stolen memory
 >  agp0: aperture at 0xd0000000, size 0x10000000
 >  i915drmkms0 at pci0 dev 2 function 0: Intel Iron Lake Integrated Graphics Device (rev. 0x02)
 >  drm: Memory usable by graphics device: 512M
 >  panic: lockdebug_lookup: uninitialized lock (lock=0xffffffff81158000, from=ffffffff8053c715)
 >  fatal breakpoint trap in supervisor mode
 ...
 >  db{0}> trace
 >  breakpoint() at netbsd:breakpoint+0x5
 >  vpanic() at netbsd:vpanic+0x13c
 >  snprintf() at netbsd:snprintf
 >  lockdebug_locked() at netbsd:lockdebug_locked
 >  mutex_enter() at netbsd:mutex_enter+0x43f
 >  intel_disable_gt_powersave() at netbsd:intel_disable_gt_powersave+0x104
 >  intel_uncore_sanitize() at netbsd:intel_uncore_sanitize+0x13
 >  i915_driver_load() at netbsd:i915_driver_load+0x8da

 And ... it looks indeed like i915_driver_load() does not
 initialize the lock for the power management sub-struct,
 dev_priv->rps.hw_lock after zero-allocating dev_priv.  May I
 suggest:

 Index: i915_dma.c
 ===================================================================
 RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c,v
 retrieving revision 1.12
 diff -u -r1.12 i915_dma.c
 --- i915_dma.c  5 Nov 2014 23:46:09 -0000       1.12
 +++ i915_dma.c  3 Dec 2014 08:05:21 -0000
 @@ -1645,9 +1645,11 @@ int i915_driver_load(struct drm_device *
  #ifdef __NetBSD__
         linux_mutex_init(&dev_priv->dpio_lock);
         linux_mutex_init(&dev_priv->modeset_restore_lock);
 +       linux_mutex_init(&dev_priv->rps.hw_lock);
  #else
         mutex_init(&dev_priv->dpio_lock);
         mutex_init(&dev_priv->modeset_restore_lock);
 +       mutex_init(&dev_priv->rps.hw_lock);
  #endif

         intel_pm_setup(dev);

 There's already a release of this mutex with an XXX comment near
 the end of this function:

 free_priv:
         /* XXX intel_pm_fini */
         linux_mutex_destroy(&dev_priv->rps.hw_lock);

 and another in i915_driver_unload(), but there's neither an
 intel_pm_fini nor an intel_pm_init anywhere to be seen.

 I'll test this change shortly.

 Regards,

 - Havard

From: Havard Eidnes <he@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/49424: DRMKMS failure on Lenovo T410s with amd64/7.0_BETA
Date: Wed, 03 Dec 2014 12:27:32 +0100 (CET)

 > Index: i915_dma.c
 > ===================================================================
 > RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c,v
 > retrieving revision 1.12
 > diff -u -r1.12 i915_dma.c
 > --- i915_dma.c  5 Nov 2014 23:46:09 -0000       1.12
 > +++ i915_dma.c  3 Dec 2014 08:05:21 -0000
 > @@ -1645,9 +1645,11 @@ int i915_driver_load(struct drm_device *
 >  #ifdef __NetBSD__
 >         linux_mutex_init(&dev_priv->dpio_lock);
 >         linux_mutex_init(&dev_priv->modeset_restore_lock);
 > +       linux_mutex_init(&dev_priv->rps.hw_lock);
 >  #else
 >         mutex_init(&dev_priv->dpio_lock);
 >         mutex_init(&dev_priv->modeset_restore_lock);
 > +       mutex_init(&dev_priv->rps.hw_lock);
 >  #endif
 >  
 >         intel_pm_setup(dev);

 This is wrong, it turns out.

 This is a so-called "Ironlake" variant, so it uses another path
 through intel_disable_gt_powersave(), instead hitting the
 uninitialized mchdev_lock in ironlake_disable_drps().  This
 spinlock is being initialized by intel_gpu_ips_init(), but it is
 called much, much later in i915_driver_load(), i.e. too late for
 the indirect use of the lock via intel_uncore_init().

 So this is my new suggestion, which seems to provide a working
 DRMKMS kernel and working X11 on this machine:

 Index: i915_dma.c
 ===================================================================
 RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c,v
 retrieving revision 1.12
 diff -u -p -r1.12 i915_dma.c
 --- i915_dma.c  5 Nov 2014 23:46:09 -0000       1.12
 +++ i915_dma.c  3 Dec 2014 11:05:28 -0000
 @@ -1697,6 +1697,10 @@ int i915_driver_load(struct drm_device *
         /* This must be called before any calls to HAS_PCH_* */
         intel_detect_pch(dev);

 +       /* Needed here to initialize mchdev_lock, before any PM calls */
 +       if (IS_GEN5(dev))
 +               intel_gpu_ips_init(dev_priv);
 +
         intel_uncore_init(dev);

         ret = i915_gem_gtt_init(dev);
 @@ -1819,9 +1823,6 @@ int i915_driver_load(struct drm_device *
                 acpi_video_register();
         }

 -       if (IS_GEN5(dev))
 -               intel_gpu_ips_init(dev_priv);
 -
         intel_init_runtime_pm(dev_priv);

         return 0;

 Regards,

 - Havard

Responsible-Changed-From-To: kern-bug-people->riastradh
Responsible-Changed-By: riastradh@NetBSD.org
Responsible-Changed-When: Wed, 03 Dec 2014 23:17:44 +0000
Responsible-Changed-Why:
mine


State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 28 Feb 2015 15:13:08 +0000
State-Changed-Why:
Can you please try again from HEAD?  I believe the following chnage
should have fixed this:

https://mail-index.netbsd.org/source-changes/2015/02/25/msg063399.html


State-Changed-From-To: feedback->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Mon, 30 Mar 2015 15:36:21 +0000
State-Changed-Why:
submitter reports fixed


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.