NetBSD Problem Report #50060

From www@NetBSD.org  Sat Jul 18 01:28:56 2015
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 4B625A65B9
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 18 Jul 2015 01:28:56 +0000 (UTC)
Message-Id: <20150718012855.0E220A65B9@mollari.NetBSD.org>
Date: Sat, 18 Jul 2015 01:28:55 +0000 (UTC)
From: nisimura@netbsd.org
Reply-To: nisimura@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: kernel crash with i915drmksm on Intel 965Q
X-Send-Pr-Version: www-1.0

>Number:         50060
>Notify-List:    gson@gson.org
>Category:       kern
>Synopsis:       kernel crash with i915drmksm on Intel 965Q
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    riastradh
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jul 18 01:30:01 +0000 2015
>Closed-Date:    Tue Nov 10 17:53:24 +0000 2015
>Last-Modified:  Thu Feb 11 23:25:01 +0000 2016
>Originator:     Toru Nishimura
>Release:        NetBSD 7.0_RC1
>Organization:
ALKYL Technology
>Environment:
NetBSD paq16 7.0_RC1 NetBSD 7.0_RC1 (GENERIC.201506190427Z) amd64
>Description:
From the introduction of i915drmkms hp dc7700 has never been successful to boot off since i915drmkms always crashs.

agp0 at pchb0: i965-family chipset
pci_mem_find: expected mem type 00000004, found 00000000
pci_mem find: expected mem type 00000004, found 00000000
agp0: can't find MMIO registers
i915drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02)
DRM error in i915_gmch_probe: failed to set up gmch
uvm_fault(0xffffffff8104b240, 0x7fc000231000, 2) -> e
fatal page fault in supervisor mode
trap type 6 code 2 rip ffffffff809cb323 cs 8 rflags 10246 cr2 7fc000231000 ilevel 8 rsp ffffffff812657e8
curlwp 0xffffffff81000540 pid 0.1 lowest kstack ffffffff812632c0
kernel: page fault trap, code=0
Stopped in pid 0.1 (system) at  netbsd:_atomic_swap_64+0x3: xchgq %rax,0(%rdi)

- 6.1.5 which uses i915drm has no trouble so far.
- "boot -c / disable i915drmkms" is the only to avoid crash.
- hp dc7700 is pure Intel-Inside product, supposedly one of most standard desktop PC.

Toru Nishimura / ALKYL Technology
>How-To-Repeat:
Boot off off the shelf NetBSD7.0_RC1 kernel on 965Q system. Mine happens to have E6300/E6600 C2D.
>Fix:

>Release-Note:

>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sat, 18 Jul 2015 19:11:25 +0000

 On Sat, Jul 18, 2015 at 01:30:01AM +0000, nisimura@netbsd.org wrote:
  > i915drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02)
  > DRM error in i915_gmch_probe: failed to set up gmch

 Is this the real problem? (that is, the subsequent crash then reflects
 bad handling of the failure condition) ... or is it supposed to be
 harmless/noncritical?

  > uvm_fault(0xffffffff8104b240, 0x7fc000231000, 2) -> e
  > fatal page fault in supervisor mode
  > trap type 6 code 2 rip ffffffff809cb323 cs 8 rflags 10246 cr2 7fc000231000 ilevel 8 rsp ffffffff812657e8
  > curlwp 0xffffffff81000540 pid 0.1 lowest kstack ffffffff812632c0
  > kernel: page fault trap, code=0
  > Stopped in pid 0.1 (system) at  netbsd:_atomic_swap_64+0x3: xchgq %rax,0(%rdi)

 Can you send the stack trace?

 -- 
 David A. Holland
 dholland@netbsd.org

From: Collector Mail <locore64@48gou.jp>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, 
	nisimura@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 20 Jul 2015 13:09:34 +0900

 --001a11403cae95811c051b46b4f4
 Content-Type: text/plain; charset=UTF-8

 >
 >
 > >> i915drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992
 (rev. 0x02)
 >> DRM error in i915_gmch_probe: failed to set up gmch
 >
 > Is this the real problem?

 No clue.  Somehow older NetBSD kernel says something about "i915drm" and has
 no trouble.

 > Can you send the stack trace?

 _atomic_swap_64() at netbsd:_atomic_swap_64+0x3
 bus_space_reservation_unmap1() at netbsd:bus_space_reservation_unmap1+0xc2
 bus_space_unmap() at netbsd:bus_space_unmap+0x38
 i915_driver_load() at netbsd:i915_driver_load+0x971
 drm_dev_register() at netbsd:drm_dev_register+0x87
 drm_pci_attach() at netbsd:drm_pci_attach+0x2d5
 i915drmkms_attach() at netbsd:i915drmkms_attach+0x92
 config_attach_loc() at netbsd:config_attach_loc+0x16e
 pci_probe_device() at netbsd:pci_probe_device+0x4ac
 pci_emumerate_bus() at netbsd:pci_enumerate_bus+0x168
 ...

 Toru Nishimura / ALKYL Technology

 --001a11403cae95811c051b46b4f4
 Content-Type: text/html; charset=UTF-8
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><blo=
 ckquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left=
 -width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;paddi=
 ng-left:1ex"><br></blockquote></div><div class=3D"gmail_extra">&gt;&gt; i91=
 5drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02)=
 </div><div class=3D"gmail_extra">&gt;&gt; DRM error in i915_gmch_probe: fai=
 led to set up gmch</div><div class=3D"gmail_extra">&gt;</div><div class=3D"=
 gmail_extra">&gt; Is this the real problem?</div><div class=3D"gmail_extra"=
 ><br></div><div class=3D"gmail_extra">No clue.=C2=A0 Somehow older NetBSD k=
 ernel says something about &quot;i915drm&quot; and has</div><div class=3D"g=
 mail_extra">no trouble.</div><div class=3D"gmail_extra"><br></div><div clas=
 s=3D"gmail_extra">&gt; Can you send the stack trace?</div><div class=3D"gma=
 il_extra"><br></div><div class=3D"gmail_extra">_atomic_swap_64() at netbsd:=
 _atomic_swap_64+0x3</div><div class=3D"gmail_extra">bus_space_reservation_u=
 nmap1() at netbsd:bus_space_reservation_unmap1+0xc2</div><div class=3D"gmai=
 l_extra">bus_space_unmap() at netbsd:bus_space_unmap+0x38</div><div class=
 =3D"gmail_extra">i915_driver_load() at netbsd:i915_driver_load+0x971</div><=
 div class=3D"gmail_extra">drm_dev_register() at netbsd:drm_dev_register+0x8=
 7</div><div class=3D"gmail_extra">drm_pci_attach() at netbsd:drm_pci_attach=
 +0x2d5</div><div class=3D"gmail_extra">i915drmkms_attach() at netbsd:i915dr=
 mkms_attach+0x92</div><div class=3D"gmail_extra">config_attach_loc() at net=
 bsd:config_attach_loc+0x16e</div><div class=3D"gmail_extra">pci_probe_devic=
 e() at netbsd:pci_probe_device+0x4ac</div><div class=3D"gmail_extra">pci_em=
 umerate_bus() at netbsd:pci_enumerate_bus+0x168</div><div class=3D"gmail_ex=
 tra">...</div><div class=3D"gmail_extra"><br></div><div class=3D"gmail_extr=
 a">Toru Nishimura / ALKYL Technology</div></div></div>

 --001a11403cae95811c051b46b4f4--

From: Collector Mail <locore64@48gou.jp>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, 
	nisimura@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 20 Jul 2015 13:18:27 +0900

 --001a11403cae5ac71f051b46d4bd
 Content-Type: text/plain; charset=UTF-8

 Got 2yrs back dmesg out on the very same machine sampled by 6.99.23
 at Aug 30 of 2013.

 pchb0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
 agp0 at pchb0: detected 7676k stolen memory
 agp0: aperture at 0xe0000000, size 0x20000000
 vga0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02)
 wsdisplay0 at vga0 kdbmux 1: console (0x80x25, vt100 emulation), using
 wskbd0
 wsmux1 : connecting to wdisplay0
 i915drm0 at vga0: Intel i965Q
 i915drm0: AGP at 0xe0000000 256MB
 i915drm0: Industrialized i915 1.6.0 20080730
 ...

 Toru Nishimura / ALKYL Technology

 2015-07-20 13:09 GMT+09:00 Collector Mail <locore64@48gou.jp>:

 >
 >> >> i915drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992
 > (rev. 0x02)
 > >> DRM error in i915_gmch_probe: failed to set up gmch
 > >
 > > Is this the real problem?
 >
 > No clue.  Somehow older NetBSD kernel says something about "i915drm" and
 > has
 > no trouble.
 >
 > > Can you send the stack trace?
 >
 > _atomic_swap_64() at netbsd:_atomic_swap_64+0x3
 > bus_space_reservation_unmap1() at netbsd:bus_space_reservation_unmap1+0xc2
 > bus_space_unmap() at netbsd:bus_space_unmap+0x38
 > i915_driver_load() at netbsd:i915_driver_load+0x971
 > drm_dev_register() at netbsd:drm_dev_register+0x87
 > drm_pci_attach() at netbsd:drm_pci_attach+0x2d5
 > i915drmkms_attach() at netbsd:i915drmkms_attach+0x92
 > config_attach_loc() at netbsd:config_attach_loc+0x16e
 > pci_probe_device() at netbsd:pci_probe_device+0x4ac
 > pci_emumerate_bus() at netbsd:pci_enumerate_bus+0x168
 > ...
 >
 > Toru Nishimura / ALKYL Technology
 >

 --001a11403cae5ac71f051b46d4bd
 Content-Type: text/html; charset=UTF-8
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr">Got 2yrs back dmesg out on the very same machine sampled b=
 y 6.99.23<div>at Aug 30 of 2013.<div><br></div><div>pchb0: i/o space, memor=
 y space enabled, rd/line, rd/mult, wr/inv ok</div><div>agp0 at pchb0: detec=
 ted 7676k stolen memory</div><div>agp0: aperture at 0xe0000000, size 0x2000=
 0000</div><div>vga0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 =
 (rev. 0x02)</div><div>wsdisplay0 at vga0 kdbmux 1: console (0x80x25, vt100 =
 emulation), using wskbd0</div><div>wsmux1 : connecting to wdisplay0</div><d=
 iv>i915drm0 at vga0: Intel i965Q</div><div>i915drm0: AGP at 0xe0000000 256M=
 B</div><div>i915drm0: Industrialized i915 1.6.0 20080730</div><div>...</div=
 ><div><br></div><div>Toru Nishimura / ALKYL Technology=C2=A0</div></div></d=
 iv><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">2015-07-20 13:=
 09 GMT+09:00 Collector Mail <span dir=3D"ltr">&lt;<a href=3D"mailto:locore6=
 4@48gou.jp" target=3D"_blank">locore64@48gou.jp</a>&gt;</span>:<br><blockqu=
 ote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc s=
 olid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extra"><span cl=
 ass=3D""><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=
 =3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(20=
 4,204,204);border-left-style:solid;padding-left:1ex"><br></blockquote></div=
 ><div class=3D"gmail_extra">&gt;&gt; i915drmksm0 at pci0 dev 2 function 0: =
 vendor 0x8086 product 0x2992 (rev. 0x02)</div><div class=3D"gmail_extra">&g=
 t;&gt; DRM error in i915_gmch_probe: failed to set up gmch</div><div class=
 =3D"gmail_extra">&gt;</div><div class=3D"gmail_extra">&gt; Is this the real=
  problem?</div><div class=3D"gmail_extra"><br></div></span><div class=3D"gm=
 ail_extra">No clue.=C2=A0 Somehow older NetBSD kernel says something about =
 &quot;i915drm&quot; and has</div><div class=3D"gmail_extra">no trouble.</di=
 v><span class=3D""><div class=3D"gmail_extra"><br></div><div class=3D"gmail=
 _extra">&gt; Can you send the stack trace?</div><div class=3D"gmail_extra">=
 <br></div></span><div class=3D"gmail_extra">_atomic_swap_64() at netbsd:_at=
 omic_swap_64+0x3</div><div class=3D"gmail_extra">bus_space_reservation_unma=
 p1() at netbsd:bus_space_reservation_unmap1+0xc2</div><div class=3D"gmail_e=
 xtra">bus_space_unmap() at netbsd:bus_space_unmap+0x38</div><div class=3D"g=
 mail_extra">i915_driver_load() at netbsd:i915_driver_load+0x971</div><div c=
 lass=3D"gmail_extra">drm_dev_register() at netbsd:drm_dev_register+0x87</di=
 v><div class=3D"gmail_extra">drm_pci_attach() at netbsd:drm_pci_attach+0x2d=
 5</div><div class=3D"gmail_extra">i915drmkms_attach() at netbsd:i915drmkms_=
 attach+0x92</div><div class=3D"gmail_extra">config_attach_loc() at netbsd:c=
 onfig_attach_loc+0x16e</div><div class=3D"gmail_extra">pci_probe_device() a=
 t netbsd:pci_probe_device+0x4ac</div><div class=3D"gmail_extra">pci_emumera=
 te_bus() at netbsd:pci_enumerate_bus+0x168</div><div class=3D"gmail_extra">=
 ...</div><div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">To=
 ru Nishimura / ALKYL Technology</div></div></div>
 </blockquote></div><br></div>

 --001a11403cae5ac71f051b46d4bd--

Responsible-Changed-From-To: kern-bug-people->riastradh
Responsible-Changed-By: riastradh@NetBSD.org
Responsible-Changed-When: Fri, 31 Jul 2015 18:45:25 +0000
Responsible-Changed-Why:
mine


From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sat, 10 Oct 2015 22:13:38 +0300

 I just tried to boot the NetBSD 7.0/amd64 install image on a PC with
 an ASUS P5B-V motherboard, and it paniced with the same backtrace as
 the one reported in PR 50060.  This is a regression - 6.1.5 works fine
 on this machine.
 -- 
 Andreas Gustafsson, gson@gson.org

From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50060 CVS commit: src/sys/external/bsd/drm2/dist/drm/i915
Date: Sat, 10 Oct 2015 15:29:44 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Sat Oct 10 19:29:44 UTC 2015

 Modified Files:
 	src/sys/external/bsd/drm2/dist/drm/i915: i915_dma.c

 Log Message:
 Zero out the guard for bus_space_unmap before calling i915_dma_cleanup() which
 calls i915_free_hws(), which then tries to unmap. Perhaps this fixes PR/50060.


 To generate a diff of this commit:
 cvs rdiff -u -r1.16 -r1.17 src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, christos@NetBSD.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sun, 11 Oct 2015 19:25:11 +0300

 I tried a -current kernel from source date 2015.10.10.19.35.15, which
 includes christos' recent commit of i915_dma.c 1.18, but it also
 paniced:

    pci0 at mainbus0 bus 0: configuration mode 1
    pchb0 at pci0 dev 0 function 0: vendor 8086 product 29a0 (rev. 0x02)
    agp0 at pchb0: i965-family chipset
    pci_mem_find: expected mem type 00000004, found 00000000
    pci_mem_find: expected mem type 00000004, found 00000000
    agp0: can't find MMIO registers
    ppb0 at pci0 dev 1 function 0: vendor 8086 product 29a1 (rev. 0x02)
    ppb0: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x16 @ 2.5GT/s
    pci1 at ppb0 bus 1
    i915drmkms0 at pci0 dev 2 function 0: vendor 8086 product 29a2 (rev. 0x02)
    DRM error in i915_gmch_probe: failed to set up gmch
    uvm_fault(0xffffffff811b27a0, 0x7fc000357000, 2) -> e
    fatal page fault in supervisor mode
    trap type 6 code 2 rip ffffffff80a80c43 cs 8 rflags 10246 cr2 7fc000357000 ilevel 8 rsp ffffffff8137e7c8
    curlwp 0xffffffff8110d740 pid 0.1 lowest kstack 0xffffffff8137b2c0
    kernel: page fault trap, code=0
    Stopped in pid 0.1 (system) at  netbsd:_atomic_swap_64+0x3:     xchgq   %rax,0(%rdi)
    db{0}>

 -- 
 Andreas Gustafsson, gson@gson.org

From: christos@zoulas.com (Christos Zoulas)
To: Andreas Gustafsson <gson@gson.org>, gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sun, 11 Oct 2015 12:33:16 -0400

 On Oct 11,  7:25pm, gson@gson.org (Andreas Gustafsson) wrote:
 -- Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q

 | I tried a -current kernel from source date 2015.10.10.19.35.15, which
 | includes christos' recent commit of i915_dma.c 1.18, but it also
 | paniced:
 | 
 |    pci0 at mainbus0 bus 0: configuration mode 1
 |    pchb0 at pci0 dev 0 function 0: vendor 8086 product 29a0 (rev. 0x02)
 |    agp0 at pchb0: i965-family chipset

 This seems to be a common problem:

 |    pci_mem_find: expected mem type 00000004, found 00000000
 |    pci_mem_find: expected mem type 00000004, found 00000000
 |    agp0: can't find MMIO registers

 |    ppb0 at pci0 dev 1 function 0: vendor 8086 product 29a1 (rev. 0x02)
 |    ppb0: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x16 @ 2.5GT/s
 |    pci1 at ppb0 bus 1
 |    i915drmkms0 at pci0 dev 2 function 0: vendor 8086 product 29a2 (rev. 0x02)
 |    DRM error in i915_gmch_probe: failed to set up gmch
 |    uvm_fault(0xffffffff811b27a0, 0x7fc000357000, 2) -> e
 |    fatal page fault in supervisor mode
 |    trap type 6 code 2 rip ffffffff80a80c43 cs 8 rflags 10246 cr2 7fc000357000 ilevel 8 rsp ffffffff8137e7c8
 |    curlwp 0xffffffff8110d740 pid 0.1 lowest kstack 0xffffffff8137b2c0
 |    kernel: page fault trap, code=0
 |    Stopped in pid 0.1 (system) at  netbsd:_atomic_swap_64+0x3:     xchgq   %rax,0(%rdi)
 |    db{0}>

 Backtrace? Same as before?

 christos

From: Andreas Gustafsson <gson@gson.org>
To: christos@zoulas.com (Christos Zoulas)
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sun, 11 Oct 2015 19:36:13 +0300

 Christos Zoulas wrote:
 > Backtrace? Same as before?

 Yes, same as before:

   db{0}> bt
   _atomic_swap_64() at netbsd:_atomic_swap_64+0x3
   bus_space_reservation_unmap1() at netbsd:bus_space_reservation_unmap1+0xc6
   bus_space_unmap() at netbsd:bus_space_unmap+0x38
   i915_driver_load() at netbsd:i915_driver_load+0xa67
   drm_dev_register() at netbsd:drm_dev_register+0x87
   drm_pci_attach() at netbsd:drm_pci_attach+0x2db
   i915drmkms_attach() at netbsd:i915drmkms_attach+0x9b
   config_attach_loc() at netbsd:config_attach_loc+0x17a
   pci_probe_device() at netbsd:pci_probe_device+0x4fa
   pci_enumerate_bus() at netbsd:pci_enumerate_bus+0x168
   pcirescan() at netbsd:pcirescan+0x43
   pciattach() at netbsd:pciattach+0x193
   config_attach_loc() at netbsd:config_attach_loc+0x17a
   mp_pci_scan() at netbsd:mp_pci_scan+0xa4
   mainbus_attach() at netbsd:mainbus_attach+0x2fa
   config_attach_loc() at netbsd:config_attach_loc+0x17a
   cpu_configure() at netbsd:cpu_configure+0x26
   main() at netbsd:main+0x299

 -- 
 Andreas Gustafsson, gson@gson.org

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, riastradh@NetBSD.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org, nisimura@netbsd.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sun, 11 Oct 2015 12:48:17 -0400

 On Oct 11,  4:40pm, gson@gson.org (Andreas Gustafsson) wrote:
 -- Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q

 | The following reply was made to PR kern/50060; it has been noted by GNATS.
 | 
 | From: Andreas Gustafsson <gson@gson.org>
 | To: christos@zoulas.com (Christos Zoulas)
 | Cc: gnats-bugs@NetBSD.org
 | Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
 | Date: Sun, 11 Oct 2015 19:36:13 +0300
 | 
 |  Christos Zoulas wrote:
 |  > Backtrace? Same as before?
 |  
 |  Yes, same as before:

 There is only one bus_space_unmap() in i915_dma.c can you either put
 printfs before and after, or comment it out to see if it is there where
 it fails?

 christos

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, christos@NetBSD.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 09:51:39 +0300

 Christos wrote:
  There is only one bus_space_unmap() in i915_dma.c can you either put
 > printfs before and after, or comment it out to see if it is there where
 > it fails?

 I added the printfs to the 2015.10.10.19.35.15 source, and a couple
 more for context:

 Index: i915_dma.c
 ===================================================================
 RCS file: /bracket/repo/src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c,v
 retrieving revision 1.18
 diff -u -r1.18 i915_dma.c
 --- i915_dma.c	10 Oct 2015 19:35:15 -0000	1.18
 +++ i915_dma.c	11 Oct 2015 18:13:43 -0000
 @@ -133,12 +133,14 @@

  	if (ring->status_page.gfx_addr) {
  		ring->status_page.gfx_addr = 0;
 +		printf("pre unmap\n");
  #ifdef __NetBSD__
  		bus_space_unmap(dev->pdev->pd_pa.pa_memt,
  		    dev_priv->dri1.gfx_hws_cpu_bsh, 4096);
  #else
  		iounmap(dev_priv->dri1.gfx_hws_cpu_addr);
  #endif
 +		printf("post unmap\n");		
  	}

  	/* Need to rewrite hardware status page */
 @@ -1612,6 +1614,8 @@
  	int ret = 0, mmio_bar, mmio_size;
  	uint32_t aperture_size;

 +	printf("i915_driver_load entry\n");
 +
  	info = (struct intel_device_info *) flags;

  	/* Refuse to load on gen6+ without kms enabled. */
 @@ -1885,6 +1889,9 @@
  	spin_lock_destroy(&dev_priv->irq_lock);
  #endif
  	kfree(dev_priv);
 +
 +	printf("i915_driver_load exit\n");
 +	
  	return ret;
  }


 Console output:

    agp0 at pchb0: i965-family chipset
    pci_mem_find: expected mem type 00000004, found 00000000
    pci_mem_find: expected mem type 00000004, found 00000000
    agp0: can't find MMIO registers
    ppb0 at pci0 dev 1 function 0: vendor 8086 product 29a1 (rev. 0x02)
    ppb0: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x16 @ 2.5GT/s
    pci1 at ppb0 bus 1
    i915drmkms0 at pci0 dev 2 function 0: vendor 8086 product 29a2 (rev. 0x02)
    i915_driver_load entry
    DRM error in i915_gmch_probe: failed to set up gmch
    uvm_fault(0xffffffff811b27a0, 0x7fc000357000, 2) -> e
    fatal page fault in supervisor mode
    trap type 6 code 2 rip ffffffff80a80c73 cs 8 rflags 10246 cr2 7fc000357000 ilevel 8 rsp ffffffff8137e7c8
    curlwp 0xffffffff8110d740 pid 0.1 lowest kstack 0xffffffff8137b2c0
    kernel: page fault trap, code=0
    Stopped in pid 0.1 (system) at  netbsd:_atomic_swap_64+0x3:     xchgq   %rax,0(%
    rdi)
    db{0}>

 Looks like that is not where it fails.
 -- 
 Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@gson.org>
To: christos@zoulas.com (Christos Zoulas), gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 15:24:30 +0300

 I added another bunch of printfs to track down where the
 bus_space_unmap() call happens.

 i915_driver_load() calls i915_gem_gtt_init(), which
 returns a nonzero value via this return statement:

         ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
                              &gtt->mappable_base, &gtt->mappable_end);
         if (ret)
                 return ret;

 i915_driver_load() then takes the goto:

         if (ret)
                 goto out_regs;

 And the panic happens in the inline function pci_iounmap(), which
 calls bus_space_unmap():

 out_regs:
         intel_uncore_fini(dev);
         intel_uncore_destroy(dev);
         pci_iounmap(dev->pdev, dev_priv->regs);

 -- 
 Andreas Gustafsson, gson@gson.org

From: christos@zoulas.com (Christos Zoulas)
To: Andreas Gustafsson <gson@gson.org>, gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 09:17:22 -0400

 On Oct 12,  3:24pm, gson@gson.org (Andreas Gustafsson) wrote:
 -- Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q

 | I added another bunch of printfs to track down where the
 | bus_space_unmap() call happens.
 | 
 | i915_driver_load() calls i915_gem_gtt_init(), which
 | returns a nonzero value via this return statement:
 | 
 |         ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
 |                              &gtt->mappable_base, &gtt->mappable_end);
 |         if (ret)
 |                 return ret;
 | 
 | i915_driver_load() then takes the goto:
 | 
 |         if (ret)
 |                 goto out_regs;
 | 
 | And the panic happens in the inline function pci_iounmap(), which
 | calls bus_space_unmap():
 | 
 | out_regs:
 |         intel_uncore_fini(dev);
 |         intel_uncore_destroy(dev);
 |         pci_iounmap(dev->pdev, dev_priv->regs);

 Thanks, this is very helpful...

 christos

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, riastradh@NetBSD.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org, nisimura@netbsd.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 15:03:35 -0400

 On Oct 12, 12:25pm, gson@gson.org (Andreas Gustafsson) wrote:
 -- Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q

 | The following reply was made to PR kern/50060; it has been noted by GNATS.
 | 
 | From: Andreas Gustafsson <gson@gson.org>
 | To: christos@zoulas.com (Christos Zoulas), gnats-bugs@NetBSD.org
 | Cc: 
 | Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
 | Date: Mon, 12 Oct 2015 15:24:30 +0300
 | 
 |  I added another bunch of printfs to track down where the
 |  bus_space_unmap() call happens.
 |  
 |  i915_driver_load() calls i915_gem_gtt_init(), which
 |  returns a nonzero value via this return statement:
 |  
 |          ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
 |                               &gtt->mappable_base, &gtt->mappable_end);
 |          if (ret)
 |                  return ret;

 Unfortunately forcing an error there on my machine does not reproduce
 the problem... Boot completes but I end up with no console which is a
 different bug!

 christos

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, nisimura@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 19:54:49 +0000

    Date: Mon, 12 Oct 2015 09:51:39 +0300
    From: Andreas Gustafsson <gson@gson.org>

    agp0: can't find MMIO registers

 This is where the real problem is happening.  Fixing the error
 branches, as christos@ has been doing, is all well and good, but they
 won't help to make the thing work.  This is in agp_i810_attach in
 sys/dev/pci/agp_i810.c, which is a twisty maze of intermingled
 device-specific logic.  I have the data sheets if anyone wants to take
 a closer look.

 Can you show `pcictl pci0 dump -b 0 -d 0 -f 2'?

From: Andreas Gustafsson <gson@gson.org>
To: Taylor R Campbell <riastradh@NetBSD.org>, gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 23:47:38 +0300

 Taylor R Campbell wrote:
 > Can you show `pcictl pci0 dump -b 0 -d 0 -f 2'?

 Since NetBSD/amd64 7.0 doesn't boot, I booted NetBSD/i386 5.99.60 that
 I happened to have on a USB stick and ran "pcictl pci0 dump -b 0 -d 0
 -f 2" there, but it printed nothing.  It doesn't look like device 0
 has a function 2, neither in the 5.99.60 nor the 7.0 dmesg.  Did you
 perhaps mean "-f 0"?

 The full dmesg from 5.99.60 is below for your reference.

 Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
     2006, 2007, 2008, 2009, 2010, 2011, 2012
     The NetBSD Foundation, Inc.  All rights reserved.
 Copyright (c) 1982, 1986, 1989, 1991, 1993
     The Regents of the University of California.  All rights reserved.

 NetBSD 5.99.60 (GENERIC) #0: Sun Jan 22 00:24:44 EET 2012
 	gson@guru.araneus.fi:/bracket/work/2012.01.21.17.12.56/obj/sys/arch/i386/compile/GENERIC
 total memory = 2039 MB
 avail memory = 1992 MB
 timecounter: Timecounters tick every 10.000 msec
 timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
 System manufacturer System Product Name (System Version)
 mainbus0 (root)
 cpu0 at mainbus0 apid 0: Intel(R) Core(TM)2 Quad CPU           @ 2.40GHz, id 0x6f7
 cpu1 at mainbus0 apid 1: Intel(R) Core(TM)2 Quad CPU           @ 2.40GHz, id 0x6f7
 cpu2 at mainbus0 apid 2: Intel(R) Core(TM)2 Quad CPU           @ 2.40GHz, id 0x6f7
 cpu3 at mainbus0 apid 3: Intel(R) Core(TM)2 Quad CPU           @ 2.40GHz, id 0x6f7
 ioapic0 at mainbus0 apid 4: pa 0xfec00000, version 20, 24 pins
 acpi0 at mainbus0: Intel ACPICA 20110623
 acpi0: X/RSDT: OemId <MSTEST,TESTONLY,11000726>, AslId <MSFT,00000097>
 ACPI Warning: Incorrect checksum in table [OEMB] - 0x60, should be 0x57 (20110623/tbutils-282)
 acpi0: SCI interrupting at int 9
 timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
 hpet0 at acpi0: high precision event timer (mem 0xfed00000-0xfed00400)
 timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000
 MCH (PNP0C01) at acpi0 not configured
 attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0
 pcppi1 at acpi0 (SPKR, PNP0800): io 0x61
 midi0 at pcppi1: PC speaker
 sysbeep0 at pcppi1
 npx1 at acpi0 (COPR, PNP0C04): io 0xf0-0xff irq 13
 npx1: reported by CPUID; using exception 16
 UAR1 (PNP0501) at acpi0 not configured
 FDC (PNP0700) at acpi0 not configured
 SIOR (PNP0C02) at acpi0 not configured
 RMSC (PNP0C02) at acpi0 not configured
 aibs0 at acpi0 (ASOC, ATK0110-16843024): ASUSTeK AI Booster
 OMSC (PNP0C02) at acpi0 not configured
 PCIE (PNP0C02) at acpi0 not configured
 RMEM (PNP0C01) at acpi0 not configured
 acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button
 apm0 at acpi0: Power Management spec V1.2
 attimer1: attached to pcppi1
 pci0 at mainbus0 bus 0: configuration mode 1
 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
 pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x29a0 (rev. 0x02)
 agp0 at pchb0: detected 7676k stolen memory
 agp0: aperture at 0xd0000000, size 0x20000000
 ppb0 at pci0 dev 1 function 0: vendor 0x8086 product 0x29a1 (rev. 0x02)
 ppb0: PCI Express 1.0 <Root Port of PCI-E Root Complex>
 pci1 at ppb0 bus 1
 pci1: i/o space, memory space enabled, rd/line, wr/inv ok
 vga1 at pci0 dev 2 function 0: vendor 0x8086 product 0x29a2 (rev. 0x02)
 vga1: WARNING: ignoring 64-bit BAR @ 0x18
 wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation)
 wsmux1: connecting to wsdisplay0
 i915drm0 at vga1: Intel i965G
 i915drm0: AGP at 0xd0000000 256MB
 i915drm0: Initialized i915 1.6.0 20080730
 vendor 0x8086 product 0x29a4 (miscellaneous communications, revision 0x02) at pci0 dev 3 function 0 not configured
 uhci0 at pci0 dev 26 function 0: vendor 0x8086 product 0x2834 (rev. 0x02)
 uhci0: interrupting at ioapic0 pin 16
 usb0 at uhci0: USB revision 1.0
 uhci1 at pci0 dev 26 function 1: vendor 0x8086 product 0x2835 (rev. 0x02)
 uhci1: interrupting at ioapic0 pin 17
 usb1 at uhci1: USB revision 1.0
 ehci0 at pci0 dev 26 function 7: vendor 0x8086 product 0x283a (rev. 0x02)
 ehci0: interrupting at ioapic0 pin 18
 ehci0: BIOS has given up ownership
 ehci0: EHCI version 1.0
 ehci0: companion controllers, 2 ports each: uhci0 uhci1
 usb2 at ehci0: USB revision 2.0
 hdaudio0 at pci0 dev 27 function 0: HD Audio Controller
 hdaudio0: interrupting at ioapic0 pin 22
 hdafg0 at hdaudio0: ADI AD1988A
 hdafg0: DAC00 8ch: Speaker [Jack]
 hdafg0: ADC01 2ch: CD [Built-In], Line In [Jack], Mic In [Jack]
 hdafg0: DAC02 2ch: HP Out [Jack]
 hdafg0: DIG03 2ch: SPDIF Out [Jack]
 hdafg0: 8ch/2ch 8000Hz 11025Hz 16000Hz 22050Hz 32000Hz 44100Hz 48000Hz 88200Hz 96000Hz 192000Hz PCM16 PCM20 PCM24 AC3
 audio0 at hdafg0: full duplex, playback, capture, independent
 ppb1 at pci0 dev 28 function 0: vendor 0x8086 product 0x283f (rev. 0x02)
 ppb1: PCI Express 1.0 <Root Port of PCI-E Root Complex>
 pci2 at ppb1 bus 3
 pci2: i/o space, memory space enabled, rd/line, wr/inv ok
 ppb2 at pci0 dev 28 function 4: vendor 0x8086 product 0x2847 (rev. 0x02)
 ppb2: PCI Express 1.0 <Root Port of PCI-E Root Complex>
 pci3 at ppb2 bus 2
 pci3: i/o space, memory space enabled, rd/line, wr/inv ok
 jmide0 at pci3 dev 0 function 0: vendor 0x197b product 0x2363
 jmide0: 1 PATA port, 2 SATA ports
 jmide0: interrupting at ioapic0 pin 16
 ahcisata0 at jmide0
 ahcisata0: AHCI revision 1.0, 2 ports, 32 slots, CAP 0xc722ff01<PSC,SSC,PMD,SPM,ISS=0x2=Gen2,SCLO,SAL,SALP,SNCQ,S64A>
 atabus0 at ahcisata0 channel 0
 atabus1 at ahcisata0 channel 1
 jmide0: PCI IDE interface used
 jmide0: bus-master DMA support present
 jmide0: primary channel wired to native-PCI mode
 jmide0: primary channel is unused
 jmide0: secondary channel wired to native-PCI mode
 jmide0: secondary channel is PATA
 atabus2 at jmide0 channel 1
 uhci2 at pci0 dev 29 function 0: vendor 0x8086 product 0x2830 (rev. 0x02)
 uhci2: interrupting at ioapic0 pin 23
 usb3 at uhci2: USB revision 1.0
 uhci3 at pci0 dev 29 function 1: vendor 0x8086 product 0x2831 (rev. 0x02)
 uhci3: interrupting at ioapic0 pin 19
 usb4 at uhci3: USB revision 1.0
 uhci4 at pci0 dev 29 function 2: vendor 0x8086 product 0x2832 (rev. 0x02)
 uhci4: interrupting at ioapic0 pin 18
 usb5 at uhci4: USB revision 1.0
 ehci1 at pci0 dev 29 function 7: vendor 0x8086 product 0x2836 (rev. 0x02)
 ehci1: interrupting at ioapic0 pin 23
 ehci1: EHCI version 1.0
 ehci1: companion controllers, 2 ports each: uhci2 uhci3 uhci4
 usb6 at ehci1: USB revision 2.0
 ppb3 at pci0 dev 30 function 0: vendor 0x8086 product 0x244e (rev. 0xf2)
 pci4 at ppb3 bus 4
 pci4: i/o space, memory space enabled
 fwohci0 at pci4 dev 3 function 0: vendor 0x1106 product 0x3044 (rev. 0xc0)
 fwohci0: interrupting at ioapic0 pin 21
 fwohci0: OHCI version 1.10 (ROM=1)
 fwohci0: No. of Isochronous channels is 4.
 fwohci0: EUI64 00:11:d8:00:01:55:27:ef
 fwohci0: Phy 1394a available S400, 2 ports.
 fwohci0: Link S400, max_rec 2048 bytes.
 ieee1394if0 at fwohci0: IEEE1394 bus
 fwip0 at ieee1394if0: IP over IEEE1394
 fwohci0: Initiate bus reset
 skc0 at pci4 dev 4 function 0: ioapic0 pin 19
 skc0: interrupt moderation is 0 us
 skc0: Marvell Yukon Lite Gigabit Ethernet rev. (0x9)
 sk0 at skc0 port A: Ethernet address 00:1b:fc:9e:0f:b4
 makphy0 at sk0 phy 0: Marvell 88E1011 Gigabit PHY, rev. 5
 makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
 ichlpcib0 at pci0 dev 31 function 0: vendor 0x8086 product 0x2810 (rev. 0x02)
 timecounter: Timecounter "ichlpcib0" frequency 3579545 Hz quality 1000
 ichlpcib0: 24-bit timer
 ichlpcib0: TCO (watchdog) timer configured.
 gpio0 at ichlpcib0: 64 pins
 piixide0 at pci0 dev 31 function 2: Intel 82801H Serial ATA Controller (ICH8) (rev. 0x02)
 piixide0: bus-master DMA support present
 piixide0: primary channel configured to native-PCI mode
 piixide0: using ioapic0 pin 19 for native-PCI interrupt
 atabus3 at piixide0 channel 0
 piixide0: secondary channel configured to native-PCI mode
 atabus4 at piixide0 channel 1
 ichsmb0 at pci0 dev 31 function 3: vendor 0x8086 product 0x283e (rev. 0x02)
 ichsmb0: interrupting at ioapic0 pin 18
 iic0 at ichsmb0: I2C bus
 piixide1 at pci0 dev 31 function 5: Intel 82801H Serial ATA Controller (ICH8) (rev. 0x02)
 piixide1: bus-master DMA support present
 piixide1: primary channel wired to native-PCI mode
 piixide1: using ioapic0 pin 19 for native-PCI interrupt
 atabus5 at piixide1 channel 0
 piixide1: secondary channel wired to native-PCI mode
 atabus6 at piixide1 channel 1
 isa0 at ichlpcib0
 com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
 pckbc0 at isa0 port 0x60-0x64
 fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
 acpicpu0 at cpu0: ACPI CPU
 acpicpu0: C1: HLT, lat   0 us, pow     0 mW
 acpicpu0: P0: FFH, lat  10 us, pow 88000 mW, 2394 MHz
 acpicpu0: P1: FFH, lat  10 us, pow 56320 mW, 1596 MHz
 coretemp0 at cpu0: thermal sensor, 1 C resolution
 acpicpu1 at cpu1: ACPI CPU
 coretemp1 at cpu1: thermal sensor, 1 C resolution
 acpicpu2 at cpu2: ACPI CPU
 coretemp2 at cpu2: thermal sensor, 1 C resolution
 acpicpu3 at cpu3: ACPI CPU
 coretemp3 at cpu3: thermal sensor, 1 C resolution
 fwohci0: BUS reset
 fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
 ieee1394if0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me)
 ieee1394if0: bus manager 0
 timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
 timecounter: Timecounter "TSC" frequency 2400087870 Hz quality 3000
 fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
 uhub0 at usb0: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
 uhub0: 2 ports with 2 removable, self powered
 uhub1 at usb1: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
 uhub1: 2 ports with 2 removable, self powered
 uhub2 at usb2: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
 uhub2: 4 ports with 4 removable, self powered
 uhub3 at usb3: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
 uhub3: 2 ports with 2 removable, self powered
 uhub4 at usb4: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
 uhub4: 2 ports with 2 removable, self powered
 uhub5 at usb5: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
 uhub5: 2 ports with 2 removable, self powered
 uhub6 at usb6: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
 uhub6: 6 ports with 6 removable, self powered
 ehci1: handing over low speed device on port 2 to uhci2
 atapibus0 at atabus2: 2 targets
 cd0 at atapibus0 drive 0: <TSSTcorpCD/DVDW SH-S182D, , SB04> cdrom removable
 cd0: 32-bit data port
 cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
 cd0(jmide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
 umass0 at uhub6 port 4 configuration 1 interface 0
 umass0: LEXAR JD FIREFLY, rev 2.00/11.00, addr 2
 umass0: using SCSI over Bulk-Only
 scsibus0 at umass0: 2 targets, 1 lun per target
 sd0 at scsibus0 target 0 lun 0: <LEXAR, JD FIREFLY, 1100> disk removable
 sd0: fabricating a geometry
 sd0: 1920 MB, 1920 cyl, 64 head, 32 sec, 512 bytes/sect x 3932160 sectors
 sd0: fabricating a geometry
 Kernelized RAIDframe activated
 boot device: sd0
 root on sd0a dumps on sd0b
 root file system type: ffs
 uhidev0 at uhub3 port 2 configuration 1 interface 0
 uhidev0: Holtek Semiconductor USB Keyboard, rev 1.10/3.10, addr 2, iclass 3/1
 ukbd0 at uhidev0: 8 modifier keys, 6 key codes
 wskbd0 at ukbd0: console keyboard, using wsdisplay0
 uhidev1 at uhub3 port 2 configuration 1 interface 1
 uhidev1: Holtek Semiconductor USB Keyboard, rev 1.10/3.10, addr 2, iclass 3/0
 uhidev1: 2 report ids
 uhid0 at uhidev1 reportid 1: input=1, output=0, feature=0
 uhid1 at uhidev1 reportid 2: input=3, output=0, feature=0
 skc0: interrupt moderation is 1000 us

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gson@netbsd.org, nisimura@netbsd.org
Cc: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 20:36:20 +0000

 This is a multi-part message in MIME format.
 --=_PEcRT7HpZC+rI8TOQS8CJoc9wLMBnnal

 Can you please try the attached patch?

 --=_PEcRT7HpZC+rI8TOQS8CJoc9wLMBnnal
 Content-Type: text/plain; charset="ISO-8859-1"; name="agp"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="agp.patch"

 Index: agp_i810.c
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 RCS file: /cvsroot/src/sys/dev/pci/agp_i810.c,v
 retrieving revision 1.118
 diff -p -u -r1.118 agp_i810.c
 --- agp_i810.c	5 Apr 2015 12:55:20 -0000	1.118
 +++ agp_i810.c	12 Oct 2015 20:35:42 -0000
 @@ -420,7 +420,9 @@ agp_i810_attach(device_t parent, device_
  	case CHIP_I965:
  		apbase =3D AGP_I965_GMADR;
  		mmadr_bar =3D AGP_I965_MMADR;
 -		mmadr_type |=3D PCI_MAPREG_MEM_TYPE_64BIT;
 +		mmadr_type |=3D PCI_MAPREG_MEM_TYPE_MASK &
 +		    pci_mapreg_type(isc->vga_pa.pa_pc,
 +			isc->vga_pa.pa_tag, AGP_I965_MMADR);
  		if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
  			AGP_I965_MMADR, mmadr_type, NULL, &isc->size, NULL))
  			isc->size =3D 512*1024; /* XXX */

 --=_PEcRT7HpZC+rI8TOQS8CJoc9wLMBnnal--

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, nisimura@netbsd.org
Cc: 
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 20:10:50 +0000

    Date: Mon, 12 Oct 2015 19:54:49 +0000
    From: Taylor R Campbell <riastradh@NetBSD.org>

    Can you show `pcictl pci0 dump -b 0 -d 0 -f 2'?

 Sorry, I meant: pcictl pci0 dump -b 0 -d 2 -f 0

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gson@netbsd.org, nisimura@netbsd.org
Cc: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 23:07:57 +0000

 This is a multi-part message in MIME format.
 --=_I2aphbwziqku9cDX0Sh0XLwUoEpS6aeG

 Here's a slightly better, if more ambitious, patch, which factors out
 the relevant logic and leaves a comment explaining what is going on
 here so it won't take me a whole day to figure this out next time it
 bites us.

 --=_I2aphbwziqku9cDX0Sh0XLwUoEpS6aeG
 Content-Type: text/plain; charset="ISO-8859-1"; name="agp1"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="agp1.patch"

 Index: agp_i810.c
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 RCS file: /cvsroot/src/sys/dev/pci/agp_i810.c,v
 retrieving revision 1.118
 diff -p -u -r1.118 agp_i810.c
 --- agp_i810.c	5 Apr 2015 12:55:20 -0000	1.118
 +++ agp_i810.c	12 Oct 2015 23:06:42 -0000
 @@ -407,47 +407,91 @@ agp_i810_attach(device_t parent, device_
  	}
  	aprint_naive("\n");
 =20
 -	mmadr_type =3D PCI_MAPREG_TYPE_MEM;
 +	/* Discriminate on the chipset to choose the relevant BARs.  */
  	switch (isc->chiptype) {
  	case CHIP_I915:
  	case CHIP_G33:
  		apbase =3D AGP_I915_GMADR;
  		mmadr_bar =3D AGP_I915_MMADR;
 -		isc->size =3D 512*1024;
  		gtt_bar =3D AGP_I915_GTTADR;
  		gtt_off =3D ~(bus_size_t)0; /* XXXGCC */
  		break;
  	case CHIP_I965:
  		apbase =3D AGP_I965_GMADR;
  		mmadr_bar =3D AGP_I965_MMADR;
 -		mmadr_type |=3D PCI_MAPREG_MEM_TYPE_64BIT;
 -		if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
 -			AGP_I965_MMADR, mmadr_type, NULL, &isc->size, NULL))
 -			isc->size =3D 512*1024; /* XXX */
  		gtt_bar =3D 0;
  		gtt_off =3D AGP_I965_GTT;
  		break;
  	case CHIP_G4X:
  		apbase =3D AGP_I965_GMADR;
  		mmadr_bar =3D AGP_I965_MMADR;
 -		mmadr_type |=3D PCI_MAPREG_MEM_TYPE_64BIT;
 -		if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
 -			AGP_I965_MMADR, mmadr_type, NULL, &isc->size, NULL))
 -			isc->size =3D 512*1024; /* XXX */
  		gtt_bar =3D 0;
  		gtt_off =3D AGP_G4X_GTT;
  		break;
  	default:
  		apbase =3D AGP_I810_GMADR;
  		mmadr_bar =3D AGP_I810_MMADR;
 -		if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
 -			AGP_I810_MMADR, mmadr_type, NULL, &isc->size, NULL))
 -			isc->size =3D 512*1024; /* XXX */
  		gtt_bar =3D 0;
  		gtt_off =3D AGP_I810_GTT;
  		break;
  	}
 =20
 +	/*
 +	 * Ensure the MMIO BAR is, in fact, a memory BAR.
 +	 *
 +	 * XXX This is required because we use pa_memt below.  It is
 +	 * not a priori clear to me there is any other reason to
 +	 * require this.
 +	 */
 +	mmadr_type =3D pci_mapreg_type(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
 +	    mmadr_bar);
 +	if ((mmadr_type & PCI_MAPREG_TYPE_MEM) !=3D PCI_MAPREG_TYPE_MEM) {
 +		aprint_error_dev(self, "non-memory device MMIO registers\n");
 +		error =3D ENXIO;
 +		goto fail1;
 +	}
 +
 +	/*
 +	 * Determine the size of the MMIO registers.
 +	 *
 +	 * XXX The size of the MMIO registers we use is statically
 +	 * determined, as a function of the chipset, by the driver's
 +	 * implementation.
 +	 *
 +	 * On some chipsets, the GTT is part of the MMIO register BAR.
 +	 * We would like to map the GTT separately, so that we can map
 +	 * it prefetchable, which we can't do with the MMIO registers.
 +	 * Consequently, we would especially like to map a fixed size
 +	 * of MMIO registers, not just whatever size the BAR says.
 +	 *
 +	 * However, old drm assumes that the combined GTT/MMIO register
 +	 * space is a single bus space mapping, so mapping them
 +	 * separately breaks that.  Once we rip out old drm, we can
 +	 * replace the pci_mapreg_info call by the chipset switch.
 +	 */
 +#if notyet
 +	switch (isc->chiptype) {
 +	case CHIP_I810:
 +	case CHIP_I830:
 +	case CHIP_I855:
 +	case CHIP_I915:
 +	case CHIP_G33:
 +	case CHIP_I965:
 +	case CHIP_G4X:
 +		isc->size =3D 512*1024;
 +		break;
 +	case CHIP_SANDYBRIDGE:
 +	case CHIP_IVYBRIDGE:
 +	case CHIP_HASWELL:
 +		isc->size =3D 2*1024*1024;
 +		break;
 +	}
 +#else
 +	if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
 +		mmadr_bar, mmadr_type, NULL, &isc->size, NULL))
 +		isc->size =3D 512*1024;
 +#endif	/* notyet */
 +
  	/* Map (or, rather, find the address and size of) the aperture.  */
  	if (isc->chiptype =3D=3D CHIP_I965 || isc->chiptype =3D=3D CHIP_G4X)
  		error =3D agp_i965_map_aperture(&isc->vga_pa, sc, apbase);

 --=_I2aphbwziqku9cDX0Sh0XLwUoEpS6aeG--

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, riastradh@NetBSD.org
Cc: nisimura@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Tue, 13 Oct 2015 10:04:13 +0300

 Taylor R Campbell wrote:
 >  Sorry, I meant: pcictl pci0 dump -b 0 -d 2 -f 0

 PCI configuration registers:
   Common header:
     0x00: 0x29a28086 0x00900007 0x03000002 0x00000000

     Vendor Name: Intel (0x8086)
     Device Name: 82965G Integrated Graphics Device (0x29a2)
     Command register: 0x0007
       I/O space accesses: on
       Memory space accesses: on
       Bus mastering: on
       Special cycles: off
       MWI transactions: off
       Palette snooping: off
       Parity error checking: off
       Address/data stepping: off
       System error (SERR): off
       Fast back-to-back transactions: off
       Interrupt disable: off
     Status register: 0x0090
       Interrupt status: inactive
       Capability List support: on
       66 MHz capable: off
       User Definable Features (UDF) support: off
       Fast back-to-back capable: on
       Data parity error detected: off
       DEVSEL timing: fast (0x0)
       Slave signaled Target Abort: off
       Master received Target Abort: off
       Master received Master Abort: off
       Asserted System Error (SERR): off
       Parity error detected: off
     Class Name: display (0x03)
     Subclass Name: VGA (0x00)
     Interface: 0x00
     Revision ID: 0x02
     BIST: 0x00
     Header Type: 0x00 (0x00)
     Latency Timer: 0x00
     Cache Line Size: 0x00

   Type 0 ("normal" device) header:
     0x10: 0xffa00000 0x00000000 0xd000000c 0x00000000
     0x20: 0x0000ec01 0x00000000 0x00000000 0x820b1043
     0x30: 0x00000000 0x00000090 0x00000000 0x0000010b

     Base address register at 0x10
       type: 32-bit nonprefetchable memory
       base: 0xffa00000, not sized
     Base address register at 0x14
       not implemented(?)
     Base address register at 0x18
       type: 64-bit prefetchable memory
       base: 0x00000000d0000000, not sized
     Base address register at 0x20
       type: i/o
       base: 0x0000ec00, not sized
     Base address register at 0x24
       not implemented(?)
     Cardbus CIS Pointer: 0x00000000
     Subsystem vendor ID: 0x1043
     Subsystem ID: 0x820b
     Expansion ROM Base Address: 0x00000000
     Capability list pointer: 0x90
     Reserved @ 0x38: 0x00000000
     Maximum Latency: 0x00
     Minimum Grant: 0x00
     Interrupt pin: 0x01 (pin A)
     Interrupt line: 0x0b

   Capability register at 0x90
     type: 0x05 (MSI)
   Capability register at 0xd0
     type: 0x01 (Power Management, rev. 1.0)

   PCI Message Signaled Interrupt
     Message Control register: 0x0000
       MSI Enabled: no
       Multiple Message Capable: no (1 vector)
       Multiple Message Enabled: off (1 vector)
       64 Bit Address Capable: no
       Per-Vector Masking Capable: no
     Message Address register: 0x00000000
     Message Data register: 0x00000000

   PCI Power Management Capabilities Register
     Capabilities register: 0x0022
       Version: 1.1
       PME# clock: off
       Device specific initialization: on
       3.3V auxiliary current: self-powered
       D1 power management state support: off
       D2 power management state support: off
       PME# support: 0x00
     Control/status register: 0x0000
       Power state: D0
       PCI Express reserved: off
       No soft reset: off
       PME# assertion disabled
       PME# status: off

   Device-dependent header:
     0x40: 0x00000000 0x000000e0 0x51090009 0x8900036e
     0x50: 0x00300c86 0x0000004b 0x00000000 0x7f800000
     0x60: 0x00020000 0x00000000 0x00000000 0x00000000
     0x70: 0x00000000 0x00000000 0x00000000 0x00000000
     0x80: 0x00000000 0x00000000 0x00000000 0x00000000
     0x90: 0x0000d005 0x00000000 0x00000000 0x00000000
     0xa0: 0x00001111 0x00000000 0x00000000 0x00000000
     0xb0: 0x00000000 0x00000000 0x00000000 0x00000000
     0xc0: 0x00000000 0x3c071f1f 0x10200080 0x000001c9
     0xd0: 0x00220001 0x00000000 0x00000000 0x00020100
     0xe0: 0x00000000 0x00000000 0x00008000 0x00000000
     0xf0: 0x00030034 0x00000000 0x00040f90 0x7f79e0e4

From: Andreas Gustafsson <gson@NetBSD.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: nisimura@netbsd.org,
    gnats-bugs@NetBSD.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Tue, 13 Oct 2015 13:04:11 +0300

 Taylor R Campbell wrote:
 > Here's a slightly better, if more ambitious, patch, which factors out
 > the relevant logic and leaves a comment explaining what is going on
 > here so it won't take me a whole day to figure this out next time it
 > bites us.

 I applied this patch, and my machine now boots.  Thank you!

 There is still a problem with the video mode, though: I have an LCD
 display connected to the VGA port, and it is showing the following
 message:

 		ATTENTION
 	       OUT OF RANGE
 	 H: 71.9 KHz    V: 160 Hz
        PLEASE CHANGE SIGNAL TIMING

 I'm planning to use this machine with a serial console, so this is not
 a big problem for me, but it probably will be for someone else.
 -- 
 Andreas Gustafsson, gson@NetBSD.org

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50060 CVS commit: src/sys/dev/pci
Date: Tue, 13 Oct 2015 12:17:04 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Tue Oct 13 12:17:04 UTC 2015

 Modified Files:
 	src/sys/dev/pci: agp_i810.c

 Log Message:
 Fix mapping Intel graphics device registers.

 - Accept either 32-bit or 64-bit mappings for all devices.
 - Let the device always dictate size of the mapping.
 - Explain why we don't have a statically fixed mapping size.

 Fixes the main part of PR kern/50060.  Still a display mode issue
 from one submitter, but it is almost certainly an unrelated issue.


 To generate a diff of this commit:
 cvs rdiff -u -r1.118 -r1.119 src/sys/dev/pci/agp_i810.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50060 CVS commit: [netbsd-7] src/sys/dev/pci
Date: Fri, 6 Nov 2015 22:55:10 +0000

 Module Name:	src
 Committed By:	riz
 Date:		Fri Nov  6 22:55:10 UTC 2015

 Modified Files:
 	src/sys/dev/pci [netbsd-7]: agp_i810.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #1000):
 	sys/dev/pci/agp_i810.c: revision 1.119
 Fix mapping Intel graphics device registers.
 - Accept either 32-bit or 64-bit mappings for all devices.
 - Let the device always dictate size of the mapping.
 - Explain why we don't have a statically fixed mapping size.
 Fixes the main part of PR kern/50060.  Still a display mode issue
 from one submitter, but it is almost certainly an unrelated issue.


 To generate a diff of this commit:
 cvs rdiff -u -r1.112.2.3 -r1.112.2.4 src/sys/dev/pci/agp_i810.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Fri, 06 Nov 2015 23:06:27 +0000
State-Changed-Why:
Fix committed, pullups applied.

nisimura@, can you confirm that this fixes the problem for you too?

gson@, can you file a separate PR for the LCD issue?


From: "nisimura@netbsd.org" <locore64@48gou.jp>
To: gnats-bugs@netbsd.org
Cc: Taylor R Campbell <riastradh@netbsd.org>, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, 
	nisimura@netbsd.org, gson@gson.org
Subject: Re: kern/50060 (kernel crash with i915drmksm on Intel 965Q)
Date: Wed, 11 Nov 2015 02:33:44 +0900

 --089e012952fe9000290524331c46
 Content-Type: text/plain; charset=UTF-8

 It has been working on my NANAO.  You can close this PR.

 -nisimura

 --089e012952fe9000290524331c46
 Content-Type: text/html; charset=UTF-8
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr">It has been working on my NANAO.=C2=A0 You can close this =
 PR.<div><br></div><div>-nisimura</div></div>

 --089e012952fe9000290524331c46--

State-Changed-From-To: feedback->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Tue, 10 Nov 2015 17:53:24 +0000
State-Changed-Why:
submitter reports fixed


From: Andreas Gustafsson <gson@gson.org>
To: riastradh@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/50060 (kernel crash with i915drmksm on Intel 965Q)
Date: Fri, 20 Nov 2015 12:49:03 +0200

 Some weeks ago, riastradh@NetBSD.org wrote:
 > gson@, can you file a separate PR for the LCD issue?

 Done: kern/50452.
 -- 
 Andreas Gustafsson, gson@gson.org

From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50060 CVS commit: [netbsd-7] src/sys/external/bsd/drm2/dist/drm/i915
Date: Thu, 11 Feb 2016 23:23:11 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Thu Feb 11 23:23:11 UTC 2016

 Modified Files:
 	src/sys/external/bsd/drm2/dist/drm/i915 [netbsd-7]: i915_dma.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #1091):
 	sys/external/bsd/drm2/dist/drm/i915/i915_dma.c: revisions 1.17, 1.18
 Zero out the guard for bus_space_unmap before calling i915_dma_cleanup() which
 calls i915_free_hws(), which then tries to unmap. Perhaps this fixes PR/50060.
 --
 fix the same bug on the linux side, print the error, and return the -tive
 error to mimick linux.


 To generate a diff of this commit:
 cvs rdiff -u -r1.10.2.4 -r1.10.2.5 \
     src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.