NetBSD Problem Report #50060
From www@NetBSD.org Sat Jul 18 01:28:56 2015
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 4B625A65B9
for <gnats-bugs@gnats.NetBSD.org>; Sat, 18 Jul 2015 01:28:56 +0000 (UTC)
Message-Id: <20150718012855.0E220A65B9@mollari.NetBSD.org>
Date: Sat, 18 Jul 2015 01:28:55 +0000 (UTC)
From: nisimura@netbsd.org
Reply-To: nisimura@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: kernel crash with i915drmksm on Intel 965Q
X-Send-Pr-Version: www-1.0
>Number: 50060
>Notify-List: gson@gson.org
>Category: kern
>Synopsis: kernel crash with i915drmksm on Intel 965Q
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: riastradh
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jul 18 01:30:01 +0000 2015
>Closed-Date: Tue Nov 10 17:53:24 +0000 2015
>Last-Modified: Thu Feb 11 23:25:01 +0000 2016
>Originator: Toru Nishimura
>Release: NetBSD 7.0_RC1
>Organization:
ALKYL Technology
>Environment:
NetBSD paq16 7.0_RC1 NetBSD 7.0_RC1 (GENERIC.201506190427Z) amd64
>Description:
From the introduction of i915drmkms hp dc7700 has never been successful to boot off since i915drmkms always crashs.
agp0 at pchb0: i965-family chipset
pci_mem_find: expected mem type 00000004, found 00000000
pci_mem find: expected mem type 00000004, found 00000000
agp0: can't find MMIO registers
i915drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02)
DRM error in i915_gmch_probe: failed to set up gmch
uvm_fault(0xffffffff8104b240, 0x7fc000231000, 2) -> e
fatal page fault in supervisor mode
trap type 6 code 2 rip ffffffff809cb323 cs 8 rflags 10246 cr2 7fc000231000 ilevel 8 rsp ffffffff812657e8
curlwp 0xffffffff81000540 pid 0.1 lowest kstack ffffffff812632c0
kernel: page fault trap, code=0
Stopped in pid 0.1 (system) at netbsd:_atomic_swap_64+0x3: xchgq %rax,0(%rdi)
- 6.1.5 which uses i915drm has no trouble so far.
- "boot -c / disable i915drmkms" is the only to avoid crash.
- hp dc7700 is pure Intel-Inside product, supposedly one of most standard desktop PC.
Toru Nishimura / ALKYL Technology
>How-To-Repeat:
Boot off off the shelf NetBSD7.0_RC1 kernel on 965Q system. Mine happens to have E6300/E6600 C2D.
>Fix:
>Release-Note:
>Audit-Trail:
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sat, 18 Jul 2015 19:11:25 +0000
On Sat, Jul 18, 2015 at 01:30:01AM +0000, nisimura@netbsd.org wrote:
> i915drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02)
> DRM error in i915_gmch_probe: failed to set up gmch
Is this the real problem? (that is, the subsequent crash then reflects
bad handling of the failure condition) ... or is it supposed to be
harmless/noncritical?
> uvm_fault(0xffffffff8104b240, 0x7fc000231000, 2) -> e
> fatal page fault in supervisor mode
> trap type 6 code 2 rip ffffffff809cb323 cs 8 rflags 10246 cr2 7fc000231000 ilevel 8 rsp ffffffff812657e8
> curlwp 0xffffffff81000540 pid 0.1 lowest kstack ffffffff812632c0
> kernel: page fault trap, code=0
> Stopped in pid 0.1 (system) at netbsd:_atomic_swap_64+0x3: xchgq %rax,0(%rdi)
Can you send the stack trace?
--
David A. Holland
dholland@netbsd.org
From: Collector Mail <locore64@48gou.jp>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
nisimura@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 20 Jul 2015 13:09:34 +0900
--001a11403cae95811c051b46b4f4
Content-Type: text/plain; charset=UTF-8
>
>
> >> i915drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992
(rev. 0x02)
>> DRM error in i915_gmch_probe: failed to set up gmch
>
> Is this the real problem?
No clue. Somehow older NetBSD kernel says something about "i915drm" and has
no trouble.
> Can you send the stack trace?
_atomic_swap_64() at netbsd:_atomic_swap_64+0x3
bus_space_reservation_unmap1() at netbsd:bus_space_reservation_unmap1+0xc2
bus_space_unmap() at netbsd:bus_space_unmap+0x38
i915_driver_load() at netbsd:i915_driver_load+0x971
drm_dev_register() at netbsd:drm_dev_register+0x87
drm_pci_attach() at netbsd:drm_pci_attach+0x2d5
i915drmkms_attach() at netbsd:i915drmkms_attach+0x92
config_attach_loc() at netbsd:config_attach_loc+0x16e
pci_probe_device() at netbsd:pci_probe_device+0x4ac
pci_emumerate_bus() at netbsd:pci_enumerate_bus+0x168
...
Toru Nishimura / ALKYL Technology
--001a11403cae95811c051b46b4f4
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><blo=
ckquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left=
-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;paddi=
ng-left:1ex"><br></blockquote></div><div class=3D"gmail_extra">>> i91=
5drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02)=
</div><div class=3D"gmail_extra">>> DRM error in i915_gmch_probe: fai=
led to set up gmch</div><div class=3D"gmail_extra">></div><div class=3D"=
gmail_extra">> Is this the real problem?</div><div class=3D"gmail_extra"=
><br></div><div class=3D"gmail_extra">No clue.=C2=A0 Somehow older NetBSD k=
ernel says something about "i915drm" and has</div><div class=3D"g=
mail_extra">no trouble.</div><div class=3D"gmail_extra"><br></div><div clas=
s=3D"gmail_extra">> Can you send the stack trace?</div><div class=3D"gma=
il_extra"><br></div><div class=3D"gmail_extra">_atomic_swap_64() at netbsd:=
_atomic_swap_64+0x3</div><div class=3D"gmail_extra">bus_space_reservation_u=
nmap1() at netbsd:bus_space_reservation_unmap1+0xc2</div><div class=3D"gmai=
l_extra">bus_space_unmap() at netbsd:bus_space_unmap+0x38</div><div class=
=3D"gmail_extra">i915_driver_load() at netbsd:i915_driver_load+0x971</div><=
div class=3D"gmail_extra">drm_dev_register() at netbsd:drm_dev_register+0x8=
7</div><div class=3D"gmail_extra">drm_pci_attach() at netbsd:drm_pci_attach=
+0x2d5</div><div class=3D"gmail_extra">i915drmkms_attach() at netbsd:i915dr=
mkms_attach+0x92</div><div class=3D"gmail_extra">config_attach_loc() at net=
bsd:config_attach_loc+0x16e</div><div class=3D"gmail_extra">pci_probe_devic=
e() at netbsd:pci_probe_device+0x4ac</div><div class=3D"gmail_extra">pci_em=
umerate_bus() at netbsd:pci_enumerate_bus+0x168</div><div class=3D"gmail_ex=
tra">...</div><div class=3D"gmail_extra"><br></div><div class=3D"gmail_extr=
a">Toru Nishimura / ALKYL Technology</div></div></div>
--001a11403cae95811c051b46b4f4--
From: Collector Mail <locore64@48gou.jp>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
nisimura@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 20 Jul 2015 13:18:27 +0900
--001a11403cae5ac71f051b46d4bd
Content-Type: text/plain; charset=UTF-8
Got 2yrs back dmesg out on the very same machine sampled by 6.99.23
at Aug 30 of 2013.
pchb0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
agp0 at pchb0: detected 7676k stolen memory
agp0: aperture at 0xe0000000, size 0x20000000
vga0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02)
wsdisplay0 at vga0 kdbmux 1: console (0x80x25, vt100 emulation), using
wskbd0
wsmux1 : connecting to wdisplay0
i915drm0 at vga0: Intel i965Q
i915drm0: AGP at 0xe0000000 256MB
i915drm0: Industrialized i915 1.6.0 20080730
...
Toru Nishimura / ALKYL Technology
2015-07-20 13:09 GMT+09:00 Collector Mail <locore64@48gou.jp>:
>
>> >> i915drmksm0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992
> (rev. 0x02)
> >> DRM error in i915_gmch_probe: failed to set up gmch
> >
> > Is this the real problem?
>
> No clue. Somehow older NetBSD kernel says something about "i915drm" and
> has
> no trouble.
>
> > Can you send the stack trace?
>
> _atomic_swap_64() at netbsd:_atomic_swap_64+0x3
> bus_space_reservation_unmap1() at netbsd:bus_space_reservation_unmap1+0xc2
> bus_space_unmap() at netbsd:bus_space_unmap+0x38
> i915_driver_load() at netbsd:i915_driver_load+0x971
> drm_dev_register() at netbsd:drm_dev_register+0x87
> drm_pci_attach() at netbsd:drm_pci_attach+0x2d5
> i915drmkms_attach() at netbsd:i915drmkms_attach+0x92
> config_attach_loc() at netbsd:config_attach_loc+0x16e
> pci_probe_device() at netbsd:pci_probe_device+0x4ac
> pci_emumerate_bus() at netbsd:pci_enumerate_bus+0x168
> ...
>
> Toru Nishimura / ALKYL Technology
>
--001a11403cae5ac71f051b46d4bd
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">Got 2yrs back dmesg out on the very same machine sampled b=
y 6.99.23<div>at Aug 30 of 2013.<div><br></div><div>pchb0: i/o space, memor=
y space enabled, rd/line, rd/mult, wr/inv ok</div><div>agp0 at pchb0: detec=
ted 7676k stolen memory</div><div>agp0: aperture at 0xe0000000, size 0x2000=
0000</div><div>vga0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 =
(rev. 0x02)</div><div>wsdisplay0 at vga0 kdbmux 1: console (0x80x25, vt100 =
emulation), using wskbd0</div><div>wsmux1 : connecting to wdisplay0</div><d=
iv>i915drm0 at vga0: Intel i965Q</div><div>i915drm0: AGP at 0xe0000000 256M=
B</div><div>i915drm0: Industrialized i915 1.6.0 20080730</div><div>...</div=
><div><br></div><div>Toru Nishimura / ALKYL Technology=C2=A0</div></div></d=
iv><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">2015-07-20 13:=
09 GMT+09:00 Collector Mail <span dir=3D"ltr"><<a href=3D"mailto:locore6=
4@48gou.jp" target=3D"_blank">locore64@48gou.jp</a>></span>:<br><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc s=
olid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extra"><span cl=
ass=3D""><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(20=
4,204,204);border-left-style:solid;padding-left:1ex"><br></blockquote></div=
><div class=3D"gmail_extra">>> i915drmksm0 at pci0 dev 2 function 0: =
vendor 0x8086 product 0x2992 (rev. 0x02)</div><div class=3D"gmail_extra">&g=
t;> DRM error in i915_gmch_probe: failed to set up gmch</div><div class=
=3D"gmail_extra">></div><div class=3D"gmail_extra">> Is this the real=
problem?</div><div class=3D"gmail_extra"><br></div></span><div class=3D"gm=
ail_extra">No clue.=C2=A0 Somehow older NetBSD kernel says something about =
"i915drm" and has</div><div class=3D"gmail_extra">no trouble.</di=
v><span class=3D""><div class=3D"gmail_extra"><br></div><div class=3D"gmail=
_extra">> Can you send the stack trace?</div><div class=3D"gmail_extra">=
<br></div></span><div class=3D"gmail_extra">_atomic_swap_64() at netbsd:_at=
omic_swap_64+0x3</div><div class=3D"gmail_extra">bus_space_reservation_unma=
p1() at netbsd:bus_space_reservation_unmap1+0xc2</div><div class=3D"gmail_e=
xtra">bus_space_unmap() at netbsd:bus_space_unmap+0x38</div><div class=3D"g=
mail_extra">i915_driver_load() at netbsd:i915_driver_load+0x971</div><div c=
lass=3D"gmail_extra">drm_dev_register() at netbsd:drm_dev_register+0x87</di=
v><div class=3D"gmail_extra">drm_pci_attach() at netbsd:drm_pci_attach+0x2d=
5</div><div class=3D"gmail_extra">i915drmkms_attach() at netbsd:i915drmkms_=
attach+0x92</div><div class=3D"gmail_extra">config_attach_loc() at netbsd:c=
onfig_attach_loc+0x16e</div><div class=3D"gmail_extra">pci_probe_device() a=
t netbsd:pci_probe_device+0x4ac</div><div class=3D"gmail_extra">pci_emumera=
te_bus() at netbsd:pci_enumerate_bus+0x168</div><div class=3D"gmail_extra">=
...</div><div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">To=
ru Nishimura / ALKYL Technology</div></div></div>
</blockquote></div><br></div>
--001a11403cae5ac71f051b46d4bd--
Responsible-Changed-From-To: kern-bug-people->riastradh
Responsible-Changed-By: riastradh@NetBSD.org
Responsible-Changed-When: Fri, 31 Jul 2015 18:45:25 +0000
Responsible-Changed-Why:
mine
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sat, 10 Oct 2015 22:13:38 +0300
I just tried to boot the NetBSD 7.0/amd64 install image on a PC with
an ASUS P5B-V motherboard, and it paniced with the same backtrace as
the one reported in PR 50060. This is a regression - 6.1.5 works fine
on this machine.
--
Andreas Gustafsson, gson@gson.org
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/50060 CVS commit: src/sys/external/bsd/drm2/dist/drm/i915
Date: Sat, 10 Oct 2015 15:29:44 -0400
Module Name: src
Committed By: christos
Date: Sat Oct 10 19:29:44 UTC 2015
Modified Files:
src/sys/external/bsd/drm2/dist/drm/i915: i915_dma.c
Log Message:
Zero out the guard for bus_space_unmap before calling i915_dma_cleanup() which
calls i915_free_hws(), which then tries to unmap. Perhaps this fixes PR/50060.
To generate a diff of this commit:
cvs rdiff -u -r1.16 -r1.17 src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, christos@NetBSD.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sun, 11 Oct 2015 19:25:11 +0300
I tried a -current kernel from source date 2015.10.10.19.35.15, which
includes christos' recent commit of i915_dma.c 1.18, but it also
paniced:
pci0 at mainbus0 bus 0: configuration mode 1
pchb0 at pci0 dev 0 function 0: vendor 8086 product 29a0 (rev. 0x02)
agp0 at pchb0: i965-family chipset
pci_mem_find: expected mem type 00000004, found 00000000
pci_mem_find: expected mem type 00000004, found 00000000
agp0: can't find MMIO registers
ppb0 at pci0 dev 1 function 0: vendor 8086 product 29a1 (rev. 0x02)
ppb0: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x16 @ 2.5GT/s
pci1 at ppb0 bus 1
i915drmkms0 at pci0 dev 2 function 0: vendor 8086 product 29a2 (rev. 0x02)
DRM error in i915_gmch_probe: failed to set up gmch
uvm_fault(0xffffffff811b27a0, 0x7fc000357000, 2) -> e
fatal page fault in supervisor mode
trap type 6 code 2 rip ffffffff80a80c43 cs 8 rflags 10246 cr2 7fc000357000 ilevel 8 rsp ffffffff8137e7c8
curlwp 0xffffffff8110d740 pid 0.1 lowest kstack 0xffffffff8137b2c0
kernel: page fault trap, code=0
Stopped in pid 0.1 (system) at netbsd:_atomic_swap_64+0x3: xchgq %rax,0(%rdi)
db{0}>
--
Andreas Gustafsson, gson@gson.org
From: christos@zoulas.com (Christos Zoulas)
To: Andreas Gustafsson <gson@gson.org>, gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sun, 11 Oct 2015 12:33:16 -0400
On Oct 11, 7:25pm, gson@gson.org (Andreas Gustafsson) wrote:
-- Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
| I tried a -current kernel from source date 2015.10.10.19.35.15, which
| includes christos' recent commit of i915_dma.c 1.18, but it also
| paniced:
|
| pci0 at mainbus0 bus 0: configuration mode 1
| pchb0 at pci0 dev 0 function 0: vendor 8086 product 29a0 (rev. 0x02)
| agp0 at pchb0: i965-family chipset
This seems to be a common problem:
| pci_mem_find: expected mem type 00000004, found 00000000
| pci_mem_find: expected mem type 00000004, found 00000000
| agp0: can't find MMIO registers
| ppb0 at pci0 dev 1 function 0: vendor 8086 product 29a1 (rev. 0x02)
| ppb0: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x16 @ 2.5GT/s
| pci1 at ppb0 bus 1
| i915drmkms0 at pci0 dev 2 function 0: vendor 8086 product 29a2 (rev. 0x02)
| DRM error in i915_gmch_probe: failed to set up gmch
| uvm_fault(0xffffffff811b27a0, 0x7fc000357000, 2) -> e
| fatal page fault in supervisor mode
| trap type 6 code 2 rip ffffffff80a80c43 cs 8 rflags 10246 cr2 7fc000357000 ilevel 8 rsp ffffffff8137e7c8
| curlwp 0xffffffff8110d740 pid 0.1 lowest kstack 0xffffffff8137b2c0
| kernel: page fault trap, code=0
| Stopped in pid 0.1 (system) at netbsd:_atomic_swap_64+0x3: xchgq %rax,0(%rdi)
| db{0}>
Backtrace? Same as before?
christos
From: Andreas Gustafsson <gson@gson.org>
To: christos@zoulas.com (Christos Zoulas)
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sun, 11 Oct 2015 19:36:13 +0300
Christos Zoulas wrote:
> Backtrace? Same as before?
Yes, same as before:
db{0}> bt
_atomic_swap_64() at netbsd:_atomic_swap_64+0x3
bus_space_reservation_unmap1() at netbsd:bus_space_reservation_unmap1+0xc6
bus_space_unmap() at netbsd:bus_space_unmap+0x38
i915_driver_load() at netbsd:i915_driver_load+0xa67
drm_dev_register() at netbsd:drm_dev_register+0x87
drm_pci_attach() at netbsd:drm_pci_attach+0x2db
i915drmkms_attach() at netbsd:i915drmkms_attach+0x9b
config_attach_loc() at netbsd:config_attach_loc+0x17a
pci_probe_device() at netbsd:pci_probe_device+0x4fa
pci_enumerate_bus() at netbsd:pci_enumerate_bus+0x168
pcirescan() at netbsd:pcirescan+0x43
pciattach() at netbsd:pciattach+0x193
config_attach_loc() at netbsd:config_attach_loc+0x17a
mp_pci_scan() at netbsd:mp_pci_scan+0xa4
mainbus_attach() at netbsd:mainbus_attach+0x2fa
config_attach_loc() at netbsd:config_attach_loc+0x17a
cpu_configure() at netbsd:cpu_configure+0x26
main() at netbsd:main+0x299
--
Andreas Gustafsson, gson@gson.org
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, riastradh@NetBSD.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, nisimura@netbsd.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Sun, 11 Oct 2015 12:48:17 -0400
On Oct 11, 4:40pm, gson@gson.org (Andreas Gustafsson) wrote:
-- Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
| The following reply was made to PR kern/50060; it has been noted by GNATS.
|
| From: Andreas Gustafsson <gson@gson.org>
| To: christos@zoulas.com (Christos Zoulas)
| Cc: gnats-bugs@NetBSD.org
| Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
| Date: Sun, 11 Oct 2015 19:36:13 +0300
|
| Christos Zoulas wrote:
| > Backtrace? Same as before?
|
| Yes, same as before:
There is only one bus_space_unmap() in i915_dma.c can you either put
printfs before and after, or comment it out to see if it is there where
it fails?
christos
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, christos@NetBSD.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 09:51:39 +0300
Christos wrote:
There is only one bus_space_unmap() in i915_dma.c can you either put
> printfs before and after, or comment it out to see if it is there where
> it fails?
I added the printfs to the 2015.10.10.19.35.15 source, and a couple
more for context:
Index: i915_dma.c
===================================================================
RCS file: /bracket/repo/src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c,v
retrieving revision 1.18
diff -u -r1.18 i915_dma.c
--- i915_dma.c 10 Oct 2015 19:35:15 -0000 1.18
+++ i915_dma.c 11 Oct 2015 18:13:43 -0000
@@ -133,12 +133,14 @@
if (ring->status_page.gfx_addr) {
ring->status_page.gfx_addr = 0;
+ printf("pre unmap\n");
#ifdef __NetBSD__
bus_space_unmap(dev->pdev->pd_pa.pa_memt,
dev_priv->dri1.gfx_hws_cpu_bsh, 4096);
#else
iounmap(dev_priv->dri1.gfx_hws_cpu_addr);
#endif
+ printf("post unmap\n");
}
/* Need to rewrite hardware status page */
@@ -1612,6 +1614,8 @@
int ret = 0, mmio_bar, mmio_size;
uint32_t aperture_size;
+ printf("i915_driver_load entry\n");
+
info = (struct intel_device_info *) flags;
/* Refuse to load on gen6+ without kms enabled. */
@@ -1885,6 +1889,9 @@
spin_lock_destroy(&dev_priv->irq_lock);
#endif
kfree(dev_priv);
+
+ printf("i915_driver_load exit\n");
+
return ret;
}
Console output:
agp0 at pchb0: i965-family chipset
pci_mem_find: expected mem type 00000004, found 00000000
pci_mem_find: expected mem type 00000004, found 00000000
agp0: can't find MMIO registers
ppb0 at pci0 dev 1 function 0: vendor 8086 product 29a1 (rev. 0x02)
ppb0: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x16 @ 2.5GT/s
pci1 at ppb0 bus 1
i915drmkms0 at pci0 dev 2 function 0: vendor 8086 product 29a2 (rev. 0x02)
i915_driver_load entry
DRM error in i915_gmch_probe: failed to set up gmch
uvm_fault(0xffffffff811b27a0, 0x7fc000357000, 2) -> e
fatal page fault in supervisor mode
trap type 6 code 2 rip ffffffff80a80c73 cs 8 rflags 10246 cr2 7fc000357000 ilevel 8 rsp ffffffff8137e7c8
curlwp 0xffffffff8110d740 pid 0.1 lowest kstack 0xffffffff8137b2c0
kernel: page fault trap, code=0
Stopped in pid 0.1 (system) at netbsd:_atomic_swap_64+0x3: xchgq %rax,0(%
rdi)
db{0}>
Looks like that is not where it fails.
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: christos@zoulas.com (Christos Zoulas), gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 15:24:30 +0300
I added another bunch of printfs to track down where the
bus_space_unmap() call happens.
i915_driver_load() calls i915_gem_gtt_init(), which
returns a nonzero value via this return statement:
ret = gtt->gtt_probe(dev, >t->base.total, >t->stolen_size,
>t->mappable_base, >t->mappable_end);
if (ret)
return ret;
i915_driver_load() then takes the goto:
if (ret)
goto out_regs;
And the panic happens in the inline function pci_iounmap(), which
calls bus_space_unmap():
out_regs:
intel_uncore_fini(dev);
intel_uncore_destroy(dev);
pci_iounmap(dev->pdev, dev_priv->regs);
--
Andreas Gustafsson, gson@gson.org
From: christos@zoulas.com (Christos Zoulas)
To: Andreas Gustafsson <gson@gson.org>, gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 09:17:22 -0400
On Oct 12, 3:24pm, gson@gson.org (Andreas Gustafsson) wrote:
-- Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
| I added another bunch of printfs to track down where the
| bus_space_unmap() call happens.
|
| i915_driver_load() calls i915_gem_gtt_init(), which
| returns a nonzero value via this return statement:
|
| ret = gtt->gtt_probe(dev, >t->base.total, >t->stolen_size,
| >t->mappable_base, >t->mappable_end);
| if (ret)
| return ret;
|
| i915_driver_load() then takes the goto:
|
| if (ret)
| goto out_regs;
|
| And the panic happens in the inline function pci_iounmap(), which
| calls bus_space_unmap():
|
| out_regs:
| intel_uncore_fini(dev);
| intel_uncore_destroy(dev);
| pci_iounmap(dev->pdev, dev_priv->regs);
Thanks, this is very helpful...
christos
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, riastradh@NetBSD.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, nisimura@netbsd.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 15:03:35 -0400
On Oct 12, 12:25pm, gson@gson.org (Andreas Gustafsson) wrote:
-- Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
| The following reply was made to PR kern/50060; it has been noted by GNATS.
|
| From: Andreas Gustafsson <gson@gson.org>
| To: christos@zoulas.com (Christos Zoulas), gnats-bugs@NetBSD.org
| Cc:
| Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
| Date: Mon, 12 Oct 2015 15:24:30 +0300
|
| I added another bunch of printfs to track down where the
| bus_space_unmap() call happens.
|
| i915_driver_load() calls i915_gem_gtt_init(), which
| returns a nonzero value via this return statement:
|
| ret = gtt->gtt_probe(dev, >t->base.total, >t->stolen_size,
| >t->mappable_base, >t->mappable_end);
| if (ret)
| return ret;
Unfortunately forcing an error there on my machine does not reproduce
the problem... Boot completes but I end up with no console which is a
different bug!
christos
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, nisimura@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 19:54:49 +0000
Date: Mon, 12 Oct 2015 09:51:39 +0300
From: Andreas Gustafsson <gson@gson.org>
agp0: can't find MMIO registers
This is where the real problem is happening. Fixing the error
branches, as christos@ has been doing, is all well and good, but they
won't help to make the thing work. This is in agp_i810_attach in
sys/dev/pci/agp_i810.c, which is a twisty maze of intermingled
device-specific logic. I have the data sheets if anyone wants to take
a closer look.
Can you show `pcictl pci0 dump -b 0 -d 0 -f 2'?
From: Andreas Gustafsson <gson@gson.org>
To: Taylor R Campbell <riastradh@NetBSD.org>, gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 23:47:38 +0300
Taylor R Campbell wrote:
> Can you show `pcictl pci0 dump -b 0 -d 0 -f 2'?
Since NetBSD/amd64 7.0 doesn't boot, I booted NetBSD/i386 5.99.60 that
I happened to have on a USB stick and ran "pcictl pci0 dump -b 0 -d 0
-f 2" there, but it printed nothing. It doesn't look like device 0
has a function 2, neither in the 5.99.60 nor the 7.0 dmesg. Did you
perhaps mean "-f 0"?
The full dmesg from 5.99.60 is below for your reference.
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
NetBSD 5.99.60 (GENERIC) #0: Sun Jan 22 00:24:44 EET 2012
gson@guru.araneus.fi:/bracket/work/2012.01.21.17.12.56/obj/sys/arch/i386/compile/GENERIC
total memory = 2039 MB
avail memory = 1992 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
System manufacturer System Product Name (System Version)
mainbus0 (root)
cpu0 at mainbus0 apid 0: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz, id 0x6f7
cpu1 at mainbus0 apid 1: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz, id 0x6f7
cpu2 at mainbus0 apid 2: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz, id 0x6f7
cpu3 at mainbus0 apid 3: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz, id 0x6f7
ioapic0 at mainbus0 apid 4: pa 0xfec00000, version 20, 24 pins
acpi0 at mainbus0: Intel ACPICA 20110623
acpi0: X/RSDT: OemId <MSTEST,TESTONLY,11000726>, AslId <MSFT,00000097>
ACPI Warning: Incorrect checksum in table [OEMB] - 0x60, should be 0x57 (20110623/tbutils-282)
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
hpet0 at acpi0: high precision event timer (mem 0xfed00000-0xfed00400)
timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000
MCH (PNP0C01) at acpi0 not configured
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0
pcppi1 at acpi0 (SPKR, PNP0800): io 0x61
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
npx1 at acpi0 (COPR, PNP0C04): io 0xf0-0xff irq 13
npx1: reported by CPUID; using exception 16
UAR1 (PNP0501) at acpi0 not configured
FDC (PNP0700) at acpi0 not configured
SIOR (PNP0C02) at acpi0 not configured
RMSC (PNP0C02) at acpi0 not configured
aibs0 at acpi0 (ASOC, ATK0110-16843024): ASUSTeK AI Booster
OMSC (PNP0C02) at acpi0 not configured
PCIE (PNP0C02) at acpi0 not configured
RMEM (PNP0C01) at acpi0 not configured
acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button
apm0 at acpi0: Power Management spec V1.2
attimer1: attached to pcppi1
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x29a0 (rev. 0x02)
agp0 at pchb0: detected 7676k stolen memory
agp0: aperture at 0xd0000000, size 0x20000000
ppb0 at pci0 dev 1 function 0: vendor 0x8086 product 0x29a1 (rev. 0x02)
ppb0: PCI Express 1.0 <Root Port of PCI-E Root Complex>
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
vga1 at pci0 dev 2 function 0: vendor 0x8086 product 0x29a2 (rev. 0x02)
vga1: WARNING: ignoring 64-bit BAR @ 0x18
wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
i915drm0 at vga1: Intel i965G
i915drm0: AGP at 0xd0000000 256MB
i915drm0: Initialized i915 1.6.0 20080730
vendor 0x8086 product 0x29a4 (miscellaneous communications, revision 0x02) at pci0 dev 3 function 0 not configured
uhci0 at pci0 dev 26 function 0: vendor 0x8086 product 0x2834 (rev. 0x02)
uhci0: interrupting at ioapic0 pin 16
usb0 at uhci0: USB revision 1.0
uhci1 at pci0 dev 26 function 1: vendor 0x8086 product 0x2835 (rev. 0x02)
uhci1: interrupting at ioapic0 pin 17
usb1 at uhci1: USB revision 1.0
ehci0 at pci0 dev 26 function 7: vendor 0x8086 product 0x283a (rev. 0x02)
ehci0: interrupting at ioapic0 pin 18
ehci0: BIOS has given up ownership
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1
usb2 at ehci0: USB revision 2.0
hdaudio0 at pci0 dev 27 function 0: HD Audio Controller
hdaudio0: interrupting at ioapic0 pin 22
hdafg0 at hdaudio0: ADI AD1988A
hdafg0: DAC00 8ch: Speaker [Jack]
hdafg0: ADC01 2ch: CD [Built-In], Line In [Jack], Mic In [Jack]
hdafg0: DAC02 2ch: HP Out [Jack]
hdafg0: DIG03 2ch: SPDIF Out [Jack]
hdafg0: 8ch/2ch 8000Hz 11025Hz 16000Hz 22050Hz 32000Hz 44100Hz 48000Hz 88200Hz 96000Hz 192000Hz PCM16 PCM20 PCM24 AC3
audio0 at hdafg0: full duplex, playback, capture, independent
ppb1 at pci0 dev 28 function 0: vendor 0x8086 product 0x283f (rev. 0x02)
ppb1: PCI Express 1.0 <Root Port of PCI-E Root Complex>
pci2 at ppb1 bus 3
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
ppb2 at pci0 dev 28 function 4: vendor 0x8086 product 0x2847 (rev. 0x02)
ppb2: PCI Express 1.0 <Root Port of PCI-E Root Complex>
pci3 at ppb2 bus 2
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
jmide0 at pci3 dev 0 function 0: vendor 0x197b product 0x2363
jmide0: 1 PATA port, 2 SATA ports
jmide0: interrupting at ioapic0 pin 16
ahcisata0 at jmide0
ahcisata0: AHCI revision 1.0, 2 ports, 32 slots, CAP 0xc722ff01<PSC,SSC,PMD,SPM,ISS=0x2=Gen2,SCLO,SAL,SALP,SNCQ,S64A>
atabus0 at ahcisata0 channel 0
atabus1 at ahcisata0 channel 1
jmide0: PCI IDE interface used
jmide0: bus-master DMA support present
jmide0: primary channel wired to native-PCI mode
jmide0: primary channel is unused
jmide0: secondary channel wired to native-PCI mode
jmide0: secondary channel is PATA
atabus2 at jmide0 channel 1
uhci2 at pci0 dev 29 function 0: vendor 0x8086 product 0x2830 (rev. 0x02)
uhci2: interrupting at ioapic0 pin 23
usb3 at uhci2: USB revision 1.0
uhci3 at pci0 dev 29 function 1: vendor 0x8086 product 0x2831 (rev. 0x02)
uhci3: interrupting at ioapic0 pin 19
usb4 at uhci3: USB revision 1.0
uhci4 at pci0 dev 29 function 2: vendor 0x8086 product 0x2832 (rev. 0x02)
uhci4: interrupting at ioapic0 pin 18
usb5 at uhci4: USB revision 1.0
ehci1 at pci0 dev 29 function 7: vendor 0x8086 product 0x2836 (rev. 0x02)
ehci1: interrupting at ioapic0 pin 23
ehci1: EHCI version 1.0
ehci1: companion controllers, 2 ports each: uhci2 uhci3 uhci4
usb6 at ehci1: USB revision 2.0
ppb3 at pci0 dev 30 function 0: vendor 0x8086 product 0x244e (rev. 0xf2)
pci4 at ppb3 bus 4
pci4: i/o space, memory space enabled
fwohci0 at pci4 dev 3 function 0: vendor 0x1106 product 0x3044 (rev. 0xc0)
fwohci0: interrupting at ioapic0 pin 21
fwohci0: OHCI version 1.10 (ROM=1)
fwohci0: No. of Isochronous channels is 4.
fwohci0: EUI64 00:11:d8:00:01:55:27:ef
fwohci0: Phy 1394a available S400, 2 ports.
fwohci0: Link S400, max_rec 2048 bytes.
ieee1394if0 at fwohci0: IEEE1394 bus
fwip0 at ieee1394if0: IP over IEEE1394
fwohci0: Initiate bus reset
skc0 at pci4 dev 4 function 0: ioapic0 pin 19
skc0: interrupt moderation is 0 us
skc0: Marvell Yukon Lite Gigabit Ethernet rev. (0x9)
sk0 at skc0 port A: Ethernet address 00:1b:fc:9e:0f:b4
makphy0 at sk0 phy 0: Marvell 88E1011 Gigabit PHY, rev. 5
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ichlpcib0 at pci0 dev 31 function 0: vendor 0x8086 product 0x2810 (rev. 0x02)
timecounter: Timecounter "ichlpcib0" frequency 3579545 Hz quality 1000
ichlpcib0: 24-bit timer
ichlpcib0: TCO (watchdog) timer configured.
gpio0 at ichlpcib0: 64 pins
piixide0 at pci0 dev 31 function 2: Intel 82801H Serial ATA Controller (ICH8) (rev. 0x02)
piixide0: bus-master DMA support present
piixide0: primary channel configured to native-PCI mode
piixide0: using ioapic0 pin 19 for native-PCI interrupt
atabus3 at piixide0 channel 0
piixide0: secondary channel configured to native-PCI mode
atabus4 at piixide0 channel 1
ichsmb0 at pci0 dev 31 function 3: vendor 0x8086 product 0x283e (rev. 0x02)
ichsmb0: interrupting at ioapic0 pin 18
iic0 at ichsmb0: I2C bus
piixide1 at pci0 dev 31 function 5: Intel 82801H Serial ATA Controller (ICH8) (rev. 0x02)
piixide1: bus-master DMA support present
piixide1: primary channel wired to native-PCI mode
piixide1: using ioapic0 pin 19 for native-PCI interrupt
atabus5 at piixide1 channel 0
piixide1: secondary channel wired to native-PCI mode
atabus6 at piixide1 channel 1
isa0 at ichlpcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
acpicpu0 at cpu0: ACPI CPU
acpicpu0: C1: HLT, lat 0 us, pow 0 mW
acpicpu0: P0: FFH, lat 10 us, pow 88000 mW, 2394 MHz
acpicpu0: P1: FFH, lat 10 us, pow 56320 mW, 1596 MHz
coretemp0 at cpu0: thermal sensor, 1 C resolution
acpicpu1 at cpu1: ACPI CPU
coretemp1 at cpu1: thermal sensor, 1 C resolution
acpicpu2 at cpu2: ACPI CPU
coretemp2 at cpu2: thermal sensor, 1 C resolution
acpicpu3 at cpu3: ACPI CPU
coretemp3 at cpu3: thermal sensor, 1 C resolution
fwohci0: BUS reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
ieee1394if0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me)
ieee1394if0: bus manager 0
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
timecounter: Timecounter "TSC" frequency 2400087870 Hz quality 3000
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
uhub0 at usb0: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1 at usb1: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub2 at usb2: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub2: 4 ports with 4 removable, self powered
uhub3 at usb3: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
uhub4 at usb4: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub4: 2 ports with 2 removable, self powered
uhub5 at usb5: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub5: 2 ports with 2 removable, self powered
uhub6 at usb6: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub6: 6 ports with 6 removable, self powered
ehci1: handing over low speed device on port 2 to uhci2
atapibus0 at atabus2: 2 targets
cd0 at atapibus0 drive 0: <TSSTcorpCD/DVDW SH-S182D, , SB04> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd0(jmide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
umass0 at uhub6 port 4 configuration 1 interface 0
umass0: LEXAR JD FIREFLY, rev 2.00/11.00, addr 2
umass0: using SCSI over Bulk-Only
scsibus0 at umass0: 2 targets, 1 lun per target
sd0 at scsibus0 target 0 lun 0: <LEXAR, JD FIREFLY, 1100> disk removable
sd0: fabricating a geometry
sd0: 1920 MB, 1920 cyl, 64 head, 32 sec, 512 bytes/sect x 3932160 sectors
sd0: fabricating a geometry
Kernelized RAIDframe activated
boot device: sd0
root on sd0a dumps on sd0b
root file system type: ffs
uhidev0 at uhub3 port 2 configuration 1 interface 0
uhidev0: Holtek Semiconductor USB Keyboard, rev 1.10/3.10, addr 2, iclass 3/1
ukbd0 at uhidev0: 8 modifier keys, 6 key codes
wskbd0 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub3 port 2 configuration 1 interface 1
uhidev1: Holtek Semiconductor USB Keyboard, rev 1.10/3.10, addr 2, iclass 3/0
uhidev1: 2 report ids
uhid0 at uhidev1 reportid 1: input=1, output=0, feature=0
uhid1 at uhidev1 reportid 2: input=3, output=0, feature=0
skc0: interrupt moderation is 1000 us
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gson@netbsd.org, nisimura@netbsd.org
Cc: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 20:36:20 +0000
This is a multi-part message in MIME format.
--=_PEcRT7HpZC+rI8TOQS8CJoc9wLMBnnal
Can you please try the attached patch?
--=_PEcRT7HpZC+rI8TOQS8CJoc9wLMBnnal
Content-Type: text/plain; charset="ISO-8859-1"; name="agp"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="agp.patch"
Index: agp_i810.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/dev/pci/agp_i810.c,v
retrieving revision 1.118
diff -p -u -r1.118 agp_i810.c
--- agp_i810.c 5 Apr 2015 12:55:20 -0000 1.118
+++ agp_i810.c 12 Oct 2015 20:35:42 -0000
@@ -420,7 +420,9 @@ agp_i810_attach(device_t parent, device_
case CHIP_I965:
apbase =3D AGP_I965_GMADR;
mmadr_bar =3D AGP_I965_MMADR;
- mmadr_type |=3D PCI_MAPREG_MEM_TYPE_64BIT;
+ mmadr_type |=3D PCI_MAPREG_MEM_TYPE_MASK &
+ pci_mapreg_type(isc->vga_pa.pa_pc,
+ isc->vga_pa.pa_tag, AGP_I965_MMADR);
if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
AGP_I965_MMADR, mmadr_type, NULL, &isc->size, NULL))
isc->size =3D 512*1024; /* XXX */
--=_PEcRT7HpZC+rI8TOQS8CJoc9wLMBnnal--
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, nisimura@netbsd.org
Cc:
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 20:10:50 +0000
Date: Mon, 12 Oct 2015 19:54:49 +0000
From: Taylor R Campbell <riastradh@NetBSD.org>
Can you show `pcictl pci0 dump -b 0 -d 0 -f 2'?
Sorry, I meant: pcictl pci0 dump -b 0 -d 2 -f 0
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gson@netbsd.org, nisimura@netbsd.org
Cc: gnats-bugs@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Mon, 12 Oct 2015 23:07:57 +0000
This is a multi-part message in MIME format.
--=_I2aphbwziqku9cDX0Sh0XLwUoEpS6aeG
Here's a slightly better, if more ambitious, patch, which factors out
the relevant logic and leaves a comment explaining what is going on
here so it won't take me a whole day to figure this out next time it
bites us.
--=_I2aphbwziqku9cDX0Sh0XLwUoEpS6aeG
Content-Type: text/plain; charset="ISO-8859-1"; name="agp1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="agp1.patch"
Index: agp_i810.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/dev/pci/agp_i810.c,v
retrieving revision 1.118
diff -p -u -r1.118 agp_i810.c
--- agp_i810.c 5 Apr 2015 12:55:20 -0000 1.118
+++ agp_i810.c 12 Oct 2015 23:06:42 -0000
@@ -407,47 +407,91 @@ agp_i810_attach(device_t parent, device_
}
aprint_naive("\n");
=20
- mmadr_type =3D PCI_MAPREG_TYPE_MEM;
+ /* Discriminate on the chipset to choose the relevant BARs. */
switch (isc->chiptype) {
case CHIP_I915:
case CHIP_G33:
apbase =3D AGP_I915_GMADR;
mmadr_bar =3D AGP_I915_MMADR;
- isc->size =3D 512*1024;
gtt_bar =3D AGP_I915_GTTADR;
gtt_off =3D ~(bus_size_t)0; /* XXXGCC */
break;
case CHIP_I965:
apbase =3D AGP_I965_GMADR;
mmadr_bar =3D AGP_I965_MMADR;
- mmadr_type |=3D PCI_MAPREG_MEM_TYPE_64BIT;
- if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
- AGP_I965_MMADR, mmadr_type, NULL, &isc->size, NULL))
- isc->size =3D 512*1024; /* XXX */
gtt_bar =3D 0;
gtt_off =3D AGP_I965_GTT;
break;
case CHIP_G4X:
apbase =3D AGP_I965_GMADR;
mmadr_bar =3D AGP_I965_MMADR;
- mmadr_type |=3D PCI_MAPREG_MEM_TYPE_64BIT;
- if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
- AGP_I965_MMADR, mmadr_type, NULL, &isc->size, NULL))
- isc->size =3D 512*1024; /* XXX */
gtt_bar =3D 0;
gtt_off =3D AGP_G4X_GTT;
break;
default:
apbase =3D AGP_I810_GMADR;
mmadr_bar =3D AGP_I810_MMADR;
- if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
- AGP_I810_MMADR, mmadr_type, NULL, &isc->size, NULL))
- isc->size =3D 512*1024; /* XXX */
gtt_bar =3D 0;
gtt_off =3D AGP_I810_GTT;
break;
}
=20
+ /*
+ * Ensure the MMIO BAR is, in fact, a memory BAR.
+ *
+ * XXX This is required because we use pa_memt below. It is
+ * not a priori clear to me there is any other reason to
+ * require this.
+ */
+ mmadr_type =3D pci_mapreg_type(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
+ mmadr_bar);
+ if ((mmadr_type & PCI_MAPREG_TYPE_MEM) !=3D PCI_MAPREG_TYPE_MEM) {
+ aprint_error_dev(self, "non-memory device MMIO registers\n");
+ error =3D ENXIO;
+ goto fail1;
+ }
+
+ /*
+ * Determine the size of the MMIO registers.
+ *
+ * XXX The size of the MMIO registers we use is statically
+ * determined, as a function of the chipset, by the driver's
+ * implementation.
+ *
+ * On some chipsets, the GTT is part of the MMIO register BAR.
+ * We would like to map the GTT separately, so that we can map
+ * it prefetchable, which we can't do with the MMIO registers.
+ * Consequently, we would especially like to map a fixed size
+ * of MMIO registers, not just whatever size the BAR says.
+ *
+ * However, old drm assumes that the combined GTT/MMIO register
+ * space is a single bus space mapping, so mapping them
+ * separately breaks that. Once we rip out old drm, we can
+ * replace the pci_mapreg_info call by the chipset switch.
+ */
+#if notyet
+ switch (isc->chiptype) {
+ case CHIP_I810:
+ case CHIP_I830:
+ case CHIP_I855:
+ case CHIP_I915:
+ case CHIP_G33:
+ case CHIP_I965:
+ case CHIP_G4X:
+ isc->size =3D 512*1024;
+ break;
+ case CHIP_SANDYBRIDGE:
+ case CHIP_IVYBRIDGE:
+ case CHIP_HASWELL:
+ isc->size =3D 2*1024*1024;
+ break;
+ }
+#else
+ if (pci_mapreg_info(isc->vga_pa.pa_pc, isc->vga_pa.pa_tag,
+ mmadr_bar, mmadr_type, NULL, &isc->size, NULL))
+ isc->size =3D 512*1024;
+#endif /* notyet */
+
/* Map (or, rather, find the address and size of) the aperture. */
if (isc->chiptype =3D=3D CHIP_I965 || isc->chiptype =3D=3D CHIP_G4X)
error =3D agp_i965_map_aperture(&isc->vga_pa, sc, apbase);
--=_I2aphbwziqku9cDX0Sh0XLwUoEpS6aeG--
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org, riastradh@NetBSD.org
Cc: nisimura@netbsd.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Tue, 13 Oct 2015 10:04:13 +0300
Taylor R Campbell wrote:
> Sorry, I meant: pcictl pci0 dump -b 0 -d 2 -f 0
PCI configuration registers:
Common header:
0x00: 0x29a28086 0x00900007 0x03000002 0x00000000
Vendor Name: Intel (0x8086)
Device Name: 82965G Integrated Graphics Device (0x29a2)
Command register: 0x0007
I/O space accesses: on
Memory space accesses: on
Bus mastering: on
Special cycles: off
MWI transactions: off
Palette snooping: off
Parity error checking: off
Address/data stepping: off
System error (SERR): off
Fast back-to-back transactions: off
Interrupt disable: off
Status register: 0x0090
Interrupt status: inactive
Capability List support: on
66 MHz capable: off
User Definable Features (UDF) support: off
Fast back-to-back capable: on
Data parity error detected: off
DEVSEL timing: fast (0x0)
Slave signaled Target Abort: off
Master received Target Abort: off
Master received Master Abort: off
Asserted System Error (SERR): off
Parity error detected: off
Class Name: display (0x03)
Subclass Name: VGA (0x00)
Interface: 0x00
Revision ID: 0x02
BIST: 0x00
Header Type: 0x00 (0x00)
Latency Timer: 0x00
Cache Line Size: 0x00
Type 0 ("normal" device) header:
0x10: 0xffa00000 0x00000000 0xd000000c 0x00000000
0x20: 0x0000ec01 0x00000000 0x00000000 0x820b1043
0x30: 0x00000000 0x00000090 0x00000000 0x0000010b
Base address register at 0x10
type: 32-bit nonprefetchable memory
base: 0xffa00000, not sized
Base address register at 0x14
not implemented(?)
Base address register at 0x18
type: 64-bit prefetchable memory
base: 0x00000000d0000000, not sized
Base address register at 0x20
type: i/o
base: 0x0000ec00, not sized
Base address register at 0x24
not implemented(?)
Cardbus CIS Pointer: 0x00000000
Subsystem vendor ID: 0x1043
Subsystem ID: 0x820b
Expansion ROM Base Address: 0x00000000
Capability list pointer: 0x90
Reserved @ 0x38: 0x00000000
Maximum Latency: 0x00
Minimum Grant: 0x00
Interrupt pin: 0x01 (pin A)
Interrupt line: 0x0b
Capability register at 0x90
type: 0x05 (MSI)
Capability register at 0xd0
type: 0x01 (Power Management, rev. 1.0)
PCI Message Signaled Interrupt
Message Control register: 0x0000
MSI Enabled: no
Multiple Message Capable: no (1 vector)
Multiple Message Enabled: off (1 vector)
64 Bit Address Capable: no
Per-Vector Masking Capable: no
Message Address register: 0x00000000
Message Data register: 0x00000000
PCI Power Management Capabilities Register
Capabilities register: 0x0022
Version: 1.1
PME# clock: off
Device specific initialization: on
3.3V auxiliary current: self-powered
D1 power management state support: off
D2 power management state support: off
PME# support: 0x00
Control/status register: 0x0000
Power state: D0
PCI Express reserved: off
No soft reset: off
PME# assertion disabled
PME# status: off
Device-dependent header:
0x40: 0x00000000 0x000000e0 0x51090009 0x8900036e
0x50: 0x00300c86 0x0000004b 0x00000000 0x7f800000
0x60: 0x00020000 0x00000000 0x00000000 0x00000000
0x70: 0x00000000 0x00000000 0x00000000 0x00000000
0x80: 0x00000000 0x00000000 0x00000000 0x00000000
0x90: 0x0000d005 0x00000000 0x00000000 0x00000000
0xa0: 0x00001111 0x00000000 0x00000000 0x00000000
0xb0: 0x00000000 0x00000000 0x00000000 0x00000000
0xc0: 0x00000000 0x3c071f1f 0x10200080 0x000001c9
0xd0: 0x00220001 0x00000000 0x00000000 0x00020100
0xe0: 0x00000000 0x00000000 0x00008000 0x00000000
0xf0: 0x00030034 0x00000000 0x00040f90 0x7f79e0e4
From: Andreas Gustafsson <gson@NetBSD.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: nisimura@netbsd.org,
gnats-bugs@NetBSD.org
Subject: Re: kern/50060: kernel crash with i915drmksm on Intel 965Q
Date: Tue, 13 Oct 2015 13:04:11 +0300
Taylor R Campbell wrote:
> Here's a slightly better, if more ambitious, patch, which factors out
> the relevant logic and leaves a comment explaining what is going on
> here so it won't take me a whole day to figure this out next time it
> bites us.
I applied this patch, and my machine now boots. Thank you!
There is still a problem with the video mode, though: I have an LCD
display connected to the VGA port, and it is showing the following
message:
ATTENTION
OUT OF RANGE
H: 71.9 KHz V: 160 Hz
PLEASE CHANGE SIGNAL TIMING
I'm planning to use this machine with a serial console, so this is not
a big problem for me, but it probably will be for someone else.
--
Andreas Gustafsson, gson@NetBSD.org
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/50060 CVS commit: src/sys/dev/pci
Date: Tue, 13 Oct 2015 12:17:04 +0000
Module Name: src
Committed By: riastradh
Date: Tue Oct 13 12:17:04 UTC 2015
Modified Files:
src/sys/dev/pci: agp_i810.c
Log Message:
Fix mapping Intel graphics device registers.
- Accept either 32-bit or 64-bit mappings for all devices.
- Let the device always dictate size of the mapping.
- Explain why we don't have a statically fixed mapping size.
Fixes the main part of PR kern/50060. Still a display mode issue
from one submitter, but it is almost certainly an unrelated issue.
To generate a diff of this commit:
cvs rdiff -u -r1.118 -r1.119 src/sys/dev/pci/agp_i810.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/50060 CVS commit: [netbsd-7] src/sys/dev/pci
Date: Fri, 6 Nov 2015 22:55:10 +0000
Module Name: src
Committed By: riz
Date: Fri Nov 6 22:55:10 UTC 2015
Modified Files:
src/sys/dev/pci [netbsd-7]: agp_i810.c
Log Message:
Pull up following revision(s) (requested by riastradh in ticket #1000):
sys/dev/pci/agp_i810.c: revision 1.119
Fix mapping Intel graphics device registers.
- Accept either 32-bit or 64-bit mappings for all devices.
- Let the device always dictate size of the mapping.
- Explain why we don't have a statically fixed mapping size.
Fixes the main part of PR kern/50060. Still a display mode issue
from one submitter, but it is almost certainly an unrelated issue.
To generate a diff of this commit:
cvs rdiff -u -r1.112.2.3 -r1.112.2.4 src/sys/dev/pci/agp_i810.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Fri, 06 Nov 2015 23:06:27 +0000
State-Changed-Why:
Fix committed, pullups applied.
nisimura@, can you confirm that this fixes the problem for you too?
gson@, can you file a separate PR for the LCD issue?
From: "nisimura@netbsd.org" <locore64@48gou.jp>
To: gnats-bugs@netbsd.org
Cc: Taylor R Campbell <riastradh@netbsd.org>, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org,
nisimura@netbsd.org, gson@gson.org
Subject: Re: kern/50060 (kernel crash with i915drmksm on Intel 965Q)
Date: Wed, 11 Nov 2015 02:33:44 +0900
--089e012952fe9000290524331c46
Content-Type: text/plain; charset=UTF-8
It has been working on my NANAO. You can close this PR.
-nisimura
--089e012952fe9000290524331c46
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">It has been working on my NANAO.=C2=A0 You can close this =
PR.<div><br></div><div>-nisimura</div></div>
--089e012952fe9000290524331c46--
State-Changed-From-To: feedback->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Tue, 10 Nov 2015 17:53:24 +0000
State-Changed-Why:
submitter reports fixed
From: Andreas Gustafsson <gson@gson.org>
To: riastradh@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/50060 (kernel crash with i915drmksm on Intel 965Q)
Date: Fri, 20 Nov 2015 12:49:03 +0200
Some weeks ago, riastradh@NetBSD.org wrote:
> gson@, can you file a separate PR for the LCD issue?
Done: kern/50452.
--
Andreas Gustafsson, gson@gson.org
From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/50060 CVS commit: [netbsd-7] src/sys/external/bsd/drm2/dist/drm/i915
Date: Thu, 11 Feb 2016 23:23:11 +0000
Module Name: src
Committed By: snj
Date: Thu Feb 11 23:23:11 UTC 2016
Modified Files:
src/sys/external/bsd/drm2/dist/drm/i915 [netbsd-7]: i915_dma.c
Log Message:
Pull up following revision(s) (requested by riastradh in ticket #1091):
sys/external/bsd/drm2/dist/drm/i915/i915_dma.c: revisions 1.17, 1.18
Zero out the guard for bus_space_unmap before calling i915_dma_cleanup() which
calls i915_free_hws(), which then tries to unmap. Perhaps this fixes PR/50060.
--
fix the same bug on the linux side, print the error, and return the -tive
error to mimick linux.
To generate a diff of this commit:
cvs rdiff -u -r1.10.2.4 -r1.10.2.5 \
src/sys/external/bsd/drm2/dist/drm/i915/i915_dma.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.