NetBSD Problem Report #46596

From gson@gson.org  Tue Jun 12 20:01:31 2012
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 1CF5F63B882
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 12 Jun 2012 20:01:31 +0000 (UTC)
Message-Id: <20120612195857.0E04675E8C@guava.gson.org>
Date: Tue, 12 Jun 2012 22:58:56 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@gnats.NetBSD.org
Subject: ehci interrupt storm
X-Send-Pr-Version: 3.95

>Number:         46596
>Category:       kern
>Synopsis:       ehci interrupt storm triggered by VGA interrupt
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    jdolecek
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jun 12 20:05:00 +0000 2012
>Closed-Date:    Wed Jun 17 14:08:57 +0000 2020
>Last-Modified:  Wed Jun 17 14:15:01 +0000 2020
>Originator:     Andreas Gustafsson
>Release:        NetBSD 5.1
>Organization:
>Environment:
System: NetBSD guido.araneus.fi 5.1 NetBSD 5.1 (GENERIC) #0: Sat Nov 6 13:19:33 UTC 2010 builds@b6.netbsd.org:/home/builds/ab/netbsd-5-1-RELEASE/amd64/201011061943Z-obj/home/builds/ab/netbsd-5-1-RELEASE/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

I have a quad-core Intel Core i5 machine running NetBSD/amd64 5.1.
Running "top", I noticed that one of the cores is spending 30% of its
time in the "interrupt" state even though the machine is otherwise
idle.  Except for this, the machine is operating normally.  The
abnormal interrupt load was not there when the machine was first
booted, but appeared at some later time.

Output of "top":

  load averages:  0.49,  0.35,  0.26;               up 48+20:10:21
  37 processes: 31 sleeping, 3 stopped, 3 on CPU
  CPU0 states:  0.0% user,  0.0% nice,  0.0% system, 30.3% interrupt, 69.7% idle
  CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
  CPU2 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
  CPU3 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
  Memory: 9510M Act, 1478M Inact, 2720K Wired, 27M Exec, 7574M File, 2861M Free
  Swap: 16G Total, 1935M Used, 14G Free

Output of "vmstat 1":

  procs    memory      page                       disks   faults      cpu
  r b w    avm    fre  flt  re  pi   po   fr   sr w0 c0   in   sy  cs us sy id
  1 0 0 11719232 2929396 -363 1  0   26  147  190 95  0 -111  -91 -315 41 5 54
  0 0 0 11719236 2929392 1   0   0    0    0    0  0  0 232564 2118 111 0 6 94
  0 0 0 11719236 2929392 0   0   0    0    0    0  0  0 233937 2028 114 0 7 93
  0 0 0 11719236 2929392 0   0   0    0    0    0  0  0 233731 2028 114 0 9 91
  1 0 0 11719236 2929392 0   0   0    0    0    0  0  0 234077 2072 114 0 8 92
  0 0 0 11719236 2929392 0   0   0    0    0    0  0  0 233330 2071 116 0 7 93

Output of "vmstat -i":

  Interrupt                                     total     rate
  global TLB IPI                          11088118152     2627
  ioapic0 pin 16                          12004189025     2844
  ioapic0 pin 22                                    6        0
  ioapic0 pin 17                            399449104       94
  ioapic0 pin 18                             20577942        4
  ioapic0 pin 23                                   22        0
  ioapic0 pin 19                                11907        0
  Total                                   23512346158     5571

Rerunning "vmstat -i" repeatedly, the totals for "ioapic pin 16" and
"Total" are rapidly increasing, but the one for "global TLB IPI" is
not.

According to dmesg, "ioapic0 pin 16" belongs to ehci0:

  ehci0: interrupting at ioapic0 pin 16

Output from "usbdevs -v":

  Controller /dev/usb0:
  addr 1: high speed, self powered, config 1, EHCI root hub(0x0000), vendor 0x8086(0x8086), rev 1.00
   port 1 addr 2: high speed, self powered, config 1, product 0x0024(0x0024), vendor 0x8087(0x8087), rev 0.00
    port 1 powered
    port 2 powered
    port 3 powered
    port 4 powered
    port 5 powered
    port 6 addr 3: low speed, power 100 mA, config 1, product 0x0103(0x0103), vendor 0x1267(0x1267), rev 1.01
   port 2 powered
  Controller /dev/usb1:
  addr 1: high speed, self powered, config 1, EHCI root hub(0x0000), vendor 0x8086(0x8086), rev 1.00
   port 1 addr 2: high speed, self powered, config 1, product 0x0024(0x0024), vendor 0x8087(0x8087), rev 0.00
    port 1 powered
    port 2 powered
    port 3 powered
    port 4 powered
    port 5 powered
    port 6 powered
    port 7 powered
    port 8 powered
   port 2 powered

>How-To-Repeat:

I have no idea.

>Fix:

>Release-Note:

>Audit-Trail:
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/46596: ehci interrupt storm
Date: Tue, 18 Dec 2012 11:54:38 +0200

 The problem persists after upgrading to 6.0.  In about six days of
 uptime, my machine has racked up more than 127 billion (!) interrupts:

   $ vmstat -i
   interrupt                                     total     rate
   TLB shootdown                             165106920      322
   cpu0 timer                                 51120744       99
   ioapic0 pin 20                              4313163        8
   ioapic0 pin 16                         127086316826   248509
   ioapic0 pin 17                                    1        0
   ioapic0 pin 18                                86758        0
   ioapic0 pin 23                                  872        0
   ioapic0 pin 19                             20469816       40
   Total                                  127327415100   248981

   $ dmesg | grep 'pin 16'
   ehci0: interrupting at ioapic0 pin 16

 -- 
 Andreas Gustafsson, gson@gson.org

From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: tsutsui@ceres.dti.ne.jp
Subject: Re: kern/46596: ehci interrupt storm
Date: Wed, 19 Dec 2012 00:39:40 +0900

 Does your machine have com at isa?
 If so, what happens if disable it and enable com at acpi?
 I had one machine (though it's AMD based) that hung at usb
 attach with com at isa but worked with com at acpi.
 (Changing BIOS interrupt settings didn't help)
 ---
 Izumi Tsutsui

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org,
    gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: Re: kern/46596: ehci interrupt storm
Date: Tue, 18 Dec 2012 19:43:30 +0200

 Izumi Tsutsui wrote:
 >  Does your machine have com at isa?

 It has no com ports on the motherboard, but two on a PCI card:

   puc0 at pci2 dev 1 function 0: NetMos NM9865 1 UART (com)
   com2 at puc0 port 0: interrupting at ioapic0 pin 17
   com2: ns16550a, working fifo
   puc1 at pci2 dev 1 function 1: NetMos NM9865 1 UART (com)
   com3 at puc1 port 0: interrupting at ioapic0 pin 18
   com3: ns16550a, working fifo

 That PCI card was installed only after the initial PR was filed
 against NetBSD 5.1, so I doubt it is the cause of the problem.

 The problem might be related to PR kern/46696, which also affects
 this particular machine.  The motherboard is an Intel DH67CLB3 H67.
 -- 
 Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: Nick Hudson <skrll@netbsd.org>
Subject: Re: kern/46596: ehci interrupt storm
Date: Tue, 17 Sep 2013 19:41:08 +0300

 One weird aspect of this bug is that the interrupt storm sometimes starts
 immediately at boot, and sometimes only after a seemingly random delay.

 I have set up the machine to run the command

   (date; vmstat 1 | head -n 10000) >>/root/vmstat-logs/vmstat.`date +%Y%m%d-%H%M%S`.log &

 from /etc/rc.local, and below are the first 100 lines from a typical
 log file specimen.  It shows the interrupt storm starting about 50
 seconds after rc.local was run; that's where the number in the "in"
 column jumps from close to zero to more than 240000 and then stays
 that way.

 I have no idea what triggers the transition.
 -- 
 Andreas Gustafsson, gson@gson.org

 Sat Aug  3 19:33:44 EEST 2013
  procs    memory      page                       disks   faults      cpu
  r b      avm    fre  flt  re  pi   po   fr   sr w0 w1   in   sy  cs us sy id
  1 1    10584 16071108 142  0   0    0    0    0 36  1   59 3727  68  0  2 98
  0 5    15644 16063272  0   0   0    0    0    0 174 0  193 7536   2  1  0 98
  2 0    19268 16055340 372  0   0    0    0    0 251 0  340 39028 26  9  1 90
  1 0    23824 16042956 2669 0   0    0    0    0 161 0  163 66916 128 12 2 86
  0 1    37168 16028664  0   0   0    0    0    0 297 0  300 9424   2  2  0 97
  0 0    41328 16024652  7   0   0    0    0    0 235 0  274 29752  6  7  0 93
  0 0    41328 16024760  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0  0  0    1   16   6  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0 85  0  174   16   2  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0  1  0    2   16   2  0 0 100
  0 0    41328 16024760  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    1   30   2  0 0 100
  procs    memory      page                       disks   faults      cpu
  r b      avm    fre  flt  re  pi   po   fr   sr w0 w1   in   sy  cs us sy id
  0 0    41444 16024644  0   0   0    0    0    0  0  0    1   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0 15  0   33   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    1   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  5  0   10   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  5  0   10   21   4  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  2  0    4   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0 10  0   21   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  9  0   20   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0 12  0   24   16   2  0 0 100
  procs    memory      page                       disks   faults      cpu
  r b      avm    fre  flt  re  pi   po   fr   sr w0 w1   in   sy  cs us sy id
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    1   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   9  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  0  0    0   16   2  0 0 100
  0 0    41444 16024644  0   0   0    0    0    0  9  0 173780 16   2  0  6 94
  0 0    41444 16024644  0   0   0    0    0    0  0  0 241291 16   2  0  6 94
  0 0    41448 16024640  0   0   0    0    0    0  0  0 240996 16   2  0  7 93
  0 0    41448 16024640  0   0   0    0    0    0  0  0 241695 16   2  0  7 93
  0 0    41448 16024640  0   0   0    0    0    0  0  0 241665 16   2  0  9 91
  0 0    41448 16024640  0   0   0    0    0    0  0  0 241343 16   2  0  9 91
  0 0    41448 16024640  0   0   0    0    0    0  0  0 241681 16   4  0  9 91
  0 0    41448 16024640  0   0   0    0    0    0  0  0 241777 16   2  0  8 92
  0 0    41448 16024640  0   0   0    0    0    0  0  0 241716 16   2  0 10 90
  procs    memory      page                       disks   faults      cpu
  r b      avm    fre  flt  re  pi   po   fr   sr w0 w1   in   sy  cs us sy id
  0 0    41448 16024640  0   0   0    0    0    0  0  0 240859 16   2  0 10 90
  0 0    41448 16024640  0   0   0    0    0    0  0  0 241229 16   2  0  7 93
  0 0    41448 16024640  0   0   0    0    0    0  9  0 241273 16   2  0  8 92
  0 0    41448 16024640  0   0   0    0    0    0  0  0 240923 21   2  0  8 92
  0 0    43232 16022160  0   0   0    0    0    0 32  0 242051 1479 2  0 10 90
  0 0    45788 16019188 968  0   0    0    0    0 74  0 242057 1688 159 0 7 93
  0 0    45792 16019184  0   0   0    0    0    0  0  0 241170 148 14  0  6 94
  0 0    45800 16019172 168  0   0    0    0    0  2  0 241135 389 22  0  9 91
  0 0    45800 16019164  0   0   0    0    0    0  0  0 240661 67   2  0  8 92
  0 0    45800 16019164  0   0   0    0    0    0  4  0 240996 16   2  0  7 93
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241335 16   2  0  8 92
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241060 16   2  0  8 92
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241682 16   2  0  8 92
  0 0    45800 16019164  0   0   0    0    0    0 77  0 241996 16   2  0  7 93
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241940 16   2  0  7 93
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241761 16   2  0  6 94
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241305 16   2  0  7 93
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241191 16  14  0  7 93
  procs    memory      page                       disks   faults      cpu
  r b      avm    fre  flt  re  pi   po   fr   sr w0 w1   in   sy  cs us sy id
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241310 16   2  0  8 92
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241560 16   2  0  8 92
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241718 40   4  0  9 91
  0 0    45800 16019164  0   0   0    0    0    0  0  0 241477 40   4  0  8 92
  0 0    45800 16019164 17   0   0    0    0    0  0  0 241381 209 10  0 10 90
  0 0    45800 16019120  0   0   0    0    0    0  9  0 241014 408  4  0  9 91
  0 1    46608 16006820 3203 0   0    0    0    0 31  0 241433 537 71  0  8 92
  0 1    46608 16006820  0   0   0    0    0    0  0  0 240942 16   2  0  8 92
  0 1    46608 16006820  0   0   0    0    0    0  0  0 241288 16   2  0  9 91
  0 1    46608 16006820  0   0   0    0    0    0  0  0 241236 16   4  0  9 91
  0 1    46608 16006820  0   0   0    0    0    0  0  0 241044 16   2  0  9 91
  0 1    46608 16006820  0   0   0    0    0    0  0  0 241506 16   6  0  9 91
  0 1    46608 16006820  0   0   0    0    0    0  0  0 241603 16   2  0  7 93
  0 1    46608 16006820  0   0   0    0    0    0  0  0 241620 16   2  0  7 93
  0 1    46608 16006820  0   0   0    0    0    0  0  0 241729 16   2  0 10 90
  0 1    46608 16006820  0   0   0    0    0    0  0  0 241363 21   2  0  7 93
  0 0    58228 15994924 854  0   0    0    0    0 121 0 240907 904 785 0  9 91

From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: Nick Hudson <skrll@netbsd.org>
Subject: Re: kern/46596: ehci interrupt storm
Date: Tue, 17 Sep 2013 21:19:59 +0300

 A short while ago, I wrote:
 > I have no idea what triggers the transition.

 I do now, after finding some discussion about a similar bug in
 FreeBSD:

   http://www.freebsd.org/cgi/query-pr.cgi?pr=156596
   http://forums.freebsd.org/showthread.php?t=24952

 The crucial hint came from the FreeBSD user "starslab" who noticed
 that the interrupt storm started when pulling the VGA connector from
 the back of the machine.

 My machine is connected to a KVM switch, and I have now determined
 that the interrupt storm consistently starts when that machine's VGA
 output is selected for display.  If the machine is already selected at
 boot time, the storm starts at boot, and if not, the storm starts when
 setting the KVM switch to display the machine's VGA output.  This
 happens even if the USB cable used for the K and M in KVM is not
 connected, so it's clearly triggered by some signal transition on the
 VGA port, not the USB port.

 I suppose this is kind of a reverse Heisenbug: it only appears when
 the machine's VGA output is observed.
 -- 
 Andreas Gustafsson, gson@gson.org

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: Andreas Gustafsson <gson@gson.org>, "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: 
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Mon, 15 Jun 2020 22:12:59 +0200

 Hi Andreas,

 Do you by chance still have the machine to test a patch for VGA
 similar to what was committed for FreeBSD?

 Jaromir

From: Andreas Gustafsson <gson@gson.org>
To: =?iso-8859-2?Q?Jarom=EDr_Dole=E8ek?= <jaromir.dolecek@gmail.com>
Cc: "gnats-bugs\@NetBSD.org" <gnats-bugs@netbsd.org>
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Tue, 16 Jun 2020 10:04:30 +0300

 Jarom=EDr Dole=E8ek wrote:
 > Do you by chance still have the machine to test a patch for VGA
 > similar to what was committed for FreeBSD=3F

 Yes.  It's in storage, but I can dig it out.
 --=20
 Andreas Gustafsson, gson@gson.org

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: Andreas Gustafsson <gson@gson.org>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Tue, 16 Jun 2020 12:38:45 +0200

 Allright, please let me know if this patch fixes the behaviour:

 http://www.netbsd.org/~jdolecek/vga_intr_disable.diff

 Jaromir

 Le mar. 16 juin 2020 =C3=A0 09:04, Andreas Gustafsson <gson@gson.org> a =C3=
 =A9crit :
 >
 > Jarom=C3=ADr Dole=C4=8Dek wrote:
 > > Do you by chance still have the machine to test a patch for VGA
 > > similar to what was committed for FreeBSD?
 >
 > Yes.  It's in storage, but I can dig it out.
 > --
 > Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@gson.org>
To: =?iso-8859-2?Q?Jarom=EDr_Dole=E8ek?= <jaromir.dolecek@gmail.com>
Cc: "gnats-bugs\@NetBSD.org" <gnats-bugs@netbsd.org>
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Wed, 17 Jun 2020 12:07:18 +0300

 Jaromir,

 I dug out the machine from storage and connected a VGA display using a
 passive DVI/VGA adapter plugged into the motherboard DVI port.  My
 test consists of starting a "vmstat 1" and unplugging/replugging the
 adapter from the motherbord DVI port.

 I was unable to reproduce the bug using a -current USB live image, so
 apparently it has already been fixed, and as a result, I can't
 meaningfully test your patch.

 To narrow down the timeframe of the fix, I tried booting a couple of
 old install CDs and breaking out of sysinst to a shell to run the
 test.  With a 6.1.5 CD, I could reproduce the bug (the vmstat
 interrupt column started showing 200000+ interrupts per second), but
 with a 7.1 install CD, I could not.  Instead, the following kernel
 message was printed:

   drm: HPD interrupt storm detected on connector HDMI-A-1: switching from hotplug detection to polling

 With the -current live image, there is neither an interrupt store nor
 a kernel message.
 -- 
 Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@gson.org>
To: =?iso-8859-2?Q?Jarom=EDr_Dole=E8ek?= <jaromir.dolecek@gmail.com>
Cc: "gnats-bugs\@NetBSD.org" <gnats-bugs@netbsd.org>
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Wed, 17 Jun 2020 12:10:32 +0300

 > With the -current live image, there is neither an interrupt store nor

 s/store/storm/
 -- 
 Andreas Gustafsson, gson@gson.org

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: Andreas Gustafsson <gson@gson.org>
Cc: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Wed, 17 Jun 2020 11:11:45 +0200

 Can you try to disable intelfb via userconf or boot and try with that,
 to see how it behaves without the drm driver?

 Jaromir

 Le mer. 17 juin 2020 =C3=A0 11:07, Andreas Gustafsson <gson@gson.org> a =C3=
 =A9crit :
 >
 > Jaromir,
 >
 > I dug out the machine from storage and connected a VGA display using a
 > passive DVI/VGA adapter plugged into the motherboard DVI port.  My
 > test consists of starting a "vmstat 1" and unplugging/replugging the
 > adapter from the motherbord DVI port.
 >
 > I was unable to reproduce the bug using a -current USB live image, so
 > apparently it has already been fixed, and as a result, I can't
 > meaningfully test your patch.
 >
 > To narrow down the timeframe of the fix, I tried booting a couple of
 > old install CDs and breaking out of sysinst to a shell to run the
 > test.  With a 6.1.5 CD, I could reproduce the bug (the vmstat
 > interrupt column started showing 200000+ interrupts per second), but
 > with a 7.1 install CD, I could not.  Instead, the following kernel
 > message was printed:
 >
 >   drm: HPD interrupt storm detected on connector HDMI-A-1: switching from=
  hotplug detection to polling
 >
 > With the -current live image, there is neither an interrupt store nor
 > a kernel message.
 > --
 > Andreas Gustafsson, gson@gson.org

From: Andreas Gustafsson <gson@gson.org>
To: =?iso-8859-2?Q?Jarom=EDr_Dole=E8ek?= <jaromir.dolecek@gmail.com>
Cc: "gnats-bugs\@NetBSD.org" <gnats-bugs@netbsd.org>
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Wed, 17 Jun 2020 13:10:00 +0300

 Jarom=EDr Dole=E8ek wrote:
 > Can you try to disable intelfb via userconf or boot and try with that=
 ,
 > to see how it behaves without the drm driver=3F

 I tried inserting "userconf disable intelfb*;" after the first colon
 in the default menu=3D entry in /boot.cfg, and the machine paniced in
 cnopen() during boot.  This is with -current.
 --=20
 Andreas Gustafsson, gson@gson.org

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: Andreas Gustafsson <gson@gson.org>
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Wed, 17 Jun 2020 12:35:41 +0200

 Right, I think you need to disable i915drmkms.

 Jaromir

 Le mer. 17 juin 2020 =C3=A0 12:15, Andreas Gustafsson <gson@gson.org> a =C3=
 =A9crit :
 >
 > The following reply was made to PR kern/46596; it has been noted by GNATS=
 .
 >
 > From: Andreas Gustafsson <gson@gson.org>
 > To: =3D?iso-8859-2?Q?Jarom=3DEDr_Dole=3DE8ek?=3D <jaromir.dolecek@gmail.c=
 om>
 > Cc: "gnats-bugs\@NetBSD.org" <gnats-bugs@netbsd.org>
 > Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
 > Date: Wed, 17 Jun 2020 13:10:00 +0300
 >
 >  Jarom=3DEDr Dole=3DE8ek wrote:
 >  > Can you try to disable intelfb via userconf or boot and try with that=
 =3D
 >  ,
 >  > to see how it behaves without the drm driver=3D3F
 >
 >  I tried inserting "userconf disable intelfb*;" after the first colon
 >  in the default menu=3D3D entry in /boot.cfg, and the machine paniced in
 >  cnopen() during boot.  This is with -current.
 >  --=3D20
 >  Andreas Gustafsson, gson@gson.org
 >

From: Andreas Gustafsson <gson@gson.org>
To: =?iso-8859-2?Q?Jarom=EDr_Dole=E8ek?= <jaromir.dolecek@gmail.com>
Cc: "gnats-bugs\@NetBSD.org" <gnats-bugs@netbsd.org>
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Wed, 17 Jun 2020 16:19:35 +0300

 Jarom=EDr Dole=E8ek wrote:
 > Right, I think you need to disable i915drmkms.

 That works.  With i915drmkms* disabled, the system boots, and I can
 reproduce the interrupt storm.  And with your patch applied, I can
 no longer reproduce it.
 --=20
 Andreas Gustafsson, gson@gson.org

From: "Jaromir Dolecek" <jdolecek@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/46596 CVS commit: src/sys/dev/pci
Date: Wed, 17 Jun 2020 14:04:03 +0000

 Module Name:	src
 Committed By:	jdolecek
 Date:		Wed Jun 17 14:04:03 UTC 2020

 Modified Files:
 	src/sys/dev/pci: vga_pci.c

 Log Message:
 explicitly disable INTx interrupts to avoid interrupt storm triggered by
 unhandled adapter interrupts

 fixes PR kern/46596 by Andreas Gustafsson, fix adopted from FreeBSD


 To generate a diff of this commit:
 cvs rdiff -u -r1.55 -r1.56 src/sys/dev/pci/vga_pci.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: kern-bug-people->jdolecek
Responsible-Changed-By: jdolecek@NetBSD.org
Responsible-Changed-When: Wed, 17 Jun 2020 14:08:57 +0000
Responsible-Changed-Why:
I've committed a fix for this.


State-Changed-From-To: open->closed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Wed, 17 Jun 2020 14:08:57 +0000
State-Changed-Why:
Fix for vga_pci.c to avoid the interrupt storm committed. I think this
is better to keep on -current for some time to confirm it doesn't break
something else, so not plannning to pullup the fix to netbsd-9 for now. 


From: Andreas Gustafsson <gson@gson.org>
To: jdolecek@NetBSD.org, "wrote:"@gson.org
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/46596 (ehci interrupt storm triggered by VGA interrupt)
Date: Wed, 17 Jun 2020 17:10:50 +0300

 jdolecek@NetBSD.org wrote:
 > I've committed a fix for this.

 Thank you!
 -- 
 Andreas Gustafsson, gson@gson.org

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.