NetBSD Problem Report #46828

From root@DL320.i.ki.nu  Thu Aug 23 14:31:12 2012
Return-Path: <root@DL320.i.ki.nu>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 4798163B882
	for <gnats-bugs@gnats.NetBSD.org>; Thu, 23 Aug 2012 14:31:12 +0000 (UTC)
Message-Id: <201208231426.q7NEQtng000395@DL320.i.ki.nu>
Date: Thu, 23 Aug 2012 23:26:55 +0900 (JST)
From: makoto@ki.nu
Reply-To: makoto@ki.nu
To: gnats-bugs@gnats.NetBSD.org
Subject: 6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p
X-Send-Pr-Version: 3.95

>Number:         46828
>Category:       kern
>Synopsis:       6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    tsutsui
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Aug 23 14:35:10 +0000 2012
>Closed-Date:    Mon Oct 01 17:52:51 +0000 2012
>Last-Modified:  Mon Oct 01 17:52:51 +0000 2012
>Originator:     Makoto Fujiwara
>Release:        NetBSD 6.0_RC1
>Organization:
KINU Corporation

>Environment:


System: NetBSD DL320 6.0_RC1 NetBSD 6.0_RC1 (GENERIC) #1: Thu Aug 23 18:08:19 JST 2012 root@modena:/export/cvs-work/src/sys/arch/amd64/compile/obj/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
DL320/G5p is a HP made, rack mount 1U server. The machine
booted fine on NetBSD 5.1, while with 6.0 (BETA, BETA2 and
RC_1) it hung when ehci proved. The point of hung is at
pci/ehci_pci.c
----------------------
391  pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGSUP,
392   legsup);
----------------------

I have two work around if USB device is NOT necessary. 
Either of them boots fine.

1. Set USB Legacy support disable at BIOS setup.
    ----------------------
    rbsu> SET CONFIG USB CONTROL 3
    USB Control
    1|USB Enabled
    2|USB Disabled
    3|Legacy USB Disabled <=
    4|External USB Ports Disabled
    ----------------------
    With this, dmesg is at (say after)
      http://www.ki.nu/~makoto/dmesg/6.0_RC1/DL320_G5p-USB-Legacy-disabled
    while USB Enabled (as a default, say before),
      http://www.ki.nu/~makoto/dmesg/6.0_RC1/DL320_G5p-USB-Enabled

    Major difference before and after is:
    ------------
     ehci0: interrupting at ioapic0 pin 21
    +ehci0: EHCI version 1.0
    +....
    ------------
But this work around has (naturally) two big problems.
  -  Won't let you to use USB keyboard
  -  Can not boot from USB device

2. Disable ehci on kernel configuration

The second work around won't probe USB device at all.

Some strange point (to me), as far as ehci is enabled (as
GENERIC kernel), even when BIOS setup as
'USB Legace support disable',
USB keyboard and USB mass storage can be used. The part of
dmessage for such situation is:
--------
ehci0 at pci0 dev 29 function 7: vendor 0x8086 product 0x293a (rev. 0x02)
ehci0: interrupting at ioapic0 pin 21
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2 uhci3
...
uhub4 at usb4: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
...
ehci0: handing over low speed device on port 5 to uhci2
...
umass0 at uhub4 port 6 configuration 1 interface 0
umass0: vendor 0x05e3 USB TO IDE, rev 2.00/0.33, addr 2
umass0: using SCSI over Bulk-Only
scsibus0 at umass0: 2 targets, 1 lun per target
sd0 at scsibus0 target 0 lun 0: <HTS54108, 0G9AT00, 0811> disk fixed
sd0: fabricating a geometry
sd0: 76319 MB, 76319 cyl, 64 head, 32 sec, 512 bytes/sect x 156301488 sectors
sd0: fabricating a geometry
--------

If we need boot from USB mass storage, we can NOT take above
work around. For such case, I have really-ad-hoc change as
follows.

Index: sys/dev/pci/ehci_pci.c
===================================================================
RCS file: /cvs/cvsroot/src/sys/dev/pci/ehci_pci.c,v
retrieving revision 1.55
diff -u -r1.55 ehci_pci.c
--- sys/dev/pci/ehci_pci.c	10 Jun 2012 06:15:53 -0000	1.55
+++ sys/dev/pci/ehci_pci.c	26 Jun 2012 10:07:17 -0000
@@ -390,6 +390,11 @@
 	addr = EHCI_HCC_EECP(cparams);
 	while (addr != 0) {
 		cap = pci_conf_read(pc, tag, addr);
+		aprint_normal( "%s: %s:%d _ %08x %08x\n", __FILE__,__func__,__LINE__,
+			       EHCI_CAP_GET_ID(cap),
+			       EHCI_CAP_ID_LEGACY);
+		ms = EHCI_MAX_BIOS_WAIT;
+		goto skip;
 		if (EHCI_CAP_GET_ID(cap) != EHCI_CAP_ID_LEGACY)
 			goto next;
 		legsup = pci_conf_read(pc, tag, addr + PCI_EHCI_USBLEGSUP);
@@ -406,6 +411,7 @@
 					break;
 				delay(10000);
 			}
+		skip:
 			if (ms == EHCI_MAX_BIOS_WAIT) {
 				aprint_normal("%s: BIOS refuses to give up "
 				    "ownership, using force\n", devname);

This will boot as follows.
----------------------
usb3 at uhci3: USB revision 1.0
ehci0 at pci0 dev 29 function 7: vendor 0x8086 product 0x293a (rev. 0x02)
ehci0: interrupting at ioapic0 pin 21
/export/src-a.j.n.o/src/sys/dev/pci/ehci_pci.c:
   ehci_get_ownership:393 _ 00000001 00000001
ehci0: BIOS refuses to give up ownership, using force
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2 uhci3
usb4 at ehci0: USB revision 2.0
ppb8 at pci0 dev 30 function 0: vendor 0x8086 product 0x244e (rev. 0x92)
----------------------

I've placed the full dmesg for four combinations.
  http://www.ki.nu/~makoto/dmesg/6.0_RC1/

  4487 Aug 23 12:12 DL320_G5p-USB-Enabled
  9312 Aug 23 12:25 DL320_G5p-USB-Legacy-disabled
  9750 Aug 23 19:01 patched-DL320_G5p-USB-Enabled
  9555 Aug 23 18:55 patched-DL320_G5p-USB-Legacy-disabled
>How-To-Repeat:
	(1) Have HP DL320/G5p machine
        (2) Setup network boot
	(3) Use netbsd-INSTALL.gz kernel from either one of 
	    6.0_BETA, 6.0_BETA2, 6.0_RC1
>Fix:
	Not known

>Release-Note:

>Audit-Trail:
From: Ryo ONODERA <ryo_on@yk.rim.or.jp>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46828 6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p
Date: Sat, 22 Sep 2012 07:21:42 +0900 (JST)

 Hi,

 # Sorry for sending with wrong subject, I will send again.

 NetBSD/amd64 6.99.11 on HP Proliant ML110 G7 has similar problem.
 It seems reverting rev 1.53 of sys/dev/pci/ehci_pci.c solves the problem.

 Try attached patch.

 Thank you.

 Index: ehci_pci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/ehci_pci.c,v
 retrieving revision 1.56
 diff -u -r1.56 ehci_pci.c
 --- ehci_pci.c	20 Jul 2012 01:26:19 -0000	1.56
 +++ ehci_pci.c	21 Sep 2012 22:14:05 -0000
 @@ -393,18 +393,16 @@
  		if (EHCI_CAP_GET_ID(cap) != EHCI_CAP_ID_LEGACY)
  			goto next;
  		legsup = pci_conf_read(pc, tag, addr + PCI_EHCI_USBLEGSUP);
 +		/* Ask BIOS to give up ownership */
 +		pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGSUP,
 +		    legsup | EHCI_LEG_HC_OS_OWNED);
  		if (legsup & EHCI_LEG_HC_BIOS_OWNED) {
 -			/* Ask BIOS to give up ownership */
 -			legsup &= ~EHCI_LEG_HC_BIOS_OWNED;
 -			legsup |= EHCI_LEG_HC_OS_OWNED;
 -			pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGSUP,
 -			    legsup);
  			for (ms = 0; ms < EHCI_MAX_BIOS_WAIT; ms++) {
  				legsup = pci_conf_read(pc, tag,
  				    addr + PCI_EHCI_USBLEGSUP);
  				if (!(legsup & EHCI_LEG_HC_BIOS_OWNED))
  					break;
 -				delay(10000);
 +				delay(1000);
  			}
  			if (ms == EHCI_MAX_BIOS_WAIT) {
  				aprint_normal("%s: BIOS refuses to give up "
 @@ -417,7 +415,9 @@
  		}

  		/* Disable SMIs */
 -		pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGCTLSTS, 0);
 +		pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGCTLSTS,
 +		    EHCI_LEG_EXT_SMI_BAR | EHCI_LEG_EXT_SMI_PCICMD |
 +		    EHCI_LEG_EXT_SMI_OS_CHANGE);

  next:
  		if (--maxcap < 0) {

 --
 Ryo ONODERA // ryo_on@yk.rim.or.jp
 PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3




From: Ryo ONODERA <ryo_on@yk.rim.or.jp>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46828 6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p
Date: Sat, 22 Sep 2012 09:29:36 +0900 (JST)

 Hi,

 I think the patch,
 http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dev/pci/ehci_pci.c.diff?r1=1.52&r2=1.53 ,
 has two parts.
 And first part has problem.

 The following patch solves the problem on ML110 G7.

 Index: ehci_pci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/ehci_pci.c,v
 retrieving revision 1.56
 diff -u -r1.56 ehci_pci.c
 --- ehci_pci.c	20 Jul 2012 01:26:19 -0000	1.56
 +++ ehci_pci.c	22 Sep 2012 00:22:17 -0000
 @@ -91,7 +91,7 @@
  enum ehci_pci_quirk_flags ehci_pci_lookup_quirkdata(pci_vendor_id_t,
  	pci_product_id_t);

 -#define EHCI_MAX_BIOS_WAIT		100 /* ms*10 */
 +#define EHCI_MAX_BIOS_WAIT		1000 /* ms */
  #define EHCI_SBx00_WORKAROUND_REG	0x50
  #define EHCI_SBx00_WORKAROUND_ENABLE	__BIT(27)

 @@ -393,18 +393,16 @@
  		if (EHCI_CAP_GET_ID(cap) != EHCI_CAP_ID_LEGACY)
  			goto next;
  		legsup = pci_conf_read(pc, tag, addr + PCI_EHCI_USBLEGSUP);
 +		/* Ask BIOS to give up ownership */
 +		pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGSUP,
 +		    legsup | EHCI_LEG_HC_OS_OWNED);
  		if (legsup & EHCI_LEG_HC_BIOS_OWNED) {
 -			/* Ask BIOS to give up ownership */
 -			legsup &= ~EHCI_LEG_HC_BIOS_OWNED;
 -			legsup |= EHCI_LEG_HC_OS_OWNED;
 -			pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGSUP,
 -			    legsup);
  			for (ms = 0; ms < EHCI_MAX_BIOS_WAIT; ms++) {
  				legsup = pci_conf_read(pc, tag,
  				    addr + PCI_EHCI_USBLEGSUP);
  				if (!(legsup & EHCI_LEG_HC_BIOS_OWNED))
  					break;
 -				delay(10000);
 +				delay(1000);
  			}
  			if (ms == EHCI_MAX_BIOS_WAIT) {
  				aprint_normal("%s: BIOS refuses to give up "

 --
 Ryo ONODERA // ryo_on@yk.rim.or.jp
 PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3


From: Ryo ONODERA <ryo_on@yk.rim.or.jp>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46828 6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p
Date: Sat, 22 Sep 2012 17:43:16 +0900 (JST)

 Hi,

 The following patch works well.

 According to Intel's "Enhanced Host Controller Interface Specification
 for Universal Serial Bus",
 www.intel.com/technology/usb/download/ehci-r10.pdf .

 In p.131, 
 "One semaphore is for the operating system (OS) and one is for the
 BIOS. These semaphores are readable and writable. These fields are in
 adjacent bytes, which allows each agent (OS or BIOS) to update their
 respective semaphore without overwriting the other ownership semaphore."

 I feel that BIOS part should not be overwritten by NetBSD.


 Index: ehci_pci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/ehci_pci.c,v
 retrieving revision 1.56
 diff -u -r1.56 ehci_pci.c
 --- ehci_pci.c	20 Jul 2012 01:26:19 -0000	1.56
 +++ ehci_pci.c	22 Sep 2012 08:32:57 -0000
 @@ -395,10 +395,8 @@
  		legsup = pci_conf_read(pc, tag, addr + PCI_EHCI_USBLEGSUP);
  		if (legsup & EHCI_LEG_HC_BIOS_OWNED) {
  			/* Ask BIOS to give up ownership */
 -			legsup &= ~EHCI_LEG_HC_BIOS_OWNED;
 -			legsup |= EHCI_LEG_HC_OS_OWNED;
  			pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGSUP,
 -			    legsup);
 +			    legsup | EHCI_LEG_HC_OS_OWNED);
  			for (ms = 0; ms < EHCI_MAX_BIOS_WAIT; ms++) {
  				legsup = pci_conf_read(pc, tag,
  				    addr + PCI_EHCI_USBLEGSUP);

 --
 Ryo ONODERA // ryo_on@yk.rim.or.jp
 PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3


From: Ryo ONODERA <ryo_on@yk.rim.or.jp>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46828 6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p
Date: Sat, 22 Sep 2012 18:14:35 +0900 (JST)

 Hi,

 According to Intel's document, the following patch is preferable, I think.

 Index: ehci_pci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/ehci_pci.c,v
 retrieving revision 1.56
 diff -u -r1.56 ehci_pci.c
 --- ehci_pci.c	20 Jul 2012 01:26:19 -0000	1.56
 +++ ehci_pci.c	22 Sep 2012 09:12:36 -0000
 @@ -393,12 +393,10 @@
  		if (EHCI_CAP_GET_ID(cap) != EHCI_CAP_ID_LEGACY)
  			goto next;
  		legsup = pci_conf_read(pc, tag, addr + PCI_EHCI_USBLEGSUP);
 +		/* Ask BIOS to give up ownership */
 +		pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGSUP,
 +		    legsup | EHCI_LEG_HC_OS_OWNED);
  		if (legsup & EHCI_LEG_HC_BIOS_OWNED) {
 -			/* Ask BIOS to give up ownership */
 -			legsup &= ~EHCI_LEG_HC_BIOS_OWNED;
 -			legsup |= EHCI_LEG_HC_OS_OWNED;
 -			pci_conf_write(pc, tag, addr + PCI_EHCI_USBLEGSUP,
 -			    legsup);
  			for (ms = 0; ms < EHCI_MAX_BIOS_WAIT; ms++) {
  				legsup = pci_conf_read(pc, tag,
  				    addr + PCI_EHCI_USBLEGSUP);

 --
 Ryo ONODERA // ryo_on@yk.rim.or.jp
 PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3


From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46828 6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p
Date: Sat, 22 Sep 2012 10:18:32 +0100

 On Sat, Sep 22, 2012 at 08:45:03AM +0000, Ryo ONODERA wrote:
 ...
 >  According to Intel's "Enhanced Host Controller Interface Specification
 >  for Universal Serial Bus",
 >  www.intel.com/technology/usb/download/ehci-r10.pdf .
 >  
 >  In p.131, 
 >  "One semaphore is for the operating system (OS) and one is for the
 >  BIOS. These semaphores are readable and writable. These fields are in
 >  adjacent bytes, which allows each agent (OS or BIOS) to update their
 >  respective semaphore without overwriting the other ownership semaphore."

 That description makes me think that the code should be doing byte
 writes.
 However I've a lurking feeling that not all systems support non-word
 config space writes.

 	David

 -- 
 David Laight: david@l8s.co.uk

From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
To: gnats-bugs@NetBSD.org
Cc: tsutsui@ceres.dti.ne.jp
Subject: Re: kern/46828 6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p
Date: Sat, 22 Sep 2012 19:05:33 +0900

 >  >  In p.131, 
 >  >  "One semaphore is for the operating system (OS) and one is for the
 >  >  BIOS. These semaphores are readable and writable. These fields are in
 >  >  adjacent bytes, which allows each agent (OS or BIOS) to update their
 >  >  respective semaphore without overwriting the other ownership semaphore."
 >  
 >  That description makes me think that the code should be doing byte
 >  writes.

 For me "overwriting the other ownership semaphore"
 just means "changing the other semaphore bit."

 ---
 Izumi Tsutsui

From: "Izumi Tsutsui" <tsutsui@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/46828 CVS commit: src/sys/dev/pci
Date: Sat, 22 Sep 2012 14:27:24 +0000

 Module Name:	src
 Committed By:	tsutsui
 Date:		Sat Sep 22 14:27:24 UTC 2012

 Modified Files:
 	src/sys/dev/pci: ehci_pci.c

 Log Message:
 Fix PR kern/46828 (6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p):
  In ehci_get_ownership(), don't explicitly clear EHCI_LEG_HC_BIOS_OWNED
  semaphore bit in the driver before asking BIOS to give up ownership.
  The EHCI spec implies that the semaphore should not be changed by
  the other agent and actually the previous one (introduced in rev 1.53
  after 5.x) caused hangup during probe on at least two HP machines
  as mentioned in the PR.  Analyzed and patch provided by Ryo ONODERA.

 Should be pulled up to netbsd-6 (fatal hangup during boot).


 To generate a diff of this commit:
 cvs rdiff -u -r1.56 -r1.57 src/sys/dev/pci/ehci_pci.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Makoto Fujiwara <makoto@ki.nu>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/46828 6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p
Date: Mon, 24 Sep 2012 00:04:51 +0900

 (Sorry for sending to wrong address, and duplication),

 I have confirmed RC2 kernel with the committed patch boots 
 fine on my DL320/G5p.
 Thanks a lot, ryoon@ and tsutsui@.

 I really would like to pull this up to 6.0, thanks in advance.
 ---
 Makoto Fujiwara, 
 mef@NetBSD.org

Responsible-Changed-From-To: kern-bug-people->tsutsui
Responsible-Changed-By: tsutsui@NetBSD.org
Responsible-Changed-When: Mon, 24 Sep 2012 00:54:50 +0900
Responsible-Changed-Why:


State-Changed-From-To: open->pending-pullups
State-Changed-By: tsutsui@NetBSD.org
State-Changed-When: Mon, 24 Sep 2012 00:54:50 +0900
State-Changed-Why:
pullup-6 #569


From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/46828 CVS commit: [netbsd-6] src/sys/dev/pci
Date: Mon, 1 Oct 2012 17:37:28 +0000

 Module Name:	src
 Committed By:	riz
 Date:		Mon Oct  1 17:37:28 UTC 2012

 Modified Files:
 	src/sys/dev/pci [netbsd-6]: ehci_pci.c

 Log Message:
 Pull up following revision(s) (requested by tstsui in ticket #569):
 	sys/dev/pci/ehci_pci.c: revision 1.57
 Fix PR kern/46828 (6.0_BETA2 and 6.0_RC1 won't start on DL320/G5p):
  In ehci_get_ownership(), don't explicitly clear EHCI_LEG_HC_BIOS_OWNED
  semaphore bit in the driver before asking BIOS to give up ownership.
  The EHCI spec implies that the semaphore should not be changed by
  the other agent and actually the previous one (introduced in rev 1.53
  after 5.x) caused hangup during probe on at least two HP machines
  as mentioned in the PR.  Analyzed and patch provided by Ryo ONODERA.
 Should be pulled up to netbsd-6 (fatal hangup during boot).


 To generate a diff of this commit:
 cvs rdiff -u -r1.54 -r1.54.2.1 src/sys/dev/pci/ehci_pci.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 01 Oct 2012 17:52:51 +0000
State-Changed-Why:
pulled up, thanks


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.