NetBSD Problem Report #44581

From campbell@mumble.net  Tue Feb 15 21:20:21 2011
Return-Path: <campbell@mumble.net>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 801CF63B100
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 15 Feb 2011 21:20:21 +0000 (UTC)
Message-Id: <20110215212019.A66A098298@pluto.mumble.net>
Date: Tue, 15 Feb 2011 21:20:19 +0000 (UTC)
From: Taylor R Campbell <campbell+netbsd@mumble.net>
Reply-To: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@gnats.NetBSD.org
Subject: MacBook1,1 won't resume after suspend
X-Send-Pr-Version: 3.95

>Number:         44581
>Category:       port-i386
>Synopsis:       MacBook1,1 won't resume after suspend
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-i386-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Feb 15 21:25:00 +0000 2011
>Closed-Date:    Mon Nov 21 01:13:03 +0000 2016
>Last-Modified:  Mon Nov 21 01:13:03 +0000 2016
>Originator:     Taylor R Campbell <campbell+netbsd@mumble.net>
>Release:        NetBSD 5.1
>Organization:
>Environment:
System: NetBSD oberon.local 5.1 NetBSD 5.1 (GENERIC) #0: Sun Nov  7 14:39:56 UTC 2010  builds@b6.netbsd.org:/home/builds/ab/netbsd-5-1-RELEASE/i386/201011061943Z-obj/home/builds/ab/netbsd-5-1-RELEASE/src/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:

	Under 5.1, I observe:

		# sysctl hw.acpi machdep | grep acpi
		hw.acpi.root = 1040416
		hw.acpi.supported_states = S0 S3 S4 S5
		machdep.acpi_vbios_reset = 1
		machdep.acpi_beep_on_reset = 0
		machdep.acpiapm.standby = 1
		machdep.acpiapm.suspend = 3

	Suspending and resuming mainbus0 with

		# drvctl -S mainbus0; sleep 10; drvctl -Q mainbus0

	seems to work.  The disk stops spinning, keyboard input stops
	working, the kernel stops replying to pings, and so on.  Ten
	seconds later, the disk spins up again, the kernel sprays a lot
	of messages to the console about device detachment and
	attachment, keyboard input starts working, the kernel starts
	replying to pings, and so on.

	With machdep.acpi_vbios_reset set to 0, 1, or 2,

		# sysctl -w machdep.sleep_state=3; sleep 60; drvctl -Q mainbus0

	prints a kernel message about acpi entering state 3 and about
	flushing disk caches, and suspends the machine.  Neither
	keyboard nor trackpad input wakes it.  Pressing the power
	button causes the optical and magnetic drives to spin up, but
	the display is still dark and the machine unresponsive: no
	replies to pings, pressing the power button again has no
	effect, &c., until I force a reboot by holding the power button
	down for several seconds.



	Under a current kernel as of a couple days ago, I observe:

		# sysctl hw.acpi
		hw.acpi.root = 1040416
		hw.acpi.sleep.state = 0
		hw.acpi.sleep.states = S0 S3 S4 S5
		hw.acpi.sleep.vbios = 1
		hw.acpi.stat.gpe = 17
		hw.acpi.stat.sci = 17
		hw.acpi.stat.fixed = 17
		hw.acpi.stat.method = 17

	Suspending and resuming mainbus0 with

		# drvctl -S mainbus0; sleep 10; drvctl -Q mainbus0

	sometimes works and sometimes hangs the machine.  If I suspend
	all of the children of mainbus0 (cpu0 cpu1 ioapic0 acpi0 pci0)
	and then resume them, it sometimes works and I sometimes get a
	garbled panic, at which point ddb ignores keyboard input and I
	have to forcibly reboot the machine.

	With hw.acpi.sleep.vbios set to 1,

		# sysctl -w hw.acpi.sleep.state=3; sleep 60; drvctl -Q mainbus0

	behaves as in 5.1.  With hw.acpi.sleep.vbios set to 0 or 2,
	after the machine suspends, the display turns on again to
	reveal a number of messages from the kernel, apparently about
	attaching and then detaching the USB keyboard and trackpad --
	yes, in that order: attaching, and then detaching.  At the end
	is a message about attaching fwohci.  I don't see a shell
	prompt, but it could have been pushed off screen by the kernel
	messages.  Keyboard input has no effect, the kernel does not
	reply to pings, and pushing the power button doesn't seem to do
	anything until, as usual, I hold it down for several seconds to
	forcibly reboot the machine.



	Let me know if you'd like to see full dmesg output.  This
	machine has two CPUs.

>How-To-Repeat:

	Try to suspend and resume a MacBook1,1.  Grumble in frustration
	at the failure.

>Fix:

	Yes, please!

>Release-Note:

>Audit-Trail:
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Tue, 15 Feb 2011 23:21:00 +0000

 A little more information:

 I backported support for CTLTYPE_BOOL to 5.1's sysctl(8) so that I
 could try seeing and setting hw.acpi.wake.*.  If I run a -current
 kernel, and set hw.acpi.wake.uhci0 &c. to 1, then keyboard input
 causes the machine to start to resume just like hitting the power
 button did before, but the display doesn't come on, even if
 hw.acpi.sleep.vbios is 0.

 I booted a -current kernel with `-xs1', set hw.acpi.sleep.vbios to 0,
 set hw.acpi.sleep.state to 3, and then hit the power button.  The
 display woke up, showing the following text (ten-fingered copy &
 paste; there are errors, no doubt).  I presume there was much more,
 but it has scrolled off the screen.

 uhidev4: at uhub3 port 1 (addr 2) disconnected
 wsmouse1: detached
 ums1: detached
 uhidev5: detached
 uhidev5: at uhub3 port 1 (addr 2) disconnected
 atabus2: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 87
  atabus3atabus3: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_f=
 lags 9f
 atabus3: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 87
  iic0iic0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_flags 1f
 iic0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 7
  isa0isa0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_flags 1f
 isa0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 7
  audio0audio0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_fla=
 gs 9f
 audio0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 87
  mskc0mskc0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_flags=
  1f
 mskc0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 7
  ath0ath0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_flags 1f
 ath0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 1f
  fwohci0fwohci0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_f=
 lags 1f
 fwohci0: Phy 1394a available S400, 3 ports.
 fwohci0: Link S400, max_rec 2048 bytes.
 fwohci0: Initate bus reset

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 00:55:42 +0000

 At jmcneill's suggestion, I tried disabling fwohci before suspending,
 with `drvctl -d fwohci0', on a current kernel.  Now the machine
 resumes!

 Of course, now it also doesn't have Firewire.  When I rescanned pci3
 (where fwohci0 was attached before) with `drvctl -r pci3', the kernel
 complained `fwohci0: can't map OHCI register space'.  When I tried
 detaching fwohci0 again, the kernel panicked on an uninitialized lock.
 Will investigate further.

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 03:19:20 +0000

 Here's a patch to make the lockdebug panic go away when I reattach and
 redetach fwohci0.  This splits fwohci_init into two routines, one that
 won't fail and guarantees that the relevant kernel data structures are
 initialized well enough for fwohci_detach not to barf (fwohci_init),
 and one that is allowed to fail (fwohci_attach).

 Next I'll try to find what's wrong with the pmf handlers to make
 fwohci fail to resume and fail to reattach after the rest of the
 system has suspended and resumed while it has been detached.

 Index: dev/cardbus/fwohci_cardbus.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/cardbus/fwohci_cardbus.c,v
 retrieving revision 1.33
 diff -p -u -r1.33 fwohci_cardbus.c
 --- dev/cardbus/fwohci_cardbus.c	19 Apr 2010 07:05:15 -0000	1.33
 +++ dev/cardbus/fwohci_cardbus.c	16 Feb 2011 03:16:19 -0000
 @@ -98,6 +98,8 @@ fwohci_cardbus_attach(device_t parent, d
  	       PCI_REVISION(ca->ca_class));
  	aprint_naive("\n");

 +	fwohci_init(&sc->sc_sc);
 +
  	/* Map I/O registers */
  	if (Cardbus_mapreg_map(ct, PCI_OHCI_MAP_REGISTER,
  	      PCI_MAPREG_TYPE_MEM, 0,
 @@ -129,7 +131,7 @@ fwohci_cardbus_attach(device_t parent, d
  	}

  	/* XXX NULL should be replaced by some call to Cardbus coed */
 -	if (fwohci_init(&sc->sc_sc) != 0) {
 +	if (fwohci_attach(&sc->sc_sc) != 0) {
  		Cardbus_intr_disestablish(ct, sc->sc_ih);
  		sc->sc_ih = NULL;
  	}
 Index: dev/ieee1394/firewire.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/ieee1394/firewire.c,v
 retrieving revision 1.38
 diff -p -u -r1.38 firewire.c
 --- dev/ieee1394/firewire.c	7 Sep 2010 07:26:54 -0000	1.38
 +++ dev/ieee1394/firewire.c	16 Feb 2011 03:16:23 -0000
 @@ -680,6 +680,15 @@ fw_init(struct firewire_comm *fc)
  	fc->crom_src_buf = NULL;
  }

 +void
 +fw_destroy(struct firewire_comm *fc)
 +{
 +	mutex_destroy(&fc->arq->q_mtx);
 +	mutex_destroy(&fc->ars->q_mtx);
 +	mutex_destroy(&fc->atq->q_mtx);
 +	mutex_destroy(&fc->ats->q_mtx);
 +}
 +
  #define BIND_CMP(addr, fwb) \
  	(((addr) < (fwb)->start) ? -1 : ((fwb)->end < (addr)) ? 1 : 0)

 Index: dev/ieee1394/firewirereg.h
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/ieee1394/firewirereg.h,v
 retrieving revision 1.15
 diff -p -u -r1.15 firewirereg.h
 --- dev/ieee1394/firewirereg.h	14 Nov 2010 15:47:20 -0000	1.15
 +++ dev/ieee1394/firewirereg.h	16 Feb 2011 03:16:23 -0000
 @@ -283,6 +283,7 @@ int fw_xferwait(struct fw_xfer *);
  void fw_drain_txq(struct firewire_comm *);
  void fw_busreset(struct firewire_comm *, uint32_t);
  void fw_init(struct firewire_comm *);
 +void fw_destroy(struct firewire_comm *);
  struct fw_bind *fw_bindlookup(struct firewire_comm *, uint16_t, uint32_t);
  int fw_bindadd(struct firewire_comm *, struct fw_bind *);
  int fw_bindremove(struct firewire_comm *, struct fw_bind *);
 Index: dev/ieee1394/fwohci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/ieee1394/fwohci.c,v
 retrieving revision 1.130
 diff -p -u -r1.130 fwohci.c
 --- dev/ieee1394/fwohci.c	7 Sep 2010 07:19:45 -0000	1.130
 +++ dev/ieee1394/fwohci.c	16 Feb 2011 03:16:23 -0000
 @@ -323,38 +323,15 @@ static void fwohci_arcv(struct fwohci_so
  #define IRX_CH	36


 -int
 +/*
 + * Call fwohci_init before fwohci_attach to initialize the kernel's
 + * data structures well enough that fwohci_detach won't crash, even if
 + * fwohci_attach fails.
 + */
 +
 +void
  fwohci_init(struct fwohci_softc *sc)
  {
 -	uint32_t reg;
 -	uint8_t ui[8];
 -	int i, mver;
 -
 -/* OHCI version */
 -	reg = OREAD(sc, OHCI_VERSION);
 -	mver = (reg >> 16) & 0xff;
 -	aprint_normal_dev(sc->fc.dev, "OHCI version %x.%x (ROM=%d)\n",
 -	    mver, reg & 0xff, (reg >> 24) & 1);
 -	if (mver < 1 || mver > 9) {
 -		aprint_error_dev(sc->fc.dev, "invalid OHCI version\n");
 -		return ENXIO;
 -	}
 -
 -/* Available Isochronous DMA channel probe */
 -	OWRITE(sc, OHCI_IT_MASK, 0xffffffff);
 -	OWRITE(sc, OHCI_IR_MASK, 0xffffffff);
 -	reg = OREAD(sc, OHCI_IT_MASK) & OREAD(sc, OHCI_IR_MASK);
 -	OWRITE(sc, OHCI_IT_MASKCLR, 0xffffffff);
 -	OWRITE(sc, OHCI_IR_MASKCLR, 0xffffffff);
 -	for (i = 0; i < 0x20; i++)
 -		if ((reg & (1 << i)) == 0)
 -			break;
 -	sc->fc.nisodma = i;
 -	aprint_normal_dev(sc->fc.dev, "No. of Isochronous channels is %d.\n",
 -	    i);
 -	if (i == 0)
 -		return ENXIO;
 -
  	sc->fc.arq = &sc->arrq.xferq;
  	sc->fc.ars = &sc->arrs.xferq;
  	sc->fc.atq = &sc->atrq.xferq;
 @@ -395,6 +372,68 @@ fwohci_init(struct fwohci_softc *sc)
  	sc->atrq.off = OHCI_ATQOFF;
  	sc->atrs.off = OHCI_ATSOFF;

 +	sc->fc.tcode = tinfo;
 +
 +	sc->fc.cyctimer = fwohci_cyctimer;
 +	sc->fc.ibr = fwohci_ibr;
 +	sc->fc.set_bmr = fwohci_set_bus_manager;
 +	sc->fc.ioctl = fwohci_ioctl;
 +	sc->fc.irx_enable = fwohci_irx_enable;
 +	sc->fc.irx_disable = fwohci_irx_disable;
 +
 +	sc->fc.itx_enable = fwohci_itxbuf_enable;
 +	sc->fc.itx_disable = fwohci_itx_disable;
 +	sc->fc.timeout = fwohci_timeout;
 +	sc->fc.set_intr = fwohci_set_intr;
 +#if BYTE_ORDER == BIG_ENDIAN
 +	sc->fc.irx_post = fwohci_irx_post;
 +#else
 +	sc->fc.irx_post = NULL;
 +#endif
 +	sc->fc.itx_post = NULL;
 +
 +	sc->intmask = sc->irstat = sc->itstat = 0;
 +
 +	fw_init(&sc->fc);
 +}
 +
 +/*
 + * Call fwohci_attach after fwohci_init to initialize the hardware and
 + * attach children.
 + */
 +
 +int
 +fwohci_attach(struct fwohci_softc *sc)
 +{
 +	uint32_t reg;
 +	uint8_t ui[8];
 +	int i, mver;
 +
 +/* OHCI version */
 +	reg = OREAD(sc, OHCI_VERSION);
 +	mver = (reg >> 16) & 0xff;
 +	aprint_normal_dev(sc->fc.dev, "OHCI version %x.%x (ROM=%d)\n",
 +	    mver, reg & 0xff, (reg >> 24) & 1);
 +	if (mver < 1 || mver > 9) {
 +		aprint_error_dev(sc->fc.dev, "invalid OHCI version\n");
 +		return ENXIO;
 +	}
 +
 +/* Available Isochronous DMA channel probe */
 +	OWRITE(sc, OHCI_IT_MASK, 0xffffffff);
 +	OWRITE(sc, OHCI_IR_MASK, 0xffffffff);
 +	reg = OREAD(sc, OHCI_IT_MASK) & OREAD(sc, OHCI_IR_MASK);
 +	OWRITE(sc, OHCI_IT_MASKCLR, 0xffffffff);
 +	OWRITE(sc, OHCI_IR_MASKCLR, 0xffffffff);
 +	for (i = 0; i < 0x20; i++)
 +		if ((reg & (1 << i)) == 0)
 +			break;
 +	sc->fc.nisodma = i;
 +	aprint_normal_dev(sc->fc.dev, "No. of Isochronous channels is %d.\n",
 +	    i);
 +	if (i == 0)
 +		return ENXIO;
 +
  	for (i = 0; i < sc->fc.nisodma; i++) {
  		sc->fc.it[i] = &sc->it[i].xferq;
  		sc->fc.ir[i] = &sc->ir[i].xferq;
 @@ -406,8 +445,6 @@ fwohci_init(struct fwohci_softc *sc)
  		sc->ir[i].off = OHCI_IROFF(i);
  	}

 -	sc->fc.tcode = tinfo;
 -
  	sc->fc.config_rom = fwdma_alloc_setup(sc->fc.dev, sc->fc.dmat,
  	    CROMSIZE, &sc->crom_dma, CROMSIZE, BUS_DMA_NOWAIT);
  	if (sc->fc.config_rom == NULL) {
 @@ -467,27 +504,6 @@ fwohci_init(struct fwohci_softc *sc)
  	    "EUI64 %02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x\n",
  	    ui[0], ui[1], ui[2], ui[3], ui[4], ui[5], ui[6], ui[7]);

 -	sc->fc.cyctimer = fwohci_cyctimer;
 -	sc->fc.ibr = fwohci_ibr;
 -	sc->fc.set_bmr = fwohci_set_bus_manager;
 -	sc->fc.ioctl = fwohci_ioctl;
 -	sc->fc.irx_enable = fwohci_irx_enable;
 -	sc->fc.irx_disable = fwohci_irx_disable;
 -
 -	sc->fc.itx_enable = fwohci_itxbuf_enable;
 -	sc->fc.itx_disable = fwohci_itx_disable;
 -	sc->fc.timeout = fwohci_timeout;
 -	sc->fc.set_intr = fwohci_set_intr;
 -#if BYTE_ORDER == BIG_ENDIAN
 -	sc->fc.irx_post = fwohci_irx_post;
 -#else
 -	sc->fc.irx_post = NULL;
 -#endif
 -	sc->fc.itx_post = NULL;
 -
 -	sc->intmask = sc->irstat = sc->itstat = 0;
 -
 -	fw_init(&sc->fc);
  	fwohci_reset(sc);

  	sc->fc.bdev =
 @@ -499,10 +515,13 @@ fwohci_init(struct fwohci_softc *sc)
  int
  fwohci_detach(struct fwohci_softc *sc, int flags)
  {
 -	int i;
 +	int i, rv;

 -	if (sc->fc.bdev != NULL)
 -		config_detach(sc->fc.bdev, flags);
 +	if (sc->fc.bdev != NULL) {
 +		rv = config_detach(sc->fc.bdev, flags);
 +		if (rv)
 +			return rv;
 +	}
  	if (sc->sid_buf != NULL)
  		fwdma_free(sc->sid_dma.dma_tag, sc->sid_dma.dma_map,
  		    sc->sid_dma.v_addr);
 @@ -519,10 +538,7 @@ fwohci_detach(struct fwohci_softc *sc, i
  		fwohci_db_free(sc, &sc->ir[i]);
  	}

 -	mutex_destroy(&sc->arrq.xferq.q_mtx);
 -	mutex_destroy(&sc->arrs.xferq.q_mtx);
 -	mutex_destroy(&sc->atrq.xferq.q_mtx);
 -	mutex_destroy(&sc->atrs.xferq.q_mtx);
 +	fw_destroy(&sc->fc);

  	return 0;
  }
 Index: dev/ieee1394/fwohcivar.h
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/ieee1394/fwohcivar.h,v
 retrieving revision 1.32
 diff -p -u -r1.32 fwohcivar.h
 --- dev/ieee1394/fwohcivar.h	23 May 2010 18:56:59 -0000	1.32
 +++ dev/ieee1394/fwohcivar.h	16 Feb 2011 03:16:23 -0000
 @@ -79,7 +79,8 @@ struct fwohci_softc {
  #define OWRITE(sc, r, x) bus_space_write_4((sc)->bst, (sc)->bsh, (r), (x))
  #define OREAD(sc, r)	bus_space_read_4((sc)->bst, (sc)->bsh, (r))

 -int fwohci_init(struct fwohci_softc *);
 +void fwohci_init(struct fwohci_softc *);
 +int fwohci_attach(struct fwohci_softc *);
  int fwohci_detach(struct fwohci_softc *, int);
  int fwohci_intr(void *arg);
  int fwohci_resume(struct fwohci_softc *);
 Index: dev/pci/fwohci_pci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/fwohci_pci.c,v
 retrieving revision 1.39
 diff -p -u -r1.39 fwohci_pci.c
 --- dev/pci/fwohci_pci.c	29 Apr 2010 06:41:27 -0000	1.39
 +++ dev/pci/fwohci_pci.c	16 Feb 2011 03:16:44 -0000
 @@ -107,6 +107,8 @@ fwohci_pci_attach(device_t parent, devic
  	aprint_normal(": %s (rev. 0x%02x)\n", devinfo,
  	    PCI_REVISION(pa->pa_class));

 +	fwohci_init(&psc->psc_sc);
 +
  	psc->psc_sc.fc.dev = self;
  	psc->psc_sc.fc.dmat = pa->pa_dmat;
  	psc->psc_pc = pa->pa_pc;
 @@ -154,15 +156,12 @@ fwohci_pci_attach(device_t parent, devic
  	}
  	aprint_normal_dev(self, "interrupting at %s\n", intrstr);

 +	if (fwohci_attach(&psc->psc_sc) != 0)
 +		goto fail;
 +
  	if (!pmf_device_register(self, fwohci_pci_suspend, fwohci_pci_resume))
  		aprint_error_dev(self, "couldn't establish power handler\n");

 -	if (fwohci_init(&psc->psc_sc) != 0) {
 -		pci_intr_disestablish(pa->pa_pc, psc->psc_ih);
 -		bus_space_unmap(psc->psc_sc.bst, psc->psc_sc.bsh,
 -		    psc->psc_sc.bssize);
 -	}
 -
  	return;

  fail:

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 16:35:39 +0000

 If I disable Firewire, the machine seems to reliably resume.  However,
 after it resumes, there is an interrupt storm on ioapic0 pin 9.  This
 happens whether I use `drvctl -d fwohci0' or whether I add `no fwohci*
 at pci?' to my kernel configuration, so it doesn't seem related to
 Firewire.  (fwohci0 takes ioapic0 pin 19.)

 Suggestions?  I'll continue trying to find what's wrong with fwohci in
 the meantime.

From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: port-i386-maintainer@NetBSD.org, netbsd-bugs@NetBSD.org,
	Taylor R Campbell <campbell+netbsd@mumble.net>
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 19:38:38 +0200

 On Wed, Feb 16, 2011 at 04:40:04PM +0000, Taylor R Campbell wrote:
 >  If I disable Firewire, the machine seems to reliably resume.  However,
 >  after it resumes, there is an interrupt storm on ioapic0 pin 9.  This
 >  happens whether I use `drvctl -d fwohci0' or whether I add `no fwohci*
 >  at pci?' to my kernel configuration, so it doesn't seem related to
 >  Firewire.  (fwohci0 takes ioapic0 pin 19.)
 >  
 >  Suggestions?

 This is kind of shot in the dark, but can you try the following small patch?

 - Jukka.

 Index: acpi.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/acpi/acpi.c,v
 retrieving revision 1.235
 diff -u -p -r1.235 acpi.c
 --- acpi.c	15 Feb 2011 20:24:11 -0000	1.235
 +++ acpi.c	16 Feb 2011 17:36:22 -0000
 @@ -1370,9 +1370,9 @@ acpi_enter_sleep_state(int state)
  				AcpiEnable();

  			(void)pmf_system_bus_resume(PMF_Q_NONE);
 +			(void)pmf_system_resume(PMF_Q_NONE);
  			(void)AcpiLeaveSleepState(state);
  			(void)AcpiSetFirmwareWakingVector(0);
 -			(void)pmf_system_resume(PMF_Q_NONE);
  		}

  		break;

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 17:50:14 +0000

 Here's a diff between the outputs of `pcictl pci3 dump -d 3 -f 0'
 (where fwohci0 attaches), before and after suspending & resuming.
 This probably explains why fwohci0 fails after resumption.  Perhaps
 someone who is better versed in PCI than I am can interpret it and
 figure out what went wrong faster than I can.

 The relevant part of the configuration looks like

 root -> mainbus0 -> pci0 -> ppb2 -> pci3 -> fwohci0.

 --- pci3dump.0.boot	2011-02-16 17:08:02.000000000 +0000
 +++ pci3dump.2.resumed	2011-02-16 17:09:36.000000000 +0000
 @@ -1,20 +1,20 @@
  PCI configuration registers:
    Common header:
 -    0x00: 0x581111c1 0x02900216 0x0c001061 0x0000f810
 +    0x00: 0x581111c1 0x02900000 0x0c001061 0x00000000

      Vendor Name: Lucent Technologies (0x11c1)
      Device Name: FW322/323 IEEE 1394 Host Controller (0x5811)
 -    Command register: 0x0216
 +    Command register: 0x0000
        I/O space accesses: off
 -      Memory space accesses: on
 -      Bus mastering: on
 +      Memory space accesses: off
 +      Bus mastering: off
        Special cycles: off
 -      MWI transactions: on
 +      MWI transactions: off
        Palette snooping: off
        Parity error checking: off
        Address/data stepping: off
        System error (SERR): off
 -      Fast back-to-back transactions: on
 +      Fast back-to-back transactions: off
        Interrupt disable: off
      Status register: 0x0290
        Capability List support: on
 @@ -34,17 +34,16 @@
      Revision ID: 0x61
      BIST: 0x00
      Header Type: 0x00 (0x00)
 -    Latency Timer: 0xf8
 -    Cache Line Size: 0x10
 +    Latency Timer: 0x00
 +    Cache Line Size: 0x00

    Type 0 ("normal" device) header:
 -    0x10: 0x90000000 0x00000000 0x00000000 0x00000000
 +    0x10: 0x00000000 0x00000000 0x00000000 0x00000000
      0x20: 0x00000000 0x00000000 0x00000000 0x581111c1
 -    0x30: 0x00000000 0x00000044 0x00000000 0x180c010b
 +    0x30: 0x00000000 0x00000044 0x00000000 0x180c0100

      Base address register at 0x10
 -      type: 32-bit nonprefetchable memory
 -      base: 0x90000000, not sized
 +      not implemented(?)
      Base address register at 0x14
        not implemented(?)
      Base address register at 0x18
 @@ -64,7 +63,7 @@
      Maximum Latency: 0x18
      Minimum Grant: 0x0c
      Interrupt pin: 0x01 (pin A)
 -    Interrupt line: 0x0b
 +    Interrupt line: 0x00

    Capability register at 0x44
      type: 0x01 (Power Management, rev. 1.0)
 @@ -78,15 +77,15 @@
        D1 power management state support: on
        D2 power management state support: on
        PME# support: 0x0f
 -    Control/status register: 0x8000
 +    Control/status register: 0x0000
        Power state: D0
        PCI Express reserved: off
        No soft reset: off
        PME# assertion disabled
 -      PME# status: on
 +      PME# status: off

    Device-dependent header:
 -    0x40: 0x00000000 0x7e020001 0x00008000 0x00000000
 +    0x40: 0x00000000 0x7e020001 0x00000000 0x00000000
      0x50: 0x00000000 0x00000000 0x00000000 0x00000000
      0x60: 0x00000000 0x00000000 0x00000000 0x00000000
      0x70: 0x........ 0x........ 0x00000000 0x00000000

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: jruohonen@iki.fi
Cc: gnats-bugs@NetBSD.org, port-i386-maintainer@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 17:59:00 +0000

    Date: Wed, 16 Feb 2011 19:38:38 +0200
    From: Jukka Ruohonen <jruohonen@iki.fi>

    This is kind of shot in the dark, but can you try the following small pa=
 tch?

 No dice -- same interrupt storm.

From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: port-i386-maintainer@NetBSD.org, netbsd-bugs@NetBSD.org,
	Taylor R Campbell <campbell+netbsd@mumble.net>
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 20:26:06 +0200

 On Wed, Feb 16, 2011 at 06:00:09PM +0000, Taylor R Campbell wrote:
 >  This is kind of shot in the dark, but can you try the following small
 >  patch?
 >  
 >  No dice -- same interrupt storm.

 Another one attached.

 This is based on the following Linux bug report that sounds awfully similar:

 	https://bugzilla.kernel.org/show_bug.cgi?id=6670

 Len Brown from Intel concludes therein that this would be a BIOS bug.

 Index: acpi_wakeup.c
 ===================================================================
 RCS file: /cvsroot/src/sys/arch/x86/acpi/acpi_wakeup.c,v
 retrieving revision 1.27
 diff -u -p -r1.27 acpi_wakeup.c
 --- acpi_wakeup.c	13 Jan 2011 03:45:38 -0000	1.27
 +++ acpi_wakeup.c	16 Feb 2011 18:23:00 -0000
 @@ -342,6 +342,11 @@ acpi_md_sleep(int state)
  	inittodr(time_second);

  	/*
 +	 * A workaround for broken BIOS.
 +	 */
 +	(void)AcpiWriteBitRegister(ACPI_BITREG_SCI_ENABLE, 1);
 +
 +	/*
  	 * Clear fixed events (see e.g. ACPI 3.0, p. 62).
  	 * Also prevent GPEs from misfiring by disabling
  	 * all GPEs before interrupts are enabled. The

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: jruohonen@iki.fi
Cc: gnats-bugs@NetBSD.org, port-i386-maintainer@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 18:42:10 +0000

    Date: Wed, 16 Feb 2011 20:26:06 +0200
    From: Jukka Ruohonen <jruohonen@iki.fi>

    Another one attached.

    This is based on the following Linux bug report that sounds awfully simi=
 lar:

            https://bugzilla.kernel.org/show_bug.cgi?id=3D6670

    Len Brown from Intel concludes therein that this would be a BIOS bug.

 Seems much happier now!  No more interrupt storm, and hw.acpi.stat.sci
 is only one greater than hw.acpi.stat.gpe.

From: "Jukka Ruohonen" <jruoho@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44581 CVS commit: src/sys/arch/x86/acpi
Date: Wed, 16 Feb 2011 18:55:51 +0000

 Module Name:	src
 Committed By:	jruoho
 Date:		Wed Feb 16 18:55:50 UTC 2011

 Modified Files:
 	src/sys/arch/x86/acpi: acpi_wakeup.c

 Log Message:
 Explicitly re-enable the SCI interrupt when the wakeup starts (and before
 interrupts are enabled). A workaround for a BIOS bug. Fixes the interrupt
 storm reported by Taylor R. Campbell in PR # 44581.


 To generate a diff of this commit:
 cvs rdiff -u -r1.27 -r1.28 src/sys/arch/x86/acpi/acpi_wakeup.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 20:19:10 +0000

 fwohci(4) is not happy to attach if its PCI command status register
 does not have bus mastering or memory mapping enabled.  I think this
 is the case for many drivers.  If I have fwohci0 detached, I observe
 that when pci3 is resumed, device 3 function 0 (where fwohci0
 attaches) has its command status register zeroed.  That makes
 fwohci_pci_attach fail, because pci_mapreg_map thinks the flags are
 inappropriate.

 I tried the following patch, and, lo and behold, the machine resumes!
 (I have lots of changes in this tree which may be necessary too.)

 But here's the weird part.  The output I see is

    fwohci0: on resume, csr = 42992150
    fwohci0: setting csr to 42992150

 I'm glad my machine is resuming now, but this leaves me unsatisfied
 inside...

 Also, if I detach fwohci0 before suspending and rescan pci3 after
 resuming, then the csr has bus mastering and memory mapping disabled,
 so fwohci0 fails to attach.  I know this doesn't matter much, but it
 would be nice if suspending while fwohci0 is detached didn't render
 fwohci0 unusable until the next reboot.

 Index: fwohci_pci.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/pci/fwohci_pci.c,v
 retrieving revision 1.39
 diff -p -u -r1.39 fwohci_pci.c
 --- fwohci_pci.c	29 Apr 2010 06:41:27 -0000	1.39
 +++ fwohci_pci.c	16 Feb 2011 20:08:21 -0000
 @@ -214,10 +213,18 @@ fwohci_pci_resume(device_t dv, const pmf
  {
  	struct fwohci_pci_softc *psc = device_private(dv);
  	int s;
 +	uint32_t csr;

  	s = splbio();
 +	csr = pci_conf_read(psc->psc_pc, psc->psc_tag, PCI_COMMAND_STATUS_REG);
 +	aprint_normal_dev(dv, "on resume, csr = %d\n", csr);
 +	csr |= PCI_COMMAND_MASTER_ENABLE | PCI_COMMAND_MEM_ENABLE;
 +	aprint_normal_dev(dv, "setting csr to %d\n", csr);
 +	pci_conf_write(psc->psc_pc, psc->psc_tag, PCI_COMMAND_STATUS_REG, csr);
  	fwohci_resume(&psc->psc_sc);
 +	aprint_normal_dev(dv, "fwohci_resume returned, about to splx %d\n", s);
  	splx(s);
 +	aprint_normal_dev(dv, "fwohci_resume splx %d\n", s);

  	return true;
  }

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 20:27:58 +0000

 Also, in order to test rescanning pci3 and reattaching fwohci0, I had
 to apply the following patch to fix a config panic induced by a
 trivial race condition in firewire.c.  (fw_bus_probe_thread may call
 config_pending_decr before control has reached firewireattach's call
 to config_pending_incr, causing a panic.)

 Index: firewire.c
 ===================================================================
 RCS file: /cvsroot/src/sys/dev/ieee1394/firewire.c,v
 retrieving revision 1.38
 diff -c -r1.38 firewire.c
 --- firewire.c	7 Sep 2010 07:26:54 -0000	1.38
 +++ firewire.c	16 Feb 2011 20:26:36 -0000
 @@ -257,10 +257,10 @@
  	callout_schedule(&fc->timeout_callout, hz);

  	/* create thread */
 +	config_pending_incr();
  	if (kthread_create(PRI_NONE, KTHREAD_MPSAFE, NULL, fw_bus_probe_thread,
  	    fc, &fc->probe_thread, "fw%dprobe", device_unit(fc->bdev)))
  		aprint_error_dev(self, "kthread_create failed\n");
 -	config_pending_incr();

  	devlist = malloc(sizeof(struct firewire_dev_list), M_DEVBUF, M_NOWAIT);
  	if (devlist == NULL) {

From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Thu, 17 Feb 2011 00:12:01 +0000

    Date: Wed, 16 Feb 2011 20:19:10 +0000
    From: Taylor R Campbell <campbell+netbsd@mumble.net>

    I tried the following patch, and, lo and behold, the machine resumes!
    (I have lots of changes in this tree which may be necessary too.)

 Oops -- this was a red herring.  In my testing of dozens of different
 configurations, I must have mixed some of them up.  If I undo this
 patch, it also works, so it must be the SCI business that let Firewire
 resumption work.  So I think this PR can be closed now, unless someone
 is interested in applying one of the earlier patches I sent to make
 reattaching fwohci0 merely fail and not panic after suspending and
 resuming.

 I also tested removing the calls in firewire.c to config_pending_incr
 and config_pending_decr -- I don't see why they're there, I observed a
 panic because of a race condition between them, and the code seems to
 work without them.  At the very least, I think the calls to
 kthread_create and config_pending_incr should be reversed, like they
 were when the code was imported.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Sun, 20 Feb 2011 00:37:07 +0000

 On Thu, Feb 17, 2011 at 12:15:04AM +0000, Taylor R Campbell wrote:
  >  [...] unless someone
  >  is interested in applying one of the earlier patches I sent to make
  >  reattaching fwohci0 merely fail and not panic after suspending and
  >  resuming.

 Those sound like good patches to have... I don't know suspend/resume
 stuff well enough to want to go commit them myself though.

 -- 
 David A. Holland
 dholland@netbsd.org

State-Changed-From-To: open->analyzed
State-Changed-By: jruoho@NetBSD.org
State-Changed-When: Sun, 27 Feb 2011 10:36:13 +0000
State-Changed-Why:
Problem has been analyzed.

Someone who understands fwohci(4) needs to evaluate the patches.



From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44581 CVS commit: src/sys/dev
Date: Sat, 4 Aug 2012 03:55:44 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Aug  4 03:55:44 UTC 2012

 Modified Files:
 	src/sys/dev/cardbus: fwohci_cardbus.c
 	src/sys/dev/ieee1394: firewire.c firewirereg.h fwohci.c fwohcivar.h
 	src/sys/dev/pci: fwohci_pci.c

 Log Message:
 Fix error branches and config pending races in firewire init.

 This way, if anything fails, it just fails; you don't panic.  This can
 happen if suspending and resuming of firewire is broken (e.g., as I
 encountered in PR kern/44581).


 To generate a diff of this commit:
 cvs rdiff -u -r1.34 -r1.35 src/sys/dev/cardbus/fwohci_cardbus.c
 cvs rdiff -u -r1.39 -r1.40 src/sys/dev/ieee1394/firewire.c
 cvs rdiff -u -r1.17 -r1.18 src/sys/dev/ieee1394/firewirereg.h
 cvs rdiff -u -r1.132 -r1.133 src/sys/dev/ieee1394/fwohci.c
 cvs rdiff -u -r1.33 -r1.34 src/sys/dev/ieee1394/fwohcivar.h
 cvs rdiff -u -r1.40 -r1.41 src/sys/dev/pci/fwohci_pci.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Mon, 21 Nov 2016 01:13:03 +0000
State-Changed-Why:
fixed years ago


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.