NetBSD Problem Report #44581
From campbell@mumble.net Tue Feb 15 21:20:21 2011
Return-Path: <campbell@mumble.net>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 801CF63B100
for <gnats-bugs@gnats.NetBSD.org>; Tue, 15 Feb 2011 21:20:21 +0000 (UTC)
Message-Id: <20110215212019.A66A098298@pluto.mumble.net>
Date: Tue, 15 Feb 2011 21:20:19 +0000 (UTC)
From: Taylor R Campbell <campbell+netbsd@mumble.net>
Reply-To: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@gnats.NetBSD.org
Subject: MacBook1,1 won't resume after suspend
X-Send-Pr-Version: 3.95
>Number: 44581
>Category: port-i386
>Synopsis: MacBook1,1 won't resume after suspend
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-i386-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Feb 15 21:25:00 +0000 2011
>Closed-Date: Mon Nov 21 01:13:03 +0000 2016
>Last-Modified: Mon Nov 21 01:13:03 +0000 2016
>Originator: Taylor R Campbell <campbell+netbsd@mumble.net>
>Release: NetBSD 5.1
>Organization:
>Environment:
System: NetBSD oberon.local 5.1 NetBSD 5.1 (GENERIC) #0: Sun Nov 7 14:39:56 UTC 2010 builds@b6.netbsd.org:/home/builds/ab/netbsd-5-1-RELEASE/i386/201011061943Z-obj/home/builds/ab/netbsd-5-1-RELEASE/src/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
Under 5.1, I observe:
# sysctl hw.acpi machdep | grep acpi
hw.acpi.root = 1040416
hw.acpi.supported_states = S0 S3 S4 S5
machdep.acpi_vbios_reset = 1
machdep.acpi_beep_on_reset = 0
machdep.acpiapm.standby = 1
machdep.acpiapm.suspend = 3
Suspending and resuming mainbus0 with
# drvctl -S mainbus0; sleep 10; drvctl -Q mainbus0
seems to work. The disk stops spinning, keyboard input stops
working, the kernel stops replying to pings, and so on. Ten
seconds later, the disk spins up again, the kernel sprays a lot
of messages to the console about device detachment and
attachment, keyboard input starts working, the kernel starts
replying to pings, and so on.
With machdep.acpi_vbios_reset set to 0, 1, or 2,
# sysctl -w machdep.sleep_state=3; sleep 60; drvctl -Q mainbus0
prints a kernel message about acpi entering state 3 and about
flushing disk caches, and suspends the machine. Neither
keyboard nor trackpad input wakes it. Pressing the power
button causes the optical and magnetic drives to spin up, but
the display is still dark and the machine unresponsive: no
replies to pings, pressing the power button again has no
effect, &c., until I force a reboot by holding the power button
down for several seconds.
Under a current kernel as of a couple days ago, I observe:
# sysctl hw.acpi
hw.acpi.root = 1040416
hw.acpi.sleep.state = 0
hw.acpi.sleep.states = S0 S3 S4 S5
hw.acpi.sleep.vbios = 1
hw.acpi.stat.gpe = 17
hw.acpi.stat.sci = 17
hw.acpi.stat.fixed = 17
hw.acpi.stat.method = 17
Suspending and resuming mainbus0 with
# drvctl -S mainbus0; sleep 10; drvctl -Q mainbus0
sometimes works and sometimes hangs the machine. If I suspend
all of the children of mainbus0 (cpu0 cpu1 ioapic0 acpi0 pci0)
and then resume them, it sometimes works and I sometimes get a
garbled panic, at which point ddb ignores keyboard input and I
have to forcibly reboot the machine.
With hw.acpi.sleep.vbios set to 1,
# sysctl -w hw.acpi.sleep.state=3; sleep 60; drvctl -Q mainbus0
behaves as in 5.1. With hw.acpi.sleep.vbios set to 0 or 2,
after the machine suspends, the display turns on again to
reveal a number of messages from the kernel, apparently about
attaching and then detaching the USB keyboard and trackpad --
yes, in that order: attaching, and then detaching. At the end
is a message about attaching fwohci. I don't see a shell
prompt, but it could have been pushed off screen by the kernel
messages. Keyboard input has no effect, the kernel does not
reply to pings, and pushing the power button doesn't seem to do
anything until, as usual, I hold it down for several seconds to
forcibly reboot the machine.
Let me know if you'd like to see full dmesg output. This
machine has two CPUs.
>How-To-Repeat:
Try to suspend and resume a MacBook1,1. Grumble in frustration
at the failure.
>Fix:
Yes, please!
>Release-Note:
>Audit-Trail:
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Tue, 15 Feb 2011 23:21:00 +0000
A little more information:
I backported support for CTLTYPE_BOOL to 5.1's sysctl(8) so that I
could try seeing and setting hw.acpi.wake.*. If I run a -current
kernel, and set hw.acpi.wake.uhci0 &c. to 1, then keyboard input
causes the machine to start to resume just like hitting the power
button did before, but the display doesn't come on, even if
hw.acpi.sleep.vbios is 0.
I booted a -current kernel with `-xs1', set hw.acpi.sleep.vbios to 0,
set hw.acpi.sleep.state to 3, and then hit the power button. The
display woke up, showing the following text (ten-fingered copy &
paste; there are errors, no doubt). I presume there was much more,
but it has scrolled off the screen.
uhidev4: at uhub3 port 1 (addr 2) disconnected
wsmouse1: detached
ums1: detached
uhidev5: detached
uhidev5: at uhub3 port 1 (addr 2) disconnected
atabus2: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 87
atabus3atabus3: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_f=
lags 9f
atabus3: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 87
iic0iic0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_flags 1f
iic0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 7
isa0isa0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_flags 1f
isa0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 7
audio0audio0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_fla=
gs 9f
audio0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 87
mskc0mskc0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_flags=
1f
mskc0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 7
ath0ath0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_flags 1f
ath0: device_pmf_unlock.2350, sysctl dvl_nlock 0 dvl_nwait 0 dv_flags 1f
fwohci0fwohci0: device_pmf_lock1.2323, sysctl dvl_nlock 1 dvl_nwait 0 dv_f=
lags 1f
fwohci0: Phy 1394a available S400, 3 ports.
fwohci0: Link S400, max_rec 2048 bytes.
fwohci0: Initate bus reset
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 00:55:42 +0000
At jmcneill's suggestion, I tried disabling fwohci before suspending,
with `drvctl -d fwohci0', on a current kernel. Now the machine
resumes!
Of course, now it also doesn't have Firewire. When I rescanned pci3
(where fwohci0 was attached before) with `drvctl -r pci3', the kernel
complained `fwohci0: can't map OHCI register space'. When I tried
detaching fwohci0 again, the kernel panicked on an uninitialized lock.
Will investigate further.
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 03:19:20 +0000
Here's a patch to make the lockdebug panic go away when I reattach and
redetach fwohci0. This splits fwohci_init into two routines, one that
won't fail and guarantees that the relevant kernel data structures are
initialized well enough for fwohci_detach not to barf (fwohci_init),
and one that is allowed to fail (fwohci_attach).
Next I'll try to find what's wrong with the pmf handlers to make
fwohci fail to resume and fail to reattach after the rest of the
system has suspended and resumed while it has been detached.
Index: dev/cardbus/fwohci_cardbus.c
===================================================================
RCS file: /cvsroot/src/sys/dev/cardbus/fwohci_cardbus.c,v
retrieving revision 1.33
diff -p -u -r1.33 fwohci_cardbus.c
--- dev/cardbus/fwohci_cardbus.c 19 Apr 2010 07:05:15 -0000 1.33
+++ dev/cardbus/fwohci_cardbus.c 16 Feb 2011 03:16:19 -0000
@@ -98,6 +98,8 @@ fwohci_cardbus_attach(device_t parent, d
PCI_REVISION(ca->ca_class));
aprint_naive("\n");
+ fwohci_init(&sc->sc_sc);
+
/* Map I/O registers */
if (Cardbus_mapreg_map(ct, PCI_OHCI_MAP_REGISTER,
PCI_MAPREG_TYPE_MEM, 0,
@@ -129,7 +131,7 @@ fwohci_cardbus_attach(device_t parent, d
}
/* XXX NULL should be replaced by some call to Cardbus coed */
- if (fwohci_init(&sc->sc_sc) != 0) {
+ if (fwohci_attach(&sc->sc_sc) != 0) {
Cardbus_intr_disestablish(ct, sc->sc_ih);
sc->sc_ih = NULL;
}
Index: dev/ieee1394/firewire.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ieee1394/firewire.c,v
retrieving revision 1.38
diff -p -u -r1.38 firewire.c
--- dev/ieee1394/firewire.c 7 Sep 2010 07:26:54 -0000 1.38
+++ dev/ieee1394/firewire.c 16 Feb 2011 03:16:23 -0000
@@ -680,6 +680,15 @@ fw_init(struct firewire_comm *fc)
fc->crom_src_buf = NULL;
}
+void
+fw_destroy(struct firewire_comm *fc)
+{
+ mutex_destroy(&fc->arq->q_mtx);
+ mutex_destroy(&fc->ars->q_mtx);
+ mutex_destroy(&fc->atq->q_mtx);
+ mutex_destroy(&fc->ats->q_mtx);
+}
+
#define BIND_CMP(addr, fwb) \
(((addr) < (fwb)->start) ? -1 : ((fwb)->end < (addr)) ? 1 : 0)
Index: dev/ieee1394/firewirereg.h
===================================================================
RCS file: /cvsroot/src/sys/dev/ieee1394/firewirereg.h,v
retrieving revision 1.15
diff -p -u -r1.15 firewirereg.h
--- dev/ieee1394/firewirereg.h 14 Nov 2010 15:47:20 -0000 1.15
+++ dev/ieee1394/firewirereg.h 16 Feb 2011 03:16:23 -0000
@@ -283,6 +283,7 @@ int fw_xferwait(struct fw_xfer *);
void fw_drain_txq(struct firewire_comm *);
void fw_busreset(struct firewire_comm *, uint32_t);
void fw_init(struct firewire_comm *);
+void fw_destroy(struct firewire_comm *);
struct fw_bind *fw_bindlookup(struct firewire_comm *, uint16_t, uint32_t);
int fw_bindadd(struct firewire_comm *, struct fw_bind *);
int fw_bindremove(struct firewire_comm *, struct fw_bind *);
Index: dev/ieee1394/fwohci.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ieee1394/fwohci.c,v
retrieving revision 1.130
diff -p -u -r1.130 fwohci.c
--- dev/ieee1394/fwohci.c 7 Sep 2010 07:19:45 -0000 1.130
+++ dev/ieee1394/fwohci.c 16 Feb 2011 03:16:23 -0000
@@ -323,38 +323,15 @@ static void fwohci_arcv(struct fwohci_so
#define IRX_CH 36
-int
+/*
+ * Call fwohci_init before fwohci_attach to initialize the kernel's
+ * data structures well enough that fwohci_detach won't crash, even if
+ * fwohci_attach fails.
+ */
+
+void
fwohci_init(struct fwohci_softc *sc)
{
- uint32_t reg;
- uint8_t ui[8];
- int i, mver;
-
-/* OHCI version */
- reg = OREAD(sc, OHCI_VERSION);
- mver = (reg >> 16) & 0xff;
- aprint_normal_dev(sc->fc.dev, "OHCI version %x.%x (ROM=%d)\n",
- mver, reg & 0xff, (reg >> 24) & 1);
- if (mver < 1 || mver > 9) {
- aprint_error_dev(sc->fc.dev, "invalid OHCI version\n");
- return ENXIO;
- }
-
-/* Available Isochronous DMA channel probe */
- OWRITE(sc, OHCI_IT_MASK, 0xffffffff);
- OWRITE(sc, OHCI_IR_MASK, 0xffffffff);
- reg = OREAD(sc, OHCI_IT_MASK) & OREAD(sc, OHCI_IR_MASK);
- OWRITE(sc, OHCI_IT_MASKCLR, 0xffffffff);
- OWRITE(sc, OHCI_IR_MASKCLR, 0xffffffff);
- for (i = 0; i < 0x20; i++)
- if ((reg & (1 << i)) == 0)
- break;
- sc->fc.nisodma = i;
- aprint_normal_dev(sc->fc.dev, "No. of Isochronous channels is %d.\n",
- i);
- if (i == 0)
- return ENXIO;
-
sc->fc.arq = &sc->arrq.xferq;
sc->fc.ars = &sc->arrs.xferq;
sc->fc.atq = &sc->atrq.xferq;
@@ -395,6 +372,68 @@ fwohci_init(struct fwohci_softc *sc)
sc->atrq.off = OHCI_ATQOFF;
sc->atrs.off = OHCI_ATSOFF;
+ sc->fc.tcode = tinfo;
+
+ sc->fc.cyctimer = fwohci_cyctimer;
+ sc->fc.ibr = fwohci_ibr;
+ sc->fc.set_bmr = fwohci_set_bus_manager;
+ sc->fc.ioctl = fwohci_ioctl;
+ sc->fc.irx_enable = fwohci_irx_enable;
+ sc->fc.irx_disable = fwohci_irx_disable;
+
+ sc->fc.itx_enable = fwohci_itxbuf_enable;
+ sc->fc.itx_disable = fwohci_itx_disable;
+ sc->fc.timeout = fwohci_timeout;
+ sc->fc.set_intr = fwohci_set_intr;
+#if BYTE_ORDER == BIG_ENDIAN
+ sc->fc.irx_post = fwohci_irx_post;
+#else
+ sc->fc.irx_post = NULL;
+#endif
+ sc->fc.itx_post = NULL;
+
+ sc->intmask = sc->irstat = sc->itstat = 0;
+
+ fw_init(&sc->fc);
+}
+
+/*
+ * Call fwohci_attach after fwohci_init to initialize the hardware and
+ * attach children.
+ */
+
+int
+fwohci_attach(struct fwohci_softc *sc)
+{
+ uint32_t reg;
+ uint8_t ui[8];
+ int i, mver;
+
+/* OHCI version */
+ reg = OREAD(sc, OHCI_VERSION);
+ mver = (reg >> 16) & 0xff;
+ aprint_normal_dev(sc->fc.dev, "OHCI version %x.%x (ROM=%d)\n",
+ mver, reg & 0xff, (reg >> 24) & 1);
+ if (mver < 1 || mver > 9) {
+ aprint_error_dev(sc->fc.dev, "invalid OHCI version\n");
+ return ENXIO;
+ }
+
+/* Available Isochronous DMA channel probe */
+ OWRITE(sc, OHCI_IT_MASK, 0xffffffff);
+ OWRITE(sc, OHCI_IR_MASK, 0xffffffff);
+ reg = OREAD(sc, OHCI_IT_MASK) & OREAD(sc, OHCI_IR_MASK);
+ OWRITE(sc, OHCI_IT_MASKCLR, 0xffffffff);
+ OWRITE(sc, OHCI_IR_MASKCLR, 0xffffffff);
+ for (i = 0; i < 0x20; i++)
+ if ((reg & (1 << i)) == 0)
+ break;
+ sc->fc.nisodma = i;
+ aprint_normal_dev(sc->fc.dev, "No. of Isochronous channels is %d.\n",
+ i);
+ if (i == 0)
+ return ENXIO;
+
for (i = 0; i < sc->fc.nisodma; i++) {
sc->fc.it[i] = &sc->it[i].xferq;
sc->fc.ir[i] = &sc->ir[i].xferq;
@@ -406,8 +445,6 @@ fwohci_init(struct fwohci_softc *sc)
sc->ir[i].off = OHCI_IROFF(i);
}
- sc->fc.tcode = tinfo;
-
sc->fc.config_rom = fwdma_alloc_setup(sc->fc.dev, sc->fc.dmat,
CROMSIZE, &sc->crom_dma, CROMSIZE, BUS_DMA_NOWAIT);
if (sc->fc.config_rom == NULL) {
@@ -467,27 +504,6 @@ fwohci_init(struct fwohci_softc *sc)
"EUI64 %02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x\n",
ui[0], ui[1], ui[2], ui[3], ui[4], ui[5], ui[6], ui[7]);
- sc->fc.cyctimer = fwohci_cyctimer;
- sc->fc.ibr = fwohci_ibr;
- sc->fc.set_bmr = fwohci_set_bus_manager;
- sc->fc.ioctl = fwohci_ioctl;
- sc->fc.irx_enable = fwohci_irx_enable;
- sc->fc.irx_disable = fwohci_irx_disable;
-
- sc->fc.itx_enable = fwohci_itxbuf_enable;
- sc->fc.itx_disable = fwohci_itx_disable;
- sc->fc.timeout = fwohci_timeout;
- sc->fc.set_intr = fwohci_set_intr;
-#if BYTE_ORDER == BIG_ENDIAN
- sc->fc.irx_post = fwohci_irx_post;
-#else
- sc->fc.irx_post = NULL;
-#endif
- sc->fc.itx_post = NULL;
-
- sc->intmask = sc->irstat = sc->itstat = 0;
-
- fw_init(&sc->fc);
fwohci_reset(sc);
sc->fc.bdev =
@@ -499,10 +515,13 @@ fwohci_init(struct fwohci_softc *sc)
int
fwohci_detach(struct fwohci_softc *sc, int flags)
{
- int i;
+ int i, rv;
- if (sc->fc.bdev != NULL)
- config_detach(sc->fc.bdev, flags);
+ if (sc->fc.bdev != NULL) {
+ rv = config_detach(sc->fc.bdev, flags);
+ if (rv)
+ return rv;
+ }
if (sc->sid_buf != NULL)
fwdma_free(sc->sid_dma.dma_tag, sc->sid_dma.dma_map,
sc->sid_dma.v_addr);
@@ -519,10 +538,7 @@ fwohci_detach(struct fwohci_softc *sc, i
fwohci_db_free(sc, &sc->ir[i]);
}
- mutex_destroy(&sc->arrq.xferq.q_mtx);
- mutex_destroy(&sc->arrs.xferq.q_mtx);
- mutex_destroy(&sc->atrq.xferq.q_mtx);
- mutex_destroy(&sc->atrs.xferq.q_mtx);
+ fw_destroy(&sc->fc);
return 0;
}
Index: dev/ieee1394/fwohcivar.h
===================================================================
RCS file: /cvsroot/src/sys/dev/ieee1394/fwohcivar.h,v
retrieving revision 1.32
diff -p -u -r1.32 fwohcivar.h
--- dev/ieee1394/fwohcivar.h 23 May 2010 18:56:59 -0000 1.32
+++ dev/ieee1394/fwohcivar.h 16 Feb 2011 03:16:23 -0000
@@ -79,7 +79,8 @@ struct fwohci_softc {
#define OWRITE(sc, r, x) bus_space_write_4((sc)->bst, (sc)->bsh, (r), (x))
#define OREAD(sc, r) bus_space_read_4((sc)->bst, (sc)->bsh, (r))
-int fwohci_init(struct fwohci_softc *);
+void fwohci_init(struct fwohci_softc *);
+int fwohci_attach(struct fwohci_softc *);
int fwohci_detach(struct fwohci_softc *, int);
int fwohci_intr(void *arg);
int fwohci_resume(struct fwohci_softc *);
Index: dev/pci/fwohci_pci.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/fwohci_pci.c,v
retrieving revision 1.39
diff -p -u -r1.39 fwohci_pci.c
--- dev/pci/fwohci_pci.c 29 Apr 2010 06:41:27 -0000 1.39
+++ dev/pci/fwohci_pci.c 16 Feb 2011 03:16:44 -0000
@@ -107,6 +107,8 @@ fwohci_pci_attach(device_t parent, devic
aprint_normal(": %s (rev. 0x%02x)\n", devinfo,
PCI_REVISION(pa->pa_class));
+ fwohci_init(&psc->psc_sc);
+
psc->psc_sc.fc.dev = self;
psc->psc_sc.fc.dmat = pa->pa_dmat;
psc->psc_pc = pa->pa_pc;
@@ -154,15 +156,12 @@ fwohci_pci_attach(device_t parent, devic
}
aprint_normal_dev(self, "interrupting at %s\n", intrstr);
+ if (fwohci_attach(&psc->psc_sc) != 0)
+ goto fail;
+
if (!pmf_device_register(self, fwohci_pci_suspend, fwohci_pci_resume))
aprint_error_dev(self, "couldn't establish power handler\n");
- if (fwohci_init(&psc->psc_sc) != 0) {
- pci_intr_disestablish(pa->pa_pc, psc->psc_ih);
- bus_space_unmap(psc->psc_sc.bst, psc->psc_sc.bsh,
- psc->psc_sc.bssize);
- }
-
return;
fail:
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 16:35:39 +0000
If I disable Firewire, the machine seems to reliably resume. However,
after it resumes, there is an interrupt storm on ioapic0 pin 9. This
happens whether I use `drvctl -d fwohci0' or whether I add `no fwohci*
at pci?' to my kernel configuration, so it doesn't seem related to
Firewire. (fwohci0 takes ioapic0 pin 19.)
Suggestions? I'll continue trying to find what's wrong with fwohci in
the meantime.
From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: port-i386-maintainer@NetBSD.org, netbsd-bugs@NetBSD.org,
Taylor R Campbell <campbell+netbsd@mumble.net>
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 19:38:38 +0200
On Wed, Feb 16, 2011 at 04:40:04PM +0000, Taylor R Campbell wrote:
> If I disable Firewire, the machine seems to reliably resume. However,
> after it resumes, there is an interrupt storm on ioapic0 pin 9. This
> happens whether I use `drvctl -d fwohci0' or whether I add `no fwohci*
> at pci?' to my kernel configuration, so it doesn't seem related to
> Firewire. (fwohci0 takes ioapic0 pin 19.)
>
> Suggestions?
This is kind of shot in the dark, but can you try the following small patch?
- Jukka.
Index: acpi.c
===================================================================
RCS file: /cvsroot/src/sys/dev/acpi/acpi.c,v
retrieving revision 1.235
diff -u -p -r1.235 acpi.c
--- acpi.c 15 Feb 2011 20:24:11 -0000 1.235
+++ acpi.c 16 Feb 2011 17:36:22 -0000
@@ -1370,9 +1370,9 @@ acpi_enter_sleep_state(int state)
AcpiEnable();
(void)pmf_system_bus_resume(PMF_Q_NONE);
+ (void)pmf_system_resume(PMF_Q_NONE);
(void)AcpiLeaveSleepState(state);
(void)AcpiSetFirmwareWakingVector(0);
- (void)pmf_system_resume(PMF_Q_NONE);
}
break;
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 17:50:14 +0000
Here's a diff between the outputs of `pcictl pci3 dump -d 3 -f 0'
(where fwohci0 attaches), before and after suspending & resuming.
This probably explains why fwohci0 fails after resumption. Perhaps
someone who is better versed in PCI than I am can interpret it and
figure out what went wrong faster than I can.
The relevant part of the configuration looks like
root -> mainbus0 -> pci0 -> ppb2 -> pci3 -> fwohci0.
--- pci3dump.0.boot 2011-02-16 17:08:02.000000000 +0000
+++ pci3dump.2.resumed 2011-02-16 17:09:36.000000000 +0000
@@ -1,20 +1,20 @@
PCI configuration registers:
Common header:
- 0x00: 0x581111c1 0x02900216 0x0c001061 0x0000f810
+ 0x00: 0x581111c1 0x02900000 0x0c001061 0x00000000
Vendor Name: Lucent Technologies (0x11c1)
Device Name: FW322/323 IEEE 1394 Host Controller (0x5811)
- Command register: 0x0216
+ Command register: 0x0000
I/O space accesses: off
- Memory space accesses: on
- Bus mastering: on
+ Memory space accesses: off
+ Bus mastering: off
Special cycles: off
- MWI transactions: on
+ MWI transactions: off
Palette snooping: off
Parity error checking: off
Address/data stepping: off
System error (SERR): off
- Fast back-to-back transactions: on
+ Fast back-to-back transactions: off
Interrupt disable: off
Status register: 0x0290
Capability List support: on
@@ -34,17 +34,16 @@
Revision ID: 0x61
BIST: 0x00
Header Type: 0x00 (0x00)
- Latency Timer: 0xf8
- Cache Line Size: 0x10
+ Latency Timer: 0x00
+ Cache Line Size: 0x00
Type 0 ("normal" device) header:
- 0x10: 0x90000000 0x00000000 0x00000000 0x00000000
+ 0x10: 0x00000000 0x00000000 0x00000000 0x00000000
0x20: 0x00000000 0x00000000 0x00000000 0x581111c1
- 0x30: 0x00000000 0x00000044 0x00000000 0x180c010b
+ 0x30: 0x00000000 0x00000044 0x00000000 0x180c0100
Base address register at 0x10
- type: 32-bit nonprefetchable memory
- base: 0x90000000, not sized
+ not implemented(?)
Base address register at 0x14
not implemented(?)
Base address register at 0x18
@@ -64,7 +63,7 @@
Maximum Latency: 0x18
Minimum Grant: 0x0c
Interrupt pin: 0x01 (pin A)
- Interrupt line: 0x0b
+ Interrupt line: 0x00
Capability register at 0x44
type: 0x01 (Power Management, rev. 1.0)
@@ -78,15 +77,15 @@
D1 power management state support: on
D2 power management state support: on
PME# support: 0x0f
- Control/status register: 0x8000
+ Control/status register: 0x0000
Power state: D0
PCI Express reserved: off
No soft reset: off
PME# assertion disabled
- PME# status: on
+ PME# status: off
Device-dependent header:
- 0x40: 0x00000000 0x7e020001 0x00008000 0x00000000
+ 0x40: 0x00000000 0x7e020001 0x00000000 0x00000000
0x50: 0x00000000 0x00000000 0x00000000 0x00000000
0x60: 0x00000000 0x00000000 0x00000000 0x00000000
0x70: 0x........ 0x........ 0x00000000 0x00000000
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: jruohonen@iki.fi
Cc: gnats-bugs@NetBSD.org, port-i386-maintainer@NetBSD.org,
netbsd-bugs@NetBSD.org
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 17:59:00 +0000
Date: Wed, 16 Feb 2011 19:38:38 +0200
From: Jukka Ruohonen <jruohonen@iki.fi>
This is kind of shot in the dark, but can you try the following small pa=
tch?
No dice -- same interrupt storm.
From: Jukka Ruohonen <jruohonen@iki.fi>
To: gnats-bugs@NetBSD.org
Cc: port-i386-maintainer@NetBSD.org, netbsd-bugs@NetBSD.org,
Taylor R Campbell <campbell+netbsd@mumble.net>
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 20:26:06 +0200
On Wed, Feb 16, 2011 at 06:00:09PM +0000, Taylor R Campbell wrote:
> This is kind of shot in the dark, but can you try the following small
> patch?
>
> No dice -- same interrupt storm.
Another one attached.
This is based on the following Linux bug report that sounds awfully similar:
https://bugzilla.kernel.org/show_bug.cgi?id=6670
Len Brown from Intel concludes therein that this would be a BIOS bug.
Index: acpi_wakeup.c
===================================================================
RCS file: /cvsroot/src/sys/arch/x86/acpi/acpi_wakeup.c,v
retrieving revision 1.27
diff -u -p -r1.27 acpi_wakeup.c
--- acpi_wakeup.c 13 Jan 2011 03:45:38 -0000 1.27
+++ acpi_wakeup.c 16 Feb 2011 18:23:00 -0000
@@ -342,6 +342,11 @@ acpi_md_sleep(int state)
inittodr(time_second);
/*
+ * A workaround for broken BIOS.
+ */
+ (void)AcpiWriteBitRegister(ACPI_BITREG_SCI_ENABLE, 1);
+
+ /*
* Clear fixed events (see e.g. ACPI 3.0, p. 62).
* Also prevent GPEs from misfiring by disabling
* all GPEs before interrupts are enabled. The
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: jruohonen@iki.fi
Cc: gnats-bugs@NetBSD.org, port-i386-maintainer@NetBSD.org,
netbsd-bugs@NetBSD.org
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 18:42:10 +0000
Date: Wed, 16 Feb 2011 20:26:06 +0200
From: Jukka Ruohonen <jruohonen@iki.fi>
Another one attached.
This is based on the following Linux bug report that sounds awfully simi=
lar:
https://bugzilla.kernel.org/show_bug.cgi?id=3D6670
Len Brown from Intel concludes therein that this would be a BIOS bug.
Seems much happier now! No more interrupt storm, and hw.acpi.stat.sci
is only one greater than hw.acpi.stat.gpe.
From: "Jukka Ruohonen" <jruoho@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44581 CVS commit: src/sys/arch/x86/acpi
Date: Wed, 16 Feb 2011 18:55:51 +0000
Module Name: src
Committed By: jruoho
Date: Wed Feb 16 18:55:50 UTC 2011
Modified Files:
src/sys/arch/x86/acpi: acpi_wakeup.c
Log Message:
Explicitly re-enable the SCI interrupt when the wakeup starts (and before
interrupts are enabled). A workaround for a BIOS bug. Fixes the interrupt
storm reported by Taylor R. Campbell in PR # 44581.
To generate a diff of this commit:
cvs rdiff -u -r1.27 -r1.28 src/sys/arch/x86/acpi/acpi_wakeup.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 20:19:10 +0000
fwohci(4) is not happy to attach if its PCI command status register
does not have bus mastering or memory mapping enabled. I think this
is the case for many drivers. If I have fwohci0 detached, I observe
that when pci3 is resumed, device 3 function 0 (where fwohci0
attaches) has its command status register zeroed. That makes
fwohci_pci_attach fail, because pci_mapreg_map thinks the flags are
inappropriate.
I tried the following patch, and, lo and behold, the machine resumes!
(I have lots of changes in this tree which may be necessary too.)
But here's the weird part. The output I see is
fwohci0: on resume, csr = 42992150
fwohci0: setting csr to 42992150
I'm glad my machine is resuming now, but this leaves me unsatisfied
inside...
Also, if I detach fwohci0 before suspending and rescan pci3 after
resuming, then the csr has bus mastering and memory mapping disabled,
so fwohci0 fails to attach. I know this doesn't matter much, but it
would be nice if suspending while fwohci0 is detached didn't render
fwohci0 unusable until the next reboot.
Index: fwohci_pci.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/fwohci_pci.c,v
retrieving revision 1.39
diff -p -u -r1.39 fwohci_pci.c
--- fwohci_pci.c 29 Apr 2010 06:41:27 -0000 1.39
+++ fwohci_pci.c 16 Feb 2011 20:08:21 -0000
@@ -214,10 +213,18 @@ fwohci_pci_resume(device_t dv, const pmf
{
struct fwohci_pci_softc *psc = device_private(dv);
int s;
+ uint32_t csr;
s = splbio();
+ csr = pci_conf_read(psc->psc_pc, psc->psc_tag, PCI_COMMAND_STATUS_REG);
+ aprint_normal_dev(dv, "on resume, csr = %d\n", csr);
+ csr |= PCI_COMMAND_MASTER_ENABLE | PCI_COMMAND_MEM_ENABLE;
+ aprint_normal_dev(dv, "setting csr to %d\n", csr);
+ pci_conf_write(psc->psc_pc, psc->psc_tag, PCI_COMMAND_STATUS_REG, csr);
fwohci_resume(&psc->psc_sc);
+ aprint_normal_dev(dv, "fwohci_resume returned, about to splx %d\n", s);
splx(s);
+ aprint_normal_dev(dv, "fwohci_resume splx %d\n", s);
return true;
}
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Wed, 16 Feb 2011 20:27:58 +0000
Also, in order to test rescanning pci3 and reattaching fwohci0, I had
to apply the following patch to fix a config panic induced by a
trivial race condition in firewire.c. (fw_bus_probe_thread may call
config_pending_decr before control has reached firewireattach's call
to config_pending_incr, causing a panic.)
Index: firewire.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ieee1394/firewire.c,v
retrieving revision 1.38
diff -c -r1.38 firewire.c
--- firewire.c 7 Sep 2010 07:26:54 -0000 1.38
+++ firewire.c 16 Feb 2011 20:26:36 -0000
@@ -257,10 +257,10 @@
callout_schedule(&fc->timeout_callout, hz);
/* create thread */
+ config_pending_incr();
if (kthread_create(PRI_NONE, KTHREAD_MPSAFE, NULL, fw_bus_probe_thread,
fc, &fc->probe_thread, "fw%dprobe", device_unit(fc->bdev)))
aprint_error_dev(self, "kthread_create failed\n");
- config_pending_incr();
devlist = malloc(sizeof(struct firewire_dev_list), M_DEVBUF, M_NOWAIT);
if (devlist == NULL) {
From: Taylor R Campbell <campbell+netbsd@mumble.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Thu, 17 Feb 2011 00:12:01 +0000
Date: Wed, 16 Feb 2011 20:19:10 +0000
From: Taylor R Campbell <campbell+netbsd@mumble.net>
I tried the following patch, and, lo and behold, the machine resumes!
(I have lots of changes in this tree which may be necessary too.)
Oops -- this was a red herring. In my testing of dozens of different
configurations, I must have mixed some of them up. If I undo this
patch, it also works, so it must be the SCI business that let Firewire
resumption work. So I think this PR can be closed now, unless someone
is interested in applying one of the earlier patches I sent to make
reattaching fwohci0 merely fail and not panic after suspending and
resuming.
I also tested removing the calls in firewire.c to config_pending_incr
and config_pending_decr -- I don't see why they're there, I observed a
panic because of a race condition between them, and the code seems to
work without them. At the very least, I think the calls to
kthread_create and config_pending_incr should be reversed, like they
were when the code was imported.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-i386/44581: MacBook1,1 won't resume after suspend
Date: Sun, 20 Feb 2011 00:37:07 +0000
On Thu, Feb 17, 2011 at 12:15:04AM +0000, Taylor R Campbell wrote:
> [...] unless someone
> is interested in applying one of the earlier patches I sent to make
> reattaching fwohci0 merely fail and not panic after suspending and
> resuming.
Those sound like good patches to have... I don't know suspend/resume
stuff well enough to want to go commit them myself though.
--
David A. Holland
dholland@netbsd.org
State-Changed-From-To: open->analyzed
State-Changed-By: jruoho@NetBSD.org
State-Changed-When: Sun, 27 Feb 2011 10:36:13 +0000
State-Changed-Why:
Problem has been analyzed.
Someone who understands fwohci(4) needs to evaluate the patches.
[D
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44581 CVS commit: src/sys/dev
Date: Sat, 4 Aug 2012 03:55:44 +0000
Module Name: src
Committed By: riastradh
Date: Sat Aug 4 03:55:44 UTC 2012
Modified Files:
src/sys/dev/cardbus: fwohci_cardbus.c
src/sys/dev/ieee1394: firewire.c firewirereg.h fwohci.c fwohcivar.h
src/sys/dev/pci: fwohci_pci.c
Log Message:
Fix error branches and config pending races in firewire init.
This way, if anything fails, it just fails; you don't panic. This can
happen if suspending and resuming of firewire is broken (e.g., as I
encountered in PR kern/44581).
To generate a diff of this commit:
cvs rdiff -u -r1.34 -r1.35 src/sys/dev/cardbus/fwohci_cardbus.c
cvs rdiff -u -r1.39 -r1.40 src/sys/dev/ieee1394/firewire.c
cvs rdiff -u -r1.17 -r1.18 src/sys/dev/ieee1394/firewirereg.h
cvs rdiff -u -r1.132 -r1.133 src/sys/dev/ieee1394/fwohci.c
cvs rdiff -u -r1.33 -r1.34 src/sys/dev/ieee1394/fwohcivar.h
cvs rdiff -u -r1.40 -r1.41 src/sys/dev/pci/fwohci_pci.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: analyzed->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Mon, 21 Nov 2016 01:13:03 +0000
State-Changed-Why:
fixed years ago
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.