NetBSD Problem Report #30977
From cube@eve-team.com Fri Aug 12 14:12:30 2005
Return-Path: <cube@eve-team.com>
Received: from relay01a.clb.oleane.net (relay01a.clb.oleane.net [213.56.31.145])
by narn.netbsd.org (Postfix) with ESMTP id 927F663B116
for <gnats-bugs@gnats.NetBSD.org>; Fri, 12 Aug 2005 14:12:24 +0000 (UTC)
Message-Id: <20050812141129.4C26014691@vidal.eve>
Date: Fri, 12 Aug 2005 16:11:29 +0200 (CEST)
From: cube@cubidou.net
Reply-To: cube@cubidou.net
To: gnats-bugs@netbsd.org
Subject: FPU troubles
X-Send-Pr-Version: 3.95
>Number: 30977
>Category: port-xen
>Synopsis: FPU troubles
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: bouyer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Aug 12 14:13:00 +0000 2005
>Closed-Date: Sun Mar 05 12:28:37 +0000 2006
>Last-Modified: Sun Mar 05 12:28:37 +0000 2006
>Originator: Quentin Garnier
>Release: NetBSD 3.99.7
>Organization:
>Environment:
System: NetBSD vidal.eve 3.99.7 NetBSD 3.99.7 (XEN_VIDAL) #0: Thu Aug 11 16:39:22 CEST 2005 cube@vidal.eve:/empty/NetBSD/vidal/obj/empty/NetBSD/current/src/sys/arch/i386/compile/XEN_VIDAL i386
Architecture: i386
Machine: i386
>Description:
Applications that use the FPU show weird behaviour: xpdf has
trouble displaying images (reloading a page several time shows
different results) as they get black regions on it.
Ogg decoders such as ogg123 or any player linked to libvorbis
insert beeps and cracks in the music.
I could prouve it is a FPU issue by listening to the difference
when using libvorbis.so (from audio/libvorbis) and
libvorbisidec.so (from audio/tremor).
The former uses the FPU, while the latter is a fixed-point
implementation. The former inserts beeps and cracks, sound is
just fine when using the latter.
I also had funny results one day when tried manipulating images
with ImageMagick one day. Black lines appeared on the result.
>How-To-Repeat:
Use mp3blaster to listen to your collection of Ogg/Vorbis files.
Cry in despair when KT Tunstall's music gets desecrated.
>Fix:
Unknown. Obviously updating Xen to 2.0.7 and NetBSD to a very
-current wasn't enough.
>Release-Note:
>Audit-Trail:
From: Quentin Garnier <cube@cubidou.net>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-xen/30977: FPU troubles
Date: Thu, 8 Sep 2005 17:01:21 +0200
--nWclrlgqnAApBvUd
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Fri, Aug 12, 2005 at 02:13:01PM +0000, cube@cubidou.net wrote:
> >Number: 30977
> >Category: port-xen
> >Synopsis: FPU troubles
[...]
> >Description:
> Applications that use the FPU show weird behaviour: xpdf has
> trouble displaying images (reloading a page several time shows
> different results) as they get black regions on it.
>=20
> Ogg decoders such as ogg123 or any player linked to libvorbis
> insert beeps and cracks in the music.
>=20
> I could prouve it is a FPU issue by listening to the difference
> when using libvorbis.so (from audio/libvorbis) and
> libvorbisidec.so (from audio/tremor).
>=20
> The former uses the FPU, while the latter is a fixed-point
> implementation. The former inserts beeps and cracks, sound is
> just fine when using the latter.
>=20
> I also had funny results one day when tried manipulating images
> with ImageMagick one day. Black lines appeared on the result.
[...]
> >Fix:
> Unknown. Obviously updating Xen to 2.0.7 and NetBSD to a very
> -current wasn't enough.
I've updated userland & kernel to latest -current, and I still see the
behaviour.
--=20
Quentin Garnier - cube@cubidou.net - cube@NetBSD.org
"When I find the controls, I'll go where I like, I'll know where I want
to be, but maybe for now I'll stay right here on a silent sea."
KT Tunstall, Silent Sea, Eye to the Telescope, 2004.
--nWclrlgqnAApBvUd
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (NetBSD)
iQEVAwUBQyBSQdgoQloHrPnoAQJm5AgAglICs+jSDG+3XSK44sGT5XdOsEH3J2z3
5H0WN26UGUHW9U+gTNNIIOAmg+MgVu61kQU883//jd+FZnLg7Zz08B2KSLA+KWFg
GxI77cthoa6OF2xQjFPNY80UZ8nEP0Ij2rBlfOfIskAkwt9q0bOdFnUxQT2LccKz
GQpDCaOBcEEsFnk3kkBuaDr+fRFkw/FAsh3/TLEvMMrm1FK+jEFPd6ShVQEvnhar
TTFpvZUnj2sqXxv0jK4PHWa/dzeGVr103v/lgSWdCKrtCWfrMWXZEWA4BqFm4Rtv
lbRZa9FotfUY995nj/FPHzTkZ+RrOKfFoe7/3VXcZfd1BOtP7n8sHQ==
=lSoq
-----END PGP SIGNATURE-----
--nWclrlgqnAApBvUd--
From: Paul Ripke <stix@stix.id.au>
To: gnats-bugs@netbsd.org
Cc: cube@cubidou.net, port-xen@netbsd.org
Subject: Re: port-xen/30977: FPU troubles
Date: Wed, 23 Nov 2005 08:15:04 +1100
I've finally got around to trying xen (NetBSD-3.0BETA, xen 2.0.7) and
have run into this PR in DOM0.
First sign was xearth coredumping on login. Next was only "clicking"
from xmms, and I've had xpdf core, too.
A little debugging shows that the FPU state is not being correctly
saved or restored during "some" context switches. The following code
appears to exhibit the behaviour reliably:
marvin:ksh$ cat fp2.c
#include <assert.h>
#include <pthread.h>
#include <stdio.h>
#include <time.h>
void *
runloop(void *p)
{
int i;
double fi;
i = 1;
while (1) {
fi = i;
if ((int)fi != i) {
printf("%9ld %9d %9d\n", clock(), i, (int)fi);
}
i++;
}
}
int
main(int argc, char **argv)
{
pthread_t a, b;
setlinebuf(stdout);
assert(pthread_create(&a, NULL, &runloop, NULL) == 0);
assert(pthread_create(&b, NULL, &runloop, NULL) == 0);
assert(pthread_join(a, NULL) == 0);
return 0;
}
marvin:ksh$ gcc -O -Wall -lpthread -o fp2 fp2.c
marvin:ksh$ ./fp2 | head
19 2810714 4887424
34 4782167 -2147483648
152 29109559 -2147483648
189 36046059 -2147483648
217 43217401 -2147483648
295 58190482 60223234
334 65918432 -2147483648
415 84428916 -2147483648
478 93211664 97457195
507 102074913 -2147483648
I've dug through NetBSD code (on the assumption xen is doing the
right thing), but I'm not 100% sure exactly what I'm looking for.
I guess this would be a nice bug to squash for 3.0 (assuming the
bug is in NetBSD).
Cheers,
--
Paul Ripke
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Paul Ripke <stix@stix.id.au>
Cc: gnats-bugs@NetBSD.org, cube@cubidou.net, port-xen@NetBSD.org
Subject: Re: port-xen/30977: FPU troubles
Date: Wed, 23 Nov 2005 13:47:04 +0100
--T4sUOijqQbZv57TR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Wed, Nov 23, 2005 at 08:15:04AM +1100, Paul Ripke wrote:
> I've finally got around to trying xen (NetBSD-3.0BETA, xen 2.0.7) and
> have run into this PR in DOM0.
>
> First sign was xearth coredumping on login. Next was only "clicking"
> from xmms, and I've had xpdf core, too.
>
> A little debugging shows that the FPU state is not being correctly
> saved or restored during "some" context switches. The following code
> appears to exhibit the behaviour reliably:
Interesting. I modified your test prog to also print a thread id (see attached
file). I get the same result as you with a stock current kernel:
1 33 873373 -2147483648
1 55 2001359 -2147483648
0 62 2770587 -2147483648
1 92 3483395 -2147483648
0 128 5914000 -2147483648
1 132 5228374 -2147483648
1 172 6972647 6098061
0 182 8008230 6098061
0 222 9753133 -2147483648
1 232 9592029 -2147483648
1 272 11335805 -2147483648
0 282 12371751 -2147483648
1 312 13079804 13241628
0 328 14641093 -2147483648
1 352 14824306 14985832
1 412 17443856 -2147483648
0 482 21107482 20059525
But with a kernel with the FPU patch I posted here some times ago (see second
attachement), I get:
1 72 2622678 3646769
0 81 3646769 2622678
1 152 6105605 7113173
0 162 7113173 6105605
0 281 12357157 12217903
1 291 12217903 12357157
0 321 14100004 13958543
1 331 13958543 14970612
0 341 14970612 14100004
1 751 32317194 33249945
0 761 33249945 32317194
Now the values are just swapped between threads. It seems that the fpu context
is properly restored, but sometimes to the wrong thread ...
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--T4sUOijqQbZv57TR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="fp2.c"
#include <assert.h>
#include <pthread.h>
#include <stdio.h>
#include <time.h>
void *
runloop(void *p)
{
int i;
double fi;
int *myid = p;
i = 1;
while (1) {
fi = i;
if ((int)fi != i) {
printf("%d %9ld %9d %9d\n", *myid, clock(), i, (int)fi);
}
i++;
}
}
int
main(int argc, char **argv)
{
pthread_t a, b;
int ia = 0, ib = 1;
setlinebuf(stdout);
assert(pthread_create(&a, NULL, &runloop, &ia) == 0);
assert(pthread_create(&b, NULL, &runloop, &ib) == 0);
assert(pthread_join(a, NULL) == 0);
return 0;
}
--T4sUOijqQbZv57TR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
Index: i386/machdep.c
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/i386/machdep.c,v
retrieving revision 1.18
diff -u -r1.18 machdep.c
--- i386/machdep.c 19 Aug 2005 16:06:12 -0000 1.18
+++ i386/machdep.c 7 Nov 2005 15:55:22 -0000
@@ -2419,6 +2419,7 @@
if (l->l_addr->u_pcb.pcb_fpcpu != NULL)
npxsave_lwp(l, 0);
#endif
+ KASSERT((flags & _UC_FXSAVE) == 0);
if (flags & _UC_FXSAVE) {
if (i386_use_fxsave) {
memcpy(
@@ -2444,6 +2445,11 @@
}
/* If not set already. */
l->l_md.md_flags |= MDL_USEDFPU;
+#define frstor(addr) __asm("frstor %0" : : "m" (*addr))
+ frstor(&l->l_addr->u_pcb.pcb_savefpu.sv_87);
+#undef frstor
+ l->l_addr->u_pcb.pcb_fpcpu = curcpu();
+ curcpu()->ci_fpcurlwp = l;
#if 0
/* Apparently unused. */
l->l_addr->u_pcb.pcb_saveemc = mcp->mc_fp.fp_emcsts;
Index: i386/npx.c
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/i386/npx.c,v
retrieving revision 1.4
diff -u -r1.4 npx.c
--- i386/npx.c 20 Mar 2005 13:12:59 -0000 1.4
+++ i386/npx.c 7 Nov 2005 15:55:22 -0000
@@ -70,7 +70,7 @@
#include <sys/cdefs.h>
__KERNEL_RCSID(0, "$NetBSD: npx.c,v 1.4 2005/03/20 13:12:59 bouyer Exp $");
-#if 0
+#if 1
#define IPRINTF(x) printf x
#else
#define IPRINTF(x)
@@ -116,7 +116,7 @@
* 387 and 287 Numeric Coprocessor Extension (NPX) Driver.
*
* We do lazy initialization and switching using the TS bit in cr0 and the
- * MDL_USEDFPU bit in mdproc.
+ * MDL_USEDFPU bit in mdlwp.
*
* DNA exceptions are handled like this:
*
@@ -149,7 +149,6 @@
#define stts() HYPERVISOR_fpu_taskswitch()
#endif
-int npxdna(struct cpu_info *);
static int npxdna_notset(struct cpu_info *);
static int npxdna_s87(struct cpu_info *);
#ifdef I686_CPU
@@ -319,8 +318,7 @@
}
#endif
-void npxinit(ci)
- struct cpu_info *ci;
+void npxinit(struct cpu_info *ci)
{
lcr0(rcr0() & ~(CR0_EM|CR0_TS));
fninit();
@@ -534,7 +532,7 @@
{
struct lwp *l;
int s;
-#ifdef XXXXENDEBUG_LOW
+#if 1 /* def XXXXENDEBUG_LOW */
struct trapframe *f = (struct trapframe *)((uint32_t *)(void *)&ci)[1];
#endif
@@ -565,9 +563,11 @@
IPRINTF(("Save"));
npxsave_cpu(ci, 1);
} else {
+ clts();
IPRINTF(("Init"));
fninit();
fwait();
+ stts();
}
splx(s);
@@ -610,8 +610,11 @@
* fpu_save, if we're saving the FPU state not from interrupt
* context (f.i. during fork()). Just return in this case.
*/
- if (ci->ci_fpsaving)
+ if (ci->ci_fpsaving) {
+ printf("recursive npx trap; cr0=%x\n", rcr0());
+ Debugger();
return (1);
+ }
s = splipi(); /* lock out IPI's while we clean house.. */
#ifdef MULTIPROCESSOR
@@ -650,7 +653,7 @@
s = splipi();
ci->ci_fpcurlwp = l;
l->l_addr->u_pcb.pcb_fpcpu = ci;
- ci->ci_fpused = 1;
+ //ci->ci_fpused = 1;
splx(s);
@@ -689,8 +692,8 @@
if (l == NULL)
return;
- IPRINTF(("%s: fp CPU %s lwp %p\n", ci->ci_dev->dv_xname,
- save? "save" : "flush", l));
+ /* IPRINTF(("%s: fp CPU %s lwp %p\n", ci->ci_dev->dv_xname,
+ save? "save" : "flush", l)); */
if (save) {
#ifdef DIAGNOSTIC
@@ -720,12 +723,12 @@
* We set the TS bit in the saved CR0 for this process, so that it
* will get a DNA exception on any FPU instruction and force a reload.
*/
- l->l_addr->u_pcb.pcb_cr0 |= CR0_TS;
+ //l->l_addr->u_pcb.pcb_cr0 |= CR0_TS;
s = splipi();
l->l_addr->u_pcb.pcb_fpcpu = NULL;
ci->ci_fpcurlwp = NULL;
- ci->ci_fpused = 1;
+ //ci->ci_fpused = 1;
splx(s);
}
--T4sUOijqQbZv57TR--
From: Quentin Garnier <cube@cubidou.net>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: Paul Ripke <stix@stix.id.au>, gnats-bugs@NetBSD.org,
port-xen@NetBSD.org
Subject: Re: port-xen/30977: FPU troubles
Date: Fri, 9 Dec 2005 18:45:50 +0100
--GeDkoc8jIzHasOdk
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Wed, Nov 23, 2005 at 01:47:04PM +0100, Manuel Bouyer wrote:
> On Wed, Nov 23, 2005 at 08:15:04AM +1100, Paul Ripke wrote:
> > I've finally got around to trying xen (NetBSD-3.0BETA, xen 2.0.7) and
> > have run into this PR in DOM0.
> >=20
> > First sign was xearth coredumping on login. Next was only "clicking"
> > from xmms, and I've had xpdf core, too.
> >=20
> > A little debugging shows that the FPU state is not being correctly
> > saved or restored during "some" context switches. The following code
> > appears to exhibit the behaviour reliably:
[...]
> Index: i386/machdep.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
ogg123 is happier with this, but still not perfect.
gkrellm, OTOH, is really not happy with it.
--=20
Quentin Garnier - cube@cubidou.net - cube@NetBSD.org
"When I find the controls, I'll go where I like, I'll know where I want
to be, but maybe for now I'll stay right here on a silent sea."
KT Tunstall, Silent Sea, Eye to the Telescope, 2004.
--GeDkoc8jIzHasOdk
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (NetBSD)
iQEVAwUBQ5nCztgoQloHrPnoAQIRcwgAgf4dyJO36mhuYK4cdur4fn6glk9sbvyC
nSimx59JQZw/5xgdZriEJrcqHzxg73WqHdL23+2qyF+yAYI3xOlU6J6lCiDuDtYN
NEL/8CFvAvEyc2aVM5dMmLNF+8SqXvYEfi+kEl1rWWm/nw1vwc4qmdkHcfqxUb5R
fD7Z/JR4PeqltW/pKxEZ9O0oJOqSLoCXUdV+phHnPQoeeipfHYxP95uk+GaVzkwj
DfTYY7vxyiQ4iD8Gs4axTSmX/3vHkP9CjG1HUuSj2k9nP712lOcAzoGwqysjuSf0
ds0sny9ECmnR0x+J7VTXxVRQjuu9VI4bHNmBSeT3OceMPMXcujfO0g==
=dF8G
-----END PGP SIGNATURE-----
--GeDkoc8jIzHasOdk--
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Quentin Garnier <cube@cubidou.net>
Cc: Paul Ripke <stix@stix.id.au>, gnats-bugs@NetBSD.org,
port-xen@NetBSD.org
Subject: Re: port-xen/30977: FPU troubles
Date: Fri, 9 Dec 2005 21:21:31 +0100
On Fri, Dec 09, 2005 at 06:45:50PM +0100, Quentin Garnier wrote:
> On Wed, Nov 23, 2005 at 01:47:04PM +0100, Manuel Bouyer wrote:
> > On Wed, Nov 23, 2005 at 08:15:04AM +1100, Paul Ripke wrote:
> > > I've finally got around to trying xen (NetBSD-3.0BETA, xen 2.0.7) and
> > > have run into this PR in DOM0.
> > >
> > > First sign was xearth coredumping on login. Next was only "clicking"
> > > from xmms, and I've had xpdf core, too.
> > >
> > > A little debugging shows that the FPU state is not being correctly
> > > saved or restored during "some" context switches. The following code
> > > appears to exhibit the behaviour reliably:
> [...]
> > Index: i386/machdep.c
> > ===================================================================
>
> ogg123 is happier with this, but still not perfect.
OK, this may be related to the flaw paranoia is still reporting.
> gkrellm, OTOH, is really not happy with it.
Is this a threaded program ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Paul Ripke <stix@stix.id.au>
To: Manuel Bouyer <bouyer@antioche.eu.org>,
Quentin Garnier <cube@cubidou.net>
Cc: gnats-bugs@NetBSD.org, port-xen@NetBSD.org
Subject: Re: port-xen/30977: FPU troubles
Date: Mon, 19 Dec 2005 09:34:01 +1100
OK, I've had a closer look at this, and come up with the included patch,
which appears to work fine for me, with threaded and non-threaded
programs.
I can't repeat the behaviour of the FPU state being not being restored,
restored to the wrong thread, or wiped.
Thoughts behind the patch:
- since clts() is a no-op under xen 2 (in xen 3,
HYPERVISOR_fpu_taskswitch
takes an int parameter allowing set/clear), it makes no sense to
stts()
inside the DNA handler. We can't clear it again.
- I don't get the npxsave_lwp() calls in the DNA handler. On SMP, it's
used to dump an LWP's FPU state from a different CPU. On a
uniprocessor,
the FPU state has already been dumped by npxsave_cpu(). SMP support
isn't
relevant for xen 2.
Anyway, please test the patch and see how it goes. I note that my sound
(SBlive) is still crook, looks like another bug.
Index: sys/arch/xen/i386/machdep.c
===================================================================
RCS file: /export/netbsd/cvsroot/src/sys/arch/xen/i386/machdep.c,v
retrieving revision 1.13.2.3
diff -u -d -b -w -r1.13.2.3 machdep.c
--- sys/arch/xen/i386/machdep.c 25 Aug 2005 20:16:21 -0000 1.13.2.3
+++ sys/arch/xen/i386/machdep.c 18 Dec 2005 08:00:53 -0000
@@ -1,4 +1,4 @@
-/* $NetBSD$ */
+/* $NetBSD: machdep.c,v 1.13.2.3 2005/08/25 20:16:21 tron Exp $ */
/* NetBSD: machdep.c,v 1.559 2004/07/22 15:12:46 mycroft Exp */
/*-
@@ -73,7 +73,7 @@
*/
#include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD$");
+__KERNEL_RCSID(0, "$NetBSD: machdep.c,v 1.13.2.3 2005/08/25 20:16:21
tron Exp $");
#include "opt_beep.h"
#include "opt_compat_ibcs2.h"
@@ -1174,8 +1174,10 @@
#if NNPX > 0
/* If we were using the FPU, forget about it. */
- if (l->l_addr->u_pcb.pcb_fpcpu != NULL)
+ if (l->l_addr->u_pcb.pcb_fpcpu != NULL) {
npxsave_lwp(l, 0);
+ HYPERVISOR_fpu_taskswitch();
+ }
#endif
#ifdef USER_LDT
@@ -2335,6 +2337,7 @@
*/
if (l->l_addr->u_pcb.pcb_fpcpu) {
npxsave_lwp(l, 1);
+ HYPERVISOR_fpu_taskswitch();
}
#endif
if (i386_use_fxsave) {
@@ -2418,8 +2421,10 @@
/*
* If we were using the FPU, forget that we were.
*/
- if (l->l_addr->u_pcb.pcb_fpcpu != NULL)
+ if (l->l_addr->u_pcb.pcb_fpcpu != NULL) {
npxsave_lwp(l, 0);
+ HYPERVISOR_fpu_taskswitch();
+ }
#endif
if (flags & _UC_FXSAVE) {
if (i386_use_fxsave) {
Index: sys/arch/xen/i386/npx.c
===================================================================
RCS file: /export/netbsd/cvsroot/src/sys/arch/xen/i386/npx.c,v
retrieving revision 1.3.14.1
diff -u -d -b -w -r1.3.14.1 npx.c
--- sys/arch/xen/i386/npx.c 20 Mar 2005 14:38:21 -0000 1.3.14.1
+++ sys/arch/xen/i386/npx.c 18 Dec 2005 21:40:44 -0000
@@ -1,4 +1,4 @@
-/* $NetBSD$ */
+/* $NetBSD: npx.c,v 1.3.14.1 2005/03/20 14:38:21 tron Exp $ */
/* NetBSD: npx.c,v 1.103 2004/03/21 10:56:24 simonb Exp */
/*-
@@ -68,10 +68,11 @@
*/
#include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD$");
+__KERNEL_RCSID(0, "$NetBSD: npx.c,v 1.3.14.1 2005/03/20 14:38:21 tron
Exp $");
#if 0
#define IPRINTF(x) printf x
+#define XXXXENDEBUG_LOW
#else
#define IPRINTF(x)
#endif
@@ -575,12 +576,8 @@
KDASSERT(ci->ci_fpcurlwp == NULL);
#ifndef MULTIPROCESSOR
KDASSERT(l->l_addr->u_pcb.pcb_fpcpu == NULL);
-#else
- if (l->l_addr->u_pcb.pcb_fpcpu != NULL)
- npxsave_lwp(l, 1);
#endif
l->l_addr->u_pcb.pcb_cr0 &= ~CR0_TS;
- clts();
s = splipi();
ci->ci_fpcurlwp = l;
l->l_addr->u_pcb.pcb_fpcpu = ci;
@@ -629,11 +626,9 @@
if (ci->ci_fpcurlwp != NULL)
npxsave_cpu(ci, 1);
else {
- clts();
IPRINTF(("%s: fp init\n", ci->ci_dev->dv_xname));
fninit();
fwait();
- stts();
}
splx(s);
@@ -641,12 +636,8 @@
KDASSERT(ci->ci_fpcurlwp == NULL);
#ifndef MULTIPROCESSOR
KDASSERT(l->l_addr->u_pcb.pcb_fpcpu == NULL);
-#else
- if (l->l_addr->u_pcb.pcb_fpcpu != NULL)
- npxsave_lwp(l, 1);
#endif
l->l_addr->u_pcb.pcb_cr0 &= ~CR0_TS;
- clts();
s = splipi();
ci->ci_fpcurlwp = l;
l->l_addr->u_pcb.pcb_fpcpu = ci;
@@ -710,7 +701,6 @@
* which report FP failures via traps rather than irq13).
* XXX punting for now..
*/
- clts();
ci->ci_fpsaving = 1;
fpu_save(&l->l_addr->u_pcb.pcb_savefpu);
ci->ci_fpsaving = 0;
--
Paul Ripke
Responsible-Changed-From-To: port-xen-maintainer->bouyer
Responsible-Changed-By: bouyer@netbsd.org
Responsible-Changed-When: Tue, 03 Jan 2006 20:23:52 +0000
Responsible-Changed-Why:
I commited a variant of Paul's patch.
State-Changed-From-To: open->feedback
State-Changed-By: bouyer@netbsd.org
State-Changed-When: Tue, 03 Jan 2006 20:23:52 +0000
State-Changed-Why:
Hi
can you please try a current kernel with xen/i386/npx.c 1.7 and see if it's
fixed ?
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Paul Ripke <stix@stix.id.au>
Cc: Quentin Garnier <cube@cubidou.net>, gnats-bugs@NetBSD.org,
port-xen@NetBSD.org
Subject: Re: port-xen/30977: FPU troubles
Date: Tue, 3 Jan 2006 21:21:52 +0100
On Mon, Dec 19, 2005 at 09:34:01AM +1100, Paul Ripke wrote:
> OK, I've had a closer look at this, and come up with the included patch,
> which appears to work fine for me, with threaded and non-threaded
> programs.
> I can't repeat the behaviour of the FPU state being not being restored,
> restored to the wrong thread, or wiped.
This is great, many thanks ! I commited the attached patch, which
calls HYPERVISOR_fpu_taskswitch() at the end of npxsave_lwp() instead of the
various places in which call npxsave_lwp() in the code.
>
> Thoughts behind the patch:
> - since clts() is a no-op under xen 2 (in xen 3,
> HYPERVISOR_fpu_taskswitch
> takes an int parameter allowing set/clear), it makes no sense to
> stts()
> inside the DNA handler. We can't clear it again.
> - I don't get the npxsave_lwp() calls in the DNA handler. On SMP, it's
> used to dump an LWP's FPU state from a different CPU. On a
> uniprocessor,
> the FPU state has already been dumped by npxsave_cpu(). SMP support
> isn't
> relevant for xen 2.
Sure, but as it's #ifdef MULTIPROCESSOR, there's no problem keeping
it here. The goal is to eventually be able to share more code with i386.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Paul Ripke <stix@stix.id.au>
Cc: Quentin Garnier <cube@cubidou.net>, gnats-bugs@NetBSD.org,
port-xen@NetBSD.org
Subject: Re: port-xen/30977: FPU troubles
Date: Tue, 3 Jan 2006 21:23:43 +0100
--+HP7ph2BbKc20aGI
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Tue, Jan 03, 2006 at 09:21:52PM +0100, Manuel Bouyer wrote:
> On Mon, Dec 19, 2005 at 09:34:01AM +1100, Paul Ripke wrote:
> > OK, I've had a closer look at this, and come up with the included patch,
> > which appears to work fine for me, with threaded and non-threaded
> > programs.
> > I can't repeat the behaviour of the FPU state being not being restored,
> > restored to the wrong thread, or wiped.
>
> This is great, many thanks ! I commited the attached patch, which
here is the patch
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--+HP7ph2BbKc20aGI
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="xen.diff"
Index: npx.c
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/i386/npx.c,v
retrieving revision 1.6
diff -u -r1.6 npx.c
--- i386/npx.c 24 Dec 2005 20:07:48 -0000 1.6
+++ i386/npx.c 3 Jan 2006 19:09:33 -0000
@@ -146,7 +146,7 @@
#define stts() lcr0(rcr0() | CR0_TS)
#else
#define clts()
-#define stts() HYPERVISOR_fpu_taskswitch()
+#define stts()
#endif
int npxdna(struct cpu_info *);
@@ -749,7 +749,7 @@
oci = l->l_addr->u_pcb.pcb_fpcpu;
if (oci == NULL)
- return;
+ goto end;
IPRINTF(("%s: fp %s lwp %p\n", ci->ci_dev->dv_xname,
save? "save" : "flush", l));
@@ -792,4 +792,6 @@
KASSERT(ci->ci_fpcurlwp == l);
npxsave_cpu(ci, save);
#endif
+end:
+ HYPERVISOR_fpu_taskswitch();
}
--+HP7ph2BbKc20aGI--
State-Changed-From-To: feedback->closed
State-Changed-By: cube@netbsd.org
State-Changed-When: Sun, 05 Mar 2006 12:28:37 +0000
State-Changed-Why:
Problem is fixed, now.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.