NetBSD Problem Report #38293
From martin@duskware.de Tue Mar 25 11:33:06 2008
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by narn.NetBSD.org (Postfix) with ESMTP id BCA5163B863
for <gnats-bugs@gnats.netbsd.org>; Tue, 25 Mar 2008 11:33:06 +0000 (UTC)
Message-Id: <20080325095556.2167863B8A5@narn.NetBSD.org>
Date: Tue, 25 Mar 2008 09:55:56 +0000 (UTC)
From: Christoph_Egger@gmx.de
Reply-To: Christoph_Egger@gmx.de
To: netbsd-bugs-owner@NetBSD.org
Subject: panic: fp_save ipi didn't
X-Send-Pr-Version: www-1.0
>Number: 38293
>Category: port-amd64
>Synopsis: panic: fp_save ipi didn't
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-amd64-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Mar 25 11:35:00 +0000 2008
>Closed-Date: Sun Feb 22 08:00:13 +0000 2009
>Last-Modified: Sun Feb 22 08:00:13 +0000 2009
>Originator: Christoph Egger
>Release: 4.99.48
>Organization:
>Environment:
NetBSD 4.99.48 (GENERIC) #0: Mon Jan 7 21:47:09 PST 2008 builds@wb28:/home/builds/ab/HEAD/amd64/200801070002Z-obj/home/builds/ab/HEAD/src/sys/arch/amd64/compile/GENERC amd64
>Description:
The kernel panics in src/sys/arch/amd64/amd64/fpu.c
with "panic: fp_save ipi didn't"
In all panics I have got so far, the backtrace is always the same:
fpusave_lwp() at netbsd:fpusave_lwp+0x7b
cpu_lwp_free() at netbsd:cpu_lwp_free+0x2c
exit1() at netbsd:exit1+0x5f1
sys_exit() at netbsd:sys_exit+0x67
syscall() at netbsd:syscall+0xa9
To get this panic, a simple command is enough:
dmesg | less
However, the important thing here is the pipe. And the panic
always happens when the command _after_ the pipe ("less" in the
above example) does the exit syscall.
The panic does not always happen, but it is reproducable in an
non-interactive way:
while true; dmesg | fgrep "something"; done
Then wait for the panic and when it happens, then always when
fgrep does the exit syscall.
Joerg Sonnenberger told me, this (or similar) panics
has been seen on non-x86 hardware so I assign this to
category "kern" instead to "port-amd64".
>How-To-Repeat:
Set up a Xen environment with Xen 3.3-unstable,
changeset 17264. You can get the sources
from
hg clone http://xenbits.xensource.com/staging/xen-unstable.hg/
Older Xen versions don't have the necessary bugfixes to run
NetBSD/amd64 as HVM guest.
It doesn't matter if you run NetBSD or Linux as Dom0.
Then install NetBSD/amd64 in a HVM guest and after you have
a NetBSD disk image.
Very important is, that you assign more virtual VCPUs to
the HVM guest than physical CPUs are present (I use twice VCPUs
as physical CPUs are available). And you must have two physical
CPUs at a minimum (=> 4 VPUs for the guest at a miminum).
Then boot NetBSD/amd64.
Login and run
while true; dmesg | fgrep "something"; done
Sometimes the panic even happens during boot when the
boot shell scripts runs.
>Fix:
>Release-Note:
>Audit-Trail:
From: "Jared D. McNeill" <jmcneill@invisible.ca>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/38293: panic: fp_save ipi didn't
Date: Tue, 25 Mar 2008 12:06:59 -0400
Christoph_Egger@gmx.de wrote:
> Joerg Sonnenberger told me, this (or similar) panics
> has been seen on non-x86 hardware so I assign this to
> category "kern" instead to "port-amd64".
I have seen the same panic (under different circumstances) on amd64, see
port-amd64/37748
Cheers,
Jared
Responsible-Changed-From-To: kern-bug-people->amd64-maintainer
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Wed, 26 Mar 2008 19:00:04 +0000
Responsible-Changed-Why:
Even if rumors say this has been seen on other archs, it is still all
machine dependend code.
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/38293: panic: fp_save ipi didn't
Date: Sat, 29 Mar 2008 16:23:38 +0000
I've just seen this on NetBSD/i386. I suspect that the synchronization
around the IPIs isn't up to the job.
Andrew
Responsible-Changed-From-To: amd64-maintainer->port-amd64-maintainer
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Wed, 23 Apr 2008 06:39:46 +0000
Responsible-Changed-Why:
typo'd
From: Simon Burge <simonb@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/38293: panic: fp_save ipi didn't
Date: Fri, 20 Jun 2008 15:14:58 +1000
Christoph_Egger@gmx.de wrote:
> The kernel panics in src/sys/arch/amd64/amd64/fpu.c
> with "panic: fp_save ipi didn't"
>
> In all panics I have got so far, the backtrace is always the same:
>
> fpusave_lwp() at netbsd:fpusave_lwp+0x7b
> cpu_lwp_free() at netbsd:cpu_lwp_free+0x2c
> exit1() at netbsd:exit1+0x5f1
> sys_exit() at netbsd:sys_exit+0x67
> syscall() at netbsd:syscall+0xa9
I've also just started seeing this too, amd64 with simonb-wapbl branch
from June 19th. I've seen three so far while running build.sh, and
they've all been of the form:
breakpoint() at netbsd:breakpoint+0x5
panic() at netbsd:panic+0x260
fpusave_lwp() at netbsd:fpusave_lwp+0x92
cpu_lwp_fork() at netbsd:cpu_lwp_fork+0x41
lwp_create() at netbsd:lwp_create+0x24f
fork1() at netbsd:fork1+0x45c
sys___vfork14() at netbsd:sys___vfork14+0x35
syscall() at netbsd:syscall+0x9a
Note that I've got a couple of printfs in some of the ufs filesystem
functions, so there's quite a lot of console output happening all the
time. I've not seen this problem without those printfs. Does these
mean something timing related then?
Cheers,
Simon.
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/38293 CVS commit: src/sys/arch
Date: Tue, 11 Nov 2008 13:45:10 +0000 (UTC)
Module Name: src
Committed By: ad
Date: Tue Nov 11 13:45:10 UTC 2008
Modified Files:
src/sys/arch/amd64/amd64: fpu.c ipifuncs.c
src/sys/arch/i386/i386: ipifuncs.c
src/sys/arch/i386/isa: npx.c
src/sys/arch/x86/include: intrdefs.h
Log Message:
PR port-amd64/38293 panic: fp_save ipi didn't
Kill the FP flush IPI and always save. The synchronization here isn't strong
and we could easily pull the chain on an innocent LWP's FP state.
Another fix to follow.
To generate a diff of this commit:
cvs rdiff -r1.26 -r1.27 src/sys/arch/amd64/amd64/fpu.c
cvs rdiff -r1.19 -r1.20 src/sys/arch/amd64/amd64/ipifuncs.c
cvs rdiff -r1.27 -r1.28 src/sys/arch/i386/i386/ipifuncs.c
cvs rdiff -r1.129 -r1.130 src/sys/arch/i386/isa/npx.c
cvs rdiff -r1.13 -r1.14 src/sys/arch/x86/include/intrdefs.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/38293 CVS commit: src/sys/arch
Date: Tue, 11 Nov 2008 14:40:19 +0000 (UTC)
Module Name: src
Committed By: ad
Date: Tue Nov 11 14:40:19 UTC 2008
Modified Files:
src/sys/arch/amd64/amd64: fpu.c genassym.cf locore.S machdep.c
src/sys/arch/i386/i386: autoconf.c genassym.cf locore.S machdep.c
src/sys/arch/i386/isa: npx.c
Log Message:
PR port-amd64/38293 panic: fp_save ipi didn't
Fix race conditions in FPU IPI handling.
To generate a diff of this commit:
cvs rdiff -r1.27 -r1.28 src/sys/arch/amd64/amd64/fpu.c
cvs rdiff -r1.37 -r1.38 src/sys/arch/amd64/amd64/genassym.cf
cvs rdiff -r1.47 -r1.48 src/sys/arch/amd64/amd64/locore.S
cvs rdiff -r1.103 -r1.104 src/sys/arch/amd64/amd64/machdep.c
cvs rdiff -r1.92 -r1.93 src/sys/arch/i386/i386/autoconf.c
cvs rdiff -r1.76 -r1.77 src/sys/arch/i386/i386/genassym.cf
cvs rdiff -r1.78 -r1.79 src/sys/arch/i386/i386/locore.S
cvs rdiff -r1.645 -r1.646 src/sys/arch/i386/i386/machdep.c
cvs rdiff -r1.130 -r1.131 src/sys/arch/i386/isa/npx.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/38293 CVS commit: [netbsd-5] src/sys/arch
Date: Mon, 17 Nov 2008 18:50:22 +0000 (UTC)
Module Name: src
Committed By: snj
Date: Mon Nov 17 18:50:22 UTC 2008
Modified Files:
src/sys/arch/amd64/amd64 [netbsd-5]: fpu.c ipifuncs.c
src/sys/arch/i386/i386 [netbsd-5]: ipifuncs.c
src/sys/arch/i386/isa [netbsd-5]: npx.c
src/sys/arch/x86/include [netbsd-5]: intrdefs.h
Log Message:
Pull up following revision(s) (requested by ad in ticket #73):
sys/arch/amd64/amd64/fpu.c: revision 1.27
sys/arch/amd64/amd64/ipifuncs.c: revision 1.20
sys/arch/i386/i386/ipifuncs.c: revision 1.28
sys/arch/i386/isa/npx.c: revision 1.130
sys/arch/x86/include/intrdefs.h: revision 1.14
PR port-amd64/38293 panic: fp_save ipi didn't
Kill the FP flush IPI and always save. The synchronization here isn't
strong and we could easily pull the chain on an innocent LWP's FP state.
Another fix to follow.
To generate a diff of this commit:
cvs rdiff -r1.26 -r1.26.6.1 src/sys/arch/amd64/amd64/fpu.c
cvs rdiff -r1.19 -r1.19.8.1 src/sys/arch/amd64/amd64/ipifuncs.c
cvs rdiff -r1.27 -r1.27.8.1 src/sys/arch/i386/i386/ipifuncs.c
cvs rdiff -r1.129 -r1.129.10.1 src/sys/arch/i386/isa/npx.c
cvs rdiff -r1.13 -r1.13.10.1 src/sys/arch/x86/include/intrdefs.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/38293 CVS commit: [netbsd-5] src/sys/arch
Date: Mon, 17 Nov 2008 18:53:54 +0000 (UTC)
Module Name: src
Committed By: snj
Date: Mon Nov 17 18:53:54 UTC 2008
Modified Files:
src/sys/arch/amd64/amd64 [netbsd-5]: fpu.c genassym.cf locore.S
machdep.c
src/sys/arch/i386/i386 [netbsd-5]: autoconf.c genassym.cf locore.S
machdep.c
src/sys/arch/i386/isa [netbsd-5]: npx.c
Log Message:
Pull up following revision(s) (requested by ad in ticket #74):
sys/arch/i386/isa/npx.c: revision 1.131
sys/arch/amd64/amd64/fpu.c: revision 1.28
sys/arch/i386/i386/genassym.cf: revision 1.77
sys/arch/i386/i386/autoconf.c: revision 1.93
sys/arch/amd64/amd64/locore.S: revision 1.48
sys/arch/amd64/amd64/machdep.c: revision 1.104
sys/arch/i386/i386/machdep.c: revision 1.646
sys/arch/amd64/amd64/genassym.cf: revision 1.38
sys/arch/i386/i386/locore.S: revision 1.79
PR port-amd64/38293 panic: fp_save ipi didn't
Fix race conditions in FPU IPI handling.
To generate a diff of this commit:
cvs rdiff -r1.26.6.1 -r1.26.6.2 src/sys/arch/amd64/amd64/fpu.c
cvs rdiff -r1.37 -r1.37.4.1 src/sys/arch/amd64/amd64/genassym.cf
cvs rdiff -r1.47 -r1.47.8.1 src/sys/arch/amd64/amd64/locore.S
cvs rdiff -r1.102 -r1.102.4.1 src/sys/arch/amd64/amd64/machdep.c
cvs rdiff -r1.92 -r1.92.8.1 src/sys/arch/i386/i386/autoconf.c
cvs rdiff -r1.76 -r1.76.4.1 src/sys/arch/i386/i386/genassym.cf
cvs rdiff -r1.78 -r1.78.4.1 src/sys/arch/i386/i386/locore.S
cvs rdiff -r1.644 -r1.644.4.1 src/sys/arch/i386/i386/machdep.c
cvs rdiff -r1.129.10.1 -r1.129.10.2 src/sys/arch/i386/isa/npx.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 22 Feb 2009 08:00:13 +0000
State-Changed-Why:
Fixed by ad in November.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.