NetBSD Problem Report #38293

From martin@duskware.de  Tue Mar 25 11:33:06 2008
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id BCA5163B863
	for <gnats-bugs@gnats.netbsd.org>; Tue, 25 Mar 2008 11:33:06 +0000 (UTC)
Message-Id: <20080325095556.2167863B8A5@narn.NetBSD.org>
Date: Tue, 25 Mar 2008 09:55:56 +0000 (UTC)
From: Christoph_Egger@gmx.de
Reply-To: Christoph_Egger@gmx.de
To: netbsd-bugs-owner@NetBSD.org
Subject: panic: fp_save ipi didn't
X-Send-Pr-Version: www-1.0

>Number:         38293
>Category:       port-amd64
>Synopsis:       panic: fp_save ipi didn't
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-amd64-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 25 11:35:00 +0000 2008
>Closed-Date:    Sun Feb 22 08:00:13 +0000 2009
>Last-Modified:  Sun Feb 22 08:00:13 +0000 2009
>Originator:     Christoph Egger
>Release:        4.99.48
>Organization:
>Environment:
NetBSD 4.99.48 (GENERIC) #0: Mon Jan 7 21:47:09 PST 2008 builds@wb28:/home/builds/ab/HEAD/amd64/200801070002Z-obj/home/builds/ab/HEAD/src/sys/arch/amd64/compile/GENERC amd64
>Description:

The kernel panics in src/sys/arch/amd64/amd64/fpu.c
with "panic: fp_save ipi didn't"

In all panics I have got so far, the backtrace is always the same:

fpusave_lwp() at netbsd:fpusave_lwp+0x7b
cpu_lwp_free() at netbsd:cpu_lwp_free+0x2c
exit1() at netbsd:exit1+0x5f1
sys_exit() at netbsd:sys_exit+0x67
syscall() at netbsd:syscall+0xa9


To get this panic, a simple command is enough:

dmesg | less

However, the important thing here is the pipe. And the panic
always happens when the command _after_ the pipe ("less" in the
above example) does the exit syscall.

The panic does not always happen, but it is reproducable in an
non-interactive way:

while true; dmesg | fgrep "something"; done

Then wait for the panic and when it happens, then always when
fgrep does the exit syscall.


Joerg Sonnenberger told me, this (or similar) panics
has been seen on non-x86 hardware so I assign this to
category "kern" instead to "port-amd64".

>How-To-Repeat:

Set up a Xen environment with Xen 3.3-unstable,
changeset 17264. You can get the sources
from

hg clone http://xenbits.xensource.com/staging/xen-unstable.hg/

Older Xen versions don't have the necessary bugfixes to run
NetBSD/amd64 as HVM guest.

It doesn't matter if you run NetBSD or Linux as Dom0.

Then install NetBSD/amd64 in a HVM guest and after you have
a NetBSD disk image.

Very important is, that you assign more virtual VCPUs to
the HVM guest than physical CPUs are present (I use twice VCPUs
as physical CPUs are available). And you must have two physical
CPUs at a minimum (=> 4 VPUs for the guest at a miminum).
Then boot NetBSD/amd64.

Login and run

while true; dmesg | fgrep "something"; done

Sometimes the panic even happens during boot when the
boot shell scripts runs.


>Fix:

>Release-Note:

>Audit-Trail:
From: "Jared D. McNeill" <jmcneill@invisible.ca>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
 netbsd-bugs@netbsd.org
Subject: Re: kern/38293: panic: fp_save ipi didn't
Date: Tue, 25 Mar 2008 12:06:59 -0400

 Christoph_Egger@gmx.de wrote:
 > Joerg Sonnenberger told me, this (or similar) panics
 > has been seen on non-x86 hardware so I assign this to
 > category "kern" instead to "port-amd64".

 I have seen the same panic (under different circumstances) on amd64, see 
 port-amd64/37748

 Cheers,
 Jared

Responsible-Changed-From-To: kern-bug-people->amd64-maintainer
Responsible-Changed-By: martin@NetBSD.org
Responsible-Changed-When: Wed, 26 Mar 2008 19:00:04 +0000
Responsible-Changed-Why:
Even if rumors say this has been seen on other archs, it is still all
machine dependend code.


From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/38293: panic: fp_save ipi didn't
Date: Sat, 29 Mar 2008 16:23:38 +0000

 I've just seen this on NetBSD/i386. I suspect that the synchronization
 around the IPIs isn't up to the job.

 Andrew

Responsible-Changed-From-To: amd64-maintainer->port-amd64-maintainer
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Wed, 23 Apr 2008 06:39:46 +0000
Responsible-Changed-Why:
typo'd


From: Simon Burge <simonb@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: Re: kern/38293: panic: fp_save ipi didn't 
Date: Fri, 20 Jun 2008 15:14:58 +1000

 Christoph_Egger@gmx.de wrote:

 > The kernel panics in src/sys/arch/amd64/amd64/fpu.c
 > with "panic: fp_save ipi didn't"
 > 
 > In all panics I have got so far, the backtrace is always the same:
 > 
 > fpusave_lwp() at netbsd:fpusave_lwp+0x7b
 > cpu_lwp_free() at netbsd:cpu_lwp_free+0x2c
 > exit1() at netbsd:exit1+0x5f1
 > sys_exit() at netbsd:sys_exit+0x67
 > syscall() at netbsd:syscall+0xa9

 I've also just started seeing this too, amd64 with simonb-wapbl branch
 from June 19th.  I've seen three so far while running build.sh, and
 they've all been of the form:

    breakpoint() at netbsd:breakpoint+0x5 
    panic() at netbsd:panic+0x260 
    fpusave_lwp() at netbsd:fpusave_lwp+0x92 
    cpu_lwp_fork() at netbsd:cpu_lwp_fork+0x41 
    lwp_create() at netbsd:lwp_create+0x24f 
    fork1() at netbsd:fork1+0x45c 
    sys___vfork14() at netbsd:sys___vfork14+0x35 
    syscall() at netbsd:syscall+0x9a 

 Note that I've got a couple of printfs in some of the ufs filesystem
 functions, so there's quite a lot of console output happening all the
 time.  I've not seen this problem without those printfs.  Does these
 mean something timing related then?

 Cheers,
 Simon.

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/38293 CVS commit: src/sys/arch
Date: Tue, 11 Nov 2008 13:45:10 +0000 (UTC)

 Module Name:	src
 Committed By:	ad
 Date:		Tue Nov 11 13:45:10 UTC 2008

 Modified Files:
 	src/sys/arch/amd64/amd64: fpu.c ipifuncs.c
 	src/sys/arch/i386/i386: ipifuncs.c
 	src/sys/arch/i386/isa: npx.c
 	src/sys/arch/x86/include: intrdefs.h

 Log Message:
 PR port-amd64/38293 panic: fp_save ipi didn't

 Kill the FP flush IPI and always save. The synchronization here isn't strong
 and we could easily pull the chain on an innocent LWP's FP state.

 Another fix to follow.


 To generate a diff of this commit:
 cvs rdiff -r1.26 -r1.27 src/sys/arch/amd64/amd64/fpu.c
 cvs rdiff -r1.19 -r1.20 src/sys/arch/amd64/amd64/ipifuncs.c
 cvs rdiff -r1.27 -r1.28 src/sys/arch/i386/i386/ipifuncs.c
 cvs rdiff -r1.129 -r1.130 src/sys/arch/i386/isa/npx.c
 cvs rdiff -r1.13 -r1.14 src/sys/arch/x86/include/intrdefs.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/38293 CVS commit: src/sys/arch
Date: Tue, 11 Nov 2008 14:40:19 +0000 (UTC)

 Module Name:	src
 Committed By:	ad
 Date:		Tue Nov 11 14:40:19 UTC 2008

 Modified Files:
 	src/sys/arch/amd64/amd64: fpu.c genassym.cf locore.S machdep.c
 	src/sys/arch/i386/i386: autoconf.c genassym.cf locore.S machdep.c
 	src/sys/arch/i386/isa: npx.c

 Log Message:
 PR port-amd64/38293 panic: fp_save ipi didn't

 Fix race conditions in FPU IPI handling.


 To generate a diff of this commit:
 cvs rdiff -r1.27 -r1.28 src/sys/arch/amd64/amd64/fpu.c
 cvs rdiff -r1.37 -r1.38 src/sys/arch/amd64/amd64/genassym.cf
 cvs rdiff -r1.47 -r1.48 src/sys/arch/amd64/amd64/locore.S
 cvs rdiff -r1.103 -r1.104 src/sys/arch/amd64/amd64/machdep.c
 cvs rdiff -r1.92 -r1.93 src/sys/arch/i386/i386/autoconf.c
 cvs rdiff -r1.76 -r1.77 src/sys/arch/i386/i386/genassym.cf
 cvs rdiff -r1.78 -r1.79 src/sys/arch/i386/i386/locore.S
 cvs rdiff -r1.645 -r1.646 src/sys/arch/i386/i386/machdep.c
 cvs rdiff -r1.130 -r1.131 src/sys/arch/i386/isa/npx.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/38293 CVS commit: [netbsd-5] src/sys/arch
Date: Mon, 17 Nov 2008 18:50:22 +0000 (UTC)

 Module Name:	src
 Committed By:	snj
 Date:		Mon Nov 17 18:50:22 UTC 2008

 Modified Files:
 	src/sys/arch/amd64/amd64 [netbsd-5]: fpu.c ipifuncs.c
 	src/sys/arch/i386/i386 [netbsd-5]: ipifuncs.c
 	src/sys/arch/i386/isa [netbsd-5]: npx.c
 	src/sys/arch/x86/include [netbsd-5]: intrdefs.h

 Log Message:
 Pull up following revision(s) (requested by ad in ticket #73):
 	sys/arch/amd64/amd64/fpu.c: revision 1.27
 	sys/arch/amd64/amd64/ipifuncs.c: revision 1.20
 	sys/arch/i386/i386/ipifuncs.c: revision 1.28
 	sys/arch/i386/isa/npx.c: revision 1.130
 	sys/arch/x86/include/intrdefs.h: revision 1.14
 PR port-amd64/38293 panic: fp_save ipi didn't
 Kill the FP flush IPI and always save. The synchronization here isn't
 strong and we could easily pull the chain on an innocent LWP's FP state.
 Another fix to follow.


 To generate a diff of this commit:
 cvs rdiff -r1.26 -r1.26.6.1 src/sys/arch/amd64/amd64/fpu.c
 cvs rdiff -r1.19 -r1.19.8.1 src/sys/arch/amd64/amd64/ipifuncs.c
 cvs rdiff -r1.27 -r1.27.8.1 src/sys/arch/i386/i386/ipifuncs.c
 cvs rdiff -r1.129 -r1.129.10.1 src/sys/arch/i386/isa/npx.c
 cvs rdiff -r1.13 -r1.13.10.1 src/sys/arch/x86/include/intrdefs.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Soren Jacobsen <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/38293 CVS commit: [netbsd-5] src/sys/arch
Date: Mon, 17 Nov 2008 18:53:54 +0000 (UTC)

 Module Name:	src
 Committed By:	snj
 Date:		Mon Nov 17 18:53:54 UTC 2008

 Modified Files:
 	src/sys/arch/amd64/amd64 [netbsd-5]: fpu.c genassym.cf locore.S
 	    machdep.c
 	src/sys/arch/i386/i386 [netbsd-5]: autoconf.c genassym.cf locore.S
 	    machdep.c
 	src/sys/arch/i386/isa [netbsd-5]: npx.c

 Log Message:
 Pull up following revision(s) (requested by ad in ticket #74):
 	sys/arch/i386/isa/npx.c: revision 1.131
 	sys/arch/amd64/amd64/fpu.c: revision 1.28
 	sys/arch/i386/i386/genassym.cf: revision 1.77
 	sys/arch/i386/i386/autoconf.c: revision 1.93
 	sys/arch/amd64/amd64/locore.S: revision 1.48
 	sys/arch/amd64/amd64/machdep.c: revision 1.104
 	sys/arch/i386/i386/machdep.c: revision 1.646
 	sys/arch/amd64/amd64/genassym.cf: revision 1.38
 	sys/arch/i386/i386/locore.S: revision 1.79
 PR port-amd64/38293 panic: fp_save ipi didn't
 Fix race conditions in FPU IPI handling.


 To generate a diff of this commit:
 cvs rdiff -r1.26.6.1 -r1.26.6.2 src/sys/arch/amd64/amd64/fpu.c
 cvs rdiff -r1.37 -r1.37.4.1 src/sys/arch/amd64/amd64/genassym.cf
 cvs rdiff -r1.47 -r1.47.8.1 src/sys/arch/amd64/amd64/locore.S
 cvs rdiff -r1.102 -r1.102.4.1 src/sys/arch/amd64/amd64/machdep.c
 cvs rdiff -r1.92 -r1.92.8.1 src/sys/arch/i386/i386/autoconf.c
 cvs rdiff -r1.76 -r1.76.4.1 src/sys/arch/i386/i386/genassym.cf
 cvs rdiff -r1.78 -r1.78.4.1 src/sys/arch/i386/i386/locore.S
 cvs rdiff -r1.644 -r1.644.4.1 src/sys/arch/i386/i386/machdep.c
 cvs rdiff -r1.129.10.1 -r1.129.10.2 src/sys/arch/i386/isa/npx.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 22 Feb 2009 08:00:13 +0000
State-Changed-Why:
Fixed by ad in November.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.