NetBSD Problem Report #44995

From yamt@NetBSD.org  Thu May 26 03:17:07 2011
Return-Path: <yamt@NetBSD.org>
Received: by www.NetBSD.org (Postfix, from userid 1270)
	id E34AF63BA9C; Thu, 26 May 2011 03:17:07 +0000 (UTC)
Message-Id: <20110526031707.E34AF63BA9C@www.NetBSD.org>
Date: Thu, 26 May 2011 03:17:07 +0000 (UTC)
From: yamt@NetBSD.org
Reply-To: yamt@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: PAE cpu_load_pmap doesn't seem safe
X-Send-Pr-Version: 3.95

>Number:         44995
>Category:       port-i386
>Synopsis:       PAE cpu_load_pmap doesn't seem safe
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-i386-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu May 26 03:20:00 +0000 2011
>Closed-Date:    Tue Jun 12 17:14:52 +0000 2012
>Last-Modified:  Tue Dec 18 02:35:02 +0000 2012
>Originator:     YAMAMOTO Takashi
>Release:        NetBSD current
>Organization:

>Environment:


Architecture: i386
Machine: i386
>Description:
	in the case of PAE, cpu_load_pmap modifies L3 PDIR for
	the current cpu with the following code.

		l3_pd[i] = pmap->pm_pdirpa[i] | PG_V;

	this likely will be complied into two 32-bit mov instructions
	and nothing prevents a page table walk between them.

>How-To-Repeat:

>Fix:
	make cr3 simply point to the recursive mapping part of the second
	level PTP?  (i haven't confirmed if this is possible.  just an idea.)

>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->closed
State-Changed-By: yamt@NetBSD.org
State-Changed-When: Tue, 12 Jun 2012 17:14:52 +0000
State-Changed-Why:
fixed.


From: "YAMAMOTO Takashi" <yamt@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44995 CVS commit: src/sys/arch/x86/x86
Date: Tue, 12 Jun 2012 17:14:19 +0000

 Module Name:	src
 Committed By:	yamt
 Date:		Tue Jun 12 17:14:19 UTC 2012

 Modified Files:
 	src/sys/arch/x86/x86: cpu.c

 Log Message:
 cpu_load_pmap: disable interrupts.  add a comment to explain why.  PR/44995


 To generate a diff of this commit:
 cvs rdiff -u -r1.98 -r1.99 src/sys/arch/x86/x86/cpu.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Bernd Ernesti <netbsd@lists.veego.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-i386/44995 (PAE cpu_load_pmap doesn't seem safe)
Date: Tue, 12 Jun 2012 20:05:51 +0200

 On Tue, Jun 12, 2012 at 05:14:53PM +0000, yamt@NetBSD.org wrote:
 > Synopsis: PAE cpu_load_pmap doesn't seem safe
 > 
 > State-Changed-From-To: open->closed

 What about a pullup for netbsd-6?

 Bernd

From: yamt@mwd.biglobe.ne.jp (YAMAMOTO Takashi)
To: gnats-bugs@NetBSD.org
Cc: port-i386-maintainer@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, yamt@NetBSD.org
Subject: Re: port-i386/44995 (PAE cpu_load_pmap doesn't seem safe)
Date: Wed, 13 Jun 2012 02:20:04 +0000 (UTC)

 hi,

 > The following reply was made to PR port-i386/44995; it has been noted by GNATS.
 > 
 > From: Bernd Ernesti <netbsd@lists.veego.de>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: port-i386/44995 (PAE cpu_load_pmap doesn't seem safe)
 > Date: Tue, 12 Jun 2012 20:05:51 +0200
 > 
 >  On Tue, Jun 12, 2012 at 05:14:53PM +0000, yamt@NetBSD.org wrote:
 >  > Synopsis: PAE cpu_load_pmap doesn't seem safe
 >  > 
 >  > State-Changed-From-To: open->closed
 >  
 >  What about a pullup for netbsd-6?

 while i haven't checked the branch, i bet it's necessary.

 i appreciate if someone volunteer to test and request a pullup.
 running ./build.sh -j128 was enough to trigger the problem.
 (crash or silent reboot)

 YAMAMOTO Takashi

 >  
 >  Bernd

From: Nat Sloss <nathanialsloss@yahoo.com.au>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-i386/44995
Date: Tue, 18 Dec 2012 11:10:01 +1100

 Hi.

 I would like to request a pull up for NetBSD-6 as I have experienced problems 
 with PAE (non XEN) i386 NetBSD-6 kernels.

 While doing a

  ./build.sh -U -x -X ../xsrc -j 12 release

 On NetBSD-6 (I'm testing out my new computer and will hopefully repeat this 
 task for 100 hours or so) i386 PAE kernel it would some times reset after 10 
 minutes or panic in make.

 Here is the output from ddb:

 fatal double fault in supervisor mode
 trap type 12 code 80000000 eip c010cce0 cd 7 eflags 10086 cr2 cbb76ffc ilevel 
 6
 db {2}> bt
 Xtrap0e(c010cce0,8,10086,10,c010cce0,8,10086,c010cce0,8) at netbsd:Xtrap03
 ?(d42138bc,d4213950,ebb77da0,1,bb517000,1b,bb7a24f2,0,ebb70010,30) at 10
 pmap_load(b3,ab,bf7f001f,bb51001f,bb517000,80684b8,bf7d4b8,bb7d25bc,0,0) at 
 netb
 sd:pmap_load+0x113
 *occurs just after or during cpu_load_pmap in pmap_load/pmap.c

 I applied the differences in 1.100 of x86/cpu.c and now the kernel wont boot 
 at all.  It crashes immediately and no backtrace can be obtained as the stack 
 pointer is invalid (0xce) and it stops displaying: system: invalid address.

 One of the registers points to kernel_pmap_store the eip equals (0x8) and cr2 
 is equal to cpu_load_pmap+0x3a which is right when the function pops off the 
 registers and returns.  However the stack pointer is invalid so it crashes.

 Are there other files that I need to alter to get the changes made in this PR 
 on NetBSD-6 as although my new machine is amd64 I intend to run an i386 PAE 
 kernel as all of my other machines are i386.


 Regards,

 Nat.

From: Nat Sloss <nathanialsloss@yahoo.com.au>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-i386/44995
Date: Tue, 18 Dec 2012 13:21:20 +1100

 Hi.

 I've found the reason as to why NetBSD-6 PAE i386 kernels crash after applying 
 the patch from this PR.

 The reason is that interrupts are enabled, I think they are disabled at 
 initial boot, and checking the interrupt enable flag and then deciding to 
 enable interrupts allowed it to boot, but alas I don't think I've done this 
 properly as it causes the system to lock up within one minute when building.

 This is the modified patch that I applied:

 Index: src/sys/arch/x86/x86/cpu.c
 ===================================================================
 RCS file: /cvsroot/src/sys/arch/x86/x86/cpu.c,v
 retrieving revision 1.96.8.3
 diff -u -r1.96.8.3 cpu.c
 --- src/sys/arch/x86/x86/cpu.c	5 Jul 2012 17:52:54 -0000	1.96.8.3
 +++ src/sys/arch/x86/x86/cpu.c	18 Dec 2012 02:24:07 -0000
 @@ -1,4 +1,4 @@
 -/*	$NetBSD: cpu.c,v 1.96.8.3 2012/07/05 17:52:54 riz Exp $	*/
 +/*	$NetBSD: cpu.c,v 1.100 2012/07/02 01:05:48 chs Exp $	*/

  /*-
   * Copyright (c) 2000-2012 NetBSD Foundation, Inc.
 @@ -62,7 +62,7 @@
   */

  #include <sys/cdefs.h>
 -__KERNEL_RCSID(0, "$NetBSD: cpu.c,v 1.96.8.3 2012/07/05 17:52:54 riz Exp $");
 +__KERNEL_RCSID(0, "$NetBSD: cpu.c,v 1.100 2012/07/02 01:05:48 chs Exp $");

  #include "opt_ddb.h"
  #include "opt_mpbios.h"		/* for MPDEBUG */
 @@ -1235,16 +1235,23 @@
  cpu_load_pmap(struct pmap *pmap, struct pmap *oldpmap)
  {
  #ifdef PAE
 -	int i, s;
 -	struct cpu_info *ci;
 -
 -	s = splvm(); /* just to be safe */
 -	ci = curcpu();
 +	struct cpu_info *ci = curcpu();
  	pd_entry_t *l3_pd = ci->ci_pae_l3_pdir;
 +	int i, intrPrev;
 +
 +	intrPrev = x86_read_flags() & 0x40; 
 +
 +	/*
 +	 * disable interrupts to block TLB shootdowns, which can reload cr3.
 +	 * while this doesn't block NMIs, it's probably ok as NMIs unlikely
 +	 * reload cr3.
 +	 */
 +	x86_disable_intr();
  	for (i = 0 ; i < PDP_SIZE; i++) {
  		l3_pd[i] = pmap->pm_pdirpa[i] | PG_V;
  	}
 -	splx(s);
 +	if (intrPrev)
 +		x86_enable_intr();
  	tlbflush();
  #else /* PAE */
  	lcr3(pmap_pdirpa(pmap, 0));


 Is there another way apart from reading the IE bit of the flags register to 
 determine whether interrupts should be enabled?

 Regards,

 Nat.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.