NetBSD Problem Report #41342

From www@NetBSD.org  Sun May  3 07:10:20 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 0AE9163C221
	for <gnats-bugs@gnats.netbsd.org>; Sun,  3 May 2009 07:10:20 +0000 (UTC)
Message-Id: <20090503071019.BE55F63C131@www.NetBSD.org>
Date: Sun,  3 May 2009 07:10:19 +0000 (UTC)
From: marcotte@panix.com
Reply-To: marcotte@panix.com
To: gnats-bugs@NetBSD.org
Subject: BSDi binaries cause panic
X-Send-Pr-Version: www-1.0

>Number:         41342
>Category:       kern
>Synopsis:       BSDi binaries cause panic
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    chs
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun May 03 07:15:00 +0000 2009
>Closed-Date:    
>Last-Modified:  Wed Oct 31 17:35:03 +0000 2012
>Originator:     Brian Marcotte
>Release:        NetBSD 5.0
>Organization:
Panix
>Environment:
NetBSD www.panix.com 5.0 NetBSD 5.0 (PANIX-XEN3U-WEB) #0: Tue Apr 28 20:29:19 EDT 2009  root@juggler.panix.com:/misc3/obj/misc2/devel/netbsd/5.0/src/sys/arch/i386/compile/PANIX-XEN3U-WEB i386
>Description:
BSDi binaries segfault or even panic the system on NetBSD 5.0.

Under Xen, the kernel will panic:
    fatal protection fault in supervisor mode
    trap type 4 code 0 eip c01001f0 cs 9 eflags 10286 cr2 7514 ilevel 0
    panic: trap
    Begin traceback...
    End traceback...

When I told it to drop to DDB so I could try to get a trace, I got this:
  Xosyscall(17,12af,bf7ff9f0,1f,0,0,0,0,d104ad90,d104a4f4) at netbsd:Xosyscall
  ?(1,bf7ffa80,bf7ffa88,bf7ffff0,0,0,1,bf7ffb00,0,bf7ffb0a) at 0x18e5
  ?(bf7ffb00,0,bf7ffb0a,bf7ffb2c,bf7ffb44,bf7ffc1c,bf7ffc5c,bf7ffc7e,bf7ffcb5,bf7ffcd2) at 0x1097

I've tracked this down to the commit described in this message:

    http://mail-index.netbsd.org/source-changes/2009/04/04/msg219192.html

which was done to fix PR 40143 (Viewing an mpeg transport stream with mplayer causes crash).

>How-To-Repeat:
Grab a sample binary from http://www.panix.com/~marcotte/bsdi/architextIndex.

When NetBSD 5.0 is running on hardware, the binary will segfault. When
the system is a Xen domU, the kernel will panic. Under previous versions
of NetBSD, it works fine and will print "Must specify root index
filename".

>Fix:

>Release-Note:

>Audit-Trail:
From: Andrew Doran <ad@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/41342 CVS commit: src/sys/arch/i386/i386
Date: Mon, 4 May 2009 11:47:29 +0000

 Module Name:	src
 Committed By:	ad
 Date:		Mon May  4 11:47:29 UTC 2009

 Modified Files:
 	src/sys/arch/i386/i386: locore.S

 Log Message:
 PR kern/41342: BSDi binaries cause panic

 XXX Manuel, please have a look as I am not sure what to do for XEN here!


 To generate a diff of this commit:
 cvs rdiff -u -r1.86 -r1.87 src/sys/arch/i386/i386/locore.S

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Brian Marcotte <marcotte@panix.com>
To: Andrew Doran <ad@netbsd.org>
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, gnats-bugs@NetBSD.org
Subject: Re: PR/41342 CVS commit: src/sys/arch/i386/i386
Date: Thu, 7 May 2009 14:54:47 -0400

 >  Log Message:
 >  PR kern/41342: BSDi binaries cause panic
 >  
 >  XXX Manuel, please have a look as I am not sure what to do for XEN here!

 Thanks for looking at this.

 Any luck on getting BSDi binaries to work at all? On hardware it's better
 since it doesn't panic the machine, but the binaries segfault.

 --
 - Brian

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Brian Marcotte <marcotte@panix.com>
Cc: Andrew Doran <ad@NetBSD.org>, kern-bug-people@NetBSD.org,
        gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org, gnats-bugs@NetBSD.org
Subject: Re: PR/41342 CVS commit: src/sys/arch/i386/i386
Date: Sun, 17 May 2009 19:12:35 +0200

 On Thu, May 07, 2009 at 02:54:47PM -0400, Brian Marcotte wrote:
 > >  Log Message:
 > >  PR kern/41342: BSDi binaries cause panic
 > >  
 > >  XXX Manuel, please have a look as I am not sure what to do for XEN here!
 > 
 > Thanks for looking at this.
 > 
 > Any luck on getting BSDi binaries to work at all? On hardware it's better
 > since it doesn't panic the machine, but the binaries segfault.

 I looked a bit at this: I added instrumentation to i386 trap.c, and found
 that if gets:
 trap 4 code 8 eip ad6fe cs 17 eflags 10293 cr2 acb94 cpl 0
 urlwp 0xca293cc0 pid 6 lid
 db> x/x acb94
 ccb94:        57e58955
 db> 
 0xacb98:        7d8b5356
 db> 
 0xacb9c:        a43d8308

 so it has no problems reading what cr2 points to.
 0xad6ed:        addb    %al,0(%eax)
 db> 
 0xad6ef:        addb    %ch,%cl
 db> 
 0xad6f1:        decl    %esi
 db> 
 0xad6f2:        addb    %al,0(%eax)
 db> 
 0xad6f4:        addb    %al,0(%eax)
 db> 
 0xad6f6:        addb    %al,0(%eax)
 db> 
 0xad6f8:        leal    0xca,%eax
 db> 
 0xad6fe:        lcall   $0,0x7
 db> 
 0xad705:        jb      0xad6f0
 db> 
 0xad707:        ret

 but:
 0xad6f0:        jmp     0xad743
 db> 
 0xad6f5:        addb    %al,0(%eax)
 db> 
 0xad6f7:        addb    %cl,0xca05(%ebp)
 db> 
 0xad6fd:        addb    %bl,0(%edx)
 db> 
 0xad703:        pop     %es
 db> 
 0xad704:        addb    %dh,0xffffffe9(%edx)
 db> 
 0xad707:        ret

 So the faulting instruction would be a 'lcall   $0,0x7' but I don't
 understand how this binary is constructed.

 Any idea what to look at next ?

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: "Chuck Silvers" <chs@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/41342 CVS commit: src/sys/arch/i386/i386
Date: Fri, 26 Oct 2012 14:46:44 +0000

 Module Name:	src
 Committed By:	chs
 Date:		Fri Oct 26 14:46:44 UTC 2012

 Modified Files:
 	src/sys/arch/i386/i386: locore.S

 Log Message:
 in osyscall, set the PSL_I bit into the correct field of the trapframe.
 it was going into tf_eip instead of tf_eflags, which would sometimes
 corrupt %eip and always return to user mode with interrupts disabled.
 this was found with a netbsd 1.0 binary, and dsl@ points out that
 this should also fix PR 41342.


 To generate a diff of this commit:
 cvs rdiff -u -r1.102 -r1.103 src/sys/arch/i386/i386/locore.S

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

----



    Source-Changes archive
     __________________________________________________________________

From: Buhrow

	The following commit should fix this issue once it's pulled into NetBSD-5 and NetBSD-6.
For NetBSD-5, see: ticket #1810
For NetBSD-6, see ticket #642

                       CVS commit: src/sys/arch/i386/i386
     __________________________________________________________________

     * To: source-changes%NetBSD.org@localhost
     * Subject: CVS commit: src/sys/arch/i386/i386
     * From: "Chuck Silvers" <chs%netbsd.org@localhost>
     * Date: Fri, 26 Oct 2012 14:46:44 +0000
     __________________________________________________________________

Module Name:    src
Committed By:   chs
Date:           Fri Oct 26 14:46:44 UTC 2012

Modified Files:
        src/sys/arch/i386/i386: locore.S

Log Message:
in osyscall, set the PSL_I bit into the correct field of the trapframe.
it was going into tf_eip instead of tf_eflags, which would sometimes
corrupt %eip and always return to user mode with interrupts disabled.
this was found with a netbsd 1.0 binary, and dsl@ points out that
this should also fix PR 41342.


To generate a diff of this commit:
cvs rdiff -u -r1.102 -r1.103 src/sys/arch/i386/i386/locore.S

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.


State-Changed-From-To: open->pending-pullups
State-Changed-By: buhrow@NetBSD.org
State-Changed-When: Fri, 26 Oct 2012 18:29:55 +0000
State-Changed-Why:
There is a fix in hand and in the -current tree.


From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/41342 CVS commit: [netbsd-5] src/sys/arch/i386/i386
Date: Wed, 31 Oct 2012 15:34:59 +0000

 Module Name:	src
 Committed By:	riz
 Date:		Wed Oct 31 15:34:58 UTC 2012

 Modified Files:
 	src/sys/arch/i386/i386 [netbsd-5]: locore.S

 Log Message:
 Pull up following revision(s) (requested by chs in ticket #1810):
 	sys/arch/i386/i386/locore.S: revision 1.103
 in osyscall, set the PSL_I bit into the correct field of the trapframe.
 it was going into tf_eip instead of tf_eflags, which would sometimes
 corrupt %eip and always return to user mode with interrupts disabled.
 this was found with a netbsd 1.0 binary, and dsl@ points out that
 this should also fix PR 41342.


 To generate a diff of this commit:
 cvs rdiff -u -r1.78.4.3 -r1.78.4.4 src/sys/arch/i386/i386/locore.S

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Jeff Rizzo" <riz@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/41342 CVS commit: [netbsd-6] src/sys/arch/i386/i386
Date: Wed, 31 Oct 2012 17:19:50 +0000

 Module Name:	src
 Committed By:	riz
 Date:		Wed Oct 31 17:19:49 UTC 2012

 Modified Files:
 	src/sys/arch/i386/i386 [netbsd-6]: locore.S

 Log Message:
 Pull up following revision(s) (requested by chs in ticket #642):
 	sys/arch/i386/i386/locore.S: revision 1.103
 in osyscall, set the PSL_I bit into the correct field of the trapframe.
 it was going into tf_eip instead of tf_eflags, which would sometimes
 corrupt %eip and always return to user mode with interrupts disabled.
 this was found with a netbsd 1.0 binary, and dsl@ points out that
 this should also fix PR 41342.


 To generate a diff of this commit:
 cvs rdiff -u -r1.95.10.2 -r1.95.10.3 src/sys/arch/i386/i386/locore.S

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: kern-bug-people->chs
Responsible-Changed-By: chs@NetBSD.org
Responsible-Changed-When: Wed, 31 Oct 2012 17:35:03 +0000
Responsible-Changed-Why:
I fixed it.


State-Changed-From-To: pending-pullups->closed
State-Changed-By: chs@NetBSD.org
State-Changed-When: Wed, 31 Oct 2012 17:35:03 +0000
State-Changed-Why:
fixed


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.