NetBSD Problem Report #43217
From www@NetBSD.org Wed Apr 28 03:47:24 2010
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 3E4E663BA59
for <gnats-bugs@gnats.NetBSD.org>; Wed, 28 Apr 2010 03:47:24 +0000 (UTC)
Message-Id: <20100428034723.CE82863B8FE@www.NetBSD.org>
Date: Wed, 28 Apr 2010 03:47:23 +0000 (UTC)
From: lacombar@gmail.com
Reply-To: lacombar@gmail.com
To: gnats-bugs@NetBSD.org
Subject: KASSERT(ci->ci_ilevel < IPL_HIGH) failed when running a linux binary
X-Send-Pr-Version: www-1.0
>Number: 43217
>Category: kern
>Synopsis: KASSERT(ci->ci_ilevel < IPL_HIGH) failed when running a linux binary
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: dholland
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Apr 28 03:50:00 +0000 2010
>Closed-Date: Sun Jun 06 19:50:45 +0000 2010
>Last-Modified: Sun Jun 06 19:50:45 +0000 2010
>Originator: Arnaud Lacombe
>Release:
>Organization:
n/a
>Environment:
2-way CPU, current 5.99.25 + DIAGNOSTIC + DEBUG
>Description:
On a -current kernel from CVS of April 2nd, my machine is crashing on
following assertion:
KASSERT(ci->ci_ilevel < IPL_HIGH) in arch/x86/x86/pmap.c
backtrace is:
netbsd_elf_signature() ->
printf ->
strlen() ->
trap(): trap number 6 ->
uvm_fault_internal() ->
pmap_map_ptes() ...
The Following printf() is triggering the bug:
882 #ifdef DIAGNOSTIC
883 printf("%s: bad tag %d: "
884 "[%d %d, %d %d, %*.*s %*.*s]\n",
885 epp->ep_name,
886 np->n_type,
887 np->n_namesz, ELF_NOTE_PAX_NAMESZ,
888 np->n_descsz, ELF_NOTE_PAX_DESCSZ,
889 ELF_NOTE_PAX_NAMESZ,
890 ELF_NOTE_PAX_NAMESZ,
891 ndata,
892 ELF_NOTE_PAX_NAMESZ,
893 ELF_NOTE_PAX_NAMESZ,
894 ELF_NOTE_PAX_NAME);
895 #endif
Normal execution should display:
/path/to/bin/gcc: bad tag 1: [14 4, 16 4, GNU PaX]
Full discussion at http://mail-index.netbsd.org/current-users/2010/04/17/msg013120.html.
>How-To-Repeat:
Case 1:
0. reboot
1. login
2. grep -r iamarandomstring /large/directory
3. while true; do /path/to/linux/binary; done
running (2) alone is fine. running (3) alone is fine. running (3) + pure CPU load is fine.
Case 2:
0. Day-to-day desktop usage
1. /path/to/linux/binary
>Fix:
>Release-Note:
>Audit-Trail:
From: Andrew Doran <ad@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/43217: KASSERT(ci->ci_ilevel < IPL_HIGH) failed when
running a linux binary
Date: Wed, 28 Apr 2010 10:40:07 +0000
There are at least two problems here.
On Wed, Apr 28, 2010 at 03:50:00AM +0000, lacombar@gmail.com wrote:
> KASSERT(ci->ci_ilevel < IPL_HIGH) in arch/x86/x86/pmap.c
1st is that we have gone to IPL_HIGH somewhere before entering the
pmap module, this isn't allowed because: context switching at IPL_HIGH
is prohibited and the pmap module may context switch.
> backtrace is:
>
> netbsd_elf_signature() ->
The second problem is your trap here..
From: Arnaud Lacombe <lacombar@gmail.com>
To: Andrew Doran <ad@netbsd.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/43217: KASSERT(ci->ci_ilevel < IPL_HIGH) failed when running
a linux binary
Date: Wed, 28 Apr 2010 09:14:12 -0400
Hi,
On Wed, Apr 28, 2010 at 6:40 AM, Andrew Doran <ad@netbsd.org> wrote:
> There are at least two problems here.
>
> On Wed, Apr 28, 2010 at 03:50:00AM +0000, lacombar@gmail.com wrote:
>
>> KASSERT(ci->ci_ilevel < IPL_HIGH) in arch/x86/x86/pmap.c
>
> 1st is that we have gone to IPL_HIGH somewhere before entering the
> pmap module, this isn't allowed because: context switching at IPL_HIGH
> is prohibited and the pmap module may context switch.
>
As pointed by Mindaugas, printf() takes a lock which raises the prio
to IPL_HIGH. Then strlen() triggers the fault, so we enter the trap at
IPL_HIGH.
- Arnaud
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/43217: KASSERT(ci->ci_ilevel < IPL_HIGH) failed when
running a linux binary
Date: Wed, 28 Apr 2010 22:20:20 +0000
On Wed, Apr 28, 2010 at 03:50:00AM +0000, lacombar@gmail.com wrote:
> The Following printf() is triggering the bug:
> 882 #ifdef DIAGNOSTIC
> 883 printf("%s: bad tag %d: "
> 884 "[%d %d, %d %d, %*.*s %*.*s]\n",
> 885 epp->ep_name,
ep_name points to userspace and should not be printed like this.
(It's not clear to me why we hang onto the userspace pointer instead
of the kernel copy that exec makes. One of the patches I'm sitting on
because I haven't had time to test it properly tidies this up as a
side effect of some other reorg.)
--
David A. Holland
dholland@netbsd.org
Responsible-Changed-From-To: kern-bug-people->dholland
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Wed, 28 Apr 2010 22:26:30 +0000
Responsible-Changed-Why:
I'm sitting on a patch that makes this a one-character fix
From: "David A. Holland" <dholland@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/43217 CVS commit: src/sys/kern
Date: Sun, 2 May 2010 06:35:22 +0000
Module Name: src
Committed By: dholland
Date: Sun May 2 06:35:21 UTC 2010
Modified Files:
src/sys/kern: exec_elf.c
Log Message:
Don't printf a userspace pointer; print the copied-in kernel version
instead, now that it's readily available. Fixes PR 43217.
To generate a diff of this commit:
cvs rdiff -u -r1.20 -r1.21 src/sys/kern/exec_elf.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 02 May 2010 06:41:23 +0000
State-Changed-Why:
Fixed the printf - can you check that it no longer crashes?
From: Bernd Ernesti <netbsd@lists.veego.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/43217 (KASSERT(ci->ci_ilevel < IPL_HIGH) failed when running a linux binary)
Date: Sat, 5 Jun 2010 09:05:15 +0200
On Sun, May 02, 2010 at 06:41:24AM +0000, dholland@NetBSD.org wrote:
> Synopsis: KASSERT(ci->ci_ilevel < IPL_HIGH) failed when running a linux binary
>
> State-Changed-From-To: open->feedback
> State-Changed-By: dholland@NetBSD.org
> State-Changed-When: Sun, 02 May 2010 06:41:23 +0000
> State-Changed-Why:
> Fixed the printf - can you check that it no longer crashes?
It is still crashing for me with a current kernel from June the 3rd.
I tried to install acrobat which requires some packages and it allways crashed
while installing suse_freetype2
The lines with the unknown note type looks bad. Some strange non ascii chars.
From /var/log/messages:
/usr/pkg/emul/linux/sbin/ldconfig: bad tag 1: [4 4, 16 4, GNU PaX]
?^CE^L9E?^O?^Y^B: unknown note type 1163097427
/usr/pkg/emul/linux/sbin/ldconfig: bad tag 1: [4 4, 16 4, GNU PaX]
^PR?E^HP?<U+5c3>?^P??: unknown note type 1163097427
/usr/pkg/emul/linux/sbin/ldconfig: bad tag 1: [4 4, 16 4, GNU PaX]
^PR?E^HP?<U+5c3>?^P??: unknown note type 1163097427
/usr/pkg/emul/linux/sbin/ldconfig: bad tag 1: [4 4, 16 4, GNU PaX]
^PR?E^HP?<U+5c3>?^P??: unknown note type 1163097427
And that was what I saw on the serial console:
9E....: unknown note type 1163097427
/usr/pkg/emul/linux/sbin/ldconfig: bad tag 1: [4 4, 16 4, GNU PaX]
.R.Pÿ..Ä..Ø: unknown note type 1163097427
/usr/pkg/emul/linux/sbin/ldconfig: bad tag 1: [4 4, 16 4, GNU PaX]
panic: kernel diagnostic assertion "ci->ci_ilevel < IPL_HIGH" failed: file "/src/sys/arch/x86/x86/pmap.c", line 2612
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c01f4694 cs 8 eflags 246 cr2 bb916454 ilevel 8
Stopped in pid 15505.1 (sh) at netbsd:breakpoint+0x4: popl %ebp
db{0}> bt
breakpoint(cfbc872c,8,0,cfbc85f0,c098fd40,2,cfbc8614,c071765e,c097c268,c08d8b48) at netbsd:breakpoint+0x4
panic(c097c268,c08d8b48,c093470e,c0934580,a34,7,cd958000,1,0,0) at netbsd:panic+0x1f2
kern_assert(c08d8b48,c0934580,a34,c093470e,5c5a,cd528458,1,cd6ea000,cfbc868c,cf313000) at netbsd:kern_assert+0x2e
pmap_load(c0a043e0,cfb8d984,cf313000,cf313000,bb913000,ce5e5e44,cfbc86c4,c04c679e,ce5e5e44,cfbc86a8) at netbsd:pmap_load+0x256
pmap_map_ptes(ce5e5e44,cfbc86a8,cfbc86b4,cfbc86ac,718,cfbc87c8,cfbc86c4,3,3,cfbc8754) at netbsd:pmap_map_ptes+0x20e
pmap_extract(ce5e5e44,bb913000,0,cfd0d900,30,cfbc8794,10,cfbc8754,3000,0) at netbsd:pmap_extract+0xbe
uvm_fault_internal(ce59742c,bb916000,1,0,1,f4,cde832c0,cfb8d984,cfbc8da0,6) at netbsd:uvm_fault_internal+0x2f2
trap() at netbsd:trap+0x70e
--- trap (number 6) ---
strlen(c08f5bfe,5,0,0,cfbc8988,ce4c7000,cfbc897c,c05b4fbb,4,0) at netbsd:strlen+0x10
printf(c08f5bfe,bb916454,45537553,0,34,45537553,2,d6448004,d5b13a10,cfbc89c8) at netbsd:printf+0x24
netbsd_elf32_signature(cf313000,cfbc8bcc,d5b14a04,0,cfbc8a38,1,d5b14a04,2,d3a763a8,d3a763a8) at netbsd:netbsd_elf32_signature+0x16a
exec_elf32_makecmds(cf313000,cfbc8bcc,0,1000,cc2c3c94,d5b14a00,cfbc8a9c,c05b1eb5,8,1) at netbsd:exec_elf32_makecmds+0x15b
check_exec(cf313000,cfbc8bcc,ce4c7000,cf313000,c0a3f4e4,bbaae000,7f92f000,cfb8d998,5,0) at netbsd:check_exec+0x1e4
execve1(cf313000,bb916454,bb9164b4,bb9164c4,c03d7220,bbaae000,1,0,c09bb370,0) at netbsd:execve1+0x192
sys_execve(cf313000,cfbc8d00,cfbc8d28,43,bbaae000,cfb8d984,3b,bb916454,bb9164b4,bb9164c4) at netbsd:sys_execve+0x22
syscall(cfbc8d48,bb9000b3,ab,bfbf001f,bbba001f,bb916454,bb9164b4,bfbfdfe8,bb916454,7d7b7cff) at netbsd:syscall+0xb9
Bernd
From: "David A. Holland" <dholland@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/43217 CVS commit: src/sys/kern
Date: Sun, 6 Jun 2010 06:20:16 +0000
Module Name: src
Committed By: dholland
Date: Sun Jun 6 06:20:16 UTC 2010
Modified Files:
src/sys/kern: exec_elf.c
Log Message:
Improve previous: there were two printfs and I'd only noticed and fixed
one of them. PR 43217.
To generate a diff of this commit:
cvs rdiff -u -r1.21 -r1.22 src/sys/kern/exec_elf.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org, netbsd@lists.veego.de
Cc:
Subject: Re: kern/43217 (KASSERT(ci->ci_ilevel < IPL_HIGH) failed when
running a linux binary)
Date: Sun, 6 Jun 2010 06:28:36 +0000
On Sat, Jun 05, 2010 at 07:10:06AM +0000, Bernd Ernesti wrote:
> It is still crashing for me with a current kernel from June the 3rd.
I figured it out later on after we talked... can you try exec_elf.c
-r1.22?
--
David A. Holland
dholland@netbsd.org
From: Bernd Ernesti <netbsd@lists.veego.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/43217 (KASSERT(ci->ci_ilevel < IPL_HIGH) failed when running a linux binary)
Date: Sun, 6 Jun 2010 12:05:43 +0200
On Sun, Jun 06, 2010 at 06:28:36AM +0000, David Holland wrote:
> On Sat, Jun 05, 2010 at 07:10:06AM +0000, Bernd Ernesti wrote:
> > It is still crashing for me with a current kernel from June the 3rd.
>
> I figured it out later on after we talked... can you try exec_elf.c
> -r1.22?
That fixed the panic.
Thank you,
Bernd
State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 06 Jun 2010 19:50:45 +0000
State-Changed-Why:
Confirmed fixed.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.