NetBSD Problem Report #47806
From wiz@yt.nih.at Wed May 8 08:29:46 2013
Return-Path: <wiz@yt.nih.at>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id 507B863F4EF
for <gnats-bugs@gnats.NetBSD.org>; Wed, 8 May 2013 08:29:46 +0000 (UTC)
Message-Id: <20130508082939.153592AC2AF@yt.nih.at>
Date: Wed, 8 May 2013 10:29:39 +0200 (CEST)
From: Thomas Klausner <wiz@NetBSD.org>
Reply-To: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Subject: "Abort trap" when running Linux lddconfig
X-Send-Pr-Version: 3.95
>Number: 47806
>Category: kern
>Synopsis: "Abort trap" when running Linux lddconfig
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: slp
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed May 08 08:30:00 +0000 2013
>Closed-Date: Sun Apr 13 02:30:37 +0000 2014
>Last-Modified: Sun Apr 13 02:30:37 +0000 2014
>Originator: Thomas Klausner
>Release: NetBSD 6.99.19
>Organization:
Curiosity is the very basis of education and if you tell me that
curiosity killed the cat, I say only that the cat died nobly.
- Arnold Edinborough
>Environment:
System: NetBSD yt.nih.at 6.99.19 NetBSD 6.99.19 (KVOTHE) #4: Wed May 1 16:58:18 CEST 2013 wiz@yt.nih.at:/archive/foreign/src/sys/arch/amd64/compile/obj/KVOTHE amd64
Architecture: x86_64
Machine: amd64
>Description:
When installing various SUSE Linux emulation packages from pkgsrc, lddconfig is
run automatically. For some time now (months?) this fails with
[1] Abort trap (/usr/pkg/emul/l...
ktrace -di output in that area says:
9671 9671 ldconfig NAMI "/emul/linux/lib64/libnss_nis.so.2"
9671 9671 ldconfig RET open 4
9671 9671 ldconfig CALL fstat64(4,0x7f7fffffd9e0)
9671 9671 ldconfig RET fstat64 0
9671 9671 ldconfig CALL mmap(0,0xcd78,1,1,4,0)
9671 9671 ldconfig RET mmap 140187597983744/0x7f7ff7fad000
9671 9671 ldconfig CALL munmap(0x7f7ff7fad000,0xcd78)
9671 9671 ldconfig RET munmap 0
9671 9671 ldconfig CALL close(4)
9671 9671 ldconfig RET close 0
9671 9671 ldconfig CALL rt_sigprocmask(1,0x7f7fffffd9b0,0,8)
9671 9671 ldconfig RET rt_sigprocmask 0
9671 9671 ldconfig CALL gettid
9671 9671 ldconfig RET gettid 9671/0x25c7
9671 9671 ldconfig CALL tgkill(0x25c7,0x25c7,6)
9671 9671 ldconfig RET tgkill 0
9671 9671 ldconfig PSIG SIGABRT SIG_DFL: code=SI_LWP sent by pid=9671, uid=0)
15222 1 sh RET __wait450 9671/0x25c7
15222 1 sh CALL write(2,0x7f7ff7b36080,0x10)
15222 1 sh GIO fd 2 wrote 16 bytes
"[1] Abort trap"
15222 1 sh RET write 16/0x10
15222 1 sh CALL write(2,0x7f7ff7b36080,0x21)
15222 1 sh GIO fd 2 wrote 33 bytes
" (/usr/pkg/emul/l..."
>How-To-Repeat:
Example:
# pkg_add suse_krb5
suse_krb5-12.1nb1: rebuilding run-time library search paths database
[1] Abort trap (/usr/pkg/emul/l...
>Fix:
>Release-Note:
>Audit-Trail:
From: Sergio Lopez <slp@sinrega.org>
To: gnats-bugs@NetBSD.org
Cc: wiz@NetBSD.org
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Tue, 27 Aug 2013 22:14:39 +0000
--mP3DRpeJDSE+ciuQ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
ldconfig uses dirent's d_type to determine whether an entry is a
regular file or a link, and it's getting confused by bogus values on
said field.
The emulation code puts d_type just after the d_name string, but glibc
expects to find it at the end of the record, which, as the structure
is ALIGN'ed, could be at a different offset.
Something like the change in the attached diff should make the trick.
--mP3DRpeJDSE+ciuQ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="linux_dtype.diff"
Index: linux_misc.c
===================================================================
RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
retrieving revision 1.224
diff -u -r1.224 linux_misc.c
--- linux_misc.c 11 Aug 2013 09:07:15 -0000 1.224
+++ linux_misc.c 27 Aug 2013 21:01:06 -0000
@@ -786,7 +786,7 @@
idb.d_reclen = (u_short)linux_reclen;
}
strcpy(idb.d_name, bdp->d_name);
- idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
+ *((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
if ((error = copyout((void *)&idb, outp, linux_reclen)))
goto out;
/* advance past this real entry */
--mP3DRpeJDSE+ciuQ--
From: Thomas Klausner <wiz@NetBSD.org>
To: Sergio Lopez <slp@sinrega.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Sun, 1 Sep 2013 19:27:05 +0200
Hi Sergio!
On Tue, Aug 27, 2013 at 10:14:39PM +0000, Sergio Lopez wrote:
> ldconfig uses dirent's d_type to determine whether an entry is a
> regular file or a link, and it's getting confused by bogus values on
> said field.
>
> The emulation code puts d_type just after the d_name string, but glibc
> expects to find it at the end of the record, which, as the structure
> is ALIGN'ed, could be at a different offset.
>
> Something like the change in the attached diff should make the trick.
Thanks for investigating this!
> Index: linux_misc.c
> ===================================================================
> RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
> retrieving revision 1.224
> diff -u -r1.224 linux_misc.c
> --- linux_misc.c 11 Aug 2013 09:07:15 -0000 1.224
> +++ linux_misc.c 27 Aug 2013 21:01:06 -0000
> @@ -786,7 +786,7 @@
> idb.d_reclen = (u_short)linux_reclen;
> }
> strcpy(idb.d_name, bdp->d_name);
> - idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
> + *((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
> if ((error = copyout((void *)&idb, outp, linux_reclen)))
> goto out;
> /* advance past this real entry */
I've built a kernel (from -current sources of Aug 29) with this change
and deleted and reinstalled every Linux emulation package -- there
wasn't a single ldconfig coredump. So this seems to fix the problem,
thank you! Can you please commit it (or a variant -- ISTR there are
macros for playing around with alignment.)?
Thomas
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Tue, 3 Sep 2013 03:34:09 +0000
On Tue, Aug 27, 2013 at 10:20:01PM +0000, Sergio Lopez wrote:
> ldconfig uses dirent's d_type to determine whether an entry is a
> regular file or a link, and it's getting confused by bogus values on
> said field.
>
> The emulation code puts d_type just after the d_name string, but glibc
> expects to find it at the end of the record, which, as the structure
> is ALIGN'ed, could be at a different offset.
It might be a good idea to insert the type in both places, in case
some older version of glibc does it the other way or the behavior
depends on what glibc thinks the kernel version is.
I think the tidy way to write it would be
idb.d_name[bdp->d_namlen + 1] = 0;
idb.d_name[ALIGN(bdp->d_namlen) + 1] = 0;
if I understand all the bits correctly.
--
David A. Holland
dholland@netbsd.org
From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Tue, 3 Sep 2013 08:41:34 +0100
On Tue, Aug 27, 2013 at 10:20:01PM +0000, Sergio Lopez wrote:
> Index: linux_misc.c
> ===================================================================
> RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
> retrieving revision 1.224
> diff -u -r1.224 linux_misc.c
> --- linux_misc.c 11 Aug 2013 09:07:15 -0000 1.224
> +++ linux_misc.c 27 Aug 2013 21:01:06 -0000
> @@ -786,7 +786,7 @@
> idb.d_reclen = (u_short)linux_reclen;
> }
> strcpy(idb.d_name, bdp->d_name);
> - idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
> + *((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
> if ((error = copyout((void *)&idb, outp, linux_reclen)))
> goto out;
> /* advance past this real entry */
FWIW I'd guess that the strcpy() can be replaced by a faster memcpy()
since the length must already be known.
Also (contradicting someone else) d_reclen has been calculated using
ALIGN() - probably best to only use ALIGN once.
If you write in into both places, add a comment to explain why.
David
--
David Laight: david@l8s.co.uk
From: Sergio Lopez <slp@sinrega.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Wed, 4 Sep 2013 20:59:27 +0000
On Tue, Sep 03, 2013 at 03:35:01AM +0000, David Holland wrote:
> It might be a good idea to insert the type in both places, in case
> some older version of glibc does it the other way or the behavior
> depends on what glibc thinks the kernel version is.
I've been digging into glibc's repo, and since its inclusion into the
32 bit version of dirent (circa 2004), d_type is expected to be at the
end of each record.
I suppose this has passed unnoticed all this time due to the fact that
its implementation into our compat layer is relatively recent (2010)
and its not widely used by applications (most still rely on fstat).
Sergio.
From: Sergio Lopez <slp@sinrega.org>
To: gnats-bugs@NetBSD.org
Cc: David Laight <david@l8s.co.uk>
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Wed, 4 Sep 2013 21:15:40 +0000
--k1lZvvs/B4yU6o8G
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Tue, Sep 03, 2013 at 08:41:34AM +0100, David Laight wrote:
> On Tue, Aug 27, 2013 at 10:20:01PM +0000, Sergio Lopez wrote:
> > Index: linux_misc.c
> > ===================================================================
> > RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
> > retrieving revision 1.224
> > diff -u -r1.224 linux_misc.c
> > --- linux_misc.c 11 Aug 2013 09:07:15 -0000 1.224
> > +++ linux_misc.c 27 Aug 2013 21:01:06 -0000
> > @@ -786,7 +786,7 @@
> > idb.d_reclen = (u_short)linux_reclen;
> > }
> > strcpy(idb.d_name, bdp->d_name);
> > - idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
> > + *((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
> > if ((error = copyout((void *)&idb, outp, linux_reclen)))
> > goto out;
> > /* advance past this real entry */
>
> FWIW I'd guess that the strcpy() can be replaced by a faster memcpy()
> since the length must already be known.
You're right about this, but I'd prefer to avoid doing changes
unrelated to this PR.
I've attached a newer version of the patch which avoids writting d_type
if the application is using the old call convention. Linux doesn't
insert d_type in its old call implementation (still present nowadays),
and doing it as my previous patch does, could potentially corrupt the
structure, as Enami Tsugutomo noticed.
Sergio.
--k1lZvvs/B4yU6o8G
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="linux_dtype.diff"
Index: linux_misc.c
===================================================================
RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
retrieving revision 1.224
diff -u -r1.224 linux_misc.c
--- linux_misc.c 11 Aug 2013 09:07:15 -0000 1.224
+++ linux_misc.c 4 Sep 2013 20:39:51 -0000
@@ -784,9 +784,10 @@
}
idb.d_off = (linux_off_t)off;
idb.d_reclen = (u_short)linux_reclen;
+ /* Linux puts d_type at the end of each record */
+ *((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
}
strcpy(idb.d_name, bdp->d_name);
- idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
if ((error = copyout((void *)&idb, outp, linux_reclen)))
goto out;
/* advance past this real entry */
--k1lZvvs/B4yU6o8G--
Responsible-Changed-From-To: kern-bug-people->slp
Responsible-Changed-By: slp@NetBSD.org
Responsible-Changed-When: Wed, 04 Sep 2013 22:54:30 +0000
Responsible-Changed-Why:
Take
From: "Sergio Lopez" <slp@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47806 CVS commit: src/sys/compat/linux/common
Date: Sun, 10 Nov 2013 12:07:53 +0000
Module Name: src
Committed By: slp
Date: Sun Nov 10 12:07:52 UTC 2013
Modified Files:
src/sys/compat/linux/common: linux_misc.c
Log Message:
On linux_sys_getdents, insert d_type at the end of each record.
Fixes PR kern/47806.
To generate a diff of this commit:
cvs rdiff -u -r1.226 -r1.227 src/sys/compat/linux/common/linux_misc.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 24 Nov 2013 02:13:15 +0000
State-Changed-Why:
fixed, thanks
State-Changed-From-To: closed->pending-pullups
State-Changed-By: hauke@NetBSD.org
State-Changed-When: Mon, 09 Dec 2013 15:02:43 +0000
State-Changed-Why:
The patch needs pull-up to (at least) netbsd-6, as the PR discussion
clearly stated.
From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47806 CVS commit: [netbsd-6] src/sys/compat/linux/common
Date: Sat, 14 Dec 2013 19:31:17 +0000
Module Name: src
Committed By: bouyer
Date: Sat Dec 14 19:31:17 UTC 2013
Modified Files:
src/sys/compat/linux/common [netbsd-6]: linux_misc.c
Log Message:
Pull up following revision(s) (requested by hauke in ticket #993):
sys/compat/linux/common/linux_misc.c: revision 1.227
On linux_sys_getdents, insert d_type at the end of each record.
Fixes PR kern/47806.
To generate a diff of this commit:
cvs rdiff -u -r1.219 -r1.219.8.1 src/sys/compat/linux/common/linux_misc.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: pending-pullups->closed
State-Changed-By: snj@NetBSD.org
State-Changed-When: Sun, 13 Apr 2014 02:30:37 +0000
State-Changed-Why:
Pulled up.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.