NetBSD Problem Report #47806

From wiz@yt.nih.at  Wed May  8 08:29:46 2013
Return-Path: <wiz@yt.nih.at>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 507B863F4EF
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  8 May 2013 08:29:46 +0000 (UTC)
Message-Id: <20130508082939.153592AC2AF@yt.nih.at>
Date: Wed,  8 May 2013 10:29:39 +0200 (CEST)
From: Thomas Klausner <wiz@NetBSD.org>
Reply-To: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Subject: "Abort trap" when running Linux lddconfig
X-Send-Pr-Version: 3.95

>Number:         47806
>Category:       kern
>Synopsis:       "Abort trap" when running Linux lddconfig
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    slp
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed May 08 08:30:00 +0000 2013
>Closed-Date:    Sun Apr 13 02:30:37 +0000 2014
>Last-Modified:  Sun Apr 13 02:30:37 +0000 2014
>Originator:     Thomas Klausner
>Release:        NetBSD 6.99.19
>Organization:
Curiosity is the very basis of education and if you tell me that 
curiosity killed the cat, I say only that the cat died nobly.
- Arnold Edinborough
>Environment:
System: NetBSD yt.nih.at 6.99.19 NetBSD 6.99.19 (KVOTHE) #4: Wed May 1 16:58:18 CEST 2013 wiz@yt.nih.at:/archive/foreign/src/sys/arch/amd64/compile/obj/KVOTHE amd64
Architecture: x86_64
Machine: amd64
>Description:
When installing various SUSE Linux emulation packages from pkgsrc, lddconfig is
run automatically. For some time now (months?) this fails with
[1]   Abort trap              (/usr/pkg/emul/l...

ktrace -di output in that area says:
  9671   9671 ldconfig NAMI  "/emul/linux/lib64/libnss_nis.so.2"
  9671   9671 ldconfig RET   open 4
  9671   9671 ldconfig CALL  fstat64(4,0x7f7fffffd9e0)
  9671   9671 ldconfig RET   fstat64 0
  9671   9671 ldconfig CALL  mmap(0,0xcd78,1,1,4,0)
  9671   9671 ldconfig RET   mmap 140187597983744/0x7f7ff7fad000
  9671   9671 ldconfig CALL  munmap(0x7f7ff7fad000,0xcd78)
  9671   9671 ldconfig RET   munmap 0
  9671   9671 ldconfig CALL  close(4)
  9671   9671 ldconfig RET   close 0
  9671   9671 ldconfig CALL  rt_sigprocmask(1,0x7f7fffffd9b0,0,8)
  9671   9671 ldconfig RET   rt_sigprocmask 0
  9671   9671 ldconfig CALL  gettid
  9671   9671 ldconfig RET   gettid 9671/0x25c7
  9671   9671 ldconfig CALL  tgkill(0x25c7,0x25c7,6)
  9671   9671 ldconfig RET   tgkill 0
  9671   9671 ldconfig PSIG  SIGABRT SIG_DFL: code=SI_LWP sent by pid=9671, uid=0)
 15222      1 sh       RET   __wait450 9671/0x25c7
 15222      1 sh       CALL  write(2,0x7f7ff7b36080,0x10)
 15222      1 sh       GIO   fd 2 wrote 16 bytes
       "[1]   Abort trap"
 15222      1 sh       RET   write 16/0x10
 15222      1 sh       CALL  write(2,0x7f7ff7b36080,0x21)
 15222      1 sh       GIO   fd 2 wrote 33 bytes
       "              (/usr/pkg/emul/l..."



>How-To-Repeat:
Example:
# pkg_add suse_krb5
suse_krb5-12.1nb1: rebuilding run-time library search paths database
[1]   Abort trap              (/usr/pkg/emul/l...
>Fix:


>Release-Note:

>Audit-Trail:
From: Sergio Lopez <slp@sinrega.org>
To: gnats-bugs@NetBSD.org
Cc: wiz@NetBSD.org
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Tue, 27 Aug 2013 22:14:39 +0000

 --mP3DRpeJDSE+ciuQ
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 ldconfig uses dirent's d_type to determine whether an entry is a
 regular file or a link, and it's getting confused by bogus values on
 said field.

 The emulation code puts d_type just after the d_name string, but glibc
 expects to find it at the end of the record, which, as the structure
 is ALIGN'ed, could be at a different offset.

 Something like the change in the attached diff should make the trick.


 --mP3DRpeJDSE+ciuQ
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="linux_dtype.diff"

 Index: linux_misc.c
 ===================================================================
 RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
 retrieving revision 1.224
 diff -u -r1.224 linux_misc.c
 --- linux_misc.c	11 Aug 2013 09:07:15 -0000	1.224
 +++ linux_misc.c	27 Aug 2013 21:01:06 -0000
 @@ -786,7 +786,7 @@
  			idb.d_reclen = (u_short)linux_reclen;
  		}
  		strcpy(idb.d_name, bdp->d_name);
 -		idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
 +		*((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
  		if ((error = copyout((void *)&idb, outp, linux_reclen)))
  			goto out;
  		/* advance past this real entry */

 --mP3DRpeJDSE+ciuQ--

From: Thomas Klausner <wiz@NetBSD.org>
To: Sergio Lopez <slp@sinrega.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Sun, 1 Sep 2013 19:27:05 +0200

 Hi Sergio!

 On Tue, Aug 27, 2013 at 10:14:39PM +0000, Sergio Lopez wrote:
 > ldconfig uses dirent's d_type to determine whether an entry is a
 > regular file or a link, and it's getting confused by bogus values on
 > said field.
 > 
 > The emulation code puts d_type just after the d_name string, but glibc
 > expects to find it at the end of the record, which, as the structure
 > is ALIGN'ed, could be at a different offset.
 > 
 > Something like the change in the attached diff should make the trick.

 Thanks for investigating this!

 > Index: linux_misc.c
 > ===================================================================
 > RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
 > retrieving revision 1.224
 > diff -u -r1.224 linux_misc.c
 > --- linux_misc.c	11 Aug 2013 09:07:15 -0000	1.224
 > +++ linux_misc.c	27 Aug 2013 21:01:06 -0000
 > @@ -786,7 +786,7 @@
 >  			idb.d_reclen = (u_short)linux_reclen;
 >  		}
 >  		strcpy(idb.d_name, bdp->d_name);
 > -		idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
 > +		*((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
 >  		if ((error = copyout((void *)&idb, outp, linux_reclen)))
 >  			goto out;
 >  		/* advance past this real entry */

 I've built a kernel (from -current sources of Aug 29) with this change
 and deleted and reinstalled every Linux emulation package -- there
 wasn't a single ldconfig coredump. So this seems to fix the problem,
 thank you! Can you please commit it (or a variant -- ISTR there are
 macros for playing around with alignment.)?
  Thomas

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Tue, 3 Sep 2013 03:34:09 +0000

 On Tue, Aug 27, 2013 at 10:20:01PM +0000, Sergio Lopez wrote:
  >  ldconfig uses dirent's d_type to determine whether an entry is a
  >  regular file or a link, and it's getting confused by bogus values on
  >  said field.
  >  
  >  The emulation code puts d_type just after the d_name string, but glibc
  >  expects to find it at the end of the record, which, as the structure
  >  is ALIGN'ed, could be at a different offset.

 It might be a good idea to insert the type in both places, in case
 some older version of glibc does it the other way or the behavior
 depends on what glibc thinks the kernel version is.

 I think the tidy way to write it would be

    idb.d_name[bdp->d_namlen + 1] = 0;
    idb.d_name[ALIGN(bdp->d_namlen) + 1] = 0;

 if I understand all the bits correctly.

 -- 
 David A. Holland
 dholland@netbsd.org

From: David Laight <david@l8s.co.uk>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Tue, 3 Sep 2013 08:41:34 +0100

 On Tue, Aug 27, 2013 at 10:20:01PM +0000, Sergio Lopez wrote:
 >  Index: linux_misc.c
 >  ===================================================================
 >  RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
 >  retrieving revision 1.224
 >  diff -u -r1.224 linux_misc.c
 >  --- linux_misc.c	11 Aug 2013 09:07:15 -0000	1.224
 >  +++ linux_misc.c	27 Aug 2013 21:01:06 -0000
 >  @@ -786,7 +786,7 @@
 >   			idb.d_reclen = (u_short)linux_reclen;
 >   		}
 >   		strcpy(idb.d_name, bdp->d_name);
 >  -		idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
 >  +		*((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
 >   		if ((error = copyout((void *)&idb, outp, linux_reclen)))
 >   			goto out;
 >   		/* advance past this real entry */

 FWIW I'd guess that the strcpy() can be replaced by a faster memcpy()
 since the length must already be known.

 Also (contradicting someone else) d_reclen has been calculated using
 ALIGN() - probably best to only use ALIGN once.

 If you write in into both places, add a comment to explain why.

 	David

 -- 
 David Laight: david@l8s.co.uk

From: Sergio Lopez <slp@sinrega.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Wed, 4 Sep 2013 20:59:27 +0000

 On Tue, Sep 03, 2013 at 03:35:01AM +0000, David Holland wrote:
 >  It might be a good idea to insert the type in both places, in case
 >  some older version of glibc does it the other way or the behavior
 >  depends on what glibc thinks the kernel version is.

 I've been digging into glibc's repo, and since its inclusion into the
 32 bit version of dirent (circa 2004), d_type is expected to be at the
 end of each record.

 I suppose this has passed unnoticed all this time due to the fact that
 its implementation into our compat layer is relatively recent (2010)
 and its not widely used by applications (most still rely on fstat).

 Sergio.

From: Sergio Lopez <slp@sinrega.org>
To: gnats-bugs@NetBSD.org
Cc: David Laight <david@l8s.co.uk>
Subject: Re: kern/47806 ("Abort trap" when running Linux lddconfig)
Date: Wed, 4 Sep 2013 21:15:40 +0000

 --k1lZvvs/B4yU6o8G
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 On Tue, Sep 03, 2013 at 08:41:34AM +0100, David Laight wrote:
 > On Tue, Aug 27, 2013 at 10:20:01PM +0000, Sergio Lopez wrote:
 > >  Index: linux_misc.c
 > >  ===================================================================
 > >  RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
 > >  retrieving revision 1.224
 > >  diff -u -r1.224 linux_misc.c
 > >  --- linux_misc.c	11 Aug 2013 09:07:15 -0000	1.224
 > >  +++ linux_misc.c	27 Aug 2013 21:01:06 -0000
 > >  @@ -786,7 +786,7 @@
 > >   			idb.d_reclen = (u_short)linux_reclen;
 > >   		}
 > >   		strcpy(idb.d_name, bdp->d_name);
 > >  -		idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
 > >  +		*((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
 > >   		if ((error = copyout((void *)&idb, outp, linux_reclen)))
 > >   			goto out;
 > >   		/* advance past this real entry */
 > 
 > FWIW I'd guess that the strcpy() can be replaced by a faster memcpy()
 > since the length must already be known.

 You're right about this, but I'd prefer to avoid doing changes
 unrelated to this PR.

 I've attached a newer version of the patch which avoids writting d_type
 if the application is using the old call convention. Linux doesn't
 insert d_type in its old call implementation (still present nowadays),
 and doing it as my previous patch does, could potentially corrupt the
 structure, as Enami Tsugutomo noticed.

 Sergio.

 --k1lZvvs/B4yU6o8G
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="linux_dtype.diff"

 Index: linux_misc.c
 ===================================================================
 RCS file: /cvsroot/src/sys/compat/linux/common/linux_misc.c,v
 retrieving revision 1.224
 diff -u -r1.224 linux_misc.c
 --- linux_misc.c        11 Aug 2013 09:07:15 -0000      1.224
 +++ linux_misc.c        4 Sep 2013 20:39:51 -0000
 @@ -784,9 +784,10 @@
                         }
                         idb.d_off = (linux_off_t)off;
                         idb.d_reclen = (u_short)linux_reclen;
 +                       /* Linux puts d_type at the end of each record */
 +                       *((char *)&idb + idb.d_reclen - 1) = bdp->d_type;
                 }
                 strcpy(idb.d_name, bdp->d_name);
 -               idb.d_name[strlen(idb.d_name) + 1] = bdp->d_type;
                 if ((error = copyout((void *)&idb, outp, linux_reclen)))
                         goto out;
                 /* advance past this real entry */

 --k1lZvvs/B4yU6o8G--

Responsible-Changed-From-To: kern-bug-people->slp
Responsible-Changed-By: slp@NetBSD.org
Responsible-Changed-When: Wed, 04 Sep 2013 22:54:30 +0000
Responsible-Changed-Why:
Take


From: "Sergio Lopez" <slp@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47806 CVS commit: src/sys/compat/linux/common
Date: Sun, 10 Nov 2013 12:07:53 +0000

 Module Name:	src
 Committed By:	slp
 Date:		Sun Nov 10 12:07:52 UTC 2013

 Modified Files:
 	src/sys/compat/linux/common: linux_misc.c

 Log Message:
 On linux_sys_getdents, insert d_type at the end of each record.
 Fixes PR kern/47806.


 To generate a diff of this commit:
 cvs rdiff -u -r1.226 -r1.227 src/sys/compat/linux/common/linux_misc.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 24 Nov 2013 02:13:15 +0000
State-Changed-Why:
fixed, thanks


State-Changed-From-To: closed->pending-pullups
State-Changed-By: hauke@NetBSD.org
State-Changed-When: Mon, 09 Dec 2013 15:02:43 +0000
State-Changed-Why:
The patch needs pull-up to (at least) netbsd-6, as the PR discussion 
clearly stated.


From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47806 CVS commit: [netbsd-6] src/sys/compat/linux/common
Date: Sat, 14 Dec 2013 19:31:17 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sat Dec 14 19:31:17 UTC 2013

 Modified Files:
 	src/sys/compat/linux/common [netbsd-6]: linux_misc.c

 Log Message:
 Pull up following revision(s) (requested by hauke in ticket #993):
 	sys/compat/linux/common/linux_misc.c: revision 1.227
 On linux_sys_getdents, insert d_type at the end of each record.
 Fixes PR kern/47806.


 To generate a diff of this commit:
 cvs rdiff -u -r1.219 -r1.219.8.1 src/sys/compat/linux/common/linux_misc.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->closed
State-Changed-By: snj@NetBSD.org
State-Changed-When: Sun, 13 Apr 2014 02:30:37 +0000
State-Changed-Why:
Pulled up.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.