NetBSD Problem Report #46263

From ignatios@cs.uni-bonn.de  Tue Mar 27 08:07:36 2012
Return-Path: <ignatios@cs.uni-bonn.de>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id B655D63BBEC
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 27 Mar 2012 08:07:36 +0000 (UTC)
Message-Id: <1332835662.640103.9391.nullmailer@tiger.cs.uni-bonn.de>
Date: Tue, 27 Mar 2012 10:07:42 +0200
From: ignatios@cs.uni-bonn.de
Reply-To: ignatios@cs.uni-bonn.de
To: gnats-bugs@gnats.NetBSD.org
Subject: some programs don't work in xen3pae-domU
X-Send-Pr-Version: 3.95

>Number:         46263
>Category:       port-xen
>Synopsis:       some programs don't work in xen3pae-domU (segv et al.)
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-xen-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 27 08:10:00 +0000 2012
>Last-Modified:  Tue Mar 27 14:05:03 +0000 2012
>Originator:     Ignatios Souvatzis
>Release:        NetBSD 5.1.0_PATCH
>Organization:
computer science department, university of Bonn, Germany
>Environment:
System: NetBSD jaguar-gamma 5.1.0_PATCH NetBSD 5.1.0_PATCH (XEN3PAE_DOMU) #1: Fri Sep 23 10:10:12 CEST 2011 ignatios@random84.cs.uni-bonn.de:/var/itch/obj/5i386/sys/arch/i386/compile/XEN3PAE_DOMU i386
Architecture: i386
Machine: i386

> ls -l *core*
-rw-r--r--  1 ignatios  25   2956320 Dec 23 18:23 j3dcore.jar
-rw-------  1 ignatios  25     95736 Mar 27 09:41 mailwrapper.core
-rw-------  1 ignatios  25  12519916 Mar 22 16:21 npviewer.bin.core
-rw-------  1 ignatios  25   1551648 Mar 22 15:53 xfig.core
% ldd `which mailwrapper`
/usr/sbin/mailwrapper:
        -lc.12 => /usr/lib/libc.so.12


>Description:

A) mailq (actually mailwrapper) crashes before executing nullmailer's 
   mailq.

% mailq
Segmentation fault(core dumped)
% gdb `which mailwrapper` mailwrapper.core
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386--netbsdelf"...(no debugging symbols found)

Reading symbols from /usr/lib/libc.so.12...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libc.so.12
Reading symbols from /usr/libexec/ld.elf_so...(no debugging symbols found)...done.
Loaded symbols for /usr/libexec/ld.elf_so

Core was generated by `mailwrapper'.
Program terminated with signal 11, Segmentation fault.
#0  0xbb71c929 in strspn () from /usr/lib/libc.so.12
(gdb) where
#0  0xbb71c929 in strspn () from /usr/lib/libc.so.12
#1  0x08048aea in main ()
(gdb) disas strspn
...
0xbb71c906 <strspn+142>:        lea    0xffff07ac(%ebx),%esi
0xbb71c90c <strspn+148>:        mov    %dl,%al
0xbb71c90e <strspn+150>:        shr    $0x3,%al
0xbb71c911 <strspn+153>:        movzbl %al,%eax
0xbb71c914 <strspn+156>:        and    $0x7,%edx
0xbb71c917 <strspn+159>:        mov    (%esi,%edx,4),%edx
---Type <return> to continue, or q <return> to quit---
0xbb71c91a <strspn+162>:        or     %dl,0xffffffd0(%ebp,%eax,1)
0xbb71c91e <strspn+166>:        mov    0x1(%ecx),%dl
0xbb71c921 <strspn+169>:        inc    %ecx
0xbb71c922 <strspn+170>:        test   %dl,%dl
0xbb71c924 <strspn+172>:        jne    0xbb71c90c <strspn+148>
0xbb71c926 <strspn+174>:        mov    0xffffffc0(%ebp),%eax
0xbb71c929 <strspn+177>:        mov    (%eax),%cl
0xbb71c92b <strspn+179>:        test   %cl,%cl
0xbb71c92d <strspn+181>:        je     0xbb71c971 <strspn+249>
0xbb71c92f <strspn+183>:        mov    %cl,%al
0xbb71c931 <strspn+185>:        shr    $0x3,%al
0xbb71c934 <strspn+188>:        movzbl %al,%eax

B) xdvi sometimes reports that some glyphs couldn't be rendered (by t1lib)
and renders them as whitespace, sometimes complains about a segmentation
violation, sometimes works fine. All with the same .dvi!
The affected glyphs are different each time!

% xdvi-original fxtbook.dvi
xdvi-xaw: Error: T1lib failed for character 0x69 `i': Rasterization Aborted. Replacing by whitespace.
xdvi-xaw: Error: T1lib failed for character 0x64 `d': Rasterization Aborted. Replacing by whitespace.
% pkg_info t1lib
Information for t1lib-5.1.2nb6: (...)
Required by:
xdvik-22.84.16nb2
% ldd /usr/pkg/bin/xdvi-xaw
/usr/pkg/bin/xdvi-xaw:
	-lkpathsea.6 => /usr/pkg/lib/libkpathsea.so.6
	-lc.12 => /usr/lib/libc.so.12
	-lt1.5 => /usr/pkg/lib/libt1.so.5
	-lm.0 => /usr/lib/libm.so.0
	-lXaw7.7 => /usr/pkg/lib/libXaw7.so.7
	-lXmu.6 => /usr/pkg/lib/libXmu.so.6
	-lXt.6 => /usr/pkg/lib/libXt.so.6
	-lSM.6 => /usr/pkg/lib/libSM.so.6
	-lICE.6 => /usr/pkg/lib/libICE.so.6
	-lX11.6 => /usr/pkg/lib/libX11.so.6
	-lxcb.1 => /usr/pkg/lib/libxcb.so.1
	-lXau.6 => /usr/pkg/lib/libXau.so.6
	-lXdmcp.6 => /usr/pkg/lib/libXdmcp.so.6
	-lXext.6 => /usr/pkg/lib/libXext.so.6
	-lXpm.4 => /usr/pkg/lib/libXpm.so.4

The same pkg binaries work fine on a standalone machine with GENERIC
kernel.

>How-To-Repeat:
	send a mail to test the mail system, check queue
	test xdvi, which is mission critical on the freshly installed system
>Fix:
	Use a standalone machine with the same binaries.

>Audit-Trail:
From: Ignatios Souvatzis <ignatios@cs.uni-bonn.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-xen/46263: some programs don't work in xen3pae-domU
Date: Tue, 27 Mar 2012 10:35:25 +0200

 It occured to me that I should post the CPU details, just in case:

 domU:

 > cpuctl identify 0
 Cannot bind to target CPU.  Output may not accurately describe the target.
 Run as root to allow binding.

 cpu0: Intel Pentium Pro, II or III (686-class), id 0x106e5
 cpu0: features 0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
 cpu0: features 0xbfebfbff<PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX>
 cpu0: features 0xbfebfbff<FXSR,SSE,SSE2,SS,HTT,TM,SBF>
 cpu0: features2 0x98e3fd<SSE3,DTES64,MONITOR,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE41,SSE42,POPCNT>
 cpu0: features3 0x28100000<XD,EM64T>
 cpu0: "Intel(R) Xeon(R) CPU           L3426  @ 1.87GHz"
 cpu0: I-cache 32KB 64B/line 4-way, D-cache 32KB 64B/line 8-way
 cpu0: L2 cache 256KB 64B/line 8-way
 cpu0: ITLB 64 4KB entries 4-way
 cpu0: DTLB 64 4KB entries 4-way
 cpu0: L3 cache 8MB 64B/line 16-way
 cpu0: Initial APIC ID 1
 cpu0: Cluster/Package ID 0
 cpu0: Core ID 0
 cpu0: SMT ID 1
 cpu0: family 06 model 0e extfamily 00 extmodel 01

 dom0:

 jaguar-cage# cpuctl identify 0
 cpu0: Intel Pentium Pro, II or III (686-class), id 0x106e5
 cpu0: features  0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
 cpu0: features  0xbfebfbff<PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR>
 cpu0: features  0xbfebfbff<SSE,SSE2,SS,HTT,TM,SBF>
 cpu0: features2 0x98e3fd<SSE3,DTES64,MONITOR,DS-CPL,VMX,SMX,EST,TM2,SSSE3>
 cpu0: features2 0x98e3fd<CX16,xTPR,PDCM,SSE41,SSE42,POPCNT>
 cpu0: features3 0x28100800<SYSCALL/SYSRET,XD,EM64T>
 cpu0: features4 0x1<LAHF>
 cpu0: "Intel(R) Xeon(R) CPU           L3426  @ 1.87GHz"
 cpu0: I-cache 32KB 64B/line 4-way, D-cache 32KB 64B/line 8-way
 cpu0: L2 cache 256KB 64B/line 8-way
 cpu0: L3 cache 8MB 64B/line 16-way
 cpu0: ITLB 64 4KB entries 4-way
 cpu0: DTLB 64 4KB entries 4-way
 cpu0: Initial APIC ID 6
 cpu0: Cluster/Package ID 0
 cpu0: Core ID 3
 cpu0: SMT ID 0
 cpu0: family 06 model 0e extfamily 00 extmodel 01 stepping 05
 cpu0: UCode version: ?

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: port-xen-maintainer@NetBSD.org, gnats-admin@NetBSD.org,
        netbsd-bugs@NetBSD.org
Subject: Re: port-xen/46263: some programs don't work in xen3pae-domU
Date: Tue, 27 Mar 2012 14:03:14 +0200

 On Tue, Mar 27, 2012 at 08:10:00AM +0000, ignatios@cs.uni-bonn.de wrote:
 > System: NetBSD jaguar-gamma 5.1.0_PATCH NetBSD 5.1.0_PATCH (XEN3PAE_DOMU) #1: Fri Sep 23 10:10:12 CEST 2011 ignatios@random84.cs.uni-bonn.de:/var/itch/obj/5i386/sys/arch/i386/compile/XEN3PAE_DOMU i386
 > Architecture: i386
 > Machine: i386
 > 
 > > ls -l *core*
 > -rw-r--r--  1 ignatios  25   2956320 Dec 23 18:23 j3dcore.jar
 > -rw-------  1 ignatios  25     95736 Mar 27 09:41 mailwrapper.core
 > -rw-------  1 ignatios  25  12519916 Mar 22 16:21 npviewer.bin.core
 > -rw-------  1 ignatios  25   1551648 Mar 22 15:53 xfig.core
 > % ldd `which mailwrapper`
 > /usr/sbin/mailwrapper:
 >         -lc.12 => /usr/lib/libc.so.12
 > 
 > 
 > >Description:
 > 
 > A) mailq (actually mailwrapper) crashes before executing nullmailer's 
 >    mailq.
 > 
 > % mailq
 > Segmentation fault(core dumped)
 > % gdb `which mailwrapper` mailwrapper.core
 > GNU gdb 6.5
 > Copyright (C) 2006 Free Software Foundation, Inc.
 > GDB is free software, covered by the GNU General Public License, and you are
 > welcome to change it and/or distribute copies of it under certain conditions.
 > Type "show copying" to see the conditions.
 > There is absolutely no warranty for GDB.  Type "show warranty" for details.
 > This GDB was configured as "i386--netbsdelf"...(no debugging symbols found)
 > 
 > Reading symbols from /usr/lib/libc.so.12...(no debugging symbols found)...done.
 > Loaded symbols for /usr/lib/libc.so.12
 > Reading symbols from /usr/libexec/ld.elf_so...(no debugging symbols found)...done.
 > Loaded symbols for /usr/libexec/ld.elf_so
 > 
 > Core was generated by `mailwrapper'.
 > Program terminated with signal 11, Segmentation fault.
 > #0  0xbb71c929 in strspn () from /usr/lib/libc.so.12
 > (gdb) where
 > #0  0xbb71c929 in strspn () from /usr/lib/libc.so.12
 > #1  0x08048aea in main ()
 > (gdb) disas strspn
 > ...
 > 0xbb71c906 <strspn+142>:        lea    0xffff07ac(%ebx),%esi
 > 0xbb71c90c <strspn+148>:        mov    %dl,%al
 > 0xbb71c90e <strspn+150>:        shr    $0x3,%al
 > 0xbb71c911 <strspn+153>:        movzbl %al,%eax
 > 0xbb71c914 <strspn+156>:        and    $0x7,%edx
 > 0xbb71c917 <strspn+159>:        mov    (%esi,%edx,4),%edx
 > ---Type <return> to continue, or q <return> to quit---
 > 0xbb71c91a <strspn+162>:        or     %dl,0xffffffd0(%ebp,%eax,1)
 > 0xbb71c91e <strspn+166>:        mov    0x1(%ecx),%dl
 > 0xbb71c921 <strspn+169>:        inc    %ecx
 > 0xbb71c922 <strspn+170>:        test   %dl,%dl
 > 0xbb71c924 <strspn+172>:        jne    0xbb71c90c <strspn+148>
 > 0xbb71c926 <strspn+174>:        mov    0xffffffc0(%ebp),%eax
 > 0xbb71c929 <strspn+177>:        mov    (%eax),%cl
 > 0xbb71c92b <strspn+179>:        test   %cl,%cl
 > 0xbb71c92d <strspn+181>:        je     0xbb71c971 <strspn+249>
 > 0xbb71c92f <strspn+183>:        mov    %cl,%al
 > 0xbb71c931 <strspn+185>:        shr    $0x3,%al
 > 0xbb71c934 <strspn+188>:        movzbl %al,%eax
 > 
 > B) xdvi sometimes reports that some glyphs couldn't be rendered (by t1lib)
 > and renders them as whitespace, sometimes complains about a segmentation
 > violation, sometimes works fine. All with the same .dvi!
 > The affected glyphs are different each time!
 > 
 > % xdvi-original fxtbook.dvi
 > xdvi-xaw: Error: T1lib failed for character 0x69 `i': Rasterization Aborted. Replacing by whitespace.
 > xdvi-xaw: Error: T1lib failed for character 0x64 `d': Rasterization Aborted. Replacing by whitespace.
 > % pkg_info t1lib
 > Information for t1lib-5.1.2nb6: (...)
 > Required by:
 > xdvik-22.84.16nb2
 > % ldd /usr/pkg/bin/xdvi-xaw
 > /usr/pkg/bin/xdvi-xaw:
 > 	-lkpathsea.6 => /usr/pkg/lib/libkpathsea.so.6
 > 	-lc.12 => /usr/lib/libc.so.12
 > 	-lt1.5 => /usr/pkg/lib/libt1.so.5
 > 	-lm.0 => /usr/lib/libm.so.0
 > 	-lXaw7.7 => /usr/pkg/lib/libXaw7.so.7
 > 	-lXmu.6 => /usr/pkg/lib/libXmu.so.6
 > 	-lXt.6 => /usr/pkg/lib/libXt.so.6
 > 	-lSM.6 => /usr/pkg/lib/libSM.so.6
 > 	-lICE.6 => /usr/pkg/lib/libICE.so.6
 > 	-lX11.6 => /usr/pkg/lib/libX11.so.6
 > 	-lxcb.1 => /usr/pkg/lib/libxcb.so.1
 > 	-lXau.6 => /usr/pkg/lib/libXau.so.6
 > 	-lXdmcp.6 => /usr/pkg/lib/libXdmcp.so.6
 > 	-lXext.6 => /usr/pkg/lib/libXext.so.6
 > 	-lXpm.4 => /usr/pkg/lib/libXpm.so.4
 > 
 > The same pkg binaries work fine on a standalone machine with GENERIC
 > kernel.
 > 
 > >How-To-Repeat:
 > 	send a mail to test the mail system, check queue
 > 	test xdvi, which is mission critical on the freshly installed system

 Can you try a recent netbsd-5_STABLE kernel, with ticket #1738 ?
 This kernel should have:
 ==> sys/arch/i386/i386/gdt.c <==
 /*      $NetBSD: gdt.c,v 1.45.10.2 2012/03/21 21:29:31 jdc Exp $        */

 ==> sys/arch/i386/i386/machdep.c <==
 /*      $NetBSD: machdep.c,v 1.644.4.13 2012/03/21 21:29:31 jdc Exp $   */

 ==> sys/arch/i386/include/segments.h <==
 /*      $NetBSD: segments.h,v 1.50.4.2 2012/03/21 21:29:31 jdc Exp $    */

 But I'm surprised this kind of issue didn't show up on pkgbuild ...

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Ignatios Souvatzis <ignatios@cs.uni-bonn.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-xen/46263: some programs don't work in xen3pae-domU
Date: Tue, 27 Mar 2012 16:02:22 +0200

 On Tue, Mar 27, 2012 at 12:05:03PM +0000, Manuel Bouyer wrote:

 >  Can you try a recent netbsd-5_STABLE kernel, with ticket #1738 ?
 >  This kernel should have:
 >  ==> sys/arch/i386/i386/gdt.c <==
 >  /*      $NetBSD: gdt.c,v 1.45.10.2 2012/03/21 21:29:31 jdc Exp $        */
 >  
 >  ==> sys/arch/i386/i386/machdep.c <==
 >  /*      $NetBSD: machdep.c,v 1.644.4.13 2012/03/21 21:29:31 jdc Exp $   */
 >  
 >  ==> sys/arch/i386/include/segments.h <==
 >  /*      $NetBSD: segments.h,v 1.50.4.2 2012/03/21 21:29:31 jdc Exp $    */
 >  
 >  But I'm surprised this kind of issue didn't show up on pkgbuild ...

 Actually, it turns out that the mailwrapper part of the PR is bogus
 (mailwrapper configuration error). Mailwrapper should have slightly
 better error handling, IMO.

 The xdvi/T1lib part however, is unaffected by using the proposed kernel
 version.

 The xdvi/T1lib part is also unaffected by starting xen with mem=3G, thus
 (hopefully) forcing it to only use and give out low memory.

 	-is

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.