NetBSD Problem Report #55853

From mouse@Stone.Rodents-Montreal.ORG  Wed Dec  9 02:37:44 2020
Return-Path: <mouse@Stone.Rodents-Montreal.ORG>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 4443E1A9217
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  9 Dec 2020 02:37:44 +0000 (UTC)
Message-Id: <202012090237.VAA00478@Stone.Rodents-Montreal.ORG>
Date: Tue, 8 Dec 2020 21:37:40 -0500 (EST)
From: Mouse <mouse@Rodents-Montreal.ORG>
Reply-To: mouse@Rodents-Montreal.ORG
To: gnats-bugs@NetBSD.org
Subject: [dM] xdpyinfo sometimes loses output
X-Send-Pr-Version: 3.95

>Number:         55853
>Category:       xsrc
>Synopsis:       Under some circumstances, xdpyinfo prints nothing.
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    xsrc-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Dec 09 02:40:00 +0000 2020
>Last-Modified:  Wed Dec 09 14:30:01 +0000 2020
>Originator:     Mouse
>Release:        NetBSD 9.1
>Organization:
	Dis-
>Environment:
System: NetBSD Aaeon-9.Rodents-Montreal.ORG 9.1 NetBSD 9.1 (GEN91) #6: Tue Dec 8 10:08:26 EST 2020 mouse@Aaeon-9.Rodents-Montreal.ORG:/home/mouse/kbuild/GEN91 amd64
Architecture: x86_64
Machine: amd64
>Description:
	Under certain circumstances, xdpyinfo unexpectedly produces no
	output.  My impression of the circumstances is "stdout isn't a
	tty", but that's just a guess.
>How-To-Repeat:
	On a 9.1 system, start X.
	Run "/usr/X11R7/bin/xdpyinfo".
	I see normal-looking output.
	Run "/usr/X11R7/bin/xdpyinfo | cat".
	I see no output.
	So far, script(1), as in "script -c xdpyinfo", is the only way
	I've found to capture xdpyinfo output on 9.1.

	Given how many layers are between the display and xdpyinfo's
	stdout, I am inclined to doubt the video hardware is relevant.
	In case anyone cares, though, here is what looks to me like the
	relevant lines from /var/run/dmesg.boot.  I can supply more if
	someone tells me what to look for.

	acpivga0 at acpi0 (GFX0): ACPI Display Adapter
	acpiout0 at acpivga0 (DD01, 0x0100): ACPI Display Output Device
	acpiout1 at acpivga0 (DD02, 0x0002): ACPI Display Output Device
	acpiout2 at acpivga0 (DD03, 0x0300): ACPI Display Output Device
	acpiout3 at acpivga0 (DD04, 0x0301): ACPI Display Output Device
	acpiout4 at acpivga0 (DD05, 0x0302): ACPI Display Output Device
	acpiout5 at acpivga0 (DD06, 0x0303): ACPI Display Output Device
	acpiout6 at acpivga0 (DD07, 0x0304): ACPI Display Output Device
	acpiout7 at acpivga0 (DD08, 0x0305): ACPI Display Output Device
	acpiout8 at acpivga0 (DD09, 0x0009): ACPI Display Output Device
	acpiout9 at acpivga0 (DD0A, 0x000a): ACPI Display Output Device
	acpiout10 at acpivga0 (DD0B, 0x000b): ACPI Display Output Device
	acpiout11 at acpivga0 (DD0C, 0x000c): ACPI Display Output Device
	acpiout12 at acpivga0 (DD0D, 0x000d): ACPI Display Output Device
	acpiout13 at acpivga0 (DD0E, 0x000e): ACPI Display Output Device
	acpiout14 at acpivga0 (DD0F, 0x000f): ACPI Display Output Device
	acpiout15 at acpivga0 (DD1F, 0x0400): ACPI Display Output Device
	acpiout15: brightness levels: [0-100]
	acpiout16 at acpivga0 (SKC0, 0xca00): ACPI Display Output Device
	acpivga0: connected output devices:
	acpivga0:   0x0400 (acpiout15): Unknown Output Device, head 0

	i915drmkms0 at pci0 dev 2 function 0: vendor 8086 product 5912 (rev. 0x04)

	uhidev0 at uhub1 port 3 configuration 1 interface 0
	uhidev0: PixArt (0x93a) USB Optical Mouse (0x2510), rev 2.00/1.00, addr 1, iclass 3/1
	ums0 at uhidev0: 3 buttons and Z dir
	wsmouse0 at ums0 mux 2

	uhidev1 at uhub1 port 5 configuration 1 interface 0
	uhidev1: Gaming KB (0x258a) Gaming KB (0x1006), rev 1.10/1.05, addr 2, iclass 3/1
	ukbd0 at uhidev1
	wskbd0 at ukbd0: console keyboard
	uhidev2 at uhub1 port 5 configuration 1 interface 1
	uhidev2: Gaming KB (0x258a) Gaming KB (0x1006), rev 1.10/1.05, addr 2, iclass 3/1
	uhidev2: 8 report ids
	uhid0 at uhidev2 reportid 1: input=1, output=0, feature=0
	uhid1 at uhidev2 reportid 2: input=2, output=0, feature=0
	uhid2 at uhidev2 reportid 5: input=0, output=0, feature=5
	ukbd1 at uhidev2 reportid 6
	ukbd1: autoconfiguration error: attach failed, key code size != 8
	ukbd2 at uhidev2 reportid 7
	wskbd1 at ukbd2 mux 1
	ukbd3 at uhidev2 reportid 8
	wskbd2 at ukbd3 mux 1

	kern info: [drm] Memory usable by graphics device = 4096M
	kern info: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
	kern info: [drm] Driver supports precise vblank timestamp query.
	kern info: [drm] failed to find VBIOS tables
	i915drmkms0: interrupting at ioapic0 pin 16 (i915drmkms0)
	kern info: [drm] failed to retrieve link info, disabling eDP
	kern error: [drm:(/sys/external/bsd/drm2/dist/drm/i915/i915_drv.c:636)i915_firmware_load_error_print] *ERROR* failed to load firmware i915/kbl_dmc_ver1.bin (0)
	kern error: [drm:(/sys/external/bsd/drm2/dist/drm/i915/i915_drv.c:651)i915_firmware_load_error_print] *ERROR* kern error: [drm:(/sys/external/bsd/drm2/dist/drm/i915/intel_guc_loader.c:560)guc_fw_fetch] *ERROR* Failed to fetch GuC firmware from i915/kbl_guc_ver9_14.bin (error -2)
	The driver is built-in, so to load the firmware you need to
	include it either in the kernel (see CONFIG_EXTRA_FIRMWARE) or
	in your initrd/initramfs image.
	kern error: [drm:(/sys/external/bsd/drm2/dist/drm/i915/i915_gem.c:5417)i915_gem_init_hw] *ERROR* Failed to initialize GuC, error -5 (ignored)
	kern info: [drm] GMBUS [i915 gmbus dpd] timed out, falling back to bit banging on pin 6
	intelfb0 at i915drmkms0
	intelfb0: framebuffer at 0xffffab8067718000, size 1280x1024, depth 32, stride 5120
	warning: /sys/external/bsd/drm2/dist/drm/i915/intel_pm.c:3624: WARN_ON(!wm_changed)wsdisplay0 at intelfb0 kbdmux 1: console (default, vt100 emulation), using wskbd0
	wsmux1: connecting to wsdisplay0
	wskbd1: connecting to wsdisplay0
	wskbd2: connecting to wsdisplay0
	kern info: [drm] RC6 on
>Fix:
	I have no idea, beyond "don't try to use xdpyinfo on 9.1", which
	is less than tremendously helpful.

>Audit-Trail:
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: xsrc/55853: [dM] xdpyinfo sometimes loses output
Date: Wed, 9 Dec 2020 05:11:24 -0000 (UTC)

 mouse@Rodents-Montreal.ORG (Mouse) writes:

 >	Under certain circumstances, xdpyinfo unexpectedly produces no
 >	output.  My impression of the circumstances is "stdout isn't a
 >	tty", but that's just a guess.

 xdypinfo | wc
     1892    7775   65536

 gives a magic number of output bytes.

 A ktrace gives

   6636      1 xdpyinfo CALL  write(1,0x7b37a000,0x4000)
   6636      1 xdpyinfo RET   write 16384/0x4000
   6636      1 xdpyinfo CALL  write(1,0x7b37a000,0x4000)
   6636      1 xdpyinfo RET   write 16384/0x4000
   6636      1 xdpyinfo CALL  write(1,0x7b37a000,0x4000)
   6636      1 xdpyinfo RET   write 16384/0x4000
   6636      1 xdpyinfo CALL  write(1,0x7b37a000,0x4000)
   6636      1 xdpyinfo RET   write 16384/0x4000

 But with tty output, it is line buffered and all writes sum up to
 67874 bytes.

 So the last buffered output isn't flushed. Depending on your
 xdpyinfo generated output, this might be everything.

 You also see in the trace that xdpyinfo exits with code 3 in
 either case.

 Breakpoint 1, 0x7bb89394 in _exit () from /usr/lib/libc.so.12
 (gdb) where
 #0  0x7bb89394 in _exit () from /usr/lib/libc.so.12
 #1  0x7bde3af8 in XF86cleanup (sig=<optimized out>) at /home/netbsd9/xsrc/external/mit/libXxf86dga/dist/src/XF86DGA.c:660
 #2  0x7bed2574 in _rtld_call_function_void (obj=0x7befbc00, addr=2078161548) at /home/netbsd9/src/libexec/ld.elf_so/rtld.h:490
 #3  _rtld_call_initfini_function (obj=0x7befbc00, mask=<optimized out>, func=2078161548) at /home/netbsd9/src/libexec/ld.elf_so/rtld.c:143
 #4  _rtld_call_fini_function (cur_objgen=<optimized out>, mask=<optimized out>, obj=0x7befbc00) at /home/netbsd9/src/libexec/ld.elf_so/rtld.c:173
 #5  _rtld_call_fini_functions (mask=mask@entry=0x7fffd510, force=force@entry=1) at /home/netbsd9/src/libexec/ld.elf_so/rtld.c:204
 #6  0x7bed34bc in _rtld_exit () at /home/netbsd9/src/libexec/ld.elf_so/rtld.c:418
 #7  0x7bb5f778 in __cxa_finalize (dso=dso@entry=0x0) at /home/netbsd9/src/lib/libc/stdlib/atexit.c:222
 #8  0x7bb5f378 in exit (status=0) at /home/netbsd9/src/lib/libc/stdlib/exit.c:60
 #9  0x000149a8 in main (argc=<optimized out>, argv=<optimized out>) at /home/netbsd9/xsrc/external/mit/xdpyinfo/dist/xdpyinfo.c:1506


 static void
 #ifdef __NetBSD__
 __attribute__ ((__destructor__))
 #endif
 XF86cleanup(int sig)  
 {
     ScrPtr sp;
     int i;
     static char beenhere = 0;

     if (beenhere)
         _exit(3);
     beenhere = 1;

     for (i = 0; i < numScrs; i++) {
         sp = scrList[i];
         XF86DGADirectVideo(sp->display, sp->screen, 0);
         XSync(sp->display, False); 
     }
     _exit(3);
 }

 That code is usually called by an atexit() handler or by a fatal
 signal handler which are set up in XF86DGAGetVideo.

 But we cannot run an atexit handler in a possibly dlopen'ed
 shared library safely, so we have patched libXxf86dga (and others)
 to run atexit handlers unconditionally as a library destructor.

 In either case that code doesn't make sense as you get the _exit(3)
 also on other systems once XF86DGAGetVideo() has been called and
 the atexit handler is installed. But with our change to a destructor,
 the bug is always there.

 xdpyinfo links against libXxf86dga because it queries the XF86DGA
 extension and calls XF86DGAGetVideoLL for information. It doesn't
 call XF86DGAGetVideo. Other systems, using the upstream code,
 therefore do not see that issue.


 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: RVP <rvp@SDF.ORG>
To: Mouse <mouse@Rodents-Montreal.ORG>
Cc: gnats-bugs@NetBSD.org
Subject: Re: xsrc/55853: [dM] xdpyinfo sometimes loses output
Date: Wed, 9 Dec 2020 05:34:25 +0000 (UTC)

 This looks like a problem caused by fully-buffered writes:

 $ uname -a
 NetBSD x202e.localdomain 9.99.76 NetBSD 9.99.76 (GENERIC) #0: Wed Dec  2 
 02:21:43 UTC 2020 
 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64

 $ xdpyinfo | wc -c
     65536
 $ xdpyinfo > /tmp/x.txt
 $ wc -c /tmp/x.txt
     65536 /tmp/x.txt
 $

 Precisely 64K each time. Make writes to stdout line-buffered:

 $ env STDBUF1=L xdpyinfo | wc -c
     67792
 $

 Success.

 Someone'll have to look into the libc/stdio/... code to see why
 fully-buffered output streams are not being flushed before a
 program exits.

 In the meantime, the "fix" for this _specific_ problem is simple:

 ---START PATCH---
 diff -urN xdpyinfo-1.3.2.orig/xdpyinfo.c xdpyinfo-1.3.2/xdpyinfo.c
 --- xdpyinfo-1.3.2.orig/xdpyinfo.c      2015-04-17 06:00:20.000000000 
 +0000
 +++ xdpyinfo-1.3.2/xdpyinfo.c   2020-12-09 05:24:58.749772471 +0000
 @@ -1501,6 +1501,7 @@
       }

       print_marked_extensions(dpy);
 +    fflush(NULL);

       XCloseDisplay (dpy);
       exit (0);
 ---END PATCH---

 We flush all output streams explicitly.

 -RVP

From: RVP <rvp@SDF.ORG>
To: Michael van Elst <mlelstv@serpens.de>
Cc: gnats-bugs@netbsd.org, mouse@Rodents-Montreal.ORG
Subject: Re: xsrc/55853: [dM] xdpyinfo sometimes loses output
Date: Wed, 9 Dec 2020 06:03:51 +0000 (UTC)

 On Wed, 9 Dec 2020, Michael van Elst wrote:

 > So the last buffered output isn't flushed. Depending on your
 > xdpyinfo generated output, this might be everything.
 >
 > You also see in the trace that xdpyinfo exits with code 3 in
 > either case.
 >

 Ah! I missed that. I should've checked my ktrace for the exit
 status too.

 Yours looks like the correct explanation. Thanks.

 -RVP

From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: xsrc/55853: [dM] xdpyinfo sometimes loses output
Date: Wed, 9 Dec 2020 06:12:21 -0000 (UTC)

 mlelstv@serpens.de (Michael van Elst) writes:

 See also: https://bugs.freedesktop.org/show_bug.cgi?id=9815

 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv@serpens.de
                                 "A potential Snark may lurk in every tree."

From: Mouse <mouse@Rodents-Montreal.ORG>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: xsrc/55853: [dM] xdpyinfo sometimes loses output
Date: Wed, 9 Dec 2020 09:26:52 -0500 (EST)

 >  static void
 >  #ifdef __NetBSD__
 >  __attribute__ ((__destructor__))
 >  #endif
 >  XF86cleanup(int sig)  
 >  {
 [...]
 >      _exit(3);
 >  }

 >  That code is usually called by an atexit() handler or by a fatal
 >  signal handler which are set up in XF86DGAGetVideo.

 Should such a function be calling _exit at all?  It will prevent other
 cleanup (such as other atexit() handlers) from running, if naught else.

 /~\ The ASCII				  Mouse
 \ / Ribbon Campaign
  X  Against HTML		mouse@rodents-montreal.org
 / \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.