NetBSD Problem Report #57445

From martin@aprisoft.de  Mon May 29 12:25:05 2023
Return-Path: <martin@aprisoft.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id A6C541A9238
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 29 May 2023 12:25:05 +0000 (UTC)
Message-Id: <20230529122454.8E6CD5CC81D@emmas.aprisoft.de>
Date: Mon, 29 May 2023 14:24:54 +0200 (CEST)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: firefox crashes on startup
X-Send-Pr-Version: 3.95

>Number:         57445
>Category:       pkg
>Synopsis:       firefox crashes on startup
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    riastradh
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon May 29 12:30:00 +0000 2023
>Closed-Date:    Sat Apr 13 12:55:15 +0000 2024
>Last-Modified:  Sat Apr 13 12:55:15 +0000 2024
>Originator:     Martin Husemann
>Release:        NetBSD 10.99.4
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD martins.aprisoft.de 10.99.4 NetBSD 10.99.4 (GENERIC) #171: Sat May 27 10:59:49 CEST 2023 martin@martins.aprisoft.de:/usr/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:

This has been talked about on the mailing list on and off with no commitable
solution - so it is time to file a PR.

The bug described below *might* be related to the TLS issue reported in
PR 50277, and actually disabling TLS use in lib mesa seems to help, but
it is not clear what this means - could be just different timing helping
accidently/by chance.

So what happens when I start firefox: firefox is supposed to create a render
thread and use libmesa for that. It creates a GL context and sets it for
that thread. When something goes wrong, the error code is fetched from
that context.

However, at regular startup on my machine something goes wrong but the GL
context has not been set (is still NULL) and this NULL dereference (to get
the last error status) kills firefox.

Note: upstream (in libmesa) the context never is NULL, it is initialized to
the address of a static (empty) &dummyContext.

The change to make it initally NULL is a NetBSD specific hack to work around
some ld.elf_so restriction (that I don't fully understand).

The bug is obscured slightly by:
 - gdb/ptrace not working well together with -current ld.elf_so, TLS, massive
   threaded apps like firefox. Running firefox from gdb often makes it work
   but also when it doesn't work changes the details of the crash completely
   and also crashes gdb
 - firefox helpfully installing a signal handler catching the crash

So what I did is:
 - build firefox with options debug-info
   (see https://wiki.NetBSD.org/tutorials/pkgsrc/debugging_firefox/)
 - modify mesalib.old (see patch below)
 - run firefox w/o gdb and check the core
 - run firefox from inside gdb with a breakpoint on __glXSetCurrentContext
   which is the only place where the context ever gets changed

To me it looks like something goes wrong in the NetBSD variant of
Mesa's	GET_CURRENT_CONTEXT(ctx)
macro. Theoretically it is impossible to get ctx == NULL.

Within Firefox there should be a valid context in the render thread,
but it seems in the crash case it does not get that far (why it fails
to do so is still to be analyzed, but maybe anything including bad luck/timing).
It could (silently) happen on other systems too, but probably goes unnoticed
there as the crashing call just retrieves the error code from the dummyContext.


So here are the gdb sessions, one showing the actual crash from a core
dump, and one showing a successfull call to __glXSetCurrentContext
with a valid context.

[/usr/pkgobj/www/firefox/work/build/dist/bin] martin@martins > ./run-mozilla.sh ./firefox
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: glxtest: cannot access /sys/bus/pci (t=0.385433) [GFX1-]: glxtest: cannot access /sys/bus/pci
ATTENTION: default value of option mesa_glthread overridden by environment.
libEGL warning: DRI2: failed to authenticate
ATTENTION: default value of option mesa_glthread overridden by environment.
_mesa_GetError() in thread 0x73016b7dd400 lwp 27206 with NULL context
Segmentation fault (core dumped)

 > gdb ./firefox firefox.core 
[..]
Reading symbols from ./firefox...
[New process 27206]
[..]
[New process 11054]
[New process 21185]
Core was generated by `firefox'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000730165aadb97 in _mesa_GetError ()
   from /usr/X11R7/lib/modules/dri/swrast_dri.so
[Current thread is 1 (process 27206)]

(gdb) bt
#0  0x0000730165aadb97 in _mesa_GetError ()
   from /usr/X11R7/lib/modules/dri/swrast_dri.so
#1  0x0000730179027166 in mozilla::gl::GLContext::InitImpl (
    this=0x73015f23f800)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:526
#2  0x000073017902859a in mozilla::gl::GLContext::InitImpl (
    this=0x73015f23f800)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:372
#3  mozilla::gl::GLContext::Init (this=0x73015f23f800)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:324
#4  0x0000730179028612 in mozilla::gl::GLContextEGL::Init (this=0x73015f23f800)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderEGL.cpp:412
#5  0x0000730179028a34 in mozilla::gl::GLContextEGL::CreateGLContext (egl=..., 
    desc=..., surfaceConfig=0x73015f29db40, surface=surface@entry=0x0, 
    useGles=useGles@entry=false, contextConfig=<optimized out>, 
    out_failureId=out_failureId@entry=0x73016b749700)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderEGL.cpp:759
#6  0x00007301790291d9 in mozilla::gl::GLContextEGLFactory::CreateImpl (
    aWindow=aWindow@entry=0x0, 
    aHardwareWebRender=aHardwareWebRender@entry=true, 
    aUseGles=aUseGles@entry=false)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderEGL.cpp:294
#7  0x0000730179029647 in mozilla::gl::GLContextEGLFactory::Create (
    aWindow=0x0, aHardwareWebRender=<optimized out>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderEGL.cpp:328
#8  0x000073017902968b in mozilla::gl::GLContextProviderEGL::CreateForCompositorWidget (aCompositorWidget=aCompositorWidget@entry=0x0, 
    aHardwareWebRender=aHardwareWebRender@entry=true)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderEGL.cpp:1003
#9  0x00007301792a9212 in CreateGLContextEGL ()
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderThread.cpp:1363
#10 0x00007301792a9487 in CreateGLContext (aError=...)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderThread.cpp:1396
#11 mozilla::wr::RenderThread::CreateSingletonGL (
    this=this@entry=0x73016ed67e00, aError=...)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderThread.cpp:1142
#12 0x00007301792a9a7d in mozilla::wr::RenderThread::InitDeviceTask (
    this=0x73016ed67e00)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderThread.cpp:973
#13 mozilla::wr::RenderThread::InitDeviceTask (this=0x73016ed67e00)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderThread.cpp:962
#14 0x0000730178c97dbe in mozilla::detail::runnable_args_base<(mozilla::detail::RunnableResult)0>::Run (this=<optimized out>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/dom/media/webrtc/transport/runnable_utils.h:41
#15 0x000073017880d552 in nsThread::ProcessNextEvent (this=0x73016edf9080, 
    aMayWait=<optimized out>, aResult=0x73016b749cd7)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThread.cpp:1233
#16 0x0000730178801db1 in NS_ProcessNextEvent (aThread=<optimized out>, 
    aThread@entry=0x73016edf9080, aMayWait=aMayWait@entry=true)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThreadUtils.cpp:477
#17 0x0000730178ce8d58 in mozilla::ipc::MessagePumpForNonMainThreads::Run (
    this=0x73016b4b50c0, aDelegate=0x73016b749d90)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/glue/MessagePump.cpp:330
#18 0x0000730178cad506 in MessageLoop::RunInternal (
    this=0x730182b46380 <__stack_chk_guard>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:381
#19 MessageLoop::RunHandler (this=0x730182b46380 <__stack_chk_guard>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:374
#20 MessageLoop::Run (this=this@entry=0x73016b749d90)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:356
#21 0x000073017880cd68 in nsThread::ThreadFunc (aArg=<optimized out>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThread.cpp:391
#22 0x00007301748297a6 in _pt_root () from /usr/pkg/lib/nspr/libnspr4.so
#23 0x0000730182d7c2df in pthread__create_tramp (cookie=0x73016b7dd400)
    at /usr/src/lib/libpthread/pthread.c:592


[/usr/pkgobj/www/firefox/work/build/dist/bin] martin@martins > ./run-mozilla.sh -g ./firefox
Reading symbols from ./firefox...
(gdb) break __glXSetCurrentContext
Function "__glXSetCurrentContext" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (__glXSetCurrentContext) pending.
(gdb) run
[..]
[New LWP 15641 of process 12660]
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: No GPUs detected via PCI (t=37.547) [GFX1-]: No GPUs detected via PCI
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: No GPUs detected via PCI (t=37.547) |[1][GFX1-]: glxtest: process failed (received signal 5) (t=37.5471) [GFX1-]: glxtest: process failed (received signal 5)
[New LWP 4400 of process 12660]
[..]
[New LWP 20536 of process 12660]
[New LWP 22439 of process 12660]
[Switching to LWP 25340 of process 12660]

Thread 61 "Renderer" hit Breakpoint 1, __glXSetCurrentContext (
    c=0x7f1d1395c000)
    at /usr/xsrc/external/mit/MesaLib.old/dist/src/glx/glxcurrent.c:102
102	fprintf(stderr, "__glXSetCurrentContext(%p) in thread %p lwp %ld\n",

(gdb) p *c
$1 = {buf = 0x0, pc = 0x0, limit = 0x0, bufEnd = 0x0, bufSize = 0, 
  vtable = 0x7f1d26b24340, xid = 35651646, share_xid = 0, screen = 0, 
  psc = 0x7f1d2c8d3f00, imported = 0 '\000', currentContextTag = 4294967295, 
  renderMode = 0, feedbackBuf = 0x0, selectBuf = 0x0, fillImage = 0x0, 
  attributes = {stack = {0x0 <repeats 16 times>}, stackPointer = 0x0}, 
  error = 0, isDirect = 1, currentDpy = 0x7f1d39b31000, 
  currentDrawable = 35651644, vendor = 0x0, renderer = 0x0, version = 0x0, 
  extensions = 0x0, maxSmallRenderCommandSize = 0, majorOpcode = 151, 
  config = 0x7f1d268b5c00, currentReadable = 35651644, 
  client_state_private = 0x0, renderType = 32788, server_major = 0, 
  server_minor = 0, thread_refcount = 1, noError = 0, 
  gl_extension_bits = '\000' <repeats 16 times>}
(gdb) bt
#0  __glXSetCurrentContext (c=0x7f1d1395c000)
    at /usr/xsrc/external/mit/MesaLib.old/dist/src/glx/glxcurrent.c:102
#1  MakeContextCurrent (dpy=0x7f1d39b31000, draw=35651644, read=35651644, 
    gc_user=0x7f1d1395c000)
    at /usr/xsrc/external/mit/MesaLib.old/dist/src/glx/glxcurrent.c:253
#2  0x00007f1d32df0f75 in mozilla::gl::GLXLibrary::fMakeCurrent (
    context=<optimized out>, drawable=<optimized out>, 
    display=<optimized out>, this=<optimized out>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLXLibrary.h:68
#3  mozilla::gl::GLContextGLX::MakeCurrentImpl (this=0x7f1d138d4800)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:473
#4  0x00007f1d32dff586 in mozilla::gl::GLContext::MakeCurrent (
    this=0x7f1d138d4800, aForce=<optimized out>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:2440
#5  0x00007f1d32e2858e in mozilla::gl::GLContext::InitImpl (
    this=0x7f1d138d4800)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:372
#6  mozilla::gl::GLContext::Init (this=this@entry=0x7f1d138d4800)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:324
#7  0x00007f1d32df328f in mozilla::gl::GLContextGLX::Init (this=0x7f1d138d4800)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:453
#8  operator() (__closure=__closure@entry=0x7f1d1c6c84b0, attribs=...)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:351
#9  0x00007f1d32df36f6 in mozilla::gl::GLContextGLX::CreateGLContext (
    desc=..., display=..., drawable=<optimized out>, drawable@entry=35651644, 
    cfg=<optimized out>, cfg@entry=0x7f1d268b5c00, 
    ownedPixmap=<optimized out>, ownedPixmap@entry=0)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:417
#10 0x00007f1d32df3a76 in mozilla::gl::CreateForWidget (
    aXDisplay=aXDisplay@entry=0x7f1d39b31000, 
    aXWindow=aXWindow@entry=35651644, 
    aHardwareWebRender=aHardwareWebRender@entry=true, 
    aForceAccelerated=<optimized out>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:620
#11 0x00007f1d32df3c62 in mozilla::gl::CreateForWidget (
    aForceAccelerated=<optimized out>, aHardwareWebRender=<optimized out>, 
    aXWindow=35651644, aXDisplay=0x7f1d39b31000)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:634
#12 mozilla::gl::GLContextProviderGLX::CreateForCompositorWidget (
    aCompositorWidget=<optimized out>, aHardwareWebRender=<optimized out>, 
    aForceAccelerated=<optimized out>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:634
#13 0x00007f1d32df42b7 in mozilla::gl::GLContextProviderLinux::CreateForCompositorWidget (aCompositorWidget=<optimized out>, 
    aHardwareWebRender=aHardwareWebRender@entry=true, 
    aForceAccelerated=aForceAccelerated@entry=true)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderLinux.cpp:29
#14 0x00007f1d330a9d10 in mozilla::wr::RenderCompositorOGL::Create (
    aWidget=..., aError=...)
    at /usr/pkgobj/www/firefox/work/build/dist/include/mozilla/RefPtr.h:280
#15 0x00007f1d330baa17 in mozilla::wr::RenderCompositor::Create (aWidget=..., 
    aError=...)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderCompositor.cpp:223
#16 0x00007f1d330bfd50 in mozilla::wr::NewRenderer::Run (this=0x7f1d13972100, 
    aRenderThread=..., aWindowId=...)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/WebRenderAPI.cpp:71
#17 0x00007f1d330a604c in mozilla::wr::RenderThread::RunEvent (
    this=0x7f1d2c8d2c00, aWindowId=..., aEvent=...)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderThread.cpp:526
#18 0x00007f1d330a122e in mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >&&>::applyImpl<mozilla::wr::RenderThread, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >), StoreCopyPassByConstLRef<mozilla::wr::WrWindowId>, StoreCopyPassByRRef<mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> > >, 0ul, 1ul> (args=..., m=<optimized out>, o=<optimized out>)
    at /usr/pkgobj/www/firefox/work/build/dist/include/nsThreadUtils.h:902
#19 mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >&&>::apply<mozilla::wr::RenderThread, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >)> (m=<optimized out>, 
    o=<optimized out>, this=<optimized out>)
    at /usr/pkgobj/www/firefox/work/build/dist/include/nsThreadUtils.h:1169
#20 mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >), true, (mozilla::RunnableKind)0, mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >&&>::Run (
    this=<optimized out>)
    at /usr/pkgobj/www/firefox/work/build/dist/include/nsThreadUtils.h:1216
#21 0x00007f1d3260d552 in nsThread::ProcessNextEvent (this=0x7f1d29ff7a00, 
    aMayWait=<optimized out>, aResult=0x7f1d1c6c8cd7)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThread.cpp:1233
#22 0x00007f1d32601db1 in NS_ProcessNextEvent (aThread=<optimized out>, 
    aThread@entry=0x7f1d29ff7a00, aMayWait=aMayWait@entry=true)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThreadUtils.cpp:477
#23 0x00007f1d32ae8d58 in mozilla::ipc::MessagePumpForNonMainThreads::Run (
    this=0x7f1d2683d0c0, aDelegate=0x7f1d1c6c8d90)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/glue/MessagePump.cpp:330
#24 0x00007f1d32aad506 in MessageLoop::RunInternal (
    this=0x7f1d3c9c8380 <__stack_chk_guard>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:381
#25 MessageLoop::RunHandler (this=0x7f1d3c9c8380 <__stack_chk_guard>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:374
#26 MessageLoop::Run (this=this@entry=0x7f1d1c6c8d90)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:356
#27 0x00007f1d3260cd68 in nsThread::ThreadFunc (aArg=<optimized out>)
    at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThread.cpp:391
#28 0x00007f1d2e6297a6 in _pt_root () from /usr/pkg/lib/nspr/libnspr4.so
#29 0x00007f1d3cbfe2df in pthread__create_tramp (cookie=0x7f1d268dc400)
    at /usr/src/lib/libpthread/pthread.c:592




>How-To-Repeat:
On my machine: just try to start firefox.

>Fix:
n/a

Patch used against mesalib.old:

Index: dist/src/glx/glxcurrent.c
===================================================================
RCS file: /cvsroot/xsrc/external/mit/MesaLib.old/dist/src/glx/glxcurrent.c,v
retrieving revision 1.1.1.2
diff -u -p -r1.1.1.2 glxcurrent.c
--- dist/src/glx/glxcurrent.c	11 Jul 2021 20:36:29 -0000	1.1.1.2
+++ dist/src/glx/glxcurrent.c	29 May 2023 11:29:27 -0000
@@ -34,6 +34,7 @@
  */

 #include <pthread.h>
+#include <lwp.h>

 #include "glxclient.h"
 #include "glapi.h"
@@ -98,6 +99,8 @@ __thread void *__glX_tls_Context __attri
 _X_HIDDEN void
 __glXSetCurrentContext(struct glx_context * c)
 {
+fprintf(stderr, "__glXSetCurrentContext(%p) in thread %p lwp %ld\n",
+  c, pthread_self(), (long)_lwp_self());
    __glX_tls_Context = (c != NULL) ? c : &dummyContext;
 }

Index: dist/src/mesa/main/getstring.c
===================================================================
RCS file: /cvsroot/xsrc/external/mit/MesaLib.old/dist/src/mesa/main/getstring.c,v
retrieving revision 1.1.1.2
diff -u -p -r1.1.1.2 getstring.c
--- dist/src/mesa/main/getstring.c	11 Jul 2021 20:36:32 -0000	1.1.1.2
+++ dist/src/mesa/main/getstring.c	29 May 2023 11:29:27 -0000
@@ -24,6 +24,7 @@


 #include <stdbool.h>
+#include <lwp.h>
 #include "glheader.h"
 #include "context.h"
 #include "debug_output.h"
@@ -325,6 +326,11 @@ GLenum GLAPIENTRY
 _mesa_GetError( void )
 {
    GET_CURRENT_CONTEXT(ctx);
+
+if (ctx == NULL)
+  fprintf(stderr, "_mesa_GetError() in thread %p lwp %ld with NULL context\n",
+    pthread_self(), (long)_lwp_self());
+
    GLenum e = ctx->ErrorValue;
    ASSERT_OUTSIDE_BEGIN_END_WITH_RETVAL(ctx, 0);


>Release-Note:

>Audit-Trail:
From: "David Brownlee" <abs@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57445 CVS commit: pkgsrc/www/firefox
Date: Tue, 30 May 2023 08:37:37 +0000

 Module Name:	pkgsrc
 Committed By:	abs
 Date:		Tue May 30 08:37:37 UTC 2023

 Modified Files:
 	pkgsrc/www/firefox: Makefile
 Added Files:
 	pkgsrc/www/firefox/files: firefox.sh

 Log Message:
 Add temporary workaround for PR#57445 for native X11 NetBSD

 Calling "export LD_PRELOAD=/usr/X11R7/lib/libEGL.so" before starting firefox
 avoids the crash on startup in many cases

 To be removed once PR#57445 is resolved (or restricted to non fixed installs)


 To generate a diff of this commit:
 cvs rdiff -u -r1.554 -r1.555 pkgsrc/www/firefox/Makefile
 cvs rdiff -u -r0 -r1.1 pkgsrc/www/firefox/files/firefox.sh

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: pkg/57445: firefox crashes on startup
Date: Tue, 30 May 2023 20:06:19 +0200

 This was a bit of a fools erand - I looked at the wrong context variable
 (too many TLS things in there).

 Adding a similar fprintf to the correct contecxt makes it obvious that
 it is a toolchain issue on the NetBSD side (or something really stupid that
 I am overlooking):


 [/usr/pkgobj/www/firefox/work/build/dist/bin] martin@martins > ./run-mozilla.sh ./firefox
 Crash Annotation GraphicsCriticalError: |[0][GFX1-]: glxtest: cannot access /sys/bus/pci (t=0.370644) [GFX1-]: glxtest: cannot access /sys/bus/pci
 ATTENTION: default value of option mesa_glthread overridden by environment.
 libEGL warning: DRI2: failed to authenticate
 ATTENTION: default value of option mesa_glthread overridden by environment.
 u_current_set_context(0x7516f1a404c0) thread 0x7516fe11d800 lwp 2811
 _mesa_GetError() in thread 0x7516fe11d800 lwp 2811 with NULL context
 Segmentation fault (core dumped)


 There is no fork() or anything dubious in between, here is the full
 excerpt of ktrace -i for the section between setting and not-getting
 the value:

   1833   2938 firefox  CALL  write(2,0x71401b9659d0,0x45)
   1833   1833 firefox  CALL  access(0x71401e744dd8,4)
   1833   1833 firefox  NAMI  "/usr/X11R7/lib/X11/fonts/100dpi/timB08.bdf"
   1833   2938 firefox  GIO   fd 2 wrote 69 bytes
        "u_current_set_context(0x7140101a0140) thread 0x71401b9f3400 lwp 2938\n"
   1833   2938 firefox  RET   write 69/0x45
   1833   1833 firefox  RET   access 0
   1833   1833 firefox  CALL  access(0x71401e7453b0,4)
   1833   1833 firefox  NAMI  "/usr/X11R7/lib/X11/fonts/100dpi/timB10-ISO8859-1.pcf.gz"
   1833   1833 firefox  RET   access 0
   1833   1833 firefox  CALL  access(0x71401e745990,4)
   1833   1833 firefox  NAMI  "/usr/X11R7/lib/X11/fonts/100dpi/timB10.bdf"
   1833   1833 firefox  RET   access 0
   1833   1833 firefox  CALL  access(0x71401e745f68,4)
   1833   1833 firefox  NAMI  "/usr/X11R7/lib/X11/fonts/100dpi/timB12-ISO8859-1.pcf.gz"
   1833   1833 firefox  RET   access 0
   1833   1833 firefox  CALL  access(0x71401e746548,4)
   1833   1833 firefox  NAMI  "/usr/X11R7/lib/X11/fonts/100dpi/timB12.bdf"
   1833   1833 firefox  RET   access 0
   1833   1833 firefox  CALL  access(0x71401e746b20,4)
   1833   1833 firefox  NAMI  "/usr/X11R7/lib/X11/fonts/100dpi/timB14-ISO8859-1.pcf.gz"
   1833   1833 firefox  RET   access 0
   1833   1833 firefox  CALL  access(0x71401e747100,4)
   1833   1833 firefox  NAMI  "/usr/X11R7/lib/X11/fonts/100dpi/timB14.bdf"
   1833   1833 firefox  RET   access 0
   1833   2938 firefox  CALL  _lwp_self
   1833   2938 firefox  RET   _lwp_self 2938/0xb7a
   1833   2938 firefox  CALL  write(2,0x71401b963c50,0x45)
   1833   2938 firefox  GIO   fd 2 wrote 69 bytes
        "_mesa_GetError() in thread 0x71401b9f3400 lwp 2938 with NULL context\n"
   1833   2938 firefox  RET   write 69/0x45
   1833   2938 firefox  PSIG  SIGSEGV caught handler=0x71402d511870 mask=(): code=SEGV_MAPERR, addr=0x578, trap=6)
   1833   1833 firefox  CALL  access(0x71401e7476d8,4)
   1833   2938 firefox  CALL  __sigaction_sigtramp(SIGSEGV,0x7140305f84c0,0,0x7140335906e0,2)
   1833   1833 firefox  NAMI  "/usr/X11R7/lib/X11/fonts/100dpi/timB18-ISO8859-1.pcf.gz"
   1833   2938 firefox  RET   __sigaction_sigtramp 0
   1833   2938 firefox  CALL  setcontext(0x71401b963e10)
   1833   1833 firefox  RET   access 0
   1833   2938 firefox  RET   setcontext JUSTRETURN
   1833   2938 firefox  PSIG  SIGSEGV SIG_DFL: code=SEGV_MAPERR, addr=0x578, trap=6)


 Here is some more debug output with address and value of the underlying TLS
 variable added, plus backtraces:

 Output:

 [/usr/pkgobj/www/firefox/work/build/dist/bin] martin@martins > ./run-mozilla.sh ./firefox 
 Crash Annotation GraphicsCriticalError: |[0][GFX1-]: glxtest: cannot access /sys/bus/pci (t=0.362207) [GFX1-]: glxtest: cannot access /sys/bus/pci
 ATTENTION: default value of option mesa_glthread overridden by environment.
 libEGL warning: DRI2: failed to authenticate
 ATTENTION: default value of option mesa_glthread overridden by environment.
 u_current_set_context(0x7933f3c08480) thread 0x7933ffa9f400 lwp 11177, &_glapi_tls_Context: 0x7933ff8f1030 (value: 0x7933f3c08480)
 _mesa_GetError() in thread 0x7933ffa9f400 lwp 11177 with NULL context
 &_glapi_tls_Context: 0x7933ff8f1018 (value: 0x0)
 Segmentation fault (core dumped)



 When run from gdb with breakpoint on u_current_set_context:

 [Switching to LWP 12645 of process 12088]

 Thread 60 "Renderer" hit Breakpoint 1, u_current_set_context (
     ptr=0x7a44b8201c00)
     at /usr/xsrc/external/mit/MesaLib.old/dist/src/mapi/u_current.c:236
 236	{
 (gdb) bt
 #0  u_current_set_context (ptr=0x7a44b8201c00)
     at /usr/xsrc/external/mit/MesaLib.old/dist/src/mapi/u_current.c:236
 #1  0x00007a44c5684966 in _mesa_make_current ()
    from /usr/X11R7/lib/modules/dri/swrast_dri.so
 #2  0x00007a44c551a727 in ?? () from /usr/X11R7/lib/modules/dri/swrast_dri.so
 #3  0x00007a44c53df9fc in dri_make_current ()
    from /usr/X11R7/lib/modules/dri/swrast_dri.so
 #4  0x00007a44c53e5a1e in ?? () from /usr/X11R7/lib/modules/dri/swrast_dri.so
 #5  0x00007a44cbfa49a4 in drisw_bind_context (context=0x7a44b839a000, 
     old=<optimized out>, draw=<optimized out>, read=<optimized out>)
     at /usr/xsrc/external/mit/MesaLib.old/dist/src/glx/drisw_glx.c:424
 #6  0x00007a44cbfc650b in MakeContextCurrent (dpy=0x7a44dee42000, 
     draw=35651644, read=35651644, gc_user=0x7a44b839a000)
     at /usr/xsrc/external/mit/MesaLib.old/dist/src/glx/glxcurrent.c:239
 #7  0x00007a44d79f0f75 in mozilla::gl::GLXLibrary::fMakeCurrent (
     context=<optimized out>, drawable=<optimized out>, 
     display=<optimized out>, this=<optimized out>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLXLibrary.h:68
 #8  mozilla::gl::GLContextGLX::MakeCurrentImpl (this=0x7a44b8312800)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:473
 #9  0x00007a44d79ff586 in mozilla::gl::GLContext::MakeCurrent (
     this=0x7a44b8312800, aForce=<optimized out>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:2440
 #10 0x00007a44d7a2858e in mozilla::gl::GLContext::InitImpl (
     this=0x7a44b8312800)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:372
 #11 mozilla::gl::GLContext::Init (this=this@entry=0x7a44b8312800)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:324
 #12 0x00007a44d79f328f in mozilla::gl::GLContextGLX::Init (this=0x7a44b8312800)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:453
 #13 operator() (__closure=__closure@entry=0x7a44c0eb84b0, attribs=...)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:351
 #14 0x00007a44d79f36f6 in mozilla::gl::GLContextGLX::CreateGLContext (
     desc=..., display=..., drawable=<optimized out>, drawable@entry=35651644, 
     cfg=<optimized out>, cfg@entry=0x7a44cb7b3c00, 
     ownedPixmap=<optimized out>, ownedPixmap@entry=0)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:417
 #15 0x00007a44d79f3a76 in mozilla::gl::CreateForWidget (
     aXDisplay=aXDisplay@entry=0x7a44dee42000, 
     aXWindow=aXWindow@entry=35651644, 
     aHardwareWebRender=aHardwareWebRender@entry=true, 
     aForceAccelerated=<optimized out>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:620
 #16 0x00007a44d79f3c62 in mozilla::gl::CreateForWidget (
     aForceAccelerated=<optimized out>, aHardwareWebRender=<optimized out>, 
     aXWindow=35651644, aXDisplay=0x7a44dee42000)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:634
 #17 mozilla::gl::GLContextProviderGLX::CreateForCompositorWidget (
     aCompositorWidget=<optimized out>, aHardwareWebRender=<optimized out>, 
     aForceAccelerated=<optimized out>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:634
 #18 0x00007a44d79f42b7 in mozilla::gl::GLContextProviderLinux::CreateForCompositorWidget (aCompositorWidget=<optimized out>, 
     aHardwareWebRender=aHardwareWebRender@entry=true, 
     aForceAccelerated=aForceAccelerated@entry=true)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderLinux.cpp:29
 #19 0x00007a44d7ca9d10 in mozilla::wr::RenderCompositorOGL::Create (
     aWidget=..., aError=...)
     at /usr/pkgobj/www/firefox/work/build/dist/include/mozilla/RefPtr.h:280
 #20 0x00007a44d7cbaa17 in mozilla::wr::RenderCompositor::Create (aWidget=..., 
     aError=...)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderCompositor.cpp:223
 #21 0x00007a44d7cbfd50 in mozilla::wr::NewRenderer::Run (this=0x7a44b83b0100, 
     aRenderThread=..., aWindowId=...)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/WebRenderAPI.cpp:71
 #22 0x00007a44d7ca604c in mozilla::wr::RenderThread::RunEvent (
     this=0x7a44cf1d3c00, aWindowId=..., aEvent=...)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderThread.cpp:526
 #23 0x00007a44d7ca122e in mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >&&>::applyImpl<mozilla::wr::RenderThread, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >), StoreCopyPassByConstLRef<mozilla::wr::WrWindowId>, StoreCopyPassByRRef<mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> > >, 0ul, 1ul> (args=..., m=<optimized out>, o=<optimized out>)
     at /usr/pkgobj/www/firefox/work/build/dist/include/nsThreadUtils.h:902
 #24 mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >&&>::apply<mozilla::wr::RenderThread, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >)> (m=<optimized out>, 
     o=<optimized out>, this=<optimized out>)
     at /usr/pkgobj/www/firefox/work/build/dist/include/nsThreadUtils.h:1169
 #25 mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >), true, (mozilla::RunnableKind)0, mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >&&>::Run (
     this=<optimized out>)
     at /usr/pkgobj/www/firefox/work/build/dist/include/nsThreadUtils.h:1216
 #26 0x00007a44d720d552 in nsThread::ProcessNextEvent (this=0x7a44cf12e880, 
     aMayWait=<optimized out>, aResult=0x7a44c0eb8cd7)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThread.cpp:1233
 #27 0x00007a44d7201db1 in NS_ProcessNextEvent (aThread=<optimized out>, 
     aThread@entry=0x7a44cf12e880, aMayWait=aMayWait@entry=true)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThreadUtils.cpp:477
 #28 0x00007a44d76e8d58 in mozilla::ipc::MessagePumpForNonMainThreads::Run (
     this=0x7a44cb5320c0, aDelegate=0x7a44c0eb8d90)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/glue/MessagePump.cpp:330
 #29 0x00007a44d76ad506 in MessageLoop::RunInternal (
     this=0x7a44e1658380 <__stack_chk_guard>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:381
 #30 MessageLoop::RunHandler (this=0x7a44e1658380 <__stack_chk_guard>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:374
 #31 MessageLoop::Run (this=this@entry=0x7a44c0eb8d90)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:356
 #32 0x00007a44d720cd68 in nsThread::ThreadFunc (aArg=<optimized out>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThread.cpp:391
 #33 0x00007a44d32297a6 in _pt_root () from /usr/pkg/lib/nspr/libnspr4.so
 #34 0x00007a44e188e2df in pthread__create_tramp (cookie=0x7a44cb7da400)
     at /usr/src/lib/libpthread/pthread.c:592

 When run from gdb with breakpoint on _mesa_GetError:


 [New LWP 12910 of process 12321]
 u_current_set_context(0x77d0dcbd0080) thread 0x77d0f18b2800 lwp 4678, &_glapi_tls_Context: 0x77d0f04d9028 (value: 0x77d0dcbd0080)
 __glXSetCurrentContext(0x77d0dd38b000) in thread 0x77d0f18b2800 lwp 4678
 --Type <RET> for more, q to quit, c to continue without paging--
 [Switching to LWP 4678 of process 12321]

 Thread 60 "Renderer" hit Breakpoint 1, 0x000077d0e9eadb7b in _mesa_GetError ()
    from /usr/X11R7/lib/modules/dri/swrast_dri.so
 (gdb) bt
 #0  0x000077d0e9eadb7b in _mesa_GetError ()
    from /usr/X11R7/lib/modules/dri/swrast_dri.so
 #1  0x000077d0fc827166 in mozilla::gl::GLContext::InitImpl (
     this=0x77d0dd303800)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:526
 #2  0x000077d0fc82859a in mozilla::gl::GLContext::InitImpl (
     this=0x77d0dd303800)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:372
 #3  mozilla::gl::GLContext::Init (this=this@entry=0x77d0dd303800)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContext.cpp:324
 #4  0x000077d0fc7f328f in mozilla::gl::GLContextGLX::Init (this=0x77d0dd303800)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:453
 #5  operator() (__closure=__closure@entry=0x77d0e56c84b0, attribs=...)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:351
 #6  0x000077d0fc7f36f6 in mozilla::gl::GLContextGLX::CreateGLContext (
     desc=..., display=..., drawable=<optimized out>, drawable@entry=35651644, 
     cfg=<optimized out>, cfg@entry=0x77d0f0544c00, 
     ownedPixmap=<optimized out>, ownedPixmap@entry=0)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:417
 #7  0x000077d0fc7f3a76 in mozilla::gl::CreateForWidget (
     aXDisplay=aXDisplay@entry=0x77d10353a000, 
     aXWindow=aXWindow@entry=35651644, 
     aHardwareWebRender=aHardwareWebRender@entry=true, 
     aForceAccelerated=<optimized out>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:620
 #8  0x000077d0fc7f3c62 in mozilla::gl::CreateForWidget (
     aForceAccelerated=<optimized out>, aHardwareWebRender=<optimized out>, 
     aXWindow=35651644, aXDisplay=0x77d10353a000)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:634
 #9  mozilla::gl::GLContextProviderGLX::CreateForCompositorWidget (
     aCompositorWidget=<optimized out>, aHardwareWebRender=<optimized out>, 
     aForceAccelerated=<optimized out>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderGLX.cpp:634
 #10 0x000077d0fc7f42b7 in mozilla::gl::GLContextProviderLinux::CreateForCompositorWidget (aCompositorWidget=<optimized out>, 
     aHardwareWebRender=aHardwareWebRender@entry=true, 
     aForceAccelerated=aForceAccelerated@entry=true)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/gl/GLContextProviderLinux.cpp:29
 #11 0x000077d0fcaa9d10 in mozilla::wr::RenderCompositorOGL::Create (
     aWidget=..., aError=...)
     at /usr/pkgobj/www/firefox/work/build/dist/include/mozilla/RefPtr.h:280
 #12 0x000077d0fcabaa17 in mozilla::wr::RenderCompositor::Create (aWidget=..., 
     aError=...)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderCompositor.cpp:223
 #13 0x000077d0fcabfd50 in mozilla::wr::NewRenderer::Run (this=0x77d0dd3a1100, 
     aRenderThread=..., aWindowId=...)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/WebRenderAPI.cpp:71
 #14 0x000077d0fcaa604c in mozilla::wr::RenderThread::RunEvent (
     this=0x77d0f585ac00, aWindowId=..., aEvent=...)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/gfx/webrender_bindings/RenderThread.cpp:526
 #15 0x000077d0fcaa122e in mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >&&>::applyImpl<mozilla::wr::RenderThread, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >), StoreCopyPassByConstLRef<mozilla::wr::WrWindowId>, StoreCopyPassByRRef<mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> > >, 0ul, 1ul> (args=..., m=<optimized out>, o=<optimized out>)
     at /usr/pkgobj/www/firefox/work/build/dist/include/nsThreadUtils.h:902
 #16 mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >&&>::apply<mozilla::wr::RenderThread, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >)> (m=<optimized out>, 
     o=<optimized out>, this=<optimized out>)
     at /usr/pkgobj/www/firefox/work/build/dist/include/nsThreadUtils.h:1169
 #17 mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >), true, (mozilla::RunnableKind)0, mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent, mozilla::DefaultDelete<mozilla::wr::RendererEvent> >&&>::Run (
     this=<optimized out>)
     at /usr/pkgobj/www/firefox/work/build/dist/include/nsThreadUtils.h:1216
 #18 0x000077d0fc00d552 in nsThread::ProcessNextEvent (this=0x77d0f36a7880, 
     aMayWait=<optimized out>, aResult=0x77d0e56c8cd7)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThread.cpp:1233
 #19 0x000077d0fc001db1 in NS_ProcessNextEvent (aThread=<optimized out>, 
     aThread@entry=0x77d0f36a7880, aMayWait=aMayWait@entry=true)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThreadUtils.cpp:477
 #20 0x000077d0fc4e8d58 in mozilla::ipc::MessagePumpForNonMainThreads::Run (
     this=0x77d0f02b70c0, aDelegate=0x77d0e56c8d90)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/glue/MessagePump.cpp:330
 #21 0x000077d0fc4ad506 in MessageLoop::RunInternal (
     this=0x77d1063d2380 <__stack_chk_guard>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:381
 #22 MessageLoop::RunHandler (this=0x77d1063d2380 <__stack_chk_guard>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:374
 #23 MessageLoop::Run (this=this@entry=0x77d0e56c8d90)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/ipc/chromium/src/base/message_loop.cc:356
 #24 0x000077d0fc00cd68 in nsThread::ThreadFunc (aArg=<optimized out>)
     at /usr/pkgobj/www/firefox/work/firefox-112.0.1/xpcom/threads/nsThread.cpp:391
 #25 0x000077d0f80297a6 in _pt_root () from /usr/pkg/lib/nspr/libnspr4.so
 #26 0x000077d1066082df in pthread__create_tramp (cookie=0x77d0f18b2800)
     at /usr/src/lib/libpthread/pthread.c:592


From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: pkg/57445: firefox crashes on startup
Date: Thu, 1 Jun 2023 00:28:00 +0000

 This is a multi-part message in MIME format.
 --=_QJyEfqlAfncoM8fB56hLDQdX5vxg3caH

 I pushed some changes to add rtld debug messages about tls:

 https://mail-index.netbsd.org/source-changes/2023/05/31/msg145024.html

 Can you:

 1. recompile ld.elf_so with -DDEBUG -DRTLD_DEBUG_RELOC (not
    -DRTLD_DEBUG) and the attached patch (same as at
    https://mail-index.netbsd.org/pkgsrc-users/2023/05/31/msg037402.html);

 2. save output of `LD_DEBUG=1 firefox' with your patches (it will be
    verrrrrry verbose);

 3. share the result of grepping that output for the combination of
    (a) your debug messages, and
    (b) the string ` tls '; and

 4. share the output of `readelf -ld' and `readelf -r | grep tls' on:
    (a) firefox (the actual executable),
    (b) libxul.so or whatever it is,
    (c) libGL.so,
    (d) libEGL.so, and
    (e) libglapi.so?

 --=_QJyEfqlAfncoM8fB56hLDQdX5vxg3caH
 Content-Type: text/plain; charset="ISO-8859-1"; name="rtldtlshack"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="rtldtlshack.patch"

 diff --git a/libexec/ld.elf_so/arch/x86_64/mdreloc.c b/libexec/ld.elf_so/ar=
 ch/x86_64/mdreloc.c
 index a04c05ea0aa7..adbab16003a7 100644
 --- a/libexec/ld.elf_so/arch/x86_64/mdreloc.c
 +++ b/libexec/ld.elf_so/arch/x86_64/mdreloc.c
 @@ -227,7 +227,7 @@ _rtld_relocate_nonplt_objects(Obj_Entry *obj)
 =20
  		case R_TYPE(TPOFF64):
  			if (!defobj->tls_done &&
 -			    _rtld_tls_offset_allocate(obj))
 +			    _rtld_tls_offset_allocate(__UNCONST(defobj)))
  				return -1;
 =20
  			*where64 =3D (Elf64_Addr)(def->st_value -

 --=_QJyEfqlAfncoM8fB56hLDQdX5vxg3caH--

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: pkg/57445: firefox crashes on startup
Date: Thu, 1 Jun 2023 07:14:04 +0200

 The requested output is quite huge, so I put it here:

 	https://www.netbsd.org/~martin/rtld_tls_debug.tar.bz2

 Martin

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Martin Husemann <martin@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, joerg@NetBSD.org
Subject: Re: pkg/57445: firefox crashes on startup
Date: Thu, 1 Jun 2023 11:54:00 +0000

 Here's the relevant excerpts from the log:

 search object 0x70ee11e74400 (/usr/X11R7/lib/libglapi.so.1) for _glapi_tls_=
 Context
 check "_glapi_tls_Context" vs "_glapi_tls_Context" in /usr/X11R7/lib/libgla=
 pi.so.1
 /usr/X11R7/lib/libglapi.so.1: static tls offset 0x1840 size 16
 TPOFF64 _glapi_tls_Context in /usr/X11R7/lib/libglapi.so.1 --> 0xffffffffff=
 ffe7c0
 ...
 search object 0x70ee11e75000 (/usr/X11R7/lib/libGL.so) for _glapi_tls_Conte=
 xt
 check "_glapi_tls_Context" vs "glMultiTexCoord3ivARB" in /usr/X11R7/lib/lib=
 GL.so
 check "_glapi_tls_Context" vs "_glapi_tls_Context" in /usr/X11R7/lib/libGL.=
 so
 TPOFF64 _glapi_tls_Context in /usr/X11R7/lib/modules/dri/swrast_dri.so --> =
 0xffffffffffffe7a8
 ...
 obj 0x70ee11e74400 dtv 0x70ee1135e030 tlsoffset 6208
 obj 0x70ee11e75000 dtv 0x70ee03fae018 tlsoffset 6232
 ...
 u_current_set_context(0x70ee03e57dc0) thread 0x70ee11411400 lwp 1752, &_gla=
 pi_tls_Context: 0x70ee1135e030 (value: 0x70ee03e57dc0)
 &_glapi_tls_Context: 0x70ee1135e018 (value: 0x0)

 So there are two definitions of the symbol _glapi_tls_Context, one in
 libglapi and one in libGL.

 - On initial process execution, ld.elf_so resolves a reference to
   _glapi_tls_Context in libglapi to the definition in libglapi, at tls
   offset 6208 =3D 0x1840, which in lwp 1752 is at 0x70ee1135e030.

   This is what is used by u_current_set_context in libglapi, in
   xsrc/external/mit/MesaLib.old/dist/src/mapi/u_current.c, defined in
   the same file.

 - On dlopen of swrast_dri.so, however, ld.elf_so resolves a reference
   to _glapi_tls_Context in swrast_dri.so to the definition in libGL,
   at tls offset 6232 =3D 0x1858, which in lwp 1752 is at 0x70ee1135e018.

   This is what is used by _mesa_GetError in swrast_dri.so, in
   xsrc/external/mit/MesaLib.old/dist/src/mesa/main/getstring.c,
   defined in xsrc/external/mit/MesaLib.old/dist/src/glx/glxcurrent.c.

 (Note: static tls offsets are negative, which is why a higher tls
 offset leads to a lower virtual address in the same thread's tcb.)

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Martin Husemann <martin@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, joerg@NetBSD.org, tnn@NetBSD.org
Subject: Re: pkg/57445: firefox crashes on startup
Date: Thu, 1 Jun 2023 12:53:08 +0000

 Martin: Next thing to try:

 1. apply the ld.elf_so patch I gave you, AND
 2. remove the MASSIVE KLUDGE from
    xsrc/external/mit/MesaLib.old/dist/src/glx/glxcurrent.c and rebuild
    libGL.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: 
Cc: Martin Husemann <martin@NetBSD.org>, gnats-bugs@NetBSD.org,
	joerg@NetBSD.org, tnn@NetBSD.org
Subject: Re: pkg/57445: firefox crashes on startup
Date: Thu, 1 Jun 2023 14:31:30 +0000

 > Date: Thu, 1 Jun 2023 12:46:25 +0000
 > From: Taylor R Campbell <riastradh@NetBSD.org>
 >=20
 > Here's a reproducer.  I think I got everything relevant here, but the
 > excerpts I asked martin for don't tell me the exact order or arguments
 > of dlopens that firefox issues, so I guessed.
 >=20
 > So, I think the massive kludge of `just define the symbol in multiple
 > libraries without extern' probably never actually worked, but maybe
 > did paper over the symptoms in the past.
 >=20
 > $ head -40 glapi.c GL.c swrast_dri.c firefox.c
 > =3D=3D> glapi.c <=3D=3D
 > __thread int _glapi_tls_Context __attribute__((tls_model("initial-exec"))=
 ) =3D 0;
 > [...]
 > =3D=3D> GL.c <=3D=3D
 > __thread int _glapi_tls_Context __attribute__((tls_model("initial-exec"))=
 ) =3D 0;
 > [...]
 > $ ./firefox
 > GL 0x794c549be044
 > GL glapi 0x794c549be044
 > dri 0x794c549bf850
 > dri glapi 0x794c549be044

 If I change GL.c to use extern, this is what I get:

 $ ./firefox
 GL 0x7ca9ba272850
 GL glapi 0x7ca9ba271044
 dri 0x7ca9ba271044
 dri glapi 0x7ca9ba271044

 So, this would explain why the duplicate definition -- instead of
 using extern -- seemed to solve the problem in the past:

 1. When libglapi defines _glapi_tls_Context and libGL declares it
    extern, they disagree on it.

 2. When libglapi and libGL both define _glapi_tls_Context, they agree
    on it.

 But the problem is that in case (2), swrast_dri.so disagrees with both
 of them -- and I observe this whether swrast_dri.so has a third
 definition of _glapi_tls_Context or another extern declaration.

 If I put:

 - definition in libglapi
 - extern in libGL
 - extern in swrast_dri.so

 and run it with the patched ld.elf_so, it works:

 $ ./firefox
 GL 0x7d6bb813103c
 GL glapi 0x7d6bb813103c
 dri 0x7d6bb813103c
 dri glapi 0x7d6bb813103c

 So I think that fixing ld.elf_so to resolve the tls offset in the
 defining object rather than the referencing object (which will take a
 bit of work to disentangle all the const, or to resolve the tls offset
 earlier), and having a single definition with only extern declarations
 (`removing the massive kludge'), may fix the issue once and for all.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Martin Husemann <martin@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, joerg@NetBSD.org, tnn@NetBSD.org
Subject: Re: pkg/57445: firefox crashes on startup
Date: Thu, 1 Jun 2023 12:46:25 +0000

 Here's a reproducer.  I think I got everything relevant here, but the
 excerpts I asked martin for don't tell me the exact order or arguments
 of dlopens that firefox issues, so I guessed.

 So, I think the massive kludge of `just define the symbol in multiple
 libraries without extern' probably never actually worked, but maybe
 did paper over the symptoms in the past.


 $ head -40 glapi.c GL.c swrast_dri.c firefox.c
 =3D=3D> glapi.c <=3D=3D
 __thread int _glapi_tls_Context __attribute__((tls_model("initial-exec"))) =
 =3D 0;

 int *glapi_context(void);
 int *
 glapi_context(void)
 {
 	return &_glapi_tls_Context;
 }

 =3D=3D> GL.c <=3D=3D
 __thread int _glapi_tls_Context __attribute__((tls_model("initial-exec"))) =
 =3D 0;

 int *GL_context(void);
 int *
 GL_context(void)
 {
 	return &_glapi_tls_Context;
 }

 =3D=3D> swrast_dri.c <=3D=3D
 extern __thread int _glapi_tls_Context __attribute__((tls_model("initial-ex=
 ec")));

 int *dri_context(void);
 int *
 dri_context(void)
 {
 	return &_glapi_tls_Context;
 }

 =3D=3D> firefox.c <=3D=3D
 #include <dlfcn.h>
 #include <err.h>
 #include <stdio.h>

 int *GL_context(void);
 int *glapi_context(void);
 int *dri_context(void);

 #define	DL(x)	do							      \
 {									      \
 	if ((x) =3D=3D NULL)						      \
 		errx(1, "%s: %s", #x, dlerror());			      \
 } while (0)

 int
 main(void)
 {
 	void *GL =3D NULL;
 	void *dri =3D NULL;
 	int *(*GL_context)(void);
 	int *(*GL_glapi_context)(void);
 	int *(*dri_context)(void);
 	int *(*dri_glapi_context)(void);

 	DL(GL =3D dlopen("./libGL.so", 0));
 	DL(dri =3D dlopen("./swrast_dri.so", 0));

 	DL(GL_context =3D dlsym(GL, "GL_context"));
 	DL(GL_glapi_context =3D dlsym(GL, "glapi_context"));
 	DL(dri_context =3D dlsym(dri, "dri_context"));
 	DL(dri_glapi_context =3D dlsym(dri, "glapi_context"));

 	printf("GL %p\n", GL_context());
 	printf("GL glapi %p\n", GL_glapi_context());
 	printf("dri %p\n", dri_context());
 	printf("dri glapi %p\n", dri_glapi_context());
 	fflush(stdout);
 	return ferror(stdout);
 }
 $ make
 cc -fPIC   -c firefox.c
 cc -o firefox firefox.o -L. -Wl,-R$(pwd)
 cc -fPIC   -c GL.c
 cc -fPIC   -c glapi.c
 cc -o libglapi.so -shared glapi.o -L. -Wl,-R$(pwd)
 cc -o libGL.so -shared GL.o -L. -Wl,-R$(pwd) -lglapi
 cc -fPIC   -c swrast_dri.c
 cc -o swrast_dri.so -shared swrast_dri.o -L. -Wl,-R$(pwd) -lglapi
 $ ./firefox
 GL 0x794c549be044
 GL glapi 0x794c549be044
 dri 0x794c549bf850
 dri glapi 0x794c549be044
 $ LD_PRELOAD=3D$(pwd)/libGL.so ./firefox
 GL 0x74f852ef885c
 GL glapi 0x74f852ef885c
 dri 0x74f852ef885c
 dri glapi 0x74f852ef885c
 $=20

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Martin Husemann <martin@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, joerg@NetBSD.org, tnn@NetBSD.org
Subject: Re: pkg/57445: firefox crashes on startup
Date: Thu, 1 Jun 2023 12:21:35 +0000

 martin: Can you get the ld.elf_so debug output _without_ the ld.elf_so
 patch that tries to fix the underlying extern __thread problem?
 Curious to see how it compares.


 So here are the relevant definitions and declarations:


 - libglapi:

 /* xsrc/external/mit/MesaLib.old/dist/src/mapi/u_current.h */
 #define u_current_context _glapi_tls_Context

 /* xsrc/external/mit/MesaLib.old/dist/src/mapi/u_current.c */
 __thread void *u_current_context
     __attribute__((tls_model("initial-exec")));


 - libGL (which is built with -DGLX_USE_TLS):

 /* xsrc/external/mit/MesaLib.old/dist/src/glx/glxcurrent.c */
 /*
  * MASSIVE KLUDGE!
  * We need these to not be extern in libGL.so because of
  * PR toolchain/50277
  */
 #if defined(GLX_USE_TLS) && defined(__NetBSD__)
 _X_EXPORT __thread struct _glapi_table * _glapi_tls_Dispatch
     __attribute__((tls_model("initial-exec"))) = NULL;
 _X_EXPORT __thread void * _glapi_tls_Context
     __attribute__((tls_model("initial-exec")));
 #endif


 - swrast_dri.so:

 /* xsrc/external/mit/MesaLib.old/dist/src/mapi/glapi/glapi.h */

 _GLAPI_EXPORT extern __thread void * _glapi_tls_Context
     __attribute__((tls_model("initial-exec")));


 This MASSIVE KLUDGE -- did it ever actually work or did it only paper
 over a symptom, or did it try to work around two problems (static TLS
 symbol resolution is broken, _and_ static TLS initialization must be
 zero) and only work around one of them while papering over the other?
 Do we have a test case showing that it actually worked?  Was there
 some precondition in the original massive kludge that made it
 applicable at the time, but no longer?

 Note: The original massive kludge was applied to pkgsrc in 2015 in the
 update to mesa-11.0.0, then merged into xsrc in 2019, then updated in
 pkgsrc in early 2020; then TLS disabled altogether in pkgsrc (except
 on linux/glibc) shortly after in 2020 as it has remained since.

 https://mail-index.netbsd.org/pkgsrc-users/2015/09/11/msg022180.html
 https://mail-index.netbsd.org/pkgsrc-changes/2015/09/26/msg130441.html
 https://mail-index.netbsd.org/source-changes/2019/04/09/msg104932.html
 https://mail-index.netbsd.org/pkgsrc-changes/2020/02/19/msg206832.html
 https://mail-index.netbsd.org/pkgsrc-changes/2020/04/09/msg210367.html

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Martin Husemann <martin@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, joerg@NetBSD.org, tnn@NetBSD.org
Subject: Re: pkg/57445: firefox crashes on startup
Date: Thu, 1 Jun 2023 12:46:25 +0000

 Here's a reproducer.  I think I got everything relevant here, but the
 excerpts I asked martin for don't tell me the exact order or arguments
 of dlopens that firefox issues, so I guessed.

 So, I think the massive kludge of `just define the symbol in multiple
 libraries without extern' probably never actually worked, but maybe
 did paper over the symptoms in the past.


 $ head -40 glapi.c GL.c swrast_dri.c firefox.c
 =3D=3D> glapi.c <=3D=3D
 __thread int _glapi_tls_Context __attribute__((tls_model("initial-exec"))) =
 =3D 0;

 int *glapi_context(void);
 int *
 glapi_context(void)
 {
 	return &_glapi_tls_Context;
 }

 =3D=3D> GL.c <=3D=3D
 __thread int _glapi_tls_Context __attribute__((tls_model("initial-exec"))) =
 =3D 0;

 int *GL_context(void);
 int *
 GL_context(void)
 {
 	return &_glapi_tls_Context;
 }

 =3D=3D> swrast_dri.c <=3D=3D
 extern __thread int _glapi_tls_Context __attribute__((tls_model("initial-ex=
 ec")));

 int *dri_context(void);
 int *
 dri_context(void)
 {
 	return &_glapi_tls_Context;
 }

 =3D=3D> firefox.c <=3D=3D
 #include <dlfcn.h>
 #include <err.h>
 #include <stdio.h>

 int *GL_context(void);
 int *glapi_context(void);
 int *dri_context(void);

 #define	DL(x)	do							      \
 {									      \
 	if ((x) =3D=3D NULL)						      \
 		errx(1, "%s: %s", #x, dlerror());			      \
 } while (0)

 int
 main(void)
 {
 	void *GL =3D NULL;
 	void *dri =3D NULL;
 	int *(*GL_context)(void);
 	int *(*GL_glapi_context)(void);
 	int *(*dri_context)(void);
 	int *(*dri_glapi_context)(void);

 	DL(GL =3D dlopen("./libGL.so", 0));
 	DL(dri =3D dlopen("./swrast_dri.so", 0));

 	DL(GL_context =3D dlsym(GL, "GL_context"));
 	DL(GL_glapi_context =3D dlsym(GL, "glapi_context"));
 	DL(dri_context =3D dlsym(dri, "dri_context"));
 	DL(dri_glapi_context =3D dlsym(dri, "glapi_context"));

 	printf("GL %p\n", GL_context());
 	printf("GL glapi %p\n", GL_glapi_context());
 	printf("dri %p\n", dri_context());
 	printf("dri glapi %p\n", dri_glapi_context());
 	fflush(stdout);
 	return ferror(stdout);
 }
 $ make
 cc -fPIC   -c firefox.c
 cc -o firefox firefox.o -L. -Wl,-R$(pwd)
 cc -fPIC   -c GL.c
 cc -fPIC   -c glapi.c
 cc -o libglapi.so -shared glapi.o -L. -Wl,-R$(pwd)
 cc -o libGL.so -shared GL.o -L. -Wl,-R$(pwd) -lglapi
 cc -fPIC   -c swrast_dri.c
 cc -o swrast_dri.so -shared swrast_dri.o -L. -Wl,-R$(pwd) -lglapi
 $ ./firefox
 GL 0x794c549be044
 GL glapi 0x794c549be044
 dri 0x794c549bf850
 dri glapi 0x794c549be044
 $ LD_PRELOAD=3D$(pwd)/libGL.so ./firefox
 GL 0x74f852ef885c
 GL glapi 0x74f852ef885c
 dri 0x74f852ef885c
 dri glapi 0x74f852ef885c
 $=20

From: Taylor R Campbell <riastradh@NetBSD.org>
To: 
Cc: Martin Husemann <martin@NetBSD.org>, gnats-bugs@NetBSD.org,
	joerg@NetBSD.org, tnn@NetBSD.org
Subject: Re: pkg/57445: firefox crashes on startup
Date: Thu, 1 Jun 2023 14:31:30 +0000

 > Date: Thu, 1 Jun 2023 12:46:25 +0000
 > From: Taylor R Campbell <riastradh@NetBSD.org>
 >=20
 > Here's a reproducer.  I think I got everything relevant here, but the
 > excerpts I asked martin for don't tell me the exact order or arguments
 > of dlopens that firefox issues, so I guessed.
 >=20
 > So, I think the massive kludge of `just define the symbol in multiple
 > libraries without extern' probably never actually worked, but maybe
 > did paper over the symptoms in the past.
 >=20
 > $ head -40 glapi.c GL.c swrast_dri.c firefox.c
 > =3D=3D> glapi.c <=3D=3D
 > __thread int _glapi_tls_Context __attribute__((tls_model("initial-exec"))=
 ) =3D 0;
 > [...]
 > =3D=3D> GL.c <=3D=3D
 > __thread int _glapi_tls_Context __attribute__((tls_model("initial-exec"))=
 ) =3D 0;
 > [...]
 > $ ./firefox
 > GL 0x794c549be044
 > GL glapi 0x794c549be044
 > dri 0x794c549bf850
 > dri glapi 0x794c549be044

 If I change GL.c to use extern, this is what I get:

 $ ./firefox
 GL 0x7ca9ba272850
 GL glapi 0x7ca9ba271044
 dri 0x7ca9ba271044
 dri glapi 0x7ca9ba271044

 So, this would explain why the duplicate definition -- instead of
 using extern -- seemed to solve the problem in the past:

 1. When libglapi defines _glapi_tls_Context and libGL declares it
    extern, they disagree on it.

 2. When libglapi and libGL both define _glapi_tls_Context, they agree
    on it.

 But the problem is that in case (2), swrast_dri.so disagrees with both
 of them -- and I observe this whether swrast_dri.so has a third
 definition of _glapi_tls_Context or another extern declaration.

 If I put:

 - definition in libglapi
 - extern in libGL
 - extern in swrast_dri.so

 and run it with the patched ld.elf_so, it works:

 $ ./firefox
 GL 0x7d6bb813103c
 GL glapi 0x7d6bb813103c
 dri 0x7d6bb813103c
 dri glapi 0x7d6bb813103c

 So I think that fixing ld.elf_so to resolve the tls offset in the
 defining object rather than the referencing object (which will take a
 bit of work to disentangle all the const, or to resolve the tls offset
 earlier), and having a single definition with only extern declarations
 (`removing the massive kludge'), may fix the issue once and for all.

From: "Joerg Sonnenberger" <joerg@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57445 CVS commit: src
Date: Sun, 4 Jun 2023 01:24:59 +0000

 Module Name:	src
 Committed By:	joerg
 Date:		Sun Jun  4 01:24:58 UTC 2023

 Modified Files:
 	src/libexec/ld.elf_so: README.TLS map_object.c rtld.h tls.c
 	src/libexec/ld.elf_so/arch/aarch64: mdreloc.c
 	src/libexec/ld.elf_so/arch/alpha: alpha_reloc.c
 	src/libexec/ld.elf_so/arch/arm: mdreloc.c
 	src/libexec/ld.elf_so/arch/hppa: hppa_reloc.c
 	src/libexec/ld.elf_so/arch/i386: mdreloc.c
 	src/libexec/ld.elf_so/arch/m68k: mdreloc.c
 	src/libexec/ld.elf_so/arch/mips: mips_reloc.c
 	src/libexec/ld.elf_so/arch/or1k: mdreloc.c
 	src/libexec/ld.elf_so/arch/powerpc: ppc_reloc.c
 	src/libexec/ld.elf_so/arch/riscv: mdreloc.c
 	src/libexec/ld.elf_so/arch/sh3: mdreloc.c
 	src/libexec/ld.elf_so/arch/sparc: mdreloc.c
 	src/libexec/ld.elf_so/arch/sparc64: mdreloc.c
 	src/libexec/ld.elf_so/arch/x86_64: mdreloc.c
 	src/tests/libexec/ld.elf_so: t_tls_extern.c

 Log Message:
 Fix interactions of initial-exec TLS model and dlopen

 (1) If an initial-exec relocation was used for a non-local symbol
 (i.e. the definition of the symbol is in a different DSO), the
 computation of the static TLS offset used the wrong DSO.
 This would effectively mean the wrong address was computed
 (PR toolchain/50277, PR pkg/57445).

 Fix this by forcing the computation of the correct DSO (the one defining
 the symbol).

 This code uses __UNCONST to avoid the vast interface changes for this
 special case.

 (2) If symbols from a DSO loaded via dlopen are used with both
 global-dynamic/local-dynamic and initial-exec relocations AND
 a initial-exec relocation was resolved first in a thread, a split brain
 situation could exist where the dynamic relocations would use one memory
 block (separate allocation) and the initial-exec relocations the static
 per-thread TLS space.

 (3) If the initial-exec relocation in (2) is seen after any thread has
 already used a GD/LD allocation, bail out. Since IE relocations are used
 only in the GOT, this will prevent the dlopen. This is a bit more
 aggressive than necessary, but a full blown reference counting doesn't
 seem to be justified.


 To generate a diff of this commit:
 cvs rdiff -u -r1.5 -r1.6 src/libexec/ld.elf_so/README.TLS
 cvs rdiff -u -r1.66 -r1.67 src/libexec/ld.elf_so/map_object.c
 cvs rdiff -u -r1.145 -r1.146 src/libexec/ld.elf_so/rtld.h
 cvs rdiff -u -r1.17 -r1.18 src/libexec/ld.elf_so/tls.c
 cvs rdiff -u -r1.17 -r1.18 src/libexec/ld.elf_so/arch/aarch64/mdreloc.c
 cvs rdiff -u -r1.43 -r1.44 src/libexec/ld.elf_so/arch/alpha/alpha_reloc.c
 cvs rdiff -u -r1.45 -r1.46 src/libexec/ld.elf_so/arch/arm/mdreloc.c
 cvs rdiff -u -r1.49 -r1.50 src/libexec/ld.elf_so/arch/hppa/hppa_reloc.c
 cvs rdiff -u -r1.41 -r1.42 src/libexec/ld.elf_so/arch/i386/mdreloc.c
 cvs rdiff -u -r1.33 -r1.34 src/libexec/ld.elf_so/arch/m68k/mdreloc.c
 cvs rdiff -u -r1.74 -r1.75 src/libexec/ld.elf_so/arch/mips/mips_reloc.c
 cvs rdiff -u -r1.3 -r1.4 src/libexec/ld.elf_so/arch/or1k/mdreloc.c
 cvs rdiff -u -r1.62 -r1.63 src/libexec/ld.elf_so/arch/powerpc/ppc_reloc.c
 cvs rdiff -u -r1.8 -r1.9 src/libexec/ld.elf_so/arch/riscv/mdreloc.c
 cvs rdiff -u -r1.35 -r1.36 src/libexec/ld.elf_so/arch/sh3/mdreloc.c
 cvs rdiff -u -r1.56 -r1.57 src/libexec/ld.elf_so/arch/sparc/mdreloc.c
 cvs rdiff -u -r1.69 -r1.70 src/libexec/ld.elf_so/arch/sparc64/mdreloc.c
 cvs rdiff -u -r1.47 -r1.48 src/libexec/ld.elf_so/arch/x86_64/mdreloc.c
 cvs rdiff -u -r1.11 -r1.12 src/tests/libexec/ld.elf_so/t_tls_extern.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sun, 04 Jun 2023 11:56:02 +0000
State-Changed-Why:
candidate fix committed; test, please!


From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57445 CVS commit: src/libexec/ld.elf_so
Date: Sun, 4 Jun 2023 23:42:38 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sun Jun  4 23:42:38 UTC 2023

 Modified Files:
 	src/libexec/ld.elf_so: rtld.c

 Log Message:
 ld.elf_so: Sprinkle more debug messages on dlopen and error.

 PR pkg/57445


 To generate a diff of this commit:
 cvs rdiff -u -r1.213 -r1.214 src/libexec/ld.elf_so/rtld.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: PR/57445 CVS commit: src/libexec/ld.elf_so
Date: Mon, 5 Jun 2023 09:28:12 +0200

 On Sun, Jun 04, 2023 at 11:45:02PM +0000, Taylor R Campbell wrote:
 >  ld.elf_so: Sprinkle more debug messages on dlopen and error.

 Not sure what happened, but after updating ld.elf_so to that version
 I can not reproduce the problem any more.

 Martin

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: PR/57445 CVS commit: src/libexec/ld.elf_so
Date: Mon, 5 Jun 2023 09:30:43 +0200

 However, webgl does not work with this version, it complains:

 JavaScript warning: , line 0: DispatchCommand(id: 52) failed. Please file a bug!
 JavaScript warning: , line 0: WebGL context was lost.

 Martin

Responsible-Changed-From-To: pkg-manager->riastradh
Responsible-Changed-By: riastradh@NetBSD.org
Responsible-Changed-When: Sat, 13 Apr 2024 12:55:15 +0000
Responsible-Changed-Why:


State-Changed-From-To: feedback->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sat, 13 Apr 2024 12:55:15 +0000
State-Changed-Why:
crash fixed by static tls fixes
webgl issues can be another PR


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.