NetBSD Problem Report #42158
From www@NetBSD.org Tue Oct 6 18:12:03 2009
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id E3A1B63C37A
for <gnats-bugs@gnats.netbsd.org>; Tue, 6 Oct 2009 18:12:03 +0000 (UTC)
Message-Id: <20091006181203.B448D63B8B6@www.NetBSD.org>
Date: Tue, 6 Oct 2009 18:12:03 +0000 (UTC)
From: pooka@iki.fi
Reply-To: pooka@iki.fi
To: gnats-bugs@NetBSD.org
Subject: qemu: pthread + fork = hang
X-Send-Pr-Version: www-1.0
>Number: 42158
>Category: pkg
>Synopsis: qemu: pthread + fork = hang
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: bin-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Oct 06 18:15:00 +0000 2009
>Closed-Date: Mon Apr 26 06:49:15 +0000 2010
>Last-Modified: Mon Apr 26 06:49:15 +0000 2010
>Originator: Antti Kantee
>Release: 5.99.19
>Organization:
>Environment:
>Description:
A program which executes fork() and is linked against libpthread
hangs in when run in qemu. This does not happen on normal hardware.
It is unclear if this is a libpthread bug, kernel bug or qemu bug
(please reclassify if it becomes apparent).
>How-To-Repeat:
#include <unistd.h>
int
main()
{
fork();
}
cc test.c -lpthread
./a.out
>Fix:
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->analyzed
State-Changed-By: pooka@NetBSD.org
State-Changed-When: Fri, 09 Oct 2009 00:08:18 +0300
State-Changed-Why:
I analyzed this a bit:
(gdb) bt
#0 0xbbbdfe00 in _atomic_cas_ptr (ptr=0xbbbd4060, old=0x0, new=0xbfa00000)
at /root/lib/libpthread/arch/i386/pthread_md.h:104
#1 0xbbbdfd7b in pthread_mutex_lock (ptm=0xbbbd4054) at pthread_mutex.c:156
#2 0xbbb7efb5 in __cxa_finalize () from /usr/lib/libc.so.12
#3 0xbbb7ef76 in exit () from /usr/lib/libc.so.12
#4 0x080484f6 in ___start ()
#5 0x08048447 in _start ()
(gdb) x/i $eip
0xbbbdfdfc <_atomic_cas_ptr+28>: lock cmpxchg %ecx,(%esi)
(gdb) info register
eax 0x0 0
ecx 0xbfa00000 -1080033280
edx 0xbbbd4060 -1145225120
ebx 0xbbbe7144 -1145147068
esp 0xbfbfeaa8 0xbfbfeaa8
ebp 0xbfbfeb18 0xbfbfeb18
esi 0xbbbd4060 -1145225120
edi 0xbfbfeb6c -1077941396
eip 0xbbbdfdfc 0xbbbdfdfc <_atomic_cas_ptr+28>
eflags 0x382 [ SF TF IF ]
cs 0x17 23
ss 0x1f 31
ds 0x1f 31
es 0x1f 31
fs 0x0 0
gs 0x0 0
(gdb) print *(int *)0xbbbd4060
$2 = 2
(gdb) stepi
0xbbbdfe00 104 __asm __volatile ("lock; cmpxchgl %2, %1"
(gdb) info register
eax 0x2 2
ecx 0xbfa00000 -1080033280
edx 0xbbbd4060 -1145225120
ebx 0xbbbe7144 -1145147068
esp 0xbfbfeaa8 0xbfbfeaa8
ebp 0xbfbfeb18 0xbfbfeb18
esi 0xbbbd4060 -1145225120
edi 0xbfbfeb6c -1077941396
eip 0xbbbdfe00 0xbbbdfe00 <_atomic_cas_ptr+32>
eflags 0x346 [ PF ZF TF IF ]
cs 0x17 23
ss 0x1f 31
ds 0x1f 31
es 0x1f 31
fs 0x0 0
gs 0x0 0
(gdb)
mutex_enter(&atexit_mutex) in __cxa_finalize() works totally bonkers.
The atomic cas in the pthread mutex enter fastpath is succesful in
that it actually swaps the values. Still, it returns the old value
of the mutex owner (which for a free recursive mutex is 0x2). This
leads to an eventual spincycle in the slow path, since the owner
is now us and we are running.
Seems like cmpxchg fails to work properly in qemu in some circumstances.
Dunno if this is a qemu bug or NetBSD bug. Note that ZF is set
after the instruction, so chompchomp really thinks it was doing
the right thing.
Also, it seems that usually this happens in the parent. However,
if the child exists before the parent, both processes fail in this
way.
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: pkg/42158: qemu: pthread + fork = hang
Date: Sun, 25 Apr 2010 15:23:41 +0300
I have reclassified this as category pkg because it's definitely a
qemu bug.
In a physical i386 CPU, the cmpxchg instruction performs a comparison
and read-modify-write memory cycle. In the case where the comparison
outcome is "unequal", the read-modify-write cycle is an effective
no-op, writing back the same value that was read, and the value of the
source operand is loaded into the accumulator. Qemu attempts to
emulate this behavior including the redundant memory write.
To be precise, qemu first loads the accumulator and then does the
redundant memory write. If a page fault occurs during the write, the
cmpxchg instruction will be restarted after handling the page fault,
but because the accumulator has already been changed, the comparison
will now incorrectly yield a result of "equal", causing the memory
write to write the value from the source operand instead of re-writing
the original memory contents.
I assume fork() triggers the bug because it write protects pages to
implement copy-on-write, thereby producing a situation where the read
part of the cmpxchg read-modify-write cycle succeeds but the write
part causes a page fault.
Patching qemu to only change the accumulator after performing the
redundant write fixes the problem for me. I will commit my patch to
pkgsrc and report the problem upstream.
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/42158 CVS commit: pkgsrc/emulators/qemu
Date: Sun, 25 Apr 2010 12:55:41 +0000
Module Name: pkgsrc
Committed By: gson
Date: Sun Apr 25 12:55:41 UTC 2010
Modified Files:
pkgsrc/emulators/qemu: Makefile distinfo
Added Files:
pkgsrc/emulators/qemu/patches: patch-ed
Log Message:
Correct emulation of i386 cmpxchg instruction in the case where the
comparison outcome is unequal and the memory write causes a page
fault. Fixes PR pkg/42158.
To generate a diff of this commit:
cvs rdiff -u -r1.65 -r1.66 pkgsrc/emulators/qemu/Makefile
cvs rdiff -u -r1.53 -r1.54 pkgsrc/emulators/qemu/distinfo
cvs rdiff -u -r0 -r1.1 pkgsrc/emulators/qemu/patches/patch-ed
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: analyzed->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Mon, 26 Apr 2010 06:49:15 +0000
State-Changed-Why:
Fix committed yesterday.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.