NetBSD Problem Report #44756
From www@NetBSD.org Tue Mar 22 20:59:09 2011
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id A123263B8E3
for <gnats-bugs@gnats.NetBSD.org>; Tue, 22 Mar 2011 20:59:09 +0000 (UTC)
Message-Id: <20110322205908.721A763B873@www.NetBSD.org>
Date: Tue, 22 Mar 2011 20:59:08 +0000 (UTC)
From: cryintothebluesky@googlemail.com
Reply-To: cryintothebluesky@googlemail.com
To: gnats-bugs@NetBSD.org
Subject: pthread_cond_timedwait() sometimes returns error code 3 (ESRCH)
X-Send-Pr-Version: www-1.0
>Number: 44756
>Category: lib
>Synopsis: pthread_cond_timedwait() sometimes returns error code 3 (ESRCH)
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: lib-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Mar 22 21:00:01 +0000 2011
>Closed-Date: Fri Feb 13 22:37:57 +0000 2015
>Last-Modified: Fri Feb 13 22:37:57 +0000 2015
>Originator: Sad Clouds
>Release: 5.1
>Organization:
>Environment:
NetBSD atom 5.1_STABLE NetBSD 5.1_STABLE (GENERIC) #6: Tue Mar 15 12:39:43 GMT 2011 roman@atom:/opt/obj.amd64/sys/arch/amd64/compile/GENERIC amd64
>Description:
pthread_cond_timedwait() sometimes returns error code 3 (ESRCH), this should never occur. The problem is likely to be seen when creating and exiting threads in a loop, when those threads call pthread_cond_timedwait().
>How-To-Repeat:
The following test program demonstrates the problem. It may take a while for the error to occur, but sooner or later it repeatedly occurs, e.g.:
atom$ gcc -O tmp.c -lpthread
atom$ ./a.out
pthread_cond_timedwait returned = 3
#include <stdlib.h>
#include <errno.h>
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
void *thread_func(void *arg)
{
struct timespec ts;
int int_ret;
int i = 0;
while (1)
{
if (i++ >= 10000)
pthread_exit(NULL);
if (clock_gettime(CLOCK_REALTIME, &ts) != 0)
abort();
/*
Set to 1 second in the past, this should make pthread_cond_timedwait()
return immediately with ETIMEDOUT.
*/
ts.tv_sec -= 1;
if (pthread_mutex_lock(&mutex) != 0)
abort();
int_ret = pthread_cond_timedwait(&cond, &mutex, &ts);
if (int_ret != 0 && int_ret != ETIMEDOUT)
{
/*
Sometimes pthread_cond_timedwait() returns 3.
This should never happen.
*/
printf("pthread_cond_timedwait returned = %d\n", int_ret);
exit(1);
}
if (pthread_mutex_unlock(&mutex) != 0)
abort();
}
}
int main(void)
{
pthread_t tid[64];
int i;
while (1)
{
for (i = 0; i < 64; i++)
{
if (pthread_create(&tid[i], NULL, thread_func, NULL) != 0)
{
abort();
}
}
for (i = 0; i < 64; i++)
{
if (pthread_join(tid[i], NULL) != 0)
abort();
}
}
pthread_exit(NULL);
}
>Fix:
>Release-Note:
>Audit-Trail:
From: Sad Clouds <cryintothebluesky@googlemail.com>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org
Subject: Re: lib/44756: pthread_cond_timedwait() sometimes returns error
code 3 (ESRCH)
Date: Tue, 22 Mar 2011 21:11:26 +0000
It didn't paste the first few #include lines properly. All the #include
lines should be:
#include <pthread.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
From: "Jukka Ruohonen" <jruoho@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44756 CVS commit: src/tests/lib/libpthread
Date: Sun, 27 Mar 2011 16:45:16 +0000
Module Name: src
Committed By: jruoho
Date: Sun Mar 27 16:45:16 UTC 2011
Modified Files:
src/tests/lib/libpthread: t_cond.c
Log Message:
Add a test case for pthread_cond_timedwait(3) failures reported by
Sad Clouds in PR lib/44756. This was discussed also in:
http://mail-index.netbsd.org/tech-userlevel/2011/03/17/msg004689.html
To generate a diff of this commit:
cvs rdiff -u -r1.2 -r1.3 src/tests/lib/libpthread/t_cond.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44756 CVS commit: src/lib/libpthread
Date: Fri, 31 Jan 2014 14:22:00 -0500
Module Name: src
Committed By: christos
Date: Fri Jan 31 19:22:00 UTC 2014
Modified Files:
src/lib/libpthread: pthread_cond.c pthread_mutex.c
Log Message:
PR/44756: Sad Clouds: Prevent leakage of errno = ESRCH from _lwp_park. This
has two parts:
- in pthread_cond_timedwait() if the thread we are trying to unpark
exited, retry the the _lwp_park call without it.
- pthread_mutex() was affecting errno since it is calling _lwp_park()
from pthread_mutex_lock_slow(). preserve the original errno.
Note that the example problem still causes an occassional deadlock on machines
with many CPUs and it is the same deadlock we observe with named.
To generate a diff of this commit:
cvs rdiff -u -r1.61 -r1.62 src/lib/libpthread/pthread_cond.c
cvs rdiff -u -r1.56 -r1.57 src/lib/libpthread/pthread_mutex.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Mindaugas Rasiukevicius" <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44756 CVS commit: src/lib/libpthread
Date: Mon, 3 Feb 2014 15:51:01 +0000
Module Name: src
Committed By: rmind
Date: Mon Feb 3 15:51:01 UTC 2014
Modified Files:
src/lib/libpthread: pthread_mutex.c
Log Message:
pthread__mutex_lock_slow: fix the handling of a potential race with the
non-interlocked CAS in the fast unlock path -- it is unsafe to test for
the waiters-bit while the owner thread is running, we have to spin for
the owner or its state change to be sure about the presence of the bit.
Split off the logic into the pthread__mutex_setwaiters() routine.
This is a partial fix to the named lockup problem (also see PR/44756).
It seems there is another race which can be reproduced on faster CPUs.
To generate a diff of this commit:
cvs rdiff -u -r1.58 -r1.59 src/lib/libpthread/pthread_mutex.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jeremy C. Reed" <reed@reedmedia.net>
To: gnats-bugs@NetBSD.org
Cc: rmind@netbsd.org
Subject: Re: PR/44756 CVS commit: src/lib/libpthread
Date: Tue, 11 Feb 2014 13:01:18 -0600 (CST)
On Mon, 3 Feb 2014, Mindaugas Rasiukevicius wrote:
> This is a partial fix to the named lockup problem (also see PR/44756).
> It seems there is another race which can be reproduced on faster CPUs.
Is there a different PR for tracking the other race?
> cvs rdiff -u -r1.58 -r1.59 src/lib/libpthread/pthread_mutex.c
From: "Stephen Borrill" <sborrill@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44756 CVS commit: [netbsd-6] src/lib/libpthread
Date: Thu, 20 Feb 2014 13:00:40 +0000
Module Name: src
Committed By: sborrill
Date: Thu Feb 20 13:00:40 UTC 2014
Modified Files:
src/lib/libpthread [netbsd-6]: pthread_cond.c pthread_mutex.c
Log Message:
Pull up the following revisions(s) (requested by prlw1 in ticket #1029):
lib/libpthread/pthread_cond.c: revision 1.62
lib/libpthread/pthread_mutex.c: revision 1.57,1.59
Partial fix for thread deadlock commonly observed with named.
Also address PR/44756.
To generate a diff of this commit:
cvs rdiff -u -r1.56.8.3 -r1.56.8.4 src/lib/libpthread/pthread_cond.c
cvs rdiff -u -r1.51.22.1 -r1.51.22.2 src/lib/libpthread/pthread_mutex.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Stephen Borrill" <sborrill@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44756 CVS commit: [netbsd-5] src/lib/libpthread
Date: Thu, 20 Feb 2014 13:53:26 +0000
Module Name: src
Committed By: sborrill
Date: Thu Feb 20 13:53:26 UTC 2014
Modified Files:
src/lib/libpthread [netbsd-5]: pthread_cond.c pthread_mutex.c
Log Message:
Pull up the following revisions(s) (requested by prlw1 in ticket #1898):
lib/libpthread/pthread_cond.c: revision 1.62
lib/libpthread/pthread_mutex.c: revision 1.57,1.59
Partial fix for thread deadlock commonly observed with named.
Also address PR/44756.
To generate a diff of this commit:
cvs rdiff -u -r1.53 -r1.53.2.1 src/lib/libpthread/pthread_cond.c
cvs rdiff -u -r1.51 -r1.51.4.1 src/lib/libpthread/pthread_mutex.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Jeremy C. Reed" <reed@reedmedia.net>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: PR/44756
Date: Thu, 13 Mar 2014 16:35:29 -0500 (CDT)
Any way to identify/detect what version of libpthread has the issue or
fix?
(In particular, to extend BIND's ./configure to check for this.)
From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44756 CVS commit: src/tests/lib/libpthread
Date: Wed, 3 Sep 2014 16:23:25 +0000
Module Name: src
Committed By: gson
Date: Wed Sep 3 16:23:25 UTC 2014
Modified Files:
src/tests/lib/libpthread: t_cond.c
Log Message:
The cond_timedwait_race test case is no longer expected to fail; it
has been consistently passing since CVS date 2014.01.31.19.22.00.
See also PR lib/44756.
To generate a diff of this commit:
cvs rdiff -u -r1.5 -r1.6 src/tests/lib/libpthread/t_cond.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/44756 CVS commit: [netbsd-7] src/tests/lib/libpthread
Date: Mon, 22 Dec 2014 02:06:10 +0000
Module Name: src
Committed By: msaitoh
Date: Mon Dec 22 02:06:10 UTC 2014
Modified Files:
src/tests/lib/libpthread [netbsd-7]: t_cond.c
Log Message:
Pull up following revision(s) (requested by gson in ticket #346):
tests/lib/libpthread/t_cond.c: revision 1.6
The cond_timedwait_race test case is no longer expected to fail; it
has been consistently passing since CVS date 2014.01.31.19.22.00.
See also PR lib/44756.
To generate a diff of this commit:
cvs rdiff -u -r1.5 -r1.5.4.1 src/tests/lib/libpthread/t_cond.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: prlw1@NetBSD.org
State-Changed-When: Fri, 13 Feb 2015 22:37:57 +0000
State-Changed-Why:
Fixed, tested and pulled-up
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.