NetBSD Problem Report #44756

From www@NetBSD.org  Tue Mar 22 20:59:09 2011
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id A123263B8E3
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 22 Mar 2011 20:59:09 +0000 (UTC)
Message-Id: <20110322205908.721A763B873@www.NetBSD.org>
Date: Tue, 22 Mar 2011 20:59:08 +0000 (UTC)
From: cryintothebluesky@googlemail.com
Reply-To: cryintothebluesky@googlemail.com
To: gnats-bugs@NetBSD.org
Subject: pthread_cond_timedwait() sometimes returns error code 3 (ESRCH)
X-Send-Pr-Version: www-1.0

>Number:         44756
>Category:       lib
>Synopsis:       pthread_cond_timedwait() sometimes returns error code 3 (ESRCH)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 22 21:00:01 +0000 2011
>Closed-Date:    Fri Feb 13 22:37:57 +0000 2015
>Last-Modified:  Fri Feb 13 22:37:57 +0000 2015
>Originator:     Sad Clouds
>Release:        5.1
>Organization:
>Environment:
NetBSD atom 5.1_STABLE NetBSD 5.1_STABLE (GENERIC) #6: Tue Mar 15 12:39:43 GMT 2011  roman@atom:/opt/obj.amd64/sys/arch/amd64/compile/GENERIC amd64
>Description:
pthread_cond_timedwait() sometimes returns error code 3 (ESRCH), this should never occur. The problem is likely to be seen when creating and exiting threads in a loop, when those threads call pthread_cond_timedwait().
>How-To-Repeat:
The following test program demonstrates the problem. It may take a while for the error to occur, but sooner or later it repeatedly occurs, e.g.:

atom$ gcc -O tmp.c -lpthread
atom$ ./a.out
pthread_cond_timedwait returned = 3


#include <stdlib.h>
#include <errno.h>

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

void *thread_func(void *arg)
{
	struct timespec ts;
	int int_ret;
	int i = 0;

	while (1)
	{
		if (i++ >= 10000)
			pthread_exit(NULL);

		if (clock_gettime(CLOCK_REALTIME, &ts) != 0)
			abort();

		/*
		Set to 1 second in the past, this should make pthread_cond_timedwait()
		return immediately with ETIMEDOUT.
		*/
		ts.tv_sec -= 1;

		if (pthread_mutex_lock(&mutex) != 0)
			abort();

		int_ret = pthread_cond_timedwait(&cond, &mutex, &ts);
		if (int_ret != 0 && int_ret != ETIMEDOUT)
		{
			/*
			Sometimes pthread_cond_timedwait() returns 3.
			This should never happen.
			*/
			printf("pthread_cond_timedwait returned = %d\n", int_ret);
			exit(1);
		}

		if (pthread_mutex_unlock(&mutex) != 0)
			abort();
	}
}

int main(void)
{
	pthread_t tid[64];
	int i;

	while (1)
	{
		for (i = 0; i < 64; i++)
		{
			if (pthread_create(&tid[i], NULL, thread_func, NULL) != 0)
			{
				abort();
			}

		}

		for (i = 0; i < 64; i++)
		{
			if (pthread_join(tid[i], NULL) != 0)
				abort();
		}
	}

	pthread_exit(NULL);
}
>Fix:

>Release-Note:

>Audit-Trail:
From: Sad Clouds <cryintothebluesky@googlemail.com>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@netbsd.org
Subject: Re: lib/44756: pthread_cond_timedwait() sometimes returns error
 code 3 (ESRCH)
Date: Tue, 22 Mar 2011 21:11:26 +0000

 It didn't paste the first few #include lines properly. All the #include
 lines should be:

 #include <pthread.h>
 #include <time.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <errno.h>

From: "Jukka Ruohonen" <jruoho@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44756 CVS commit: src/tests/lib/libpthread
Date: Sun, 27 Mar 2011 16:45:16 +0000

 Module Name:	src
 Committed By:	jruoho
 Date:		Sun Mar 27 16:45:16 UTC 2011

 Modified Files:
 	src/tests/lib/libpthread: t_cond.c

 Log Message:
 Add a test case for pthread_cond_timedwait(3) failures reported by
 Sad Clouds in PR lib/44756. This was discussed also in:

 	http://mail-index.netbsd.org/tech-userlevel/2011/03/17/msg004689.html


 To generate a diff of this commit:
 cvs rdiff -u -r1.2 -r1.3 src/tests/lib/libpthread/t_cond.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44756 CVS commit: src/lib/libpthread
Date: Fri, 31 Jan 2014 14:22:00 -0500

 Module Name:	src
 Committed By:	christos
 Date:		Fri Jan 31 19:22:00 UTC 2014

 Modified Files:
 	src/lib/libpthread: pthread_cond.c pthread_mutex.c

 Log Message:
 PR/44756: Sad Clouds: Prevent leakage of errno = ESRCH from _lwp_park. This
 has two parts:
 	- in pthread_cond_timedwait() if the thread we are trying to unpark
 	  exited, retry the the _lwp_park call without it.
 	- pthread_mutex() was affecting errno since it is calling _lwp_park()
 	  from pthread_mutex_lock_slow(). preserve the original errno.
 Note that the example problem still causes an occassional deadlock on machines
 with many CPUs and it is the same deadlock we observe with named.


 To generate a diff of this commit:
 cvs rdiff -u -r1.61 -r1.62 src/lib/libpthread/pthread_cond.c
 cvs rdiff -u -r1.56 -r1.57 src/lib/libpthread/pthread_mutex.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Mindaugas Rasiukevicius" <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44756 CVS commit: src/lib/libpthread
Date: Mon, 3 Feb 2014 15:51:01 +0000

 Module Name:	src
 Committed By:	rmind
 Date:		Mon Feb  3 15:51:01 UTC 2014

 Modified Files:
 	src/lib/libpthread: pthread_mutex.c

 Log Message:
 pthread__mutex_lock_slow: fix the handling of a potential race with the
 non-interlocked CAS in the fast unlock path -- it is unsafe to test for
 the waiters-bit while the owner thread is running, we have to spin for
 the owner or its state change to be sure about the presence of the bit.
 Split off the logic into the pthread__mutex_setwaiters() routine.

 This is a partial fix to the named lockup problem (also see PR/44756).
 It seems there is another race which can be reproduced on faster CPUs.


 To generate a diff of this commit:
 cvs rdiff -u -r1.58 -r1.59 src/lib/libpthread/pthread_mutex.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Jeremy C. Reed" <reed@reedmedia.net>
To: gnats-bugs@NetBSD.org
Cc: rmind@netbsd.org
Subject: Re: PR/44756 CVS commit: src/lib/libpthread
Date: Tue, 11 Feb 2014 13:01:18 -0600 (CST)

 On Mon, 3 Feb 2014, Mindaugas Rasiukevicius wrote:

 >  This is a partial fix to the named lockup problem (also see PR/44756).
 >  It seems there is another race which can be reproduced on faster CPUs.


 Is there a different PR for tracking the other race?


 >  cvs rdiff -u -r1.58 -r1.59 src/lib/libpthread/pthread_mutex.c

From: "Stephen Borrill" <sborrill@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44756 CVS commit: [netbsd-6] src/lib/libpthread
Date: Thu, 20 Feb 2014 13:00:40 +0000

 Module Name:	src
 Committed By:	sborrill
 Date:		Thu Feb 20 13:00:40 UTC 2014

 Modified Files:
 	src/lib/libpthread [netbsd-6]: pthread_cond.c pthread_mutex.c

 Log Message:
 Pull up the following revisions(s) (requested by prlw1 in ticket #1029):
 	lib/libpthread/pthread_cond.c:	revision 1.62
 	lib/libpthread/pthread_mutex.c:	revision 1.57,1.59

 Partial fix for thread deadlock commonly observed with named.
 Also address PR/44756.


 To generate a diff of this commit:
 cvs rdiff -u -r1.56.8.3 -r1.56.8.4 src/lib/libpthread/pthread_cond.c
 cvs rdiff -u -r1.51.22.1 -r1.51.22.2 src/lib/libpthread/pthread_mutex.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Stephen Borrill" <sborrill@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44756 CVS commit: [netbsd-5] src/lib/libpthread
Date: Thu, 20 Feb 2014 13:53:26 +0000

 Module Name:	src
 Committed By:	sborrill
 Date:		Thu Feb 20 13:53:26 UTC 2014

 Modified Files:
 	src/lib/libpthread [netbsd-5]: pthread_cond.c pthread_mutex.c

 Log Message:
 Pull up the following revisions(s) (requested by prlw1 in ticket #1898):
 	lib/libpthread/pthread_cond.c:	revision 1.62
 	lib/libpthread/pthread_mutex.c:	revision 1.57,1.59

 Partial fix for thread deadlock commonly observed with named.
 Also address PR/44756.


 To generate a diff of this commit:
 cvs rdiff -u -r1.53 -r1.53.2.1 src/lib/libpthread/pthread_cond.c
 cvs rdiff -u -r1.51 -r1.51.4.1 src/lib/libpthread/pthread_mutex.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Jeremy C. Reed" <reed@reedmedia.net>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: PR/44756
Date: Thu, 13 Mar 2014 16:35:29 -0500 (CDT)

 Any way to identify/detect what version of libpthread has the issue or 
 fix?  

 (In particular, to extend BIND's ./configure to check for this.)

From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44756 CVS commit: src/tests/lib/libpthread
Date: Wed, 3 Sep 2014 16:23:25 +0000

 Module Name:	src
 Committed By:	gson
 Date:		Wed Sep  3 16:23:25 UTC 2014

 Modified Files:
 	src/tests/lib/libpthread: t_cond.c

 Log Message:
 The cond_timedwait_race test case is no longer expected to fail; it
 has been consistently passing since CVS date 2014.01.31.19.22.00.
 See also PR lib/44756.


 To generate a diff of this commit:
 cvs rdiff -u -r1.5 -r1.6 src/tests/lib/libpthread/t_cond.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "SAITOH Masanobu" <msaitoh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/44756 CVS commit: [netbsd-7] src/tests/lib/libpthread
Date: Mon, 22 Dec 2014 02:06:10 +0000

 Module Name:	src
 Committed By:	msaitoh
 Date:		Mon Dec 22 02:06:10 UTC 2014

 Modified Files:
 	src/tests/lib/libpthread [netbsd-7]: t_cond.c

 Log Message:
 Pull up following revision(s) (requested by gson in ticket #346):
 	tests/lib/libpthread/t_cond.c: revision 1.6
 The cond_timedwait_race test case is no longer expected to fail; it
 has been consistently passing since CVS date 2014.01.31.19.22.00.
 See also PR lib/44756.


 To generate a diff of this commit:
 cvs rdiff -u -r1.5 -r1.5.4.1 src/tests/lib/libpthread/t_cond.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: prlw1@NetBSD.org
State-Changed-When: Fri, 13 Feb 2015 22:37:57 +0000
State-Changed-Why:
Fixed, tested and pulled-up


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.