NetBSD Problem Report #43625

From mark@ecs.vuw.ac.nz  Fri Jul 16 01:18:23 2010
Return-Path: <mark@ecs.vuw.ac.nz>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 5D7A563BAE7
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 16 Jul 2010 01:18:23 +0000 (UTC)
Message-Id: <201007160118.o6G1IJqP009532@city-art.ecs.vuw.ac.nz>
Date: Fri, 16 Jul 2010 13:18:19 +1200 (NZST)
From: mark@ecs.vuw.ac.nz
Reply-To: mark@ecs.vuw.ac.nz
To: gnats-bugs@gnats.NetBSD.org
Subject: signal posting problem
X-Send-Pr-Version: 3.95

>Number:         43625
>Category:       kern
>Synopsis:       signal posting problem
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jul 16 01:20:00 +0000 2010
>Closed-Date:    Thu Jul 21 03:32:54 +0000 2011
>Last-Modified:  Thu Jul 21 03:32:54 +0000 2011
>Originator:     Mark Davies
>Release:        NetBSD 5.0_STABLE
>Organization:
ECS, Victoria Uni. of Wellington, New Zealand.
>Environment:


System: NetBSD city-art.ecs.vuw.ac.nz 5.0_STABLE NetBSD 5.0_STABLE (ECS_WORKSTATION) #7: Sun Feb 28 09:13:18 NZDT 2010 mark@turakirae.ecs.vuw.ac.nz:/local/SAVE/build.obj/src/work/5/src/sys/arch/i386/compile/ECS_WORKSTATION i386
Architecture: i386
Machine: i386
>Description:
	On 5.0_STABLE/i386 and 5.1_RC3/i386 systems the cnid_metad process from netatalk is failing to reap its
	children.  The relevant code is below (trimming out the functional bits)

static volatile sig_atomic_t sigchild = 0;

struct server {
    char  *name;
    pid_t pid;
    int control_fd;               /* file descriptor to child cnid_dbd 
process */
};

static struct server srv[MAXVOLS];


/* -------------------- */
static int maybe_start_dbd(char *dbdpn, char *dbdir, char *usockfn)
{
    pid_t pid;
    struct server *up;

    up = test_usockfn(dbdir);
    if (up && up->pid) {
        /* we already have a process, send our fd */
        if (send_cred(up->control_fd, rqstfd) < 0) {
            /* FIXME */
            return -1;
        }
        return 0;
    }


    if ((pid = fork()) < 0) {
        return -1;
    }
    if (pid == 0) {
         ret = execlp(dbdpn, dbdpn, dbdir, buf1, buf2, logconfig, NULL);
         exit(0);
     }
    /*
     *  Parent.
     */
    up->pid = pid;
    return 0;
}

/* ------------------ */
static void catch_child(int sig _U_) 
{
    sigchild = 1;
}

/* ----------------------- */
static void set_signal(void)
{
    struct sigaction sv;
    sigset_t set;

    signal(SIGPIPE, SIG_IGN);

    sv.sa_handler = catch_child;
    sv.sa_flags = SA_NOCLDSTOP;
    sigemptyset(&sv.sa_mask);
    if (sigaction(SIGCHLD, &sv, NULL) < 0) {
        LOG(log_error, logtype_cnid, "cnid_metad: sigaction: %s", 
strerror(errno));
        exit(1);
    }
    /* block everywhere but in pselect */
    sigemptyset(&set);
    sigaddset(&set, SIGCHLD);
    sigprocmask(SIG_BLOCK, &set, NULL);
}

int usockfd_check(int sockfd, const sigset_t *sigset)
{
    int fd;
    socklen_t size;
    fd_set readfds;
    int ret;
    struct timeval tv;

    FD_ZERO(&readfds);
    FD_SET(sockfd, &readfds);

    if ((ret = pselect(sockfd + 1, &readfds, NULL, NULL, NULL, sigset)) < 
0) {
        if (errno == EINTR)
            return 0;
        return -1;
    }
}

/* ------------------ */
int main(int argc, char *argv[])
{
    pid_t pid;
    int   status;
    int    ret;
    sigset_t set;


    set_signal();

    sigemptyset(&set);
    sigprocmask(SIG_SETMASK, NULL, &set);
    sigdelset(&set, SIGCHLD);

    while (1) {
        rqstfd = usockfd_check(srvfd, &set);
        /* Collect zombie processes and log what happened to them */
        if (sigchild) while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
            for (i = 0; i < MAXVOLS; i++) {
                if (srv[i].pid == pid) {
                    srv[i].pid = 0;
                    break;
                }
            }
            if (WIFEXITED(status)) {
                LOG(log_info, logtype_cnid, "cnid_dbd pid %i exited with 
exit code %i",
                    pid, WEXITSTATUS(status));
            }
            else if (WIFSIGNALED(status)) {
                LOG(log_info, logtype_cnid, "cnid_dbd pid %i exited with 
signal %i",
                    pid, WTERMSIG(status));
            }
            sigchild = 0;
        }
        if (rqstfd <= 0)
            continue;

        maybe_start_dbd(dbdpn, dbdir, dbp->usock_file);

    }
}

	If you remove the  "if (sigchild)" prior to the "while ((pid = waitpid..." it does reap the
	children so the pselect() is being interrupted but the sigchld signal is not being posted to
	the process. as seen by this ktrace fragment

	>               [...]
	> 15830      1 cnid_dbd CALL  fcntl(2,8,0xbfbfedf4)
	> 15830      1 cnid_dbd RET   fcntl 0
	> 15830      1 cnid_dbd CALL  close(2)
	> 15830      1 cnid_dbd RET   close 0
	> 15830      1 cnid_dbd CALL  exit(0)
	>   460      1 cnid_metad RET   pselect -1 errno 4 Interrupted system 
	>call
	>   460      1 cnid_metad CALL  wait4(0xffffffff,0xbfbfee6c,1,0)
	>   460      1 cnid_metad RET   wait4 15830/0x3dd6
	>   460      1 cnid_metad CALL  close(4)
	>   460      1 cnid_metad RET   close 0
	>   460      1 cnid_metad CALL  wait4(0xffffffff,0xbfbfee6c,1,0)
	>   460      1 cnid_metad RET   wait4 0
	>   460      1 cnid_metad CALL  pselect(1,0xbfbfe8f4,0,0,0,0xbfbfee38)


>How-To-Repeat:
	Try to run cnid_metad

>Fix:


>Release-Note:

>Audit-Trail:

From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/43625 CVS commit: src/sys
Date: Tue, 17 May 2011 23:51:41 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Wed May 18 03:51:41 UTC 2011

 Modified Files:
 	src/sys/kern: sys_select.c sys_sig.c
 	src/sys/sys: signalvar.h

 Log Message:
 PR/43625: Mark Davies: Fix pselect(2) to honor the temporary mask. pselect(2)
 (and pollts(2)) are similar to sigsuspend(2) in that they temporarily change
 the process signal mask and wait for signal delivery. Factor out and share the
 code that does this.


 To generate a diff of this commit:
 cvs rdiff -u -r1.30 -r1.31 src/sys/kern/sys_select.c
 cvs rdiff -u -r1.32 -r1.33 src/sys/kern/sys_sig.c
 cvs rdiff -u -r1.79 -r1.80 src/sys/sys/signalvar.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: Christos Zoulas <christos@netbsd.org>
Subject: Re: PR/43625 CVS commit: src/sys
Date: Tue, 19 Jul 2011 02:50:57 +0000

 On Wed, May 18, 2011 at 03:55:02AM +0000, Christos Zoulas wrote:
  >  Modified Files:
  >  	src/sys/kern: sys_select.c sys_sig.c
  >  	src/sys/sys: signalvar.h
  >  
  >  Log Message:
  >  PR/43625: Mark Davies: Fix pselect(2) to honor the temporary
  >  mask. pselect(2) (and pollts(2)) are similar to sigsuspend(2) in
  >  that they temporarily change the process signal mask and wait for
  >  signal delivery. Factor out and share the code that does this.

 Christos, were you intending for this to be pulled up to -5, or should
 the PR be closed?

 -- 
 David A. Holland
 dholland@netbsd.org

From: christos@zoulas.com (Christos Zoulas)
To: David Holland <dholland-bugs@netbsd.org>, gnats-bugs@NetBSD.org
Cc: 
Subject: Re: PR/43625 CVS commit: src/sys
Date: Tue, 19 Jul 2011 09:03:47 -0400

 On Jul 19,  2:50am, dholland-bugs@netbsd.org (David Holland) wrote:
 -- Subject: Re: PR/43625 CVS commit: src/sys

 | On Wed, May 18, 2011 at 03:55:02AM +0000, Christos Zoulas wrote:
 |  >  Modified Files:
 |  >  	src/sys/kern: sys_select.c sys_sig.c
 |  >  	src/sys/sys: signalvar.h
 |  >  
 |  >  Log Message:
 |  >  PR/43625: Mark Davies: Fix pselect(2) to honor the temporary
 |  >  mask. pselect(2) (and pollts(2)) are similar to sigsuspend(2) in
 |  >  that they temporarily change the process signal mask and wait for
 |  >  signal delivery. Factor out and share the code that does this.
 | 
 | Christos, were you intending for this to be pulled up to -5, or should
 | the PR be closed?

 I have not seen any requests for it, so let's close it for now.

 christos

State-Changed-From-To: open->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Thu, 21 Jul 2011 03:32:54 +0000
State-Changed-Why:
Fixed in -current; if you'd like this to be in 5.1_STABLE let us know.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.