NetBSD Problem Report #50439

From dholland@netbsd.org  Tue Nov 17 09:39:38 2015
Return-Path: <dholland@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C8CB4A6552
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 17 Nov 2015 09:39:37 +0000 (UTC)
Message-Id: <20151117093937.5ADBD14A16B@mail.netbsd.org>
Date: Tue, 17 Nov 2015 09:39:37 +0000 (UTC)
From: dholland@netbsd.org
Reply-To: dholland@netbsd.org
To: gnats-bugs@gnats.NetBSD.org
Subject: rpcbind follies with nis down
X-Send-Pr-Version: 3.95

>Number:         50439
>Category:       bin
>Synopsis:       rpcbind follies with nis down
>Confidential:   no
>Severity:       critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Nov 17 09:40:00 +0000 2015
>Originator:     David A. Holland
>Release:        NetBSD 7.99.20 (20150727)
>Organization:
>Environment:
System: NetBSD macaran 7.99.20 NetBSD 7.99.20 (MACARAN) #30: Mon Jul 27 20:25:15 EDT 2015  dholland@macaran:/usr/src/sys/arch/amd64/compile/MACARAN amd64
Architecture: x86_64
Machine: amd64
>Description:

	Now that ypbind has been fixed to not explode the world when
	the network goes down, it seems that rpcbind takes over
	responsibility.

	When the NIS server goes down, the libc NIS code contacts
	rpcbind, producing this message:

Nov 13 19:00:00 macaran rpcbind: connect from 127.0.0.1 to getport/addr(ypbind)

	Each time this happens it seems to produce another fork of
	rpcbind. In the course of a ~1h30 network downtime a couple
	days ago, process accounting logged 1449403 rpcbind processes
	exiting. This (and/or possibly related phenomena occurring in
	the libc NIS code) was sufficient to run through 12G of ram
	and swap and then OOM. This took out the X server of course
	and thus I don't have as much information as I'd like about
	what actually happened.

>How-To-Repeat:

	Be using NIS; disconnect the network with a lot of stuff
	running.

>Fix:

	rpcbind apparently forks every time it wants to log a message.
	This is silly; it shouldn't need to fork more than once
	overall.

	However, I think the real problem lies in the libc NIS code; I
	think it is probably doing something stupid that leads it to
	blast rpcbind unnecessarily. I had a fair amount of stuff
	running when the network went plop, but not 1.4 million
	processes or even 14,000.

	Unfortunately, nuking NIS from orbit isn't an option.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.