NetBSD Problem Report #10686

Received: (qmail 14889 invoked from network); 26 Jul 2000 12:46:21 -0000
Message-Id: <200007261246.e6QCkJU09485@hera.lip6.fr>
Date: Wed, 26 Jul 2000 14:46:19 +0200 (MEST)
From: Manuel Bouyer <bouyer@hera.lip6.fr>
Reply-To: bouyer@netbsd.org
To: gnats-bugs@gnats.netbsd.org
Subject: rpcbind doesn't always DTRT with non-local networks
X-Send-Pr-Version: 3.95

>Number:         10686
>Category:       bin
>Synopsis:       rpcbind doesn't always DTRT with non-local networks
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    bouyer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jul 26 12:47:00 +0000 2000
>Closed-Date:    
>Last-Modified:  Wed Jun 30 09:02:00 +0000 2004
>Originator:     Manuel Bouyer
>Release:        NetBSD 1.5_ALPHA as of 2 days ago
>Organization:

LIP6/RP, Universite Paris VI.

>Environment:

System: NetBSD hera 1.5_ALPHA NetBSD 1.5_ALPHA (HERA) #2: Tue Jul 25 13:24:33 MEST 2000 root@civry:/usr/sup_src/src/sys/arch/i386/compile/HERA i386

>Description:
	This machine is acting as a master NIS server, in remplacement of a
	SunOS 4.1.4 box. The box has only one IP addr, and is NIS server
	for clients on both the local net (which are using broadcast) and
	remotes, on different IP segments (linux, various NetBSD releases and
	Solaris).
	With the stock rpcbind, rpcbind dumps core soon after ypserv is started.
	This seems to be as soon as some clients tries to access ypserv
	(but it's not with any client: tests with a test domain and a reduced
	set of clients didn't show this). rpcbind was dumping core because
	cap->rmt_uaddr passed to sscanf() in xdr_rmtcall_result() was NULL.
	This seems already reported in bin/10487, I tried the patch proposed
	in this PR:
RCS file: /pub/NetBSD-CVS/basesrc/usr.sbin/rpcbind/rpcb_svc_com.c,v
retrieving revision 1.1.2.1
diff -u -r1.1.2.1 rpcb_svc_com.c
--- rpcb_svc_com.c      2000/06/23 08:16:03     1.1.2.1
+++ rpcb_svc_com.c      2000/07/26 10:23:40
@@ -448,7 +448,8 @@
                u_long port;

		 /* interpret the universal address for TCP/IP */
-               if (sscanf(cap->rmt_uaddr, "%d.%d.%d.%d.%d.%d",
+               if ((cap->rmt_uaddr == 0) ||
+                   sscanf(cap->rmt_uaddr, "%d.%d.%d.%d.%d.%d",
			 &h1, &h2, &h3, &h4, &p1, &p2) != 6)
			return (FALSE);
		 port = ((p1 & 0xff) << 8) + (p2 & 0xff);

	With this patch, rpcbind no longer dumps core but some remote
	client can't access ypserver (clients from the local network didn't
	have any troubles). I debugged this with a NetBSD 1.3.2 client, I
	didn't closely at what other did. But obvisouly some of them worked.
	The problem was very strange, being that rpcinfo on the client didn't
	have problems talking to the server (both '-p' and '-u 100004'), but
	ypbind didn't. A tcpdump showed that the client sent and UDP packet
	to the server's rpcbind but the server nerver anserwed.
	Running with '-d' showed a lot of "rpcbproc_callit_com:  duplicate
	request".
	After some debugging, I found that the addrmerge() call in
	rpcbproc_callit_com() always returned NULL for the remote client
	(looks like because it didn't find any interface for this one, which is
	OK as the client is not on a local net). It seems that because of this
	calls from different clients were handled as duplicate requests
	from the same client and ignored. The very first request eventually
	got handled with the bin/10487 patch, I didn't check this.
	Based on other use of addrmerge() of mergeaddr() in the code
	I did this change:
diff -u -r1.1.2.1 rpcb_svc_com.c
--- rpcb_svc_com.c      2000/06/23 08:16:03     1.1.2.1
+++ rpcb_svc_com.c      2000/07/26 10:23:40
@@ -753,6 +754,8 @@
            addrmerge(&tbuf, rbl->rpcb_map.r_addr, NULL, nconf->nc_netid);
	m_uaddr = addrmerge(caller, rbl->rpcb_map.r_addr, NULL,
	    nconf->nc_netid);
+       if (m_uaddr == NULL)
+               m_uaddr = strdup(rbl->rpcb_map.r_addr);
#ifdef RPCBIND_DEBUG
	if (debugging)
	fprintf(stderr, "merged uaddr %s\n", m_uaddr);

I'm not sure what this is supposed to do (looks like using the caller's addr
instead of the merged one) but now m_uaddr is never NULL, and rpcbind is
working properly with both local and remote clients (for ypserv, NFS, rstatd
rquotad). However I didn't try to understand the depths of rpcbind, and I don't
know what m_uaddr is really used for. So this change may not be the rigth one.
I leave this to someone really understanding the code :)

>How-To-Repeat:
	setup a yp server, set up several clients on a different subnet.
	It may be necessary to reboot the server once all clients are
	running to trigger the bug.
>Fix:
	See above. The proposed patch may only be a workaround for my problem,
	and not a real fix.
	Fixing this may also fix bin/10487 the rigth way.
>Release-Note:
>Audit-Trail:

From: "Dr. Rene Hexel" <rh@vip.at>
To: bouyer@hera.lip6.fr
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: bin/10686: rpcbind doesn't always DTRT with non-local networks
Date: Wed, 26 Jul 2000 15:52:59 +0200

 Manuel Bouyer wrote:

 > >Number:         10686
 > >Category:       bin
 > >Synopsis:       rpcbind doesn't always DTRT with non-local networks

   Just as a reference point: this might be the same as bin/10683 I
 submitted earlier today.

   Cheers
       ,
    Rene

Responsible-Changed-From-To: bin-bug-people->bin-bug-people,bouyer@netbsd.org 
Responsible-Changed-By: tls 
Responsible-Changed-When: Wed Mar 31 21:43:05 UTC 2004 
Responsible-Changed-Why:  
Manuel's a developer; he gets to "own" his own bug. 
Responsible-Changed-From-To: bin-bug-people,bouyer@netbsd.org->bouyer 
Responsible-Changed-By: fair 
Responsible-Changed-When: Thu Apr 1 09:59:51 UTC 2004 
Responsible-Changed-Why:  

Let's try that responsibility reassignment again... 


From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@gnats.netbsd.org
Cc: tls@netbsd.org, gnats-admin@netbsd.org, bin-bug-people@netbsd.org
Subject: Re: bin/10686
Date: Thu, 1 Apr 2004 23:16:02 +0200

 On Wed, Mar 31, 2004 at 09:43:33PM -0000, tls@netbsd.org wrote:
 > Synopsis: rpcbind doesn't always DTRT with non-local networks
 > 
 > Responsible-Changed-From-To: bin-bug-people->bin-bug-people,bouyer@netbsd.org
 > Responsible-Changed-By: tls
 > Responsible-Changed-When: Wed Mar 31 21:43:05 UTC 2004
 > Responsible-Changed-Why: 
 > Manuel's a developer; he gets to "own" his own bug.

 Well, if I send a bug report, it's because I don't know how to fix it
 myself, or I'm not sure the fix I propose is right. In other words: I'm
 asking for help. So I'm not sure assigning my PRs to me will really be
 productive.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: Emmanuel Dreyfus <manu@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: bouyer@netbsd.org, fvdl@netbsd.org
Subject: bin/10686: rpcbind doesn't always DTRT with non-local networks
Date: Thu, 27 May 2004 15:33:32 +0000

 The problem is not specific to remote networks. My rpcbind stops answering
 queries from ypbind on the local network. I have exactly the same symptoms:
 ypbind gets no reply while rpcinfo -p works. 

 The patch proposed in this PR does not solve the problem. The only way to 
 fix the problem is to restart rpcbind.

 -- 
 Emmanuel Dreyfus
 manu@netbsd.org

From: Emmanuel Dreyfus <manu@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: bouyer@netbsd.org
Subject: bin/10686: rpcbind doesn't always DTRT with non-local networks
Date: Wed, 30 Jun 2004 09:01:13 +0000

 --5mCyUwZo2JvN/JJP
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 Attached is a patch that fixes everything for me. I don't understand
 what exactly I have ifdef'ed out, but it seems to work properly now.

 -- 
 Emmanuel Dreyfus
 manu@netbsd.org

 --5mCyUwZo2JvN/JJP
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename=patch

 --- /tmp/src/usr.sbin/rpcbind/rpcb_svc_com.c	Tue Oct 21 04:53:02 2003
 +++ rpcb_svc_com.c	Thu May 27 17:49:13 2004
 @@ -1,5 +1,5 @@
 -/*	$NetBSD: rpcb_svc_com.c,v 1.10 2003/10/21 02:53:02 fvdl Exp $	*/
 +/*	$NetBSD: rpcb_svc_com.c,v 1.6 2000/08/03 00:07:22 fvdl Exp $	*/

  /*
   * Sun RPC is a product of Sun Microsystems, Inc. and is provided for
   * unrestricted use provided that this legend is included on all tape
 @@ -41,17 +41,17 @@

  #include <sys/types.h>
  #include <sys/stat.h>
  #include <sys/param.h>
 +#include <sys/poll.h>
  #include <sys/socket.h>
  #include <rpc/rpc.h>
  #include <rpc/rpcb_prot.h>
  #include <netconfig.h>
  #include <errno.h>
  #include <syslog.h>
  #include <unistd.h>
  #include <stdio.h>
 -#include <poll.h>
  #ifdef PORTMAP
  #include <netinet/in.h>
  #include <rpc/pmap_prot.h>
  #endif /* PORTMAP */
 @@ -101,9 +101,11 @@
  				    rpcproc_t, rpcvers_t));
  static struct finfo *forward_find __P((u_int32_t));
  static int free_slot_by_xid __P((u_int32_t));
  static int free_slot_by_index __P((int));
 +#if 0
  static int netbufcmp __P((struct netbuf *, struct netbuf *));
 +#endif
  static struct netbuf *netbufdup __P((struct netbuf *));
  static void netbuffree __P((struct netbuf *));
  static int check_rmtcalls __P((struct pollfd *, int));
  static void xprt_set_caller __P((SVCXPRT *, struct finfo *));
 @@ -752,8 +754,10 @@
  	local_uaddr =
  	    addrmerge(&tbuf, rbl->rpcb_map.r_addr, NULL, nconf->nc_netid);
  	m_uaddr = addrmerge(caller, rbl->rpcb_map.r_addr, NULL,
  			nconf->nc_netid);
 +	if (m_uaddr == NULL)
 +		m_uaddr = strdup(rbl->rpcb_map.r_addr);
  #ifdef RPCBIND_DEBUG
  	if (debugging)
  		fprintf(stderr, "merged uaddr %s\n", m_uaddr);
  #endif
 @@ -943,13 +947,17 @@
  	 * use the slot with the earliest time.
  	 */
  	for (i = 0; i < NFORWARD; i++) {
  		if (FINFO[i].flag & FINFO_ACTIVE) {
 +#if 1
 +			if (0) {
 +#else
  			if ((FINFO[i].caller_xid == caller_xid) &&
  			    (FINFO[i].reply_type == reply_type) &&
  			    (FINFO[i].versnum == versnum) &&
  			    (!netbufcmp(FINFO[i].caller_addr,
  					    caller_addr))) {
 +#endif
  				FINFO[i].time = time((time_t *)0);
  				return (0);	/* Duplicate entry */
  			} else {
  				/* Should we wait any longer */
 @@ -1036,13 +1044,15 @@
  	}
  	return (0);
  }

 +#if 0
  static int
  netbufcmp(struct netbuf *n1, struct netbuf *n2)
  {
  	return ((n1->len != n2->len) || memcmp(n1->buf, n2->buf, n1->len));
  }
 +#endif

  static struct netbuf *
  netbufdup(struct netbuf *ap)
  {
 @@ -1064,9 +1074,8 @@
  }


  #define	MASKVAL	(POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND)
 -extern bool_t __svc_clean_idle(fd_set *, int, bool_t);

  void
  my_svc_run()
  {
 @@ -1077,9 +1086,8 @@
  #ifdef SVC_RUN_DEBUG
  	int i;
  #endif
  	register struct pollfd	*p;
 -	fd_set cleanfds;

  	for (;;) {
  		p = pollfds;
  		for (n = 0; n <= svc_maxfd; n++) {
 @@ -1099,18 +1107,16 @@
  					fprintf(stderr, "%d ", p->fd);
  			fprintf(stderr, ">\n");
  		}
  #endif
 -		switch (poll_ret = poll(pollfds, nfds, 30 * 1000)) {
 +		switch (poll_ret = poll(pollfds, nfds, INFTIM)) {
  		case -1:
  			/*
  			 * We ignore all errors, continuing with the assumption
  			 * that it was set by the signal handlers (or any
  			 * other outside event) and not caused by poll().
  			 */
  		case 0:
 -			cleanfds = svc_fdset;
 -			__svc_clean_idle(&cleanfds, 30, FALSE);
  			continue;
  		default:
  #ifdef SVC_RUN_DEBUG
  			if (debugging) {
 @@ -1430,9 +1436,9 @@
  		prot = IPPROTO_UDP;
  	} else if (strcmp(arg->r_netid, tcptrans) == 0) {
  		/* It is TCP */
  		prot = IPPROTO_TCP;
 -	} else if (arg->r_netid[0] == 0) {
 +	} else if (arg->r_netid[0] == NULL) {
  		prot = 0;	/* Remove all occurrences */
  	} else {
  		/* Not a IP protocol */
  		return (0);

 --5mCyUwZo2JvN/JJP--
>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.