NetBSD Problem Report #10686
Received: (qmail 14889 invoked from network); 26 Jul 2000 12:46:21 -0000
Message-Id: <200007261246.e6QCkJU09485@hera.lip6.fr>
Date: Wed, 26 Jul 2000 14:46:19 +0200 (MEST)
From: Manuel Bouyer <bouyer@hera.lip6.fr>
Reply-To: bouyer@netbsd.org
To: gnats-bugs@gnats.netbsd.org
Subject: rpcbind doesn't always DTRT with non-local networks
X-Send-Pr-Version: 3.95
>Number: 10686
>Category: bin
>Synopsis: rpcbind doesn't always DTRT with non-local networks
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: bouyer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jul 26 12:47:00 +0000 2000
>Closed-Date:
>Last-Modified: Wed Jun 30 09:02:00 +0000 2004
>Originator: Manuel Bouyer
>Release: NetBSD 1.5_ALPHA as of 2 days ago
>Organization:
LIP6/RP, Universite Paris VI.
>Environment:
System: NetBSD hera 1.5_ALPHA NetBSD 1.5_ALPHA (HERA) #2: Tue Jul 25 13:24:33 MEST 2000 root@civry:/usr/sup_src/src/sys/arch/i386/compile/HERA i386
>Description:
This machine is acting as a master NIS server, in remplacement of a
SunOS 4.1.4 box. The box has only one IP addr, and is NIS server
for clients on both the local net (which are using broadcast) and
remotes, on different IP segments (linux, various NetBSD releases and
Solaris).
With the stock rpcbind, rpcbind dumps core soon after ypserv is started.
This seems to be as soon as some clients tries to access ypserv
(but it's not with any client: tests with a test domain and a reduced
set of clients didn't show this). rpcbind was dumping core because
cap->rmt_uaddr passed to sscanf() in xdr_rmtcall_result() was NULL.
This seems already reported in bin/10487, I tried the patch proposed
in this PR:
RCS file: /pub/NetBSD-CVS/basesrc/usr.sbin/rpcbind/rpcb_svc_com.c,v
retrieving revision 1.1.2.1
diff -u -r1.1.2.1 rpcb_svc_com.c
--- rpcb_svc_com.c 2000/06/23 08:16:03 1.1.2.1
+++ rpcb_svc_com.c 2000/07/26 10:23:40
@@ -448,7 +448,8 @@
u_long port;
/* interpret the universal address for TCP/IP */
- if (sscanf(cap->rmt_uaddr, "%d.%d.%d.%d.%d.%d",
+ if ((cap->rmt_uaddr == 0) ||
+ sscanf(cap->rmt_uaddr, "%d.%d.%d.%d.%d.%d",
&h1, &h2, &h3, &h4, &p1, &p2) != 6)
return (FALSE);
port = ((p1 & 0xff) << 8) + (p2 & 0xff);
With this patch, rpcbind no longer dumps core but some remote
client can't access ypserver (clients from the local network didn't
have any troubles). I debugged this with a NetBSD 1.3.2 client, I
didn't closely at what other did. But obvisouly some of them worked.
The problem was very strange, being that rpcinfo on the client didn't
have problems talking to the server (both '-p' and '-u 100004'), but
ypbind didn't. A tcpdump showed that the client sent and UDP packet
to the server's rpcbind but the server nerver anserwed.
Running with '-d' showed a lot of "rpcbproc_callit_com: duplicate
request".
After some debugging, I found that the addrmerge() call in
rpcbproc_callit_com() always returned NULL for the remote client
(looks like because it didn't find any interface for this one, which is
OK as the client is not on a local net). It seems that because of this
calls from different clients were handled as duplicate requests
from the same client and ignored. The very first request eventually
got handled with the bin/10487 patch, I didn't check this.
Based on other use of addrmerge() of mergeaddr() in the code
I did this change:
diff -u -r1.1.2.1 rpcb_svc_com.c
--- rpcb_svc_com.c 2000/06/23 08:16:03 1.1.2.1
+++ rpcb_svc_com.c 2000/07/26 10:23:40
@@ -753,6 +754,8 @@
addrmerge(&tbuf, rbl->rpcb_map.r_addr, NULL, nconf->nc_netid);
m_uaddr = addrmerge(caller, rbl->rpcb_map.r_addr, NULL,
nconf->nc_netid);
+ if (m_uaddr == NULL)
+ m_uaddr = strdup(rbl->rpcb_map.r_addr);
#ifdef RPCBIND_DEBUG
if (debugging)
fprintf(stderr, "merged uaddr %s\n", m_uaddr);
I'm not sure what this is supposed to do (looks like using the caller's addr
instead of the merged one) but now m_uaddr is never NULL, and rpcbind is
working properly with both local and remote clients (for ypserv, NFS, rstatd
rquotad). However I didn't try to understand the depths of rpcbind, and I don't
know what m_uaddr is really used for. So this change may not be the rigth one.
I leave this to someone really understanding the code :)
>How-To-Repeat:
setup a yp server, set up several clients on a different subnet.
It may be necessary to reboot the server once all clients are
running to trigger the bug.
>Fix:
See above. The proposed patch may only be a workaround for my problem,
and not a real fix.
Fixing this may also fix bin/10487 the rigth way.
>Release-Note:
>Audit-Trail:
From: "Dr. Rene Hexel" <rh@vip.at>
To: bouyer@hera.lip6.fr
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: bin/10686: rpcbind doesn't always DTRT with non-local networks
Date: Wed, 26 Jul 2000 15:52:59 +0200
Manuel Bouyer wrote:
> >Number: 10686
> >Category: bin
> >Synopsis: rpcbind doesn't always DTRT with non-local networks
Just as a reference point: this might be the same as bin/10683 I
submitted earlier today.
Cheers
,
Rene
Responsible-Changed-From-To: bin-bug-people->bin-bug-people,bouyer@netbsd.org
Responsible-Changed-By: tls
Responsible-Changed-When: Wed Mar 31 21:43:05 UTC 2004
Responsible-Changed-Why:
Manuel's a developer; he gets to "own" his own bug.
Responsible-Changed-From-To: bin-bug-people,bouyer@netbsd.org->bouyer
Responsible-Changed-By: fair
Responsible-Changed-When: Thu Apr 1 09:59:51 UTC 2004
Responsible-Changed-Why:
Let's try that responsibility reassignment again...
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@gnats.netbsd.org
Cc: tls@netbsd.org, gnats-admin@netbsd.org, bin-bug-people@netbsd.org
Subject: Re: bin/10686
Date: Thu, 1 Apr 2004 23:16:02 +0200
On Wed, Mar 31, 2004 at 09:43:33PM -0000, tls@netbsd.org wrote:
> Synopsis: rpcbind doesn't always DTRT with non-local networks
>
> Responsible-Changed-From-To: bin-bug-people->bin-bug-people,bouyer@netbsd.org
> Responsible-Changed-By: tls
> Responsible-Changed-When: Wed Mar 31 21:43:05 UTC 2004
> Responsible-Changed-Why:
> Manuel's a developer; he gets to "own" his own bug.
Well, if I send a bug report, it's because I don't know how to fix it
myself, or I'm not sure the fix I propose is right. In other words: I'm
asking for help. So I'm not sure assigning my PRs to me will really be
productive.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Emmanuel Dreyfus <manu@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: bouyer@netbsd.org, fvdl@netbsd.org
Subject: bin/10686: rpcbind doesn't always DTRT with non-local networks
Date: Thu, 27 May 2004 15:33:32 +0000
The problem is not specific to remote networks. My rpcbind stops answering
queries from ypbind on the local network. I have exactly the same symptoms:
ypbind gets no reply while rpcinfo -p works.
The patch proposed in this PR does not solve the problem. The only way to
fix the problem is to restart rpcbind.
--
Emmanuel Dreyfus
manu@netbsd.org
From: Emmanuel Dreyfus <manu@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: bouyer@netbsd.org
Subject: bin/10686: rpcbind doesn't always DTRT with non-local networks
Date: Wed, 30 Jun 2004 09:01:13 +0000
--5mCyUwZo2JvN/JJP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Attached is a patch that fixes everything for me. I don't understand
what exactly I have ifdef'ed out, but it seems to work properly now.
--
Emmanuel Dreyfus
manu@netbsd.org
--5mCyUwZo2JvN/JJP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=patch
--- /tmp/src/usr.sbin/rpcbind/rpcb_svc_com.c Tue Oct 21 04:53:02 2003
+++ rpcb_svc_com.c Thu May 27 17:49:13 2004
@@ -1,5 +1,5 @@
-/* $NetBSD: rpcb_svc_com.c,v 1.10 2003/10/21 02:53:02 fvdl Exp $ */
+/* $NetBSD: rpcb_svc_com.c,v 1.6 2000/08/03 00:07:22 fvdl Exp $ */
/*
* Sun RPC is a product of Sun Microsystems, Inc. and is provided for
* unrestricted use provided that this legend is included on all tape
@@ -41,17 +41,17 @@
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/param.h>
+#include <sys/poll.h>
#include <sys/socket.h>
#include <rpc/rpc.h>
#include <rpc/rpcb_prot.h>
#include <netconfig.h>
#include <errno.h>
#include <syslog.h>
#include <unistd.h>
#include <stdio.h>
-#include <poll.h>
#ifdef PORTMAP
#include <netinet/in.h>
#include <rpc/pmap_prot.h>
#endif /* PORTMAP */
@@ -101,9 +101,11 @@
rpcproc_t, rpcvers_t));
static struct finfo *forward_find __P((u_int32_t));
static int free_slot_by_xid __P((u_int32_t));
static int free_slot_by_index __P((int));
+#if 0
static int netbufcmp __P((struct netbuf *, struct netbuf *));
+#endif
static struct netbuf *netbufdup __P((struct netbuf *));
static void netbuffree __P((struct netbuf *));
static int check_rmtcalls __P((struct pollfd *, int));
static void xprt_set_caller __P((SVCXPRT *, struct finfo *));
@@ -752,8 +754,10 @@
local_uaddr =
addrmerge(&tbuf, rbl->rpcb_map.r_addr, NULL, nconf->nc_netid);
m_uaddr = addrmerge(caller, rbl->rpcb_map.r_addr, NULL,
nconf->nc_netid);
+ if (m_uaddr == NULL)
+ m_uaddr = strdup(rbl->rpcb_map.r_addr);
#ifdef RPCBIND_DEBUG
if (debugging)
fprintf(stderr, "merged uaddr %s\n", m_uaddr);
#endif
@@ -943,13 +947,17 @@
* use the slot with the earliest time.
*/
for (i = 0; i < NFORWARD; i++) {
if (FINFO[i].flag & FINFO_ACTIVE) {
+#if 1
+ if (0) {
+#else
if ((FINFO[i].caller_xid == caller_xid) &&
(FINFO[i].reply_type == reply_type) &&
(FINFO[i].versnum == versnum) &&
(!netbufcmp(FINFO[i].caller_addr,
caller_addr))) {
+#endif
FINFO[i].time = time((time_t *)0);
return (0); /* Duplicate entry */
} else {
/* Should we wait any longer */
@@ -1036,13 +1044,15 @@
}
return (0);
}
+#if 0
static int
netbufcmp(struct netbuf *n1, struct netbuf *n2)
{
return ((n1->len != n2->len) || memcmp(n1->buf, n2->buf, n1->len));
}
+#endif
static struct netbuf *
netbufdup(struct netbuf *ap)
{
@@ -1064,9 +1074,8 @@
}
#define MASKVAL (POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND)
-extern bool_t __svc_clean_idle(fd_set *, int, bool_t);
void
my_svc_run()
{
@@ -1077,9 +1086,8 @@
#ifdef SVC_RUN_DEBUG
int i;
#endif
register struct pollfd *p;
- fd_set cleanfds;
for (;;) {
p = pollfds;
for (n = 0; n <= svc_maxfd; n++) {
@@ -1099,18 +1107,16 @@
fprintf(stderr, "%d ", p->fd);
fprintf(stderr, ">\n");
}
#endif
- switch (poll_ret = poll(pollfds, nfds, 30 * 1000)) {
+ switch (poll_ret = poll(pollfds, nfds, INFTIM)) {
case -1:
/*
* We ignore all errors, continuing with the assumption
* that it was set by the signal handlers (or any
* other outside event) and not caused by poll().
*/
case 0:
- cleanfds = svc_fdset;
- __svc_clean_idle(&cleanfds, 30, FALSE);
continue;
default:
#ifdef SVC_RUN_DEBUG
if (debugging) {
@@ -1430,9 +1436,9 @@
prot = IPPROTO_UDP;
} else if (strcmp(arg->r_netid, tcptrans) == 0) {
/* It is TCP */
prot = IPPROTO_TCP;
- } else if (arg->r_netid[0] == 0) {
+ } else if (arg->r_netid[0] == NULL) {
prot = 0; /* Remove all occurrences */
} else {
/* Not a IP protocol */
return (0);
--5mCyUwZo2JvN/JJP--
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.