NetBSD Problem Report #41955

From woods@once.weird.com  Fri Aug 28 23:17:43 2009
Return-Path: <woods@once.weird.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 97AF063B121
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 28 Aug 2009 23:17:43 +0000 (UTC)
Message-Id: <m1MhAhQ-002YliC@once.weird.com>
Date: Fri, 28 Aug 2009 19:17:40 -0400 (EDT)
From: "Greg A. Woods" <woods@planix.com>
Sender: "Greg A. Woods" <woods@once.weird.com>
Reply-To: "Greg A. Woods" <woods@planix.com>
To: gnats-bugs@gnats.NetBSD.org
Subject: poor retry/next-ns logic in the DNS resolver hides certain error conditions
X-Send-Pr-Version: 3.95

>Number:         41955
>Category:       lib
>Synopsis:       poor retry/next-ns logic in the DNS resolver hides certain error conditions
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Aug 28 23:20:01 +0000 2009
>Last-Modified:  Sat Aug 29 03:30:02 +0000 2009
>Originator:     Greg A. Woods
>Release:        all
>Organization:
Planix, Inc.; Toronto, Ontario; Canada
>Environment:
System: NetBSD
>Description:

	While trying to decipher some DNS responses I discovered that
	queries done via TCP returned the proper response code to the
	caller, while queries done via UDP hid this response code from
	the caller.

>How-To-Repeat:

	Make a query using res_send() to a nameserver using a question
	which will result in a "REFUSED" response and note the
	difference between using a UDP query and a TCP query.

	An easy way to do this is to request an A RR for a zone where
	the answer will contain a character which is not valid for
	hostnames and where the queried nameserver is BIND-8 (or
	equivalent) with the various "check-names" options enabled.  For
	example a "/" in a CIDR-style PTR delegation CNAME record.  Note
	that despite the fact the answer is a CNAME, if an A RR is
	requested then BIND-8 will assume the right-hand side points to
	an A RR and will apply the check-names logic to the result
	before sending it, and will end up refusing the query even when
	the result is a CNAME that points to a PTR.

	(note the version of "host" I'm using here has not yet been
	released -- the released version still contains yet another bug
	which exacerbates this problem, fixes are pending.  also note
	that libc has been compiled with -DRESOLVDEBUG, something that
	should normally be the default.)

	    $ host -d -t a 10.161.29.204.in-addr.arpa 204.92.254.5   
	    ;; res_nmkquery(QUERY, 10.161.29.204.in-addr.arpa, IN, A)
	    ;; res_send()
	    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16691
	    ;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
	    ;;      10.161.29.204.in-addr.arpa, type = A, class = IN
	    ;; Querying server (# 1) address = 204.92.254.5
	    ;; new DG socket
	    server rejected query:
	    ;; ns_initparse: Message too long
	    ;; Querying server (# 1) address = 204.92.254.5
	    ;; new DG socket
	    server rejected query:
	    ;; ns_initparse: Message too long
	    ;; res_send failed
	     !!! Nameserver ns.weird.com not responding
	     !!! 10.161.29.204.in-addr.arpa A record not found at ns.weird.com, try again

	notice that the ultimate result is "not responding", but the
	response came back nearly immediately.

	try again, but this time force TCP:

	    $ host -u -d -t a 10.161.29.204.in-addr.arpa 204.92.254.5
	    ;; res_nmkquery(QUERY, 10.161.29.204.in-addr.arpa, IN, A)
	    ;; res_send()
	    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49
	    ;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
	    ;;      10.161.29.204.in-addr.arpa, type = A, class = IN
	    ;; Querying server (# 1) address = 204.92.254.5
	    ;; got answer:
	    ;; ns_initparse: Message too long
	    ;; Query for A records failed, 1 answer, authoritative, status: query refused
	    get_info(10.161.29.204.in-addr.arpa): qdcount = 1, ancount = 1, nscount = 0, arcount = 0
	    ;; res_send returned with REFUSED
	     !!! The server at ns.weird.com does not allow recursion.
	     !!! 10.161.29.204.in-addr.arpa A record query refused by ns.weird.com

	With the fix below the UDP result now matches the TCP result:

	    $ ./host -d -t a 10.161.29.204.in-addr.arpa 204.92.254.5    
	    ;; res_nmkquery(QUERY, 10.161.29.204.in-addr.arpa, IN, A)
	    ;; res_send()
	    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14698
	    ;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
	    ;;      10.161.29.204.in-addr.arpa, type = A, class = IN
	    ;; Querying server (# 1) address = 204.92.254.5
	    ;; new DG socket
	    server rejected query returning REFUSED:
	    ;; ns_initparse: Message too long
	    ;; got answer:
	    ;; ns_initparse: Message too long
	    ;; Query for A records failed, 1 answer, authoritative, status: query refused
	    get_info(10.161.29.204.in-addr.arpa): qdcount = 1, ancount = 1, nscount = 0, arcount = 0
	    ;; res_send returned with REFUSED
	     !!! The server at ns.weird.com does not allow recursion.
	     !!! 10.161.29.204.in-addr.arpa A record query refused by ns.weird.com


	(the curious are invited to try the proper "PTR" query to see
	the expected result)

	(I was also going to try to fix the bogus debug claim from
	ns_initparse() about "Message too long", but that's probably not
	worthwhile doing.)

>Fix:

	I think the following fix, shown for both -current and netbsd-4,
	seems to be sufficient -- i.e. it works for my test case.

	I'm not comfortable with the "ns+1 <" expression, but that's
	what worked and so I didn't try to explore any other options.

	I will probably try poking this fix into the BIND libresolv
	sources too, but I'm not sure they will bother updating this old
	resolver code.

Index: lib/libc/resolv/res_send.c
===================================================================
RCS file: /cvs/master/m-NetBSD/main/src/lib/libc/resolv/res_send.c,v
retrieving revision 1.18
diff -u -r1.18 res_send.c
--- lib/libc/resolv/res_send.c	12 Apr 2009 17:07:17 -0000	1.18
+++ lib/libc/resolv/res_send.c	28 Aug 2009 23:05:56 -0000
@@ -1033,12 +1033,12 @@
 	    anhp->rcode == NOTIMP ||
 	    anhp->rcode == REFUSED) {
 		DprintQ(statp->options & RES_DEBUG,
-			(stdout, "server rejected query:\n"),
+			(stdout, "server rejected query returning %s:\n", p_rcode(anhp->rcode)),
 			ans, (resplen > anssiz) ? anssiz : resplen);
 		res_nclose(statp);
-		/* don't retry if called from dig */
-		if (!statp->pfcode)
-			return (0);
+		/* don't retry if called from dig, or no more NS's to try */
+		if (!statp->pfcode && ns+1 < statp->nscount)
+			return (0);	/* otherwise try next NS */
 	}
 	if (!(statp->options & RES_IGNTC) && anhp->tc) {
 		/*

Index: lib/libc/resolv/res_send.c
===================================================================
RCS file: /cvs/master/m-NetBSD/main/src/lib/libc/resolv/res_send.c,v
retrieving revision 1.9.4.2
diff -u -r1.9.4.2 res_send.c
--- lib/libc/resolv/res_send.c	17 May 2007 21:25:19 -0000	1.9.4.2
+++ lib/libc/resolv/res_send.c	25 Aug 2009 22:59:13 -0000
@@ -1000,12 +1000,12 @@
 	    anhp->rcode == NOTIMP ||
 	    anhp->rcode == REFUSED) {
 		DprintQ(statp->options & RES_DEBUG,
-			(stdout, "server rejected query:\n"),
+			(stdout, "server rejected query returning %s:\n", p_rcode(anhp->rcode)),
 			ans, (resplen > anssiz) ? anssiz : resplen);
 		res_nclose(statp);
-		/* don't retry if called from dig */
-		if (!statp->pfcode)
-			return (0);
+		/* don't retry if called from dig, or no more NS's to try */
+		if (!statp->pfcode && ns+1 < statp->nscount)
+			return (0);	/* otherwise try next NS */
 	}
 	if (!(statp->options & RES_IGNTC) && anhp->tc) {
 		/*

>Audit-Trail:
From: "Jeremy C. Reed" <reed@reedmedia.net>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: lib/41955: poor retry/next-ns logic in the DNS resolver hides
 certain error conditions
Date: Fri, 28 Aug 2009 22:25:56 -0500 (CDT)

 On Fri, 28 Aug 2009, Greg A. Woods wrote:

 > 	I will probably try poking this fix into the BIND libresolv
 > 	sources too, but I'm not sure they will bother updating this old
 > 	resolver code.

 Hi Greg!

 Yes, ISC still maintains the code. The new project for maintaining the old 
 code is at https://www.isc.org/software/libbind

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.