NetBSD Problem Report #41955
From woods@once.weird.com Fri Aug 28 23:17:43 2009
Return-Path: <woods@once.weird.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 97AF063B121
for <gnats-bugs@gnats.NetBSD.org>; Fri, 28 Aug 2009 23:17:43 +0000 (UTC)
Message-Id: <m1MhAhQ-002YliC@once.weird.com>
Date: Fri, 28 Aug 2009 19:17:40 -0400 (EDT)
From: "Greg A. Woods" <woods@planix.com>
Sender: "Greg A. Woods" <woods@once.weird.com>
Reply-To: "Greg A. Woods" <woods@planix.com>
To: gnats-bugs@gnats.NetBSD.org
Subject: poor retry/next-ns logic in the DNS resolver hides certain error conditions
X-Send-Pr-Version: 3.95
>Number: 41955
>Category: lib
>Synopsis: poor retry/next-ns logic in the DNS resolver hides certain error conditions
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: lib-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Aug 28 23:20:01 +0000 2009
>Last-Modified: Sat Aug 29 03:30:02 +0000 2009
>Originator: Greg A. Woods
>Release: all
>Organization:
Planix, Inc.; Toronto, Ontario; Canada
>Environment:
System: NetBSD
>Description:
While trying to decipher some DNS responses I discovered that
queries done via TCP returned the proper response code to the
caller, while queries done via UDP hid this response code from
the caller.
>How-To-Repeat:
Make a query using res_send() to a nameserver using a question
which will result in a "REFUSED" response and note the
difference between using a UDP query and a TCP query.
An easy way to do this is to request an A RR for a zone where
the answer will contain a character which is not valid for
hostnames and where the queried nameserver is BIND-8 (or
equivalent) with the various "check-names" options enabled. For
example a "/" in a CIDR-style PTR delegation CNAME record. Note
that despite the fact the answer is a CNAME, if an A RR is
requested then BIND-8 will assume the right-hand side points to
an A RR and will apply the check-names logic to the result
before sending it, and will end up refusing the query even when
the result is a CNAME that points to a PTR.
(note the version of "host" I'm using here has not yet been
released -- the released version still contains yet another bug
which exacerbates this problem, fixes are pending. also note
that libc has been compiled with -DRESOLVDEBUG, something that
should normally be the default.)
$ host -d -t a 10.161.29.204.in-addr.arpa 204.92.254.5
;; res_nmkquery(QUERY, 10.161.29.204.in-addr.arpa, IN, A)
;; res_send()
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16691
;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; 10.161.29.204.in-addr.arpa, type = A, class = IN
;; Querying server (# 1) address = 204.92.254.5
;; new DG socket
server rejected query:
;; ns_initparse: Message too long
;; Querying server (# 1) address = 204.92.254.5
;; new DG socket
server rejected query:
;; ns_initparse: Message too long
;; res_send failed
!!! Nameserver ns.weird.com not responding
!!! 10.161.29.204.in-addr.arpa A record not found at ns.weird.com, try again
notice that the ultimate result is "not responding", but the
response came back nearly immediately.
try again, but this time force TCP:
$ host -u -d -t a 10.161.29.204.in-addr.arpa 204.92.254.5
;; res_nmkquery(QUERY, 10.161.29.204.in-addr.arpa, IN, A)
;; res_send()
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49
;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; 10.161.29.204.in-addr.arpa, type = A, class = IN
;; Querying server (# 1) address = 204.92.254.5
;; got answer:
;; ns_initparse: Message too long
;; Query for A records failed, 1 answer, authoritative, status: query refused
get_info(10.161.29.204.in-addr.arpa): qdcount = 1, ancount = 1, nscount = 0, arcount = 0
;; res_send returned with REFUSED
!!! The server at ns.weird.com does not allow recursion.
!!! 10.161.29.204.in-addr.arpa A record query refused by ns.weird.com
With the fix below the UDP result now matches the TCP result:
$ ./host -d -t a 10.161.29.204.in-addr.arpa 204.92.254.5
;; res_nmkquery(QUERY, 10.161.29.204.in-addr.arpa, IN, A)
;; res_send()
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14698
;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; 10.161.29.204.in-addr.arpa, type = A, class = IN
;; Querying server (# 1) address = 204.92.254.5
;; new DG socket
server rejected query returning REFUSED:
;; ns_initparse: Message too long
;; got answer:
;; ns_initparse: Message too long
;; Query for A records failed, 1 answer, authoritative, status: query refused
get_info(10.161.29.204.in-addr.arpa): qdcount = 1, ancount = 1, nscount = 0, arcount = 0
;; res_send returned with REFUSED
!!! The server at ns.weird.com does not allow recursion.
!!! 10.161.29.204.in-addr.arpa A record query refused by ns.weird.com
(the curious are invited to try the proper "PTR" query to see
the expected result)
(I was also going to try to fix the bogus debug claim from
ns_initparse() about "Message too long", but that's probably not
worthwhile doing.)
>Fix:
I think the following fix, shown for both -current and netbsd-4,
seems to be sufficient -- i.e. it works for my test case.
I'm not comfortable with the "ns+1 <" expression, but that's
what worked and so I didn't try to explore any other options.
I will probably try poking this fix into the BIND libresolv
sources too, but I'm not sure they will bother updating this old
resolver code.
Index: lib/libc/resolv/res_send.c
===================================================================
RCS file: /cvs/master/m-NetBSD/main/src/lib/libc/resolv/res_send.c,v
retrieving revision 1.18
diff -u -r1.18 res_send.c
--- lib/libc/resolv/res_send.c 12 Apr 2009 17:07:17 -0000 1.18
+++ lib/libc/resolv/res_send.c 28 Aug 2009 23:05:56 -0000
@@ -1033,12 +1033,12 @@
anhp->rcode == NOTIMP ||
anhp->rcode == REFUSED) {
DprintQ(statp->options & RES_DEBUG,
- (stdout, "server rejected query:\n"),
+ (stdout, "server rejected query returning %s:\n", p_rcode(anhp->rcode)),
ans, (resplen > anssiz) ? anssiz : resplen);
res_nclose(statp);
- /* don't retry if called from dig */
- if (!statp->pfcode)
- return (0);
+ /* don't retry if called from dig, or no more NS's to try */
+ if (!statp->pfcode && ns+1 < statp->nscount)
+ return (0); /* otherwise try next NS */
}
if (!(statp->options & RES_IGNTC) && anhp->tc) {
/*
Index: lib/libc/resolv/res_send.c
===================================================================
RCS file: /cvs/master/m-NetBSD/main/src/lib/libc/resolv/res_send.c,v
retrieving revision 1.9.4.2
diff -u -r1.9.4.2 res_send.c
--- lib/libc/resolv/res_send.c 17 May 2007 21:25:19 -0000 1.9.4.2
+++ lib/libc/resolv/res_send.c 25 Aug 2009 22:59:13 -0000
@@ -1000,12 +1000,12 @@
anhp->rcode == NOTIMP ||
anhp->rcode == REFUSED) {
DprintQ(statp->options & RES_DEBUG,
- (stdout, "server rejected query:\n"),
+ (stdout, "server rejected query returning %s:\n", p_rcode(anhp->rcode)),
ans, (resplen > anssiz) ? anssiz : resplen);
res_nclose(statp);
- /* don't retry if called from dig */
- if (!statp->pfcode)
- return (0);
+ /* don't retry if called from dig, or no more NS's to try */
+ if (!statp->pfcode && ns+1 < statp->nscount)
+ return (0); /* otherwise try next NS */
}
if (!(statp->options & RES_IGNTC) && anhp->tc) {
/*
>Audit-Trail:
From: "Jeremy C. Reed" <reed@reedmedia.net>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: lib/41955: poor retry/next-ns logic in the DNS resolver hides
certain error conditions
Date: Fri, 28 Aug 2009 22:25:56 -0500 (CDT)
On Fri, 28 Aug 2009, Greg A. Woods wrote:
> I will probably try poking this fix into the BIND libresolv
> sources too, but I'm not sure they will bother updating this old
> resolver code.
Hi Greg!
Yes, ISC still maintains the code. The new project for maintaining the old
code is at https://www.isc.org/software/libbind
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.