public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
From: "law at redhat dot com" <sourceware-bugzilla@sourceware.org> To: glibc-bugs@sources.redhat.com Subject: [Bug network/14308] New: getaddrinfo DNS referral response returns host not found when A and AAAA questions are sent and one response is a referral Date: Fri, 29 Jun 2012 05:28:00 -0000 [thread overview] Message-ID: <bug-14308-131@http.sourceware.org/bugzilla/> (raw) http://sourceware.org/bugzilla/show_bug.cgi?id=14308 Bug #: 14308 Summary: getaddrinfo DNS referral response returns host not found when A and AAAA questions are sent and one response is a referral Product: glibc Version: 2.15 Status: NEW Severity: normal Priority: P2 Component: network AssignedTo: unassigned@sourceware.org ReportedBy: law@redhat.com Classification: Unclassified Created attachment 6494 --> http://sourceware.org/bugzilla/attachment.cgi?id=6494 Wireshark Packet Capture [ This is a cut-n-paste from a bug report in Red Hat's bugzilla database. Refiling it here on Gino's behalf as this is really the right place. ] I'm seeing a very edge-case issue with Linux hosts complaining about "host not found" name resolution errors when using the getaddrinfo() API call as provided by glibc. When getaddrinfo() is called on hosts with the IPv6 module loaded, it sends a parallel query for A and AAAA records. If one of the responses is a referral response and the other is not (e.g. A is not-referral but AAAA is referral), the function will terminate and return value -2, "No address associated with hostname" error. Version-Release number of selected component (if applicable): Any Linux host running glibc 2.9 or later Affected distributions: - Fedora 10-16, rawhide - EL 6.0 and later How reproducible: 100% under specific conditions: - Linux running glibc 2.9 or later with IPv6 enabled - Citrix Netscaler as a DNS server Steps to Reproduce: 1. Install Fedora 16 (e.g. Live ISO) 2. Setup and install any DNS server (e.g. ISC Bind) and configure it to be an authoritative name server for a zone (e.g. example.com) 3. Setup and install a Citrix Netscaler load-balancer (e.g. VPX appliance) and configure a DNS proxy VIP 4. Enable DNS caching on the Citrix Netscaler load-balancer (set dns parameter -cache YES) 5. Configure the Linux host to use the DNS VIP as its resolver 6. Disable nscd on the Linux host to bypass name caching (alternatively, use nscd -i hosts in between calls) 7. Run a program that uses getaddrinfo() (e.g. curl, wget, telnet, etc) to resolve a hostname Actual results: curl -v hostname.example.com * getaddrinfo(3) failed for hostname.example.com:80 * Couldn't resolve host 'hostname.example.com * Closing connection #0 curl: (6) Couldn't resolve host 'hostname.example.com' Expected results: Connection should be fine Additional Info: A buggy DNS server is exposing an edge-case bug in glibc Related Bugs: Citrix SR 60783234 [reply] [-] Private Comment 1 Gino LV. Ledesma 2012-05-22 21:14:19 EDT Created attachment 586223 [details] Wireshark Packet Capture [reply] [-] Private Comment 2 Gino LV. Ledesma 2012-05-22 21:14:34 EDT I took a packet capture of the DNS request-response scenario (dns.pcap) and found the following Case 1: Response is served fresh (not from cache) Frame 01 Client: A? active-mrepo.me.com Frame 02 Client: AAAA? active-mrepo.me.com Frame 03 VIP: A 1/1/1 active-mrepo.me.com. 17.172.194.16 Frame 04 VIP: AAAA 0/1/0 hostmaster... Case 2: Response is served from cache Frame 05 Client: A? active-mrepo.me.com Frame 06 Client: AAAA? active-mrepo.me.com Frame 07 VIP: A 1/0/0 active-mrepo.me.com. 17.172.194.16 Frame 08 VIP: AAAA 0/0/0 hostmaster... # Try 2 (this is done automatically by glibc) Frame 09 Client: A? active-mrepo.me.com Frame 10 Client: AAAA? active-mrepo.me.com Frame 11 VIP: A 1/0/0 active-mrepo.me.com. 17.172.194.16 Frame 12 VIP: AAAA 0/0/0 hostmaster... As best as I can tell, there are two things happening here: 1) This DNS server is serving a DNS response that falls under NODATA type 3 (RFC 2308 section 2.2) 2) glibc's (mis-?) interpretation of a referral response and short-circuiting its logic glibc interprets DNS responses as "referral" if the following conditions are met (see glibc 2.15, resolv/res_send.c lines 1301-1303): a) rcode == NOERROR b) ancount == 0 c) aa == 0 d) ra == 0 e) arcount == 0 I see that this change was introduced in glibc 2.9 and is still present in 2.15. In the above situation when two responses come in (A response = authoritative, AAAA = referral), send_dg() immediately returns 0, causing __libc_res_nsend to try the next nameserver and repeat the query, ignoring any valid responses that may have come in. -- Here is a debug call-trace of glibc with RES_DEBUG enabled: looking up: active-mrepo ;; res_setoptions(" timeout:600 debug ", "conf").. ;; debug dots=0, statp->ndots=1, trailing_dot=0, name=active-mrepo ;; res_nquerydomain(active-mrepo, me.com, 1, 62321) ;; res_query(active-mrepo.me.com, 1, 62321) ;; res_nmkquery(QUERY, active-mrepo.me.com, IN, A) ;; res_nmkquery(QUERY, active-mrepo.me.com, IN, AAAA) ;; res_send() ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49477 ;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; active-mrepo.me.com, type = A, class = IN ;; Querying server (# 1) address = 17.230.128.24 referred query: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8813 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; active-mrepo.me.com, type = AAAA, class = IN ;; got answer: ;; ns_initparse: Message too long ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8813 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; active-mrepo.me.com, type = AAAA, class = IN The call trace won't show the A response because the AAAA referral response returns 0 immediately. Adjusting the next_ns: goto label to be moved above the if(buf2 != null) under the SERVAIL/NOTIMP/REFUSED code checks seems like a work-around but is most likely incorrect (due to the subsequent goto wait call). -- On a side note, I can only make this happen with the Citrix Netscaler acting as a DNS cache. Other caching resolvers (bind, dnsmasq, dnscache, pdns-recursor, etc) do not expose this behavior in glibc because they will either have the "aa" or "ra" bits set to 1. The Netscaler seems to be unique in serving the following combination of flags for type=AAAA: rcode == NOERROR ancount == 0 ra == 0 arcount == 0 nscount=0 I've filed a bug with Citrix for this problem: Citrix SR 60783234 -- Possibly related bugs: https://bugzilla.redhat.com/show_bug.cgi?id=459756 https://bugs.launchpad.net/ubuntu/+source/apt/+bug/326718 -- Work-arounds: 1. Disable Netscaler caching (set dns param -cache NO) 2. Switch service type from DNS/DNS_TCP to UDP/TCP 3. Make Netscaler authoritative for all zones that its DNS proxying for (add ns soaRecord ...) 4. Disable IPv6 on client-side Some notes / tidbits: 1. This behavior does NOT affect older glibc hosts (e.g. EL 5.x), presumably because gethostbyname3_r does two separate calls for A/AAAA 2. The order of the response (whether A or AAAA comes first) doesn't seem to matter -- I just installed Fedora 17 i386 on a VM to test this. Problem still persists there (glibc-2.15-37.fc17.i686). I've also confirmed that the problem exists with upstream, stock glibc 2.15. -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
next reply other threads:[~2012-06-29 5:28 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-06-29 5:28 law at redhat dot com [this message] 2012-08-13 1:04 ` [Bug network/14308] " phattanon at nettree dot co.th 2013-02-09 8:50 ` gauryogesh.nsit at gmail dot com 2013-05-29 2:03 ` atsushi at onoe dot org 2013-05-29 2:04 ` atsushi at onoe dot org 2014-04-15 16:08 ` siddhesh at redhat dot com 2014-04-30 6:23 ` cvs-commit at gcc dot gnu.org 2014-04-30 6:37 ` siddhesh at redhat dot com 2014-04-30 6:38 ` siddhesh at redhat dot com 2014-06-18 4:28 ` fweimer at redhat dot com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-14308-131@http.sourceware.org/bugzilla/ \ --to=sourceware-bugzilla@sourceware.org \ --cc=glibc-bugs@sources.redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).