public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/11709] New: glibc domain resolution does not obtain IP addresses from truncated UDP DNS responses.
@ 2010-06-16 13:28 khanipov at gmail dot com
2010-06-16 21:11 ` [Bug libc/11709] " pasky at suse dot cz
2010-06-17 4:18 ` khanipov at gmail dot com
0 siblings, 2 replies; 3+ messages in thread
From: khanipov at gmail dot com @ 2010-06-16 13:28 UTC (permalink / raw)
To: glibc-bugs
Contents:
1. Problem
2. Investigation
3. Conclusions
1. Problem.
At home I am using a D-Link Dir-320 router in DHCP mode to get access to the
internet. It has an option of relaying incoming DNS queries to the Internet
Service Provider's (ISP) DNS server and thus becoming a DNS server itself,
letting my PC use the router as a DNS server. Otherwise the PC must use ISP's
DNS server (DNS server is configured via DHCP, so I don't need to change
anything on my PC to reconfigure, if I want the router to act as a DNS I just
need to put a tick in its settings). By default the 'relay DNS' option was
turned on and everything had been going well untill I migrated from Windows to
Linux...
I noticed that some webpages didn't load properly (there were missing pictures
and and some other parts). At first I thought that the problem was with the
webpage, not with my PC. However I also noticed that many other webpages which I
had watched often before (xyz.livejournal.com, youtube) were sometimes loading
for a very long time. In the browser I could see something like 'resolving
host...' or 'waiting for pagead2.googlesyndication.com'. I was curious what
could cause that.
2. Investigation.
First of all I tried to open these web pages using Windows. Everything went
fine: all pages opened quickly without any delay. This way I found out that the
problem was connected with Linux. My next idea was that it was caused by a
faulty domain name resolution and I somehow got to turn off the router's DNS
relay mode. After that everything went fine (on Linux). One may think that it
was router, not Linux who caused problems, but it cannot explain why Windows
worked fine even with DNS relay.
I took some domain names which Linux could not resolve with router's DNS relay
and which arised often during my web-surfing: pagead2.googlesyndication.com,
w.sharethis.com and tried to ping them. For example, ping
pagead2.googlesyndication.com freezed for about 15 seconds and then informed me
that the host is unknown. After I turned off DNS relay ping worked fine. On
Windows, however, ping worked fine even with DNS relay: still there were several
seconds of delay, but the name got resolved.
At this point I realised that I need to get deeper into the DNS protocols. I
found out that both TCP and UDP queries are specified in the RFC. UDP replies
may get truncated if the full response data cannot be stored within a single UDP
frame, and if an application needs all the information contained in a response
it can set up a TCP connection and repeat its query, thus getting rid of small
UDP frame size limitations.
I became familiar with the DNS query and response protocol details and fired up
tcpdump:
sudo tcpdump -i eth1 -X udp port domain
It showed me that when I was pinging pagead2.googlesyndication.com responses
from the router were coming consistent and they contained all four IP addresses
of the host. What made them different from other DNS responses which I observed
when pinging 'resolvable' hosts was the truncation flag present, meaning that
the UDP packet was too tight for the whole response. The theory was born then:
Linux domain name resolution system discards truncated UDP replies and sets up a
TCP connection to get the full response, while my router fails to accept TCP DNS
queries. The theory was proved when I tried to run
nslookup w.sharethis.com
which showed the message ";; Truncated, retrying in TCP mode ;; connection timed
out; no servers could be reached" meaning that the router failed to process TCP
query. Yet the initial UDP reply from the router contained all the necessary IP
addresses! I guess that the Windows system didn't discard this initial reply due
to its truncation and used its data after TCP connection attempt failed (this
explains the delay in name resolution which I could observe on Windows).
3. Conclustions.
1. D-Link Dir-320 router does not comply to DNS server standards (which is
beyond glibc).
2. Linux domain name resolution system does not obtain IP addresses from
truncated UDP DNS responses. I think this is a bug, because most applications
using Internet just need the way to translate domain name into IP address and
are not interested in various additional informaton present in a full response,
so there is no need for them to set up a TCP connection to get the full DNS
record data. If the UDP response contains IP addresses, even if it has the
truncation flag set, it must be used without any further queries.
I would also like to note that the described problem can make many people (at
least those using D-Link Dir-320) stop using Linux systems. The problem was
irritating me for a long untill I finally took my time and found the cause. I
guess that many would just get back to their Windows OS thinking that Linux is
guilty of not loading web pages.
As far as I am concerned name resolution is performed via the getaddrinfo
function which, if I am correct, resides inside glibc, that is why I am posting
a report here.
--
Summary: glibc domain resolution does not obtain IP addresses
from truncated UDP DNS responses.
Product: glibc
Version: 2.10
Status: NEW
Severity: normal
Priority: P2
Component: libc
AssignedTo: drepper at redhat dot com
ReportedBy: khanipov at gmail dot com
CC: glibc-bugs at sources dot redhat dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=11709
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug libc/11709] glibc domain resolution does not obtain IP addresses from truncated UDP DNS responses.
2010-06-16 13:28 [Bug libc/11709] New: " khanipov at gmail dot com
@ 2010-06-16 21:11 ` pasky at suse dot cz
2010-06-17 4:18 ` khanipov at gmail dot com
1 sibling, 0 replies; 3+ messages in thread
From: pasky at suse dot cz @ 2010-06-16 21:11 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From pasky at suse dot cz 2010-06-16 21:10 -------
Thanks for your report - though its form is more of a blog post and more
to-the-point summary would be easier to process.
As you note, the router behavior is completely non-standard. However, we cannot
just decide we do not need anything else from the DNS record, since it may be
crucial to get all the records e.g. to properly sort and choose the appropriate
IP address based on the preferred family and scope; getaddrinfo() supports
complex ordering mechanisms for this (see gai.conf(5)). Ignoring the rest of a
truncated reply would cause invalid behavior.
Thus, even if we introduced a special option to process even truncated UDP
replies, that behavior would be actually harmful and it is better then to use a
different (e.g. local-running) caching nameserver. Since this is the first bug
report about this router I have ever seen, I don't think your problem is that
wide-spread; but even if it was, it would be much better for the distributions
to test and handle such broken DNS servers specially (e.g. as part of the DHCP
negotiation), falling back to a local caching nameserver instead of forcing
glibc to process broken DNS replies at all costs.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |WONTFIX
http://sourceware.org/bugzilla/show_bug.cgi?id=11709
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug libc/11709] glibc domain resolution does not obtain IP addresses from truncated UDP DNS responses.
2010-06-16 13:28 [Bug libc/11709] New: " khanipov at gmail dot com
2010-06-16 21:11 ` [Bug libc/11709] " pasky at suse dot cz
@ 2010-06-17 4:18 ` khanipov at gmail dot com
1 sibling, 0 replies; 3+ messages in thread
From: khanipov at gmail dot com @ 2010-06-17 4:18 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From khanipov at gmail dot com 2010-06-17 04:18 -------
(In reply to comment #1)
Thank you for a quick reply and thorough explanation! I am sorry to make you
read the big story, next time I will try to make a more brief report.
Now I understand that using only a truncated UDP to perform getaddrinfo() will
do no good and that the system must try to set up a TCP connection to get all
the information needed (for instance, to sort IP addresses according to their
priorities, as I understood from your reply). However, I am curious if it makes
sense to change the behavior so that the initial UDP reply won't be discarded
completely:
1. Try UDP query.
2. If the UDP response has a TC flag (truncation), try TCP query.
3. If TCP fails, return the information obtained from the initial UDP
response.
Of course I understand, that this is far not a high-priority modification, if
sensible at all. But if it is sensible, I could try to dig into the
glibc/resolv/ code and propose some changes. So I am interested in your opinion
about the idea.
Thanks.
> Thanks for your report - though its form is more of a blog post and more
> to-the-point summary would be easier to process.
>
> As you note, the router behavior is completely non-standard. However, we
cannot
> just decide we do not need anything else from the DNS record, since it may be
> crucial to get all the records e.g. to properly sort and choose the
appropriate
> IP address based on the preferred family and scope; getaddrinfo() supports
> complex ordering mechanisms for this (see gai.conf(5)). Ignoring the rest of a
> truncated reply would cause invalid behavior.
>
> Thus, even if we introduced a special option to process even truncated UDP
> replies, that behavior would be actually harmful and it is better then to use
a
> different (e.g. local-running) caching nameserver. Since this is the first bug
> report about this router I have ever seen, I don't think your problem is that
> wide-spread; but even if it was, it would be much better for the distributions
> to test and handle such broken DNS servers specially (e.g. as part of the DHCP
> negotiation), falling back to a local caching nameserver instead of forcing
> glibc to process broken DNS replies at all costs.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=11709
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-06-30 17:47 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-11709-131@http.sourceware.org/bugzilla/>
2014-06-30 17:47 ` [Bug libc/11709] glibc domain resolution does not obtain IP addresses from truncated UDP DNS responses fweimer at redhat dot com
2010-06-16 13:28 [Bug libc/11709] New: " khanipov at gmail dot com
2010-06-16 21:11 ` [Bug libc/11709] " pasky at suse dot cz
2010-06-17 4:18 ` khanipov at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).