public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Test status before h_errno in gaih_inet
@ 2016-07-25 18:31 Stan Shebs
  2016-08-19 16:01 ` Florian Weimer
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Stan Shebs @ 2016-07-25 18:31 UTC (permalink / raw)
  To: libc-alpha

[-- Attachment #1: Type: text/plain, Size: 727 bytes --]

Per feedback on my previous attempt to fix getaddrinfo failures
after recovering from a failure:

https://sourceware.org/ml/libc-alpha/2016-07/msg00341.html

here is a simpler patch that checks status before checking
h_errno.

The patch basically adds back a test that was previously
present.  I am not 100% convinced that it catches all cases
of NSS status being set in a way that ought to be reported
as an error coming from getaddrinfo, but I can't find
any actual examples.

2016-07-25  Stan Shebs  <stanshebs@google.com>

        * sysdeps/posix/getaddrinfo.c (gaih_inet): Test status before
        looking at h_errno.
        * posix/tst-getaddrinfo6.c: New test.
        * posix/Makefile (tests): Add tst-getaddrinfo6.

[-- Attachment #2: gai-patch-2 --]
[-- Type: application/octet-stream, Size: 4125 bytes --]

diff --git a/posix/Makefile b/posix/Makefile
index 5b0e298..7211393 100644
--- a/posix/Makefile
+++ b/posix/Makefile
@@ -90,7 +90,7 @@ tests		:= tstgetopt testfnm runtests runptests	     \
 		   bug-getopt5 tst-getopt_long1 bug-regex34 bug-regex35 \
 		   tst-pathconf tst-getaddrinfo4 tst-rxspencer-no-utf8 \
 		   tst-fnmatch3 bug-regex36 tst-getaddrinfo5 \
-		   tst-posix_spawn-fd
+		   tst-posix_spawn-fd tst-getaddrinfo6
 xtests		:= bug-ga2
 ifeq (yes,$(build-shared))
 test-srcs	:= globtest
diff --git a/posix/tst-getaddrinfo6.c b/posix/tst-getaddrinfo6.c
new file mode 100644
index 0000000..b515544
--- /dev/null
+++ b/posix/tst-getaddrinfo6.c
@@ -0,0 +1,115 @@
+/* Check that getaddrinfo still works after recovery from failure.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Thanks to Dave Clausen for the test concept.  */
+
+#include <stdio.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <netdb.h>
+#include <errno.h>
+#include <sys/resource.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+#define NUMFDS 10
+int fds[NUMFDS];
+
+static int
+do_test (void)
+{
+  char hostname[1024];
+  struct rlimit rlim;
+  int gaierr;
+  struct addrinfo hints, *ai;
+
+  gethostname (hostname, sizeof (hostname));
+
+  /* Bring the limit of open files down to a small predictable value.  */
+
+  if (getrlimit (RLIMIT_NOFILE, &rlim) < 0)
+    {
+      printf ("error: getrlimit: %m\n");
+      return 1;
+    }
+  rlim.rlim_cur = NUMFDS;
+  if (setrlimit (RLIMIT_NOFILE, &rlim) < 0)
+    {
+      printf ("error: setrlimit: %m\n");
+      return 1;
+    }
+
+  memset (&hints, '\0', sizeof (hints));
+  hints.ai_family = AF_UNSPEC;
+  hints.ai_socktype = SOCK_STREAM;
+  hints.ai_flags = AI_CANONNAME;
+
+  gaierr = getaddrinfo (hostname, NULL, &hints, &ai);
+  if (gaierr != 0)
+    {
+      printf ("error: getaddrinfo: %s\n", gai_strerror (gaierr));
+      return 1;
+    }
+
+  // Create file descriptors until we completely run out.
+
+  memset (fds, '\0', sizeof (fds));
+  for (int i = 0; i < NUMFDS; ++i)
+    {
+      int fd = dup (STDOUT_FILENO);
+
+      if (fd == -1 && errno == EMFILE)
+	break;
+
+      fds[i] = fd;
+    }
+
+  gaierr = getaddrinfo (hostname, NULL, &hints, &ai);
+  if (gaierr != 0)
+    {
+      printf ("error (expected): getaddrinfo: %s\n", gai_strerror (gaierr));
+      /* Don't give up, we expected this error.  */
+    }
+
+  // Now close the fds, freeing them up for use.
+
+  for (int i = 0; i < NUMFDS; ++i)
+    {
+      if (fds[i] == 0)
+	break;
+
+      close (fds[i]);
+    }
+
+  /* This one should succeed again.  */
+
+  gaierr = getaddrinfo (hostname, NULL, &hints, &ai);
+  if (gaierr != 0)
+    {
+      printf ("error: getaddrinfo: %s\n", gai_strerror (gaierr));
+      return 1;
+    }
+
+  return 0;
+}
+
+#define TEST_FUNCTION do_test ()
+
+#include "../test-skeleton.c"
diff --git a/sysdeps/posix/getaddrinfo.c b/sysdeps/posix/getaddrinfo.c
index 574ce08..c1479b6 100644
--- a/sysdeps/posix/getaddrinfo.c
+++ b/sysdeps/posix/getaddrinfo.c
@@ -1044,7 +1044,7 @@ gaih_inet (const char *name, const struct gaih_service *service,
 
 	  _res.options |= old_res_options & RES_USE_INET6;
 
-	  if (h_errno == NETDB_INTERNAL)
+	  if (status == NSS_STATUS_UNAVAIL && h_errno == NETDB_INTERNAL)
 	    {
 	      result = -EAI_SYSTEM;
 	      goto free_and_return;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Test status before h_errno in gaih_inet
  2016-07-25 18:31 [PATCH] Test status before h_errno in gaih_inet Stan Shebs
@ 2016-08-19 16:01 ` Florian Weimer
  2017-08-03  9:48   ` Florian Weimer
  2016-08-29 10:35 ` Siddhesh Poyarekar
  2017-08-07 13:07 ` Florian Weimer
  2 siblings, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2016-08-19 16:01 UTC (permalink / raw)
  To: Stan Shebs; +Cc: libc-alpha

On 07/25/2016 06:50 PM, Stan Shebs wrote:
> Per feedback on my previous attempt to fix getaddrinfo failures
> after recovering from a failure:
>
> https://sourceware.org/ml/libc-alpha/2016-07/msg00341.html
>
> here is a simpler patch that checks status before checking
> h_errno.
>
> The patch basically adds back a test that was previously
> present.  I am not 100% convinced that it catches all cases
> of NSS status being set in a way that ought to be reported
> as an error coming from getaddrinfo, but I can't find
> any actual examples.
>
> 2016-07-25  Stan Shebs  <stanshebs@google.com>
>
>         * sysdeps/posix/getaddrinfo.c (gaih_inet): Test status before
>         looking at h_errno.
>         * posix/tst-getaddrinfo6.c: New test.
>         * posix/Makefile (tests): Add tst-getaddrinfo6.

It's unclear whether the test case is intended to run against nss_files, 
nss_dns, or both.  You should call __nss_configure_lookup to make your 
selection explicit.  This also avoids accidentally running against a 
system nscd daemon.

I'm still trying to figure out what the expected error reporting 
behavior for functions returning enum nss_status is.  There seems to be 
some expectation that enum nss_status != NSS_STATUS_SUCCESS implies that 
h_errno is valid, but this code in getaddrinfo.c itself contradicts that:

       status = NSS_STATUS_UNAVAIL;
       /* Could not load any of the lookup functions.  Indicate
          an internal error if the failure was due to a system
	 error other than the file not being found.  We use the
	 errno from the last failed callback.  */
       if (errno != 0 && errno != ENOENT)
	__set_h_errno (NETDB_INTERNAL);

It would have to set h_errno unconditionally in order to preserve the 
invariant.

The _nss_files_gethostbyname3_r implementation in nss_files calls 
internal_setent, but does not update *herrnop for status != 
NSS_STATUS_SUCCESS.  This is in contrast to _nss_files_gethostbyname4_r, 
which does.

I need to dig further and write up what I find, but I suspect that we 
may have to set h_errno to 0 temporarily to obtain maximum compatibility 
with existing NSS modules, and base the error check on that.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Test status before h_errno in gaih_inet
  2016-07-25 18:31 [PATCH] Test status before h_errno in gaih_inet Stan Shebs
  2016-08-19 16:01 ` Florian Weimer
@ 2016-08-29 10:35 ` Siddhesh Poyarekar
  2016-08-29 10:37   ` Siddhesh Poyarekar
  2017-08-07 13:07 ` Florian Weimer
  2 siblings, 1 reply; 6+ messages in thread
From: Siddhesh Poyarekar @ 2016-08-29 10:35 UTC (permalink / raw)
  To: Stan Shebs, libc-alpha



On Monday 25 July 2016 10:20 PM, Stan Shebs wrote:
> The patch basically adds back a test that was previously
> present.  I am not 100% convinced that it catches all cases
> of NSS status being set in a way that ought to be reported
> as an error coming from getaddrinfo, but I can't find
> any actual examples.
When I had changed this condition some time ago, my understanding was
that there are only two cases where h_errno is set to NETDB_INTERNAL:
when status is NSS_STATUS_UNAVAILABLE and when NSS_STATUS_TRYAGAIN.  I
had dropped the status check as unnecessary (in multiple places IIRC)
because the NSS_STATUS_TRYAGAIN checks earlier was exhaustive.  However,
I did not take into consideration the fact that h_errno will persist
across calls and hence removing that check was wrong.

Based on that, I agree with your fix is correct but it is incomplete. 
There need to be similar checks for every place that test for h_errno ==
NETDB_INTERNAL, like in the gethosts macro and elsewhere in gaih_inet. 
IIRC gethosts is exercised for AF_INET, so you might be able to verify
that with your test case.

Siddhesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Test status before h_errno in gaih_inet
  2016-08-29 10:35 ` Siddhesh Poyarekar
@ 2016-08-29 10:37   ` Siddhesh Poyarekar
  0 siblings, 0 replies; 6+ messages in thread
From: Siddhesh Poyarekar @ 2016-08-29 10:37 UTC (permalink / raw)
  To: Stan Shebs, libc-alpha

On Monday 29 August 2016 04:05 PM, Siddhesh Poyarekar wrote:
> Based on that, I agree with your fix is correct but it is incomplete. 

Ugh, let me try that one again:

    Based on that, I agree with your fix but it is incomplete.

Siddhesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Test status before h_errno in gaih_inet
  2016-08-19 16:01 ` Florian Weimer
@ 2017-08-03  9:48   ` Florian Weimer
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2017-08-03  9:48 UTC (permalink / raw)
  To: Stan Shebs; +Cc: libc-alpha

On 08/19/2016 06:01 PM, Florian Weimer wrote:
> I need to dig further and write up what I find, but I suspect that we
> may have to set h_errno to 0 temporarily to obtain maximum compatibility
> with existing NSS modules, and base the error check on that.

_nss_files_gethostbyname3_r sets *herrnop on success to a non-zero
value, so that does not work.

Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Test status before h_errno in gaih_inet
  2016-07-25 18:31 [PATCH] Test status before h_errno in gaih_inet Stan Shebs
  2016-08-19 16:01 ` Florian Weimer
  2016-08-29 10:35 ` Siddhesh Poyarekar
@ 2017-08-07 13:07 ` Florian Weimer
  2 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2017-08-07 13:07 UTC (permalink / raw)
  To: Stan Shebs, libc-alpha

On 07/25/2016 06:50 PM, Stan Shebs wrote:
>         * posix/tst-getaddrinfo6.c: New test.

The test does not reproduce the issue for me, with Fedora 24 (glibc
2.23-based).

How confident are you about your root cause analysis?

I see a completely different bug in the NSS framework: If a call into
NSS triggers dlopen, and that dlopen fails with a system error, we cache
the failure permanently and never attempt to open the service module again.

This issue could well lead to persistent getaddrinfo failures because
glibc assumes none of the configured NSS modules are available.

Could this match what you experienced?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-07 13:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-25 18:31 [PATCH] Test status before h_errno in gaih_inet Stan Shebs
2016-08-19 16:01 ` Florian Weimer
2017-08-03  9:48   ` Florian Weimer
2016-08-29 10:35 ` Siddhesh Poyarekar
2016-08-29 10:37   ` Siddhesh Poyarekar
2017-08-07 13:07 ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).