public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650]
@ 2024-06-07 12:13 Maciej W. Rozycki
  2024-06-07 16:11 ` Florian Weimer
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Maciej W. Rozycki @ 2024-06-07 12:13 UTC (permalink / raw)
  To: libc-alpha; +Cc: Carlos O'Donell

Complement commit b03e4d7bd25b ("stdio: fix vfscanf with matches longer 
than INT_MAX (bug 27650)") and add a test case for the issue, inspired 
by the reproducer provided with the bug report.

This has been verified to succeed as from the commit referred and fail 
beforehand.

As the test requires 2GiB of data to be passed around its performance 
has been evaluated using a choice of systems and the execution time 
determined to be respectively in the range of 10s for POWER9@2.166GHz, 
28s for FU740@1.2GHz, and 48s for 74Kf@950MHz.  As this is on the verge 
of and beyond the default timeout it has been increased by the factor of 
8.  Regardless, following recent practice the test has been added to the 
standard rather than extended set.
---
Hi,

 This has been verified with the `powerpc64le-linux-gnu' (IBM POWER9) 
native target and then the same host and the `riscv64-linux-gnu' (SiFive 
FU740) and `mips-linux-gnu' (o32 ABI) (MIPS 74Kf) targets.  This is so as 
to assess performance requirements for the test case.  And with respect to 
these I have been referred off list to: 
<https://inbox.sourceware.org/libc-alpha/b52d05a2-f4c1-e385-39bd-cd4a6f4f232f@redhat.com/>
for the choice between the standard and the extended set of tests.

 Any questions, comments or concerns?  Otherwise OK to apply?

  Maciej
---
 stdio-common/Makefile            |    2 
 stdio-common/tst-scanf-bz27650.c |  216 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 218 insertions(+)

glibc-tst-scanf-bz27650.diff
Index: glibc/stdio-common/Makefile
===================================================================
--- glibc.orig/stdio-common/Makefile
+++ glibc/stdio-common/Makefile
@@ -244,6 +244,7 @@ tests := \
   tst-scanf-binary-c23 \
   tst-scanf-binary-gnu11 \
   tst-scanf-binary-gnu89 \
+  tst-scanf-bz27650 \
   tst-scanf-intn \
   tst-scanf-round \
   tst-scanf-to_inpunct \
@@ -314,6 +315,7 @@ generated += \
   tst-printf-fp-free.mtrace \
   tst-printf-fp-leak-mem.out \
   tst-printf-fp-leak.mtrace \
+  tst-scanf-bz27650.mtrace \
   tst-vfprintf-width-prec-mem.out \
   tst-vfprintf-width-prec.mtrace \
   # generated
Index: glibc/stdio-common/tst-scanf-bz27650.c
===================================================================
--- /dev/null
+++ glibc/stdio-common/tst-scanf-bz27650.c
@@ -0,0 +1,216 @@
+/* Test for BZ #27650, formatted input matching beyond MAX_INT.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <array_length.h>
+#include <error.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <limits.h>
+#include <mcheck.h>
+#include <poll.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <support/subprocess.h>
+#include <support/test-driver.h>
+
+/* Produce a stream of more than MAX_INT characters to stdout of which
+   none is the new line character.  This is executed as a subprocess
+   and the caller wants a void callee, upon the return from which the
+   process will terminate successfully, so in the case of a failure we
+   need to explicitly call exit with the failure status.  */
+
+static void
+do_write (void *arg)
+{
+  static const char s[] = { [0 ... 4095] = 'a' };
+  size_t i;
+
+  for (i = 0; i <= INT_MAX / sizeof (s); i++)
+    if (fwrite (s, 1, sizeof (s), stdout) != sizeof (s))
+      {
+	int err = errno;
+
+	/* Close our stdout so that there's no risk for us to block
+	   while `fscanf' is waiting on our stdout in `do_read' and
+	   nothing checking our stderr.  If closing has failed, then
+	   refrain from reporting anything, for the same reason.  */
+	if (fclose (stdout) == 0)
+	  error (0, err, "%s: fwrite: output error", __func__);
+	exit (EXIT_FAILURE);
+      }
+}
+
+/* Consume a stream of more than MAX_INT characters from IN of which
+   none is the new line character.  The call to fscanf is supposed
+   to complete upon the EOF condition on IN, however in the presence
+   of BZ #27650 it will terminate prematurely with characters still
+   outstanding in IN.  Diagnose the condition and return status
+   accordingly.  */
+
+static int
+do_read (FILE *in)
+{
+  int v;
+
+  v = fscanf (in, "%*[^\n]");
+  if (v == EOF || errno != 0)
+    {
+      error (0, errno, "%s: fscanf: input failure", __func__);
+      return EXIT_FAILURE;
+    }
+
+  if (!feof (in))
+    {
+      v = fgetc (in);
+      if (v == EOF)
+	error (0, errno, "%s: fgetc: input failure", __func__);
+      else if (v == '\n')
+	error (0, 0, "%s: unexpected new line character received", __func__);
+      else
+	error (0, 0,
+	       "%s: character received after end of file expected: \\x%02x",
+	       __func__, v);
+      return EXIT_FAILURE;
+    }
+
+  return EXIT_SUCCESS;
+}
+
+/* Run do_write in a subprocess and communicate its output produced to
+   stdout via a pipe to do_read.  Upon completion of do_read consume
+   any outstanding input from do_write and report any issues.  Return
+   success or failure based on the status of the subprocess and ours.  */
+
+int
+do_test (void)
+{
+  struct support_subprocess target;
+  FILE *chdout, *chderr;
+  int chdstatus;
+  int status;
+
+  mtrace ();
+
+  target = support_subprocess (do_write, NULL);
+  chdout = fdopen (target.stdout_pipe[0], "r");
+  if (chdout == NULL)
+    {
+      error (0, errno, "fdopen");
+      status = EXIT_FAILURE;
+    }
+  else
+    status = do_read (chdout);
+
+  /* Switch the pipes to the non-blocking mode to make sure do_write
+     does not lock up waiting output and consume any outstanding input
+     received.  Discard any output from do_write's stdout and pass any
+     output from do_write's stderr along to our stderr.  */
+  if (fcntl (target.stdout_pipe[0], F_SETFL, O_NONBLOCK) == -1
+      || fcntl (target.stderr_pipe[0], F_SETFL, O_NONBLOCK) == -1)
+    {
+      error (0, errno, "fcntl (F_SETFL)");
+      status = EXIT_FAILURE;
+    }
+  else if (chdout != NULL)
+    {
+      chderr = fdopen (target.stderr_pipe[0], "r");
+      if (chderr != NULL)
+	{
+	  struct pollfd fds[] =
+	    { { .fd = target.stderr_pipe[0], .events = POLLIN },
+	      { .fd = target.stdout_pipe[0], .events = POLLIN } };
+	  FILE *ss[array_length (fds)][2] =
+	    { { chderr, stderr }, { chdout } };
+	  int pollstatus;
+	  size_t i;
+
+	  while ((pollstatus = poll (fds, array_length (fds), -1)) >= 0)
+	    {
+	      bool stop;
+
+	      stop = false;
+	      for (i = 0; i < array_length (fds); i++)
+		{
+		  char buf[1024];
+		  char *s;
+
+		  if (fds[i].revents & POLLERR)
+		    fds[i].fd = -1;
+		  else if (fds[i].revents & POLLIN)
+		    do
+		      {
+			s = fgets (buf, sizeof (buf), ss[i][0]);
+			if (s != NULL)
+			  {
+			    if (ss[i][1] != NULL
+				&& fputs (buf, ss[i][1]) == EOF)
+			      {
+				error (0, errno, "fputs");
+				status = EXIT_FAILURE;
+				stop = true;
+			      }
+			  }
+			else if (errno == EAGAIN)
+			  clearerr (chderr);
+			else
+			  {
+			    error (0, errno, "fgets");
+			    status = EXIT_FAILURE;
+			    stop = true;
+			  }
+		      }
+		    while (s != NULL);
+		  else if (fds[i].revents & POLLHUP)
+		    fds[i].fd = -1;
+		}
+	      if (stop)
+		break;
+
+	      stop = true;
+	      for (i = 0; i < array_length (fds); i++)
+		if (fds[i].fd >= 0)
+		  {
+		    stop = false;
+		    break;
+		  }
+	      if (stop)
+		break;
+	    }
+	  if (pollstatus < 0)
+	    {
+	      error (0, errno, "poll");
+	      status = EXIT_FAILURE;
+	    }
+	}
+    }
+
+  /* Combine our subprocess's status and intended ours.  Only succeed
+     if both are good.  */
+  chdstatus = support_process_wait (&target);
+  if (status == EXIT_SUCCESS && WIFEXITED (chdstatus))
+    return WEXITSTATUS (chdstatus);
+  else if (status != EXIT_SUCCESS)
+    return status;
+  else
+    return EXIT_FAILURE;
+}
+
+#define TIMEOUT (DEFAULT_TIMEOUT * 8)
+#include <support/test-driver.c>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650]
  2024-06-07 12:13 [PATCH] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650] Maciej W. Rozycki
@ 2024-06-07 16:11 ` Florian Weimer
  2024-06-07 17:24   ` Maciej W. Rozycki
  2024-06-21 15:32 ` [PING][PATCH] " Maciej W. Rozycki
  2024-07-01 10:25 ` [PING^2][PATCH] " Maciej W. Rozycki
  2 siblings, 1 reply; 5+ messages in thread
From: Florian Weimer @ 2024-06-07 16:11 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: libc-alpha, Carlos O'Donell

* Maciej W. Rozycki:

> +/* Produce a stream of more than MAX_INT characters to stdout of which
> +   none is the new line character.  This is executed as a subprocess
> +   and the caller wants a void callee, upon the return from which the
> +   process will terminate successfully, so in the case of a failure we
> +   need to explicitly call exit with the failure status.  */
> +
> +static void
> +do_write (void *arg)
> +{
> +  static const char s[] = { [0 ... 4095] = 'a' };
> +  size_t i;
> +
> +  for (i = 0; i <= INT_MAX / sizeof (s); i++)
> +    if (fwrite (s, 1, sizeof (s), stdout) != sizeof (s))
> +      {
> +	int err = errno;
> +
> +	/* Close our stdout so that there's no risk for us to block
> +	   while `fscanf' is waiting on our stdout in `do_read' and
> +	   nothing checking our stderr.  If closing has failed, then
> +	   refrain from reporting anything, for the same reason.  */
> +	if (fclose (stdout) == 0)
> +	  error (0, err, "%s: fwrite: output error", __func__);
> +	exit (EXIT_FAILURE);
> +      }

Is there a reason for using stdout or stream I/O for writing the file?
You could use create_temp_file from <support/temp_file.h> and write
directly to the file descriptor.

It may be faster to use <support/blob_repeat.h> to create a string in
memory because it uses alias mappings, and parse that string using
sscanf or fmemopen+vfscanf.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650]
  2024-06-07 16:11 ` Florian Weimer
@ 2024-06-07 17:24   ` Maciej W. Rozycki
  0 siblings, 0 replies; 5+ messages in thread
From: Maciej W. Rozycki @ 2024-06-07 17:24 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha, Carlos O'Donell

On Fri, 7 Jun 2024, Florian Weimer wrote:

> * Maciej W. Rozycki:
> 
> > +/* Produce a stream of more than MAX_INT characters to stdout of which
> > +   none is the new line character.  This is executed as a subprocess
> > +   and the caller wants a void callee, upon the return from which the
> > +   process will terminate successfully, so in the case of a failure we
> > +   need to explicitly call exit with the failure status.  */
> > +
> > +static void
> > +do_write (void *arg)
> > +{
> > +  static const char s[] = { [0 ... 4095] = 'a' };
> > +  size_t i;
> > +
> > +  for (i = 0; i <= INT_MAX / sizeof (s); i++)
> > +    if (fwrite (s, 1, sizeof (s), stdout) != sizeof (s))
> > +      {
> > +	int err = errno;
> > +
> > +	/* Close our stdout so that there's no risk for us to block
> > +	   while `fscanf' is waiting on our stdout in `do_read' and
> > +	   nothing checking our stderr.  If closing has failed, then
> > +	   refrain from reporting anything, for the same reason.  */
> > +	if (fclose (stdout) == 0)
> > +	  error (0, err, "%s: fwrite: output error", __func__);
> > +	exit (EXIT_FAILURE);
> > +      }
> 
> Is there a reason for using stdout or stream I/O for writing the file?
> You could use create_temp_file from <support/temp_file.h> and write
> directly to the file descriptor.

 A pipe is used because over 2GiB of data has to be transferred.  It could 
be a bit of a stress for the target board if such a large amount was to be 
actually stored in a filesystem (even in the presence of LFS).  There may 
be limited storage available too.

 My choice to use the `fwrite' interface over raw `write' was mostly 
symmetry with the rest of code and also less hassle in handling, as with 
`fwrite' you don't have to take care of partial writes.  Besides, `stdout' 
is readily available, there's no need for an extra library call (and an 
error to handle) to extract the underlying file descriptor.  Not a big 
deal overall, just a matter of style.

> It may be faster to use <support/blob_repeat.h> to create a string in
> memory because it uses alias mappings, and parse that string using
> sscanf or fmemopen+vfscanf.

 Neat!  However with 32-bit targets the size of the allocation required 
may exceed the limit of the user VM supported by hardware or the OS (some 
targets give room for manoeuvre to the OS as to the user/kernel VM split 
while other ones have it hardwired).  As a quick check I have run 
stdlib/tst-strtod-overflow with my 74Kf target and, lo and behold:

UNSUPPORTED: stdlib/tst-strtod-overflow
original exit status 77
warning: memory allocation failed, cannot test for overflow

And I think we do want to have coverage for scanning INT_MAX+ characters 
especially with 32-bit targets, where it may be hitting more than just the 
limit of the `int' type and the specific issue reported with BZ #27650.

 Do my answers address your concerns?

  Maciej


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PING][PATCH] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650]
  2024-06-07 12:13 [PATCH] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650] Maciej W. Rozycki
  2024-06-07 16:11 ` Florian Weimer
@ 2024-06-21 15:32 ` Maciej W. Rozycki
  2024-07-01 10:25 ` [PING^2][PATCH] " Maciej W. Rozycki
  2 siblings, 0 replies; 5+ messages in thread
From: Maciej W. Rozycki @ 2024-06-21 15:32 UTC (permalink / raw)
  To: libc-alpha; +Cc: Florian Weimer

On Fri, 7 Jun 2024, Maciej W. Rozycki wrote:

> Complement commit b03e4d7bd25b ("stdio: fix vfscanf with matches longer 
> than INT_MAX (bug 27650)") and add a test case for the issue, inspired 
> by the reproducer provided with the bug report.

 Ping for: 
<https://sourceware.org/pipermail/libc-alpha/2024-June/157283.html>,
<https://patchwork.sourceware.org/project/glibc/patch/7bd09b7b-bbd4-ac32-02d5-687ec5e01986@redhat.com/>.

  Maciej


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PING^2][PATCH] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650]
  2024-06-07 12:13 [PATCH] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650] Maciej W. Rozycki
  2024-06-07 16:11 ` Florian Weimer
  2024-06-21 15:32 ` [PING][PATCH] " Maciej W. Rozycki
@ 2024-07-01 10:25 ` Maciej W. Rozycki
  2 siblings, 0 replies; 5+ messages in thread
From: Maciej W. Rozycki @ 2024-07-01 10:25 UTC (permalink / raw)
  To: libc-alpha; +Cc: Florian Weimer

On Fri, 7 Jun 2024, Maciej W. Rozycki wrote:

> Complement commit b03e4d7bd25b ("stdio: fix vfscanf with matches longer 
> than INT_MAX (bug 27650)") and add a test case for the issue, inspired 
> by the reproducer provided with the bug report.

 Ping for:
<https://sourceware.org/pipermail/libc-alpha/2024-June/157283.html>,
<https://patchwork.sourceware.org/project/glibc/patch/7bd09b7b-bbd4-ac32-02d5-687ec5e01986@redhat.com/>.

  Maciej


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-01 10:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-07 12:13 [PATCH] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650] Maciej W. Rozycki
2024-06-07 16:11 ` Florian Weimer
2024-06-07 17:24   ` Maciej W. Rozycki
2024-06-21 15:32 ` [PING][PATCH] " Maciej W. Rozycki
2024-07-01 10:25 ` [PING^2][PATCH] " Maciej W. Rozycki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).