public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r14-8487] tree-ssa-strlen: Fix pdata->maxlen computation [PR110603]
@ 2024-01-29  9:21 Jakub Jelinek
  0 siblings, 0 replies; only message in thread
From: Jakub Jelinek @ 2024-01-29  9:21 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:b338fdbc2b74f25c07da263a1f5983421fac1a53

commit r14-8487-gb338fdbc2b74f25c07da263a1f5983421fac1a53
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jan 29 10:20:32 2024 +0100

    tree-ssa-strlen: Fix pdata->maxlen computation [PR110603]
    
    On the following testcase we emit an invalid range of [2, 1] due to
    UB in the source.  Older VRP code silently swapped the boundaries and
    made [1, 2] range out of it, but newer code just ICEs on it.
    
    The reason for pdata->minlen 2 is that we see a memcpy in this case
    setting both elements of the array to non-zero value, so strlen (a)
    can't be smaller than 2.  The reason for pdata->maxlen 1 is that in
    char a[2] array without UB there can be at most 1 non-zero character
    because there needs to be '\0' termination in the buffer too.
    
    IMHO we shouldn't create invalid ranges like that and even creating
    for that case a range [1, 2] looks wrong to me, so the following patch
    just doesn't set maxlen in that case to the array size - 1, matching
    what will really happen at runtime when triggering such UB (strlen will
    be at least 2, perhaps more or will crash).
    This is what the second hunk of the patch does.
    
    The first hunk fixes a fortunately harmless thinko.
    If the strlen pass knows the string length (i.e. get_string_length
    function returns non-NULL), we take a different path, we get to this
    only if all we know is that there are certain number of non-zero
    characters but we don't know what it is followed with, whether further
    non-zero characters or zero termination or either of that.
    If we know exactly how many non-zero characters it is, such as
    char a[42];
    ...
      memcpy (a, "01234567890123456789", 20);
    then we take an earlier if for the INTEGER_CST case and set correctly
    just pdata->minlen to 20 in that case, but if we have something like
      int len;
      ...
      if (len < 15 || len > 32) return;
      memcpy (a, "0123456789012345678901234567890123456789", len);
    then we have [15, 32] range for the nonzero_chars and we set pdata->minlen
    correctly to 15, but incorrectly set also pdata->maxlen to 32.  That is
    not what the above implies, it just means that in some cases we know that
    there are at least 32 non-zero characters, followed by something we don't
    know.  There is no guarantee that there is '\0' right after it, so it
    means nothing.
    The reason this is harmless, just confusing, is that the code a few lines
    later fortunately overwrites this incorrect pdata->maxlen value with
    something different (either array length - 1 or all ones etc.).
    
    2024-01-29  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/110603
            * tree-ssa-strlen.cc (get_range_strlen_dynamic): Remove incorrect
            setting of pdata->maxlen to vr.upper_bound (which is unconditionally
            overwritten anyway).  Avoid creating invalid range with minlen
            larger than maxlen.  Formatting fix.
    
            * gcc.c-torture/compile/pr110603.c: New test.

Diff:
---
 gcc/testsuite/gcc.c-torture/compile/pr110603.c | 16 ++++++++++++++++
 gcc/tree-ssa-strlen.cc                         | 19 +++++++++++++++----
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.c-torture/compile/pr110603.c b/gcc/testsuite/gcc.c-torture/compile/pr110603.c
new file mode 100644
index 000000000000..afcc1a522c23
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr110603.c
@@ -0,0 +1,16 @@
+/* PR tree-optimization/110603 */
+
+typedef __SIZE_TYPE__ size_t;
+void *memcpy (void *, const void *, size_t);
+int snprintf (char *restrict, size_t, const char *restrict, ...);
+extern char a[2];
+void bar (void);
+
+void
+foo (void)
+{
+  memcpy (a, "12", sizeof (a));
+  int b = snprintf (0, 0, "%s", a);
+  if (b <= 3)
+    bar ();
+}
diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc
index c20db7a0cd95..f2fd47d77e89 100644
--- a/gcc/tree-ssa-strlen.cc
+++ b/gcc/tree-ssa-strlen.cc
@@ -1228,7 +1228,6 @@ get_range_strlen_dynamic (tree src, gimple *stmt,
 		{
 		  tree type = vr.type ();
 		  pdata->minlen = wide_int_to_tree (type, vr.lower_bound ());
-		  pdata->maxlen = wide_int_to_tree (type, vr.upper_bound ());
 		}
 	    }
 	  else
@@ -1253,9 +1252,21 @@ get_range_strlen_dynamic (tree src, gimple *stmt,
 		{
 		  ++off;   /* Increment for the terminating nul.  */
 		  tree toffset = build_int_cst (size_type_node, off);
-		  pdata->maxlen = fold_build2 (MINUS_EXPR, size_type_node, size,
-					       toffset);
-		  pdata->maxbound = pdata->maxlen;
+		  pdata->maxlen = fold_build2 (MINUS_EXPR, size_type_node,
+					       size, toffset);
+		  if (tree_int_cst_lt (pdata->maxlen, pdata->minlen))
+		    /* This can happen when triggering UB, when base is an
+		       array which is known to be filled with at least size
+		       non-zero bytes.  E.g. for
+		       char a[2]; memcpy (a, "12", sizeof a);
+		       We don't want to create an invalid range [2, 1]
+		       where 2 comes from the number of non-zero bytes and
+		       1 from longest valid zero-terminated string that can
+		       be stored in such an array, so pick just one of
+		       those, pdata->minlen.  See PR110603.  */
+		    pdata->maxlen = build_all_ones_cst (size_type_node);
+		  else
+		    pdata->maxbound = pdata->maxlen;
 		}
 	      else	
 		pdata->maxlen = build_all_ones_cst (size_type_node);

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2024-01-29  9:21 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-29  9:21 [gcc r14-8487] tree-ssa-strlen: Fix pdata->maxlen computation [PR110603] Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).