public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] fold more string comparison with known result (PR 90879)
@ 2019-08-09 16:42 Martin Sebor
  2019-08-09 16:51 ` Jakub Jelinek
  2019-08-12 22:22 ` Jeff Law
  0 siblings, 2 replies; 21+ messages in thread
From: Martin Sebor @ 2019-08-09 16:42 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1295 bytes --]

GCC 9 optimizes a subset of expression of the form
(0 == strcmp(a, b)) based on the length and/or size of
the arguments but it doesn't take advantage of all
the opportunities there.  For example in the following,
although it folds the first test to false it doesn't fold
the second one:

   char a[4];

   void f (void)
   {
     if (strlen (a) > 3)   // folded to false by GCC 8+
       abort ();

     if (strcmp (a, "1234") == 0)   // folded by patched GCC
       abort ();
}

The attached patch extends the strcmp optimization added in
GCC 9 to also handle the latter cases (among others).  Testing
the enhancement with several other sizable code bases besides
GCC (Binutils/GDB, the Linux kernel, and LLVM) shows that code
like this is rare.  After thinking about it I decided it's more
likely a bug than a significant optimization opportunity, so
I introduced a new warning to point it out: -Wstring-compare
(enabled in -Wextra).

Besides this enhancement, the patch also improves the current
optimization to fold strcmp calls with conditional arguments
such as in:

   void f (char *s, int i)
   {
     strcpy (s, "12");
     if (strcmp (s, i ? "123" : "1234") == 0)   // folded
       abort ();
   }

Martin

PS The diff looks like the changes are more extensive than they
actually are.

[-- Attachment #2: gcc-90879.diff --]
[-- Type: text/x-patch, Size: 65782 bytes --]

PR tree-optimization/90879 - fold zero-equality of strcmp between a longer string and a smaller array

gcc/c-family/ChangeLog:

	PR tree-optimization/90879
	* c.opt (-Wstring-compare): New option.

gcc/testsuite/ChangeLog:

	PR tree-optimization/90879
	* gcc.dg/Wstring-compare-2.c: New test.
	* gcc.dg/Wstring-compare.c: New test.
	* gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
	* gcc.dg/strcmpopt_6.c: New test.
	* gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
	test cases.
	* gcc.dg/strlenopt-66.c: Run it.
	* gcc.dg/strlenopt-67.c: New test.
	* gcc.dg/strlenopt-68.c: New test.

gcc/ChangeLog:

	PR tree-optimization/90879
	* builtins.c (check_access): Avoid using maxbound when null.
	* calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen change.
	* doc/invoke.texi (-Wstring-compare): Document new warning option.
	* gengtype-state.c (state_ident_st): Use a zero-length array instead.
	(state_token_st): Same.  Make last.
	(state_ident_by_name): Allocate enough space for terminating nul.
	* gimple-fold.c (get_range_strlen_tree): Make setting maxbound
	conditional.
	(get_range_strlen): Overwrite initial maxbound when non-null.
	* gimple-ssa-sprintf.c (get_string_length): Adjust to get_range_strlen
	change.
	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
	(used_only_for_zero_equality): New function.
	(handle_builtin_memcmp): Call it.
	(determine_min_objsize): Return an integer instead of tree.
	(get_len_or_size, strxcmp_eqz_result): New functions.
	(maybe_warn_pointless_strcmp): New function.
	(handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
	between a longer string and a smaller array.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 695a9d191af..eca710942dc 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3326,7 +3326,7 @@ check_access (tree exp, tree, tree, tree dstwrite,
 	  c_strlen_data lendata = { };
 	  get_range_strlen (srcstr, &lendata, /* eltsize = */ 1);
 	  range[0] = lendata.minlen;
-	  range[1] = lendata.maxbound;
+	  range[1] = lendata.maxbound ? lendata.maxbound : lendata.maxlen;
 	  if (range[0] && (!maxread || TREE_CODE (maxread) == INTEGER_CST))
 	    {
 	      if (maxread && tree_int_cst_le (maxread, range[0]))
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 257cadfa5f1..2fe6cc4ee08 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -784,6 +784,12 @@ Wsizeof-array-argument
 C ObjC C++ ObjC++ Var(warn_sizeof_array_argument) Warning Init(1)
 Warn when sizeof is applied on a parameter declared as an array.
 
+Wstring-compare
+C ObjC C++ LTO ObjC++ Warning Var(warn_string_compare) Warning LangEnabledBy(C ObjC C++ ObjC++, Wextra)
+Warn about calls to strcmp and strncmp used in equality expressions that
+are necessarily true or false due to the length of one and size of the other
+argument.
+
 Wstringop-overflow
 C ObjC C++ LTO ObjC++ Warning Alias(Wstringop-overflow=, 2, 0)
 Warn about buffer overflow in string manipulation functions like memcpy
diff --git a/gcc/calls.c b/gcc/calls.c
index 7507b698e27..dcebf67b5cc 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1593,6 +1593,10 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp)
 	    if (!get_attr_nonstring_decl (arg))
 	      {
 		c_strlen_data lendata = { };
+		/* Set MAXBOUND to an arbitrary non-null non-integer
+		   node as a request to have it set to the length of
+		   the longest string in a PHI.  */
+		lendata.maxbound = arg;
 		get_range_strlen (arg, &lendata, /* eltsize = */ 1);
 		maxlen = lendata.maxbound;
 	      }
@@ -1618,6 +1622,10 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp)
 	if (!get_attr_nonstring_decl (arg))
 	  {
 	    c_strlen_data lendata = { };
+	    /* Set MAXBOUND to an arbitrary non-null non-integer
+	       node as a request to have it set to the length of
+	       the longest string in a PHI.  */
+	    lendata.maxbound = arg;
 	    get_range_strlen (arg, &lendata, /* eltsize = */ 1);
 	    maxlen = lendata.maxbound;
 	  }
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 01aab60f895..f9efdc2e140 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -347,6 +347,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wsizeof-pointer-memaccess  -Wsizeof-array-argument @gol
 -Wstack-protector  -Wstack-usage=@var{byte-size}  -Wstrict-aliasing @gol
 -Wstrict-aliasing=n  -Wstrict-overflow  -Wstrict-overflow=@var{n} @gol
+-Wstring-compare @gol
 -Wstringop-overflow=@var{n}  -Wstringop-truncation  -Wsubobject-linkage @gol
 -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}malloc@r{]} @gol
 -Wsuggest-final-types @gol  -Wsuggest-final-methods  -Wsuggest-override @gol
@@ -5815,6 +5816,30 @@ comparisons, so this warning level gives a very large number of
 false positives.
 @end table
 
+@item -Wstring-compare
+@opindex Wstring-compare
+@opindex Wno-string-compare
+Warn for calls to @code{strcmp} and @code{strncmp} whose result can
+be determined to be either zero or non-zero in tests for such equality
+owing to the length of one argument being greater than the size of
+the array the other argument is stored in (or the bound in the case
+of @code{strncmp}.  Such calls could be mistakes.  For example, the call
+to @code{strcmp} below is diagnosed because its result is necessarily
+non-zero irrespective of the contents of the array @code{a}.
+
+@smallexample
+extern char a[4];
+void f (char *d)
+@{
+  strcpy (d, "string");
+  @dots{}
+  if (0 == strcmp (a, d))   // cannot be true
+    puts ("a and d are the same");
+@}
+@end smallexample
+
+@option{-Wstring-compare} is enabled by @option{-Wextra}.
+
 @item -Wstringop-overflow
 @itemx -Wstringop-overflow=@var{type}
 @opindex Wstringop-overflow
diff --git a/gcc/gengtype-state.c b/gcc/gengtype-state.c
index 03f40694ec6..80a8b57e9a2 100644
--- a/gcc/gengtype-state.c
+++ b/gcc/gengtype-state.c
@@ -79,6 +79,14 @@ enum state_token_en
   STOK_NAME                     /* hash-consed name or identifier.  */
 };
 
+/* Suppress warning: ISO C forbids zero-size array for stok_string
+   below.  The arrays are treated as flexible array members but in
+   otherwise an empty struct or as a member of a union cannot be
+   declared as such.  They must have zero size to keep GCC from
+   assuming their bound reflect their size.  */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wpedantic"
+
 
 /* Structure and hash-table used to share identifiers or names.  */
 struct state_ident_st
@@ -86,11 +94,10 @@ struct state_ident_st
   /* TODO: We could improve the parser by reserving identifiers for
      state keywords and adding a keyword number for them.  That would
      mean adding another field in this state_ident_st struct.  */
-  char stid_name[1];		/* actually bigger & null terminated */
+  char stid_name[0];		/* actually bigger & null terminated */
 };
 static htab_t state_ident_tab;
 
-
 /* The state_token_st structure is for lexical tokens in the read
    state file.  The stok_kind field discriminates the union.  Tokens
    are allocated by peek_state_token which calls read_a_state_token
@@ -110,14 +117,15 @@ struct state_token_st
   union		                        /* discriminated by stok_kind! */
   {
     int stok_num;			/* when STOK_INTEGER */
-    char stok_string[1];		/* when STOK_STRING, actual size is
-					   bigger and null terminated */
     struct state_ident_st *stok_ident;	/* when STOK_IDENT */
     void *stok_ptr;		        /* null otherwise */
+    char stok_string[0];		/* when STOK_STRING, actual size is
+					   bigger and null terminated */
   }
   stok_un;
 };
 
+#pragma GCC diagnostic pop
 
 
 
@@ -325,7 +333,7 @@ state_ident_by_name (const char *name, enum insert_option optins)
   namlen = strlen (name);
   stid =
     (struct state_ident_st *) xmalloc (sizeof (struct state_ident_st) +
-				       namlen);
+				       namlen + 1);
   memset (stid, 0, sizeof (struct state_ident_st) + namlen);
   strcpy (stid->stid_name, name);
   *slot = stid;
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index fc57fb45e3a..582768090ae 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1346,6 +1346,10 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	}
     }
 
+  /* Set if VAL represents the maximum length based on array size (set
+     when exact length cannot be determined).  */
+  bool maxbound = false;
+
   if (!val && rkind == SRK_LENRANGE)
     {
       if (TREE_CODE (arg) == ADDR_EXPR)
@@ -1441,6 +1445,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	      pdata->minlen = ssize_int (0);
 	    }
 	}
+      maxbound = true;
     }
 
   if (!val)
@@ -1454,7 +1459,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	  && tree_int_cst_lt (val, pdata->minlen)))
     pdata->minlen = val;
 
-  if (pdata->maxbound)
+  if (pdata->maxbound && TREE_CODE (pdata->maxbound) == INTEGER_CST)
     {
       /* Adjust the tighter (more optimistic) string length bound
 	 if necessary and proceed to adjust the more conservative
@@ -1472,7 +1477,10 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
       else
 	pdata->maxbound = val;
     }
-  else
+  else if (pdata->maxbound || maxbound)
+    /* Set PDATA->MAXBOUND only if it either isn't INTEGER_CST or
+       if VAL corresponds to the maximum length determined based
+       on the type of the object.  */
     pdata->maxbound = val;
 
   if (tight_bound)
@@ -1653,8 +1661,11 @@ get_range_strlen (tree arg, bitmap *visited,
 
 /* Try to obtain the range of the lengths of the string(s) referenced
    by ARG, or the size of the largest array ARG refers to if the range
-   of lengths cannot be determined, and store all in *PDATA.  ELTSIZE
-   is the expected size of the string element in bytes: 1 for char and
+   of lengths cannot be determined, and store all in *PDATA which must
+   be zero-initialized on input except PDATA->MAXBOUND may be set to
+   a non-null tree node other than INTEGER_CST to request to have it
+   set to the length of the longest string in a PHI.  ELTSIZE is
+   the expected size of the string element in bytes: 1 for char and
    some power of 2 for wide characters.
    Return true if the range [PDATA->MINLEN, PDATA->MAXLEN] is suitable
    for optimization.  Returning false means that a nonzero PDATA->MINLEN
@@ -1666,6 +1677,7 @@ bool
 get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)
 {
   bitmap visited = NULL;
+  tree maxbound = pdata->maxbound;
 
   if (!get_range_strlen (arg, &visited, SRK_LENRANGE, pdata, eltsize))
     {
@@ -1678,8 +1690,9 @@ get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)
   else if (!pdata->minlen)
     pdata->minlen = ssize_int (0);
 
-  /* Unless its null, leave the more conservative MAXBOUND unchanged.  */
-  if (!pdata->maxbound)
+  /* If it's unchanged from it initial non-null value, set the conservative
+     MAXBOUND to MAXLEN.  Otherwise leave it null (if it is null).  */
+  if (maxbound && pdata->maxbound == maxbound)
     pdata->maxbound = pdata->maxlen;
 
   if (visited)
diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 88ba1f2cac1..279338b1577 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -2041,6 +2041,9 @@ get_string_length (tree str, unsigned eltsize)
      aren't known to point any such arrays result in LENDATA.MAXLEN
      set to SIZE_MAX.  */
   c_strlen_data lendata = { };
+  /* Set MAXBOUND to an arbitrary non-null non-integer node as a request
+     to have it set to the length of the longest string in a PHI.  */
+  lendata.maxbound = str;
   get_range_strlen (str, &lendata, eltsize);
 
   /* Return the default result when nothing is known about the string. */
diff --git a/gcc/testsuite/gcc.dg/Wstring-compare-2.c b/gcc/testsuite/gcc.dg/Wstring-compare-2.c
new file mode 100644
index 00000000000..7116be6896e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstring-compare-2.c
@@ -0,0 +1,127 @@
+/* PR tree-optimization/90879 - fold zero-equality of strcmp between
+   a longer string and a smaller array
+   Test for a warning for strcmp of a longer string against smaller
+   array.
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wstring-compare -Wno-stringop-truncation -ftrack-macro-expansion=0" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern void* memcpy (void*, const void*, size_t);
+
+extern int strcmp (const char*, const char*);
+extern size_t strlen (const char*);
+extern char* strcpy (char*, const char*);
+extern char* strncpy (char*, const char*, size_t);
+extern int strncmp (const char*, const char*, size_t);
+
+void sink (int, ...);
+#define sink(...) sink (__LINE__, __VA_ARGS__)
+
+
+extern char a1[1], a2[2], a3[3], a4[4], a5[5], a6[6], a7[7], a8[8], a9[9];
+
+#define T(a, b) sink (0 == strcmp (a, b))
+
+
+void test_string_cst (void)
+{
+  const char *s1 = "1", *s2 = "12";
+
+  T (s1, a1);                 // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to non-zero" }
+  T (s1, a2);
+  T (s1, a3);
+
+  T (a1, s1);                 // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to non-zero" }
+  T (a2, s1);
+  T (a3, s1);
+
+  T (s2, a1);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 1 evaluates to non-zero" }
+  T (s2, a2);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to non-zero" }
+  T (s2, a3);
+
+  T (a1, s2);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 1 evaluates to non-zero" }
+  T (a2, s2);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to non-zero" }
+  T (a3, s2);
+}
+
+
+void test_string_cst_off_cst (void)
+{
+  const char *s1 = "1", *s2 = "12", *s3 = "123", *s4 = "1234";
+
+  T (s1, a2 + 1);              // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to non-zero" }
+  T (a2 + 1, s1);              // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to non-zero" }
+
+
+  T (s3 + 1, a2);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to non-zero" }
+  T (s3 + 1, a3);
+
+  T (s2, a4 + 1);
+  T (s2, a4 + 2);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to non-zero" }
+
+  T (s4, a4 + 1);             // { dg-warning ".strcmp. of a string of length 4 and an array of size 3 evaluates to non-zero" }
+  T (s3, a5 + 1);
+}
+
+
+/* Use strncpy below rather than memcpy until PR 91183 is resolved.  */
+
+#undef T
+#define T(s, n, a)					\
+  do {							\
+    char arr[32];					\
+    sink (arr);						\
+    strncpy (arr, s, n < 0 ? strlen (s) + 1: n);	\
+    sink (0 == strcmp (arr, a));			\
+  } while (0)
+
+void test_string_exact_length (void)
+{
+  const char *s1 = "1", *s2 = "12";
+
+  T (s1, -1, a1);             // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to non-zero" }
+  T (s1, -1, a2);
+  T (s1, -1, a3);
+
+  T (s2, -1, a1);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 1 evaluates to non-zero" }
+  T (s2, -1, a2);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to non-zero" }
+  T (s2, -1, a3);
+}
+
+
+void test_string_min_length (void)
+{
+  const char *s1 = "1", *s2 = "12";
+
+  T (s1,  1, a1);             // { dg-warning ".strcmp. of a string of length 1 or more and an array of size 1 evaluates to non-zero" }
+  T (s1,  1, a2);
+  T (s1,  1, a3);
+
+  T (s2,  2, a1);             // { dg-warning ".strcmp. of a string of length 2 or more and an array of size 1 evaluates to non-zero" }
+  T (s2,  2, a2);             // { dg-warning ".strcmp. of a string of length 2 or more and an array of size 2 evaluates to non-zero" }
+  T (s2,  2, a3);
+}
+
+
+int test_strncmp_str_lit_var (const char *s, long n)
+{
+  if (strncmp (s, "123456", n) == 0)    // { dg-bogus "\\\[-Wstring-compare" }
+    return 1;
+
+  return 0;
+}
+
+int test_strlen_strncmp_str_lit_var (const char *s, long n)
+{
+  if (__builtin_strlen (s) < n)
+    return -1;
+
+  if (n == 6)
+    if (strncmp (s, "123456", n) == 0)  // { dg-bogus "\\\[-Wstring-compare" }
+      return 1;
+
+  return 0;
+}
+
+
diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c b/gcc/testsuite/gcc.dg/Wstring-compare.c
new file mode 100644
index 00000000000..702abeb96be
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstring-compare.c
@@ -0,0 +1,181 @@
+/* PR tree-optimization/90879 - fold zero-equality of strcmp between
+   a longer string and a smaller array
+   { dg-do compile }
+   { dg-options "-O2 -Wall -ftrack-macro-expansion=0" } */
+
+#include "strlenopt.h"
+
+#define T(a, b) sink (0 == strcmp (a, b), a, b)
+
+void sink (int, ...);
+
+struct S { char a4[4], c; };
+
+extern char a4[4];
+extern char a5[5];
+extern char b4[4];
+
+/* Verify that comparison of string literals with arrays with unknown
+   content but size that prevents them from comparing equal is diagnosed.  */
+
+void strcmp_array_lit (void)
+{
+  if (strcmp (a4, "1234"))  // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+    sink (0, a4);
+
+  int cmp;
+  cmp = strcmp (a4, "1234");  // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero" }
+  if (cmp)                  // { dg-message "in this expression" }
+    sink (0, a4);
+
+  T (a4, "4321");           // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero " }
+  T (a4, "12345");          // { dg-warning "length 5 and an array of size 4 " }
+  T (a4, "123456");         // { dg-warning "length 6 and an array of size 4 " }
+  T ("1234", a4);           // { dg-warning "length 4 and an array of size 4 " }
+  T ("12345", a4);          // { dg-warning "length 5 and an array of size 4 " }
+  T ("123456", a4);         // { dg-warning "length 6 and an array of size 4 " }
+}
+
+
+void strcmp_array_pstr (void)
+{
+  const char *s4 = "1234";
+
+  {
+    if (strcmp (a4, s4))    // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  {
+    int c;
+    c = strcmp (a4, s4);    // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero" }
+    if (c)                  // { dg-message "in this expression" }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  const char *t4 = "4321";
+  const char *s5 = "12345";
+  const char *s6 = "123456";
+
+  T (a4, t4);               // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero " }
+  T (a4, s5);               // { dg-warning "length 5 and an array of size 4 " }
+  T (a4, s6);               // { dg-warning "length 6 and an array of size 4 " }
+  T (s4, a4);               // { dg-warning "length 4 and an array of size 4 " }
+  T (s5, a4);               // { dg-warning "length 5 and an array of size 4 " }
+  T (s6, a4);               // { dg-warning "length 6 and an array of size 4 " }
+}
+
+
+void strcmp_array_cond_pstr (int i)
+{
+  const char *s4 = i ? "1234" : "4321";
+  T (a4, s4);               // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero " }
+  T (a5, s4);
+}
+
+void strcmp_array_copy (void)
+{
+  char s[8];
+
+  {
+    strcpy (s, "1234");
+    if (strcmp (a4, s))     // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  {
+    strcpy (s, "1234");
+
+    int c;
+    c = strcmp (a4, s);     // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero" }
+    if (c)                  // { dg-message "in this expression" }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  strcpy (s, "4321");
+  T (a4, s);                // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to non-zero " }
+  strcpy (s, "12345");
+  T (a4, s);                // { dg-warning "length 5 and an array of size 4 " }
+  strcpy (s, "123456");
+  T (a4, s);                // { dg-warning "length 6 and an array of size 4 " }
+  strcpy (s, "4321");
+  T (s, a4);                // { dg-warning "length 4 and an array of size 4 " }
+  strcpy (s, "54321");
+  T (s, a4);                // { dg-warning "length 5 and an array of size 4 " }
+  strcpy (s, "654321");
+  T (s, a4);                // { dg-warning "length 6 and an array of size 4 " }
+}
+
+
+void strcmp_member_array_lit (const struct S *p)
+{
+  T (p->a4, "1234");        // { dg-warning "length 4 and an array of size 4 " }
+}
+
+
+#undef T
+#define T(a, b, n) sink (0 == strncmp (a, b, n), a, b)
+
+void strncmp_array_lit (void)
+{
+  if (strncmp (a4, "12345", 5))   // { dg-warning "'strncmp' of a string of length 5, an array of size 4 and bound of 5 evaluates to non-zero" }
+                                  // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+    sink (0, a4);
+
+  int cmp;
+  cmp = strncmp (a4, "54321", 5);   // { dg-warning "'strncmp' of a string of length 5, an array of size 4 and bound of 5 evaluates to non-zero" }
+  if (cmp)                          // { dg-message "in this expression" }
+    sink (0, a4);
+
+  // Verify no warning when the bound is the same as the array size.
+  T (a4, "4321", 4);
+  T (a4, "654321", 4);
+
+  T (a4, "12345", 5);       // { dg-warning "length 5, an array of size 4 and bound of 5 " }
+  T (a4, "123456", 6);      // { dg-warning "length 6, an array of size 4 and bound of 6" }
+
+  T ("1234", a4, 4);
+  T ("12345", a4, 4);
+
+  T ("12345", a4, 5);       // { dg-warning "length 5, an array of size 4 and bound of 5 " }
+  T ("123456", a4, 6);      // { dg-warning "length 6, an array of size 4 and bound of 6 " }
+}
+
+
+void strncmp_strarray_copy (void)
+{
+  {
+    char a[] = "1234";
+    char b[6];
+    strcpy (b, "12345");
+    if (strncmp (a, b, 5))  // { dg-warning "'strncmp' of strings of length 4 and 5 and bound of 5 evaluates to non-zero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+      sink (0, a, b);
+  }
+
+  {
+    char a[] = "4321";
+    char b[6];
+    strcpy (b, "54321");
+    int cmp;
+    cmp = strncmp (a, b, 5);  // { dg-warning "'strncmp' of strings of length 4 and 5 and bound of 5 evaluates to non-zero" }
+    if (cmp)                  // { dg-message "in this expression" }
+      sink (0, a, b);
+  }
+
+  strcpy (a4, "abc");
+  T (a4, "54321", 5);       // { dg-warning "'strncmp' of strings of length 3 and 5 and bound of 5 evaluates to non-zero " }
+}
+
+
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
index 86a0d7a08b3..35941bee575 100644
--- a/gcc/testsuite/gcc.dg/strcmpopt_3.c
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -1,31 +1,31 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -fdump-tree-strlen" } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
 
-__attribute__ ((noinline)) int 
-f1 (void) 
-{ 
+__attribute__ ((noinline)) int
+f1 (void)
+{
   char *s0= "abcd";
   char s[8];
   __builtin_strcpy (s, s0);
-  return __builtin_strcmp(s, "abc") != 0; 
+  return __builtin_strcmp (s, "abc") != 0;
 }
 
 __attribute__ ((noinline)) int
-f2 (void) 
-{ 
+f2 (void)
+{
   char *s0 = "ab";
   char s[8];
   __builtin_strcpy (s, s0);
-  return __builtin_strcmp("abc", s) != 0; 
+  return __builtin_strcmp ("abc", s) != 0;
 }
 
 int main (void)
 {
-  if (f1 () != 1 
+  if (f1 () != 1
       || f2 () != 1)
     __builtin_abort ();
 
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_6.c b/gcc/testsuite/gcc.dg/strcmpopt_6.c
new file mode 100644
index 00000000000..cb99294e5fa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_6.c
@@ -0,0 +1,207 @@
+/* Verify that strcmp and strncmp calls with mixed constant and
+   non-constant strings are evaluated correctly.
+   { dg-do run }
+   { dg-options "-O2" } */
+
+#include "strlenopt.h"
+
+#define A(expr)                                                 \
+  ((expr)                                                       \
+   ? (void)0                                                    \
+   : (__builtin_printf ("assertion failed on line %i: %s\n",    \
+                        __LINE__, #expr),                       \
+      __builtin_abort ()))
+
+__attribute__ ((noclone, noinline)) int
+test_strlen_gt2_strcmp_abcd (const char *s)
+{
+  if (strlen (s) < 3)
+    return -1;
+
+  return strcmp (s, "abcd") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strlen_lt6_strcmp_abcd (const char *s)
+{
+  if (strlen (s) > 5)
+    return -1;
+
+  return strcmp (s, "abcd") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_strcmp_abc (const char *s)
+{
+  char a[4];
+  strcpy (a, s);
+  return strcmp (a, "abc") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_abc_strcmp (const char *s)
+{
+  char a[4], b[6];
+  strcpy (a, "abc");
+  strcpy (b, s);
+  return strcmp (a, b) == 0;
+}
+
+/* Exercise strcmp of two strings between 1 and 3 characters long
+   stored in arrays of the same known size.  */
+char ga4[4], gb4[4];
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strcmp_same_size_arrays (void)
+{
+  ga4[0] = gb4[0] = 'x';
+  ga4[3] = gb4[3] = '\0';
+  return strcmp (ga4, gb4) == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strncmp_bound_2_same_size_arrays (void)
+{
+  ga4[0] = gb4[0] = 'x';
+  ga4[3] = gb4[3] = '\0';
+  return strncmp (ga4, gb4, 2) == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strncmp_bound_equal_same_size_arrays (void)
+{
+  ga4[0] = gb4[0] = 'x';
+  ga4[3] = gb4[3] = '\0';
+  return strncmp (ga4, gb4, 4) == 0;
+}
+
+/* Exercise strcmp of two strings between 0 and 3 characters long
+   stored in arrays of the same known size.  */
+
+__attribute__ ((noclone, noinline)) int
+test_nulterm_strcmp_same_size_arrays (void)
+{
+  ga4[3] = gb4[3] = '\0';
+  return strcmp (ga4, gb4) == 0;
+}
+
+/* Exercise strcmp of two strings between 1 and 3 and 1 and 4 characters
+   long, respectively, stored in arrays of known but different sizes.  */
+char gc5[5];
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strcmp_arrays (void)
+{
+  ga4[0] = gc5[0] = 'x';
+  ga4[3] = gc5[4] = '\0';
+  return strcmp (ga4, gc5) == 0;
+}
+
+/* Exercise strcmp of two strings between 0 and 3 and 1 and 4 characters
+   long, respectively, stored in arrays of known but different sizes.  */
+
+__attribute__ ((noclone, noinline)) int
+test_nulterm_strcmp_arrays (void)
+{
+  ga4[3] = gc5[4] = '\0';
+  return strcmp (ga4, gc5) == 0;
+}
+
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_strncmp_abcd (const char *s)
+{
+  char a[6];
+  strcpy (a, s);
+  return strcmp (a, "abcd") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_abcd_strncmp_3 (const char *s)
+{
+  char a[6], b[8];
+  strcpy (a, "abcd");
+  strcpy (b, s);
+  return strncmp (a, b, 3) == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_abcd_strncmp_4 (const char *s)
+{
+  char a[6], b[8];
+  strcpy (a, "abcd");
+  strcpy (b, s);
+  return strncmp (a, b, 4) == 0;
+}
+
+
+int main (void)
+{
+  test_strlen_gt2_strcmp_abcd ("abcd");
+  test_strlen_lt6_strcmp_abcd ("abcd");
+
+  A (0 == test_strcpy_strcmp_abc ("ab"));
+  A (0 != test_strcpy_strcmp_abc ("abc"));
+  A (0 == test_strcpy_strcmp_abc ("abcd"));
+
+  A (0 == test_strcpy_abc_strcmp ("ab"));
+  A (0 != test_strcpy_abc_strcmp ("abc"));
+  A (0 == test_strcpy_abc_strcmp ("abcd"));
+
+  strcpy (ga4, "abc"); strcpy (gb4, "abd");
+  A (0 == test_store_0_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abd"); strcpy (gb4, "abc");
+  A (0 == test_store_0_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_store_0_nulterm_strcmp_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gb4, "acd");
+  A (0 == test_store_0_nulterm_strncmp_bound_2_same_size_arrays ());
+  strcpy (ga4, "acd"); strcpy (gb4, "abc");
+  A (0 == test_store_0_nulterm_strncmp_bound_2_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_store_0_nulterm_strncmp_bound_2_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gb4, "abd");
+  A (0 == test_store_0_nulterm_strncmp_bound_equal_same_size_arrays ());
+  strcpy (ga4, "abd"); strcpy (gb4, "abc");
+  A (0 == test_store_0_nulterm_strncmp_bound_equal_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_store_0_nulterm_strncmp_bound_equal_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gb4, "abd");
+  A (0 == test_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abd"); strcpy (gb4, "abc");
+  A (0 == test_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_nulterm_strcmp_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gc5, "abcd");
+  A (0 == test_store_0_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abd"); strcpy (gc5, "abcd");
+  A (0 == test_store_0_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abc"); strcpy (gc5, "abc");
+  A (0 != test_store_0_nulterm_strcmp_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gc5, "abcd");
+  A (0 == test_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abd"); strcpy (gc5, "abc");
+  A (0 == test_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abc"); strcpy (gc5, "abc");
+  A (0 != test_nulterm_strcmp_arrays ());
+
+  A (0 == test_strcpy_strncmp_abcd ("ab"));
+  A (0 == test_strcpy_strncmp_abcd ("abc"));
+  A (0 != test_strcpy_strncmp_abcd ("abcd"));
+  A (0 == test_strcpy_strncmp_abcd ("abcde"));
+
+  A (0 == test_strcpy_abcd_strncmp_3 ("ab"));
+  A (0 != test_strcpy_abcd_strncmp_3 ("abc"));
+  A (0 != test_strcpy_abcd_strncmp_3 ("abcd"));
+  A (0 != test_strcpy_abcd_strncmp_3 ("abcde"));
+
+  A (0 == test_strcpy_abcd_strncmp_4 ("ab"));
+  A (0 == test_strcpy_abcd_strncmp_4 ("abc"));
+  A (0 != test_strcpy_abcd_strncmp_4 ("abcd"));
+  A (0 != test_strcpy_abcd_strncmp_4 ("abcde"));
+}
diff --git a/gcc/testsuite/gcc.dg/strlenopt-65.c b/gcc/testsuite/gcc.dg/strlenopt-65.c
index a34d178faa1..521d7ac2b42 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-65.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-65.c
@@ -1,17 +1,10 @@
 /* PRE tree-optimization/90626 - fold strcmp(a, b) == 0 to zero when
    one string length is exact and the other is unequal
    { dg-do compile }
-   { dg-options "-O2 -Wall -fdump-tree-optimized" } */
+   { dg-options "-O2 -Wall -Wno-string-compare -fdump-tree-optimized -ftrack-macro-expansion=0" } */
 
 #include "strlenopt.h"
 
-typedef __SIZE_TYPE__ size_t;
-
-extern void abort (void);
-extern void* memcpy (void *, const void *, size_t);
-extern int strcmp (const char *, const char *);
-extern int strncmp (const char *, const char *, size_t);
-
 #define CAT(x, y) x ## y
 #define CONCAT(x, y) CAT (x, y)
 #define FAILNAME(name) CONCAT (call_ ## name ##_on_line_, __LINE__)
@@ -142,21 +135,45 @@ void test_strcmp_keep (const char *s, const char *t)
 #undef CMPFUNC
 #define CMPFUNC(a, b, dummy) strcmp (a, b)
 
-  KEEP ("1", "1", a, b, -1);
+  KEEP ("123", "123\0", a, b, /* bnd = */ -1);
+  KEEP ("123\0", "123", a, b, -1);
+
+  {
+    char a[8], b[8];
+    sink (a, b);
+    strcpy (a, s);
+    strcpy (b, t);
+    TEST_KEEP (0 == strcmp (a, b));
+  }
+}
+
+
+void test_strncmp_keep (const char *s, const char *t)
+{
+#undef CMPFUNC
+#define CMPFUNC(a, b, n) strncmp (a, b, n)
+
+  KEEP ("1", "1", a, b, 2);
 
-  KEEP ("1\0", "1", a, b, -1);
-  KEEP ("1",   "1\0", a, b, -1);
+  KEEP ("1\0", "1", a, b, 2);
+  KEEP ("1",   "1\0", a, b, 2);
 
-  KEEP ("12\0", "12", a, b, -1);
-  KEEP ("12",   "12\0", a, b, -1);
+  KEEP ("12\0", "12", a, b, 2);
+  KEEP ("12",   "12\0", a, b, 2);
 
-  KEEP ("111\0", "111", a, b, -1);
-  KEEP ("112", "112\0", a, b, -1);
+  KEEP ("111\0", "111", a, b, 3);
+  KEEP ("112", "112\0", a, b, 3);
 
-  KEEP (s, t, a, b, -1);
+  {
+    char a[8], b[8];
+    sink (a, b);
+    strcpy (a, s);
+    strcpy (b, t);
+    TEST_KEEP (0 == strncmp (a, b, sizeof a));
+  }
 }
 
 /* { dg-final { scan-tree-dump-times "call_in_true_branch_not_eliminated_" 0 "optimized" } }
 
-   { dg-final { scan-tree-dump-times "call_made_in_true_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 8 "optimized" } }
-   { dg-final { scan-tree-dump-times "call_made_in_false_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 8 "optimized" } } */
+   { dg-final { scan-tree-dump-times "call_made_in_true_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 11 "optimized" } }
+   { dg-final { scan-tree-dump-times "call_made_in_false_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 11 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-66.c b/gcc/testsuite/gcc.dg/strlenopt-66.c
index 5dc10a07d3d..4ba31a845b0 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-66.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-66.c
@@ -1,6 +1,6 @@
 /* PRE tree-optimization/90626 - fold strcmp(a, b) == 0 to zero when
    one string length is exact and the other is unequal
-   { dg-do compile }
+   { dg-do run }
    { dg-options "-O2 -Wall -fdump-tree-optimized" } */
 
 #include "strlenopt.h"
@@ -65,8 +65,44 @@ test_strncmp (void)
   A (0 <  strncmp (b, a, 5));
 }
 
+
+__attribute__ ((noclone, noinline, noipa)) void
+test_strncmp_a4_cond_s5_s2_2 (const char *s, int i)
+{
+  char a4[4];
+  strcpy (a4, s);
+  A (0 == strncmp (a4, i ? "12345" : "12", 2));
+}
+
+
+__attribute__ ((noclone, noinline, noipa)) void
+test_strncmp_a4_cond_a5_s2_5 (const char *s, const char *t, int i)
+{
+  char a4[4], a5[5];
+  strcpy (a4, s);
+  strcpy (a5, t);
+  A (0 == strncmp (a4, i ? a5 : "12", 5));
+}
+
+__attribute__ ((noclone, noinline, noipa)) void
+test_strncmp_a4_cond_a5_a3_n (const char *s1, const char *s2, const char *s3,
+			      int i, unsigned n)
+{
+  char a3[3], a4[4], a5[5];
+  strcpy (a3, s1);
+  strcpy (a4, s2);
+  strcpy (a5, s3);
+  A (0 == strncmp (a4, i ? a5 : a3, n));
+}
+
+
 int main (void)
 {
   test_strcmp ();
   test_strncmp ();
+  test_strncmp_a4_cond_s5_s2_2 ("12", 0);
+  test_strncmp_a4_cond_a5_s2_5 ("12", "1234", 0);
+
+  test_strncmp_a4_cond_a5_a3_n ("12", "123", "1234", 0, 2);
+  test_strncmp_a4_cond_a5_a3_n ("123", "12", "12", 1, 3);
 }
diff --git a/gcc/testsuite/gcc.dg/strlenopt-68.c b/gcc/testsuite/gcc.dg/strlenopt-68.c
new file mode 100644
index 00000000000..46ceb9ddb05
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-68.c
@@ -0,0 +1,126 @@
+/* PR tree-optimization/90879 - fold zero-equality of strcmp between
+   a longer string and a smaller array
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wno-string-compare -fdump-tree-optimized -ftrack-macro-expansion=0" } */
+
+#include "strlenopt.h"
+
+#define A(expr)                                                 \
+  ((expr)                                                       \
+   ? (void)0                                                    \
+   : (__builtin_printf ("assertion failed on line %i: %s\n",    \
+                        __LINE__, #expr),                       \
+      __builtin_abort ()))
+
+void clobber (void*, ...);
+
+struct S { char a4[4], c; };
+
+extern char a4[4];
+extern char b4[4];
+
+/* Verify that comparison of string literals with arrays with unknown
+   content but size that prevents them from comparing equal is folded
+   to a constant.  */
+
+void test_array_lit (void)
+{
+  A (strcmp (a4, "1234")); clobber (a4);
+  A (strcmp (a4, "12345")); clobber (a4);
+  A (strcmp (a4, "123456")); clobber (a4);
+  A (strcmp ("1234", a4)); clobber (a4);
+  A (strcmp ("12345", a4)); clobber (a4);
+  A (strcmp ("123456", a4)); clobber (a4);
+}
+
+void test_memarray_lit (struct S *p)
+{
+  A (strcmp (p->a4, "1234"));
+  A (strcmp (p->a4, "12345"));
+  A (strcmp (p->a4, "123456"));
+
+  A (strcmp ("1234", p->a4));
+  A (strcmp ("12345", p->a4));
+  A (strcmp ("123456", p->a4));
+}
+
+/* Verify that the equality of empty strings is folded.  */
+
+void test_empty_string (void)
+{
+  A (0 == strcmp ("", ""));
+
+  *a4 = '\0';
+  A (0 == strcmp (a4, ""));
+  A (0 == strcmp ("", a4));
+  A (0 == strcmp (a4, a4));
+
+  char s[8] = "";
+  A (0 == strcmp (a4, s));
+
+  a4[1] = '\0';
+  b4[1] = '\0';
+  A (0 == strcmp (a4 + 1, b4 + 1));
+
+  a4[2] = '\0';
+  b4[2] = '\0';
+  A (0 == strcmp (&a4[2], &b4[2]));
+
+  clobber (a4, b4);
+
+  memset (a4, 0, sizeof a4);
+  memset (b4, 0, sizeof b4);
+  A (0 == strcmp (a4, b4));
+}
+
+/* Verify that comparison of dynamically created strings with unknown
+   arrays is folded.  */
+
+void test_array_copy (void)
+{
+  char s[8];
+  strcpy (s, "1234");
+  A (strcmp (a4, s));
+
+  strcpy (s, "12345");
+  A (strlen (s) == 5);
+  A (strcmp (a4, s)); clobber (a4);
+
+  strcpy (s, "123456");
+  A (strcmp (a4, s)); clobber (a4);
+
+  strcpy (s, "1234");
+  A (strcmp (s, a4)); clobber (a4);
+
+  strcpy (s, "12345");
+  A (strcmp (s, a4)); clobber (a4);
+
+  strcpy (s, "123456");
+  A (strcmp (s, a4)); clobber (a4);
+}
+
+
+void test_array_bounded (void)
+{
+  A (strncmp (a4, "12345", 5)); clobber (a4);
+  A (strncmp ("54321", a4, 5)); clobber (a4);
+
+  A (strncmp (a4, "123456", 5)); clobber (a4);
+  A (strncmp ("654321", a4, 5)); clobber (a4);
+}
+
+void test_array_copy_bounded (void)
+{
+  char s[8];
+  strcpy (s, "12345");
+  A (strncmp (a4, s, 5)); clobber (a4);
+  strcpy (s, "54321");
+  A (strncmp (s, a4, 5)); clobber (a4);
+
+  strcpy (s, "123456");
+  A (strncmp (a4, s, 5)); clobber (a4);
+  strcpy (s, "654321");
+  A (strncmp (s, a4, 5)); clobber (a4);
+}
+
+/* { dg-final { scan-tree-dump-not "abort|strcmp|strncmp" "optimized" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 4af47855e7c..29d1ae80abf 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -2091,6 +2091,9 @@ maybe_diag_stxncpy_trunc (gimple_stmt_iterator gsi, tree src, tree cnt)
   else
     {
       c_strlen_data lendata = { };
+      /* Set MAXBOUND to an arbitrary non-null non-integer node as a request
+	 to have it set to the length of the longest string in a PHI.  */
+      lendata.maxbound = src;
       get_range_strlen (src, &lendata, /* eltsize = */1);
       if (TREE_CODE (lendata.minlen) == INTEGER_CST
 	  && TREE_CODE (lendata.maxbound) == INTEGER_CST)
@@ -2862,51 +2865,78 @@ handle_builtin_memset (gimple_stmt_iterator *gsi)
   return true;
 }
 
-/* Handle a call to memcmp.  We try to handle small comparisons by
-   converting them to load and compare, and replacing the call to memcmp
-   with a __builtin_memcmp_eq call where possible.
-   return true when call is transformed, return false otherwise.  */
+/* Return a pointer to the first such equality expression if RES is used
+   only in experessions testing its equality to zero, and null otherwise.  */
 
-static bool
-handle_builtin_memcmp (gimple_stmt_iterator *gsi)
+static gimple*
+used_only_for_zero_equality (tree res)
 {
-  gcall *stmt2 = as_a <gcall *> (gsi_stmt (*gsi));
-  tree res = gimple_call_lhs (stmt2);
-  tree arg1 = gimple_call_arg (stmt2, 0);
-  tree arg2 = gimple_call_arg (stmt2, 1);
-  tree len = gimple_call_arg (stmt2, 2);
-  unsigned HOST_WIDE_INT leni;
+  gimple *first_use = NULL;
+
   use_operand_p use_p;
   imm_use_iterator iter;
 
-  if (!res)
-    return false;
-
   FOR_EACH_IMM_USE_FAST (use_p, iter, res)
     {
-      gimple *ustmt = USE_STMT (use_p);
+      gimple *use_stmt = USE_STMT (use_p);
 
-      if (is_gimple_debug (ustmt))
-	continue;
-      if (gimple_code (ustmt) == GIMPLE_ASSIGN)
+      if (is_gimple_debug (use_stmt))
+        continue;
+      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
 	{
-	  gassign *asgn = as_a <gassign *> (ustmt);
-	  tree_code code = gimple_assign_rhs_code (asgn);
-	  if ((code != EQ_EXPR && code != NE_EXPR)
-	      || !integer_zerop (gimple_assign_rhs2 (asgn)))
-	    return false;
+	  tree_code code = gimple_assign_rhs_code (use_stmt);
+	  if (code == COND_EXPR)
+	    {
+	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
+	      if ((TREE_CODE (cond_expr) != EQ_EXPR
+		   && (TREE_CODE (cond_expr) != NE_EXPR))
+		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
+		return NULL;
+	    }
+	  else if (code == EQ_EXPR || code == NE_EXPR)
+	    {
+	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
+		return NULL;
+            }
+	  else
+	    return NULL;
 	}
-      else if (gimple_code (ustmt) == GIMPLE_COND)
+      else if (gimple_code (use_stmt) == GIMPLE_COND)
 	{
-	  tree_code code = gimple_cond_code (ustmt);
+	  tree_code code = gimple_cond_code (use_stmt);
 	  if ((code != EQ_EXPR && code != NE_EXPR)
-	      || !integer_zerop (gimple_cond_rhs (ustmt)))
-	    return false;
+	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
+	    return NULL;
 	}
       else
-	return false;
+        return NULL;
+
+      if (!first_use)
+	first_use = use_stmt;
     }
 
+  return first_use;
+}
+
+/* Handle a call to memcmp.  We try to handle small comparisons by
+   converting them to load and compare, and replacing the call to memcmp
+   with a __builtin_memcmp_eq call where possible.
+   return true when call is transformed, return false otherwise.  */
+
+static bool
+handle_builtin_memcmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree res = gimple_call_lhs (stmt);
+
+  if (!res || !used_only_for_zero_equality (res))
+    return false;
+
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  tree len = gimple_call_arg (stmt, 2);
+  unsigned HOST_WIDE_INT leni;
+
   if (tree_fits_uhwi_p (len)
       && (leni = tree_to_uhwi (len)) <= GET_MODE_SIZE (word_mode)
       && pow2p_hwi (leni))
@@ -2919,7 +2949,7 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
       if (int_mode_for_size (leni, 1).exists (&mode)
 	  && (align >= leni || !targetm.slow_unaligned_access (mode, align)))
 	{
-	  location_t loc = gimple_location (stmt2);
+	  location_t loc = gimple_location (stmt);
 	  tree type, off;
 	  type = build_nonstandard_integer_type (leni, 1);
 	  gcc_assert (known_eq (GET_MODE_BITSIZE (TYPE_MODE (type)), leni));
@@ -2943,78 +2973,10 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
 	}
     }
 
-  gimple_call_set_fndecl (stmt2, builtin_decl_explicit (BUILT_IN_MEMCMP_EQ));
+  gimple_call_set_fndecl (stmt, builtin_decl_explicit (BUILT_IN_MEMCMP_EQ));
   return true;
 }
 
-/* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
-   the result of 0 == strncmp (A, B, N) (which is the same as strcmp for
-   sufficiently large N).  Otherwise return false.  */
-
-static bool
-strxcmp_unequal (int idx1, int idx2, unsigned HOST_WIDE_INT n)
-{
-  unsigned HOST_WIDE_INT len1;
-  unsigned HOST_WIDE_INT len2;
-
-  bool nulterm1;
-  bool nulterm2;
-
-  if (idx1 < 0)
-    {
-      len1 = ~idx1;
-      nulterm1 = true;
-    }
-  else if (strinfo *si = get_strinfo (idx1))
-    {
-      if (tree_fits_uhwi_p (si->nonzero_chars))
-	{
-	  len1 = tree_to_uhwi (si->nonzero_chars);
-	  nulterm1 = si->full_string_p;
-	}
-      else
-	return false;
-    }
-  else
-    return false;
-
-  if (idx2 < 0)
-    {
-      len2 = ~idx2;
-      nulterm2 = true;
-    }
-  else if (strinfo *si = get_strinfo (idx2))
-    {
-      if (tree_fits_uhwi_p (si->nonzero_chars))
-	{
-	  len2 = tree_to_uhwi (si->nonzero_chars);
-	  nulterm2 = si->full_string_p;
-	}
-      else
-	return false;
-    }
-  else
-    return false;
-
-  /* N is set to UHWI_MAX for strcmp and less to strncmp.  Adjust
-     the length of each string to consider to be no more than N.  */
-  if (len1 > n)
-    len1 = n;
-  if (len2 > n)
-    len2 = n;
-
-  if ((len1 < len2 && nulterm1)
-      || (len2 < len1 && nulterm2))
-    /* The string lengths are definitely unequal and the result can
-       be folded to one (since it's used for comparison with zero).  */
-    return true;
-
-  /* The string lengths may be equal or unequal.  Even when equal and
-     both strings nul-terminated, without the string contents there's
-     no way to determine whether they are equal.  */
-  return false;
-}
-
 /* Given an index to the strinfo vector, compute the string length
    for the corresponding string. Return -1 when unknown.  */
 
@@ -3043,15 +3005,16 @@ compute_string_length (int idx)
 
 /* Determine the minimum size of the object referenced by DEST expression
    which must have a pointer type.
-   Return the minimum size of the object if successful or NULL when the size
-   cannot be determined.  */
-static tree
+   Return the minimum size of the object if successful or HWI_M1U when
+   the size cannot be determined.  */
+
+static unsigned HOST_WIDE_INT
 determine_min_objsize (tree dest)
 {
   unsigned HOST_WIDE_INT size = 0;
 
   if (compute_builtin_object_size (dest, 2, &size))
-    return build_int_cst (sizetype, size);
+    return size;
 
   /* Try to determine the size of the object through the RHS
      of the assign statement.  */
@@ -3059,11 +3022,11 @@ determine_min_objsize (tree dest)
     {
       gimple *stmt = SSA_NAME_DEF_STMT (dest);
       if (!is_gimple_assign (stmt))
-	return NULL_TREE;
+	return HOST_WIDE_INT_M1U;
 
       if (!gimple_assign_single_p (stmt)
 	  && !gimple_assign_unary_nop_p (stmt))
-	return NULL_TREE;
+	return HOST_WIDE_INT_M1U;
 
       dest = gimple_assign_rhs1 (stmt);
       return determine_min_objsize (dest);
@@ -3071,7 +3034,7 @@ determine_min_objsize (tree dest)
 
   /* Try to determine the size of the object from its type.  */
   if (TREE_CODE (dest) != ADDR_EXPR)
-    return NULL_TREE;
+    return HOST_WIDE_INT_M1U;
 
   tree type = TREE_TYPE (dest);
   if (TREE_CODE (type) == POINTER_TYPE)
@@ -3079,196 +3042,413 @@ determine_min_objsize (tree dest)
 
   type = TYPE_MAIN_VARIANT (type);
 
-  /* We cannot determine the size of the array if it's a flexible array,
-     which is declared at the end of a structure.  */
-  if (TREE_CODE (type) == ARRAY_TYPE
-      && !array_at_struct_end_p (dest))
+  /* The size of a flexible array cannot be determined.  Otherwise,
+     for arrays with more than one element, return the size of its
+     type.  GCC itself misuses arrays of both zero and one elements
+     as flexible array members so they are excluded as well.  */
+  if (TREE_CODE (type) != ARRAY_TYPE
+      || !array_at_struct_end_p (dest))
     {
-      tree size_t = TYPE_SIZE_UNIT (type);
-      if (size_t && TREE_CODE (size_t) == INTEGER_CST
-	  && !integer_zerop (size_t))
-        return size_t;
+      tree type_size = TYPE_SIZE_UNIT (type);
+      if (type_size && TREE_CODE (type_size) == INTEGER_CST
+	  && !integer_onep (type_size)
+	  && !integer_zerop (type_size))
+        return tree_to_uhwi (type_size);
     }
 
-  return NULL_TREE;
+  return HOST_WIDE_INT_M1U;
 }
 
-/* Handle a call to strcmp or strncmp. When the result is ONLY used to do
-   equality test against zero:
-
-   A. When the lengths of both arguments are constant and it's a strcmp:
-      * if the lengths are NOT equal, we can safely fold the call
-        to a non-zero value.
-      * otherwise, do nothing now.
-
-   B. When the length of one argument is constant, try to replace the call
-   with a __builtin_str(n)cmp_eq call where possible, i.e:
-
-   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR
-   is a string with constant length , C is a constant.
-     if (C <= strlen(STR) && sizeof_array(s) > C)
-       {
-         replace this call with
-         strncmp_eq (s, STR, C) (!)= 0
-       }
-     if (C > strlen(STR)
-       {
-         it can be safely treated as a call to strcmp (s, STR) (!)= 0
-         can handled by the following strcmp.
-       }
-
-   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR
-   is a string with constant length.
-     if  (sizeof_array(s) > strlen(STR))
-       {
-         replace this call with
-         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
-       }
-
-   Return true when the call is transformed, return false otherwise.
- */
+/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths
+   of  the string(s) referenced by ARG if it can be determined.
+   If the length cannot be determined, set *SIZE to the size of
+   the array the string is stored in, if any.  If no such array is
+   known, set *SIZE to -1.  When the strings are nul-terminated set
+   *NULTERM to true, otherwise to false.  Return true on success.  */
 
 static bool
-handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
+		 unsigned HOST_WIDE_INT *size, bool *nulterm)
 {
-  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
-  tree res = gimple_call_lhs (stmt);
-  use_operand_p use_p;
-  imm_use_iterator iter;
-  tree arg1 = gimple_call_arg (stmt, 0);
-  tree arg2 = gimple_call_arg (stmt, 1);
-  int idx1 = get_stridx (arg1);
-  int idx2 = get_stridx (arg2);
-  HOST_WIDE_INT length = -1;
-  bool is_ncmp = false;
-
-  if (!res)
-    return false;
+  /* Set so that both LEN and ~LEN are invalid lengths, i.e.,
+     maximum possible length + 1.  */
+  lenrng[0] = lenrng[1] = HOST_WIDE_INT_MAX;
 
-  /* When both arguments are unknown, do nothing.  */
-  if (idx1 == 0 && idx2 == 0)
-    return false;
+  *size = HOST_WIDE_INT_M1U;
 
-  /* Handle strncmp function.  */
-  if (gimple_call_num_args (stmt) == 3)
+  if (idx < 0)
     {
-      tree len = gimple_call_arg (stmt, 2);
-      if (tree_fits_shwi_p (len))
-        length = tree_to_shwi (len);
-
-      is_ncmp = true;
+      /* IDX is the inverted constant string length.  */
+      lenrng[0] = ~idx;
+      lenrng[1] = lenrng[0];
+      *nulterm = true;
     }
-
-  /* For strncmp, if the length argument is NOT known, do nothing.  */
-  if (is_ncmp && length < 0)
-    return false;
-
-  /* When the result is ONLY used to do equality test against zero.  */
-  FOR_EACH_IMM_USE_FAST (use_p, iter, res)
+  else if (idx == 0)
+    ; /* Handled below.  */
+  else if (strinfo *si = get_strinfo (idx))
     {
-      gimple *use_stmt = USE_STMT (use_p);
+      if (!si->nonzero_chars)
+	arg = si->ptr;
+      else if (tree_fits_uhwi_p (si->nonzero_chars))
+	{
+	  lenrng[0] = tree_to_uhwi (si->nonzero_chars);
+	  *nulterm = si->full_string_p;
+	  /* Set the upper bound only if the string is known to be
+	     nul-terminated, otherwise leave it at maximum + 1.  */
+	  if (*nulterm)
+	    lenrng[1] = lenrng[0];
+	}
+      else if (TREE_CODE (si->nonzero_chars) == SSA_NAME)
+	{
+	  wide_int min, max;
+	  value_range_kind rng = get_range_info (si->nonzero_chars, &min, &max);
+	  if (rng == VR_RANGE)
+	    {
+	      lenrng[0] = min.to_uhwi ();
+	      lenrng[1] = max.to_uhwi ();
+	      *nulterm = si->full_string_p;
+	    }
+	}
+      else if (si->ptr)
+	arg = si->ptr;
+    }
 
-      if (is_gimple_debug (use_stmt))
-        continue;
-      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
+  if (lenrng[0] == HOST_WIDE_INT_MAX)
+    {
+      /* Compute the minimum and maximum real or possible lengths.  */
+      c_strlen_data lendata = { };
+      if (get_range_strlen (arg, &lendata, /* eltsize = */1))
 	{
-	  tree_code code = gimple_assign_rhs_code (use_stmt);
-	  if (code == COND_EXPR)
+	  if (tree_fits_shwi_p (lendata.maxlen) && !lendata.maxbound)
 	    {
-	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
-	      if ((TREE_CODE (cond_expr) != EQ_EXPR
-		   && (TREE_CODE (cond_expr) != NE_EXPR))
-		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
-		return false;
+	      lenrng[0] = tree_to_shwi (lendata.minlen);
+	      lenrng[1] = tree_to_shwi (lendata.maxlen);
+	      *nulterm = true;
 	    }
-	  else if (code == EQ_EXPR || code == NE_EXPR)
+	  else if (lendata.maxbound && tree_fits_shwi_p (lendata.maxbound))
 	    {
-	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
-		return false;
-            }
-	  else
-	    return false;
+	      /* Set *SIZE to the conservative LENDATA.MAXBOUND which
+		 is a conservative estimate of the longest string based
+		 on the sizes of the arrays referenced by ARG.  */
+	      *size = tree_to_uhwi (lendata.maxbound) + 1;
+	      *nulterm = false;
+	    }
 	}
-      else if (gimple_code (use_stmt) == GIMPLE_COND)
+      else
 	{
-	  tree_code code = gimple_cond_code (use_stmt);
-	  if ((code != EQ_EXPR && code != NE_EXPR)
-	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
-	    return false;
+	  /* Set *SIZE to the size of the smallest object referenced
+	     by ARG if ARG denotes a single object, or to HWI_M1U
+	     otherwise.  */
+	  *size = determine_min_objsize (arg);
+	  *nulterm = false;
 	}
+    }
+
+  return lenrng[0] != HOST_WIDE_INT_MAX || *size != HOST_WIDE_INT_M1U;
+}
+
+/* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
+   the result of 0 == strncmp (A, B, BOUND) (which is the same as strcmp
+   for s sufficiently large BOUND).  If the result is based on the length
+   of one string being greater than the longest string that would fit in
+   the array pointer to by the argument, set *PLEN and *PSIZE to
+   the corresponding length (or its complement when the string is known
+   to be at least as long and need not be nul-terminated) and size.
+   Otherwise return null.  */
+
+static tree
+strxcmp_eqz_result (tree arg1, int idx1, tree arg2, int idx2,
+		    unsigned HOST_WIDE_INT bound, unsigned HOST_WIDE_INT len[2],
+		    unsigned HOST_WIDE_INT *psize)
+{
+  /* Determine the range the length of each string is in and whether it's
+     known to be nul-terminated, or the size of the array it's stored in.  */
+  bool nul1, nul2;
+  unsigned HOST_WIDE_INT siz1, siz2;
+  unsigned HOST_WIDE_INT len1rng[2], len2rng[2];
+  if (!get_len_or_size (arg1, idx1, len1rng, &siz1, &nul1)
+      || !get_len_or_size (arg2, idx2, len2rng, &siz2, &nul2))
+    return NULL_TREE;
+
+  /* BOUND is set to HWI_M1U for strcmp and less to strncmp, and LENiRNG
+     to HWI_MAX when invalid.  Adjust the length of each string to consider
+     to be no more than BOUND.  */
+  if (len1rng[0] < HOST_WIDE_INT_MAX && len1rng[0] > bound)
+    len1rng[0] = bound;
+  if (len1rng[1] < HOST_WIDE_INT_MAX && len1rng[1] > bound)
+    len1rng[1] = bound;
+  if (len2rng[0] < HOST_WIDE_INT_MAX && len2rng[0] > bound)
+    len2rng[0] = bound;
+  if (len2rng[1] < HOST_WIDE_INT_MAX && len2rng[1] > bound)
+    len2rng[1] = bound;
+
+  /* Two empty strings are equal.  */
+  if (len1rng[1] == 0 && len2rng[1] == 0)
+    return integer_one_node;
+
+  /* The strings are definitely unequal when the lower bound of the length
+     of one of them is greater than the length of the longest string that
+     would fit into the other array.  */
+  if (len1rng[0] == HOST_WIDE_INT_MAX
+      && len2rng[0] != HOST_WIDE_INT_MAX
+      && ((len2rng[0] < bound && len2rng[0] >= siz1)
+	  || len2rng[0] > siz1))
+    {
+      *psize = siz1;
+      len[0] = len1rng[0];
+      /* Set LEN[0] to the lower bound of ARG1's length when it's
+	 nul-terminated or to the complement of its minimum length
+	 otherwise,  */
+      len[1] = nul2 ? len2rng[0] : ~len2rng[0];
+      return integer_zero_node;
+    }
+
+  if (len2rng[0] == HOST_WIDE_INT_MAX
+      && len1rng[0] != HOST_WIDE_INT_MAX
+      && ((len1rng[0] < bound && len1rng[0] >= siz2)
+	  || len1rng[0] > siz2))
+    {
+      *psize = siz2;
+      len[0] = nul1 ? len1rng[0] : ~len1rng[0];
+      len[1] = len2rng[0];
+      return integer_zero_node;
+    }
+
+  /* The strings are also definitely unequal when their lengths are unequal
+     and at least one is nul-terminated.  */
+  if (len1rng[0] != HOST_WIDE_INT_MAX
+      && len2rng[0] != HOST_WIDE_INT_MAX
+      && ((len1rng[1] < len2rng[0] && nul1)
+	  || (len2rng[1] < len1rng[0] && nul2)))
+    {
+      if (bound <= len1rng[0] || bound <= len2rng[0])
+	*psize = bound;
       else
-        return false;
+	*psize = HOST_WIDE_INT_M1U;
+
+      len[0] = len1rng[0];
+      len[1] = len2rng[0];
+      return integer_zero_node;
+    }
+
+  /* The string lengths may be equal or unequal.  Even when equal and
+     both strings nul-terminated, without the string contents there's
+     no way to determine whether they are equal.  */
+  return NULL_TREE;
+}
+
+/* Diagnose pointless calls to strcmp whose result is used in equality
+   epxpressions that evaluate to a constant due to one argument being
+   longer than the size of the other.  */
+
+static void
+maybe_warn_pointless_strcmp (gimple *stmt, HOST_WIDE_INT bound,
+			     unsigned HOST_WIDE_INT len[2],
+			     unsigned HOST_WIDE_INT siz)
+{
+  bool at_least = false;
+
+  if (len[0] > HOST_WIDE_INT_MAX)
+    {
+      at_least = true;
+      len[0] = ~len[0];
+    }
+
+  if (len[1] > HOST_WIDE_INT_MAX)
+    {
+      at_least = true;
+      len[1] = ~len[1];
     }
 
-  /* When the lengths of the arguments are known to be unequal
-     we can safely fold the call to a non-zero value for strcmp;
-     otherwise, do nothing now.  */
-  if (idx1 != 0 && idx2 != 0)
+  unsigned HOST_WIDE_INT minlen = MIN (len[0], len[1]);
+
+  tree lhs = gimple_call_lhs (stmt);
+
+  /* FIXME: Include a note pointing to the declaration
+     of the smaller array.  */
+  if (gimple *use = used_only_for_zero_equality (lhs))
     {
-      if (strxcmp_unequal (idx1, idx2, length))
+      location_t stmt_loc = gimple_location (stmt);
+      tree callee = gimple_call_fndecl (stmt);
+      bool warned = false;
+      if (siz <= minlen && bound == -1)
+	warned = warning_at (stmt_loc, OPT_Wstring_compare,
+			     (at_least
+			      ? G_("%G%qD of a string of length %wu "
+				   "or more and an array of size %wu "
+				   "evaluates to nonzero")
+			      : G_("%G%qD of a string of length %wu "
+				   "and an array of size %wu "
+				   "evaluates to nonzero")),
+			     stmt, callee, minlen, siz);
+      else if (!at_least && siz <= HOST_WIDE_INT_MAX)
 	{
-	  replace_call_with_value (gsi, integer_one_node);
-	  return true;
+	  if (len[0] != HOST_WIDE_INT_MAX
+	      && len[1] != HOST_WIDE_INT_MAX)
+	    warned = warning_at (stmt_loc, OPT_Wstring_compare,
+				 "%G%qD of strings of length %wu "
+				 "and %wu and bound of %wu evaluates "
+				 "to nonzero",
+				 stmt, callee, len[0], len[1], bound);
+	  else
+	    warned = warning_at (stmt_loc, OPT_Wstring_compare,
+				 "%G%qD of a string of length %wu, "
+				 "an array of size %wu and bound "
+				 "of %wu evaluates to nonzero",
+				 stmt, callee, minlen, siz, bound);
+	}
+
+      if (warned)
+	{
+	  location_t use_loc = gimple_location (use);
+	  if (LOCATION_LINE (stmt_loc) != LOCATION_LINE (use_loc))
+	    inform (use_loc, "in this expression");
 	}
-      return false;
     }
+}
 
-  /* When the length of one argument is constant.  */
-  tree var_string = NULL_TREE;
-  HOST_WIDE_INT const_string_leni = -1;
 
-  if (idx1)
+/* Optimize a call to strcmp or strncmp either by folding it to a constant
+   when possible or by transforming the latter to the former.  Warn about
+   calls where the length of one argument is greater than the size of
+   the array to which the other aargument points if the latter's length
+   is not known.  Return true when the call has been transformed into
+   another and false otherwise.  */
+
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree lhs = gimple_call_lhs (stmt);
+
+  if (!lhs)
+    return false;
+
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  int idx1 = get_stridx (arg1);
+  int idx2 = get_stridx (arg2);
+
+  /* For strncmp set to the the value of the third argument if known.  */
+  HOST_WIDE_INT bound = -1;
+
+  /* Extract the strncmp bound.  */
+  if (gimple_call_num_args (stmt) == 3)
     {
-      const_string_leni = compute_string_length (idx1);
-      var_string = arg2;
+      tree len = gimple_call_arg (stmt, 2);
+      if (tree_fits_shwi_p (len))
+        bound = tree_to_shwi (len);
+
+      /* If the bound argument is NOT known, do nothing.  */
+      if (bound < 0)
+	return false;
     }
-  else
+
+  /* Set to the length of one argument (or its complement if it's
+     the lower bound of a range) and the size of the array storing
+     the other if the result is based on the former being equal to
+     or greater than the latter.  */
+  unsigned HOST_WIDE_INT len[2] = { HOST_WIDE_INT_MAX, HOST_WIDE_INT_MAX };
+  unsigned HOST_WIDE_INT siz = HOST_WIDE_INT_M1U;
+
+  /* Try to determine if the two strings are either definitely equal
+     or definitely unequal and if so, either fold the result to zero
+     (when equal) or set the range of the result to ~[0, 0] otherwise.  */
+  if (tree eqz = strxcmp_eqz_result (arg1, idx1, arg2, idx2, bound,
+				     len, &siz))
     {
-      gcc_checking_assert (idx2);
-      const_string_leni = compute_string_length (idx2);
-      var_string = arg1;
+      if (integer_zerop (eqz))
+	{
+	  maybe_warn_pointless_strcmp (stmt, bound, len, siz);
+
+	  if (bound < 0)
+	    inform (gimple_location (stmt),
+		    "%G%qD (%qE, %qE) set result to ~[0, 0]",
+		    (gimple *)stmt, gimple_call_fndecl (stmt), arg1, arg2);
+	  else
+	    inform (gimple_location (stmt),
+		    "%G%qD (%qE, %qE, %wu) set result to ~[0, 0]",
+		    (gimple *)stmt, gimple_call_fndecl (stmt), arg1, arg2, bound);
+	  /* When the lengths of the first two string arguments are
+	     known to be unequal set the range of the result to non-zero.
+	     This allows the call to be eliminated if its result is only
+	     used in tests for equality to zero.  */
+	  wide_int zero = wi::zero (TYPE_PRECISION (TREE_TYPE (lhs)));
+	  set_range_info (lhs, VR_ANTI_RANGE, zero, zero);
+	  return false;
+	}
+      /* When the two strings are definitely equal (such as when they
+	 are both empty) fold the call to the constant result.  */
+      replace_call_with_value (gsi, integer_zero_node);
+      if (bound < 0)
+	inform (gimple_location (stmt),
+		"%G%qD (%qE, %qE) folded to zero",
+		(gimple *)stmt, gimple_call_fndecl (stmt), arg1, arg2);
+      else
+	inform (gimple_location (stmt),
+		"%G%qD (%qE, %qE, %wu) folded to zero",
+		(gimple *)stmt, gimple_call_fndecl (stmt), arg1, arg2, bound);
+      return true;
     }
 
-  if (const_string_leni < 0)
+  if (idx1 == 0 && idx2 == 0)
     return false;
 
-  unsigned HOST_WIDE_INT var_sizei = 0;
-  /* try to determine the minimum size of the object pointed by var_string.  */
-  tree size = determine_min_objsize (var_string);
+  /* Determine either the length or the size of each of the string
+     orguments, whichever is available.  */
+  HOST_WIDE_INT cstlen1 = -1, cstlen2 = -1;
+  HOST_WIDE_INT arysiz1 = -1, arysiz2 = -1;
+
+  if (idx1)
+    cstlen1 = compute_string_length (idx1) + 1;
+  else
+    arysiz1 = determine_min_objsize (arg1);
 
-  if (!size)
+  /* Bail if neither the string length nor the size of the array
+     it is stored in can be determined.  */
+  if (cstlen1 < 0 && arysiz1 < 0)
     return false;
 
-  if (tree_fits_uhwi_p (size))
-    var_sizei = tree_to_uhwi (size);
+  /* Repeat for the second argument.  */
+  if (idx2)
+    cstlen2 = compute_string_length (idx2) + 1;
+  else
+    arysiz2 = determine_min_objsize (arg2);
 
-  if (var_sizei == 0)
+  if (cstlen2 < 0 && arysiz2 < 0)
     return false;
 
-  /* For strncmp, if length > const_string_leni , this call can be safely
-     transformed to a strcmp.  */
-  if (is_ncmp && length > const_string_leni)
-    is_ncmp = false;
-
-  unsigned HOST_WIDE_INT final_length
-    = is_ncmp ? length : const_string_leni + 1;
+  /* The exact number of characters to compare.  */
+  HOST_WIDE_INT cmpsiz = bound < 0 ? cstlen1 < 0 ? cstlen2 : cstlen1 : bound;
+  /* The size of the array in which the unknown string is stored.  */
+  HOST_WIDE_INT varsiz = arysiz1 < 0 ? arysiz2 : arysiz1;
 
-  /* Replace strcmp or strncmp with the corresponding str(n)cmp_eq.  */
-  if (var_sizei > final_length)
+  if (cmpsiz < varsiz && used_only_for_zero_equality (lhs))
     {
-      tree fn
-	= (is_ncmp
-	   ? builtin_decl_implicit (BUILT_IN_STRNCMP_EQ)
-	   : builtin_decl_implicit (BUILT_IN_STRCMP_EQ));
-      if (!fn)
-	return false;
-      tree const_string_len = build_int_cst (size_type_node, final_length);
-      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+      /* If the known length is less than the size of the other array
+	 and the strcmp result is only used to test equality to zero,
+	 transform the call to the equivalent _eq call.  */
+      if (tree fn = builtin_decl_implicit (bound < 0 ? BUILT_IN_STRCMP_EQ
+					   : BUILT_IN_STRNCMP_EQ))
+	{
+	  if (bound < 0)
+	    inform (gimple_location (stmt),
+		    "%G%qD (%qE, %qE) transformed to %qD (..., %wi)",
+		    (gimple *)stmt, gimple_call_fndecl (stmt), arg1, arg2,
+		    fn, cmpsiz);
+	  else
+	    inform (gimple_location (stmt),
+		    "%G%qD (%qE, %qE, %wu) transformed to %qD (..., %wi)",
+		    (gimple *)stmt,
+		    gimple_call_fndecl (stmt), arg1, arg2, bound,
+		    fn, cmpsiz);
+	  tree n = build_int_cst (size_type_node, cmpsiz);
+	  update_gimple_call (gsi, fn, 3, arg1, arg2, n);
+	  return true;
+	}
     }
-  else
-    return false;
 
-  return true;
+  return false;
 }
 
 /* Handle a POINTER_PLUS_EXPR statement.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-09 16:42 [PATCH] fold more string comparison with known result (PR 90879) Martin Sebor
@ 2019-08-09 16:51 ` Jakub Jelinek
  2019-08-09 17:07   ` Martin Sebor
  2019-08-12 22:22 ` Jeff Law
  1 sibling, 1 reply; 21+ messages in thread
From: Jakub Jelinek @ 2019-08-09 16:51 UTC (permalink / raw)
  To: Martin Sebor; +Cc: gcc-patches

On Fri, Aug 09, 2019 at 10:17:12AM -0600, Martin Sebor wrote:
> --- a/gcc/gengtype-state.c
> +++ b/gcc/gengtype-state.c
> @@ -79,6 +79,14 @@ enum state_token_en
>    STOK_NAME                     /* hash-consed name or identifier.  */
>  };
>  
> +/* Suppress warning: ISO C forbids zero-size array for stok_string
> +   below.  The arrays are treated as flexible array members but in
> +   otherwise an empty struct or as a member of a union cannot be
> +   declared as such.  They must have zero size to keep GCC from
> +   assuming their bound reflect their size.  */
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wpedantic"
> +
>  
>  /* Structure and hash-table used to share identifiers or names.  */
>  struct state_ident_st
> @@ -86,11 +94,10 @@ struct state_ident_st
>    /* TODO: We could improve the parser by reserving identifiers for
>       state keywords and adding a keyword number for them.  That would
>       mean adding another field in this state_ident_st struct.  */
> -  char stid_name[1];		/* actually bigger & null terminated */
> +  char stid_name[0];		/* actually bigger & null terminated */

No, please don't do this.  The part of the GCC that is built by system
compiler shouldn't use GNU extensions, unless guarded only for compilation
with compilers that do support that.

	Jakub

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-09 16:51 ` Jakub Jelinek
@ 2019-08-09 17:07   ` Martin Sebor
  2019-08-09 17:07     ` Jakub Jelinek
  2019-08-13 20:08     ` Jeff Law
  0 siblings, 2 replies; 21+ messages in thread
From: Martin Sebor @ 2019-08-09 17:07 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1926 bytes --]

On 8/9/19 10:22 AM, Jakub Jelinek wrote:
> On Fri, Aug 09, 2019 at 10:17:12AM -0600, Martin Sebor wrote:
>> --- a/gcc/gengtype-state.c
>> +++ b/gcc/gengtype-state.c
>> @@ -79,6 +79,14 @@ enum state_token_en
>>     STOK_NAME                     /* hash-consed name or identifier.  */
>>   };
>>   
>> +/* Suppress warning: ISO C forbids zero-size array for stok_string
>> +   below.  The arrays are treated as flexible array members but in
>> +   otherwise an empty struct or as a member of a union cannot be
>> +   declared as such.  They must have zero size to keep GCC from
>> +   assuming their bound reflect their size.  */
>> +#pragma GCC diagnostic push
>> +#pragma GCC diagnostic ignored "-Wpedantic"
>> +
>>   
>>   /* Structure and hash-table used to share identifiers or names.  */
>>   struct state_ident_st
>> @@ -86,11 +94,10 @@ struct state_ident_st
>>     /* TODO: We could improve the parser by reserving identifiers for
>>        state keywords and adding a keyword number for them.  That would
>>        mean adding another field in this state_ident_st struct.  */
>> -  char stid_name[1];		/* actually bigger & null terminated */
>> +  char stid_name[0];		/* actually bigger & null terminated */
> 
> No, please don't do this.  The part of the GCC that is built by system
> compiler shouldn't use GNU extensions, unless guarded only for compilation
> with compilers that do support that.

Hmm, this wasn't supposed to be in the diff anymore (the patch
handles the code without these changes).  I removed it after
verifying it just before sending the patch so my mailer must
have sent a cached copy.  Attached is the latest tested patch
without this change.

That said, we should change this code one way or the other.
There is even less of a guarantee that other compilers support
writing past the end of arrays that have non-zero size than
that they recognize the documented zero-length extension.

Martin

[-- Attachment #2: gcc-90879.diff --]
[-- Type: text/x-patch, Size: 62463 bytes --]

PR tree-optimization/90879 - fold zero-equality of strcmp between a longer string and a smaller array

gcc/c-family/ChangeLog:

	PR tree-optimization/90879
	* c.opt (-Wstring-compare): New option.

gcc/testsuite/ChangeLog:

	PR tree-optimization/90879
	* gcc.dg/Wstring-compare-2.c: New test.
	* gcc.dg/Wstring-compare.c: New test.
	* gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
	* gcc.dg/strcmpopt_6.c: New test.
	* gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
	test cases.
	* gcc.dg/strlenopt-66.c: Run it.
	* gcc.dg/strlenopt-68.c: New test.

gcc/ChangeLog:

	PR tree-optimization/90879
	* builtins.c (check_access): Avoid using maxbound when null.
	* calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen change.
	* doc/invoke.texi (-Wstring-compare): Document new warning option.
	* gimple-fold.c (get_range_strlen_tree): Make setting maxbound
	conditional.
	(get_range_strlen): Overwrite initial maxbound when non-null.
	* gimple-ssa-sprintf.c (get_string_length): Adjust to get_range_strlen
	change.
	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
	(used_only_for_zero_equality): New function.
	(handle_builtin_memcmp): Call it.
	(determine_min_objsize): Return an integer instead of tree.
	(get_len_or_size, strxcmp_eqz_result): New functions.
	(maybe_warn_pointless_strcmp): New function.
	(handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
	between a longer string and a smaller array.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 695a9d191af..eca710942dc 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3326,7 +3326,7 @@ check_access (tree exp, tree, tree, tree dstwrite,
 	  c_strlen_data lendata = { };
 	  get_range_strlen (srcstr, &lendata, /* eltsize = */ 1);
 	  range[0] = lendata.minlen;
-	  range[1] = lendata.maxbound;
+	  range[1] = lendata.maxbound ? lendata.maxbound : lendata.maxlen;
 	  if (range[0] && (!maxread || TREE_CODE (maxread) == INTEGER_CST))
 	    {
 	      if (maxread && tree_int_cst_le (maxread, range[0]))
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 257cadfa5f1..2fe6cc4ee08 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -784,6 +784,12 @@ Wsizeof-array-argument
 C ObjC C++ ObjC++ Var(warn_sizeof_array_argument) Warning Init(1)
 Warn when sizeof is applied on a parameter declared as an array.
 
+Wstring-compare
+C ObjC C++ LTO ObjC++ Warning Var(warn_string_compare) Warning LangEnabledBy(C ObjC C++ ObjC++, Wextra)
+Warn about calls to strcmp and strncmp used in equality expressions that
+are necessarily true or false due to the length of one and size of the other
+argument.
+
 Wstringop-overflow
 C ObjC C++ LTO ObjC++ Warning Alias(Wstringop-overflow=, 2, 0)
 Warn about buffer overflow in string manipulation functions like memcpy
diff --git a/gcc/calls.c b/gcc/calls.c
index 7507b698e27..dcebf67b5cc 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1593,6 +1593,10 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp)
 	    if (!get_attr_nonstring_decl (arg))
 	      {
 		c_strlen_data lendata = { };
+		/* Set MAXBOUND to an arbitrary non-null non-integer
+		   node as a request to have it set to the length of
+		   the longest string in a PHI.  */
+		lendata.maxbound = arg;
 		get_range_strlen (arg, &lendata, /* eltsize = */ 1);
 		maxlen = lendata.maxbound;
 	      }
@@ -1618,6 +1622,10 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp)
 	if (!get_attr_nonstring_decl (arg))
 	  {
 	    c_strlen_data lendata = { };
+	    /* Set MAXBOUND to an arbitrary non-null non-integer
+	       node as a request to have it set to the length of
+	       the longest string in a PHI.  */
+	    lendata.maxbound = arg;
 	    get_range_strlen (arg, &lendata, /* eltsize = */ 1);
 	    maxlen = lendata.maxbound;
 	  }
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 01aab60f895..5f712dda8bb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -347,6 +347,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wsizeof-pointer-memaccess  -Wsizeof-array-argument @gol
 -Wstack-protector  -Wstack-usage=@var{byte-size}  -Wstrict-aliasing @gol
 -Wstrict-aliasing=n  -Wstrict-overflow  -Wstrict-overflow=@var{n} @gol
+-Wstring-compare @gol
 -Wstringop-overflow=@var{n}  -Wstringop-truncation  -Wsubobject-linkage @gol
 -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}malloc@r{]} @gol
 -Wsuggest-final-types @gol  -Wsuggest-final-methods  -Wsuggest-override @gol
@@ -5815,6 +5816,30 @@ comparisons, so this warning level gives a very large number of
 false positives.
 @end table
 
+@item -Wstring-compare
+@opindex Wstring-compare
+@opindex Wno-string-compare
+Warn for calls to @code{strcmp} and @code{strncmp} whose result is
+determined to be either zero or non-zero in tests for such equality
+owing to the length of one argument being greater than the size of
+the array the other argument is stored in (or the bound in the case
+of @code{strncmp}).  Such calls could be mistakes.  For example,
+the call to @code{strcmp} below is diagnosed because its result is
+necessarily non-zero irrespective of the contents of the array @code{a}.
+
+@smallexample
+extern char a[4];
+void f (char *d)
+@{
+  strcpy (d, "string");
+  @dots{}
+  if (0 == strcmp (a, d))   // cannot be true
+    puts ("a and d are the same");
+@}
+@end smallexample
+
+@option{-Wstring-compare} is enabled by @option{-Wextra}.
+
 @item -Wstringop-overflow
 @itemx -Wstringop-overflow=@var{type}
 @opindex Wstringop-overflow
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index fc57fb45e3a..582768090ae 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1346,6 +1346,10 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	}
     }
 
+  /* Set if VAL represents the maximum length based on array size (set
+     when exact length cannot be determined).  */
+  bool maxbound = false;
+
   if (!val && rkind == SRK_LENRANGE)
     {
       if (TREE_CODE (arg) == ADDR_EXPR)
@@ -1441,6 +1445,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	      pdata->minlen = ssize_int (0);
 	    }
 	}
+      maxbound = true;
     }
 
   if (!val)
@@ -1454,7 +1459,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	  && tree_int_cst_lt (val, pdata->minlen)))
     pdata->minlen = val;
 
-  if (pdata->maxbound)
+  if (pdata->maxbound && TREE_CODE (pdata->maxbound) == INTEGER_CST)
     {
       /* Adjust the tighter (more optimistic) string length bound
 	 if necessary and proceed to adjust the more conservative
@@ -1472,7 +1477,10 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
       else
 	pdata->maxbound = val;
     }
-  else
+  else if (pdata->maxbound || maxbound)
+    /* Set PDATA->MAXBOUND only if it either isn't INTEGER_CST or
+       if VAL corresponds to the maximum length determined based
+       on the type of the object.  */
     pdata->maxbound = val;
 
   if (tight_bound)
@@ -1653,8 +1661,11 @@ get_range_strlen (tree arg, bitmap *visited,
 
 /* Try to obtain the range of the lengths of the string(s) referenced
    by ARG, or the size of the largest array ARG refers to if the range
-   of lengths cannot be determined, and store all in *PDATA.  ELTSIZE
-   is the expected size of the string element in bytes: 1 for char and
+   of lengths cannot be determined, and store all in *PDATA which must
+   be zero-initialized on input except PDATA->MAXBOUND may be set to
+   a non-null tree node other than INTEGER_CST to request to have it
+   set to the length of the longest string in a PHI.  ELTSIZE is
+   the expected size of the string element in bytes: 1 for char and
    some power of 2 for wide characters.
    Return true if the range [PDATA->MINLEN, PDATA->MAXLEN] is suitable
    for optimization.  Returning false means that a nonzero PDATA->MINLEN
@@ -1666,6 +1677,7 @@ bool
 get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)
 {
   bitmap visited = NULL;
+  tree maxbound = pdata->maxbound;
 
   if (!get_range_strlen (arg, &visited, SRK_LENRANGE, pdata, eltsize))
     {
@@ -1678,8 +1690,9 @@ get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)
   else if (!pdata->minlen)
     pdata->minlen = ssize_int (0);
 
-  /* Unless its null, leave the more conservative MAXBOUND unchanged.  */
-  if (!pdata->maxbound)
+  /* If it's unchanged from it initial non-null value, set the conservative
+     MAXBOUND to MAXLEN.  Otherwise leave it null (if it is null).  */
+  if (maxbound && pdata->maxbound == maxbound)
     pdata->maxbound = pdata->maxlen;
 
   if (visited)
diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 88ba1f2cac1..279338b1577 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -2041,6 +2041,9 @@ get_string_length (tree str, unsigned eltsize)
      aren't known to point any such arrays result in LENDATA.MAXLEN
      set to SIZE_MAX.  */
   c_strlen_data lendata = { };
+  /* Set MAXBOUND to an arbitrary non-null non-integer node as a request
+     to have it set to the length of the longest string in a PHI.  */
+  lendata.maxbound = str;
   get_range_strlen (str, &lendata, eltsize);
 
   /* Return the default result when nothing is known about the string. */
diff --git a/gcc/testsuite/gcc.dg/Wstring-compare-2.c b/gcc/testsuite/gcc.dg/Wstring-compare-2.c
new file mode 100644
index 00000000000..e6ca2a69999
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstring-compare-2.c
@@ -0,0 +1,127 @@
+/* PR tree-optimization/90879 - fold zero-equality of strcmp between
+   a longer string and a smaller array
+   Test for a warning for strcmp of a longer string against smaller
+   array.
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wstring-compare -Wno-stringop-truncation -ftrack-macro-expansion=0" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern void* memcpy (void*, const void*, size_t);
+
+extern int strcmp (const char*, const char*);
+extern size_t strlen (const char*);
+extern char* strcpy (char*, const char*);
+extern char* strncpy (char*, const char*, size_t);
+extern int strncmp (const char*, const char*, size_t);
+
+void sink (int, ...);
+#define sink(...) sink (__LINE__, __VA_ARGS__)
+
+
+extern char a1[1], a2[2], a3[3], a4[4], a5[5], a6[6], a7[7], a8[8], a9[9];
+
+#define T(a, b) sink (0 == strcmp (a, b))
+
+
+void test_string_cst (void)
+{
+  const char *s1 = "1", *s2 = "12";
+
+  T (s1, a1);                 // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+  T (s1, a2);
+  T (s1, a3);
+
+  T (a1, s1);                 // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+  T (a2, s1);
+  T (a3, s1);
+
+  T (s2, a1);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 1 evaluates to nonzero" }
+  T (s2, a2);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+  T (s2, a3);
+
+  T (a1, s2);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 1 evaluates to nonzero" }
+  T (a2, s2);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+  T (a3, s2);
+}
+
+
+void test_string_cst_off_cst (void)
+{
+  const char *s1 = "1", *s2 = "12", *s3 = "123", *s4 = "1234";
+
+  T (s1, a2 + 1);              // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+  T (a2 + 1, s1);              // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+
+
+  T (s3 + 1, a2);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+  T (s3 + 1, a3);
+
+  T (s2, a4 + 1);
+  T (s2, a4 + 2);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+
+  T (s4, a4 + 1);             // { dg-warning ".strcmp. of a string of length 4 and an array of size 3 evaluates to nonzero" }
+  T (s3, a5 + 1);
+}
+
+
+/* Use strncpy below rather than memcpy until PR 91183 is resolved.  */
+
+#undef T
+#define T(s, n, a)					\
+  do {							\
+    char arr[32];					\
+    sink (arr);						\
+    strncpy (arr, s, n < 0 ? strlen (s) + 1: n);	\
+    sink (0 == strcmp (arr, a));			\
+  } while (0)
+
+void test_string_exact_length (void)
+{
+  const char *s1 = "1", *s2 = "12";
+
+  T (s1, -1, a1);             // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+  T (s1, -1, a2);
+  T (s1, -1, a3);
+
+  T (s2, -1, a1);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 1 evaluates to nonzero" }
+  T (s2, -1, a2);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+  T (s2, -1, a3);
+}
+
+
+void test_string_min_length (void)
+{
+  const char *s1 = "1", *s2 = "12";
+
+  T (s1,  1, a1);             // { dg-warning ".strcmp. of a string of length 1 or more and an array of size 1 evaluates to nonzero" }
+  T (s1,  1, a2);
+  T (s1,  1, a3);
+
+  T (s2,  2, a1);             // { dg-warning ".strcmp. of a string of length 2 or more and an array of size 1 evaluates to nonzero" }
+  T (s2,  2, a2);             // { dg-warning ".strcmp. of a string of length 2 or more and an array of size 2 evaluates to nonzero" }
+  T (s2,  2, a3);
+}
+
+
+int test_strncmp_str_lit_var (const char *s, long n)
+{
+  if (strncmp (s, "123456", n) == 0)    // { dg-bogus "\\\[-Wstring-compare" }
+    return 1;
+
+  return 0;
+}
+
+int test_strlen_strncmp_str_lit_var (const char *s, long n)
+{
+  if (__builtin_strlen (s) < n)
+    return -1;
+
+  if (n == 6)
+    if (strncmp (s, "123456", n) == 0)  // { dg-bogus "\\\[-Wstring-compare" }
+      return 1;
+
+  return 0;
+}
+
+
diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c b/gcc/testsuite/gcc.dg/Wstring-compare.c
new file mode 100644
index 00000000000..0ca492db0ab
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstring-compare.c
@@ -0,0 +1,181 @@
+/* PR tree-optimization/90879 - fold zero-equality of strcmp between
+   a longer string and a smaller array
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wextra -ftrack-macro-expansion=0" } */
+
+#include "strlenopt.h"
+
+#define T(a, b) sink (0 == strcmp (a, b), a, b)
+
+void sink (int, ...);
+
+struct S { char a4[4], c; };
+
+extern char a4[4];
+extern char a5[5];
+extern char b4[4];
+
+/* Verify that comparison of string literals with arrays with unknown
+   content but size that prevents them from comparing equal is diagnosed.  */
+
+void strcmp_array_lit (void)
+{
+  if (strcmp (a4, "1234"))  // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+    sink (0, a4);
+
+  int cmp;
+  cmp = strcmp (a4, "1234");  // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+  if (cmp)                  // { dg-message "in this expression" }
+    sink (0, a4);
+
+  T (a4, "4321");           // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero " }
+  T (a4, "12345");          // { dg-warning "length 5 and an array of size 4 " }
+  T (a4, "123456");         // { dg-warning "length 6 and an array of size 4 " }
+  T ("1234", a4);           // { dg-warning "length 4 and an array of size 4 " }
+  T ("12345", a4);          // { dg-warning "length 5 and an array of size 4 " }
+  T ("123456", a4);         // { dg-warning "length 6 and an array of size 4 " }
+}
+
+
+void strcmp_array_pstr (void)
+{
+  const char *s4 = "1234";
+
+  {
+    if (strcmp (a4, s4))    // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  {
+    int c;
+    c = strcmp (a4, s4);    // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+    if (c)                  // { dg-message "in this expression" }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  const char *t4 = "4321";
+  const char *s5 = "12345";
+  const char *s6 = "123456";
+
+  T (a4, t4);               // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero " }
+  T (a4, s5);               // { dg-warning "length 5 and an array of size 4 " }
+  T (a4, s6);               // { dg-warning "length 6 and an array of size 4 " }
+  T (s4, a4);               // { dg-warning "length 4 and an array of size 4 " }
+  T (s5, a4);               // { dg-warning "length 5 and an array of size 4 " }
+  T (s6, a4);               // { dg-warning "length 6 and an array of size 4 " }
+}
+
+
+void strcmp_array_cond_pstr (int i)
+{
+  const char *s4 = i ? "1234" : "4321";
+  T (a4, s4);               // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero " }
+  T (a5, s4);
+}
+
+void strcmp_array_copy (void)
+{
+  char s[8];
+
+  {
+    strcpy (s, "1234");
+    if (strcmp (a4, s))     // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  {
+    strcpy (s, "1234");
+
+    int c;
+    c = strcmp (a4, s);     // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+    if (c)                  // { dg-message "in this expression" }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  strcpy (s, "4321");
+  T (a4, s);                // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero " }
+  strcpy (s, "12345");
+  T (a4, s);                // { dg-warning "length 5 and an array of size 4 " }
+  strcpy (s, "123456");
+  T (a4, s);                // { dg-warning "length 6 and an array of size 4 " }
+  strcpy (s, "4321");
+  T (s, a4);                // { dg-warning "length 4 and an array of size 4 " }
+  strcpy (s, "54321");
+  T (s, a4);                // { dg-warning "length 5 and an array of size 4 " }
+  strcpy (s, "654321");
+  T (s, a4);                // { dg-warning "length 6 and an array of size 4 " }
+}
+
+
+void strcmp_member_array_lit (const struct S *p)
+{
+  T (p->a4, "1234");        // { dg-warning "length 4 and an array of size 4 " }
+}
+
+
+#undef T
+#define T(a, b, n) sink (0 == strncmp (a, b, n), a, b)
+
+void strncmp_array_lit (void)
+{
+  if (strncmp (a4, "12345", 5))   // { dg-warning "'strncmp' of a string of length 5, an array of size 4 and bound of 5 evaluates to nonzero" }
+                                  // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+    sink (0, a4);
+
+  int cmp;
+  cmp = strncmp (a4, "54321", 5);   // { dg-warning "'strncmp' of a string of length 5, an array of size 4 and bound of 5 evaluates to nonzero" }
+  if (cmp)                          // { dg-message "in this expression" }
+    sink (0, a4);
+
+  // Verify no warning when the bound is the same as the array size.
+  T (a4, "4321", 4);
+  T (a4, "654321", 4);
+
+  T (a4, "12345", 5);       // { dg-warning "length 5, an array of size 4 and bound of 5 " }
+  T (a4, "123456", 6);      // { dg-warning "length 6, an array of size 4 and bound of 6" }
+
+  T ("1234", a4, 4);
+  T ("12345", a4, 4);
+
+  T ("12345", a4, 5);       // { dg-warning "length 5, an array of size 4 and bound of 5 " }
+  T ("123456", a4, 6);      // { dg-warning "length 6, an array of size 4 and bound of 6 " }
+}
+
+
+void strncmp_strarray_copy (void)
+{
+  {
+    char a[] = "1234";
+    char b[6];
+    strcpy (b, "12345");
+    if (strncmp (a, b, 5))  // { dg-warning "'strncmp' of strings of length 4 and 5 and bound of 5 evaluates to nonzero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+      sink (0, a, b);
+  }
+
+  {
+    char a[] = "4321";
+    char b[6];
+    strcpy (b, "54321");
+    int cmp;
+    cmp = strncmp (a, b, 5);  // { dg-warning "'strncmp' of strings of length 4 and 5 and bound of 5 evaluates to nonzero" }
+    if (cmp)                  // { dg-message "in this expression" }
+      sink (0, a, b);
+  }
+
+  strcpy (a4, "abc");
+  T (a4, "54321", 5);       // { dg-warning "'strncmp' of strings of length 3 and 5 and bound of 5 evaluates to nonzero " }
+}
+
+
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
index 86a0d7a08b3..35941bee575 100644
--- a/gcc/testsuite/gcc.dg/strcmpopt_3.c
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -1,31 +1,31 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -fdump-tree-strlen" } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
 
-__attribute__ ((noinline)) int 
-f1 (void) 
-{ 
+__attribute__ ((noinline)) int
+f1 (void)
+{
   char *s0= "abcd";
   char s[8];
   __builtin_strcpy (s, s0);
-  return __builtin_strcmp(s, "abc") != 0; 
+  return __builtin_strcmp (s, "abc") != 0;
 }
 
 __attribute__ ((noinline)) int
-f2 (void) 
-{ 
+f2 (void)
+{
   char *s0 = "ab";
   char s[8];
   __builtin_strcpy (s, s0);
-  return __builtin_strcmp("abc", s) != 0; 
+  return __builtin_strcmp ("abc", s) != 0;
 }
 
 int main (void)
 {
-  if (f1 () != 1 
+  if (f1 () != 1
       || f2 () != 1)
     __builtin_abort ();
 
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_6.c b/gcc/testsuite/gcc.dg/strcmpopt_6.c
new file mode 100644
index 00000000000..cb99294e5fa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_6.c
@@ -0,0 +1,207 @@
+/* Verify that strcmp and strncmp calls with mixed constant and
+   non-constant strings are evaluated correctly.
+   { dg-do run }
+   { dg-options "-O2" } */
+
+#include "strlenopt.h"
+
+#define A(expr)                                                 \
+  ((expr)                                                       \
+   ? (void)0                                                    \
+   : (__builtin_printf ("assertion failed on line %i: %s\n",    \
+                        __LINE__, #expr),                       \
+      __builtin_abort ()))
+
+__attribute__ ((noclone, noinline)) int
+test_strlen_gt2_strcmp_abcd (const char *s)
+{
+  if (strlen (s) < 3)
+    return -1;
+
+  return strcmp (s, "abcd") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strlen_lt6_strcmp_abcd (const char *s)
+{
+  if (strlen (s) > 5)
+    return -1;
+
+  return strcmp (s, "abcd") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_strcmp_abc (const char *s)
+{
+  char a[4];
+  strcpy (a, s);
+  return strcmp (a, "abc") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_abc_strcmp (const char *s)
+{
+  char a[4], b[6];
+  strcpy (a, "abc");
+  strcpy (b, s);
+  return strcmp (a, b) == 0;
+}
+
+/* Exercise strcmp of two strings between 1 and 3 characters long
+   stored in arrays of the same known size.  */
+char ga4[4], gb4[4];
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strcmp_same_size_arrays (void)
+{
+  ga4[0] = gb4[0] = 'x';
+  ga4[3] = gb4[3] = '\0';
+  return strcmp (ga4, gb4) == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strncmp_bound_2_same_size_arrays (void)
+{
+  ga4[0] = gb4[0] = 'x';
+  ga4[3] = gb4[3] = '\0';
+  return strncmp (ga4, gb4, 2) == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strncmp_bound_equal_same_size_arrays (void)
+{
+  ga4[0] = gb4[0] = 'x';
+  ga4[3] = gb4[3] = '\0';
+  return strncmp (ga4, gb4, 4) == 0;
+}
+
+/* Exercise strcmp of two strings between 0 and 3 characters long
+   stored in arrays of the same known size.  */
+
+__attribute__ ((noclone, noinline)) int
+test_nulterm_strcmp_same_size_arrays (void)
+{
+  ga4[3] = gb4[3] = '\0';
+  return strcmp (ga4, gb4) == 0;
+}
+
+/* Exercise strcmp of two strings between 1 and 3 and 1 and 4 characters
+   long, respectively, stored in arrays of known but different sizes.  */
+char gc5[5];
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strcmp_arrays (void)
+{
+  ga4[0] = gc5[0] = 'x';
+  ga4[3] = gc5[4] = '\0';
+  return strcmp (ga4, gc5) == 0;
+}
+
+/* Exercise strcmp of two strings between 0 and 3 and 1 and 4 characters
+   long, respectively, stored in arrays of known but different sizes.  */
+
+__attribute__ ((noclone, noinline)) int
+test_nulterm_strcmp_arrays (void)
+{
+  ga4[3] = gc5[4] = '\0';
+  return strcmp (ga4, gc5) == 0;
+}
+
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_strncmp_abcd (const char *s)
+{
+  char a[6];
+  strcpy (a, s);
+  return strcmp (a, "abcd") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_abcd_strncmp_3 (const char *s)
+{
+  char a[6], b[8];
+  strcpy (a, "abcd");
+  strcpy (b, s);
+  return strncmp (a, b, 3) == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_abcd_strncmp_4 (const char *s)
+{
+  char a[6], b[8];
+  strcpy (a, "abcd");
+  strcpy (b, s);
+  return strncmp (a, b, 4) == 0;
+}
+
+
+int main (void)
+{
+  test_strlen_gt2_strcmp_abcd ("abcd");
+  test_strlen_lt6_strcmp_abcd ("abcd");
+
+  A (0 == test_strcpy_strcmp_abc ("ab"));
+  A (0 != test_strcpy_strcmp_abc ("abc"));
+  A (0 == test_strcpy_strcmp_abc ("abcd"));
+
+  A (0 == test_strcpy_abc_strcmp ("ab"));
+  A (0 != test_strcpy_abc_strcmp ("abc"));
+  A (0 == test_strcpy_abc_strcmp ("abcd"));
+
+  strcpy (ga4, "abc"); strcpy (gb4, "abd");
+  A (0 == test_store_0_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abd"); strcpy (gb4, "abc");
+  A (0 == test_store_0_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_store_0_nulterm_strcmp_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gb4, "acd");
+  A (0 == test_store_0_nulterm_strncmp_bound_2_same_size_arrays ());
+  strcpy (ga4, "acd"); strcpy (gb4, "abc");
+  A (0 == test_store_0_nulterm_strncmp_bound_2_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_store_0_nulterm_strncmp_bound_2_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gb4, "abd");
+  A (0 == test_store_0_nulterm_strncmp_bound_equal_same_size_arrays ());
+  strcpy (ga4, "abd"); strcpy (gb4, "abc");
+  A (0 == test_store_0_nulterm_strncmp_bound_equal_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_store_0_nulterm_strncmp_bound_equal_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gb4, "abd");
+  A (0 == test_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abd"); strcpy (gb4, "abc");
+  A (0 == test_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_nulterm_strcmp_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gc5, "abcd");
+  A (0 == test_store_0_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abd"); strcpy (gc5, "abcd");
+  A (0 == test_store_0_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abc"); strcpy (gc5, "abc");
+  A (0 != test_store_0_nulterm_strcmp_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gc5, "abcd");
+  A (0 == test_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abd"); strcpy (gc5, "abc");
+  A (0 == test_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abc"); strcpy (gc5, "abc");
+  A (0 != test_nulterm_strcmp_arrays ());
+
+  A (0 == test_strcpy_strncmp_abcd ("ab"));
+  A (0 == test_strcpy_strncmp_abcd ("abc"));
+  A (0 != test_strcpy_strncmp_abcd ("abcd"));
+  A (0 == test_strcpy_strncmp_abcd ("abcde"));
+
+  A (0 == test_strcpy_abcd_strncmp_3 ("ab"));
+  A (0 != test_strcpy_abcd_strncmp_3 ("abc"));
+  A (0 != test_strcpy_abcd_strncmp_3 ("abcd"));
+  A (0 != test_strcpy_abcd_strncmp_3 ("abcde"));
+
+  A (0 == test_strcpy_abcd_strncmp_4 ("ab"));
+  A (0 == test_strcpy_abcd_strncmp_4 ("abc"));
+  A (0 != test_strcpy_abcd_strncmp_4 ("abcd"));
+  A (0 != test_strcpy_abcd_strncmp_4 ("abcde"));
+}
diff --git a/gcc/testsuite/gcc.dg/strlenopt-65.c b/gcc/testsuite/gcc.dg/strlenopt-65.c
index a34d178faa1..521d7ac2b42 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-65.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-65.c
@@ -1,17 +1,10 @@
 /* PRE tree-optimization/90626 - fold strcmp(a, b) == 0 to zero when
    one string length is exact and the other is unequal
    { dg-do compile }
-   { dg-options "-O2 -Wall -fdump-tree-optimized" } */
+   { dg-options "-O2 -Wall -Wno-string-compare -fdump-tree-optimized -ftrack-macro-expansion=0" } */
 
 #include "strlenopt.h"
 
-typedef __SIZE_TYPE__ size_t;
-
-extern void abort (void);
-extern void* memcpy (void *, const void *, size_t);
-extern int strcmp (const char *, const char *);
-extern int strncmp (const char *, const char *, size_t);
-
 #define CAT(x, y) x ## y
 #define CONCAT(x, y) CAT (x, y)
 #define FAILNAME(name) CONCAT (call_ ## name ##_on_line_, __LINE__)
@@ -142,21 +135,45 @@ void test_strcmp_keep (const char *s, const char *t)
 #undef CMPFUNC
 #define CMPFUNC(a, b, dummy) strcmp (a, b)
 
-  KEEP ("1", "1", a, b, -1);
+  KEEP ("123", "123\0", a, b, /* bnd = */ -1);
+  KEEP ("123\0", "123", a, b, -1);
+
+  {
+    char a[8], b[8];
+    sink (a, b);
+    strcpy (a, s);
+    strcpy (b, t);
+    TEST_KEEP (0 == strcmp (a, b));
+  }
+}
+
+
+void test_strncmp_keep (const char *s, const char *t)
+{
+#undef CMPFUNC
+#define CMPFUNC(a, b, n) strncmp (a, b, n)
+
+  KEEP ("1", "1", a, b, 2);
 
-  KEEP ("1\0", "1", a, b, -1);
-  KEEP ("1",   "1\0", a, b, -1);
+  KEEP ("1\0", "1", a, b, 2);
+  KEEP ("1",   "1\0", a, b, 2);
 
-  KEEP ("12\0", "12", a, b, -1);
-  KEEP ("12",   "12\0", a, b, -1);
+  KEEP ("12\0", "12", a, b, 2);
+  KEEP ("12",   "12\0", a, b, 2);
 
-  KEEP ("111\0", "111", a, b, -1);
-  KEEP ("112", "112\0", a, b, -1);
+  KEEP ("111\0", "111", a, b, 3);
+  KEEP ("112", "112\0", a, b, 3);
 
-  KEEP (s, t, a, b, -1);
+  {
+    char a[8], b[8];
+    sink (a, b);
+    strcpy (a, s);
+    strcpy (b, t);
+    TEST_KEEP (0 == strncmp (a, b, sizeof a));
+  }
 }
 
 /* { dg-final { scan-tree-dump-times "call_in_true_branch_not_eliminated_" 0 "optimized" } }
 
-   { dg-final { scan-tree-dump-times "call_made_in_true_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 8 "optimized" } }
-   { dg-final { scan-tree-dump-times "call_made_in_false_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 8 "optimized" } } */
+   { dg-final { scan-tree-dump-times "call_made_in_true_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 11 "optimized" } }
+   { dg-final { scan-tree-dump-times "call_made_in_false_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 11 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-66.c b/gcc/testsuite/gcc.dg/strlenopt-66.c
index 5dc10a07d3d..4ba31a845b0 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-66.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-66.c
@@ -1,6 +1,6 @@
 /* PRE tree-optimization/90626 - fold strcmp(a, b) == 0 to zero when
    one string length is exact and the other is unequal
-   { dg-do compile }
+   { dg-do run }
    { dg-options "-O2 -Wall -fdump-tree-optimized" } */
 
 #include "strlenopt.h"
@@ -65,8 +65,44 @@ test_strncmp (void)
   A (0 <  strncmp (b, a, 5));
 }
 
+
+__attribute__ ((noclone, noinline, noipa)) void
+test_strncmp_a4_cond_s5_s2_2 (const char *s, int i)
+{
+  char a4[4];
+  strcpy (a4, s);
+  A (0 == strncmp (a4, i ? "12345" : "12", 2));
+}
+
+
+__attribute__ ((noclone, noinline, noipa)) void
+test_strncmp_a4_cond_a5_s2_5 (const char *s, const char *t, int i)
+{
+  char a4[4], a5[5];
+  strcpy (a4, s);
+  strcpy (a5, t);
+  A (0 == strncmp (a4, i ? a5 : "12", 5));
+}
+
+__attribute__ ((noclone, noinline, noipa)) void
+test_strncmp_a4_cond_a5_a3_n (const char *s1, const char *s2, const char *s3,
+			      int i, unsigned n)
+{
+  char a3[3], a4[4], a5[5];
+  strcpy (a3, s1);
+  strcpy (a4, s2);
+  strcpy (a5, s3);
+  A (0 == strncmp (a4, i ? a5 : a3, n));
+}
+
+
 int main (void)
 {
   test_strcmp ();
   test_strncmp ();
+  test_strncmp_a4_cond_s5_s2_2 ("12", 0);
+  test_strncmp_a4_cond_a5_s2_5 ("12", "1234", 0);
+
+  test_strncmp_a4_cond_a5_a3_n ("12", "123", "1234", 0, 2);
+  test_strncmp_a4_cond_a5_a3_n ("123", "12", "12", 1, 3);
 }
diff --git a/gcc/testsuite/gcc.dg/strlenopt-68.c b/gcc/testsuite/gcc.dg/strlenopt-68.c
new file mode 100644
index 00000000000..46ceb9ddb05
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-68.c
@@ -0,0 +1,126 @@
+/* PR tree-optimization/90879 - fold zero-equality of strcmp between
+   a longer string and a smaller array
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wno-string-compare -fdump-tree-optimized -ftrack-macro-expansion=0" } */
+
+#include "strlenopt.h"
+
+#define A(expr)                                                 \
+  ((expr)                                                       \
+   ? (void)0                                                    \
+   : (__builtin_printf ("assertion failed on line %i: %s\n",    \
+                        __LINE__, #expr),                       \
+      __builtin_abort ()))
+
+void clobber (void*, ...);
+
+struct S { char a4[4], c; };
+
+extern char a4[4];
+extern char b4[4];
+
+/* Verify that comparison of string literals with arrays with unknown
+   content but size that prevents them from comparing equal is folded
+   to a constant.  */
+
+void test_array_lit (void)
+{
+  A (strcmp (a4, "1234")); clobber (a4);
+  A (strcmp (a4, "12345")); clobber (a4);
+  A (strcmp (a4, "123456")); clobber (a4);
+  A (strcmp ("1234", a4)); clobber (a4);
+  A (strcmp ("12345", a4)); clobber (a4);
+  A (strcmp ("123456", a4)); clobber (a4);
+}
+
+void test_memarray_lit (struct S *p)
+{
+  A (strcmp (p->a4, "1234"));
+  A (strcmp (p->a4, "12345"));
+  A (strcmp (p->a4, "123456"));
+
+  A (strcmp ("1234", p->a4));
+  A (strcmp ("12345", p->a4));
+  A (strcmp ("123456", p->a4));
+}
+
+/* Verify that the equality of empty strings is folded.  */
+
+void test_empty_string (void)
+{
+  A (0 == strcmp ("", ""));
+
+  *a4 = '\0';
+  A (0 == strcmp (a4, ""));
+  A (0 == strcmp ("", a4));
+  A (0 == strcmp (a4, a4));
+
+  char s[8] = "";
+  A (0 == strcmp (a4, s));
+
+  a4[1] = '\0';
+  b4[1] = '\0';
+  A (0 == strcmp (a4 + 1, b4 + 1));
+
+  a4[2] = '\0';
+  b4[2] = '\0';
+  A (0 == strcmp (&a4[2], &b4[2]));
+
+  clobber (a4, b4);
+
+  memset (a4, 0, sizeof a4);
+  memset (b4, 0, sizeof b4);
+  A (0 == strcmp (a4, b4));
+}
+
+/* Verify that comparison of dynamically created strings with unknown
+   arrays is folded.  */
+
+void test_array_copy (void)
+{
+  char s[8];
+  strcpy (s, "1234");
+  A (strcmp (a4, s));
+
+  strcpy (s, "12345");
+  A (strlen (s) == 5);
+  A (strcmp (a4, s)); clobber (a4);
+
+  strcpy (s, "123456");
+  A (strcmp (a4, s)); clobber (a4);
+
+  strcpy (s, "1234");
+  A (strcmp (s, a4)); clobber (a4);
+
+  strcpy (s, "12345");
+  A (strcmp (s, a4)); clobber (a4);
+
+  strcpy (s, "123456");
+  A (strcmp (s, a4)); clobber (a4);
+}
+
+
+void test_array_bounded (void)
+{
+  A (strncmp (a4, "12345", 5)); clobber (a4);
+  A (strncmp ("54321", a4, 5)); clobber (a4);
+
+  A (strncmp (a4, "123456", 5)); clobber (a4);
+  A (strncmp ("654321", a4, 5)); clobber (a4);
+}
+
+void test_array_copy_bounded (void)
+{
+  char s[8];
+  strcpy (s, "12345");
+  A (strncmp (a4, s, 5)); clobber (a4);
+  strcpy (s, "54321");
+  A (strncmp (s, a4, 5)); clobber (a4);
+
+  strcpy (s, "123456");
+  A (strncmp (a4, s, 5)); clobber (a4);
+  strcpy (s, "654321");
+  A (strncmp (s, a4, 5)); clobber (a4);
+}
+
+/* { dg-final { scan-tree-dump-not "abort|strcmp|strncmp" "optimized" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 4af47855e7c..31e012b741b 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -2091,6 +2091,9 @@ maybe_diag_stxncpy_trunc (gimple_stmt_iterator gsi, tree src, tree cnt)
   else
     {
       c_strlen_data lendata = { };
+      /* Set MAXBOUND to an arbitrary non-null non-integer node as a request
+	 to have it set to the length of the longest string in a PHI.  */
+      lendata.maxbound = src;
       get_range_strlen (src, &lendata, /* eltsize = */1);
       if (TREE_CODE (lendata.minlen) == INTEGER_CST
 	  && TREE_CODE (lendata.maxbound) == INTEGER_CST)
@@ -2862,51 +2865,78 @@ handle_builtin_memset (gimple_stmt_iterator *gsi)
   return true;
 }
 
-/* Handle a call to memcmp.  We try to handle small comparisons by
-   converting them to load and compare, and replacing the call to memcmp
-   with a __builtin_memcmp_eq call where possible.
-   return true when call is transformed, return false otherwise.  */
+/* Return a pointer to the first such equality expression if RES is used
+   only in experessions testing its equality to zero, and null otherwise.  */
 
-static bool
-handle_builtin_memcmp (gimple_stmt_iterator *gsi)
+static gimple*
+used_only_for_zero_equality (tree res)
 {
-  gcall *stmt2 = as_a <gcall *> (gsi_stmt (*gsi));
-  tree res = gimple_call_lhs (stmt2);
-  tree arg1 = gimple_call_arg (stmt2, 0);
-  tree arg2 = gimple_call_arg (stmt2, 1);
-  tree len = gimple_call_arg (stmt2, 2);
-  unsigned HOST_WIDE_INT leni;
+  gimple *first_use = NULL;
+
   use_operand_p use_p;
   imm_use_iterator iter;
 
-  if (!res)
-    return false;
-
   FOR_EACH_IMM_USE_FAST (use_p, iter, res)
     {
-      gimple *ustmt = USE_STMT (use_p);
+      gimple *use_stmt = USE_STMT (use_p);
 
-      if (is_gimple_debug (ustmt))
-	continue;
-      if (gimple_code (ustmt) == GIMPLE_ASSIGN)
+      if (is_gimple_debug (use_stmt))
+        continue;
+      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
 	{
-	  gassign *asgn = as_a <gassign *> (ustmt);
-	  tree_code code = gimple_assign_rhs_code (asgn);
-	  if ((code != EQ_EXPR && code != NE_EXPR)
-	      || !integer_zerop (gimple_assign_rhs2 (asgn)))
-	    return false;
+	  tree_code code = gimple_assign_rhs_code (use_stmt);
+	  if (code == COND_EXPR)
+	    {
+	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
+	      if ((TREE_CODE (cond_expr) != EQ_EXPR
+		   && (TREE_CODE (cond_expr) != NE_EXPR))
+		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
+		return NULL;
+	    }
+	  else if (code == EQ_EXPR || code == NE_EXPR)
+	    {
+	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
+		return NULL;
+            }
+	  else
+	    return NULL;
 	}
-      else if (gimple_code (ustmt) == GIMPLE_COND)
+      else if (gimple_code (use_stmt) == GIMPLE_COND)
 	{
-	  tree_code code = gimple_cond_code (ustmt);
+	  tree_code code = gimple_cond_code (use_stmt);
 	  if ((code != EQ_EXPR && code != NE_EXPR)
-	      || !integer_zerop (gimple_cond_rhs (ustmt)))
-	    return false;
+	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
+	    return NULL;
 	}
       else
-	return false;
+        return NULL;
+
+      if (!first_use)
+	first_use = use_stmt;
     }
 
+  return first_use;
+}
+
+/* Handle a call to memcmp.  We try to handle small comparisons by
+   converting them to load and compare, and replacing the call to memcmp
+   with a __builtin_memcmp_eq call where possible.
+   return true when call is transformed, return false otherwise.  */
+
+static bool
+handle_builtin_memcmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree res = gimple_call_lhs (stmt);
+
+  if (!res || !used_only_for_zero_equality (res))
+    return false;
+
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  tree len = gimple_call_arg (stmt, 2);
+  unsigned HOST_WIDE_INT leni;
+
   if (tree_fits_uhwi_p (len)
       && (leni = tree_to_uhwi (len)) <= GET_MODE_SIZE (word_mode)
       && pow2p_hwi (leni))
@@ -2919,7 +2949,7 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
       if (int_mode_for_size (leni, 1).exists (&mode)
 	  && (align >= leni || !targetm.slow_unaligned_access (mode, align)))
 	{
-	  location_t loc = gimple_location (stmt2);
+	  location_t loc = gimple_location (stmt);
 	  tree type, off;
 	  type = build_nonstandard_integer_type (leni, 1);
 	  gcc_assert (known_eq (GET_MODE_BITSIZE (TYPE_MODE (type)), leni));
@@ -2943,78 +2973,10 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
 	}
     }
 
-  gimple_call_set_fndecl (stmt2, builtin_decl_explicit (BUILT_IN_MEMCMP_EQ));
+  gimple_call_set_fndecl (stmt, builtin_decl_explicit (BUILT_IN_MEMCMP_EQ));
   return true;
 }
 
-/* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
-   the result of 0 == strncmp (A, B, N) (which is the same as strcmp for
-   sufficiently large N).  Otherwise return false.  */
-
-static bool
-strxcmp_unequal (int idx1, int idx2, unsigned HOST_WIDE_INT n)
-{
-  unsigned HOST_WIDE_INT len1;
-  unsigned HOST_WIDE_INT len2;
-
-  bool nulterm1;
-  bool nulterm2;
-
-  if (idx1 < 0)
-    {
-      len1 = ~idx1;
-      nulterm1 = true;
-    }
-  else if (strinfo *si = get_strinfo (idx1))
-    {
-      if (tree_fits_uhwi_p (si->nonzero_chars))
-	{
-	  len1 = tree_to_uhwi (si->nonzero_chars);
-	  nulterm1 = si->full_string_p;
-	}
-      else
-	return false;
-    }
-  else
-    return false;
-
-  if (idx2 < 0)
-    {
-      len2 = ~idx2;
-      nulterm2 = true;
-    }
-  else if (strinfo *si = get_strinfo (idx2))
-    {
-      if (tree_fits_uhwi_p (si->nonzero_chars))
-	{
-	  len2 = tree_to_uhwi (si->nonzero_chars);
-	  nulterm2 = si->full_string_p;
-	}
-      else
-	return false;
-    }
-  else
-    return false;
-
-  /* N is set to UHWI_MAX for strcmp and less to strncmp.  Adjust
-     the length of each string to consider to be no more than N.  */
-  if (len1 > n)
-    len1 = n;
-  if (len2 > n)
-    len2 = n;
-
-  if ((len1 < len2 && nulterm1)
-      || (len2 < len1 && nulterm2))
-    /* The string lengths are definitely unequal and the result can
-       be folded to one (since it's used for comparison with zero).  */
-    return true;
-
-  /* The string lengths may be equal or unequal.  Even when equal and
-     both strings nul-terminated, without the string contents there's
-     no way to determine whether they are equal.  */
-  return false;
-}
-
 /* Given an index to the strinfo vector, compute the string length
    for the corresponding string. Return -1 when unknown.  */
 
@@ -3043,15 +3005,16 @@ compute_string_length (int idx)
 
 /* Determine the minimum size of the object referenced by DEST expression
    which must have a pointer type.
-   Return the minimum size of the object if successful or NULL when the size
-   cannot be determined.  */
-static tree
+   Return the minimum size of the object if successful or HWI_M1U when
+   the size cannot be determined.  */
+
+static unsigned HOST_WIDE_INT
 determine_min_objsize (tree dest)
 {
   unsigned HOST_WIDE_INT size = 0;
 
   if (compute_builtin_object_size (dest, 2, &size))
-    return build_int_cst (sizetype, size);
+    return size;
 
   /* Try to determine the size of the object through the RHS
      of the assign statement.  */
@@ -3059,11 +3022,11 @@ determine_min_objsize (tree dest)
     {
       gimple *stmt = SSA_NAME_DEF_STMT (dest);
       if (!is_gimple_assign (stmt))
-	return NULL_TREE;
+	return HOST_WIDE_INT_M1U;
 
       if (!gimple_assign_single_p (stmt)
 	  && !gimple_assign_unary_nop_p (stmt))
-	return NULL_TREE;
+	return HOST_WIDE_INT_M1U;
 
       dest = gimple_assign_rhs1 (stmt);
       return determine_min_objsize (dest);
@@ -3071,7 +3034,7 @@ determine_min_objsize (tree dest)
 
   /* Try to determine the size of the object from its type.  */
   if (TREE_CODE (dest) != ADDR_EXPR)
-    return NULL_TREE;
+    return HOST_WIDE_INT_M1U;
 
   tree type = TREE_TYPE (dest);
   if (TREE_CODE (type) == POINTER_TYPE)
@@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
 
   type = TYPE_MAIN_VARIANT (type);
 
-  /* We cannot determine the size of the array if it's a flexible array,
-     which is declared at the end of a structure.  */
-  if (TREE_CODE (type) == ARRAY_TYPE
-      && !array_at_struct_end_p (dest))
+  /* The size of a flexible array cannot be determined.  Otherwise,
+     for arrays with more than one element, return the size of its
+     type.  GCC itself misuses arrays of both zero and one elements
+     as flexible array members so they are excluded as well.  */
+  if (TREE_CODE (type) != ARRAY_TYPE
+      || !array_at_struct_end_p (dest))
     {
-      tree size_t = TYPE_SIZE_UNIT (type);
-      if (size_t && TREE_CODE (size_t) == INTEGER_CST
-	  && !integer_zerop (size_t))
-        return size_t;
+      tree type_size = TYPE_SIZE_UNIT (type);
+      if (type_size && TREE_CODE (type_size) == INTEGER_CST
+	  && !integer_onep (type_size)
+	  && !integer_zerop (type_size))
+        return tree_to_uhwi (type_size);
     }
 
-  return NULL_TREE;
+  return HOST_WIDE_INT_M1U;
 }
 
-/* Handle a call to strcmp or strncmp. When the result is ONLY used to do
-   equality test against zero:
-
-   A. When the lengths of both arguments are constant and it's a strcmp:
-      * if the lengths are NOT equal, we can safely fold the call
-        to a non-zero value.
-      * otherwise, do nothing now.
-
-   B. When the length of one argument is constant, try to replace the call
-   with a __builtin_str(n)cmp_eq call where possible, i.e:
-
-   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR
-   is a string with constant length , C is a constant.
-     if (C <= strlen(STR) && sizeof_array(s) > C)
-       {
-         replace this call with
-         strncmp_eq (s, STR, C) (!)= 0
-       }
-     if (C > strlen(STR)
-       {
-         it can be safely treated as a call to strcmp (s, STR) (!)= 0
-         can handled by the following strcmp.
-       }
-
-   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR
-   is a string with constant length.
-     if  (sizeof_array(s) > strlen(STR))
-       {
-         replace this call with
-         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
-       }
-
-   Return true when the call is transformed, return false otherwise.
- */
+/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths
+   of  the string(s) referenced by ARG if it can be determined.
+   If the length cannot be determined, set *SIZE to the size of
+   the array the string is stored in, if any.  If no such array is
+   known, set *SIZE to -1.  When the strings are nul-terminated set
+   *NULTERM to true, otherwise to false.  Return true on success.  */
 
 static bool
-handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
+		 unsigned HOST_WIDE_INT *size, bool *nulterm)
 {
-  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
-  tree res = gimple_call_lhs (stmt);
-  use_operand_p use_p;
-  imm_use_iterator iter;
-  tree arg1 = gimple_call_arg (stmt, 0);
-  tree arg2 = gimple_call_arg (stmt, 1);
-  int idx1 = get_stridx (arg1);
-  int idx2 = get_stridx (arg2);
-  HOST_WIDE_INT length = -1;
-  bool is_ncmp = false;
+  /* Set so that both LEN and ~LEN are invalid lengths, i.e.,
+     maximum possible length + 1.  */
+  lenrng[0] = lenrng[1] = HOST_WIDE_INT_MAX;
 
-  if (!res)
-    return false;
+  *size = HOST_WIDE_INT_M1U;
 
-  /* When both arguments are unknown, do nothing.  */
-  if (idx1 == 0 && idx2 == 0)
-    return false;
-
-  /* Handle strncmp function.  */
-  if (gimple_call_num_args (stmt) == 3)
+  if (idx < 0)
     {
-      tree len = gimple_call_arg (stmt, 2);
-      if (tree_fits_shwi_p (len))
-        length = tree_to_shwi (len);
-
-      is_ncmp = true;
+      /* IDX is the inverted constant string length.  */
+      lenrng[0] = ~idx;
+      lenrng[1] = lenrng[0];
+      *nulterm = true;
     }
-
-  /* For strncmp, if the length argument is NOT known, do nothing.  */
-  if (is_ncmp && length < 0)
-    return false;
-
-  /* When the result is ONLY used to do equality test against zero.  */
-  FOR_EACH_IMM_USE_FAST (use_p, iter, res)
+  else if (idx == 0)
+    ; /* Handled below.  */
+  else if (strinfo *si = get_strinfo (idx))
     {
-      gimple *use_stmt = USE_STMT (use_p);
+      if (!si->nonzero_chars)
+	arg = si->ptr;
+      else if (tree_fits_uhwi_p (si->nonzero_chars))
+	{
+	  lenrng[0] = tree_to_uhwi (si->nonzero_chars);
+	  *nulterm = si->full_string_p;
+	  /* Set the upper bound only if the string is known to be
+	     nul-terminated, otherwise leave it at maximum + 1.  */
+	  if (*nulterm)
+	    lenrng[1] = lenrng[0];
+	}
+      else if (TREE_CODE (si->nonzero_chars) == SSA_NAME)
+	{
+	  wide_int min, max;
+	  value_range_kind rng = get_range_info (si->nonzero_chars, &min, &max);
+	  if (rng == VR_RANGE)
+	    {
+	      lenrng[0] = min.to_uhwi ();
+	      lenrng[1] = max.to_uhwi ();
+	      *nulterm = si->full_string_p;
+	    }
+	}
+      else if (si->ptr)
+	arg = si->ptr;
+    }
 
-      if (is_gimple_debug (use_stmt))
-        continue;
-      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
+  if (lenrng[0] == HOST_WIDE_INT_MAX)
+    {
+      /* Compute the minimum and maximum real or possible lengths.  */
+      c_strlen_data lendata = { };
+      if (get_range_strlen (arg, &lendata, /* eltsize = */1))
 	{
-	  tree_code code = gimple_assign_rhs_code (use_stmt);
-	  if (code == COND_EXPR)
+	  if (tree_fits_shwi_p (lendata.maxlen) && !lendata.maxbound)
 	    {
-	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
-	      if ((TREE_CODE (cond_expr) != EQ_EXPR
-		   && (TREE_CODE (cond_expr) != NE_EXPR))
-		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
-		return false;
+	      lenrng[0] = tree_to_shwi (lendata.minlen);
+	      lenrng[1] = tree_to_shwi (lendata.maxlen);
+	      *nulterm = true;
 	    }
-	  else if (code == EQ_EXPR || code == NE_EXPR)
+	  else if (lendata.maxbound && tree_fits_shwi_p (lendata.maxbound))
 	    {
-	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
-		return false;
-            }
-	  else
-	    return false;
+	      /* Set *SIZE to the conservative LENDATA.MAXBOUND which
+		 is a conservative estimate of the longest string based
+		 on the sizes of the arrays referenced by ARG.  */
+	      *size = tree_to_uhwi (lendata.maxbound) + 1;
+	      *nulterm = false;
+	    }
 	}
-      else if (gimple_code (use_stmt) == GIMPLE_COND)
+      else
 	{
-	  tree_code code = gimple_cond_code (use_stmt);
-	  if ((code != EQ_EXPR && code != NE_EXPR)
-	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
-	    return false;
+	  /* Set *SIZE to the size of the smallest object referenced
+	     by ARG if ARG denotes a single object, or to HWI_M1U
+	     otherwise.  */
+	  *size = determine_min_objsize (arg);
+	  *nulterm = false;
 	}
-      else
-        return false;
     }
 
-  /* When the lengths of the arguments are known to be unequal
-     we can safely fold the call to a non-zero value for strcmp;
-     otherwise, do nothing now.  */
-  if (idx1 != 0 && idx2 != 0)
+  return lenrng[0] != HOST_WIDE_INT_MAX || *size != HOST_WIDE_INT_M1U;
+}
+
+/* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
+   the result of 0 == strncmp (A, B, BOUND) (which is the same as strcmp
+   for s sufficiently large BOUND).  If the result is based on the length
+   of one string being greater than the longest string that would fit in
+   the array pointer to by the argument, set *PLEN and *PSIZE to
+   the corresponding length (or its complement when the string is known
+   to be at least as long and need not be nul-terminated) and size.
+   Otherwise return null.  */
+
+static tree
+strxcmp_eqz_result (tree arg1, int idx1, tree arg2, int idx2,
+		    unsigned HOST_WIDE_INT bound, unsigned HOST_WIDE_INT len[2],
+		    unsigned HOST_WIDE_INT *psize)
+{
+  /* Determine the range the length of each string is in and whether it's
+     known to be nul-terminated, or the size of the array it's stored in.  */
+  bool nul1, nul2;
+  unsigned HOST_WIDE_INT siz1, siz2;
+  unsigned HOST_WIDE_INT len1rng[2], len2rng[2];
+  if (!get_len_or_size (arg1, idx1, len1rng, &siz1, &nul1)
+      || !get_len_or_size (arg2, idx2, len2rng, &siz2, &nul2))
+    return NULL_TREE;
+
+  /* BOUND is set to HWI_M1U for strcmp and less to strncmp, and LENiRNG
+     to HWI_MAX when invalid.  Adjust the length of each string to consider
+     to be no more than BOUND.  */
+  if (len1rng[0] < HOST_WIDE_INT_MAX && len1rng[0] > bound)
+    len1rng[0] = bound;
+  if (len1rng[1] < HOST_WIDE_INT_MAX && len1rng[1] > bound)
+    len1rng[1] = bound;
+  if (len2rng[0] < HOST_WIDE_INT_MAX && len2rng[0] > bound)
+    len2rng[0] = bound;
+  if (len2rng[1] < HOST_WIDE_INT_MAX && len2rng[1] > bound)
+    len2rng[1] = bound;
+
+  /* Two empty strings are equal.  */
+  if (len1rng[1] == 0 && len2rng[1] == 0)
+    return integer_one_node;
+
+  /* The strings are definitely unequal when the lower bound of the length
+     of one of them is greater than the length of the longest string that
+     would fit into the other array.  */
+  if (len1rng[0] == HOST_WIDE_INT_MAX
+      && len2rng[0] != HOST_WIDE_INT_MAX
+      && ((len2rng[0] < bound && len2rng[0] >= siz1)
+	  || len2rng[0] > siz1))
     {
-      if (strxcmp_unequal (idx1, idx2, length))
-	{
-	  replace_call_with_value (gsi, integer_one_node);
-	  return true;
-	}
-      return false;
+      *psize = siz1;
+      len[0] = len1rng[0];
+      /* Set LEN[0] to the lower bound of ARG1's length when it's
+	 nul-terminated or to the complement of its minimum length
+	 otherwise,  */
+      len[1] = nul2 ? len2rng[0] : ~len2rng[0];
+      return integer_zero_node;
     }
 
-  /* When the length of one argument is constant.  */
-  tree var_string = NULL_TREE;
-  HOST_WIDE_INT const_string_leni = -1;
+  if (len2rng[0] == HOST_WIDE_INT_MAX
+      && len1rng[0] != HOST_WIDE_INT_MAX
+      && ((len1rng[0] < bound && len1rng[0] >= siz2)
+	  || len1rng[0] > siz2))
+    {
+      *psize = siz2;
+      len[0] = nul1 ? len1rng[0] : ~len1rng[0];
+      len[1] = len2rng[0];
+      return integer_zero_node;
+    }
 
-  if (idx1)
+  /* The strings are also definitely unequal when their lengths are unequal
+     and at least one is nul-terminated.  */
+  if (len1rng[0] != HOST_WIDE_INT_MAX
+      && len2rng[0] != HOST_WIDE_INT_MAX
+      && ((len1rng[1] < len2rng[0] && nul1)
+	  || (len2rng[1] < len1rng[0] && nul2)))
     {
-      const_string_leni = compute_string_length (idx1);
-      var_string = arg2;
+      if (bound <= len1rng[0] || bound <= len2rng[0])
+	*psize = bound;
+      else
+	*psize = HOST_WIDE_INT_M1U;
+
+      len[0] = len1rng[0];
+      len[1] = len2rng[0];
+      return integer_zero_node;
     }
-  else
+
+  /* The string lengths may be equal or unequal.  Even when equal and
+     both strings nul-terminated, without the string contents there's
+     no way to determine whether they are equal.  */
+  return NULL_TREE;
+}
+
+/* Diagnose pointless calls to strcmp or strncmp STMT with string
+   arguments of lengths LEN or size SIZ and (for strncmp) BOUND,
+   whose result is used in equality epxpressions that evaluate to
+   a constant due to one argument being longer than the size of
+   the other.  */
+
+static void
+maybe_warn_pointless_strcmp (gimple *stmt, HOST_WIDE_INT bound,
+			     unsigned HOST_WIDE_INT len[2],
+			     unsigned HOST_WIDE_INT siz)
+{
+  gimple *use = used_only_for_zero_equality (gimple_call_lhs (stmt));
+  if (!use)
+    return;
+
+  bool at_least = false;
+
+  /* Excessive LEN[i] indicates a lower bound.  */
+  if (len[0] > HOST_WIDE_INT_MAX)
     {
-      gcc_checking_assert (idx2);
-      const_string_leni = compute_string_length (idx2);
-      var_string = arg1;
+      at_least = true;
+      len[0] = ~len[0];
     }
 
-  if (const_string_leni < 0)
-    return false;
+  if (len[1] > HOST_WIDE_INT_MAX)
+    {
+      at_least = true;
+      len[1] = ~len[1];
+    }
 
-  unsigned HOST_WIDE_INT var_sizei = 0;
-  /* try to determine the minimum size of the object pointed by var_string.  */
-  tree size = determine_min_objsize (var_string);
+  unsigned HOST_WIDE_INT minlen = MIN (len[0], len[1]);
+
+  /* FIXME: Include a note pointing to the declaration of the smaller
+     array.  */
+  location_t stmt_loc = gimple_location (stmt);
+  tree callee = gimple_call_fndecl (stmt);
+  bool warned = false;
+  if (siz <= minlen && bound == -1)
+    warned = warning_at (stmt_loc, OPT_Wstring_compare,
+			 (at_least
+			  ? G_("%G%qD of a string of length %wu or more and "
+			       "an array of size %wu evaluates to nonzero")
+			  : G_("%G%qD of a string of length %wu and an array "
+			       "of size %wu evaluates to nonzero")),
+			 stmt, callee, minlen, siz);
+  else if (!at_least && siz <= HOST_WIDE_INT_MAX)
+    {
+      if (len[0] != HOST_WIDE_INT_MAX && len[1] != HOST_WIDE_INT_MAX)
+	warned = warning_at (stmt_loc, OPT_Wstring_compare,
+			     "%G%qD of strings of length %wu and %wu "
+			     "and bound of %wu evaluates to nonzero",
+			     stmt, callee, len[0], len[1], bound);
+      else
+	warned = warning_at (stmt_loc, OPT_Wstring_compare,
+			     "%G%qD of a string of length %wu, an array "
+			     "of size %wu and bound of %wu evaluates to "
+			     "nonzero",
+			     stmt, callee, minlen, siz, bound);
+    }
+
+  if (warned)
+    {
+      location_t use_loc = gimple_location (use);
+      if (LOCATION_LINE (stmt_loc) != LOCATION_LINE (use_loc))
+	inform (use_loc, "in this expression");
+    }
+}
 
-  if (!size)
-    return false;
 
-  if (tree_fits_uhwi_p (size))
-    var_sizei = tree_to_uhwi (size);
+/* Optimize a call to strcmp or strncmp either by folding it to a constant
+   when possible or by transforming the latter to the former.  Warn about
+   calls where the length of one argument is greater than the size of
+   the array to which the other aargument points if the latter's length
+   is not known.  Return true when the call has been transformed into
+   another and false otherwise.  */
 
-  if (var_sizei == 0)
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree lhs = gimple_call_lhs (stmt);
+
+  if (!lhs)
     return false;
 
-  /* For strncmp, if length > const_string_leni , this call can be safely
-     transformed to a strcmp.  */
-  if (is_ncmp && length > const_string_leni)
-    is_ncmp = false;
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  int idx1 = get_stridx (arg1);
+  int idx2 = get_stridx (arg2);
 
-  unsigned HOST_WIDE_INT final_length
-    = is_ncmp ? length : const_string_leni + 1;
+  /* For strncmp set to the the value of the third argument if known.  */
+  HOST_WIDE_INT bound = -1;
 
-  /* Replace strcmp or strncmp with the corresponding str(n)cmp_eq.  */
-  if (var_sizei > final_length)
+  /* Extract the strncmp bound.  */
+  if (gimple_call_num_args (stmt) == 3)
     {
-      tree fn
-	= (is_ncmp
-	   ? builtin_decl_implicit (BUILT_IN_STRNCMP_EQ)
-	   : builtin_decl_implicit (BUILT_IN_STRCMP_EQ));
-      if (!fn)
+      tree len = gimple_call_arg (stmt, 2);
+      if (tree_fits_shwi_p (len))
+        bound = tree_to_shwi (len);
+
+      /* If the bound argument is NOT known, do nothing.  */
+      if (bound < 0)
 	return false;
-      tree const_string_len = build_int_cst (size_type_node, final_length);
-      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
     }
+
+  {
+    /* Set to the length of one argument (or its complement if it's
+       the lower bound of a range) and the size of the array storing
+       the other if the result is based on the former being equal to
+       or greater than the latter.  */
+    unsigned HOST_WIDE_INT len[2] = { HOST_WIDE_INT_MAX, HOST_WIDE_INT_MAX };
+    unsigned HOST_WIDE_INT siz = HOST_WIDE_INT_M1U;
+
+    /* Try to determine if the two strings are either definitely equal
+       or definitely unequal and if so, either fold the result to zero
+       (when equal) or set the range of the result to ~[0, 0] otherwise.  */
+    if (tree eqz = strxcmp_eqz_result (arg1, idx1, arg2, idx2, bound,
+				       len, &siz))
+      {
+	if (integer_zerop (eqz))
+	  {
+	    maybe_warn_pointless_strcmp (stmt, bound, len, siz);
+
+	    /* When the lengths of the first two string arguments are
+	       known to be unequal set the range of the result to non-zero.
+	       This allows the call to be eliminated if its result is only
+	       used in tests for equality to zero.  */
+	    wide_int zero = wi::zero (TYPE_PRECISION (TREE_TYPE (lhs)));
+	    set_range_info (lhs, VR_ANTI_RANGE, zero, zero);
+	    return false;
+	  }
+	/* When the two strings are definitely equal (such as when they
+	   are both empty) fold the call to the constant result.  */
+	replace_call_with_value (gsi, integer_zero_node);
+	return true;
+      }
+  }
+
+  /* Return if nothing is known about the strings pointed to by ARG1
+     and ARG2.  */
+  if (idx1 == 0 && idx2 == 0)
+    return false;
+
+  /* Determine either the length or the size of each of the strings,
+     whichever is available.  */
+  HOST_WIDE_INT cstlen1 = -1, cstlen2 = -1;
+  HOST_WIDE_INT arysiz1 = -1, arysiz2 = -1;
+
+  if (idx1)
+    cstlen1 = compute_string_length (idx1) + 1;
   else
+    arysiz1 = determine_min_objsize (arg1);
+
+  /* Bail if neither the string length nor the size of the array
+     it is stored in can be determined.  */
+  if (cstlen1 < 0 && arysiz1 < 0)
     return false;
 
-  return true;
+  /* Repeat for the second argument.  */
+  if (idx2)
+    cstlen2 = compute_string_length (idx2) + 1;
+  else
+    arysiz2 = determine_min_objsize (arg2);
+
+  if (cstlen2 < 0 && arysiz2 < 0)
+    return false;
+
+  /* The exact number of characters to compare.  */
+  HOST_WIDE_INT cmpsiz = bound < 0 ? cstlen1 < 0 ? cstlen2 : cstlen1 : bound;
+  /* The size of the array in which the unknown string is stored.  */
+  HOST_WIDE_INT varsiz = arysiz1 < 0 ? arysiz2 : arysiz1;
+
+  if (cmpsiz < varsiz && used_only_for_zero_equality (lhs))
+    {
+      /* If the known length is less than the size of the other array
+	 and the strcmp result is only used to test equality to zero,
+	 transform the call to the equivalent _eq call.  */
+      if (tree fn = builtin_decl_implicit (bound < 0 ? BUILT_IN_STRCMP_EQ
+					   : BUILT_IN_STRNCMP_EQ))
+	{
+	  tree n = build_int_cst (size_type_node, cmpsiz);
+	  update_gimple_call (gsi, fn, 3, arg1, arg2, n);
+	  return true;
+	}
+    }
+
+  return false;
 }
 
 /* Handle a POINTER_PLUS_EXPR statement.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-09 17:07   ` Martin Sebor
@ 2019-08-09 17:07     ` Jakub Jelinek
  2019-08-09 22:45       ` Martin Sebor
  2019-08-13 20:08     ` Jeff Law
  1 sibling, 1 reply; 21+ messages in thread
From: Jakub Jelinek @ 2019-08-09 17:07 UTC (permalink / raw)
  To: Martin Sebor; +Cc: gcc-patches

On Fri, Aug 09, 2019 at 10:51:09AM -0600, Martin Sebor wrote:
> That said, we should change this code one way or the other.
> There is even less of a guarantee that other compilers support
> writing past the end of arrays that have non-zero size than
> that they recognize the documented zero-length extension.

We use that everywhere forever, so no.
See e.g. rtx u.fld and u.hwint arrays, tree_exp operands array,
gimple_statement_with_ops op array just to name a few that are
everywhere.  Coverity is indeed unhappy about
that, but it would be with [0] certainly too.  Another option is
to use maximum possible size where we know it (which is the case of
rtxes and most tree expressions and gimple stmts, but not e.g.
CALL_EXPR or GIMPLE_CALL where there is no easy upper bound.

	Jakub

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-09 17:07     ` Jakub Jelinek
@ 2019-08-09 22:45       ` Martin Sebor
  2019-08-12 13:56         ` Michael Matz
  2019-08-12 20:15         ` Jeff Law
  0 siblings, 2 replies; 21+ messages in thread
From: Martin Sebor @ 2019-08-09 22:45 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On 8/9/19 10:58 AM, Jakub Jelinek wrote:
> On Fri, Aug 09, 2019 at 10:51:09AM -0600, Martin Sebor wrote:
>> That said, we should change this code one way or the other.
>> There is even less of a guarantee that other compilers support
>> writing past the end of arrays that have non-zero size than
>> that they recognize the documented zero-length extension.
> 
> We use that everywhere forever, so no.

Just because some invalid code has been in place "forever" doesn't
mean it cannot be changed.  Relying on undocumented "extensions"
because they just happen to work with the compilers they have been
exposed to is exactly how naive users get in trouble.  Our answer
to reports of "bugs" when the behavior changes is typically: fix
your code.  There's little reason to expect other compiler writers
to be any more accommodating.

> See e.g. rtx u.fld and u.hwint arrays, tree_exp operands array,
> gimple_statement_with_ops op array just to name a few that are
> everywhere.  Coverity is indeed unhappy about
> that, but it would be with [0] certainly too.  Another option is
> to use maximum possible size where we know it (which is the case of
> rtxes and most tree expressions and gimple stmts, but not e.g.
> CALL_EXPR or GIMPLE_CALL where there is no easy upper bound.

The solution introduced in C99 is a flexible array.  C++
compilers usually support it as well.  Those that don't are
likely to support the zero-length array (even Visual C++ does).
If there's a chance that some don't support either do you really
think it's safe to assume they will do something sane with
the [1] hack?  If you're concerned that the flexible array syntax
or the zero length array won't compile, add a configure test to
see if it does and use whatever alternative is most appropriate.

Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-09 22:45       ` Martin Sebor
@ 2019-08-12 13:56         ` Michael Matz
  2019-08-14 16:30           ` Martin Sebor
  2019-08-12 20:15         ` Jeff Law
  1 sibling, 1 reply; 21+ messages in thread
From: Michael Matz @ 2019-08-12 13:56 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jakub Jelinek, gcc-patches

Hi,

On Fri, 9 Aug 2019, Martin Sebor wrote:

> The solution introduced in C99 is a flexible array.  C++
> compilers usually support it as well.  Those that don't are
> likely to support the zero-length array (even Visual C++ does).
> If there's a chance that some don't support either do you really
> think it's safe to assume they will do something sane with
> the [1] hack?

As the [1] "hack" is the traditional pre-C99 (and C++) idiom to 
implement flexible trailing char arrays, yes, I do expect all existing 
(and not any more existing) compilers to do the obvious and sane thing 
with it.  IOW: it's more portable in practice than our documented 
zero-length extension.  And that's what matters for the things compiled by 
the host compiler.

Without requiring C99 (which would be a different discussion) and a 
non-existing C++ standard we can't write this code (in this form) in a 
standard conforming way, no matter what we wish for.  Hence it seems 
prudent to use the most portable variant of all the non-standard ways, the 
trailing [1] array.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-09 22:45       ` Martin Sebor
  2019-08-12 13:56         ` Michael Matz
@ 2019-08-12 20:15         ` Jeff Law
  2019-08-12 22:32           ` Martin Sebor
  1 sibling, 1 reply; 21+ messages in thread
From: Jeff Law @ 2019-08-12 20:15 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek; +Cc: gcc-patches

On 8/9/19 4:14 PM, Martin Sebor wrote:
> On 8/9/19 10:58 AM, Jakub Jelinek wrote:
>> On Fri, Aug 09, 2019 at 10:51:09AM -0600, Martin Sebor wrote:
>>> That said, we should change this code one way or the other.
>>> There is even less of a guarantee that other compilers support
>>> writing past the end of arrays that have non-zero size than
>>> that they recognize the documented zero-length extension.
>>
>> We use that everywhere forever, so no.
> 
> Just because some invalid code has been in place "forever" doesn't
> mean it cannot be changed.  Relying on undocumented "extensions"
> because they just happen to work with the compilers they have been
> exposed to is exactly how naive users get in trouble.  Our answer
> to reports of "bugs" when the behavior changes is typically: fix
> your code.  There's little reason to expect other compiler writers
> to be any more accommodating.
> 
>> See e.g. rtx u.fld and u.hwint arrays, tree_exp operands array,
>> gimple_statement_with_ops op array just to name a few that are
>> everywhere.  Coverity is indeed unhappy about
>> that, but it would be with [0] certainly too.  Another option is
>> to use maximum possible size where we know it (which is the case of
>> rtxes and most tree expressions and gimple stmts, but not e.g.
>> CALL_EXPR or GIMPLE_CALL where there is no easy upper bound.
> 
> The solution introduced in C99 is a flexible array.  C++
> compilers usually support it as well.  Those that don't are
> likely to support the zero-length array (even Visual C++ does).
> If there's a chance that some don't support either do you really
> think it's safe to assume they will do something sane with
> the [1] hack?  If you're concerned that the flexible array syntax
> or the zero length array won't compile, add a configure test to
> see if it does and use whatever alternative is most appropriate.
Given that we require a C++03 compiler to build GCC, I think we can
revisit how we represent the trailing array.  But that seems independent
of the bulk of this patch.

Can we separate this issue from the rest of the patch?

jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-09 16:42 [PATCH] fold more string comparison with known result (PR 90879) Martin Sebor
  2019-08-09 16:51 ` Jakub Jelinek
@ 2019-08-12 22:22 ` Jeff Law
  1 sibling, 0 replies; 21+ messages in thread
From: Jeff Law @ 2019-08-12 22:22 UTC (permalink / raw)
  To: Martin Sebor, gcc-patches

On 8/9/19 10:17 AM, Martin Sebor wrote:
> GCC 9 optimizes a subset of expression of the form
> (0 == strcmp(a, b)) based on the length and/or size of
> the arguments but it doesn't take advantage of all
> the opportunities there.  For example in the following,
> although it folds the first test to false it doesn't fold
> the second one:
> 
>   char a[4];
> 
>   void f (void)
>   {
>     if (strlen (a) > 3)   // folded to false by GCC 8+
>       abort ();
> 
>     if (strcmp (a, "1234") == 0)   // folded by patched GCC
>       abort ();
> }
> 
> The attached patch extends the strcmp optimization added in
> GCC 9 to also handle the latter cases (among others).  Testing
> the enhancement with several other sizable code bases besides
> GCC (Binutils/GDB, the Linux kernel, and LLVM) shows that code
> like this is rare.  After thinking about it I decided it's more
> likely a bug than a significant optimization opportunity, so
> I introduced a new warning to point it out: -Wstring-compare
> (enabled in -Wextra).
> 
> Besides this enhancement, the patch also improves the current
> optimization to fold strcmp calls with conditional arguments
> such as in:
> 
>   void f (char *s, int i)
>   {
>     strcpy (s, "12");
>     if (strcmp (s, i ? "123" : "1234") == 0)   // folded
>       abort ();
>   }
> 
> Martin
> 
> PS The diff looks like the changes are more extensive than they
> actually are.
> 
> gcc-90879.diff
> 
> PR tree-optimization/90879 - fold zero-equality of strcmp between a longer string and a smaller array
> 
> gcc/c-family/ChangeLog:
> 
> 	PR tree-optimization/90879
> 	* c.opt (-Wstring-compare): New option.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR tree-optimization/90879
> 	* gcc.dg/Wstring-compare-2.c: New test.
> 	* gcc.dg/Wstring-compare.c: New test.
> 	* gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
> 	* gcc.dg/strcmpopt_6.c: New test.
> 	* gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
> 	test cases.
> 	* gcc.dg/strlenopt-66.c: Run it.
> 	* gcc.dg/strlenopt-67.c: New test.
> 	* gcc.dg/strlenopt-68.c: New test.
> 
> gcc/ChangeLog:
> 
> 	PR tree-optimization/90879
> 	* builtins.c (check_access): Avoid using maxbound when null.
> 	* calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen change.
> 	* doc/invoke.texi (-Wstring-compare): Document new warning option.
> 	* gengtype-state.c (state_ident_st): Use a zero-length array instead.
> 	(state_token_st): Same.  Make last.
> 	(state_ident_by_name): Allocate enough space for terminating nul.
> 	* gimple-fold.c (get_range_strlen_tree): Make setting maxbound
> 	conditional.
> 	(get_range_strlen): Overwrite initial maxbound when non-null.
> 	* gimple-ssa-sprintf.c (get_string_length): Adjust to get_range_strlen
> 	change.
> 	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
> 	(used_only_for_zero_equality): New function.
> 	(handle_builtin_memcmp): Call it.
> 	(determine_min_objsize): Return an integer instead of tree.
> 	(get_len_or_size, strxcmp_eqz_result): New functions.
> 	(maybe_warn_pointless_strcmp): New function.
> 	(handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
> 	between a longer string and a smaller array.
> 
> diff --git a/gcc/gengtype-state.c b/gcc/gengtype-state.c
> index 03f40694ec6..80a8b57e9a2 100644
> --- a/gcc/gengtype-state.c
> +++ b/gcc/gengtype-state.c
> @@ -79,6 +79,14 @@ enum state_token_en
>    STOK_NAME                     /* hash-consed name or identifier.  */
>  };
>  
> +/* Suppress warning: ISO C forbids zero-size array for stok_string
> +   below.  The arrays are treated as flexible array members but in
> +   otherwise an empty struct or as a member of a union cannot be
> +   declared as such.  They must have zero size to keep GCC from
> +   assuming their bound reflect their size.  */
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wpedantic"
> +
>  
>  /* Structure and hash-table used to share identifiers or names.  */
>  struct state_ident_st
> @@ -86,11 +94,10 @@ struct state_ident_st
>    /* TODO: We could improve the parser by reserving identifiers for
>       state keywords and adding a keyword number for them.  That would
>       mean adding another field in this state_ident_st struct.  */
> -  char stid_name[1];		/* actually bigger & null terminated */
> +  char stid_name[0];		/* actually bigger & null terminated */
>  };
>  static htab_t state_ident_tab;
>  
> -
>  /* The state_token_st structure is for lexical tokens in the read
>     state file.  The stok_kind field discriminates the union.  Tokens
>     are allocated by peek_state_token which calls read_a_state_token
> @@ -110,14 +117,15 @@ struct state_token_st
>    union		                        /* discriminated by stok_kind! */
>    {
>      int stok_num;			/* when STOK_INTEGER */
> -    char stok_string[1];		/* when STOK_STRING, actual size is
> -					   bigger and null terminated */
>      struct state_ident_st *stok_ident;	/* when STOK_IDENT */
>      void *stok_ptr;		        /* null otherwise */
> +    char stok_string[0];		/* when STOK_STRING, actual size is
> +					   bigger and null terminated */
>    }
>    stok_un;
>  };
>  
> +#pragma GCC diagnostic pop
I think these three hunks should be a separate discussion about the code
construct we use for represent a trailing array and should be left out
of this patch.


>  
>  
>  
> @@ -325,7 +333,7 @@ state_ident_by_name (const char *name, enum insert_option optins)
>    namlen = strlen (name);
>    stid =
>      (struct state_ident_st *) xmalloc (sizeof (struct state_ident_st) +
> -				       namlen);
> +				       namlen + 1);
>    memset (stid, 0, sizeof (struct state_ident_st) + namlen);
>    strcpy (stid->stid_name, name);
>    *slot = stid;
How did you find this goof?



> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index fc57fb45e3a..582768090ae 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -1346,6 +1346,10 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
>  	}
>      }
>  
> +  /* Set if VAL represents the maximum length based on array size (set
> +     when exact length cannot be determined).  */
> +  bool maxbound = false;
> +
>    if (!val && rkind == SRK_LENRANGE)
>      {
>        if (TREE_CODE (arg) == ADDR_EXPR)
> @@ -1441,6 +1445,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
>  	      pdata->minlen = ssize_int (0);
>  	    }
>  	}
> +      maxbound = true;
>      }
>  
>    if (!val)
> @@ -1454,7 +1459,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
>  	  && tree_int_cst_lt (val, pdata->minlen)))
>      pdata->minlen = val;
>  
> -  if (pdata->maxbound)
> +  if (pdata->maxbound && TREE_CODE (pdata->maxbound) == INTEGER_CST)
>      {
>        /* Adjust the tighter (more optimistic) string length bound
>  	 if necessary and proceed to adjust the more conservative
So inside the conditional guarded by the test you're changing above we have:

     if (TREE_CODE (val) == INTEGER_CST)
        {
          if (TREE_CODE (pdata->maxbound) == INTEGER_CST)
            {
              if (tree_int_cst_lt (pdata->maxbound, val))
                pdata->maxbound = val;
            }
          else
            pdata->maxbound = build_all_ones_cst (size_type_node);
        }

Isn't the inner test that pdata->maxbound == INTEGER_CST always true and
we should remove the test and the else clause?  Does the else clause
need to be handled elsewhere (I don't see that it would be handled after
your changes).  Or perhaps it just doesn't matter...




> @@ -1653,8 +1661,11 @@ get_range_strlen (tree arg, bitmap *visited,
>  
>  /* Try to obtain the range of the lengths of the string(s) referenced
>     by ARG, or the size of the largest array ARG refers to if the range
> -   of lengths cannot be determined, and store all in *PDATA.  ELTSIZE
> -   is the expected size of the string element in bytes: 1 for char and
> +   of lengths cannot be determined, and store all in *PDATA which must
> +   be zero-initialized on input except PDATA->MAXBOUND may be set to
> +   a non-null tree node other than INTEGER_CST to request to have it
> +   set to the length of the longest string in a PHI.  ELTSIZE is
> +   the expected size of the string element in bytes: 1 for char and
Is there any reason we can't just make a clean distinction between input
and output objects in this routine?  As an API this seems awkward at best.



> @@ -2862,51 +2865,78 @@ handle_builtin_memset (gimple_stmt_iterator *gsi)
>    return true;
>  }
>  
> -/* Handle a call to memcmp.  We try to handle small comparisons by
> -   converting them to load and compare, and replacing the call to memcmp
> -   with a __builtin_memcmp_eq call where possible.
> -   return true when call is transformed, return false otherwise.  */
> +/* Return a pointer to the first such equality expression if RES is used
> +   only in experessions testing its equality to zero, and null otherwise.  */
s/experessions/expressions/


>  
> -static bool
> -handle_builtin_memcmp (gimple_stmt_iterator *gsi)
> +static gimple*
> +used_only_for_zero_equality (tree res)
Nit.  A space between "gimple" and "*".




> +
> +/* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
> +   the result of 0 == strncmp (A, B, BOUND) (which is the same as strcmp
> +   for s sufficiently large BOUND).  If the result is based on the length
> +   of one string being greater than the longest string that would fit in
> +   the array pointer to by the argument, set *PLEN and *PSIZE to
> +   the corresponding length (or its complement when the string is known
> +   to be at least as long and need not be nul-terminated) and size.
> +   Otherwise return null.  */
s/null/NULL/


> +/* Diagnose pointless calls to strcmp whose result is used in equality
> +   epxpressions that evaluate to a constant due to one argument being
> +   longer than the size of the other.  */
s/epxressions/expressions/



> +/* Optimize a call to strcmp or strncmp either by folding it to a constant
> +   when possible or by transforming the latter to the former.  Warn about
> +   calls where the length of one argument is greater than the size of
> +   the array to which the other aargument points if the latter's length
> +   is not known.  Return true when the call has been transformed into
> +   another and false otherwise.  */
s/aargument/argument/


>  
> -  unsigned HOST_WIDE_INT var_sizei = 0;
> -  /* try to determine the minimum size of the object pointed by var_string.  */
> -  tree size = determine_min_objsize (var_string);
> +  /* Determine either the length or the size of each of the string
> +     orguments, whichever is available.  */
s/orguments/arguments/


Generally looks reasonable.  Just a couple things.  One, whether or not
we can clean up the API changes to get_range_strlen, cleanups to
get_range_strlen_tree and whether or not the dropped case from
get_range_strlen_tree needs to be handled, and if so where should it go.

jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-12 20:15         ` Jeff Law
@ 2019-08-12 22:32           ` Martin Sebor
  2019-08-13  2:22             ` Jeff Law
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Sebor @ 2019-08-12 22:32 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek; +Cc: gcc-patches

On 8/12/19 2:04 PM, Jeff Law wrote:
> On 8/9/19 4:14 PM, Martin Sebor wrote:
>> On 8/9/19 10:58 AM, Jakub Jelinek wrote:
>>> On Fri, Aug 09, 2019 at 10:51:09AM -0600, Martin Sebor wrote:
>>>> That said, we should change this code one way or the other.
>>>> There is even less of a guarantee that other compilers support
>>>> writing past the end of arrays that have non-zero size than
>>>> that they recognize the documented zero-length extension.
>>>
>>> We use that everywhere forever, so no.
>>
>> Just because some invalid code has been in place "forever" doesn't
>> mean it cannot be changed.  Relying on undocumented "extensions"
>> because they just happen to work with the compilers they have been
>> exposed to is exactly how naive users get in trouble.  Our answer
>> to reports of "bugs" when the behavior changes is typically: fix
>> your code.  There's little reason to expect other compiler writers
>> to be any more accommodating.
>>
>>> See e.g. rtx u.fld and u.hwint arrays, tree_exp operands array,
>>> gimple_statement_with_ops op array just to name a few that are
>>> everywhere.  Coverity is indeed unhappy about
>>> that, but it would be with [0] certainly too.  Another option is
>>> to use maximum possible size where we know it (which is the case of
>>> rtxes and most tree expressions and gimple stmts, but not e.g.
>>> CALL_EXPR or GIMPLE_CALL where there is no easy upper bound.
>>
>> The solution introduced in C99 is a flexible array.  C++
>> compilers usually support it as well.  Those that don't are
>> likely to support the zero-length array (even Visual C++ does).
>> If there's a chance that some don't support either do you really
>> think it's safe to assume they will do something sane with
>> the [1] hack?  If you're concerned that the flexible array syntax
>> or the zero length array won't compile, add a configure test to
>> see if it does and use whatever alternative is most appropriate.
> Given that we require a C++03 compiler to build GCC, I think we can
> revisit how we represent the trailing array.  But that seems independent
> of the bulk of this patch.
> 
> Can we separate this issue from the rest of the patch?

The updated patch I posted is independent of the trailing
[1] array hack:

   https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html

Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-12 22:32           ` Martin Sebor
@ 2019-08-13  2:22             ` Jeff Law
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff Law @ 2019-08-13  2:22 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek; +Cc: gcc-patches

On 8/12/19 4:17 PM, Martin Sebor wrote:
> On 8/12/19 2:04 PM, Jeff Law wrote:
>> On 8/9/19 4:14 PM, Martin Sebor wrote:
>>> On 8/9/19 10:58 AM, Jakub Jelinek wrote:
>>>> On Fri, Aug 09, 2019 at 10:51:09AM -0600, Martin Sebor wrote:
>>>>> That said, we should change this code one way or the other.
>>>>> There is even less of a guarantee that other compilers support
>>>>> writing past the end of arrays that have non-zero size than
>>>>> that they recognize the documented zero-length extension.
>>>>
>>>> We use that everywhere forever, so no.
>>>
>>> Just because some invalid code has been in place "forever" doesn't
>>> mean it cannot be changed.  Relying on undocumented "extensions"
>>> because they just happen to work with the compilers they have been
>>> exposed to is exactly how naive users get in trouble.  Our answer
>>> to reports of "bugs" when the behavior changes is typically: fix
>>> your code.  There's little reason to expect other compiler writers
>>> to be any more accommodating.
>>>
>>>> See e.g. rtx u.fld and u.hwint arrays, tree_exp operands array,
>>>> gimple_statement_with_ops op array just to name a few that are
>>>> everywhere.  Coverity is indeed unhappy about
>>>> that, but it would be with [0] certainly too.  Another option is
>>>> to use maximum possible size where we know it (which is the case of
>>>> rtxes and most tree expressions and gimple stmts, but not e.g.
>>>> CALL_EXPR or GIMPLE_CALL where there is no easy upper bound.
>>>
>>> The solution introduced in C99 is a flexible array.  C++
>>> compilers usually support it as well.  Those that don't are
>>> likely to support the zero-length array (even Visual C++ does).
>>> If there's a chance that some don't support either do you really
>>> think it's safe to assume they will do something sane with
>>> the [1] hack?  If you're concerned that the flexible array syntax
>>> or the zero length array won't compile, add a configure test to
>>> see if it does and use whatever alternative is most appropriate.
>> Given that we require a C++03 compiler to build GCC, I think we can
>> revisit how we represent the trailing array.  But that seems independent
>> of the bulk of this patch.
>>
>> Can we separate this issue from the rest of the patch?
> 
> The updated patch I posted is independent of the trailing
> [1] array hack:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html
I must have dropped this from my queue by accident.  I'll go find it and
give it a looksie as well.

jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-09 17:07   ` Martin Sebor
  2019-08-09 17:07     ` Jakub Jelinek
@ 2019-08-13 20:08     ` Jeff Law
  2019-08-13 23:26       ` Martin Sebor
  1 sibling, 1 reply; 21+ messages in thread
From: Jeff Law @ 2019-08-13 20:08 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek; +Cc: gcc-patches

On 8/9/19 10:51 AM, Martin Sebor wrote:
> 
> PR tree-optimization/90879 - fold zero-equality of strcmp between a longer string and a smaller array
> 
> gcc/c-family/ChangeLog:
> 
> 	PR tree-optimization/90879
> 	* c.opt (-Wstring-compare): New option.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR tree-optimization/90879
> 	* gcc.dg/Wstring-compare-2.c: New test.
> 	* gcc.dg/Wstring-compare.c: New test.
> 	* gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
> 	* gcc.dg/strcmpopt_6.c: New test.
> 	* gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
> 	test cases.
> 	* gcc.dg/strlenopt-66.c: Run it.
> 	* gcc.dg/strlenopt-68.c: New test.
> 
> gcc/ChangeLog:
> 
> 	PR tree-optimization/90879
> 	* builtins.c (check_access): Avoid using maxbound when null.
> 	* calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen change.
> 	* doc/invoke.texi (-Wstring-compare): Document new warning option.
> 	* gimple-fold.c (get_range_strlen_tree): Make setting maxbound
> 	conditional.
> 	(get_range_strlen): Overwrite initial maxbound when non-null.
> 	* gimple-ssa-sprintf.c (get_string_length): Adjust to get_range_strlen
> 	change.
> 	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
> 	(used_only_for_zero_equality): New function.
> 	(handle_builtin_memcmp): Call it.
> 	(determine_min_objsize): Return an integer instead of tree.
> 	(get_len_or_size, strxcmp_eqz_result): New functions.
> 	(maybe_warn_pointless_strcmp): New function.
> 	(handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
> 	between a longer string and a smaller array.
> 

> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> index 4af47855e7c..31e012b741b 100644
> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c

> @@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
>  
>    type = TYPE_MAIN_VARIANT (type);
>  
> -  /* We cannot determine the size of the array if it's a flexible array,
> -     which is declared at the end of a structure.  */
> -  if (TREE_CODE (type) == ARRAY_TYPE
> -      && !array_at_struct_end_p (dest))
> +  /* The size of a flexible array cannot be determined.  Otherwise,
> +     for arrays with more than one element, return the size of its
> +     type.  GCC itself misuses arrays of both zero and one elements
> +     as flexible array members so they are excluded as well.  */
> +  if (TREE_CODE (type) != ARRAY_TYPE
> +      || !array_at_struct_end_p (dest))
>      {
> -      tree size_t = TYPE_SIZE_UNIT (type);
> -      if (size_t && TREE_CODE (size_t) == INTEGER_CST
> -	  && !integer_zerop (size_t))
> -        return size_t;
> +      tree type_size = TYPE_SIZE_UNIT (type);
> +      if (type_size && TREE_CODE (type_size) == INTEGER_CST
> +	  && !integer_onep (type_size)
> +	  && !integer_zerop (type_size))
> +        return tree_to_uhwi (type_size);
So I nearly commented on this when looking at the original patch.  Can
we really depend on the size when we've got an array at the end of a
struct with a declared size other than 0/1?   While 0/1 are by far the
most common way to declare them, couldn't someone have used other sizes?
 I think we pondered doing that at one time to cut down on the noise
from Coverity for RTL and TREE operand accessors.

Your code makes us safer, so I'm not saying you've done anything wrong,
just trying to decide if we need to tighten this up even further.

No additional comments beyond what I pointed out yesterday against the
original patch.

Jeff

>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-13 20:08     ` Jeff Law
@ 2019-08-13 23:26       ` Martin Sebor
  2019-08-14  0:39         ` Jeff Law
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Sebor @ 2019-08-13 23:26 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek; +Cc: gcc-patches

On 8/13/19 2:07 PM, Jeff Law wrote:
> On 8/9/19 10:51 AM, Martin Sebor wrote:
>>
>> PR tree-optimization/90879 - fold zero-equality of strcmp between a longer string and a smaller array
>>
>> gcc/c-family/ChangeLog:
>>
>> 	PR tree-optimization/90879
>> 	* c.opt (-Wstring-compare): New option.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR tree-optimization/90879
>> 	* gcc.dg/Wstring-compare-2.c: New test.
>> 	* gcc.dg/Wstring-compare.c: New test.
>> 	* gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
>> 	* gcc.dg/strcmpopt_6.c: New test.
>> 	* gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
>> 	test cases.
>> 	* gcc.dg/strlenopt-66.c: Run it.
>> 	* gcc.dg/strlenopt-68.c: New test.
>>
>> gcc/ChangeLog:
>>
>> 	PR tree-optimization/90879
>> 	* builtins.c (check_access): Avoid using maxbound when null.
>> 	* calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen change.
>> 	* doc/invoke.texi (-Wstring-compare): Document new warning option.
>> 	* gimple-fold.c (get_range_strlen_tree): Make setting maxbound
>> 	conditional.
>> 	(get_range_strlen): Overwrite initial maxbound when non-null.
>> 	* gimple-ssa-sprintf.c (get_string_length): Adjust to get_range_strlen
>> 	change.
>> 	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
>> 	(used_only_for_zero_equality): New function.
>> 	(handle_builtin_memcmp): Call it.
>> 	(determine_min_objsize): Return an integer instead of tree.
>> 	(get_len_or_size, strxcmp_eqz_result): New functions.
>> 	(maybe_warn_pointless_strcmp): New function.
>> 	(handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
>> 	between a longer string and a smaller array.
>>
> 
>> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
>> index 4af47855e7c..31e012b741b 100644
>> --- a/gcc/tree-ssa-strlen.c
>> +++ b/gcc/tree-ssa-strlen.c
> 
>> @@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
>>   
>>     type = TYPE_MAIN_VARIANT (type);
>>   
>> -  /* We cannot determine the size of the array if it's a flexible array,
>> -     which is declared at the end of a structure.  */
>> -  if (TREE_CODE (type) == ARRAY_TYPE
>> -      && !array_at_struct_end_p (dest))
>> +  /* The size of a flexible array cannot be determined.  Otherwise,
>> +     for arrays with more than one element, return the size of its
>> +     type.  GCC itself misuses arrays of both zero and one elements
>> +     as flexible array members so they are excluded as well.  */
>> +  if (TREE_CODE (type) != ARRAY_TYPE
>> +      || !array_at_struct_end_p (dest))
>>       {
>> -      tree size_t = TYPE_SIZE_UNIT (type);
>> -      if (size_t && TREE_CODE (size_t) == INTEGER_CST
>> -	  && !integer_zerop (size_t))
>> -        return size_t;
>> +      tree type_size = TYPE_SIZE_UNIT (type);
>> +      if (type_size && TREE_CODE (type_size) == INTEGER_CST
>> +	  && !integer_onep (type_size)
>> +	  && !integer_zerop (type_size))
>> +        return tree_to_uhwi (type_size);
> So I nearly commented on this when looking at the original patch.  Can
> we really depend on the size when we've got an array at the end of a
> struct with a declared size other than 0/1?   While 0/1 are by far the
> most common way to declare them, couldn't someone have used other sizes?
>   I think we pondered doing that at one time to cut down on the noise
> from Coverity for RTL and TREE operand accessors.
> 
> Your code makes us safer, so I'm not saying you've done anything wrong,
> just trying to decide if we need to tighten this up even further.

This patch issues a warning in these cases, i.e., when it sees
a call like, say, strcmp("foobar", A) with an A that's smaller
than the string, because it seems they are likely (rare) bugs.
I haven't seen the warning in any of the projects I tested it
with (Binutils/GDB, GCC, Glibc, the Linux kernel, and LLVM).

The warning uses strcmp to detect these mistakes (or misuses)
but I'd like to add similar warnings for other string functions
as well and have code out there that does this on purpose use
true flexible array members (or the zero-length extension)
instead.  That makes the intent clear.

It's a judgment call whether to also fold (or do something else
like insert a trap) in addition to issuing a warning.  In this
case (reading) I don't think it matters as much as it does for
writes.  Either way, it would be nice to set a policy and
document it in the manual so users know what to expect and
so we don't have to revisit this question for each patch that
touches on this subject.

Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-13 23:26       ` Martin Sebor
@ 2019-08-14  0:39         ` Jeff Law
  2019-08-14 20:57           ` Martin Sebor
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff Law @ 2019-08-14  0:39 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek; +Cc: gcc-patches

On 8/13/19 3:43 PM, Martin Sebor wrote:
> On 8/13/19 2:07 PM, Jeff Law wrote:
>> On 8/9/19 10:51 AM, Martin Sebor wrote:
>>>
>>> PR tree-optimization/90879 - fold zero-equality of strcmp between a
>>> longer string and a smaller array
>>>
>>> gcc/c-family/ChangeLog:
>>>
>>>     PR tree-optimization/90879
>>>     * c.opt (-Wstring-compare): New option.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     PR tree-optimization/90879
>>>     * gcc.dg/Wstring-compare-2.c: New test.
>>>     * gcc.dg/Wstring-compare.c: New test.
>>>     * gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
>>>     * gcc.dg/strcmpopt_6.c: New test.
>>>     * gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
>>>     test cases.
>>>     * gcc.dg/strlenopt-66.c: Run it.
>>>     * gcc.dg/strlenopt-68.c: New test.
>>>
>>> gcc/ChangeLog:
>>>
>>>     PR tree-optimization/90879
>>>     * builtins.c (check_access): Avoid using maxbound when null.
>>>     * calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen
>>> change.
>>>     * doc/invoke.texi (-Wstring-compare): Document new warning option.
>>>     * gimple-fold.c (get_range_strlen_tree): Make setting maxbound
>>>     conditional.
>>>     (get_range_strlen): Overwrite initial maxbound when non-null.
>>>     * gimple-ssa-sprintf.c (get_string_length): Adjust to
>>> get_range_strlen
>>>     change.
>>>     * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
>>>     (used_only_for_zero_equality): New function.
>>>     (handle_builtin_memcmp): Call it.
>>>     (determine_min_objsize): Return an integer instead of tree.
>>>     (get_len_or_size, strxcmp_eqz_result): New functions.
>>>     (maybe_warn_pointless_strcmp): New function.
>>>     (handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
>>>     between a longer string and a smaller array.
>>>
>>
>>> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
>>> index 4af47855e7c..31e012b741b 100644
>>> --- a/gcc/tree-ssa-strlen.c
>>> +++ b/gcc/tree-ssa-strlen.c
>>
>>> @@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
>>>       type = TYPE_MAIN_VARIANT (type);
>>>   -  /* We cannot determine the size of the array if it's a flexible
>>> array,
>>> -     which is declared at the end of a structure.  */
>>> -  if (TREE_CODE (type) == ARRAY_TYPE
>>> -      && !array_at_struct_end_p (dest))
>>> +  /* The size of a flexible array cannot be determined.  Otherwise,
>>> +     for arrays with more than one element, return the size of its
>>> +     type.  GCC itself misuses arrays of both zero and one elements
>>> +     as flexible array members so they are excluded as well.  */
>>> +  if (TREE_CODE (type) != ARRAY_TYPE
>>> +      || !array_at_struct_end_p (dest))
>>>       {
>>> -      tree size_t = TYPE_SIZE_UNIT (type);
>>> -      if (size_t && TREE_CODE (size_t) == INTEGER_CST
>>> -      && !integer_zerop (size_t))
>>> -        return size_t;
>>> +      tree type_size = TYPE_SIZE_UNIT (type);
>>> +      if (type_size && TREE_CODE (type_size) == INTEGER_CST
>>> +      && !integer_onep (type_size)
>>> +      && !integer_zerop (type_size))
>>> +        return tree_to_uhwi (type_size);
>> So I nearly commented on this when looking at the original patch.  Can
>> we really depend on the size when we've got an array at the end of a
>> struct with a declared size other than 0/1?   While 0/1 are by far the
>> most common way to declare them, couldn't someone have used other sizes?
>>   I think we pondered doing that at one time to cut down on the noise
>> from Coverity for RTL and TREE operand accessors.
>>
>> Your code makes us safer, so I'm not saying you've done anything wrong,
>> just trying to decide if we need to tighten this up even further.
> 
> This patch issues a warning in these cases, i.e., when it sees
> a call like, say, strcmp("foobar", A) with an A that's smaller
> than the string, because it seems they are likely (rare) bugs.
> I haven't seen the warning in any of the projects I tested it
> with (Binutils/GDB, GCC, Glibc, the Linux kernel, and LLVM).
> 
> The warning uses strcmp to detect these mistakes (or misuses)
> but I'd like to add similar warnings for other string functions
> as well and have code out there that does this on purpose use
> true flexible array members (or the zero-length extension)
> instead.  That makes the intent clear.
> 
> It's a judgment call whether to also fold (or do something else
> like insert a trap) in addition to issuing a warning.  In this
> case (reading) I don't think it matters as much as it does for
> writes.  Either way, it would be nice to set a policy and
> document it in the manual so users know what to expect and
> so we don't have to revisit this question for each patch that
> touches on this subject.
The GCC manual documents zero length arrays at the end of an aggregate
as a GNU extension for variable length objects.  The manual also
documents that it could be done with single element arrays, but that
doing so does contribute to the base size of the aggregate, but
otherwise it's handled like a zero length array.

So both zero and one element arrays are documented as supported for this
use case.  However, I could easily see someone making the case that any
size should work here and I could easily think of cases where that would
be a reasonable thing to do.  We do not handle these cases in a
consistent way -- we'll treat sizes other than 0/1 as being a variable
length object in some cases, but not in others.

I'm tempted to bring consistency here.  We're likely not losing
significant diagnostic opportunities or optimizations if we treat all
trailing arrays as creating potentially variable sized objects.

jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-12 13:56         ` Michael Matz
@ 2019-08-14 16:30           ` Martin Sebor
  0 siblings, 0 replies; 21+ messages in thread
From: Martin Sebor @ 2019-08-14 16:30 UTC (permalink / raw)
  To: Michael Matz; +Cc: Jakub Jelinek, gcc-patches

On 8/12/19 7:40 AM, Michael Matz wrote:
> Hi,
> 
> On Fri, 9 Aug 2019, Martin Sebor wrote:
> 
>> The solution introduced in C99 is a flexible array.  C++
>> compilers usually support it as well.  Those that don't are
>> likely to support the zero-length array (even Visual C++ does).
>> If there's a chance that some don't support either do you really
>> think it's safe to assume they will do something sane with
>> the [1] hack?
> 
> As the [1] "hack" is the traditional pre-C99 (and C++) idiom to
> implement flexible trailing char arrays, yes, I do expect all existing
> (and not any more existing) compilers to do the obvious and sane thing
> with it.  IOW: it's more portable in practice than our documented
> zero-length extension.  And that's what matters for the things compiled by
> the host compiler.
> 
> Without requiring C99 (which would be a different discussion) and a
> non-existing C++ standard we can't write this code (in this form) in a
> standard conforming way, no matter what we wish for.  Hence it seems
> prudent to use the most portable variant of all the non-standard ways, the
> trailing [1] array.

There are a few reasons why these legacy C idioms should be
replaced with better/newer/safer alternatives.

First, with two C revisions since C99 and with support for
superior alternatives widely available, pre-C99 idioms have less
and less relevance.

Second, since most of GCC requires a C++98 compiler to compile,
ancient C code needs to adjust to the more strict C++ requirements.
As C++ evolves, dependencies on legacy extensions like this one
make it increasingly difficult to upgrade to newer revisions of
the standard.  C++ 11 already requires compilers to reject
undefined behavior in constexpr contexts, including accesses
to arrays outside of their bounds.  Once GCC adopts C++ 11 it
won't be able to make use of constexpr with code that relies
on the hack.

Third, the safest and most secure approach to dealing with past-
the-end accesses is to diagnose and prevent them.  Accommodating
code that disregards the array bounds compromises this goal.  This
is evident from the gaps in _FORTIFY_SOURCE and -Wstringop-overflow
that other compilers like Clang and ICC don't suffer from(*).  It's
in everyone's best interest to proactively drive them to extinction
and replace them by safer alternatives that let compilers distinguish
the intentional accesses from accidental ones.  It not only makes it
easier to find bugs but also emit more efficient object code.

Martin

PS Unlike GCC, both Clang and ICC diagnose past-the-end accesses
to trailing arrays with more than one element.  They do recognize
the struct hack even in C++ and, outside constexpr contexts, avoid
diagnosing past-the-end accesses to trailing one-element arrays.
This isn't so much an issue today because neither allows statically
initializing struct objects with such arrays to more elements than
the bound specifies.  But it will likely change when the C++
proposal for constexpr functions to use new expressions is adopted 
(P0784R1).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-14  0:39         ` Jeff Law
@ 2019-08-14 20:57           ` Martin Sebor
  2019-08-21  7:40             ` Martin Sebor
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Sebor @ 2019-08-14 20:57 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek; +Cc: gcc-patches

On 8/13/19 4:46 PM, Jeff Law wrote:
> On 8/13/19 3:43 PM, Martin Sebor wrote:
>> On 8/13/19 2:07 PM, Jeff Law wrote:
>>> On 8/9/19 10:51 AM, Martin Sebor wrote:
>>>>
>>>> PR tree-optimization/90879 - fold zero-equality of strcmp between a
>>>> longer string and a smaller array
>>>>
>>>> gcc/c-family/ChangeLog:
>>>>
>>>>      PR tree-optimization/90879
>>>>      * c.opt (-Wstring-compare): New option.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>      PR tree-optimization/90879
>>>>      * gcc.dg/Wstring-compare-2.c: New test.
>>>>      * gcc.dg/Wstring-compare.c: New test.
>>>>      * gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
>>>>      * gcc.dg/strcmpopt_6.c: New test.
>>>>      * gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
>>>>      test cases.
>>>>      * gcc.dg/strlenopt-66.c: Run it.
>>>>      * gcc.dg/strlenopt-68.c: New test.
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>      PR tree-optimization/90879
>>>>      * builtins.c (check_access): Avoid using maxbound when null.
>>>>      * calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen
>>>> change.
>>>>      * doc/invoke.texi (-Wstring-compare): Document new warning option.
>>>>      * gimple-fold.c (get_range_strlen_tree): Make setting maxbound
>>>>      conditional.
>>>>      (get_range_strlen): Overwrite initial maxbound when non-null.
>>>>      * gimple-ssa-sprintf.c (get_string_length): Adjust to
>>>> get_range_strlen
>>>>      change.
>>>>      * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
>>>>      (used_only_for_zero_equality): New function.
>>>>      (handle_builtin_memcmp): Call it.
>>>>      (determine_min_objsize): Return an integer instead of tree.
>>>>      (get_len_or_size, strxcmp_eqz_result): New functions.
>>>>      (maybe_warn_pointless_strcmp): New function.
>>>>      (handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
>>>>      between a longer string and a smaller array.
>>>>
>>>
>>>> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
>>>> index 4af47855e7c..31e012b741b 100644
>>>> --- a/gcc/tree-ssa-strlen.c
>>>> +++ b/gcc/tree-ssa-strlen.c
>>>
>>>> @@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
>>>>        type = TYPE_MAIN_VARIANT (type);
>>>>    -  /* We cannot determine the size of the array if it's a flexible
>>>> array,
>>>> -     which is declared at the end of a structure.  */
>>>> -  if (TREE_CODE (type) == ARRAY_TYPE
>>>> -      && !array_at_struct_end_p (dest))
>>>> +  /* The size of a flexible array cannot be determined.  Otherwise,
>>>> +     for arrays with more than one element, return the size of its
>>>> +     type.  GCC itself misuses arrays of both zero and one elements
>>>> +     as flexible array members so they are excluded as well.  */
>>>> +  if (TREE_CODE (type) != ARRAY_TYPE
>>>> +      || !array_at_struct_end_p (dest))
>>>>        {
>>>> -      tree size_t = TYPE_SIZE_UNIT (type);
>>>> -      if (size_t && TREE_CODE (size_t) == INTEGER_CST
>>>> -      && !integer_zerop (size_t))
>>>> -        return size_t;
>>>> +      tree type_size = TYPE_SIZE_UNIT (type);
>>>> +      if (type_size && TREE_CODE (type_size) == INTEGER_CST
>>>> +      && !integer_onep (type_size)
>>>> +      && !integer_zerop (type_size))
>>>> +        return tree_to_uhwi (type_size);
>>> So I nearly commented on this when looking at the original patch.  Can
>>> we really depend on the size when we've got an array at the end of a
>>> struct with a declared size other than 0/1?   While 0/1 are by far the
>>> most common way to declare them, couldn't someone have used other sizes?
>>>    I think we pondered doing that at one time to cut down on the noise
>>> from Coverity for RTL and TREE operand accessors.
>>>
>>> Your code makes us safer, so I'm not saying you've done anything wrong,
>>> just trying to decide if we need to tighten this up even further.
>>
>> This patch issues a warning in these cases, i.e., when it sees
>> a call like, say, strcmp("foobar", A) with an A that's smaller
>> than the string, because it seems they are likely (rare) bugs.
>> I haven't seen the warning in any of the projects I tested it
>> with (Binutils/GDB, GCC, Glibc, the Linux kernel, and LLVM).
>>
>> The warning uses strcmp to detect these mistakes (or misuses)
>> but I'd like to add similar warnings for other string functions
>> as well and have code out there that does this on purpose use
>> true flexible array members (or the zero-length extension)
>> instead.  That makes the intent clear.
>>
>> It's a judgment call whether to also fold (or do something else
>> like insert a trap) in addition to issuing a warning.  In this
>> case (reading) I don't think it matters as much as it does for
>> writes.  Either way, it would be nice to set a policy and
>> document it in the manual so users know what to expect and
>> so we don't have to revisit this question for each patch that
>> touches on this subject.
> The GCC manual documents zero length arrays at the end of an aggregate
> as a GNU extension for variable length objects.  The manual also
> documents that it could be done with single element arrays, but that
> doing so does contribute to the base size of the aggregate, but
> otherwise it's handled like a zero length array.
> 
> So both zero and one element arrays are documented as supported for this
> use case.  However, I could easily see someone making the case that any
> size should work here and I could easily think of cases where that would
> be a reasonable thing to do.  We do not handle these cases in a
> consistent way -- we'll treat sizes other than 0/1 as being a variable
> length object in some cases, but not in others.
> 
> I'm tempted to bring consistency here.  We're likely not losing
> significant diagnostic opportunities or optimizations if we treat all
> trailing arrays as creating potentially variable sized objects.

It's not terribly important in this case but it is very much so
in general.

I would tend to agree that treating all trailing arrays as flexible
array members might not have a dramatic an impact om efficiency.
But when it comes to detecting and diagnosing buffer overflow, it
most certainly does.  For example:

   struct S
   {
     void (*pf)(void);
     char a[8];
   };

   void f (struct S *p)
   {
     strcpy (p->a, "0123456789");
   }

GCC doesn't diagnose the overflow even with _FORTIFY_SOURCE=2,
despite the high likelihood of memory corruption.  How common
are structs with trailing arrays?

In a GCC build on x86_64, there are 726 distinct structs with
an array as the last member.  Of those, 268 have more than one
element.  In Binutils/GDB, it's 638 and 537, respectively.  In
the kernel it's 8283 and 7584 (and 3795 have 10 or more).

Some of these serve the purpose of flexible array members but
it seems highly unlikely it's more than a small subset.  By
assuming the opposite we are leaving all the code that accesses
those arrays with no protection against buffer overflow.

I believe object and subobject boundaries must be respected.
There is room for exemptions but those need to be made explicit
in the code.  In this case, the mechanism is the flexible array
member syntax (or the zero-length array extension(*)).  The GCC
manual explicitly discourages the one-element array case and
so I'd say it would be appropriate to warn about it if we can
determine it's being misused (otherwise, why discourage it if
we don't really mean it?)  The manual doesn't mention support
for larger arrays and writing past the end of those is almost
certainly a bug.  Those should all be diagnosed.  Clang and ICC
already do diagnose these(**), and both, especially Clang, are
being used to compile increasing proportion of the Linux
ecosystem, so there is less and less reason for GCC to be
as permissive as it has been and not issue comparable
diagnostics.

Martin

[*] To minimize the transition effort we could introduce a new
"flexarray" attribute to annotate larger trailing arrays with
to indicate they are intended to be used as flexible array members.
But I don't have the impression the patter is wide- spread enough
to justify adding yet another extension.

[**] Both Clang and ICC have issued a warning for the out-of-
bounds access below for many releases:

struct S
{
   void (*pf)(void);
   char a[8];
};

void f (struct S *p)
{
   p->a[8] = 0;
}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-14 20:57           ` Martin Sebor
@ 2019-08-21  7:40             ` Martin Sebor
  2019-08-22 22:23               ` Jeff Law
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Sebor @ 2019-08-21  7:40 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek; +Cc: gcc-patches

Jeff,

Please let me know if you agree/disagree and what I need to
do to advance this work:

   https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html

Thanks
Martin

On 8/14/19 1:59 PM, Martin Sebor wrote:
> On 8/13/19 4:46 PM, Jeff Law wrote:
>> On 8/13/19 3:43 PM, Martin Sebor wrote:
>>> On 8/13/19 2:07 PM, Jeff Law wrote:
>>>> On 8/9/19 10:51 AM, Martin Sebor wrote:
>>>>>
>>>>> PR tree-optimization/90879 - fold zero-equality of strcmp between a
>>>>> longer string and a smaller array
>>>>>
>>>>> gcc/c-family/ChangeLog:
>>>>>
>>>>>      PR tree-optimization/90879
>>>>>      * c.opt (-Wstring-compare): New option.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>>      PR tree-optimization/90879
>>>>>      * gcc.dg/Wstring-compare-2.c: New test.
>>>>>      * gcc.dg/Wstring-compare.c: New test.
>>>>>      * gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
>>>>>      * gcc.dg/strcmpopt_6.c: New test.
>>>>>      * gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
>>>>>      test cases.
>>>>>      * gcc.dg/strlenopt-66.c: Run it.
>>>>>      * gcc.dg/strlenopt-68.c: New test.
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>>      PR tree-optimization/90879
>>>>>      * builtins.c (check_access): Avoid using maxbound when null.
>>>>>      * calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen
>>>>> change.
>>>>>      * doc/invoke.texi (-Wstring-compare): Document new warning 
>>>>> option.
>>>>>      * gimple-fold.c (get_range_strlen_tree): Make setting maxbound
>>>>>      conditional.
>>>>>      (get_range_strlen): Overwrite initial maxbound when non-null.
>>>>>      * gimple-ssa-sprintf.c (get_string_length): Adjust to
>>>>> get_range_strlen
>>>>>      change.
>>>>>      * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
>>>>>      (used_only_for_zero_equality): New function.
>>>>>      (handle_builtin_memcmp): Call it.
>>>>>      (determine_min_objsize): Return an integer instead of tree.
>>>>>      (get_len_or_size, strxcmp_eqz_result): New functions.
>>>>>      (maybe_warn_pointless_strcmp): New function.
>>>>>      (handle_builtin_string_cmp): Call it.  Fold zero-equality of 
>>>>> strcmp
>>>>>      between a longer string and a smaller array.
>>>>>
>>>>
>>>>> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
>>>>> index 4af47855e7c..31e012b741b 100644
>>>>> --- a/gcc/tree-ssa-strlen.c
>>>>> +++ b/gcc/tree-ssa-strlen.c
>>>>
>>>>> @@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
>>>>>        type = TYPE_MAIN_VARIANT (type);
>>>>>    -  /* We cannot determine the size of the array if it's a flexible
>>>>> array,
>>>>> -     which is declared at the end of a structure.  */
>>>>> -  if (TREE_CODE (type) == ARRAY_TYPE
>>>>> -      && !array_at_struct_end_p (dest))
>>>>> +  /* The size of a flexible array cannot be determined.  Otherwise,
>>>>> +     for arrays with more than one element, return the size of its
>>>>> +     type.  GCC itself misuses arrays of both zero and one elements
>>>>> +     as flexible array members so they are excluded as well.  */
>>>>> +  if (TREE_CODE (type) != ARRAY_TYPE
>>>>> +      || !array_at_struct_end_p (dest))
>>>>>        {
>>>>> -      tree size_t = TYPE_SIZE_UNIT (type);
>>>>> -      if (size_t && TREE_CODE (size_t) == INTEGER_CST
>>>>> -      && !integer_zerop (size_t))
>>>>> -        return size_t;
>>>>> +      tree type_size = TYPE_SIZE_UNIT (type);
>>>>> +      if (type_size && TREE_CODE (type_size) == INTEGER_CST
>>>>> +      && !integer_onep (type_size)
>>>>> +      && !integer_zerop (type_size))
>>>>> +        return tree_to_uhwi (type_size);
>>>> So I nearly commented on this when looking at the original patch.  Can
>>>> we really depend on the size when we've got an array at the end of a
>>>> struct with a declared size other than 0/1?   While 0/1 are by far the
>>>> most common way to declare them, couldn't someone have used other 
>>>> sizes?
>>>>    I think we pondered doing that at one time to cut down on the noise
>>>> from Coverity for RTL and TREE operand accessors.
>>>>
>>>> Your code makes us safer, so I'm not saying you've done anything wrong,
>>>> just trying to decide if we need to tighten this up even further.
>>>
>>> This patch issues a warning in these cases, i.e., when it sees
>>> a call like, say, strcmp("foobar", A) with an A that's smaller
>>> than the string, because it seems they are likely (rare) bugs.
>>> I haven't seen the warning in any of the projects I tested it
>>> with (Binutils/GDB, GCC, Glibc, the Linux kernel, and LLVM).
>>>
>>> The warning uses strcmp to detect these mistakes (or misuses)
>>> but I'd like to add similar warnings for other string functions
>>> as well and have code out there that does this on purpose use
>>> true flexible array members (or the zero-length extension)
>>> instead.  That makes the intent clear.
>>>
>>> It's a judgment call whether to also fold (or do something else
>>> like insert a trap) in addition to issuing a warning.  In this
>>> case (reading) I don't think it matters as much as it does for
>>> writes.  Either way, it would be nice to set a policy and
>>> document it in the manual so users know what to expect and
>>> so we don't have to revisit this question for each patch that
>>> touches on this subject.
>> The GCC manual documents zero length arrays at the end of an aggregate
>> as a GNU extension for variable length objects.  The manual also
>> documents that it could be done with single element arrays, but that
>> doing so does contribute to the base size of the aggregate, but
>> otherwise it's handled like a zero length array.
>>
>> So both zero and one element arrays are documented as supported for this
>> use case.  However, I could easily see someone making the case that any
>> size should work here and I could easily think of cases where that would
>> be a reasonable thing to do.  We do not handle these cases in a
>> consistent way -- we'll treat sizes other than 0/1 as being a variable
>> length object in some cases, but not in others.
>>
>> I'm tempted to bring consistency here.  We're likely not losing
>> significant diagnostic opportunities or optimizations if we treat all
>> trailing arrays as creating potentially variable sized objects.
> 
> It's not terribly important in this case but it is very much so
> in general.
> 
> I would tend to agree that treating all trailing arrays as flexible
> array members might not have a dramatic an impact om efficiency.
> But when it comes to detecting and diagnosing buffer overflow, it
> most certainly does.  For example:
> 
>    struct S
>    {
>      void (*pf)(void);
>      char a[8];
>    };
> 
>    void f (struct S *p)
>    {
>      strcpy (p->a, "0123456789");
>    }
> 
> GCC doesn't diagnose the overflow even with _FORTIFY_SOURCE=2,
> despite the high likelihood of memory corruption.  How common
> are structs with trailing arrays?
> 
> In a GCC build on x86_64, there are 726 distinct structs with
> an array as the last member.  Of those, 268 have more than one
> element.  In Binutils/GDB, it's 638 and 537, respectively.  In
> the kernel it's 8283 and 7584 (and 3795 have 10 or more).
> 
> Some of these serve the purpose of flexible array members but
> it seems highly unlikely it's more than a small subset.  By
> assuming the opposite we are leaving all the code that accesses
> those arrays with no protection against buffer overflow.
> 
> I believe object and subobject boundaries must be respected.
> There is room for exemptions but those need to be made explicit
> in the code.  In this case, the mechanism is the flexible array
> member syntax (or the zero-length array extension(*)).  The GCC
> manual explicitly discourages the one-element array case and
> so I'd say it would be appropriate to warn about it if we can
> determine it's being misused (otherwise, why discourage it if
> we don't really mean it?)  The manual doesn't mention support
> for larger arrays and writing past the end of those is almost
> certainly a bug.  Those should all be diagnosed.  Clang and ICC
> already do diagnose these(**), and both, especially Clang, are
> being used to compile increasing proportion of the Linux
> ecosystem, so there is less and less reason for GCC to be
> as permissive as it has been and not issue comparable
> diagnostics.
> 
> Martin
> 
> [*] To minimize the transition effort we could introduce a new
> "flexarray" attribute to annotate larger trailing arrays with
> to indicate they are intended to be used as flexible array members.
> But I don't have the impression the patter is wide- spread enough
> to justify adding yet another extension.
> 
> [**] Both Clang and ICC have issued a warning for the out-of-
> bounds access below for many releases:
> 
> struct S
> {
>    void (*pf)(void);
>    char a[8];
> };
> 
> void f (struct S *p)
> {
>    p->a[8] = 0;
> }

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-21  7:40             ` Martin Sebor
@ 2019-08-22 22:23               ` Jeff Law
  2019-08-28 21:36                 ` Martin Sebor
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff Law @ 2019-08-22 22:23 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek; +Cc: gcc-patches

On 8/20/19 8:10 PM, Martin Sebor wrote:
> Jeff,
> 
> Please let me know if you agree/disagree and what I need to
> do to advance this work:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html
For the official record, I agree :-)

jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-22 22:23               ` Jeff Law
@ 2019-08-28 21:36                 ` Martin Sebor
  2019-09-03 20:01                   ` Jeff Law
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Sebor @ 2019-08-28 21:36 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek; +Cc: gcc-patches

On 8/22/19 3:31 PM, Jeff Law wrote:
> On 8/20/19 8:10 PM, Martin Sebor wrote:
>> Jeff,
>>
>> Please let me know if you agree/disagree and what I need to
>> do to advance this work:
>>
>>    https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html
> For the official record, I agree :-)

Great! :)

Any comments/suggestions on the patch?

   https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html

Martin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-08-28 21:36                 ` Martin Sebor
@ 2019-09-03 20:01                   ` Jeff Law
  2019-09-23 22:14                     ` Martin Sebor
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff Law @ 2019-09-03 20:01 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek; +Cc: gcc-patches

On 8/28/19 3:12 PM, Martin Sebor wrote:
> On 8/22/19 3:31 PM, Jeff Law wrote:
>> On 8/20/19 8:10 PM, Martin Sebor wrote:
>>> Jeff,
>>>
>>> Please let me know if you agree/disagree and what I need to
>>> do to advance this work:
>>>
>>>    https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html
>> For the official record, I agree :-)
> 
> Great! :)
> 
> Any comments/suggestions on the patch?
> 
>   https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html
> 
> Martin
Yea, they were in an earlier message.  I'll extract the relevant
comments since some we addressed independently:


>>  
>>  
>>  
>> @@ -325,7 +333,7 @@ state_ident_by_name (const char *name, enum insert_option optins)
>>    namlen = strlen (name);
>>    stid =
>>      (struct state_ident_st *) xmalloc (sizeof (struct state_ident_st) +
>> -				       namlen);
>> +				       namlen + 1);
>>    memset (stid, 0, sizeof (struct state_ident_st) + namlen);
>>    strcpy (stid->stid_name, name);
>>    *slot = stid;
> How did you find this goof?
> 
> 
> 
This was more a curiosity than anything.  Nothing we need to change here.

>> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
>> index fc57fb45e3a..582768090ae 100644
>> --- a/gcc/gimple-fold.c
>> +++ b/gcc/gimple-fold.c
>> @@ -1346,6 +1346,10 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
>>  	}
>>      }
>>  
>> +  /* Set if VAL represents the maximum length based on array size (set
>> +     when exact length cannot be determined).  */
>> +  bool maxbound = false;
>> +
>>    if (!val && rkind == SRK_LENRANGE)
>>      {
>>        if (TREE_CODE (arg) == ADDR_EXPR)
>> @@ -1441,6 +1445,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
>>  	      pdata->minlen = ssize_int (0);
>>  	    }
>>  	}
>> +      maxbound = true;
>>      }
>>  
>>    if (!val)
>> @@ -1454,7 +1459,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
>>  	  && tree_int_cst_lt (val, pdata->minlen)))
>>      pdata->minlen = val;
>>  
>> -  if (pdata->maxbound)
>> +  if (pdata->maxbound && TREE_CODE (pdata->maxbound) == INTEGER_CST)
>>      {
>>        /* Adjust the tighter (more optimistic) string length bound
>>  	 if necessary and proceed to adjust the more conservative
> So inside the conditional guarded by the test you're changing above we have:
> 
>      if (TREE_CODE (val) == INTEGER_CST)
>         {
>           if (TREE_CODE (pdata->maxbound) == INTEGER_CST)
>             {
>               if (tree_int_cst_lt (pdata->maxbound, val))
>                 pdata->maxbound = val;
>             }
>           else
>             pdata->maxbound = build_all_ones_cst (size_type_node);
>         }
> 
> Isn't the inner test that pdata->maxbound == INTEGER_CST always true and
> we should remove the test and the else clause?  Does the else clause
> need to be handled elsewhere (I don't see that it would be handled after
> your changes).  Or perhaps it just doesn't matter...
The redundant test of TREE_CODE (pdata->maxbound) == INTEGER_CST is a
bit of nit, but we might as well clean that up.

I couldn't convince myself that losing the else clause handling was
correct or not.

> 
> 
> > @@ -1653,8 +1661,11 @@ get_range_strlen (tree arg, bitmap *visited,
>  
>  /* Try to obtain the range of the lengths of the string(s) referenced
>     by ARG, or the size of the largest array ARG refers to if the range
> -   of lengths cannot be determined, and store all in *PDATA.  ELTSIZE
> -   is the expected size of the string element in bytes: 1 for char and
> +   of lengths cannot be determined, and store all in *PDATA which must
> +   be zero-initialized on input except PDATA->MAXBOUND may be set to
> +   a non-null tree node other than INTEGER_CST to request to have it
> +   set to the length of the longest string in a PHI.  ELTSIZE is
> +   the expected size of the string element in bytes: 1 for char and
Is there any reason we can't just make a clean distinction between input
and output objects in this routine?  As an API this seems awkward at best.
Any thoughts on the API question raised?


The rest are just nits/typos:


> @@ -2862,51 +2865,78 @@ handle_builtin_memset (gimple_stmt_iterator *gsi)
>    return true;
>  }
>  
> -/* Handle a call to memcmp.  We try to handle small comparisons by
> -   converting them to load and compare, and replacing the call to memcmp
> -   with a __builtin_memcmp_eq call where possible.
> -   return true when call is transformed, return false otherwise.  */
> +/* Return a pointer to the first such equality expression if RES is used
> +   only in experessions testing its equality to zero, and null otherwise.  */
s/experessions/expressions/


>  
> -static bool
> -handle_builtin_memcmp (gimple_stmt_iterator *gsi)
> +static gimple*
> +used_only_for_zero_equality (tree res)
Nit.  A space between "gimple" and "*".




> +
> +/* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
> +   the result of 0 == strncmp (A, B, BOUND) (which is the same as strcmp
> +   for s sufficiently large BOUND).  If the result is based on the length
> +   of one string being greater than the longest string that would fit in
> +   the array pointer to by the argument, set *PLEN and *PSIZE to
> +   the corresponding length (or its complement when the string is known
> +   to be at least as long and need not be nul-terminated) and size.
> +   Otherwise return null.  */
s/null/NULL/


> +/* Diagnose pointless calls to strcmp whose result is used in equality
> +   epxpressions that evaluate to a constant due to one argument being
> +   longer than the size of the other.  */
s/epxressions/expressions/



> +/* Optimize a call to strcmp or strncmp either by folding it to a constant
> +   when possible or by transforming the latter to the former.  Warn about
> +   calls where the length of one argument is greater than the size of
> +   the array to which the other aargument points if the latter's length
> +   is not known.  Return true when the call has been transformed into
> +   another and false otherwise.  */
s/aargument/argument/


>  
> -  unsigned HOST_WIDE_INT var_sizei = 0;
> -  /* try to determine the minimum size of the object pointed by var_string.  */
> -  tree size = determine_min_objsize (var_string);
> +  /* Determine either the length or the size of each of the string
> +     orguments, whichever is available.  */
s/orguments/arguments/

> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-09-03 20:01                   ` Jeff Law
@ 2019-09-23 22:14                     ` Martin Sebor
  2019-10-04 21:15                       ` Jeff Law
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Sebor @ 2019-09-23 22:14 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 8891 bytes --]

On 9/3/19 2:00 PM, Jeff Law wrote:
> On 8/28/19 3:12 PM, Martin Sebor wrote:
>> On 8/22/19 3:31 PM, Jeff Law wrote:
>>> On 8/20/19 8:10 PM, Martin Sebor wrote:
>>>> Jeff,
>>>>
>>>> Please let me know if you agree/disagree and what I need to
>>>> do to advance this work:
>>>>
>>>>     https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html
>>> For the official record, I agree :-)
>>
>> Great! :)
>>
>> Any comments/suggestions on the patch?
>>
>>    https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00643.html
>>
>> Martin
> Yea, they were in an earlier message.  I'll extract the relevant
> comments since some we addressed independently:
> 
> 
>>>   
>>>   
>>>   
>>> @@ -325,7 +333,7 @@ state_ident_by_name (const char *name, enum insert_option optins)
>>>     namlen = strlen (name);
>>>     stid =
>>>       (struct state_ident_st *) xmalloc (sizeof (struct state_ident_st) +
>>> -				       namlen);
>>> +				       namlen + 1);
>>>     memset (stid, 0, sizeof (struct state_ident_st) + namlen);
>>>     strcpy (stid->stid_name, name);
>>>     *slot = stid;
>> How did you find this goof?
>>
>>
>>
> This was more a curiosity than anything.  Nothing we need to change here.

The code is correct as is, I just adjusted the allocated amount to
account for the change to use the zero length trailing array instead
of the [1] kind.  But neither was part of the updated patch (I had
initially posted an outdated version of my patch).

> 
>>> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
>>> index fc57fb45e3a..582768090ae 100644
>>> --- a/gcc/gimple-fold.c
>>> +++ b/gcc/gimple-fold.c
>>> @@ -1346,6 +1346,10 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
>>>   	}
>>>       }
>>>   
>>> +  /* Set if VAL represents the maximum length based on array size (set
>>> +     when exact length cannot be determined).  */
>>> +  bool maxbound = false;
>>> +
>>>     if (!val && rkind == SRK_LENRANGE)
>>>       {
>>>         if (TREE_CODE (arg) == ADDR_EXPR)
>>> @@ -1441,6 +1445,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
>>>   	      pdata->minlen = ssize_int (0);
>>>   	    }
>>>   	}
>>> +      maxbound = true;
>>>       }
>>>   
>>>     if (!val)
>>> @@ -1454,7 +1459,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
>>>   	  && tree_int_cst_lt (val, pdata->minlen)))
>>>       pdata->minlen = val;
>>>   
>>> -  if (pdata->maxbound)
>>> +  if (pdata->maxbound && TREE_CODE (pdata->maxbound) == INTEGER_CST)
>>>       {
>>>         /* Adjust the tighter (more optimistic) string length bound
>>>   	 if necessary and proceed to adjust the more conservative
>> So inside the conditional guarded by the test you're changing above we have:
>>
>>       if (TREE_CODE (val) == INTEGER_CST)
>>          {
>>            if (TREE_CODE (pdata->maxbound) == INTEGER_CST)
>>              {
>>                if (tree_int_cst_lt (pdata->maxbound, val))
>>                  pdata->maxbound = val;
>>              }
>>            else
>>              pdata->maxbound = build_all_ones_cst (size_type_node);
>>          }
>>
>> Isn't the inner test that pdata->maxbound == INTEGER_CST always true and
>> we should remove the test and the else clause?

Yes, it looks redundant.  I never remember which of these functions
ICE when their argument is not a constant (e.g., tree_int_cst_lt)
and which ones handle it gracefully (e.g., tree_int_cst_equal) so
I often check even when it isn't necessary.  It would be nice if
these closely related APIs had consistent preconditions.

    Does the else clause
>> need to be handled elsewhere (I don't see that it would be handled after
>> your changes).  Or perhaps it just doesn't matter...

It's handled in the else block, except differently than before.

> The redundant test of TREE_CODE (pdata->maxbound) == INTEGER_CST is a
> bit of nit, but we might as well clean that up.
> 
> I couldn't convince myself that losing the else clause handling was
> correct or not.

MAXBOUND is only non-constant when set that way by client code to
have the function set it to the longest PHI argument, otherwise
it's either an INTEGER_CST or null.  The inner test may be dead
code, a leftover from something earlier.  Either way, MAXBOUND
is only used for diagnostics so it probably doesn't matter.

>>> @@ -1653,8 +1661,11 @@ get_range_strlen (tree arg, bitmap *visited,
>>   
>>   /* Try to obtain the range of the lengths of the string(s) referenced
>>      by ARG, or the size of the largest array ARG refers to if the range
>> -   of lengths cannot be determined, and store all in *PDATA.  ELTSIZE
>> -   is the expected size of the string element in bytes: 1 for char and
>> +   of lengths cannot be determined, and store all in *PDATA which must
>> +   be zero-initialized on input except PDATA->MAXBOUND may be set to
>> +   a non-null tree node other than INTEGER_CST to request to have it
>> +   set to the length of the longest string in a PHI.  ELTSIZE is
>> +   the expected size of the string element in bytes: 1 for char and
> Is there any reason we can't just make a clean distinction between input
> and output objects in this routine?  As an API this seems awkward at best.
> Any thoughts on the API question raised?

I didn't add a new argument because in GCC 9 we got rid of a bunch
of them to make the function less confusing.  The final signature
(before the simplification) had 8 arguments:

    get_range_strlen (tree arg, tree length[2], bitmap *visited,
                      int type, int fuzzy, bool *flexp,
                      unsigned eltsize, tree *nonstr)

Some of them were being tested inconsistently and their effects
were pretty subtle (especially TYPE and FUZZY).  The MAXBOUND
setting is also subtle and used only for warnings so I'd rather
not expose it as an argument that every caller has to worry about
if it isn't necessary.

Longer term, I think a better design than directly accessing
the data members is for c_strlen_data to become a proper C++ class
with accessor functions to hide this stuff behind so these kinds
of "warts" could be hidden out of sight.  Since it will touch all
callers it should be made in a change independent of this one.

So for now I've removed the redundant test and fixed the typos below
(clearly, I need a spell check for code comments).  I also had to
make a few other minor tweaks to adjust to the recent changes on
trunk.  Attached is an updated patch.

Martin

> The rest are just nits/typos:
> 
> 
>> @@ -2862,51 +2865,78 @@ handle_builtin_memset (gimple_stmt_iterator *gsi)
>>     return true;
>>   }
>>   
>> -/* Handle a call to memcmp.  We try to handle small comparisons by
>> -   converting them to load and compare, and replacing the call to memcmp
>> -   with a __builtin_memcmp_eq call where possible.
>> -   return true when call is transformed, return false otherwise.  */
>> +/* Return a pointer to the first such equality expression if RES is used
>> +   only in experessions testing its equality to zero, and null otherwise.  */
> s/experessions/expressions/
> 
> 
>>   
>> -static bool
>> -handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>> +static gimple*
>> +used_only_for_zero_equality (tree res)
> Nit.  A space between "gimple" and "*".
> 
> 
> 
> 
>> +
>> +/* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
>> +   the result of 0 == strncmp (A, B, BOUND) (which is the same as strcmp
>> +   for s sufficiently large BOUND).  If the result is based on the length
>> +   of one string being greater than the longest string that would fit in
>> +   the array pointer to by the argument, set *PLEN and *PSIZE to
>> +   the corresponding length (or its complement when the string is known
>> +   to be at least as long and need not be nul-terminated) and size.
>> +   Otherwise return null.  */
> s/null/NULL/
> 
> 
>> +/* Diagnose pointless calls to strcmp whose result is used in equality
>> +   epxpressions that evaluate to a constant due to one argument being
>> +   longer than the size of the other.  */
> s/epxressions/expressions/
> 
> 
> 
>> +/* Optimize a call to strcmp or strncmp either by folding it to a constant
>> +   when possible or by transforming the latter to the former.  Warn about
>> +   calls where the length of one argument is greater than the size of
>> +   the array to which the other aargument points if the latter's length
>> +   is not known.  Return true when the call has been transformed into
>> +   another and false otherwise.  */
> s/aargument/argument/
> 
> 
>>   
>> -  unsigned HOST_WIDE_INT var_sizei = 0;
>> -  /* try to determine the minimum size of the object pointed by var_string.  */
>> -  tree size = determine_min_objsize (var_string);
>> +  /* Determine either the length or the size of each of the string
>> +     orguments, whichever is available.  */
> s/orguments/arguments/
> 
>>



[-- Attachment #2: gcc-90879.diff --]
[-- Type: text/x-patch, Size: 68081 bytes --]

PR tree-optimization/90879 - fold zero-equality of strcmp between a longer string and a smaller array

gcc/c-family/ChangeLog:

	PR tree-optimization/90879
	* c.opt (-Wstring-compare): New option.

gcc/testsuite/ChangeLog:

	PR tree-optimization/90879
	* gcc.dg/Wstring-compare-2.c: New test.
	* gcc.dg/Wstring-compare.c: New test.
	* gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
	* gcc.dg/strcmpopt_6.c: New test.
	* gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
	test cases.
	* gcc.dg/strlenopt-66.c: Run it.
	* gcc.dg/strlenopt-68.c: New test.

gcc/ChangeLog:

	PR tree-optimization/90879
	* builtins.c (check_access): Avoid using maxbound when null.
	* calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen change.
	* doc/invoke.texi (-Wstring-compare): Document new warning option.
	* gimple-fold.c (get_range_strlen_tree): Make setting maxbound
	conditional.
	(get_range_strlen): Overwrite initial maxbound when non-null.
	* gimple-ssa-sprintf.c (get_string_length): Adjust to get_range_strlen
	changes.
	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
	(used_only_for_zero_equality): New function.
	(handle_builtin_memcmp): Call it.
	(determine_min_objsize): Return an integer instead of tree.
	(get_len_or_size, strxcmp_eqz_result): New functions.
	(maybe_warn_pointless_strcmp): New function.
	(handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
	between a longer string and a smaller array.
	(get_range_strlen_dynamic): Overwrite initial maxbound when non-null.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 1fd4b88bcac..ff03d425577 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3333,7 +3333,7 @@ check_access (tree exp, tree, tree, tree dstwrite,
 	  c_strlen_data lendata = { };
 	  get_range_strlen (srcstr, &lendata, /* eltsize = */ 1);
 	  range[0] = lendata.minlen;
-	  range[1] = lendata.maxbound;
+	  range[1] = lendata.maxbound ? lendata.maxbound : lendata.maxlen;
 	  if (range[0] && (!maxread || TREE_CODE (maxread) == INTEGER_CST))
 	    {
 	      if (maxread && tree_int_cst_le (maxread, range[0]))
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 88bbe2e2085..a5377384637 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -795,6 +795,12 @@ Wsizeof-array-argument
 C ObjC C++ ObjC++ Var(warn_sizeof_array_argument) Warning Init(1)
 Warn when sizeof is applied on a parameter declared as an array.
 
+Wstring-compare
+C ObjC C++ LTO ObjC++ Warning Var(warn_string_compare) Warning LangEnabledBy(C ObjC C++ ObjC++, Wextra)
+Warn about calls to strcmp and strncmp used in equality expressions that
+are necessarily true or false due to the length of one and size of the other
+argument.
+
 Wstringop-overflow
 C ObjC C++ LTO ObjC++ Warning Alias(Wstringop-overflow=, 2, 0)
 Warn about buffer overflow in string manipulation functions like memcpy
diff --git a/gcc/calls.c b/gcc/calls.c
index 51ad55f15a9..ae904473d0d 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1614,6 +1614,10 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp)
 	    if (!get_attr_nonstring_decl (arg))
 	      {
 		c_strlen_data lendata = { };
+		/* Set MAXBOUND to an arbitrary non-null non-integer
+		   node as a request to have it set to the length of
+		   the longest string in a PHI.  */
+		lendata.maxbound = arg;
 		get_range_strlen (arg, &lendata, /* eltsize = */ 1);
 		maxlen = lendata.maxbound;
 	      }
@@ -1639,6 +1643,10 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp)
 	if (!get_attr_nonstring_decl (arg))
 	  {
 	    c_strlen_data lendata = { };
+	    /* Set MAXBOUND to an arbitrary non-null non-integer
+	       node as a request to have it set to the length of
+	       the longest string in a PHI.  */
+	    lendata.maxbound = arg;
 	    get_range_strlen (arg, &lendata, /* eltsize = */ 1);
 	    maxlen = lendata.maxbound;
 	  }
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 83016a5a8ee..07dffc255f1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -349,6 +349,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wsizeof-pointer-memaccess  -Wsizeof-array-argument @gol
 -Wstack-protector  -Wstack-usage=@var{byte-size}  -Wstrict-aliasing @gol
 -Wstrict-aliasing=n  -Wstrict-overflow  -Wstrict-overflow=@var{n} @gol
+-Wstring-compare @gol
 -Wstringop-overflow=@var{n}  -Wstringop-truncation  -Wsubobject-linkage @gol
 -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}malloc@r{]} @gol
 -Wsuggest-final-types @gol  -Wsuggest-final-methods  -Wsuggest-override @gol
@@ -4482,6 +4483,7 @@ name is still supported, but the newer name is more descriptive.)
 -Wold-style-declaration @r{(C only)}  @gol
 -Woverride-init  @gol
 -Wsign-compare @r{(C only)} @gol
+-Wstring-compare @gol
 -Wredundant-move @r{(only for C++)}  @gol
 -Wtype-limits  @gol
 -Wuninitialized  @gol
@@ -5798,6 +5800,30 @@ comparisons, so this warning level gives a very large number of
 false positives.
 @end table
 
+@item -Wstring-compare
+@opindex Wstring-compare
+@opindex Wno-string-compare
+Warn for calls to @code{strcmp} and @code{strncmp} whose result is
+determined to be either zero or non-zero in tests for such equality
+owing to the length of one argument being greater than the size of
+the array the other argument is stored in (or the bound in the case
+of @code{strncmp}).  Such calls could be mistakes.  For example,
+the call to @code{strcmp} below is diagnosed because its result is
+necessarily non-zero irrespective of the contents of the array @code{a}.
+
+@smallexample
+extern char a[4];
+void f (char *d)
+@{
+  strcpy (d, "string");
+  @dots{}
+  if (0 == strcmp (a, d))   // cannot be true
+    puts ("a and d are the same");
+@}
+@end smallexample
+
+@option{-Wstring-compare} is enabled by @option{-Wextra}.
+
 @item -Wstringop-overflow
 @itemx -Wstringop-overflow=@var{type}
 @opindex Wstringop-overflow
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 8d642de2f67..a085ab2beaf 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1348,6 +1348,10 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	}
     }
 
+  /* Set if VAL represents the maximum length based on array size (set
+     when exact length cannot be determined).  */
+  bool maxbound = false;
+
   if (!val && rkind == SRK_LENRANGE)
     {
       if (TREE_CODE (arg) == ADDR_EXPR)
@@ -1443,6 +1447,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	      pdata->minlen = ssize_int (0);
 	    }
 	}
+      maxbound = true;
     }
 
   if (!val)
@@ -1456,25 +1461,23 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	  && tree_int_cst_lt (val, pdata->minlen)))
     pdata->minlen = val;
 
-  if (pdata->maxbound)
+  if (pdata->maxbound && TREE_CODE (pdata->maxbound) == INTEGER_CST)
     {
       /* Adjust the tighter (more optimistic) string length bound
 	 if necessary and proceed to adjust the more conservative
 	 bound.  */
       if (TREE_CODE (val) == INTEGER_CST)
 	{
-	  if (TREE_CODE (pdata->maxbound) == INTEGER_CST)
-	    {
-	      if (tree_int_cst_lt (pdata->maxbound, val))
-		pdata->maxbound = val;
-	    }
-	  else
-	    pdata->maxbound = build_all_ones_cst (size_type_node);
+	  if (tree_int_cst_lt (pdata->maxbound, val))
+	    pdata->maxbound = val;
 	}
       else
 	pdata->maxbound = val;
     }
-  else
+  else if (pdata->maxbound || maxbound)
+    /* Set PDATA->MAXBOUND only if it either isn't INTEGER_CST or
+       if VAL corresponds to the maximum length determined based
+       on the type of the object.  */
     pdata->maxbound = val;
 
   if (tight_bound)
@@ -1655,8 +1658,11 @@ get_range_strlen (tree arg, bitmap *visited,
 
 /* Try to obtain the range of the lengths of the string(s) referenced
    by ARG, or the size of the largest array ARG refers to if the range
-   of lengths cannot be determined, and store all in *PDATA.  ELTSIZE
-   is the expected size of the string element in bytes: 1 for char and
+   of lengths cannot be determined, and store all in *PDATA which must
+   be zero-initialized on input except PDATA->MAXBOUND may be set to
+   a non-null tree node other than INTEGER_CST to request to have it
+   set to the length of the longest string in a PHI.  ELTSIZE is
+   the expected size of the string element in bytes: 1 for char and
    some power of 2 for wide characters.
    Return true if the range [PDATA->MINLEN, PDATA->MAXLEN] is suitable
    for optimization.  Returning false means that a nonzero PDATA->MINLEN
@@ -1668,6 +1674,7 @@ bool
 get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)
 {
   bitmap visited = NULL;
+  tree maxbound = pdata->maxbound;
 
   if (!get_range_strlen (arg, &visited, SRK_LENRANGE, pdata, eltsize))
     {
@@ -1680,9 +1687,10 @@ get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)
   else if (!pdata->minlen)
     pdata->minlen = ssize_int (0);
 
-  /* Unless its null, leave the more conservative MAXBOUND unchanged.  */
-  if (!pdata->maxbound)
-    pdata->maxbound = pdata->maxlen;
+  /* If it's unchanged from it initial non-null value, set the conservative
+     MAXBOUND to SIZE_MAX.  Otherwise leave it null (if it is null).  */
+  if (maxbound && pdata->maxbound == maxbound)
+    pdata->maxbound = build_all_ones_cst (size_type_node);
 
   if (visited)
     BITMAP_FREE (visited);
diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index b11d7989d5e..b548bbd95e3 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -1974,8 +1974,11 @@ get_string_length (tree str, unsigned eltsize, const vr_values *vr)
   if (!str)
     return fmtresult ();
 
-  /* Try to determine the dynamic string length first.  */
+  /* Try to determine the dynamic string length first.
+     Set MAXBOUND to an arbitrary non-null non-integer node as a request
+     to have it set to the length of the longest string in a PHI.  */
   c_strlen_data lendata = { };
+  lendata.maxbound = str;
   if (eltsize == 1)
     get_range_strlen_dynamic (str, &lendata, vr);
   else
@@ -1988,26 +1991,27 @@ get_string_length (tree str, unsigned eltsize, const vr_values *vr)
       get_range_strlen (str, &lendata, eltsize);
     }
 
-  /* LENDATA.MAXBOUND is null when LENDATA.MIN corresponds to the shortest
-     string referenced by STR.  Otherwise, if it's not equal to .MINLEN it
-     corresponds to the bound of the largest array STR refers to, if known,
-     or it's SIZE_MAX otherwise.  */
+  /* If LENDATA.MAXBOUND is not equal to .MINLEN it corresponds to the bound
+     of the largest array STR refers to, if known, or it's set to SIZE_MAX
+     otherwise.  */
 
   /* Return the default result when nothing is known about the string.  */
-  if (lendata.maxbound)
+  if ((lendata.maxbound && !tree_fits_uhwi_p (lendata.maxbound))
+      || !tree_fits_uhwi_p (lendata.maxlen))
     {
-      if (integer_all_onesp (lendata.maxbound)
-      	  && integer_all_onesp (lendata.maxlen))
-      	return fmtresult ();
-
-      if (!tree_fits_uhwi_p (lendata.maxbound)
-	  || !tree_fits_uhwi_p (lendata.maxlen))
-      	return fmtresult ();
-
-      unsigned HOST_WIDE_INT lenmax = tree_to_uhwi (max_object_size ()) - 2;
-      if (lenmax <= tree_to_uhwi (lendata.maxbound)
-	  && lenmax <= tree_to_uhwi (lendata.maxlen))
-	return fmtresult ();
+      fmtresult res;
+      res.nonstr = lendata.decl;
+      return res;
+    }
+
+  unsigned HOST_WIDE_INT lenmax = tree_to_uhwi (max_object_size ()) - 2;
+  if (integer_zerop (lendata.minlen)
+      && (!lendata.maxbound || lenmax <= tree_to_uhwi (lendata.maxbound))
+      && lenmax <= tree_to_uhwi (lendata.maxlen))
+    {
+      fmtresult res;
+      res.nonstr = lendata.decl;
+      return res;
     }
 
   HOST_WIDE_INT min
@@ -2056,9 +2060,9 @@ get_string_length (tree str, unsigned eltsize, const vr_values *vr)
     {
       /* When the upper bound is unknown (it can be zero or excessive)
 	 set the likely length to the greater of 1.  If MAXBOUND is
-	 set, also reset the length of the lower bound to zero.  */
+	 known, also reset the length of the lower bound to zero.  */
       res.range.likely = res.range.min ? res.range.min : warn_level > 1;
-      if (lendata.maxbound)
+      if (lendata.maxbound && !integer_all_onesp (lendata.maxbound))
 	res.range.min = 0;
     }
 
diff --git a/gcc/testsuite/gcc.dg/Wstring-compare-2.c b/gcc/testsuite/gcc.dg/Wstring-compare-2.c
new file mode 100644
index 00000000000..e6ca2a69999
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstring-compare-2.c
@@ -0,0 +1,127 @@
+/* PR tree-optimization/90879 - fold zero-equality of strcmp between
+   a longer string and a smaller array
+   Test for a warning for strcmp of a longer string against smaller
+   array.
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wstring-compare -Wno-stringop-truncation -ftrack-macro-expansion=0" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern void* memcpy (void*, const void*, size_t);
+
+extern int strcmp (const char*, const char*);
+extern size_t strlen (const char*);
+extern char* strcpy (char*, const char*);
+extern char* strncpy (char*, const char*, size_t);
+extern int strncmp (const char*, const char*, size_t);
+
+void sink (int, ...);
+#define sink(...) sink (__LINE__, __VA_ARGS__)
+
+
+extern char a1[1], a2[2], a3[3], a4[4], a5[5], a6[6], a7[7], a8[8], a9[9];
+
+#define T(a, b) sink (0 == strcmp (a, b))
+
+
+void test_string_cst (void)
+{
+  const char *s1 = "1", *s2 = "12";
+
+  T (s1, a1);                 // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+  T (s1, a2);
+  T (s1, a3);
+
+  T (a1, s1);                 // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+  T (a2, s1);
+  T (a3, s1);
+
+  T (s2, a1);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 1 evaluates to nonzero" }
+  T (s2, a2);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+  T (s2, a3);
+
+  T (a1, s2);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 1 evaluates to nonzero" }
+  T (a2, s2);                 // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+  T (a3, s2);
+}
+
+
+void test_string_cst_off_cst (void)
+{
+  const char *s1 = "1", *s2 = "12", *s3 = "123", *s4 = "1234";
+
+  T (s1, a2 + 1);              // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+  T (a2 + 1, s1);              // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+
+
+  T (s3 + 1, a2);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+  T (s3 + 1, a3);
+
+  T (s2, a4 + 1);
+  T (s2, a4 + 2);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+
+  T (s4, a4 + 1);             // { dg-warning ".strcmp. of a string of length 4 and an array of size 3 evaluates to nonzero" }
+  T (s3, a5 + 1);
+}
+
+
+/* Use strncpy below rather than memcpy until PR 91183 is resolved.  */
+
+#undef T
+#define T(s, n, a)					\
+  do {							\
+    char arr[32];					\
+    sink (arr);						\
+    strncpy (arr, s, n < 0 ? strlen (s) + 1: n);	\
+    sink (0 == strcmp (arr, a));			\
+  } while (0)
+
+void test_string_exact_length (void)
+{
+  const char *s1 = "1", *s2 = "12";
+
+  T (s1, -1, a1);             // { dg-warning ".strcmp. of a string of length 1 and an array of size 1 evaluates to nonzero" }
+  T (s1, -1, a2);
+  T (s1, -1, a3);
+
+  T (s2, -1, a1);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 1 evaluates to nonzero" }
+  T (s2, -1, a2);             // { dg-warning ".strcmp. of a string of length 2 and an array of size 2 evaluates to nonzero" }
+  T (s2, -1, a3);
+}
+
+
+void test_string_min_length (void)
+{
+  const char *s1 = "1", *s2 = "12";
+
+  T (s1,  1, a1);             // { dg-warning ".strcmp. of a string of length 1 or more and an array of size 1 evaluates to nonzero" }
+  T (s1,  1, a2);
+  T (s1,  1, a3);
+
+  T (s2,  2, a1);             // { dg-warning ".strcmp. of a string of length 2 or more and an array of size 1 evaluates to nonzero" }
+  T (s2,  2, a2);             // { dg-warning ".strcmp. of a string of length 2 or more and an array of size 2 evaluates to nonzero" }
+  T (s2,  2, a3);
+}
+
+
+int test_strncmp_str_lit_var (const char *s, long n)
+{
+  if (strncmp (s, "123456", n) == 0)    // { dg-bogus "\\\[-Wstring-compare" }
+    return 1;
+
+  return 0;
+}
+
+int test_strlen_strncmp_str_lit_var (const char *s, long n)
+{
+  if (__builtin_strlen (s) < n)
+    return -1;
+
+  if (n == 6)
+    if (strncmp (s, "123456", n) == 0)  // { dg-bogus "\\\[-Wstring-compare" }
+      return 1;
+
+  return 0;
+}
+
+
diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c b/gcc/testsuite/gcc.dg/Wstring-compare.c
new file mode 100644
index 00000000000..0ca492db0ab
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstring-compare.c
@@ -0,0 +1,181 @@
+/* PR tree-optimization/90879 - fold zero-equality of strcmp between
+   a longer string and a smaller array
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wextra -ftrack-macro-expansion=0" } */
+
+#include "strlenopt.h"
+
+#define T(a, b) sink (0 == strcmp (a, b), a, b)
+
+void sink (int, ...);
+
+struct S { char a4[4], c; };
+
+extern char a4[4];
+extern char a5[5];
+extern char b4[4];
+
+/* Verify that comparison of string literals with arrays with unknown
+   content but size that prevents them from comparing equal is diagnosed.  */
+
+void strcmp_array_lit (void)
+{
+  if (strcmp (a4, "1234"))  // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+    sink (0, a4);
+
+  int cmp;
+  cmp = strcmp (a4, "1234");  // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+  if (cmp)                  // { dg-message "in this expression" }
+    sink (0, a4);
+
+  T (a4, "4321");           // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero " }
+  T (a4, "12345");          // { dg-warning "length 5 and an array of size 4 " }
+  T (a4, "123456");         // { dg-warning "length 6 and an array of size 4 " }
+  T ("1234", a4);           // { dg-warning "length 4 and an array of size 4 " }
+  T ("12345", a4);          // { dg-warning "length 5 and an array of size 4 " }
+  T ("123456", a4);         // { dg-warning "length 6 and an array of size 4 " }
+}
+
+
+void strcmp_array_pstr (void)
+{
+  const char *s4 = "1234";
+
+  {
+    if (strcmp (a4, s4))    // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  {
+    int c;
+    c = strcmp (a4, s4);    // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+    if (c)                  // { dg-message "in this expression" }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  const char *t4 = "4321";
+  const char *s5 = "12345";
+  const char *s6 = "123456";
+
+  T (a4, t4);               // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero " }
+  T (a4, s5);               // { dg-warning "length 5 and an array of size 4 " }
+  T (a4, s6);               // { dg-warning "length 6 and an array of size 4 " }
+  T (s4, a4);               // { dg-warning "length 4 and an array of size 4 " }
+  T (s5, a4);               // { dg-warning "length 5 and an array of size 4 " }
+  T (s6, a4);               // { dg-warning "length 6 and an array of size 4 " }
+}
+
+
+void strcmp_array_cond_pstr (int i)
+{
+  const char *s4 = i ? "1234" : "4321";
+  T (a4, s4);               // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero " }
+  T (a5, s4);
+}
+
+void strcmp_array_copy (void)
+{
+  char s[8];
+
+  {
+    strcpy (s, "1234");
+    if (strcmp (a4, s))     // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  {
+    strcpy (s, "1234");
+
+    int c;
+    c = strcmp (a4, s);     // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero" }
+    if (c)                  // { dg-message "in this expression" }
+      sink (1, a4);
+    else
+      sink (0, a4);
+  }
+
+  strcpy (s, "4321");
+  T (a4, s);                // { dg-warning "'strcmp' of a string of length 4 and an array of size 4 evaluates to nonzero " }
+  strcpy (s, "12345");
+  T (a4, s);                // { dg-warning "length 5 and an array of size 4 " }
+  strcpy (s, "123456");
+  T (a4, s);                // { dg-warning "length 6 and an array of size 4 " }
+  strcpy (s, "4321");
+  T (s, a4);                // { dg-warning "length 4 and an array of size 4 " }
+  strcpy (s, "54321");
+  T (s, a4);                // { dg-warning "length 5 and an array of size 4 " }
+  strcpy (s, "654321");
+  T (s, a4);                // { dg-warning "length 6 and an array of size 4 " }
+}
+
+
+void strcmp_member_array_lit (const struct S *p)
+{
+  T (p->a4, "1234");        // { dg-warning "length 4 and an array of size 4 " }
+}
+
+
+#undef T
+#define T(a, b, n) sink (0 == strncmp (a, b, n), a, b)
+
+void strncmp_array_lit (void)
+{
+  if (strncmp (a4, "12345", 5))   // { dg-warning "'strncmp' of a string of length 5, an array of size 4 and bound of 5 evaluates to nonzero" }
+                                  // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+    sink (0, a4);
+
+  int cmp;
+  cmp = strncmp (a4, "54321", 5);   // { dg-warning "'strncmp' of a string of length 5, an array of size 4 and bound of 5 evaluates to nonzero" }
+  if (cmp)                          // { dg-message "in this expression" }
+    sink (0, a4);
+
+  // Verify no warning when the bound is the same as the array size.
+  T (a4, "4321", 4);
+  T (a4, "654321", 4);
+
+  T (a4, "12345", 5);       // { dg-warning "length 5, an array of size 4 and bound of 5 " }
+  T (a4, "123456", 6);      // { dg-warning "length 6, an array of size 4 and bound of 6" }
+
+  T ("1234", a4, 4);
+  T ("12345", a4, 4);
+
+  T ("12345", a4, 5);       // { dg-warning "length 5, an array of size 4 and bound of 5 " }
+  T ("123456", a4, 6);      // { dg-warning "length 6, an array of size 4 and bound of 6 " }
+}
+
+
+void strncmp_strarray_copy (void)
+{
+  {
+    char a[] = "1234";
+    char b[6];
+    strcpy (b, "12345");
+    if (strncmp (a, b, 5))  // { dg-warning "'strncmp' of strings of length 4 and 5 and bound of 5 evaluates to nonzero" }
+                            // { dg-bogus "in this expreession" "unwanted note" { target *-*-* } .-1 }
+      sink (0, a, b);
+  }
+
+  {
+    char a[] = "4321";
+    char b[6];
+    strcpy (b, "54321");
+    int cmp;
+    cmp = strncmp (a, b, 5);  // { dg-warning "'strncmp' of strings of length 4 and 5 and bound of 5 evaluates to nonzero" }
+    if (cmp)                  // { dg-message "in this expression" }
+      sink (0, a, b);
+  }
+
+  strcpy (a4, "abc");
+  T (a4, "54321", 5);       // { dg-warning "'strncmp' of strings of length 3 and 5 and bound of 5 evaluates to nonzero " }
+}
+
+
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
index 571646ce001..35941bee575 100644
--- a/gcc/testsuite/gcc.dg/strcmpopt_3.c
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -1,31 +1,31 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -fdump-tree-strlen" } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
 
-__attribute__ ((noinline)) int 
-f1 (void) 
-{ 
+__attribute__ ((noinline)) int
+f1 (void)
+{
   char *s0= "abcd";
   char s[8];
   __builtin_strcpy (s, s0);
-  return __builtin_strcmp(s, "abc") != 0; 
+  return __builtin_strcmp (s, "abc") != 0;
 }
 
 __attribute__ ((noinline)) int
-f2 (void) 
-{ 
+f2 (void)
+{
   char *s0 = "ab";
   char s[8];
   __builtin_strcpy (s, s0);
-  return __builtin_strcmp("abc", s) != 0; 
+  return __builtin_strcmp ("abc", s) != 0;
 }
 
 int main (void)
 {
-  if (f1 () != 1 
+  if (f1 () != 1
       || f2 () != 1)
     __builtin_abort ();
 
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen1" } } */
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_6.c b/gcc/testsuite/gcc.dg/strcmpopt_6.c
new file mode 100644
index 00000000000..cb99294e5fa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_6.c
@@ -0,0 +1,207 @@
+/* Verify that strcmp and strncmp calls with mixed constant and
+   non-constant strings are evaluated correctly.
+   { dg-do run }
+   { dg-options "-O2" } */
+
+#include "strlenopt.h"
+
+#define A(expr)                                                 \
+  ((expr)                                                       \
+   ? (void)0                                                    \
+   : (__builtin_printf ("assertion failed on line %i: %s\n",    \
+                        __LINE__, #expr),                       \
+      __builtin_abort ()))
+
+__attribute__ ((noclone, noinline)) int
+test_strlen_gt2_strcmp_abcd (const char *s)
+{
+  if (strlen (s) < 3)
+    return -1;
+
+  return strcmp (s, "abcd") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strlen_lt6_strcmp_abcd (const char *s)
+{
+  if (strlen (s) > 5)
+    return -1;
+
+  return strcmp (s, "abcd") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_strcmp_abc (const char *s)
+{
+  char a[4];
+  strcpy (a, s);
+  return strcmp (a, "abc") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_abc_strcmp (const char *s)
+{
+  char a[4], b[6];
+  strcpy (a, "abc");
+  strcpy (b, s);
+  return strcmp (a, b) == 0;
+}
+
+/* Exercise strcmp of two strings between 1 and 3 characters long
+   stored in arrays of the same known size.  */
+char ga4[4], gb4[4];
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strcmp_same_size_arrays (void)
+{
+  ga4[0] = gb4[0] = 'x';
+  ga4[3] = gb4[3] = '\0';
+  return strcmp (ga4, gb4) == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strncmp_bound_2_same_size_arrays (void)
+{
+  ga4[0] = gb4[0] = 'x';
+  ga4[3] = gb4[3] = '\0';
+  return strncmp (ga4, gb4, 2) == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strncmp_bound_equal_same_size_arrays (void)
+{
+  ga4[0] = gb4[0] = 'x';
+  ga4[3] = gb4[3] = '\0';
+  return strncmp (ga4, gb4, 4) == 0;
+}
+
+/* Exercise strcmp of two strings between 0 and 3 characters long
+   stored in arrays of the same known size.  */
+
+__attribute__ ((noclone, noinline)) int
+test_nulterm_strcmp_same_size_arrays (void)
+{
+  ga4[3] = gb4[3] = '\0';
+  return strcmp (ga4, gb4) == 0;
+}
+
+/* Exercise strcmp of two strings between 1 and 3 and 1 and 4 characters
+   long, respectively, stored in arrays of known but different sizes.  */
+char gc5[5];
+
+__attribute__ ((noclone, noinline)) int
+test_store_0_nulterm_strcmp_arrays (void)
+{
+  ga4[0] = gc5[0] = 'x';
+  ga4[3] = gc5[4] = '\0';
+  return strcmp (ga4, gc5) == 0;
+}
+
+/* Exercise strcmp of two strings between 0 and 3 and 1 and 4 characters
+   long, respectively, stored in arrays of known but different sizes.  */
+
+__attribute__ ((noclone, noinline)) int
+test_nulterm_strcmp_arrays (void)
+{
+  ga4[3] = gc5[4] = '\0';
+  return strcmp (ga4, gc5) == 0;
+}
+
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_strncmp_abcd (const char *s)
+{
+  char a[6];
+  strcpy (a, s);
+  return strcmp (a, "abcd") == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_abcd_strncmp_3 (const char *s)
+{
+  char a[6], b[8];
+  strcpy (a, "abcd");
+  strcpy (b, s);
+  return strncmp (a, b, 3) == 0;
+}
+
+__attribute__ ((noclone, noinline)) int
+test_strcpy_abcd_strncmp_4 (const char *s)
+{
+  char a[6], b[8];
+  strcpy (a, "abcd");
+  strcpy (b, s);
+  return strncmp (a, b, 4) == 0;
+}
+
+
+int main (void)
+{
+  test_strlen_gt2_strcmp_abcd ("abcd");
+  test_strlen_lt6_strcmp_abcd ("abcd");
+
+  A (0 == test_strcpy_strcmp_abc ("ab"));
+  A (0 != test_strcpy_strcmp_abc ("abc"));
+  A (0 == test_strcpy_strcmp_abc ("abcd"));
+
+  A (0 == test_strcpy_abc_strcmp ("ab"));
+  A (0 != test_strcpy_abc_strcmp ("abc"));
+  A (0 == test_strcpy_abc_strcmp ("abcd"));
+
+  strcpy (ga4, "abc"); strcpy (gb4, "abd");
+  A (0 == test_store_0_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abd"); strcpy (gb4, "abc");
+  A (0 == test_store_0_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_store_0_nulterm_strcmp_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gb4, "acd");
+  A (0 == test_store_0_nulterm_strncmp_bound_2_same_size_arrays ());
+  strcpy (ga4, "acd"); strcpy (gb4, "abc");
+  A (0 == test_store_0_nulterm_strncmp_bound_2_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_store_0_nulterm_strncmp_bound_2_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gb4, "abd");
+  A (0 == test_store_0_nulterm_strncmp_bound_equal_same_size_arrays ());
+  strcpy (ga4, "abd"); strcpy (gb4, "abc");
+  A (0 == test_store_0_nulterm_strncmp_bound_equal_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_store_0_nulterm_strncmp_bound_equal_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gb4, "abd");
+  A (0 == test_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abd"); strcpy (gb4, "abc");
+  A (0 == test_nulterm_strcmp_same_size_arrays ());
+  strcpy (ga4, "abc"); strcpy (gb4, "abc");
+  A (0 != test_nulterm_strcmp_same_size_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gc5, "abcd");
+  A (0 == test_store_0_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abd"); strcpy (gc5, "abcd");
+  A (0 == test_store_0_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abc"); strcpy (gc5, "abc");
+  A (0 != test_store_0_nulterm_strcmp_arrays ());
+
+  strcpy (ga4, "abc"); strcpy (gc5, "abcd");
+  A (0 == test_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abd"); strcpy (gc5, "abc");
+  A (0 == test_nulterm_strcmp_arrays ());
+  strcpy (ga4, "abc"); strcpy (gc5, "abc");
+  A (0 != test_nulterm_strcmp_arrays ());
+
+  A (0 == test_strcpy_strncmp_abcd ("ab"));
+  A (0 == test_strcpy_strncmp_abcd ("abc"));
+  A (0 != test_strcpy_strncmp_abcd ("abcd"));
+  A (0 == test_strcpy_strncmp_abcd ("abcde"));
+
+  A (0 == test_strcpy_abcd_strncmp_3 ("ab"));
+  A (0 != test_strcpy_abcd_strncmp_3 ("abc"));
+  A (0 != test_strcpy_abcd_strncmp_3 ("abcd"));
+  A (0 != test_strcpy_abcd_strncmp_3 ("abcde"));
+
+  A (0 == test_strcpy_abcd_strncmp_4 ("ab"));
+  A (0 == test_strcpy_abcd_strncmp_4 ("abc"));
+  A (0 != test_strcpy_abcd_strncmp_4 ("abcd"));
+  A (0 != test_strcpy_abcd_strncmp_4 ("abcde"));
+}
diff --git a/gcc/testsuite/gcc.dg/strlenopt-65.c b/gcc/testsuite/gcc.dg/strlenopt-65.c
index a34d178faa1..521d7ac2b42 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-65.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-65.c
@@ -1,17 +1,10 @@
 /* PRE tree-optimization/90626 - fold strcmp(a, b) == 0 to zero when
    one string length is exact and the other is unequal
    { dg-do compile }
-   { dg-options "-O2 -Wall -fdump-tree-optimized" } */
+   { dg-options "-O2 -Wall -Wno-string-compare -fdump-tree-optimized -ftrack-macro-expansion=0" } */
 
 #include "strlenopt.h"
 
-typedef __SIZE_TYPE__ size_t;
-
-extern void abort (void);
-extern void* memcpy (void *, const void *, size_t);
-extern int strcmp (const char *, const char *);
-extern int strncmp (const char *, const char *, size_t);
-
 #define CAT(x, y) x ## y
 #define CONCAT(x, y) CAT (x, y)
 #define FAILNAME(name) CONCAT (call_ ## name ##_on_line_, __LINE__)
@@ -142,21 +135,45 @@ void test_strcmp_keep (const char *s, const char *t)
 #undef CMPFUNC
 #define CMPFUNC(a, b, dummy) strcmp (a, b)
 
-  KEEP ("1", "1", a, b, -1);
+  KEEP ("123", "123\0", a, b, /* bnd = */ -1);
+  KEEP ("123\0", "123", a, b, -1);
+
+  {
+    char a[8], b[8];
+    sink (a, b);
+    strcpy (a, s);
+    strcpy (b, t);
+    TEST_KEEP (0 == strcmp (a, b));
+  }
+}
+
+
+void test_strncmp_keep (const char *s, const char *t)
+{
+#undef CMPFUNC
+#define CMPFUNC(a, b, n) strncmp (a, b, n)
+
+  KEEP ("1", "1", a, b, 2);
 
-  KEEP ("1\0", "1", a, b, -1);
-  KEEP ("1",   "1\0", a, b, -1);
+  KEEP ("1\0", "1", a, b, 2);
+  KEEP ("1",   "1\0", a, b, 2);
 
-  KEEP ("12\0", "12", a, b, -1);
-  KEEP ("12",   "12\0", a, b, -1);
+  KEEP ("12\0", "12", a, b, 2);
+  KEEP ("12",   "12\0", a, b, 2);
 
-  KEEP ("111\0", "111", a, b, -1);
-  KEEP ("112", "112\0", a, b, -1);
+  KEEP ("111\0", "111", a, b, 3);
+  KEEP ("112", "112\0", a, b, 3);
 
-  KEEP (s, t, a, b, -1);
+  {
+    char a[8], b[8];
+    sink (a, b);
+    strcpy (a, s);
+    strcpy (b, t);
+    TEST_KEEP (0 == strncmp (a, b, sizeof a));
+  }
 }
 
 /* { dg-final { scan-tree-dump-times "call_in_true_branch_not_eliminated_" 0 "optimized" } }
 
-   { dg-final { scan-tree-dump-times "call_made_in_true_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 8 "optimized" } }
-   { dg-final { scan-tree-dump-times "call_made_in_false_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 8 "optimized" } } */
+   { dg-final { scan-tree-dump-times "call_made_in_true_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 11 "optimized" } }
+   { dg-final { scan-tree-dump-times "call_made_in_false_branch_on_line_1\[0-9\]\[0-9\]\[0-9\]" 11 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-66.c b/gcc/testsuite/gcc.dg/strlenopt-66.c
index 5dc10a07d3d..4ba31a845b0 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-66.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-66.c
@@ -1,6 +1,6 @@
 /* PRE tree-optimization/90626 - fold strcmp(a, b) == 0 to zero when
    one string length is exact and the other is unequal
-   { dg-do compile }
+   { dg-do run }
    { dg-options "-O2 -Wall -fdump-tree-optimized" } */
 
 #include "strlenopt.h"
@@ -65,8 +65,44 @@ test_strncmp (void)
   A (0 <  strncmp (b, a, 5));
 }
 
+
+__attribute__ ((noclone, noinline, noipa)) void
+test_strncmp_a4_cond_s5_s2_2 (const char *s, int i)
+{
+  char a4[4];
+  strcpy (a4, s);
+  A (0 == strncmp (a4, i ? "12345" : "12", 2));
+}
+
+
+__attribute__ ((noclone, noinline, noipa)) void
+test_strncmp_a4_cond_a5_s2_5 (const char *s, const char *t, int i)
+{
+  char a4[4], a5[5];
+  strcpy (a4, s);
+  strcpy (a5, t);
+  A (0 == strncmp (a4, i ? a5 : "12", 5));
+}
+
+__attribute__ ((noclone, noinline, noipa)) void
+test_strncmp_a4_cond_a5_a3_n (const char *s1, const char *s2, const char *s3,
+			      int i, unsigned n)
+{
+  char a3[3], a4[4], a5[5];
+  strcpy (a3, s1);
+  strcpy (a4, s2);
+  strcpy (a5, s3);
+  A (0 == strncmp (a4, i ? a5 : a3, n));
+}
+
+
 int main (void)
 {
   test_strcmp ();
   test_strncmp ();
+  test_strncmp_a4_cond_s5_s2_2 ("12", 0);
+  test_strncmp_a4_cond_a5_s2_5 ("12", "1234", 0);
+
+  test_strncmp_a4_cond_a5_a3_n ("12", "123", "1234", 0, 2);
+  test_strncmp_a4_cond_a5_a3_n ("123", "12", "12", 1, 3);
 }
diff --git a/gcc/testsuite/gcc.dg/strlenopt-69.c b/gcc/testsuite/gcc.dg/strlenopt-69.c
new file mode 100644
index 00000000000..46ceb9ddb05
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-69.c
@@ -0,0 +1,126 @@
+/* PR tree-optimization/90879 - fold zero-equality of strcmp between
+   a longer string and a smaller array
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wno-string-compare -fdump-tree-optimized -ftrack-macro-expansion=0" } */
+
+#include "strlenopt.h"
+
+#define A(expr)                                                 \
+  ((expr)                                                       \
+   ? (void)0                                                    \
+   : (__builtin_printf ("assertion failed on line %i: %s\n",    \
+                        __LINE__, #expr),                       \
+      __builtin_abort ()))
+
+void clobber (void*, ...);
+
+struct S { char a4[4], c; };
+
+extern char a4[4];
+extern char b4[4];
+
+/* Verify that comparison of string literals with arrays with unknown
+   content but size that prevents them from comparing equal is folded
+   to a constant.  */
+
+void test_array_lit (void)
+{
+  A (strcmp (a4, "1234")); clobber (a4);
+  A (strcmp (a4, "12345")); clobber (a4);
+  A (strcmp (a4, "123456")); clobber (a4);
+  A (strcmp ("1234", a4)); clobber (a4);
+  A (strcmp ("12345", a4)); clobber (a4);
+  A (strcmp ("123456", a4)); clobber (a4);
+}
+
+void test_memarray_lit (struct S *p)
+{
+  A (strcmp (p->a4, "1234"));
+  A (strcmp (p->a4, "12345"));
+  A (strcmp (p->a4, "123456"));
+
+  A (strcmp ("1234", p->a4));
+  A (strcmp ("12345", p->a4));
+  A (strcmp ("123456", p->a4));
+}
+
+/* Verify that the equality of empty strings is folded.  */
+
+void test_empty_string (void)
+{
+  A (0 == strcmp ("", ""));
+
+  *a4 = '\0';
+  A (0 == strcmp (a4, ""));
+  A (0 == strcmp ("", a4));
+  A (0 == strcmp (a4, a4));
+
+  char s[8] = "";
+  A (0 == strcmp (a4, s));
+
+  a4[1] = '\0';
+  b4[1] = '\0';
+  A (0 == strcmp (a4 + 1, b4 + 1));
+
+  a4[2] = '\0';
+  b4[2] = '\0';
+  A (0 == strcmp (&a4[2], &b4[2]));
+
+  clobber (a4, b4);
+
+  memset (a4, 0, sizeof a4);
+  memset (b4, 0, sizeof b4);
+  A (0 == strcmp (a4, b4));
+}
+
+/* Verify that comparison of dynamically created strings with unknown
+   arrays is folded.  */
+
+void test_array_copy (void)
+{
+  char s[8];
+  strcpy (s, "1234");
+  A (strcmp (a4, s));
+
+  strcpy (s, "12345");
+  A (strlen (s) == 5);
+  A (strcmp (a4, s)); clobber (a4);
+
+  strcpy (s, "123456");
+  A (strcmp (a4, s)); clobber (a4);
+
+  strcpy (s, "1234");
+  A (strcmp (s, a4)); clobber (a4);
+
+  strcpy (s, "12345");
+  A (strcmp (s, a4)); clobber (a4);
+
+  strcpy (s, "123456");
+  A (strcmp (s, a4)); clobber (a4);
+}
+
+
+void test_array_bounded (void)
+{
+  A (strncmp (a4, "12345", 5)); clobber (a4);
+  A (strncmp ("54321", a4, 5)); clobber (a4);
+
+  A (strncmp (a4, "123456", 5)); clobber (a4);
+  A (strncmp ("654321", a4, 5)); clobber (a4);
+}
+
+void test_array_copy_bounded (void)
+{
+  char s[8];
+  strcpy (s, "12345");
+  A (strncmp (a4, s, 5)); clobber (a4);
+  strcpy (s, "54321");
+  A (strncmp (s, a4, 5)); clobber (a4);
+
+  strcpy (s, "123456");
+  A (strncmp (a4, s, 5)); clobber (a4);
+  strcpy (s, "654321");
+  A (strncmp (s, a4, 5)); clobber (a4);
+}
+
+/* { dg-final { scan-tree-dump-not "abort|strcmp|strncmp" "optimized" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 5e1054be48e..0fba8cd436a 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -896,7 +896,8 @@ get_range_strlen_dynamic (tree src, c_strlen_data *pdata, bitmap *visited,
 
 		      if (!argdata.minlen
 			  || (integer_zerop (argdata.minlen)
-			      && integer_all_onesp (argdata.maxbound)
+			      && (!argdata.maxbound
+				  || integer_all_onesp (argdata.maxbound))
 			      && integer_all_onesp (argdata.maxlen)))
 			{
 			  /* Set the upper bound of the length to unbounded.  */
@@ -910,11 +911,14 @@ get_range_strlen_dynamic (tree src, c_strlen_data *pdata, bitmap *visited,
 			  || tree_int_cst_lt (argdata.minlen, pdata->minlen))
 			pdata->minlen = argdata.minlen;
 		      if (!pdata->maxlen
-			  || tree_int_cst_lt (pdata->maxlen, argdata.maxlen))
+			  || (argdata.maxlen
+			      && tree_int_cst_lt (pdata->maxlen, argdata.maxlen)))
 			pdata->maxlen = argdata.maxlen;
 		      if (!pdata->maxbound
-			  || (tree_int_cst_lt (pdata->maxbound,
-					       argdata.maxbound)
+			  || TREE_CODE (pdata->maxbound) != INTEGER_CST
+			  || (argdata.maxbound
+			      && tree_int_cst_lt (pdata->maxbound,
+						  argdata.maxbound)
 			      && !integer_all_onesp (argdata.maxbound)))
 			pdata->maxbound = argdata.maxbound;
 		    }
@@ -1007,11 +1011,19 @@ get_range_strlen_dynamic (tree src, c_strlen_data *pdata, bitmap *visited,
 	      pdata->maxlen = build_all_ones_cst (size_type_node);
 	    }
 	}
-      else
+      else if (TREE_CODE (pdata->minlen) == INTEGER_CST)
 	{
 	  pdata->maxlen = pdata->minlen;
 	  pdata->maxbound = pdata->minlen;
 	}
+      else
+	{
+	  /* For PDATA->MINLEN that's a non-constant expression such
+	     as PLUS_EXPR whose value range is unknown, set the bounds
+	     to zero and SIZE_MAX.  */
+	  pdata->minlen = build_zero_cst (size_type_node);
+	  pdata->maxlen = build_all_ones_cst (size_type_node);
+	}
 
       return true;
     }
@@ -1032,6 +1044,7 @@ get_range_strlen_dynamic (tree src, c_strlen_data *pdata,
 			  const vr_values *rvals)
 {
   bitmap visited = NULL;
+  tree maxbound = pdata->maxbound;
 
   unsigned limit = PARAM_VALUE (PARAM_SSA_NAME_DEF_CHAIN_LIMIT);
   if (!get_range_strlen_dynamic (src, pdata, &visited, rvals, &limit))
@@ -1045,6 +1058,11 @@ get_range_strlen_dynamic (tree src, c_strlen_data *pdata,
   else if (!pdata->minlen)
     pdata->minlen = ssize_int (0);
 
+  /* If it's unchanged from it initial non-null value, set the conservative
+     MAXBOUND to SIZE_MAX.  Otherwise leave it null (if it is null).  */
+  if (maxbound && pdata->maxbound == maxbound)
+    pdata->maxbound = build_all_ones_cst (size_type_node);
+
   if (visited)
     BITMAP_FREE (visited);
 }
@@ -2444,6 +2462,9 @@ maybe_diag_stxncpy_trunc (gimple_stmt_iterator gsi, tree src, tree cnt)
   else
     {
       c_strlen_data lendata = { };
+      /* Set MAXBOUND to an arbitrary non-null non-integer node as a request
+	 to have it set to the length of the longest string in a PHI.  */
+      lendata.maxbound = src;
       get_range_strlen (src, &lendata, /* eltsize = */1);
       if (TREE_CODE (lendata.minlen) == INTEGER_CST
 	  && TREE_CODE (lendata.maxbound) == INTEGER_CST)
@@ -3216,51 +3237,78 @@ handle_builtin_memset (gimple_stmt_iterator *gsi)
   return true;
 }
 
-/* Handle a call to memcmp.  We try to handle small comparisons by
-   converting them to load and compare, and replacing the call to memcmp
-   with a __builtin_memcmp_eq call where possible.
-   return true when call is transformed, return false otherwise.  */
+/* Return a pointer to the first such equality expression if RES is used
+   only in expressions testing its equality to zero, and null otherwise.  */
 
-static bool
-handle_builtin_memcmp (gimple_stmt_iterator *gsi)
+static gimple *
+used_only_for_zero_equality (tree res)
 {
-  gcall *stmt2 = as_a <gcall *> (gsi_stmt (*gsi));
-  tree res = gimple_call_lhs (stmt2);
-  tree arg1 = gimple_call_arg (stmt2, 0);
-  tree arg2 = gimple_call_arg (stmt2, 1);
-  tree len = gimple_call_arg (stmt2, 2);
-  unsigned HOST_WIDE_INT leni;
+  gimple *first_use = NULL;
+
   use_operand_p use_p;
   imm_use_iterator iter;
 
-  if (!res)
-    return false;
-
   FOR_EACH_IMM_USE_FAST (use_p, iter, res)
     {
-      gimple *ustmt = USE_STMT (use_p);
+      gimple *use_stmt = USE_STMT (use_p);
 
-      if (is_gimple_debug (ustmt))
-	continue;
-      if (gimple_code (ustmt) == GIMPLE_ASSIGN)
+      if (is_gimple_debug (use_stmt))
+        continue;
+      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
 	{
-	  gassign *asgn = as_a <gassign *> (ustmt);
-	  tree_code code = gimple_assign_rhs_code (asgn);
-	  if ((code != EQ_EXPR && code != NE_EXPR)
-	      || !integer_zerop (gimple_assign_rhs2 (asgn)))
-	    return false;
+	  tree_code code = gimple_assign_rhs_code (use_stmt);
+	  if (code == COND_EXPR)
+	    {
+	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
+	      if ((TREE_CODE (cond_expr) != EQ_EXPR
+		   && (TREE_CODE (cond_expr) != NE_EXPR))
+		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
+		return NULL;
+	    }
+	  else if (code == EQ_EXPR || code == NE_EXPR)
+	    {
+	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
+		return NULL;
+            }
+	  else
+	    return NULL;
 	}
-      else if (gimple_code (ustmt) == GIMPLE_COND)
+      else if (gimple_code (use_stmt) == GIMPLE_COND)
 	{
-	  tree_code code = gimple_cond_code (ustmt);
+	  tree_code code = gimple_cond_code (use_stmt);
 	  if ((code != EQ_EXPR && code != NE_EXPR)
-	      || !integer_zerop (gimple_cond_rhs (ustmt)))
-	    return false;
+	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
+	    return NULL;
 	}
       else
-	return false;
+        return NULL;
+
+      if (!first_use)
+	first_use = use_stmt;
     }
 
+  return first_use;
+}
+
+/* Handle a call to memcmp.  We try to handle small comparisons by
+   converting them to load and compare, and replacing the call to memcmp
+   with a __builtin_memcmp_eq call where possible.
+   return true when call is transformed, return false otherwise.  */
+
+static bool
+handle_builtin_memcmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree res = gimple_call_lhs (stmt);
+
+  if (!res || !used_only_for_zero_equality (res))
+    return false;
+
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  tree len = gimple_call_arg (stmt, 2);
+  unsigned HOST_WIDE_INT leni;
+
   if (tree_fits_uhwi_p (len)
       && (leni = tree_to_uhwi (len)) <= GET_MODE_SIZE (word_mode)
       && pow2p_hwi (leni))
@@ -3273,7 +3321,7 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
       if (int_mode_for_size (leni, 1).exists (&mode)
 	  && (align >= leni || !targetm.slow_unaligned_access (mode, align)))
 	{
-	  location_t loc = gimple_location (stmt2);
+	  location_t loc = gimple_location (stmt);
 	  tree type, off;
 	  type = build_nonstandard_integer_type (leni, 1);
 	  gcc_assert (known_eq (GET_MODE_BITSIZE (TYPE_MODE (type)), leni));
@@ -3297,78 +3345,10 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
 	}
     }
 
-  gimple_call_set_fndecl (stmt2, builtin_decl_explicit (BUILT_IN_MEMCMP_EQ));
+  gimple_call_set_fndecl (stmt, builtin_decl_explicit (BUILT_IN_MEMCMP_EQ));
   return true;
 }
 
-/* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
-   the result of 0 == strncmp (A, B, N) (which is the same as strcmp for
-   sufficiently large N).  Otherwise return false.  */
-
-static bool
-strxcmp_unequal (int idx1, int idx2, unsigned HOST_WIDE_INT n)
-{
-  unsigned HOST_WIDE_INT len1;
-  unsigned HOST_WIDE_INT len2;
-
-  bool nulterm1;
-  bool nulterm2;
-
-  if (idx1 < 0)
-    {
-      len1 = ~idx1;
-      nulterm1 = true;
-    }
-  else if (strinfo *si = get_strinfo (idx1))
-    {
-      if (tree_fits_uhwi_p (si->nonzero_chars))
-	{
-	  len1 = tree_to_uhwi (si->nonzero_chars);
-	  nulterm1 = si->full_string_p;
-	}
-      else
-	return false;
-    }
-  else
-    return false;
-
-  if (idx2 < 0)
-    {
-      len2 = ~idx2;
-      nulterm2 = true;
-    }
-  else if (strinfo *si = get_strinfo (idx2))
-    {
-      if (tree_fits_uhwi_p (si->nonzero_chars))
-	{
-	  len2 = tree_to_uhwi (si->nonzero_chars);
-	  nulterm2 = si->full_string_p;
-	}
-      else
-	return false;
-    }
-  else
-    return false;
-
-  /* N is set to UHWI_MAX for strcmp and less to strncmp.  Adjust
-     the length of each string to consider to be no more than N.  */
-  if (len1 > n)
-    len1 = n;
-  if (len2 > n)
-    len2 = n;
-
-  if ((len1 < len2 && nulterm1)
-      || (len2 < len1 && nulterm2))
-    /* The string lengths are definitely unequal and the result can
-       be folded to one (since it's used for comparison with zero).  */
-    return true;
-
-  /* The string lengths may be equal or unequal.  Even when equal and
-     both strings nul-terminated, without the string contents there's
-     no way to determine whether they are equal.  */
-  return false;
-}
-
 /* Given an index to the strinfo vector, compute the string length
    for the corresponding string. Return -1 when unknown.  */
 
@@ -3397,15 +3377,16 @@ compute_string_length (int idx)
 
 /* Determine the minimum size of the object referenced by DEST expression
    which must have a pointer type.
-   Return the minimum size of the object if successful or NULL when the size
-   cannot be determined.  */
-static tree
+   Return the minimum size of the object if successful or HWI_M1U when
+   the size cannot be determined.  */
+
+static unsigned HOST_WIDE_INT
 determine_min_objsize (tree dest)
 {
   unsigned HOST_WIDE_INT size = 0;
 
   if (compute_builtin_object_size (dest, 2, &size))
-    return build_int_cst (sizetype, size);
+    return size;
 
   /* Try to determine the size of the object through the RHS
      of the assign statement.  */
@@ -3413,11 +3394,11 @@ determine_min_objsize (tree dest)
     {
       gimple *stmt = SSA_NAME_DEF_STMT (dest);
       if (!is_gimple_assign (stmt))
-	return NULL_TREE;
+	return HOST_WIDE_INT_M1U;
 
       if (!gimple_assign_single_p (stmt)
 	  && !gimple_assign_unary_nop_p (stmt))
-	return NULL_TREE;
+	return HOST_WIDE_INT_M1U;
 
       dest = gimple_assign_rhs1 (stmt);
       return determine_min_objsize (dest);
@@ -3425,7 +3406,7 @@ determine_min_objsize (tree dest)
 
   /* Try to determine the size of the object from its type.  */
   if (TREE_CODE (dest) != ADDR_EXPR)
-    return NULL_TREE;
+    return HOST_WIDE_INT_M1U;
 
   tree type = TREE_TYPE (dest);
   if (TREE_CODE (type) == POINTER_TYPE)
@@ -3433,196 +3414,388 @@ determine_min_objsize (tree dest)
 
   type = TYPE_MAIN_VARIANT (type);
 
-  /* We cannot determine the size of the array if it's a flexible array,
-     which is declared at the end of a structure.  */
-  if (TREE_CODE (type) == ARRAY_TYPE
-      && !array_at_struct_end_p (dest))
+  /* The size of a flexible array cannot be determined.  Otherwise,
+     for arrays with more than one element, return the size of its
+     type.  GCC itself misuses arrays of both zero and one elements
+     as flexible array members so they are excluded as well.  */
+  if (TREE_CODE (type) != ARRAY_TYPE
+      || !array_at_struct_end_p (dest))
     {
-      tree size_t = TYPE_SIZE_UNIT (type);
-      if (size_t && TREE_CODE (size_t) == INTEGER_CST
-	  && !integer_zerop (size_t))
-        return size_t;
+      tree type_size = TYPE_SIZE_UNIT (type);
+      if (type_size && TREE_CODE (type_size) == INTEGER_CST
+	  && !integer_onep (type_size)
+	  && !integer_zerop (type_size))
+        return tree_to_uhwi (type_size);
     }
 
-  return NULL_TREE;
+  return HOST_WIDE_INT_M1U;
 }
 
-/* Handle a call to strcmp or strncmp. When the result is ONLY used to do
-   equality test against zero:
-
-   A. When the lengths of both arguments are constant and it's a strcmp:
-      * if the lengths are NOT equal, we can safely fold the call
-        to a non-zero value.
-      * otherwise, do nothing now.
-
-   B. When the length of one argument is constant, try to replace the call
-   with a __builtin_str(n)cmp_eq call where possible, i.e:
-
-   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR
-   is a string with constant length , C is a constant.
-     if (C <= strlen(STR) && sizeof_array(s) > C)
-       {
-         replace this call with
-         strncmp_eq (s, STR, C) (!)= 0
-       }
-     if (C > strlen(STR)
-       {
-         it can be safely treated as a call to strcmp (s, STR) (!)= 0
-         can handled by the following strcmp.
-       }
-
-   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR
-   is a string with constant length.
-     if  (sizeof_array(s) > strlen(STR))
-       {
-         replace this call with
-         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
-       }
-
-   Return true when the call is transformed, return false otherwise.
- */
+/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths
+   of  the string(s) referenced by ARG if it can be determined.
+   If the length cannot be determined, set *SIZE to the size of
+   the array the string is stored in, if any.  If no such array is
+   known, set *SIZE to -1.  When the strings are nul-terminated set
+   *NULTERM to true, otherwise to false.  Return true on success.  */
 
 static bool
-handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
+		 unsigned HOST_WIDE_INT *size, bool *nulterm)
 {
-  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
-  tree res = gimple_call_lhs (stmt);
-  use_operand_p use_p;
-  imm_use_iterator iter;
-  tree arg1 = gimple_call_arg (stmt, 0);
-  tree arg2 = gimple_call_arg (stmt, 1);
-  int idx1 = get_stridx (arg1);
-  int idx2 = get_stridx (arg2);
-  HOST_WIDE_INT length = -1;
-  bool is_ncmp = false;
+  /* Set so that both LEN and ~LEN are invalid lengths, i.e.,
+     maximum possible length + 1.  */
+  lenrng[0] = lenrng[1] = HOST_WIDE_INT_MAX;
 
-  if (!res)
-    return false;
+  *size = HOST_WIDE_INT_M1U;
 
-  /* When both arguments are unknown, do nothing.  */
-  if (idx1 == 0 && idx2 == 0)
-    return false;
-
-  /* Handle strncmp function.  */
-  if (gimple_call_num_args (stmt) == 3)
+  if (idx < 0)
     {
-      tree len = gimple_call_arg (stmt, 2);
-      if (tree_fits_shwi_p (len))
-        length = tree_to_shwi (len);
-
-      is_ncmp = true;
+      /* IDX is the inverted constant string length.  */
+      lenrng[0] = ~idx;
+      lenrng[1] = lenrng[0];
+      *nulterm = true;
     }
-
-  /* For strncmp, if the length argument is NOT known, do nothing.  */
-  if (is_ncmp && length < 0)
-    return false;
-
-  /* When the result is ONLY used to do equality test against zero.  */
-  FOR_EACH_IMM_USE_FAST (use_p, iter, res)
+  else if (idx == 0)
+    ; /* Handled below.  */
+  else if (strinfo *si = get_strinfo (idx))
     {
-      gimple *use_stmt = USE_STMT (use_p);
+      if (!si->nonzero_chars)
+	arg = si->ptr;
+      else if (tree_fits_uhwi_p (si->nonzero_chars))
+	{
+	  lenrng[0] = tree_to_uhwi (si->nonzero_chars);
+	  *nulterm = si->full_string_p;
+	  /* Set the upper bound only if the string is known to be
+	     nul-terminated, otherwise leave it at maximum + 1.  */
+	  if (*nulterm)
+	    lenrng[1] = lenrng[0];
+	}
+      else if (TREE_CODE (si->nonzero_chars) == SSA_NAME)
+	{
+	  wide_int min, max;
+	  value_range_kind rng = get_range_info (si->nonzero_chars, &min, &max);
+	  if (rng == VR_RANGE)
+	    {
+	      lenrng[0] = min.to_uhwi ();
+	      lenrng[1] = max.to_uhwi ();
+	      *nulterm = si->full_string_p;
+	    }
+	}
+      else if (si->ptr)
+	arg = si->ptr;
+    }
 
-      if (is_gimple_debug (use_stmt))
-        continue;
-      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
+  if (lenrng[0] == HOST_WIDE_INT_MAX)
+    {
+      /* Compute the minimum and maximum real or possible lengths.  */
+      c_strlen_data lendata = { };
+      if (get_range_strlen (arg, &lendata, /* eltsize = */1))
 	{
-	  tree_code code = gimple_assign_rhs_code (use_stmt);
-	  if (code == COND_EXPR)
+	  if (tree_fits_shwi_p (lendata.maxlen) && !lendata.maxbound)
 	    {
-	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
-	      if ((TREE_CODE (cond_expr) != EQ_EXPR
-		   && (TREE_CODE (cond_expr) != NE_EXPR))
-		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
-		return false;
+	      lenrng[0] = tree_to_shwi (lendata.minlen);
+	      lenrng[1] = tree_to_shwi (lendata.maxlen);
+	      *nulterm = true;
 	    }
-	  else if (code == EQ_EXPR || code == NE_EXPR)
+	  else if (lendata.maxbound && tree_fits_shwi_p (lendata.maxbound))
 	    {
-	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
-		return false;
-            }
-	  else
-	    return false;
+	      /* Set *SIZE to the conservative LENDATA.MAXBOUND which
+		 is a conservative estimate of the longest string based
+		 on the sizes of the arrays referenced by ARG.  */
+	      *size = tree_to_uhwi (lendata.maxbound) + 1;
+	      *nulterm = false;
+	    }
 	}
-      else if (gimple_code (use_stmt) == GIMPLE_COND)
+      else
 	{
-	  tree_code code = gimple_cond_code (use_stmt);
-	  if ((code != EQ_EXPR && code != NE_EXPR)
-	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
-	    return false;
+	  /* Set *SIZE to the size of the smallest object referenced
+	     by ARG if ARG denotes a single object, or to HWI_M1U
+	     otherwise.  */
+	  *size = determine_min_objsize (arg);
+	  *nulterm = false;
 	}
-      else
-        return false;
     }
 
-  /* When the lengths of the arguments are known to be unequal
-     we can safely fold the call to a non-zero value for strcmp;
-     otherwise, do nothing now.  */
-  if (idx1 != 0 && idx2 != 0)
-    {
-      if (strxcmp_unequal (idx1, idx2, length))
-	{
-	  replace_call_with_value (gsi, integer_one_node);
-	  return true;
-	}
-      return false;
+  return lenrng[0] != HOST_WIDE_INT_MAX || *size != HOST_WIDE_INT_M1U;
+}
+
+/* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
+   the result of 0 == strncmp (A, B, BOUND) (which is the same as strcmp
+   for a sufficiently large BOUND).  If the result is based on the length
+   of one string being greater than the longest string that would fit in
+   the array pointer to by the argument, set *PLEN and *PSIZE to
+   the corresponding length (or its complement when the string is known
+   to be at least as long and need not be nul-terminated) and size.
+   Otherwise return null.  */
+
+static tree
+strxcmp_eqz_result (tree arg1, int idx1, tree arg2, int idx2,
+		    unsigned HOST_WIDE_INT bound, unsigned HOST_WIDE_INT len[2],
+		    unsigned HOST_WIDE_INT *psize)
+{
+  /* Determine the range the length of each string is in and whether it's
+     known to be nul-terminated, or the size of the array it's stored in.  */
+  bool nul1, nul2;
+  unsigned HOST_WIDE_INT siz1, siz2;
+  unsigned HOST_WIDE_INT len1rng[2], len2rng[2];
+  if (!get_len_or_size (arg1, idx1, len1rng, &siz1, &nul1)
+      || !get_len_or_size (arg2, idx2, len2rng, &siz2, &nul2))
+    return NULL_TREE;
+
+  /* BOUND is set to HWI_M1U for strcmp and less to strncmp, and LENiRNG
+     to HWI_MAX when invalid.  Adjust the length of each string to consider
+     to be no more than BOUND.  */
+  if (len1rng[0] < HOST_WIDE_INT_MAX && len1rng[0] > bound)
+    len1rng[0] = bound;
+  if (len1rng[1] < HOST_WIDE_INT_MAX && len1rng[1] > bound)
+    len1rng[1] = bound;
+  if (len2rng[0] < HOST_WIDE_INT_MAX && len2rng[0] > bound)
+    len2rng[0] = bound;
+  if (len2rng[1] < HOST_WIDE_INT_MAX && len2rng[1] > bound)
+    len2rng[1] = bound;
+
+  /* Two empty strings are equal.  */
+  if (len1rng[1] == 0 && len2rng[1] == 0)
+    return integer_one_node;
+
+  /* The strings are definitely unequal when the lower bound of the length
+     of one of them is greater than the length of the longest string that
+     would fit into the other array.  */
+  if (len1rng[0] == HOST_WIDE_INT_MAX
+      && len2rng[0] != HOST_WIDE_INT_MAX
+      && ((len2rng[0] < bound && len2rng[0] >= siz1)
+	  || len2rng[0] > siz1))
+    {
+      *psize = siz1;
+      len[0] = len1rng[0];
+      /* Set LEN[0] to the lower bound of ARG1's length when it's
+	 nul-terminated or to the complement of its minimum length
+	 otherwise,  */
+      len[1] = nul2 ? len2rng[0] : ~len2rng[0];
+      return integer_zero_node;
+    }
+
+  if (len2rng[0] == HOST_WIDE_INT_MAX
+      && len1rng[0] != HOST_WIDE_INT_MAX
+      && ((len1rng[0] < bound && len1rng[0] >= siz2)
+	  || len1rng[0] > siz2))
+    {
+      *psize = siz2;
+      len[0] = nul1 ? len1rng[0] : ~len1rng[0];
+      len[1] = len2rng[0];
+      return integer_zero_node;
+    }
+
+  /* The strings are also definitely unequal when their lengths are unequal
+     and at least one is nul-terminated.  */
+  if (len1rng[0] != HOST_WIDE_INT_MAX
+      && len2rng[0] != HOST_WIDE_INT_MAX
+      && ((len1rng[1] < len2rng[0] && nul1)
+	  || (len2rng[1] < len1rng[0] && nul2)))
+    {
+      if (bound <= len1rng[0] || bound <= len2rng[0])
+	*psize = bound;
+      else
+	*psize = HOST_WIDE_INT_M1U;
+
+      len[0] = len1rng[0];
+      len[1] = len2rng[0];
+      return integer_zero_node;
     }
 
-  /* When the length of one argument is constant.  */
-  tree var_string = NULL_TREE;
-  HOST_WIDE_INT const_string_leni = -1;
+  /* The string lengths may be equal or unequal.  Even when equal and
+     both strings nul-terminated, without the string contents there's
+     no way to determine whether they are equal.  */
+  return NULL_TREE;
+}
 
-  if (idx1)
+/* Diagnose pointless calls to strcmp or strncmp STMT with string
+   arguments of lengths LEN or size SIZ and (for strncmp) BOUND,
+   whose result is used in equality expressions that evaluate to
+   a constant due to one argument being longer than the size of
+   the other.  */
+
+static void
+maybe_warn_pointless_strcmp (gimple *stmt, HOST_WIDE_INT bound,
+			     unsigned HOST_WIDE_INT len[2],
+			     unsigned HOST_WIDE_INT siz)
+{
+  gimple *use = used_only_for_zero_equality (gimple_call_lhs (stmt));
+  if (!use)
+    return;
+
+  bool at_least = false;
+
+  /* Excessive LEN[i] indicates a lower bound.  */
+  if (len[0] > HOST_WIDE_INT_MAX)
     {
-      const_string_leni = compute_string_length (idx1);
-      var_string = arg2;
+      at_least = true;
+      len[0] = ~len[0];
     }
-  else
+
+  if (len[1] > HOST_WIDE_INT_MAX)
     {
-      gcc_checking_assert (idx2);
-      const_string_leni = compute_string_length (idx2);
-      var_string = arg1;
+      at_least = true;
+      len[1] = ~len[1];
     }
 
-  if (const_string_leni < 0)
-    return false;
+  unsigned HOST_WIDE_INT minlen = MIN (len[0], len[1]);
 
-  unsigned HOST_WIDE_INT var_sizei = 0;
-  /* try to determine the minimum size of the object pointed by var_string.  */
-  tree size = determine_min_objsize (var_string);
+  /* FIXME: Include a note pointing to the declaration of the smaller
+     array.  */
+  location_t stmt_loc = gimple_location (stmt);
+  tree callee = gimple_call_fndecl (stmt);
+  bool warned = false;
+  if (siz <= minlen && bound == -1)
+    warned = warning_at (stmt_loc, OPT_Wstring_compare,
+			 (at_least
+			  ? G_("%G%qD of a string of length %wu or more and "
+			       "an array of size %wu evaluates to nonzero")
+			  : G_("%G%qD of a string of length %wu and an array "
+			       "of size %wu evaluates to nonzero")),
+			 stmt, callee, minlen, siz);
+  else if (!at_least && siz <= HOST_WIDE_INT_MAX)
+    {
+      if (len[0] != HOST_WIDE_INT_MAX && len[1] != HOST_WIDE_INT_MAX)
+	warned = warning_at (stmt_loc, OPT_Wstring_compare,
+			     "%G%qD of strings of length %wu and %wu "
+			     "and bound of %wu evaluates to nonzero",
+			     stmt, callee, len[0], len[1], bound);
+      else
+	warned = warning_at (stmt_loc, OPT_Wstring_compare,
+			     "%G%qD of a string of length %wu, an array "
+			     "of size %wu and bound of %wu evaluates to "
+			     "nonzero",
+			     stmt, callee, minlen, siz, bound);
+    }
 
-  if (!size)
-    return false;
+  if (warned)
+    {
+      location_t use_loc = gimple_location (use);
+      if (LOCATION_LINE (stmt_loc) != LOCATION_LINE (use_loc))
+	inform (use_loc, "in this expression");
+    }
+}
 
-  if (tree_fits_uhwi_p (size))
-    var_sizei = tree_to_uhwi (size);
 
-  if (var_sizei == 0)
+/* Optimize a call to strcmp or strncmp either by folding it to a constant
+   when possible or by transforming the latter to the former.  Warn about
+   calls where the length of one argument is greater than the size of
+   the array to which the other argument points if the latter's length
+   is not known.  Return true when the call has been transformed into
+   another and false otherwise.  */
+
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree lhs = gimple_call_lhs (stmt);
+
+  if (!lhs)
     return false;
 
-  /* For strncmp, if length > const_string_leni , this call can be safely
-     transformed to a strcmp.  */
-  if (is_ncmp && length > const_string_leni)
-    is_ncmp = false;
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  int idx1 = get_stridx (arg1);
+  int idx2 = get_stridx (arg2);
 
-  unsigned HOST_WIDE_INT final_length
-    = is_ncmp ? length : const_string_leni + 1;
+  /* For strncmp set to the the value of the third argument if known.  */
+  HOST_WIDE_INT bound = -1;
 
-  /* Replace strcmp or strncmp with the corresponding str(n)cmp_eq.  */
-  if (var_sizei > final_length)
+  /* Extract the strncmp bound.  */
+  if (gimple_call_num_args (stmt) == 3)
     {
-      tree fn
-	= (is_ncmp
-	   ? builtin_decl_implicit (BUILT_IN_STRNCMP_EQ)
-	   : builtin_decl_implicit (BUILT_IN_STRCMP_EQ));
-      if (!fn)
+      tree len = gimple_call_arg (stmt, 2);
+      if (tree_fits_shwi_p (len))
+        bound = tree_to_shwi (len);
+
+      /* If the bound argument is NOT known, do nothing.  */
+      if (bound < 0)
 	return false;
-      tree const_string_len = build_int_cst (size_type_node, final_length);
-      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
     }
+
+  {
+    /* Set to the length of one argument (or its complement if it's
+       the lower bound of a range) and the size of the array storing
+       the other if the result is based on the former being equal to
+       or greater than the latter.  */
+    unsigned HOST_WIDE_INT len[2] = { HOST_WIDE_INT_MAX, HOST_WIDE_INT_MAX };
+    unsigned HOST_WIDE_INT siz = HOST_WIDE_INT_M1U;
+
+    /* Try to determine if the two strings are either definitely equal
+       or definitely unequal and if so, either fold the result to zero
+       (when equal) or set the range of the result to ~[0, 0] otherwise.  */
+    if (tree eqz = strxcmp_eqz_result (arg1, idx1, arg2, idx2, bound,
+				       len, &siz))
+      {
+	if (integer_zerop (eqz))
+	  {
+	    maybe_warn_pointless_strcmp (stmt, bound, len, siz);
+
+	    /* When the lengths of the first two string arguments are
+	       known to be unequal set the range of the result to non-zero.
+	       This allows the call to be eliminated if its result is only
+	       used in tests for equality to zero.  */
+	    wide_int zero = wi::zero (TYPE_PRECISION (TREE_TYPE (lhs)));
+	    set_range_info (lhs, VR_ANTI_RANGE, zero, zero);
+	    return false;
+	  }
+	/* When the two strings are definitely equal (such as when they
+	   are both empty) fold the call to the constant result.  */
+	replace_call_with_value (gsi, integer_zero_node);
+	return true;
+      }
+  }
+
+  /* Return if nothing is known about the strings pointed to by ARG1
+     and ARG2.  */
+  if (idx1 == 0 && idx2 == 0)
+    return false;
+
+  /* Determine either the length or the size of each of the strings,
+     whichever is available.  */
+  HOST_WIDE_INT cstlen1 = -1, cstlen2 = -1;
+  HOST_WIDE_INT arysiz1 = -1, arysiz2 = -1;
+
+  if (idx1)
+    cstlen1 = compute_string_length (idx1) + 1;
   else
+    arysiz1 = determine_min_objsize (arg1);
+
+  /* Bail if neither the string length nor the size of the array
+     it is stored in can be determined.  */
+  if (cstlen1 < 0 && arysiz1 < 0)
     return false;
 
-  return true;
+  /* Repeat for the second argument.  */
+  if (idx2)
+    cstlen2 = compute_string_length (idx2) + 1;
+  else
+    arysiz2 = determine_min_objsize (arg2);
+
+  if (cstlen2 < 0 && arysiz2 < 0)
+    return false;
+
+  /* The exact number of characters to compare.  */
+  HOST_WIDE_INT cmpsiz = bound < 0 ? cstlen1 < 0 ? cstlen2 : cstlen1 : bound;
+  /* The size of the array in which the unknown string is stored.  */
+  HOST_WIDE_INT varsiz = arysiz1 < 0 ? arysiz2 : arysiz1;
+
+  if (cmpsiz < varsiz && used_only_for_zero_equality (lhs))
+    {
+      /* If the known length is less than the size of the other array
+	 and the strcmp result is only used to test equality to zero,
+	 transform the call to the equivalent _eq call.  */
+      if (tree fn = builtin_decl_implicit (bound < 0 ? BUILT_IN_STRCMP_EQ
+					   : BUILT_IN_STRNCMP_EQ))
+	{
+	  tree n = build_int_cst (size_type_node, cmpsiz);
+	  update_gimple_call (gsi, fn, 3, arg1, arg2, n);
+	  return true;
+	}
+    }
+
+  return false;
 }
 
 /* Handle a POINTER_PLUS_EXPR statement.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] fold more string comparison with known result (PR 90879)
  2019-09-23 22:14                     ` Martin Sebor
@ 2019-10-04 21:15                       ` Jeff Law
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff Law @ 2019-10-04 21:15 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek; +Cc: gcc-patches

On 9/23/19 4:14 PM, Martin Sebor wrote:

> 
> Yes, it looks redundant.  I never remember which of these functions
> ICE when their argument is not a constant (e.g., tree_int_cst_lt)
> and which ones handle it gracefully (e.g., tree_int_cst_equal) so
> I often check even when it isn't necessary.  It would be nice if
> these closely related APIs had consistent preconditions.
Can't argue with that!

> 
> MAXBOUND is only non-constant when set that way by client code to
> have the function set it to the longest PHI argument, otherwise
> it's either an INTEGER_CST or null.  The inner test may be dead
> code, a leftover from something earlier.  Either way, MAXBOUND
> is only used for diagnostics so it probably doesn't matter.
> 
>>>> @@ -1653,8 +1661,11 @@ get_range_strlen (tree arg, bitmap *visited,
>>>  
>>>   /* Try to obtain the range of the lengths of the string(s) referenced
>>>      by ARG, or the size of the largest array ARG refers to if the range
>>> -   of lengths cannot be determined, and store all in *PDATA.  ELTSIZE
>>> -   is the expected size of the string element in bytes: 1 for char and
>>> +   of lengths cannot be determined, and store all in *PDATA which must
>>> +   be zero-initialized on input except PDATA->MAXBOUND may be set to
>>> +   a non-null tree node other than INTEGER_CST to request to have it
>>> +   set to the length of the longest string in a PHI.  ELTSIZE is
>>> +   the expected size of the string element in bytes: 1 for char and
>> Is there any reason we can't just make a clean distinction between input
>> and output objects in this routine?  As an API this seems awkward at best.
>>
>> Any thoughts on the API question raised?
> 
> I didn't add a new argument because in GCC 9 we got rid of a bunch
> of them to make the function less confusing.  The final signature
> (before the simplification) had 8 arguments:
> 
>    get_range_strlen (tree arg, tree length[2], bitmap *visited,
>                      int type, int fuzzy, bool *flexp,
>                      unsigned eltsize, tree *nonstr)
> 
> Some of them were being tested inconsistently and their effects
> were pretty subtle (especially TYPE and FUZZY).  The MAXBOUND
> setting is also subtle and used only for warnings so I'd rather
> not expose it as an argument that every caller has to worry about
> if it isn't necessary.
> 
> Longer term, I think a better design than directly accessing
> the data members is for c_strlen_data to become a proper C++ class
> with accessor functions to hide this stuff behind so these kinds
> of "warts" could be hidden out of sight.  Since it will touch all
> callers it should be made in a change independent of this one.
> 
> So for now I've removed the redundant test and fixed the typos below
> (clearly, I need a spell check for code comments).  I also had to
> make a few other minor tweaks to adjust to the recent changes on
> trunk.  Attached is an updated patch.
Sounds reasonable.  And yes, I'm certainly a fan of moving towards
proper classes.


> gcc-90879.diff
> 
> PR tree-optimization/90879 - fold zero-equality of strcmp between a longer string and a smaller array
> 
> gcc/c-family/ChangeLog:
> 
> 	PR tree-optimization/90879
> 	* c.opt (-Wstring-compare): New option.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR tree-optimization/90879
> 	* gcc.dg/Wstring-compare-2.c: New test.
> 	* gcc.dg/Wstring-compare.c: New test.
> 	* gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
> 	* gcc.dg/strcmpopt_6.c: New test.
> 	* gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
> 	test cases.
> 	* gcc.dg/strlenopt-66.c: Run it.
> 	* gcc.dg/strlenopt-68.c: New test.
> 
> gcc/ChangeLog:
> 
> 	PR tree-optimization/90879
> 	* builtins.c (check_access): Avoid using maxbound when null.
> 	* calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen change.
> 	* doc/invoke.texi (-Wstring-compare): Document new warning option.
> 	* gimple-fold.c (get_range_strlen_tree): Make setting maxbound
> 	conditional.
> 	(get_range_strlen): Overwrite initial maxbound when non-null.
> 	* gimple-ssa-sprintf.c (get_string_length): Adjust to get_range_strlen
> 	changes.
> 	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
> 	(used_only_for_zero_equality): New function.
> 	(handle_builtin_memcmp): Call it.
> 	(determine_min_objsize): Return an integer instead of tree.
> 	(get_len_or_size, strxcmp_eqz_result): New functions.
> 	(maybe_warn_pointless_strcmp): New function.
> 	(handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
> 	between a longer string and a smaller array.
> 	(get_range_strlen_dynamic): Overwrite initial maxbound when non-null.
Parts of this look like bits which I've already approved (some of the
get_range_strlen_dynamic bits).  But regardless, this is OK.


Jeff

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-10-04 21:15 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-09 16:42 [PATCH] fold more string comparison with known result (PR 90879) Martin Sebor
2019-08-09 16:51 ` Jakub Jelinek
2019-08-09 17:07   ` Martin Sebor
2019-08-09 17:07     ` Jakub Jelinek
2019-08-09 22:45       ` Martin Sebor
2019-08-12 13:56         ` Michael Matz
2019-08-14 16:30           ` Martin Sebor
2019-08-12 20:15         ` Jeff Law
2019-08-12 22:32           ` Martin Sebor
2019-08-13  2:22             ` Jeff Law
2019-08-13 20:08     ` Jeff Law
2019-08-13 23:26       ` Martin Sebor
2019-08-14  0:39         ` Jeff Law
2019-08-14 20:57           ` Martin Sebor
2019-08-21  7:40             ` Martin Sebor
2019-08-22 22:23               ` Jeff Law
2019-08-28 21:36                 ` Martin Sebor
2019-09-03 20:01                   ` Jeff Law
2019-09-23 22:14                     ` Martin Sebor
2019-10-04 21:15                       ` Jeff Law
2019-08-12 22:22 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).