public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][Middle-end]2nd patch of PR78809 and PR83026
@ 2017-12-14 19:48 Qing Zhao
  2017-12-14 20:46 ` Jakub Jelinek
  0 siblings, 1 reply; 19+ messages in thread
From: Qing Zhao @ 2017-12-14 19:48 UTC (permalink / raw)
  To: gcc-patches; +Cc: jeff Law, richard Biener

Hi,

I am not sure whether it’s proper to send this patch during this late stage.
however, since the patch itself is quite straightforward, I decided to send it now. 

=================

2nd Patch for PR78009
Patch for PR83026

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809>
Inline strcmp with small constant strings

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026>
missing strlen optimization for strcmp of unequal strings

The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html <https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html>

this patch is for the second part of change of PR78809 and PR83026:

B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0

  B.1. (PR83026) When both arguments are constant strings and it's a strcmp:
     * if the length of the strings are NOT equal, we can safely fold the call
       to a non-zero value.
     * otherwise, do nothing now.

  B.2. (PR78809) When one of the arguments is a constant string, try to replace
  the call with a __builtin_memcmp_eq call where possible, i.e:

  strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
  constant string, C is a constant.
    if (C <= strlen(STR) && sizeof_array(s) > C)
      {
        replace this call with
        memcmp (s, STR, C) (!)= 0
      }
    if (C > strlen(STR)
      {
        it can be safely treated as a call to strcmp (s, STR) (!)= 0
        can handled by the following strcmp.
      }

  strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
  constant string.
    if  (sizeof_array(s) > strlen(STR))
      {
        replace this call with
        memcmp (s, STR, strlen(STR)+1) (!)= 0
      }

  (NOTE, currently, memcmp(s1, s2, N) (!)=0 has been optimized to a simple
   sequence to access all bytes and accumulate the overall result in GCC by
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171>

   as a result, such str(n)cmp call would like to be replaced by simple
   sequences to access all types and accumulate the overall results.

adding test case strcmpopt_2.c into gcc.dg for part B of PR78809
adding test case strcmpopt_3.c into gcc.dg for PR83026

bootstraped and tested on both X86 and Aarch64. no regression.

Okay for trunk?

thanks.

Qing

================
gcc/ChangeLog:

2017-12-11  Qing Zhao  <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>

	PR middle-end/78809
	PR middle-end/83026
	* tree-ssa-strlen.c (compute_string_length): New function.
	(handle_builtin_string_cmp): New function to handle calls to
	string compare functions.
	(strlen_optimize_stmt): Add handling to builtin string compare
	calls. 

gcc/testsuite/ChangeLog:

2017-12-11  Qing Zhao  <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>

	PR middle-end/78809
	* gcc.dg/strcmpopt_2.c: New testcase.

	PR middle-end/83026
	* gcc.dg/strcmpopt_3.c: New testcase.
	
============
---
 gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 +++++++++++++
 gcc/testsuite/gcc.dg/strcmpopt_3.c |  27 +++++
 gcc/tree-ssa-strlen.c              | 197 +++++++++++++++++++++++++++++++++++++
 3 files changed, 291 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c

diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
new file mode 100644
index 0000000..ac8ff39
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+char s[100] = {'a','b','c','d'};
+typedef struct { int x; char s[8]; } S;
+
+__attribute__ ((noinline)) int 
+f1 (S *s) 
+{ 
+  return __builtin_strcmp (s->s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f2 (void) 
+{ 
+  return __builtin_strcmp (s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f3 (S *s) 
+{ 
+  return __builtin_strcmp ("abc", s->s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f4 (void) 
+{ 
+  return __builtin_strcmp ("abc", s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f5 (S *s) 
+{ 
+  return __builtin_strncmp (s->s, "abc", 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f6 (void) 
+{ 
+  return __builtin_strncmp (s, "abc", 2) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f7 (S *s) 
+{ 
+  return __builtin_strncmp ("abc", s->s, 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f8 (void) 
+{ 
+  return __builtin_strncmp ("abc", s, 2) != 0; 
+}
+
+int main (void)
+{
+  S ss = {2, {'a','b','c'}};
+
+  if (f1 (&ss) != 0 || f2 () != 1 || f3 (&ss) != 0 ||
+      f4 () != 1 || f5 (&ss) != 0 || f6 () != 0 ||
+      f7 (&ss) != 0 || f8 () != 0)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "memcmp_eq \\(" 8 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
new file mode 100644
index 0000000..a3be431
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__ ((noinline)) int 
+f1 (void) 
+{ 
+  char *s= "abcd";
+  return __builtin_strcmp(s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int
+f2 (void) 
+{ 
+  char *s = "ab";
+  return __builtin_strcmp("abc", s) != 0; 
+}
+
+int main (void)
+{
+  if (f1 () != 1 
+      || f2 () != 1)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 48b9241..aafa574 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -2541,6 +2541,198 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/* Given an index to the strinfo vector, compute the string length for the
+   corresponding string. Return -1 when unknown.  */
+ 
+static HOST_WIDE_INT 
+compute_string_length (int idx)
+{
+  HOST_WIDE_INT string_leni = -1; 
+  gcc_assert (idx != 0);
+
+  if (idx < 0)
+    string_leni = ~idx;
+  else
+    {
+      strinfo *si = get_strinfo (idx);
+      if (si)
+	{
+	  tree const_string_len = get_string_length (si);
+	  string_leni
+	    = (const_string_len && tree_fits_uhwi_p (const_string_len)
+	       ? tree_to_uhwi(const_string_len) : -1); 
+	}
+    }
+  return string_leni;
+}
+
+/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
+   equality test against zero:
+
+   A. When both arguments are constant strings and it's a strcmp:
+      * if the length of the strings are NOT equal, we can safely fold the call
+        to a non-zero value.
+      * otherwise, do nothing now.
+  
+   B. When one of the arguments is constant string, try to replace the call with
+   a __builtin_memcmp_eq call where possible, i.e:
+
+   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
+   constant string, C is a constant.
+     if (C <= strlen(STR) && sizeof_array(s) > C)
+       {
+         replace this call with
+         memcmp (s, STR, C) (!)= 0
+       }
+     if (C > strlen(STR)
+       {
+         it can be safely treated as a call to strcmp (s, STR) (!)= 0
+         can handled by the following strcmp.
+       }
+
+   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
+   constant string.
+     if  (sizeof_array(s) > strlen(STR))
+       {
+         replace this call with
+         memcmp (s, STR, strlen(STR)+1) (!)= 0
+       }
+ */ 
+
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree res = gimple_call_lhs (stmt);
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  int idx1 = get_stridx (arg1);
+  int idx2 = get_stridx (arg2);
+  HOST_WIDE_INT length = -1;
+  bool is_ncmp = false;
+
+  if (!res)
+    return true;
+
+  /* When both arguments are unknown, do nothing.  */
+  if (idx1 == 0 && idx2 == 0)
+    return true;
+
+  /* Handle strncmp function.  */
+  if (gimple_call_num_args (stmt) == 3)
+    {
+      tree len = gimple_call_arg (stmt, 2);
+      if (tree_fits_uhwi_p (len))
+        length = tree_to_uhwi (len);
+
+      is_ncmp = true;
+    }
+
+  /* For strncmp, if the length argument is NOT known, do nothing.  */
+  if (is_ncmp && length < 0)
+    return true;
+
+  /* When the result is ONLY used to do equality test against zero.  */
+  FOR_EACH_IMM_USE_FAST (use_p, iter, res) 
+    {    
+      gimple *ustmt = USE_STMT (use_p);
+
+      if (is_gimple_debug (ustmt))
+        continue;
+      if (gimple_code (ustmt) == GIMPLE_ASSIGN)
+	{    
+	  gassign *asgn = as_a <gassign *> (ustmt);
+	  tree_code code = gimple_assign_rhs_code (asgn);
+	  if ((code != EQ_EXPR && code != NE_EXPR)
+	      || !integer_zerop (gimple_assign_rhs2 (asgn)))
+	    return true;
+	}
+      else if (gimple_code (ustmt) == GIMPLE_COND)
+	{
+	  tree_code code = gimple_cond_code (ustmt);
+	  if ((code != EQ_EXPR && code != NE_EXPR)
+	      || !integer_zerop (gimple_cond_rhs (ustmt)))
+	    return true;
+	}
+      else
+        return true;
+    }
+  
+  /* When both arguments are known, and their strlens are unequal, we can 
+     safely fold the call to a non-zero value for strcmp;
+     othewise, do nothing now.  */
+  if (idx1 != 0 && idx2 != 0)
+    {
+      HOST_WIDE_INT const_string_leni1 = -1;
+      HOST_WIDE_INT const_string_leni2 = -1;
+      const_string_leni1 = compute_string_length (idx1);
+      const_string_leni2 = compute_string_length (idx2);
+      if (!is_ncmp 
+	  && const_string_leni1 != -1
+	  && const_string_leni2 != -1
+	  && const_string_leni1 != const_string_leni2) 
+	{
+	  replace_call_with_value (gsi, integer_one_node); 
+	  return false;
+	}
+      return true;
+    }
+
+  /* When one of args is constant string.  */
+  tree var_string;
+  HOST_WIDE_INT const_string_leni = -1;
+  
+  if (idx1)
+    {
+      const_string_leni = compute_string_length (idx1);
+      var_string = arg2;
+    } 
+  else if (idx2)
+    {
+      const_string_leni = compute_string_length (idx2);
+      var_string = arg1;
+    } 
+
+  if (const_string_leni < 0) 
+    return true;
+ 
+  tree size[2];
+  unsigned HOST_WIDE_INT var_sizei = 0;
+
+  /* Try to get the min and max string length for var_string, the max length is
+     the size of the array - 1, recorded in size[1].  */ 
+  get_range_strlen (var_string, size);
+  if (size[1] && tree_fits_uhwi_p (size[1]))
+    var_sizei = tree_to_uhwi (size[1]) + 1;
+
+  if (var_sizei == 0)
+    return true;
+
+  /* For strncmp, if length > const_string_leni , this call can be safely 
+     transformed to a strcmp.  */
+  if (is_ncmp && length > const_string_leni)
+    is_ncmp = false;
+
+  unsigned HOST_WIDE_INT final_length 
+    = is_ncmp ? length : (const_string_leni + 1);
+
+  /* Replace strcmp or strncmp with the corresponding memcmp.  */
+  if (var_sizei > final_length) 
+    {
+      tree fn = builtin_decl_implicit (BUILT_IN_MEMCMP_EQ);
+      if (!fn)
+	return true;
+      tree const_string_len = build_int_cst (size_type_node, final_length); 
+      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+    }
+  else 
+    return true;
+
+  return false;
+}
+
 /* Handle a POINTER_PLUS_EXPR statement.
    For p = "abcd" + 2; compute associated length, or if
    p = q + off is pointing to a '\0' character of a string, call
@@ -2940,6 +3132,11 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
 	    if (!handle_builtin_memcmp (gsi))
 	      return false;
 	    break;
+	  case BUILT_IN_STRCMP:
+	  case BUILT_IN_STRNCMP:
+	    if (!handle_builtin_string_cmp (gsi))
+	      return false;
+	    break;
 	  default:
 	    break;
 	  }
-- 
1.9.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  2017-12-14 19:48 [PATCH][Middle-end]2nd patch of PR78809 and PR83026 Qing Zhao
@ 2017-12-14 20:46 ` Jakub Jelinek
  2017-12-15 16:14   ` Qing Zhao
  2017-12-21 21:34   ` [PATCH][Middle-end][version 2]2nd " Qing Zhao
  0 siblings, 2 replies; 19+ messages in thread
From: Jakub Jelinek @ 2017-12-14 20:46 UTC (permalink / raw)
  To: Qing Zhao; +Cc: gcc-patches, jeff Law, richard Biener

On Thu, Dec 14, 2017 at 01:45:21PM -0600, Qing Zhao wrote:
> 2017-12-11  Qing Zhao  <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>

No " <mailto:qing.zhao@oracle.com>" in ChangeLog entries please.

> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c
> @@ -2541,6 +2541,198 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>    return false;
>  }
>  
> +/* Given an index to the strinfo vector, compute the string length for the
> +   corresponding string. Return -1 when unknown.  */
> + 
> +static HOST_WIDE_INT 
> +compute_string_length (int idx)
> +{
> +  HOST_WIDE_INT string_leni = -1; 
> +  gcc_assert (idx != 0);
> +
> +  if (idx < 0)
> +    string_leni = ~idx;
> +  else
> +    {
> +      strinfo *si = get_strinfo (idx);
> +      if (si)
> +	{
> +	  tree const_string_len = get_string_length (si);
> +	  string_leni
> +	    = (const_string_len && tree_fits_uhwi_p (const_string_len)
> +	       ? tree_to_uhwi(const_string_len) : -1); 

So, you are returning a signed HWI, then clearly tree_fits_uhwi_p and
tree_to_uhwi are inappropriate, you should have used tree_fits_shwi_p
and tree_to_shwi.  Space after function name is missing too.
And, as you start by initializing string_leni to -1, there is no
point to write it this way rather than
	  if (const_string_len && tree_fits_shwi_p (const_string_len))
	    string_leni = tree_to_shwi (const_string_len);

> +	}
> +    }

Maybe also do
  if (string_leni < 0)
    return -1;

> +  return string_leni;

unless the callers just look for negative value as unusable.

> +      tree len = gimple_call_arg (stmt, 2);
> +      if (tree_fits_uhwi_p (len))
> +        length = tree_to_uhwi (len);

Similarly to above, you are mixing signed and unsigned HWIs too much.

> +      if (gimple_code (ustmt) == GIMPLE_ASSIGN)

      if (is_gimple_assign (ustmt))

Usually we use use_stmt instead of ustmt.

> +	{    
> +	  gassign *asgn = as_a <gassign *> (ustmt);

No need for the gassign and ugly as_a, gimple_assign_rhs_code
as well as gimple_assign_rhs2 can be called on gimple * too.

> +	  tree_code code = gimple_assign_rhs_code (asgn);
> +	  if ((code != EQ_EXPR && code != NE_EXPR)
> +	      || !integer_zerop (gimple_assign_rhs2 (asgn)))
> +	    return true;
> +	}
> +      else if (gimple_code (ustmt) == GIMPLE_COND)
> +	{
> +	  tree_code code = gimple_cond_code (ustmt);
> +	  if ((code != EQ_EXPR && code != NE_EXPR)
> +	      || !integer_zerop (gimple_cond_rhs (ustmt)))
> +	    return true;

There is another case you are missing, assign stmt with
gimple_assign_rhs_code COND_EXPR, where gimple_assign_rhs1 is
tree with TREE_CODE EQ_EXPR or NE_EXPR with TREE_OPERAND (rhs1, 1)
integer_zerop.

> +  /* When both arguments are known, and their strlens are unequal, we can 
> +     safely fold the call to a non-zero value for strcmp;
> +     othewise, do nothing now.  */
> +  if (idx1 != 0 && idx2 != 0)
> +    {
> +      HOST_WIDE_INT const_string_leni1 = -1;
> +      HOST_WIDE_INT const_string_leni2 = -1;
> +      const_string_leni1 = compute_string_length (idx1);
> +      const_string_leni2 = compute_string_length (idx2);

Why do you initialize the vars when you immediately overwrite it?
Just do
      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
etc.

> +  /* When one of args is constant string.  */
> +  tree var_string;
> +  HOST_WIDE_INT const_string_leni = -1;
> +  
> +  if (idx1)
> +    {
> +      const_string_leni = compute_string_length (idx1);
> +      var_string = arg2;
> +    } 
> +  else if (idx2)
> +    {
> +      const_string_leni = compute_string_length (idx2);
> +      var_string = arg1;
> +    } 

Haven't you checked earlier that one of idx1 and idx2 is non-zero?
If so, then the else if (idx2) will just might confuse -Wuninitialized,
if you just use else, you don't need to initialize const_string_leni
either.

> +  /* Try to get the min and max string length for var_string, the max length is
> +     the size of the array - 1, recorded in size[1].  */ 
> +  get_range_strlen (var_string, size);
> +  if (size[1] && tree_fits_uhwi_p (size[1]))
> +    var_sizei = tree_to_uhwi (size[1]) + 1;

This is something that looks problematic to me.  get_range_strlen returns
some conservative upper bound on the string length, which is fine if
var_string points to say a TREE_STATIC variable where you know the allocated
size, or automatic variable.  But if somebody passes you a pointer to a
structure and the source doesn't contain aggregate copying for it, not sure
if you can take for granted that all the bytes are readable after the '\0'
in the string.  Hopefully at least for flexible array members and arrays in
such positions get_range_strlen will not provide the upper bound, but even
in other cases it doesn't feel safe to me.

Furthermore, in the comments you say that you do it only for small strings,
but in the patch I can't see any upper bound, so you could transform strlen
that would happen to return say just 1 or 2 with a function call that
possibly reads megabytes of data (memcmp may read all bytes, not just stop
at the first difference).
> +  unsigned HOST_WIDE_INT final_length 
> +    = is_ncmp ? length : (const_string_leni + 1);

Why the ()s?

	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  2017-12-14 20:46 ` Jakub Jelinek
@ 2017-12-15 16:14   ` Qing Zhao
  2017-12-15 16:42     ` Jakub Jelinek
  2017-12-21 21:34   ` [PATCH][Middle-end][version 2]2nd " Qing Zhao
  1 sibling, 1 reply; 19+ messages in thread
From: Qing Zhao @ 2017-12-15 16:14 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, jeff Law, richard Biener

Hi, Jakub,

thanks a lot for your detailed review.

> On Dec 14, 2017, at 2:45 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Thu, Dec 14, 2017 at 01:45:21PM -0600, Qing Zhao wrote:
>> 2017-12-11  Qing Zhao  <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>
> 
> No " <mailto:qing.zhao@oracle.com>" in ChangeLog entries please.

this is an error when I pasted this part from my terminal to mail editor, not in my real code.
will double check next time when sending out email.
> 
>> --- a/gcc/tree-ssa-strlen.c
>> +++ b/gcc/tree-ssa-strlen.c
>> @@ -2541,6 +2541,198 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>>   return false;
>> }
>> 
>> +/* Given an index to the strinfo vector, compute the string length for the
>> +   corresponding string. Return -1 when unknown.  */
>> + 
>> +static HOST_WIDE_INT 
>> +compute_string_length (int idx)
>> +{
>> +  HOST_WIDE_INT string_leni = -1; 
>> +  gcc_assert (idx != 0);
>> +
>> +  if (idx < 0)
>> +    string_leni = ~idx;
>> +  else
>> +    {
>> +      strinfo *si = get_strinfo (idx);
>> +      if (si)
>> +	{
>> +	  tree const_string_len = get_string_length (si);
>> +	  string_leni
>> +	    = (const_string_len && tree_fits_uhwi_p (const_string_len)
>> +	       ? tree_to_uhwi(const_string_len) : -1); 
> 
> So, you are returning a signed HWI, then clearly tree_fits_uhwi_p and
> tree_to_uhwi are inappropriate, you should have used tree_fits_shwi_p
> and tree_to_shwi.  Space after function name is missing too.
> And, as you start by initializing string_leni to -1, there is no
> point to write it this way rather than
> 	  if (const_string_len && tree_fits_shwi_p (const_string_len))
> 	    string_leni = tree_to_shwi (const_string_len);

originally it returned an unsigned HWI.   but later I changed it to return a signed one since I
need a negative value to represent the UNKNOWN state. 

I will fix this.
> 
>> +	}
>> +    }
> 
> Maybe also do
>  if (string_leni < 0)
>    return -1;

Yes, this might be safer.
> 
>> +  return string_leni;
> 
> unless the callers just look for negative value as unusable.
> 
>> +      tree len = gimple_call_arg (stmt, 2);
>> +      if (tree_fits_uhwi_p (len))
>> +        length = tree_to_uhwi (len);
> 
> Similarly to above, you are mixing signed and unsigned HWIs too much.

same reason as above :-),  I will fix this.

> 
>> +      if (gimple_code (ustmt) == GIMPLE_ASSIGN)
> 
>      if (is_gimple_assign (ustmt))
> 
> Usually we use use_stmt instead of ustmt.
Okay.
> 
>> +	{    
>> +	  gassign *asgn = as_a <gassign *> (ustmt);
> 
> No need for the gassign and ugly as_a, gimple_assign_rhs_code
> as well as gimple_assign_rhs2 can be called on gimple * too.

this part of the code I just copied from the routine “handle_builtin_memcpy” and no change.

I will change it as you suggested.


>> +	  tree_code code = gimple_assign_rhs_code (asgn);
>> +	  if ((code != EQ_EXPR && code != NE_EXPR)
>> +	      || !integer_zerop (gimple_assign_rhs2 (asgn)))
>> +	    return true;
>> +	}
>> +      else if (gimple_code (ustmt) == GIMPLE_COND)
>> +	{
>> +	  tree_code code = gimple_cond_code (ustmt);
>> +	  if ((code != EQ_EXPR && code != NE_EXPR)
>> +	      || !integer_zerop (gimple_cond_rhs (ustmt)))
>> +	    return true;
> 
> There is another case you are missing, assign stmt with
> gimple_assign_rhs_code COND_EXPR, where gimple_assign_rhs1 is
> tree with TREE_CODE EQ_EXPR or NE_EXPR with TREE_OPERAND (rhs1, 1)
> integer_zerop.

a little confused here:

in the current code:
	. the first case is:          result = strcmp() != 0
	. the second case is:    if (strcmp() != 0)

so, the missing case you mentioned above is:

        result = if (strcmp() != 0) 

or something else?
> 
>> +  /* When both arguments are known, and their strlens are unequal, we can 
>> +     safely fold the call to a non-zero value for strcmp;
>> +     othewise, do nothing now.  */
>> +  if (idx1 != 0 && idx2 != 0)
>> +    {
>> +      HOST_WIDE_INT const_string_leni1 = -1;
>> +      HOST_WIDE_INT const_string_leni2 = -1;
>> +      const_string_leni1 = compute_string_length (idx1);
>> +      const_string_leni2 = compute_string_length (idx2);
> 
> Why do you initialize the vars when you immediately overwrite it?

just a habit to declare a variable with initialization :-).

> Just do
>      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);

I can change it like this.
> etc.
> 
>> +  /* When one of args is constant string.  */
>> +  tree var_string;
>> +  HOST_WIDE_INT const_string_leni = -1;
>> +  
>> +  if (idx1)
>> +    {
>> +      const_string_leni = compute_string_length (idx1);
>> +      var_string = arg2;
>> +    } 
>> +  else if (idx2)
>> +    {
>> +      const_string_leni = compute_string_length (idx2);
>> +      var_string = arg1;
>> +    } 
> 
> Haven't you checked earlier that one of idx1 and idx2 is non-zero?

Yes.  

it’s guaranteed that  there is one and ONLY one of idx1 and idx2 is non-zero when getting here. 

> If so, then the else if (idx2) will just might confuse -Wuninitialized,

Okay.

> if you just use else, you don't need to initialize const_string_leni
> either.

I think that const_string_leni still need to be initialized in this case, because when idx2 is non-zero,  
const_string_leni is initialized to compute_string_length (idx2). 

> 
>> +  /* Try to get the min and max string length for var_string, the max length is
>> +     the size of the array - 1, recorded in size[1].  */ 
>> +  get_range_strlen (var_string, size);
>> +  if (size[1] && tree_fits_uhwi_p (size[1]))
>> +    var_sizei = tree_to_uhwi (size[1]) + 1;
> 
> This is something that looks problematic to me.  get_range_strlen returns
> some conservative upper bound on the string length, which is fine if
> var_string points to say a TREE_STATIC variable where you know the allocated
> size, or automatic variable.  But if somebody passes you a pointer to a
> structure and the source doesn't contain aggregate copying for it, not sure
> if you can take for granted that all the bytes are readable after the '\0'
> in the string.  Hopefully at least for flexible array members and arrays in
> such positions get_range_strlen will not provide the upper bound, but even
> in other cases it doesn't feel safe to me.

this is the part that took me most of the time during the implementation. 

I have considered the following 3 approaches to decide the size of the variable array:

	A. use “compute_builtin_object_size” in tree-object-size.h to decide the size of the
object.   However, even with the simplest case, it cannot provide the information. 
	B. use “get_range_strlen” in gimple-fold.h to decide the size of the object.  however, 
it cannot provide valid info for simple array, either. 
	C. write my own routine to decide the size of the object.


I gave up C first because I think that this new routine is very likely a partially redundant routine as
A or B, adding a partially redundant functionality into the code base might not be good.

Then I gave up A because I think that updating A will have more impact than updating B.

when reading B’s comments as following:

/* Determine the minimum and maximum value or string length that ARG
   refers to and store each in the first two elements of MINMAXLEN.
   For expressions that point to strings of unknown lengths that are
   character arrays, use the upper bound of the array as the maximum
   length.  For example, given an expression like 'x ? array : "xyz"'
   and array declared as 'char array[8]', MINMAXLEN[0] will be set
   to 3 and MINMAXLEN[1] to 7, the longest string that could be
   stored in array.
   Return true if the range of the string lengths has been obtained
   from the upper bound of an array at the end of a struct.  Such
   an array may hold a string that's longer than its upper bound
   due to it being used as a poor-man's flexible array member.  */

bool
get_range_strlen (tree arg, tree minmaxlen[2])
{
}

In the above, it clearly said:

"For expressions that point to strings of unknown lengths that are
   character arrays, use the upper bound of the array as the maximum
   length.”

So, I feel that this serves my purpose well. 
then I updated the routine get_range_strlen to make it handles simple array case. 
(this change has just been checked into upstream yesterday:

 https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=255654 <https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=255654>)

Let me know your suggestions.

I will be happy to update the code with your suggestion. 

> 
> Furthermore, in the comments you say that you do it only for small strings,
> but in the patch I can't see any upper bound, so you could transform strlen
> that would happen to return say just 1 or 2 with a function call that
> possibly reads megabytes of data (memcmp may read all bytes, not just stop
> at the first difference).

do you mean for very short constant string, we should NOT change it to a. call to memcmp?  instead we should just 
inline it with byte comparison sequence?


>> +  unsigned HOST_WIDE_INT final_length 
>> +    = is_ncmp ? length : (const_string_leni + 1);
> 
> Why the ()s?

just want to make the “const_string_leni + 1” together :-).

I can delete the () if the coding style prefers.

thanks.

Qing
> 
> 	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  2017-12-15 16:14   ` Qing Zhao
@ 2017-12-15 16:42     ` Jakub Jelinek
  2017-12-15 17:19       ` Qing Zhao
  0 siblings, 1 reply; 19+ messages in thread
From: Jakub Jelinek @ 2017-12-15 16:42 UTC (permalink / raw)
  To: Qing Zhao; +Cc: gcc-patches, jeff Law, richard Biener

On Fri, Dec 15, 2017 at 10:08:03AM -0600, Qing Zhao wrote:
> a little confused here:
> 
> in the current code:
> 	. the first case is:          result = strcmp() != 0
> 	. the second case is:    if (strcmp() != 0)
> 
> so, the missing case you mentioned above is:
> 
>         result = if (strcmp() != 0) 
> 
> or something else?

result = (strcmp () != 0 ? 15 : 37);
or similar.  Though, usually COND_EXPRs are added by the tree-if-conversion
pass, so you might need -ftree-loop-if-convert option and it probably needs
to be within some loop which will have just a single bb after the
if-conversion.
> > 
> >> +  /* When both arguments are known, and their strlens are unequal, we can 
> >> +     safely fold the call to a non-zero value for strcmp;
> >> +     othewise, do nothing now.  */
> >> +  if (idx1 != 0 && idx2 != 0)
> >> +    {
> >> +      HOST_WIDE_INT const_string_leni1 = -1;
> >> +      HOST_WIDE_INT const_string_leni2 = -1;
> >> +      const_string_leni1 = compute_string_length (idx1);
> >> +      const_string_leni2 = compute_string_length (idx2);
> > 
> > Why do you initialize the vars when you immediately overwrite it?
> 
> just a habit to declare a variable with initialization :-).
> 
> > Just do
> >      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
> 
> I can change it like this.
> > etc.
> > 
> >> +  /* When one of args is constant string.  */
> >> +  tree var_string;
> >> +  HOST_WIDE_INT const_string_leni = -1;
> >> +  
> >> +  if (idx1)
> >> +    {
> >> +      const_string_leni = compute_string_length (idx1);
> >> +      var_string = arg2;
> >> +    } 
> >> +  else if (idx2)
> >> +    {
> >> +      const_string_leni = compute_string_length (idx2);
> >> +      var_string = arg1;
> >> +    } 
> > 
> > Haven't you checked earlier that one of idx1 and idx2 is non-zero?
> 
> Yes.  
> 
> it’s guaranteed that  there is one and ONLY one of idx1 and idx2 is non-zero when getting here. 
> 
> > If so, then the else if (idx2) will just might confuse -Wuninitialized,
> 
> Okay.
> 
> > if you just use else, you don't need to initialize const_string_leni
> > either.
> 
> I think that const_string_leni still need to be initialized in this case, because when idx2 is non-zero,  
> const_string_leni is initialized to compute_string_length (idx2). 

Sure.  But
  type uninitialized_var;
  if (cond1)
    uninitialized_var = foo;
  else if (cond2)
    uninitialized_var = bar;
  use (uninitialized_var);
is a coding style which asks for -Wmaybe-uninitialized warnings, in order
not to warn, the compiler has to prove that cond1 || cond2 is always true,
which might not be always easy for the compiler.

> > This is something that looks problematic to me.  get_range_strlen returns
> > some conservative upper bound on the string length, which is fine if
> > var_string points to say a TREE_STATIC variable where you know the allocated
> > size, or automatic variable.  But if somebody passes you a pointer to a
> > structure and the source doesn't contain aggregate copying for it, not sure
> > if you can take for granted that all the bytes are readable after the '\0'
> > in the string.  Hopefully at least for flexible array members and arrays in
> > such positions get_range_strlen will not provide the upper bound, but even
> > in other cases it doesn't feel safe to me.
> 
> this is the part that took me most of the time during the implementation. 
> 
> I have considered the following 3 approaches to decide the size of the variable array:
> 
> 	A. use “compute_builtin_object_size” in tree-object-size.h to decide the size of the
> object.   However, even with the simplest case, it cannot provide the information. 

compute_builtin_object_size with modes 0 or 1 computes upper bound, what you
are really looking for is lower bound, so that would be mode 2, though that
mode isn't actually used in real-world code and thus might be not fully
tested.

> 	B. use “get_range_strlen” in gimple-fold.h to decide the size of the object.  however, 
> it cannot provide valid info for simple array, either. 

get_range_strlen returns you a range, the minval is not what you're looking
for, that is the minimum string length, so might be too short for your
purposes.  And maxval is an upper bound, but you are looking for lower
bound, you need guarantees this amount of memory can be accessed, even if
there is 0 in the first byte.

> > Furthermore, in the comments you say that you do it only for small strings,
> > but in the patch I can't see any upper bound, so you could transform strlen
> > that would happen to return say just 1 or 2 with a function call that
> > possibly reads megabytes of data (memcmp may read all bytes, not just stop
> > at the first difference).
> 
> do you mean for very short constant string, we should NOT change it to a. call to memcmp?  instead we should just 
> inline it with byte comparison sequence?

I mean we should never ever replace strcmp or strncmp call with library call to
memcmp, you don't know how it is implemented and it could access any bytes
in the new range, while previously strcmp/strncmp wouldn't.  If you have
sufficient guarantees that the memory must be all mapped and thus read-only
accessible (see above, I don't think get_range_strlen provides that), it
might be fine to inline expand the memcmp some way, because there you have
full control over what the compiler will emit.  But if we find out during
expansion we don't want to expand it inline, we should fall back to calling
strcmp or strncmp.  Which means, instead of changing the call to the
memcmp_eq builtin, change it to some STRCMP or STRNCMP internal function,
ensure it is expanded inline the way you prefer and if it can't, it will
call strcmp or strncmp.

	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  2017-12-15 16:42     ` Jakub Jelinek
@ 2017-12-15 17:19       ` Qing Zhao
  2017-12-15 17:47         ` Jakub Jelinek
  0 siblings, 1 reply; 19+ messages in thread
From: Qing Zhao @ 2017-12-15 17:19 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, jeff Law, richard Biener


> On Dec 15, 2017, at 10:42 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Fri, Dec 15, 2017 at 10:08:03AM -0600, Qing Zhao wrote:
>> a little confused here:
>> 
>> in the current code:
>> 	. the first case is:          result = strcmp() != 0
>> 	. the second case is:    if (strcmp() != 0)
>> 
>> so, the missing case you mentioned above is:
>> 
>>        result = if (strcmp() != 0) 
>> 
>> or something else?
> 
> result = (strcmp () != 0 ? 15 : 37);
> or similar.  Though, usually COND_EXPRs are added by the tree-if-conversion
> pass, so you might need -ftree-loop-if-convert option and it probably needs
> to be within some loop which will have just a single bb after the
> if-conversion.

I see. thanks.

>> 
>>> if you just use else, you don't need to initialize const_string_leni
>>> either.
>> 
>> I think that const_string_leni still need to be initialized in this case, because when idx2 is non-zero,  
>> const_string_leni is initialized to compute_string_length (idx2). 
> 
> Sure.  But
>  type uninitialized_var;
>  if (cond1)
>    uninitialized_var = foo;
>  else if (cond2)
>    uninitialized_var = bar;
>  use (uninitialized_var);
> is a coding style which asks for -Wmaybe-uninitialized warnings, in order
> not to warn, the compiler has to prove that cond1 || cond2 is always true,
> which might not be always easy for the compiler.

in my case, I already initialize the “uninitialized_var” when declared it:

  HOST_WIDE_INT const_string_leni = -1;

  if (idx1)
    {
      const_string_leni = compute_string_length (idx1);
      var_string = arg2;
    }
  else if (idx2)
    {
      const_string_leni = compute_string_length (idx2);
      var_string = arg1;
    }

so, the -Wmaybe-uninitialized should NOT issue warning, right?

but anyway, I can change the above as following:

 HOST_WIDE_INT const_string_leni = -1;

  if (idx1)
    {
      const_string_leni = compute_string_length (idx1);
      var_string = arg2;
    }
  else
    {
      gcc_assert (idx2);
      const_string_leni = compute_string_length (idx2);
      var_string = arg1;
    }

is this better?

> 
>>> This is something that looks problematic to me.  get_range_strlen returns
>>> some conservative upper bound on the string length, which is fine if
>>> var_string points to say a TREE_STATIC variable where you know the allocated
>>> size, or automatic variable.  But if somebody passes you a pointer to a
>>> structure and the source doesn't contain aggregate copying for it, not sure
>>> if you can take for granted that all the bytes are readable after the '\0'
>>> in the string.  Hopefully at least for flexible array members and arrays in
>>> such positions get_range_strlen will not provide the upper bound, but even
>>> in other cases it doesn't feel safe to me.
>> 
>> this is the part that took me most of the time during the implementation. 
>> 
>> I have considered the following 3 approaches to decide the size of the variable array:
>> 
>> 	A. use “compute_builtin_object_size” in tree-object-size.h to decide the size of the
>> object.   However, even with the simplest case, it cannot provide the information. 
> 
> compute_builtin_object_size with modes 0 or 1 computes upper bound, what you
> are really looking for is lower bound,

you mean: 0, 1 is for maximum object size, and 2 is for minimum object size?

yes, I am looking for minimum object size for this optimization. 

> so that would be mode 2, though that
> mode isn't actually used in real-world code and thus might be not fully
> tested.

so, using this routine with mode 2 should be the right approach to go? and we need fully testing on this too?

> 
>> 	B. use “get_range_strlen” in gimple-fold.h to decide the size of the object.  however, 
>> it cannot provide valid info for simple array, either. 
> 
> get_range_strlen returns you a range, the minval is not what you're looking
> for, that is the minimum string length, so might be too short for your
> purposes.  And maxval is an upper bound, but you are looking for lower
> bound, you need guarantees this amount of memory can be accessed, even if
> there is 0 in the first byte.

my understanding is that: get_range_strlen returns the minimum and maximum length of the string pointed by the 
pointer, and the maximum length of the string is determined by the size of the allocated memory pointed by the
pointer, so, it should serve my purpose,   did I misunderstand it?

> 
>>> Furthermore, in the comments you say that you do it only for small strings,
>>> but in the patch I can't see any upper bound, so you could transform strlen
>>> that would happen to return say just 1 or 2 with a function call that
>>> possibly reads megabytes of data (memcmp may read all bytes, not just stop
>>> at the first difference).
>> 
>> do you mean for very short constant string, we should NOT change it to a. call to memcmp?  instead we should just 
>> inline it with byte comparison sequence?
> 
> I mean we should never ever replace strcmp or strncmp call with library call to
> memcmp, you don't know how it is implemented and it could access any bytes
> in the new range, while previously strcmp/strncmp wouldn't.  

> If you have
> sufficient guarantees that the memory must be all mapped and thus read-only
> accessible (see above, I don't think get_range_strlen provides that), it
> might be fine to inline expand the memcmp some way, because there you have
> full control over what the compiler will emit.

agreed. and the size check in the implementation is try to guarantee the transformation
is safe.

so, the major issue here is whether “get_range_strlen” can provide the good info on this. whether we should use
compute_builtin_object_size mode 2 for this purpose.

I can change to use “compute_builtin_object_size” and mode 2, and try to see whether any improvement is needed 
to compute_builtin_object_size for simple cases.

what’s your opinion?


>  But if we find out during
> expansion we don't want to expand it inline, we should fall back to calling
> strcmp or strncmp.

under what situation we will NOT expand the memcpy_eq call inline?

Qing

>  Which means, instead of changing the call to the
> memcmp_eq builtin, change it to some STRCMP or STRNCMP internal function,
> ensure it is expanded inline the way you prefer and if it can't, it will
> call strcmp or strncmp.
> 
> 	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  2017-12-15 17:19       ` Qing Zhao
@ 2017-12-15 17:47         ` Jakub Jelinek
  2017-12-15 20:49           ` Qing Zhao
  0 siblings, 1 reply; 19+ messages in thread
From: Jakub Jelinek @ 2017-12-15 17:47 UTC (permalink / raw)
  To: Qing Zhao; +Cc: gcc-patches, jeff Law, richard Biener

On Fri, Dec 15, 2017 at 11:17:37AM -0600, Qing Zhao wrote:
>   HOST_WIDE_INT const_string_leni = -1;
> 
>   if (idx1)
>     {
>       const_string_leni = compute_string_length (idx1);
>       var_string = arg2;
>     }
>   else if (idx2)
>     {
>       const_string_leni = compute_string_length (idx2);
>       var_string = arg1;
>     }
> 
> so, the -Wmaybe-uninitialized should NOT issue warning, right?

Well, you had the var_string var uninitialized, so that is what I was
talking about.

> but anyway, I can change the above as following:
> 
>  HOST_WIDE_INT const_string_leni = -1;

And here you don't need to initialize it.

>   if (idx1)
>     {
>       const_string_leni = compute_string_length (idx1);
>       var_string = arg2;
>     }
>   else
>     {
>       gcc_assert (idx2);
>       const_string_leni = compute_string_length (idx2);
>       var_string = arg1;
>     }
> 
> is this better?

Yes, though the gcc_assert could be just gcc_checking_assert (idx2);

> > so that would be mode 2, though that
> > mode isn't actually used in real-world code and thus might be not fully
> > tested.
> 
> so, using this routine with mode 2 should be the right approach to go? 
> and we need fully testing on this too?

It has been a while since I wrote it, so it would need careful analysis.

> >> 	B. use “get_range_strlen” in gimple-fold.h to decide the size of the object.  however, 
> >> it cannot provide valid info for simple array, either. 
> > 
> > get_range_strlen returns you a range, the minval is not what you're looking
> > for, that is the minimum string length, so might be too short for your
> > purposes.  And maxval is an upper bound, but you are looking for lower
> > bound, you need guarantees this amount of memory can be accessed, even if
> > there is 0 in the first byte.
> 
> my understanding is that: get_range_strlen returns the minimum and maximum length of the string pointed by the 
> pointer, and the maximum length of the string is determined by the size of the allocated memory pointed by the
> pointer, so, it should serve my purpose,   did I misunderstand it?

What I'm worried about is:
struct S { int a; char b[64]; };
struct T { struct S c; char d; };
int
foo (struct T *x)
{
  return strcmp (x->c.b, "01234567890123456789012345678901234567890123456789") == 0;
}
int
bar (void)
{
  struct S *p = malloc (offsetof (struct S, b) + 8);
  p->a = 123;
  strcpy (p->b, "0123456");
  return foo ((struct T *) p);
}
etc.  where if you transform that into memcmp (x->c.b, "01234567890123456789012345678901234567890123456789", 51) == 0
it will segfault, whereas strcmp would not.

> >  But if we find out during
> > expansion we don't want to expand it inline, we should fall back to calling
> > strcmp or strncmp.
> 
> under what situation we will NOT expand the memcpy_eq call inline?

      target = expand_builtin_memcmp (exp, target, fcode == BUILT_IN_MEMCMP_EQ);
      if (target)
        return target;
      if (fcode == BUILT_IN_MEMCMP_EQ)
        {
          tree newdecl = builtin_decl_explicit (BUILT_IN_MEMCMP);
          TREE_OPERAND (exp, 1) = build_fold_addr_expr (newdecl);
        }
is what builtins.c has, so it certainly counts with the possibility.
Now, both expand_builtin_memcmp, and emit_block_cmp_hints has several cases
when it fails.  E.g. can_do_by_pieces decides it is too expensive to do it
inline, and emit_block_cmp_via_cmpmem fails because the target doesn't have
cmpmemsi expander.  Various other cases.

Also, note that some target might have cmpstr*si expanders implemented, but
not cmpmemsi, in which case trying to optimize strcmp as memcmp_eq might be a
severe pessimization.

	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
  2017-12-15 17:47         ` Jakub Jelinek
@ 2017-12-15 20:49           ` Qing Zhao
  0 siblings, 0 replies; 19+ messages in thread
From: Qing Zhao @ 2017-12-15 20:49 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, jeff Law, richard Biener


> On Dec 15, 2017, at 11:47 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Fri, Dec 15, 2017 at 11:17:37AM -0600, Qing Zhao wrote:
>>  HOST_WIDE_INT const_string_leni = -1;
>> 
>>  if (idx1)
>>    {
>>      const_string_leni = compute_string_length (idx1);
>>      var_string = arg2;
>>    }
>>  else if (idx2)
>>    {
>>      const_string_leni = compute_string_length (idx2);
>>      var_string = arg1;
>>    }
>> 
>> so, the -Wmaybe-uninitialized should NOT issue warning, right?
> 
> Well, you had the var_string var uninitialized, so that is what I was
> talking about.

oh, yeah, forgot to initialize var_string.
> 
>> but anyway, I can change the above as following:
>> 
>> HOST_WIDE_INT const_string_leni = -1;
> 
> And here you don't need to initialize it.
> 
>>  if (idx1)
>>    {
>>      const_string_leni = compute_string_length (idx1);
>>      var_string = arg2;
>>    }
>>  else
>>    {
>>      gcc_assert (idx2);
>>      const_string_leni = compute_string_length (idx2);
>>      var_string = arg1;
>>    }
>> 
>> is this better?
> 
> Yes, though the gcc_assert could be just gcc_checking_assert (idx2);

what’s the major difference between these two assertion calls?

> 
>>> so that would be mode 2, though that
>>> mode isn't actually used in real-world code and thus might be not fully
>>> tested.
>> 
>> so, using this routine with mode 2 should be the right approach to go? 
>> and we need fully testing on this too?
> 
> It has been a while since I wrote it, so it would need careful analysis.
> 
>>>> 	B. use “get_range_strlen” in gimple-fold.h to decide the size of the object.  however, 
>>>> it cannot provide valid info for simple array, either. 
>>> 
>>> get_range_strlen returns you a range, the minval is not what you're looking
>>> for, that is the minimum string length, so might be too short for your
>>> purposes.  And maxval is an upper bound, but you are looking for lower
>>> bound, you need guarantees this amount of memory can be accessed, even if
>>> there is 0 in the first byte.
>> 
>> my understanding is that: get_range_strlen returns the minimum and maximum length of the string pointed by the 
>> pointer, and the maximum length of the string is determined by the size of the allocated memory pointed by the
>> pointer, so, it should serve my purpose,   did I misunderstand it?
> 
> What I'm worried about is:
> struct S { int a; char b[64]; };
> struct T { struct S c; char d; };
> int
> foo (struct T *x)
> {
>  return strcmp (x->c.b, "01234567890123456789012345678901234567890123456789") == 0;
> }
> int
> bar (void)
> {
>  struct S *p = malloc (offsetof (struct S, b) + 8);
>  p->a = 123;
>  strcpy (p->b, "0123456");
>  return foo ((struct T *) p);
> }
> etc.  where if you transform that into memcmp (x->c.b, "01234567890123456789012345678901234567890123456789", 51) == 0
> it will segfault, whereas strcmp would not.

thanks for the example.

is this issue only for the flexible array member? 
if so, the current get_range_strlen will distinguish whether this is for flexible array member or not, then we can disable such transformation
for flexible array member.


> 
>>> But if we find out during
>>> expansion we don't want to expand it inline, we should fall back to calling
>>> strcmp or strncmp.
>> 
>> under what situation we will NOT expand the memcpy_eq call inline?
> 
>      target = expand_builtin_memcmp (exp, target, fcode == BUILT_IN_MEMCMP_EQ);
>      if (target)
>        return target;
>      if (fcode == BUILT_IN_MEMCMP_EQ)
>        {
>          tree newdecl = builtin_decl_explicit (BUILT_IN_MEMCMP);
>          TREE_OPERAND (exp, 1) = build_fold_addr_expr (newdecl);
>        }
> is what builtins.c has, so it certainly counts with the possibility.
> Now, both expand_builtin_memcmp, and emit_block_cmp_hints has several cases
> when it fails.  E.g. can_do_by_pieces decides it is too expensive to do it
> inline, and emit_block_cmp_via_cmpmem fails because the target doesn't have
> cmpmemsi expander.  Various other cases.
> 
> Also, note that some target might have cmpstr*si expanders implemented, but
> not cmpmemsi, in which case trying to optimize strcmp as memcmp_eq might be a
> severe pessimization.

Okay, I see.  this is reasonable. 

if the following better:  (some details still need more work, just basic idea)

in handle_builtin_string_cmp of tree-ssa-strlen.c:

+  /* Replace strcmp or strncmp with the BUILT_IN_STRCMP_EQ.  */
+  if (var_sizei > final_length) 
+    {
+      tree fn = builtin_decl_implicit (BUILT_IN_STRCMP_EQ/BUILT_IN_STRNCMP_EQ;
+      if (!fn)
+	return true;
+      tree const_string_len = build_int_cst (size_type_node, final_length); 
+      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+    }

then in builtins.c, add the following:
    case BUILT_IN_STRCMP_EQ:
    case BUILT_IN_STRNCMP_EQ:
      target = expand_builtin_memcmp (exp, target, fcode == BUILT_IN_MEMCMP_EQ);
      if (target)
        return target;
      else 
        {
         tree newdecl = builtin_decl_explicit (BUILT_IN_STRCMP/BUILT_IN_STRNCMP);
          TREE_OPERAND (exp, 1) = build_fold_addr_expr (newdecl);
        }
      
      break;


thanks.

Qing

> 
> 	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH][Middle-end][version 2]2nd patch of PR78809 and PR83026
  2017-12-14 20:46 ` Jakub Jelinek
  2017-12-15 16:14   ` Qing Zhao
@ 2017-12-21 21:34   ` Qing Zhao
  2018-01-12 22:51     ` Jeff Law
  1 sibling, 1 reply; 19+ messages in thread
From: Qing Zhao @ 2017-12-21 21:34 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, jeff Law, richard Biener

Hi, 

I updated my patch based on all your comments. 

the major changes are the following:

	1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq;
            in builtins.c,  when expanding the new __builtin_str(n)cmp_eq call, expand them first as
            __builtin_memcmp_eq, if Not succeed,  change the call back to __builtin_str(n)cmp.
	2. change the call to “get_range_strlen” with “compute_objsize”.
	3. add the missing case for equality checking with zero;
	4. adjust the new testing case for PR83026; add a new testing case for the missing case added in 3.
	5. update “uhwi” to “shwi” for where it needs;
	6. some other minor format changes.

the changes are retested on x86 and aarch64, bootstrapped and regression tested. no issue.

Okay for trunk?

thanks.

Qing

Please see the updated patch:

gcc/ChangeLog:

+2017-12-21  Qing Zhao  <qing.zhao@oracle.com>
+
+	PR middle-end/78809
+	PR middle-end/83026
+	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
+	and BUILT_IN_STRNCMP_EQ.
+	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
+	BUILT_IN_STRNCMP_EQ.
+	* tree-ssa-strlen.c (compute_string_length): New function.
+	(handle_builtin_string_cmp): New function to handle calls to
+	string compare functions.
+	(strlen_optimize_stmt): Add handling to builtin string compare
+	calls. 
+	* tree.c (build_common_builtin_nodes): Add new defines of
+	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
+
gcc/testsuite/ChangeLog

+2017-12-21  Qing Zhao  <qing.zhao@oracle.com> 
+
+	PR middle-end/78809
+	* gcc.dg/strcmpopt_2.c: New testcase.
+	* gcc.dg/strcmpopt_3.c: New testcase.
+
+	PR middle-end/83026
+	* gcc.dg/strcmpopt_3.c: New testcase.
+

---
 gcc/builtins.c                     |  33 ++++++
 gcc/builtins.def                   |   5 +
 gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 ++++++++++++
 gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 ++++++
 gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 +++
 gcc/tree-ssa-strlen.c              | 215 +++++++++++++++++++++++++++++++++++++
 gcc/tree.c                         |   8 ++
 7 files changed, 375 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 6b25253..a5f6885 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6953,12 +6953,45 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 	return target;
       break;
 
+    /* expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it 
+       back to a BUILT_IN_STRCMP. Remember to delete the 3rd paramater
+       when changing it to a strcmp call.  */
+    case BUILT_IN_STRCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* change this call back to a BUILT_IN_STRCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRCMP));
+
+      /* delete the last parameter.  */
+      unsigned int i;
+      vec<tree, va_gc> *arg_vec;
+      vec_alloc (arg_vec, 2);
+      for (i = 0; i < 2; i++)
+	arg_vec->quick_push (CALL_EXPR_ARG (exp, i));
+      exp = build_call_vec (TREE_TYPE (exp), CALL_EXPR_FN (exp), arg_vec);
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRCMP:
       target = expand_builtin_strcmp (exp, target);
       if (target)
 	return target;
       break;
 
+    /* expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it
+       back to a BUILT_IN_STRNCMP.  */
+    case BUILT_IN_STRNCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* change it back to a BUILT_IN_STRNCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRNCMP));
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRNCMP:
       target = expand_builtin_strncmp (exp, target, mode);
       if (target)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 2281021..9eb79fd 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -949,6 +949,11 @@ DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX, "__builtin_alloca_with_ali
    equality with zero.  */
 DEF_BUILTIN_STUB (BUILT_IN_MEMCMP_EQ, "__builtin_memcmp_eq")
 
+/* An internal version of strcmp/strncmp, used when the result is only 
+   tested for equality with zero.  */
+DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq")
+DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq")
+
 /* Object size checking builtins.  */
 DEF_GCC_BUILTIN	       (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_PURE_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_MEMCPY_CHK, "__memcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
new file mode 100644
index 0000000..0131b8f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+char s[100] = {'a','b','c','d'};
+typedef struct { char s[8]; int x; } S;
+
+__attribute__ ((noinline)) int 
+f1 (S *s) 
+{ 
+  return __builtin_strcmp (s->s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f2 (void) 
+{ 
+  return __builtin_strcmp (s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f3 (S *s) 
+{ 
+  return __builtin_strcmp ("abc", s->s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f4 (void) 
+{ 
+  return __builtin_strcmp ("abc", s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f5 (S *s) 
+{ 
+  return __builtin_strncmp (s->s, "abc", 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f6 (void) 
+{ 
+  return __builtin_strncmp (s, "abc", 2) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f7 (S *s) 
+{ 
+  return __builtin_strncmp ("abc", s->s, 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f8 (void) 
+{ 
+  return __builtin_strncmp ("abc", s, 2) != 0; 
+}
+
+int main (void)
+{
+  S ss = {{'a','b','c'}, 2};
+
+  if (f1 (&ss) != 0 || f2 () != 1 || f3 (&ss) != 0 ||
+      f4 () != 1 || f5 (&ss) != 0 || f6 () != 0 ||
+      f7 (&ss) != 0 || f8 () != 0)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 8 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
new file mode 100644
index 0000000..86a0d7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__ ((noinline)) int 
+f1 (void) 
+{ 
+  char *s0= "abcd";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp(s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int
+f2 (void) 
+{ 
+  char *s0 = "ab";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp("abc", s) != 0; 
+}
+
+int main (void)
+{
+  if (f1 () != 1 
+      || f2 () != 1)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c
new file mode 100644
index 0000000..d727bc3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+typedef struct { char s[8]; int x; } S;
+extern int max_i;
+
+int
+f1 (S * s)
+{ 
+  int result, i;
+  for (i = 0; i < max_i; i++)
+    result += __builtin_strcmp (s->s, "abc") != 0 ? 2 : 1;
+  return result;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 1 "strlen" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 94f20ef..57563ef 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/* Given an index to the strinfo vector, compute the string length for the
+   corresponding string. Return -1 when unknown.  */
+ 
+static HOST_WIDE_INT 
+compute_string_length (int idx)
+{
+  HOST_WIDE_INT string_leni = -1; 
+  gcc_assert (idx != 0);
+
+  if (idx < 0)
+    string_leni = ~idx;
+  else
+    {
+      strinfo *si = get_strinfo (idx);
+      if (si)
+	{
+	  tree const_string_len = get_string_length (si);
+	  if (const_string_len && tree_fits_shwi_p (const_string_len))
+	    string_leni = tree_to_shwi (const_string_len);
+	}
+    }
+
+  if (string_leni < 0)
+    return -1;
+
+  return string_leni;
+}
+
+/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
+   equality test against zero:
+
+   A. When both arguments are constant strings and it's a strcmp:
+      * if the length of the strings are NOT equal, we can safely fold the call
+        to a non-zero value.
+      * otherwise, do nothing now.
+  
+   B. When one of the arguments is constant string, try to replace the call with
+   a __builtin_str(n)cmp_eq call where possible, i.e:
+
+   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
+   constant string, C is a constant.
+     if (C <= strlen(STR) && sizeof_array(s) > C)
+       {
+         replace this call with
+         strncmp_eq (s, STR, C) (!)= 0
+       }
+     if (C > strlen(STR)
+       {
+         it can be safely treated as a call to strcmp (s, STR) (!)= 0
+         can handled by the following strcmp.
+       }
+
+   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
+   constant string.
+     if  (sizeof_array(s) > strlen(STR))
+       {
+         replace this call with
+         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
+       }
+ */ 
+
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree res = gimple_call_lhs (stmt);
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  int idx1 = get_stridx (arg1);
+  int idx2 = get_stridx (arg2);
+  HOST_WIDE_INT length = -1;
+  bool is_ncmp = false;
+
+  if (!res)
+    return true;
+
+  /* When both arguments are unknown, do nothing.  */
+  if (idx1 == 0 && idx2 == 0)
+    return true;
+
+  /* Handle strncmp function.  */
+  if (gimple_call_num_args (stmt) == 3)
+    {
+      tree len = gimple_call_arg (stmt, 2);
+      if (tree_fits_shwi_p (len))
+        length = tree_to_shwi (len);
+
+      is_ncmp = true;
+    }
+
+  /* For strncmp, if the length argument is NOT known, do nothing.  */
+  if (is_ncmp && length < 0)
+    return true;
+
+  /* When the result is ONLY used to do equality test against zero.  */
+  FOR_EACH_IMM_USE_FAST (use_p, iter, res) 
+    {    
+      gimple *use_stmt = USE_STMT (use_p);
+
+      if (is_gimple_debug (use_stmt))
+        continue;
+      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
+	{    
+	  tree_code code = gimple_assign_rhs_code (use_stmt);
+	  if (code == COND_EXPR) 
+	    {
+	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
+	      if ((TREE_CODE (cond_expr) != EQ_EXPR 
+		   && (TREE_CODE (cond_expr) != NE_EXPR))
+		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
+		return true;
+	    }
+	  else if (code == EQ_EXPR || code == NE_EXPR)
+	    {
+	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
+		return true;
+            }
+	  else 
+	    return true;
+	}
+      else if (gimple_code (use_stmt) == GIMPLE_COND)
+	{
+	  tree_code code = gimple_cond_code (use_stmt);
+	  if ((code != EQ_EXPR && code != NE_EXPR)
+	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
+	    return true;
+	}
+      else
+        return true;
+    }
+  
+  /* When both arguments are known, and their strlens are unequal, we can 
+     safely fold the call to a non-zero value for strcmp;
+     othewise, do nothing now.  */
+  if (idx1 != 0 && idx2 != 0)
+    {
+      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
+      HOST_WIDE_INT const_string_leni2 = compute_string_length (idx2);
+
+      if (!is_ncmp 
+	  && const_string_leni1 != -1
+	  && const_string_leni2 != -1
+	  && const_string_leni1 != const_string_leni2) 
+	{
+	  replace_call_with_value (gsi, integer_one_node); 
+	  return false;
+	}
+      return true;
+    }
+
+  /* When one of args is constant string.  */
+  tree var_string = NULL_TREE;
+  HOST_WIDE_INT const_string_leni = -1;
+  
+  if (idx1)
+    {
+      const_string_leni = compute_string_length (idx1);
+      var_string = arg2;
+    } 
+  else 
+    {
+      gcc_checking_assert (idx2);
+      const_string_leni = compute_string_length (idx2);
+      var_string = arg1;
+    } 
+
+  if (const_string_leni < 0) 
+    return true;
+ 
+  unsigned HOST_WIDE_INT var_sizei = 0;
+  /* try to determine the minimum size of the object pointed by var_string.  */
+  tree size = compute_objsize (var_string, 2);
+
+  if (!size)
+    return true;
+ 
+  if (tree_fits_uhwi_p (size))
+    var_sizei = tree_to_uhwi (size);
+
+  if (var_sizei == 0)
+    return true;
+
+  /* For strncmp, if length > const_string_leni , this call can be safely 
+     transformed to a strcmp.  */
+  if (is_ncmp && length > const_string_leni)
+    is_ncmp = false;
+
+  unsigned HOST_WIDE_INT final_length 
+    = is_ncmp ? length : const_string_leni + 1;
+
+  /* Replace strcmp or strncmp with the corresponding str(n)cmp_eq.  */
+  if (var_sizei > final_length) 
+    {
+      tree fn 
+	= (is_ncmp 
+	   ? builtin_decl_implicit (BUILT_IN_STRNCMP_EQ) 
+	   : builtin_decl_implicit (BUILT_IN_STRCMP_EQ));
+      if (!fn)
+	return true;
+      tree const_string_len = build_int_cst (size_type_node, final_length); 
+      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+    }
+  else 
+    return true;
+
+  return false;
+}
+
 /* Handle a POINTER_PLUS_EXPR statement.
    For p = "abcd" + 2; compute associated length, or if
    p = q + off is pointing to a '\0' character of a string, call
@@ -2939,6 +3149,11 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
 	    if (!handle_builtin_memcmp (gsi))
 	      return false;
 	    break;
+	  case BUILT_IN_STRCMP:
+	  case BUILT_IN_STRNCMP:
+	    if (!handle_builtin_string_cmp (gsi))
+	      return false;
+	    break;
 	  default:
 	    break;
 	  }
diff --git a/gcc/tree.c b/gcc/tree.c
index ed1852b..e23b3cd 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10047,6 +10047,14 @@ build_common_builtin_nodes (void)
 			"__builtin_memcmp_eq",
 			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
 
+  local_define_builtin ("__builtin_strncmp_eq", ftype, BUILT_IN_STRNCMP_EQ,
+			"__builtin_strncmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
+  local_define_builtin ("__builtin_strcmp_eq", ftype, BUILT_IN_STRCMP_EQ,
+			"__builtin_strcmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
   /* If there's a possibility that we might use the ARM EABI, build the
     alternate __cxa_end_cleanup node used to resume from C++.  */
   if (targetm.arm_eabi_unwinder)
-- 
1.9.1


> On Dec 14, 2017, at 2:45 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Thu, Dec 14, 2017 at 01:45:21PM -0600, Qing Zhao wrote:
>> 2017-12-11  Qing Zhao  <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>
> 
> No " <mailto:qing.zhao@oracle.com>" in ChangeLog entries please.
> 
>> --- a/gcc/tree-ssa-strlen.c
>> +++ b/gcc/tree-ssa-strlen.c
>> @@ -2541,6 +2541,198 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>>   return false;
>> }
>> 
>> +/* Given an index to the strinfo vector, compute the string length for the
>> +   corresponding string. Return -1 when unknown.  */
>> + 
>> +static HOST_WIDE_INT 
>> +compute_string_length (int idx)
>> +{
>> +  HOST_WIDE_INT string_leni = -1; 
>> +  gcc_assert (idx != 0);
>> +
>> +  if (idx < 0)
>> +    string_leni = ~idx;
>> +  else
>> +    {
>> +      strinfo *si = get_strinfo (idx);
>> +      if (si)
>> +	{
>> +	  tree const_string_len = get_string_length (si);
>> +	  string_leni
>> +	    = (const_string_len && tree_fits_uhwi_p (const_string_len)
>> +	       ? tree_to_uhwi(const_string_len) : -1); 
> 
> So, you are returning a signed HWI, then clearly tree_fits_uhwi_p and
> tree_to_uhwi are inappropriate, you should have used tree_fits_shwi_p
> and tree_to_shwi.  Space after function name is missing too.
> And, as you start by initializing string_leni to -1, there is no
> point to write it this way rather than
> 	  if (const_string_len && tree_fits_shwi_p (const_string_len))
> 	    string_leni = tree_to_shwi (const_string_len);
> 
>> +	}
>> +    }
> 
> Maybe also do
>  if (string_leni < 0)
>    return -1;
> 
>> +  return string_leni;
> 
> unless the callers just look for negative value as unusable.
> 
>> +      tree len = gimple_call_arg (stmt, 2);
>> +      if (tree_fits_uhwi_p (len))
>> +        length = tree_to_uhwi (len);
> 
> Similarly to above, you are mixing signed and unsigned HWIs too much.
> 
>> +      if (gimple_code (ustmt) == GIMPLE_ASSIGN)
> 
>      if (is_gimple_assign (ustmt))
> 
> Usually we use use_stmt instead of ustmt.
> 
>> +	{    
>> +	  gassign *asgn = as_a <gassign *> (ustmt);
> 
> No need for the gassign and ugly as_a, gimple_assign_rhs_code
> as well as gimple_assign_rhs2 can be called on gimple * too.
> 
>> +	  tree_code code = gimple_assign_rhs_code (asgn);
>> +	  if ((code != EQ_EXPR && code != NE_EXPR)
>> +	      || !integer_zerop (gimple_assign_rhs2 (asgn)))
>> +	    return true;
>> +	}
>> +      else if (gimple_code (ustmt) == GIMPLE_COND)
>> +	{
>> +	  tree_code code = gimple_cond_code (ustmt);
>> +	  if ((code != EQ_EXPR && code != NE_EXPR)
>> +	      || !integer_zerop (gimple_cond_rhs (ustmt)))
>> +	    return true;
> 
> There is another case you are missing, assign stmt with
> gimple_assign_rhs_code COND_EXPR, where gimple_assign_rhs1 is
> tree with TREE_CODE EQ_EXPR or NE_EXPR with TREE_OPERAND (rhs1, 1)
> integer_zerop.
> 
>> +  /* When both arguments are known, and their strlens are unequal, we can 
>> +     safely fold the call to a non-zero value for strcmp;
>> +     othewise, do nothing now.  */
>> +  if (idx1 != 0 && idx2 != 0)
>> +    {
>> +      HOST_WIDE_INT const_string_leni1 = -1;
>> +      HOST_WIDE_INT const_string_leni2 = -1;
>> +      const_string_leni1 = compute_string_length (idx1);
>> +      const_string_leni2 = compute_string_length (idx2);
> 
> Why do you initialize the vars when you immediately overwrite it?
> Just do
>      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
> etc.
> 
>> +  /* When one of args is constant string.  */
>> +  tree var_string;
>> +  HOST_WIDE_INT const_string_leni = -1;
>> +  
>> +  if (idx1)
>> +    {
>> +      const_string_leni = compute_string_length (idx1);
>> +      var_string = arg2;
>> +    } 
>> +  else if (idx2)
>> +    {
>> +      const_string_leni = compute_string_length (idx2);
>> +      var_string = arg1;
>> +    } 
> 
> Haven't you checked earlier that one of idx1 and idx2 is non-zero?
> If so, then the else if (idx2) will just might confuse -Wuninitialized,
> if you just use else, you don't need to initialize const_string_leni
> either.
> 
>> +  /* Try to get the min and max string length for var_string, the max length is
>> +     the size of the array - 1, recorded in size[1].  */ 
>> +  get_range_strlen (var_string, size);
>> +  if (size[1] && tree_fits_uhwi_p (size[1]))
>> +    var_sizei = tree_to_uhwi (size[1]) + 1;
> 
> This is something that looks problematic to me.  get_range_strlen returns
> some conservative upper bound on the string length, which is fine if
> var_string points to say a TREE_STATIC variable where you know the allocated
> size, or automatic variable.  But if somebody passes you a pointer to a
> structure and the source doesn't contain aggregate copying for it, not sure
> if you can take for granted that all the bytes are readable after the '\0'
> in the string.  Hopefully at least for flexible array members and arrays in
> such positions get_range_strlen will not provide the upper bound, but even
> in other cases it doesn't feel safe to me.
> 
> Furthermore, in the comments you say that you do it only for small strings,
> but in the patch I can't see any upper bound, so you could transform strlen
> that would happen to return say just 1 or 2 with a function call that
> possibly reads megabytes of data (memcmp may read all bytes, not just stop
> at the first difference).
>> +  unsigned HOST_WIDE_INT final_length 
>> +    = is_ncmp ? length : (const_string_leni + 1);
> 
> Why the ()s?
> 
> 	Jakub

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end][version 2]2nd patch of PR78809 and PR83026
  2017-12-21 21:34   ` [PATCH][Middle-end][version 2]2nd " Qing Zhao
@ 2018-01-12 22:51     ` Jeff Law
  2018-01-15  8:56       ` Richard Biener
  2018-01-25 20:34       ` [PATCH][Middle-end][version 2]2nd " Qing Zhao
  0 siblings, 2 replies; 19+ messages in thread
From: Jeff Law @ 2018-01-12 22:51 UTC (permalink / raw)
  To: Qing Zhao, Jakub Jelinek; +Cc: gcc-patches, richard Biener

On 12/21/2017 02:25 PM, Qing Zhao wrote:
> Hi, 
> 
> I updated my patch based on all your comments. 
> 
> the major changes are the following:
> 
> 	1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq;
>             in builtins.c,  when expanding the new __builtin_str(n)cmp_eq call, expand them first as
>             __builtin_memcmp_eq, if Not succeed,  change the call back to __builtin_str(n)cmp.
> 	2. change the call to “get_range_strlen” with “compute_objsize”.
Please read the big comment before compute_objsize.  If you are going to
use it to influence code generation or optimization, then you're most
likely doing something wrong.

compute_objsize can return an estimate in some circumstances.


> 	3. add the missing case for equality checking with zero;
> 	4. adjust the new testing case for PR83026; add a new testing case for the missing case added in 3.
> 	5. update “uhwi” to “shwi” for where it needs;
> 	6. some other minor format changes.
> 
> the changes are retested on x86 and aarch64, bootstrapped and regression tested. no issue.
> 
> Okay for trunk?
> 
> thanks.
> 
> Qing
> 
> Please see the updated patch:
> 
> gcc/ChangeLog:
> 
> +2017-12-21  Qing Zhao  <qing.zhao@oracle.com>
> +
> +	PR middle-end/78809
> +	PR middle-end/83026
> +	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
> +	and BUILT_IN_STRNCMP_EQ.
> +	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
> +	BUILT_IN_STRNCMP_EQ.
> +	* tree-ssa-strlen.c (compute_string_length): New function.
> +	(handle_builtin_string_cmp): New function to handle calls to
> +	string compare functions.
> +	(strlen_optimize_stmt): Add handling to builtin string compare
> +	calls. 
> +	* tree.c (build_common_builtin_nodes): Add new defines of
> +	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
> +
> gcc/testsuite/ChangeLog
> 
> +2017-12-21  Qing Zhao  <qing.zhao@oracle.com> 
> +
> +	PR middle-end/78809
> +	* gcc.dg/strcmpopt_2.c: New testcase.
> +	* gcc.dg/strcmpopt_3.c: New testcase.
> +
> +	PR middle-end/83026
> +	* gcc.dg/strcmpopt_3.c: New testcase.
What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ.
ISTM that if you're going to introduce those new builtins, then you have
to audit all the code that runs between their introduction into the IL
and when you expand them to ensure they're handled properly.

All you're really doing is carrying along a status bit about what
tree-ssa-strlen did.  So you could possibly store that status bit somewhere.

The concern with both is that something later invalidates the analysis
you've done.  I'm having a hard time coming up with a case where this
could happen, but I'm definitely concerned about this possibility.
Though I feel it's more likely to happen if we store a status bit vs
using STRCMP_EQ STRNCMP_EQ.

[ For example, we have two calls with the same arguments, but one feeds
an equality test, the other does not.  If we store a status bit that one
could be transformed, but then we end up CSE-ing the two calls, the
status bit would be incorrect because one of the calls did not feed an
equality test.  Hmmm. ]




> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> index 94f20ef..57563ef 100644
> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c
> @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>    return false;
>  }
>  
> +/* Given an index to the strinfo vector, compute the string length for the
> +   corresponding string. Return -1 when unknown.  */
> + 
> +static HOST_WIDE_INT 
> +compute_string_length (int idx)
> +{
> +  HOST_WIDE_INT string_leni = -1; 
> +  gcc_assert (idx != 0);
> +
> +  if (idx < 0)
> +    string_leni = ~idx;
So it seems to me you should just
  return ~idx;

Then appropriately simplify the rest of the code.

> +
> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
> +   equality test against zero:
> +
> +   A. When both arguments are constant strings and it's a strcmp:
> +      * if the length of the strings are NOT equal, we can safely fold the call
> +        to a non-zero value.
> +      * otherwise, do nothing now.
I'm guessing your comment needs a bit of work.  If both arguments are
constant strings, then we can just use the host str[n]cmp to fold the
str[n]cmp to a constant.  Presumably this is handled earlier :-)

So what I'm guessing is you're really referring to the case where the
lengths are known constants, even if the contents of the strings
themselves are not.  In that case if its an equality comparison, then
you can fold to a constant.  Otherwise we do nothing.  So I think the
comment needs updating here.






> +  
> +   B. When one of the arguments is constant string, try to replace the call with
> +   a __builtin_str(n)cmp_eq call where possible, i.e:
> +
> +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   constant string, C is a constant.
> +     if (C <= strlen(STR) && sizeof_array(s) > C)
> +       {
> +         replace this call with
> +         strncmp_eq (s, STR, C) (!)= 0
> +       }
> +     if (C > strlen(STR)
> +       {
> +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
> +         can handled by the following strcmp.
> +       }
> +
> +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   constant string.
> +     if  (sizeof_array(s) > strlen(STR))
> +       {
> +         replace this call with
> +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
So I'd hazard a guess that this comment has the same problem.  It's
mixing up the concept of a constant string and the concept of a
nonconstant string, but which has a known constant length.

First, if one of the arguments is a string constant, then it should be
easy to get the constant at expansion time with minimal effort.  It's
also the case that knowing if we're comparing the result against zero is
trivial to figure out at expansion time.  This would probably be a
reasonble thing to add to the expanders.



Your function comment for handle_builtin_string_cmp does not indicate
what the return value means.  From reading the code is appears to return
TRUE when it does nothing and FALSE when it optimizes the call in some
way.  Is that correct?  That seems inverted from the way we'd normally
write this stuff.



> +
> +  /* When one of args is constant string.  */
> +  tree var_string = NULL_TREE;
> +  HOST_WIDE_INT const_string_leni = -1;
> +  
> +  if (idx1)
> +    {
> +      const_string_leni = compute_string_length (idx1);
> +      var_string = arg2;
> +    } 
> +  else 
> +    {
> +      gcc_checking_assert (idx2);
> +      const_string_leni = compute_string_length (idx2);
> +      var_string = arg1;
> +    } 
> +
> +  if (const_string_leni < 0) 
> +    return true;
> + 
> +  unsigned HOST_WIDE_INT var_sizei = 0;
> +  /* try to determine the minimum size of the object pointed by var_string.  */
> +  tree size = compute_objsize (var_string, 2);
So this worries me as well.  compute_objsize is not supposed to be used
to influence code generation or optimization.  It is not guaranteed to
return an exact size, but instead may return an estimate!




I'd really like to hear Jakub or Richi chime in with their thoughts on
how this transforms one builtin into another and whether or not they
think that is wise.

jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end][version 2]2nd patch of PR78809 and PR83026
  2018-01-12 22:51     ` Jeff Law
@ 2018-01-15  8:56       ` Richard Biener
  2018-01-25 20:46         ` Qing Zhao
  2018-01-25 20:34       ` [PATCH][Middle-end][version 2]2nd " Qing Zhao
  1 sibling, 1 reply; 19+ messages in thread
From: Richard Biener @ 2018-01-15  8:56 UTC (permalink / raw)
  To: Jeff Law; +Cc: Qing Zhao, Jakub Jelinek, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 8203 bytes --]

On Fri, 12 Jan 2018, Jeff Law wrote:

> On 12/21/2017 02:25 PM, Qing Zhao wrote:
> > Hi, 
> > 
> > I updated my patch based on all your comments. 
> > 
> > the major changes are the following:
> > 
> > 	1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq;
> >             in builtins.c,  when expanding the new __builtin_str(n)cmp_eq call, expand them first as
> >             __builtin_memcmp_eq, if Not succeed,  change the call back to __builtin_str(n)cmp.
> > 	2. change the call to “get_range_strlen” with “compute_objsize”.
> Please read the big comment before compute_objsize.  If you are going to
> use it to influence code generation or optimization, then you're most
> likely doing something wrong.
> 
> compute_objsize can return an estimate in some circumstances.
> 
> 
> > 	3. add the missing case for equality checking with zero;
> > 	4. adjust the new testing case for PR83026; add a new testing case for the missing case added in 3.
> > 	5. update “uhwi” to “shwi” for where it needs;
> > 	6. some other minor format changes.
> > 
> > the changes are retested on x86 and aarch64, bootstrapped and regression tested. no issue.
> > 
> > Okay for trunk?
> > 
> > thanks.
> > 
> > Qing
> > 
> > Please see the updated patch:
> > 
> > gcc/ChangeLog:
> > 
> > +2017-12-21  Qing Zhao  <qing.zhao@oracle.com>
> > +
> > +	PR middle-end/78809
> > +	PR middle-end/83026
> > +	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
> > +	and BUILT_IN_STRNCMP_EQ.
> > +	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
> > +	BUILT_IN_STRNCMP_EQ.
> > +	* tree-ssa-strlen.c (compute_string_length): New function.
> > +	(handle_builtin_string_cmp): New function to handle calls to
> > +	string compare functions.
> > +	(strlen_optimize_stmt): Add handling to builtin string compare
> > +	calls. 
> > +	* tree.c (build_common_builtin_nodes): Add new defines of
> > +	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
> > +
> > gcc/testsuite/ChangeLog
> > 
> > +2017-12-21  Qing Zhao  <qing.zhao@oracle.com> 
> > +
> > +	PR middle-end/78809
> > +	* gcc.dg/strcmpopt_2.c: New testcase.
> > +	* gcc.dg/strcmpopt_3.c: New testcase.
> > +
> > +	PR middle-end/83026
> > +	* gcc.dg/strcmpopt_3.c: New testcase.
> What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ.
> ISTM that if you're going to introduce those new builtins, then you have
> to audit all the code that runs between their introduction into the IL
> and when you expand them to ensure they're handled properly.
> 
> All you're really doing is carrying along a status bit about what
> tree-ssa-strlen did.  So you could possibly store that status bit somewhere.
> 
> The concern with both is that something later invalidates the analysis
> you've done.  I'm having a hard time coming up with a case where this
> could happen, but I'm definitely concerned about this possibility.
> Though I feel it's more likely to happen if we store a status bit vs
> using STRCMP_EQ STRNCMP_EQ.
> 
> [ For example, we have two calls with the same arguments, but one feeds
> an equality test, the other does not.  If we store a status bit that one
> could be transformed, but then we end up CSE-ing the two calls, the
> status bit would be incorrect because one of the calls did not feed an
> equality test.  Hmmm. ]
> 
> 
> 
> 
> > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> > index 94f20ef..57563ef 100644
> > --- a/gcc/tree-ssa-strlen.c
> > +++ b/gcc/tree-ssa-strlen.c
> > @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
> >    return false;
> >  }
> >  
> > +/* Given an index to the strinfo vector, compute the string length for the
> > +   corresponding string. Return -1 when unknown.  */
> > + 
> > +static HOST_WIDE_INT 
> > +compute_string_length (int idx)
> > +{
> > +  HOST_WIDE_INT string_leni = -1; 
> > +  gcc_assert (idx != 0);
> > +
> > +  if (idx < 0)
> > +    string_leni = ~idx;
> So it seems to me you should just
>   return ~idx;
> 
> Then appropriately simplify the rest of the code.
> 
> > +
> > +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
> > +   equality test against zero:
> > +
> > +   A. When both arguments are constant strings and it's a strcmp:
> > +      * if the length of the strings are NOT equal, we can safely fold the call
> > +        to a non-zero value.
> > +      * otherwise, do nothing now.
> I'm guessing your comment needs a bit of work.  If both arguments are
> constant strings, then we can just use the host str[n]cmp to fold the
> str[n]cmp to a constant.  Presumably this is handled earlier :-)
> 
> So what I'm guessing is you're really referring to the case where the
> lengths are known constants, even if the contents of the strings
> themselves are not.  In that case if its an equality comparison, then
> you can fold to a constant.  Otherwise we do nothing.  So I think the
> comment needs updating here.
> 
> 
> 
> 
> 
> 
> > +  
> > +   B. When one of the arguments is constant string, try to replace the call with
> > +   a __builtin_str(n)cmp_eq call where possible, i.e:
> > +
> > +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
> > +   constant string, C is a constant.
> > +     if (C <= strlen(STR) && sizeof_array(s) > C)
> > +       {
> > +         replace this call with
> > +         strncmp_eq (s, STR, C) (!)= 0
> > +       }
> > +     if (C > strlen(STR)
> > +       {
> > +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
> > +         can handled by the following strcmp.
> > +       }
> > +
> > +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
> > +   constant string.
> > +     if  (sizeof_array(s) > strlen(STR))
> > +       {
> > +         replace this call with
> > +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
> So I'd hazard a guess that this comment has the same problem.  It's
> mixing up the concept of a constant string and the concept of a
> nonconstant string, but which has a known constant length.
> 
> First, if one of the arguments is a string constant, then it should be
> easy to get the constant at expansion time with minimal effort.  It's
> also the case that knowing if we're comparing the result against zero is
> trivial to figure out at expansion time.  This would probably be a
> reasonble thing to add to the expanders.
> 
> 
> 
> Your function comment for handle_builtin_string_cmp does not indicate
> what the return value means.  From reading the code is appears to return
> TRUE when it does nothing and FALSE when it optimizes the call in some
> way.  Is that correct?  That seems inverted from the way we'd normally
> write this stuff.
> 
> 
> 
> > +
> > +  /* When one of args is constant string.  */
> > +  tree var_string = NULL_TREE;
> > +  HOST_WIDE_INT const_string_leni = -1;
> > +  
> > +  if (idx1)
> > +    {
> > +      const_string_leni = compute_string_length (idx1);
> > +      var_string = arg2;
> > +    } 
> > +  else 
> > +    {
> > +      gcc_checking_assert (idx2);
> > +      const_string_leni = compute_string_length (idx2);
> > +      var_string = arg1;
> > +    } 
> > +
> > +  if (const_string_leni < 0) 
> > +    return true;
> > + 
> > +  unsigned HOST_WIDE_INT var_sizei = 0;
> > +  /* try to determine the minimum size of the object pointed by var_string.  */
> > +  tree size = compute_objsize (var_string, 2);
> So this worries me as well.  compute_objsize is not supposed to be used
> to influence code generation or optimization.  It is not guaranteed to
> return an exact size, but instead may return an estimate!

Yes, compute_objsize cannot be used for optimizations.

> I'd really like to hear Jakub or Richi chime in with their thoughts on
> how this transforms one builtin into another and whether or not they
> think that is wise.

Given it follows how we handle memcmp () != 0 it's ok to trigger
inline expansion.  Didn't look into the patch whether it does this.
And yes, places that look at STR[N]CMP now also have to look at
STR[N]CMP_EQ.

We're now in stage4 so please queue this patch and ping it during
next stage1.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end][version 2]2nd patch of PR78809 and PR83026
  2018-01-12 22:51     ` Jeff Law
  2018-01-15  8:56       ` Richard Biener
@ 2018-01-25 20:34       ` Qing Zhao
  1 sibling, 0 replies; 19+ messages in thread
From: Qing Zhao @ 2018-01-25 20:34 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: Jakub Jelinek, gcc-patches

Hi, Jeff,

Really sorry for my  delay. 

Your email on 1/12/2018 and Richard’s email on 1/15/2018,   were completely lost in my mailboxes until yesterday my colleague, Paolo Carlini, forwarded it to me. 

I really apologize for the late reply.

Please see my reply below:

> On Jan 12, 2018, at 4:47 PM, Jeff Law <law@redhat.com <mailto:law@redhat.com>> wrote:
>> 
>> 
>> 	1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq;
>>            in builtins.c,  when expanding the new __builtin_str(n)cmp_eq call, expand them first as
>>            __builtin_memcmp_eq, if Not succeed,  change the call back to __builtin_str(n)cmp.
>> 	2. change the call to “get_range_strlen” with “compute_objsize”.
> Please read the big comment before compute_objsize.  If you are going to
> use it to influence code generation or optimization, then you're most
> likely doing something wrong.
> 
> compute_objsize can return an estimate in some circumstances.

> On Jan 15, 2018, at 2:48 AM, Richard Biener <rguenther@suse.de <mailto:rguenther@suse.de>> wrote:
> 
> Yes, compute_objsize cannot be used for optimizations.
Yes, I just checked the compute_objsize again, you are right, it might return an estimated code size.


> On Jan 12, 2018, at 4:47 PM, Jeff Law <law@redhat.com <mailto:law@redhat.com>> wrote:

> What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ.
> ISTM that if you're going to introduce those new builtins, then you have
> to audit all the code that runs between their introduction into the IL
> and when you expand them to ensure they're handled properly.
> 
> All you're really doing is carrying along a status bit about what
> tree-ssa-strlen did.  So you could possibly store that status bit somewhere.
> 
> The concern with both is that something later invalidates the analysis
> you've done.  I'm having a hard time coming up with a case where this
> could happen, but I'm definitely concerned about this possibility.
> Though I feel it's more likely to happen if we store a status bit vs
> using STRCMP_EQ STRNCMP_EQ.
> 
> [ For example, we have two calls with the same arguments, but one feeds
> an equality test, the other does not.  If we store a status bit that one
> could be transformed, but then we end up CSE-ing the two calls, the
> status bit would be incorrect because one of the calls did not feed an
> equality test.  Hmmm. ]

See below.

>> +static HOST_WIDE_INT 
>> +compute_string_length (int idx)
>> +{
>> +  HOST_WIDE_INT string_leni = -1; 
>> +  gcc_assert (idx != 0);
>> +
>> +  if (idx < 0)
>> +    string_leni = ~idx;
> So it seems to me you should just
>  return ~idx;
> 
> Then appropriately simplify the rest of the code.

Okay, I will update my code per your suggestion. 
> 
>> +
>> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
>> +   equality test against zero:
>> +
>> +   A. When both arguments are constant strings and it's a strcmp:
>> +      * if the length of the strings are NOT equal, we can safely fold the call
>> +        to a non-zero value.
>> +      * otherwise, do nothing now.
> I'm guessing your comment needs a bit of work.  If both arguments are
> constant strings, then we can just use the host str[n]cmp to fold the
> str[n]cmp to a constant.  Presumably this is handled earlier :-)
> 
> So what I'm guessing is you're really referring to the case where the
> lengths are known constants, even if the contents of the strings
> themselves are not.  In that case if its an equality comparison, then
> you can fold to a constant.  Otherwise we do nothing.  So I think the
> comment needs updating here.

your understanding is correct, I will update the comments to make it more accurate.

>> +  
>> +   B. When one of the arguments is constant string, try to replace the call with
>> +   a __builtin_str(n)cmp_eq call where possible, i.e:
>> +
>> +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
>> +   constant string, C is a constant.
>> +     if (C <= strlen(STR) && sizeof_array(s) > C)
>> +       {
>> +         replace this call with
>> +         strncmp_eq (s, STR, C) (!)= 0
>> +       }
>> +     if (C > strlen(STR)
>> +       {
>> +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
>> +         can handled by the following strcmp.
>> +       }
>> +
>> +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
>> +   constant string.
>> +     if  (sizeof_array(s) > strlen(STR))
>> +       {
>> +         replace this call with
>> +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
> So I'd hazard a guess that this comment has the same problem.  It's
> mixing up the concept of a constant string and the concept of a
> nonconstant string, but which has a known constant length.
> 
> First, if one of the arguments is a string constant, then it should be
> easy to get the constant at expansion time with minimal effort.  It's
> also the case that knowing if we're comparing the result against zero is
> trivial to figure out at expansion time.  This would probably be a
> reasonble thing to add to the expanders.

Yes, should be a string with constant length.
I will update my comments to reflect this.

> Your function comment for handle_builtin_string_cmp does not indicate
> what the return value means.  From reading the code is appears to return
> TRUE when it does nothing and FALSE when it optimizes the call in some
> way.  Is that correct?

Yes.
>  That seems inverted from the way we'd normally
> write this stuff.

This is just following the handling of memcmp (please see the routine handle_builtin_memcmp),
try to be consistent with them. 

>> +
>> +  /* When one of args is constant string.  */
>> +  tree var_string = NULL_TREE;
>> +  HOST_WIDE_INT const_string_leni = -1;
>> +  
>> +  if (idx1)
>> +    {
>> +      const_string_leni = compute_string_length (idx1);
>> +      var_string = arg2;
>> +    } 
>> +  else 
>> +    {
>> +      gcc_checking_assert (idx2);
>> +      const_string_leni = compute_string_length (idx2);
>> +      var_string = arg1;
>> +    } 
>> +
>> +  if (const_string_leni < 0) 
>> +    return true;
>> + 
>> +  unsigned HOST_WIDE_INT var_sizei = 0;
>> +  /* try to determine the minimum size of the object pointed by var_string.  */
>> +  tree size = compute_objsize (var_string, 2);
> So this worries me as well.  compute_objsize is not supposed to be used
> to influence code generation or optimization.  It is not guaranteed to
> return an exact size, but instead may return an estimate!

As I mentioned in above:

Yes, I just checked the compute_objsize again, you are right, it might return an estimated code size.

there are 3 parts in  compute_objsize:
	A. call “compute_builtin_object_size” to get the code size;
	B.  If A failed, use value_RANGE info when the referenced object involves
   a non-constant offset in some range. 
	C. if both A and B failed, use the TYPE info to determine the code size.

So, From my understanding:
both A and C should provide accurate code size info, but B will NOT.
for my purpose, if I do NOT use B, then it will be return accurate code size, right?

> 
> 
> I'd really like to hear Jakub or Richi chime in with their thoughts on
> how this transforms one builtin into another and whether or not they think that is wise

On Jan 15, 2018, at 2:48 AM, Richard Biener <rguenther@suse.de <mailto:rguenther@suse.de>> wrote:
> Given it follows how we handle memcmp () != 0 it's ok to trigger inline expansion.
>  Didn't look into the patch whether it does this.
> And yes, places that look at STR[N]CMP now also have to look atSTR[N]CMP_EQ.

Based on both you and Richard’s above comments, my understanding is:

introducing STRCMP_EQ/STRNCMP_EQ follows the current handling of memcmp ()!=0 in 
GCC, should be fine. 
but I should check all the places that look at STRCMP/STRNCMP to include STRCMP_EQ/STRNCMP_EQ now.

Thanks.

Qing



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end][version 2]2nd patch of PR78809 and PR83026
  2018-01-15  8:56       ` Richard Biener
@ 2018-01-25 20:46         ` Qing Zhao
  2018-02-07 15:36           ` [PATCH][Middle-end][version 3]2nd " Qing Zhao
  0 siblings, 1 reply; 19+ messages in thread
From: Qing Zhao @ 2018-01-25 20:46 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, Jakub Jelinek, gcc-patches


> 
> We're now in stage4 so please queue this patch and ping it during
> next stage1.
> 

I will update my patch based on Jeff and your comments, and send it during next stage 1.

thanks.

Qing

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026
  2018-01-25 20:46         ` Qing Zhao
@ 2018-02-07 15:36           ` Qing Zhao
  2018-05-22 17:56             ` Ping: " Qing Zhao
  2018-05-25 20:56             ` Jeff Law
  0 siblings, 2 replies; 19+ messages in thread
From: Qing Zhao @ 2018-02-07 15:36 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jeff Law, Jakub Jelinek, richard Biener

[-- Attachment #1: Type: text/plain, Size: 4182 bytes --]

Hi, this is the 3rd version for this patch.

the main change compared with 2nd version are:
	1. do not use “compute_objsize” to get the minimum object size per Jeff and Richard’s
comment. Instead, add a new function “determine_min_objsize” for this purpose. This new
function calls “compute_builtin_object_size” to get the minimum objsize, however, due to 
the fact that “compute_builtin_object_size” does nothing for SSA_NAME when optimize > 0 (don’t
know the exactly reason for this), inside “determine_min_objsize”, I have to add  more code
to catch up some simple SSA_NAME cases.

	2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the new 
BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where 
BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked.

	3. some  format change and comments change per Jeff’s comment. 

let me know if you have any comments.

thanks a lot.

Qing

*********************************

2nd Patch for PR78009 
Patch for PR83026

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Inline strcmp with small constant strings

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
missing strlen optimization for strcmp of unequal strings

The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the second part of change of PR78809 and PR83026:

B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0

 B.1. (PR83026) When the lengths of both arguments are constant and
      it's a strcmp:
    * if the lengths are NOT equal, we can safely fold the call
      to a non-zero value.
    * otherwise, do nothing now.

 B.2. (PR78809) When the length of one argument is constant, try to replace
 the call with a __builtin_str(n)cmp_eq call where possible, i.e:

 strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
 string with constant length, C is a constant.
   if (C <= strlen(STR) && sizeof_array(s) > C)
     {
       replace this call with
       __builtin_strncmp_eq (s, STR, C) (!)= 0
     }
   if (C > strlen(STR)
     {
       it can be safely treated as a call to strcmp (s, STR) (!)= 0
       can handled by the following strcmp.
     }

 strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
 string with constant length.
   if  (sizeof_array(s) > strlen(STR))
     {
       replace this call with
       __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
     }

 later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
 as __builtin_memcmp_eq, if the expansion does not succeed, change them back
 to call to __builtin_str(n)cmp.

adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026

bootstraped and tested on both X86 and Aarch64. no regression.

************************************************
gcc/ChangeLog

+2018-02-02  <qing.zhao@oracle.com>
+	
+	PR middle-end/78809
+	PR middle-end/83026
+	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
+	and BUILT_IN_STRNCMP_EQ.
+	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
+	BUILT_IN_STRNCMP_EQ.
+	* gimple-fold.c (gimple_fold_builtin_string_compare): Add the 
+	handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+	(gimple_fold_builtin): Likewise.
+	* tree-ssa-strlen.c (compute_string_length): New function.
+	(determine_min_obsize): New function.
+	(handle_builtin_string_cmp): New function to handle calls to
+	string compare functions.
+	(strlen_optimize_stmt): Add handling to builtin string compare
+	calls. 
+	* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
+	Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+	* tree.c (build_common_builtin_nodes): Add new defines of
+	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
+

gcc/testsuite/ChangeLog

+2018-02-02  <qing.zhao@oracle.com>
+
+	PR middle-end/78809
+	* gcc.dg/strcmpopt_2.c: New testcase.
+	* gcc.dg/strcmpopt_3.c: New testcase.
+
+	PR middle-end/83026
+	* gcc.dg/strcmpopt_3.c: New testcase.


[-- Attachment #2: PR78809_B.patch --]
[-- Type: application/octet-stream, Size: 16893 bytes --]

 gcc/builtins.c                     |  33 +++++
 gcc/builtins.def                   |   5 +
 gcc/gimple-fold.c                  |   5 +
 gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 ++++++++++
 gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 +++++
 gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 +++
 gcc/tree-ssa-strlen.c              | 268 +++++++++++++++++++++++++++++++++++++
 gcc/tree-ssa-structalias.c         |   2 +
 gcc/tree.c                         |   8 ++
 9 files changed, 435 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 683c6ec..bb9e235 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7032,12 +7032,45 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 	return target;
       break;
 
+    /* Expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it 
+       back to a BUILT_IN_STRCMP. Remember to delete the 3rd paramater
+       when changing it to a strcmp call.  */
+    case BUILT_IN_STRCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* Change this call back to a BUILT_IN_STRCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRCMP));
+
+      /* Delete the last parameter.  */
+      unsigned int i;
+      vec<tree, va_gc> *arg_vec;
+      vec_alloc (arg_vec, 2);
+      for (i = 0; i < 2; i++)
+	arg_vec->quick_push (CALL_EXPR_ARG (exp, i));
+      exp = build_call_vec (TREE_TYPE (exp), CALL_EXPR_FN (exp), arg_vec);
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRCMP:
       target = expand_builtin_strcmp (exp, target);
       if (target)
 	return target;
       break;
 
+    /* Expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it
+       back to a BUILT_IN_STRNCMP.  */
+    case BUILT_IN_STRNCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* Change it back to a BUILT_IN_STRNCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRNCMP));
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRNCMP:
       target = expand_builtin_strncmp (exp, target, mode);
       if (target)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 17f825d..2d381ef 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -967,6 +967,11 @@ DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX, "__builtin_alloca_with_ali
    equality with zero.  */
 DEF_BUILTIN_STUB (BUILT_IN_MEMCMP_EQ, "__builtin_memcmp_eq")
 
+/* An internal version of strcmp/strncmp, used when the result is only 
+   tested for equality with zero.  */
+DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq")
+DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq")
+
 /* Object size checking builtins.  */
 DEF_GCC_BUILTIN	       (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_PURE_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_MEMCPY_CHK, "__memcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 3861692..d5584e5 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2223,12 +2223,14 @@ gimple_fold_builtin_string_compare (gimple_stmt_iterator *gsi)
       switch (fcode)
 	{
 	case BUILT_IN_STRCMP:
+	case BUILT_IN_STRCMP_EQ:
 	  {
 	    r = strcmp (p1, p2);
 	    known_result = true;
 	    break;
 	  }
 	case BUILT_IN_STRNCMP:
+	case BUILT_IN_STRNCMP_EQ:
 	  {
 	    if (length == -1)
 	      break;
@@ -2262,6 +2264,7 @@ gimple_fold_builtin_string_compare (gimple_stmt_iterator *gsi)
 
   bool nonzero_length = length >= 1
     || fcode == BUILT_IN_STRCMP
+    || fcode == BUILT_IN_STRCMP_EQ
     || fcode == BUILT_IN_STRCASECMP;
 
   location_t loc = gimple_location (stmt);
@@ -3697,8 +3700,10 @@ gimple_fold_builtin (gimple_stmt_iterator *gsi)
     case BUILT_IN_STRSTR:
       return gimple_fold_builtin_strstr (gsi);
     case BUILT_IN_STRCMP:
+    case BUILT_IN_STRCMP_EQ:
     case BUILT_IN_STRCASECMP:
     case BUILT_IN_STRNCMP:
+    case BUILT_IN_STRNCMP_EQ:
     case BUILT_IN_STRNCASECMP:
       return gimple_fold_builtin_string_compare (gsi);
     case BUILT_IN_MEMCHR:
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
new file mode 100644
index 0000000..0131b8f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+char s[100] = {'a','b','c','d'};
+typedef struct { char s[8]; int x; } S;
+
+__attribute__ ((noinline)) int 
+f1 (S *s) 
+{ 
+  return __builtin_strcmp (s->s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f2 (void) 
+{ 
+  return __builtin_strcmp (s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f3 (S *s) 
+{ 
+  return __builtin_strcmp ("abc", s->s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f4 (void) 
+{ 
+  return __builtin_strcmp ("abc", s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f5 (S *s) 
+{ 
+  return __builtin_strncmp (s->s, "abc", 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f6 (void) 
+{ 
+  return __builtin_strncmp (s, "abc", 2) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f7 (S *s) 
+{ 
+  return __builtin_strncmp ("abc", s->s, 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f8 (void) 
+{ 
+  return __builtin_strncmp ("abc", s, 2) != 0; 
+}
+
+int main (void)
+{
+  S ss = {{'a','b','c'}, 2};
+
+  if (f1 (&ss) != 0 || f2 () != 1 || f3 (&ss) != 0 ||
+      f4 () != 1 || f5 (&ss) != 0 || f6 () != 0 ||
+      f7 (&ss) != 0 || f8 () != 0)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 8 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
new file mode 100644
index 0000000..86a0d7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__ ((noinline)) int 
+f1 (void) 
+{ 
+  char *s0= "abcd";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp(s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int
+f2 (void) 
+{ 
+  char *s0 = "ab";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp("abc", s) != 0; 
+}
+
+int main (void)
+{
+  if (f1 () != 1 
+      || f2 () != 1)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c
new file mode 100644
index 0000000..d727bc3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+typedef struct { char s[8]; int x; } S;
+extern int max_i;
+
+int
+f1 (S * s)
+{ 
+  int result, i;
+  for (i = 0; i < max_i; i++)
+    result += __builtin_strcmp (s->s, "abc") != 0 ? 2 : 1;
+  return result;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 1 "strlen" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index c3cf432..9553740 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-chkp.h"
 #include "tree-hash-traits.h"
+#include "tree-object-size.h"
 #include "builtins.h"
 #include "target.h"
 #include "diagnostic-core.h"
@@ -2715,6 +2716,268 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/* Given an index to the strinfo vector, compute the string length for the
+   corresponding string. Return -1 when unknown.  */
+ 
+static HOST_WIDE_INT 
+compute_string_length (int idx)
+{
+  HOST_WIDE_INT string_leni = -1; 
+  gcc_assert (idx != 0);
+
+  if (idx < 0)
+   return ~idx;
+
+  strinfo *si = get_strinfo (idx);
+  if (si)
+    {
+      tree const_string_len = get_string_length (si);
+      if (const_string_len && tree_fits_shwi_p (const_string_len))
+	string_leni = tree_to_shwi (const_string_len);
+    }
+
+  if (string_leni < 0)
+    return -1;
+
+  return string_leni;
+}
+
+/* Determine the minimum size of the object referenced by DEST expression which
+   must have a pointer type. 
+   Return the minimum size of the object if successful or NULL when the size 
+   cannot be determined.  */
+static tree
+determine_min_objsize (tree dest)
+{
+  unsigned HOST_WIDE_INT size = 0;
+
+  if (compute_builtin_object_size (dest, 2, &size))
+    return build_int_cst (sizetype, size);
+
+  /* Try to determine the size of the object through the RHS of the 
+     assign statement.  */
+  if (TREE_CODE (dest) == SSA_NAME)
+    {
+      gimple *stmt = SSA_NAME_DEF_STMT (dest);
+      if (!is_gimple_assign (stmt))
+	return NULL_TREE;
+
+      if (!gimple_assign_single_p (stmt)
+	  && !gimple_assign_unary_nop_p (stmt))
+	return NULL_TREE;
+
+      dest = gimple_assign_rhs1 (stmt);
+      return determine_min_objsize (dest);
+    }
+
+  /* Try to determine the size of the object from its type.  */
+  if (TREE_CODE (dest) != ADDR_EXPR)
+    return NULL_TREE;
+
+  tree type = TREE_TYPE (dest);
+  if (TREE_CODE (type) == POINTER_TYPE)
+    type = TREE_TYPE (type);
+
+  type = TYPE_MAIN_VARIANT (type);
+
+  /* We cannot determine the size of the array if it's a flexible array, 
+     which is declared at the end of a structure.  */
+  if (TREE_CODE (type) == ARRAY_TYPE
+      && !array_at_struct_end_p (dest))
+    {
+      tree size_t = TYPE_SIZE_UNIT (type);
+      if (size_t && TREE_CODE (size_t) == INTEGER_CST 
+	  && !integer_zerop (size_t))
+        return size_t;
+    }
+
+  return NULL_TREE;
+}
+
+/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
+   equality test against zero:
+
+   A. When the lengths of both arguments are constant and it's a strcmp:
+      * if the lengths are NOT equal, we can safely fold the call
+        to a non-zero value.
+      * otherwise, do nothing now.
+  
+   B. When the length of one argument is constant, try to replace the call with
+   a __builtin_str(n)cmp_eq call where possible, i.e:
+
+   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
+   string with constant length , C is a constant.
+     if (C <= strlen(STR) && sizeof_array(s) > C)
+       {
+         replace this call with
+         strncmp_eq (s, STR, C) (!)= 0
+       }
+     if (C > strlen(STR)
+       {
+         it can be safely treated as a call to strcmp (s, STR) (!)= 0
+         can handled by the following strcmp.
+       }
+
+   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
+   string with constant length.
+     if  (sizeof_array(s) > strlen(STR))
+       {
+         replace this call with
+         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
+       }
+
+   Return false when the call is transformed, return true otherwise. 
+ */ 
+
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree res = gimple_call_lhs (stmt);
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  int idx1 = get_stridx (arg1);
+  int idx2 = get_stridx (arg2);
+  HOST_WIDE_INT length = -1;
+  bool is_ncmp = false;
+
+  if (!res)
+    return true;
+
+  /* When both arguments are unknown, do nothing.  */
+  if (idx1 == 0 && idx2 == 0)
+    return true;
+
+  /* Handle strncmp function.  */
+  if (gimple_call_num_args (stmt) == 3)
+    {
+      tree len = gimple_call_arg (stmt, 2);
+      if (tree_fits_shwi_p (len))
+        length = tree_to_shwi (len);
+
+      is_ncmp = true;
+    }
+
+  /* For strncmp, if the length argument is NOT known, do nothing.  */
+  if (is_ncmp && length < 0)
+    return true;
+
+  /* When the result is ONLY used to do equality test against zero.  */
+  FOR_EACH_IMM_USE_FAST (use_p, iter, res) 
+    {    
+      gimple *use_stmt = USE_STMT (use_p);
+
+      if (is_gimple_debug (use_stmt))
+        continue;
+      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
+	{    
+	  tree_code code = gimple_assign_rhs_code (use_stmt);
+	  if (code == COND_EXPR) 
+	    {
+	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
+	      if ((TREE_CODE (cond_expr) != EQ_EXPR 
+		   && (TREE_CODE (cond_expr) != NE_EXPR))
+		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
+		return true;
+	    }
+	  else if (code == EQ_EXPR || code == NE_EXPR)
+	    {
+	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
+		return true;
+            }
+	  else 
+	    return true;
+	}
+      else if (gimple_code (use_stmt) == GIMPLE_COND)
+	{
+	  tree_code code = gimple_cond_code (use_stmt);
+	  if ((code != EQ_EXPR && code != NE_EXPR)
+	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
+	    return true;
+	}
+      else
+        return true;
+    }
+  
+  /* When the lengths of both arguments are known, and they are unequal, we can 
+     safely fold the call to a non-zero value for strcmp;
+     othewise, do nothing now.  */
+  if (idx1 != 0 && idx2 != 0)
+    {
+      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
+      HOST_WIDE_INT const_string_leni2 = compute_string_length (idx2);
+
+      if (!is_ncmp 
+	  && const_string_leni1 != -1
+	  && const_string_leni2 != -1
+	  && const_string_leni1 != const_string_leni2) 
+	{
+	  replace_call_with_value (gsi, integer_one_node); 
+	  return false;
+	}
+      return true;
+    }
+
+  /* When the length of one argument is constant.  */
+  tree var_string = NULL_TREE;
+  HOST_WIDE_INT const_string_leni = -1;
+  
+  if (idx1)
+    {
+      const_string_leni = compute_string_length (idx1);
+      var_string = arg2;
+    } 
+  else 
+    {
+      gcc_checking_assert (idx2);
+      const_string_leni = compute_string_length (idx2);
+      var_string = arg1;
+    } 
+
+  if (const_string_leni < 0) 
+    return true;
+ 
+  unsigned HOST_WIDE_INT var_sizei = 0;
+  /* try to determine the minimum size of the object pointed by var_string.  */
+  tree size = determine_min_objsize (var_string);
+
+  if (!size)
+    return true;
+ 
+  if (tree_fits_uhwi_p (size))
+    var_sizei = tree_to_uhwi (size);
+
+  if (var_sizei == 0)
+    return true;
+
+  /* For strncmp, if length > const_string_leni , this call can be safely 
+     transformed to a strcmp.  */
+  if (is_ncmp && length > const_string_leni)
+    is_ncmp = false;
+
+  unsigned HOST_WIDE_INT final_length 
+    = is_ncmp ? length : const_string_leni + 1;
+
+  /* Replace strcmp or strncmp with the corresponding str(n)cmp_eq.  */
+  if (var_sizei > final_length) 
+    {
+      tree fn 
+	= (is_ncmp 
+	   ? builtin_decl_implicit (BUILT_IN_STRNCMP_EQ) 
+	   : builtin_decl_implicit (BUILT_IN_STRCMP_EQ));
+      if (!fn)
+	return true;
+      tree const_string_len = build_int_cst (size_type_node, final_length); 
+      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+    }
+  else 
+    return true;
+
+  return false;
+}
+
 /* Handle a POINTER_PLUS_EXPR statement.
    For p = "abcd" + 2; compute associated length, or if
    p = q + off is pointing to a '\0' character of a string, call
@@ -3167,6 +3430,11 @@ strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh)
 	    if (!handle_builtin_memcmp (gsi))
 	      return false;
 	    break;
+	  case BUILT_IN_STRCMP:
+	  case BUILT_IN_STRNCMP:
+	    if (!handle_builtin_string_cmp (gsi))
+	      return false;
+	    break;
 	  default:
 	    break;
 	  }
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index d3b38c3..18cd986 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -4498,7 +4498,9 @@ find_func_aliases_for_builtin_call (struct function *fn, gcall *t)
          that use the memory pointed to by their arguments (but not
 	 transitively).  */
       case BUILT_IN_STRCMP:
+      case BUILT_IN_STRCMP_EQ:
       case BUILT_IN_STRNCMP:
+      case BUILT_IN_STRNCMP_EQ:
       case BUILT_IN_STRCASECMP:
       case BUILT_IN_STRNCASECMP:
       case BUILT_IN_MEMCMP:
diff --git a/gcc/tree.c b/gcc/tree.c
index cd68fb5..ff69088 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10269,6 +10269,14 @@ build_common_builtin_nodes (void)
 			"__builtin_memcmp_eq",
 			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
 
+  local_define_builtin ("__builtin_strncmp_eq", ftype, BUILT_IN_STRNCMP_EQ,
+			"__builtin_strncmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
+  local_define_builtin ("__builtin_strcmp_eq", ftype, BUILT_IN_STRCMP_EQ,
+			"__builtin_strcmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
   /* If there's a possibility that we might use the ARM EABI, build the
     alternate __cxa_end_cleanup node used to resume from C++.  */
   if (targetm.arm_eabi_unwinder)
-- 
1.9.1

[-- Attachment #3: Type: text/plain, Size: 234 bytes --]



> 
>> 
>> We're now in stage4 so please queue this patch and ping it during
>> next stage1.
>> 
> 
> I will update my patch based on Jeff and your comments, and send it during next stage 1.
> 
> thanks.
> 
> Qing
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Ping: [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026
  2018-02-07 15:36           ` [PATCH][Middle-end][version 3]2nd " Qing Zhao
@ 2018-05-22 17:56             ` Qing Zhao
  2018-05-25 20:56             ` Jeff Law
  1 sibling, 0 replies; 19+ messages in thread
From: Qing Zhao @ 2018-05-22 17:56 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jeff Law, Jakub Jelinek, richard Biener

[-- Attachment #1: Type: text/plain, Size: 4465 bytes --]

Ping for the following patch sent in 3 months ago in the end of GCC8:

https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg184075.html

I have rebased the patch on the latest GCC9 thunk. 

bootstraped and tested on both X86 and Aarch64. no regression.

the following are more details:

========

Hi, this is the 3rd version for this patch.

the main change compared with 2nd version are:
	1. do not use “compute_objsize” to get the minimum object size per Jeff and Richard’s
comment. Instead, add a new function “determine_min_objsize” for this purpose. This new
function calls “compute_builtin_object_size” to get the minimum objsize, however, due to 
the fact that “compute_builtin_object_size” does nothing for SSA_NAME when optimize > 0 (don’t
know the exactly reason for this), inside “determine_min_objsize”, I have to add  more code
to catch up some simple SSA_NAME cases.

	2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the new 
BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where 
BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked.

	3. some  format change and comments change per Jeff’s comment. 

let me know if you have any comments.

thanks a lot.

Qing

*********************************

2nd Patch for PR78009 
Patch for PR83026

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Inline strcmp with small constant strings

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
missing strlen optimization for strcmp of unequal strings

The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the second part of change of PR78809 and PR83026:

B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0

B.1. (PR83026) When the lengths of both arguments are constant and
     it's a strcmp:
   * if the lengths are NOT equal, we can safely fold the call
     to a non-zero value.
   * otherwise, do nothing now.

B.2. (PR78809) When the length of one argument is constant, try to replace
the call with a __builtin_str(n)cmp_eq call where possible, i.e:

strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
string with constant length, C is a constant.
  if (C <= strlen(STR) && sizeof_array(s) > C)
    {
      replace this call with
      __builtin_strncmp_eq (s, STR, C) (!)= 0
    }
  if (C > strlen(STR)
    {
      it can be safely treated as a call to strcmp (s, STR) (!)= 0
      can handled by the following strcmp.
    }

strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
string with constant length.
  if  (sizeof_array(s) > strlen(STR))
    {
      replace this call with
      __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
    }

later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
as __builtin_memcmp_eq, if the expansion does not succeed, change them back
to call to __builtin_str(n)cmp.

adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026

bootstraped and tested on both X86 and Aarch64. no regression.

************************************************
gcc/ChangeLog

+2018-05-21  <qing.zhao@oracle.com>
+	
+	PR middle-end/78809
+	PR middle-end/83026
+	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
+	and BUILT_IN_STRNCMP_EQ.
+	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
+	BUILT_IN_STRNCMP_EQ.
+	* gimple-fold.c (gimple_fold_builtin_string_compare): Add the 
+	handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+	(gimple_fold_builtin): Likewise.
+	* tree-ssa-strlen.c (compute_string_length): New function.
+	(determine_min_obsize): New function.
+	(handle_builtin_string_cmp): New function to handle calls to
+	string compare functions.
+	(strlen_optimize_stmt): Add handling to builtin string compare
+	calls. 
+	* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
+	Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+	* tree.c (build_common_builtin_nodes): Add new defines of
+	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
+

gcc/testsuite/ChangeLog

+2018-05-21  <qing.zhao@oracle.com>
+
+	PR middle-end/78809
+	* gcc.dg/strcmpopt_2.c: New testcase.
+	* gcc.dg/strcmpopt_3.c: New testcase.
+
+	PR middle-end/83026
+	* gcc.dg/strcmpopt_3.c: New testcase.


[-- Attachment #2: PR78809_B.patch --]
[-- Type: application/octet-stream, Size: 19630 bytes --]

From cb30112672512c01d5c87e8c6d8ecdf118cf661b Mon Sep 17 00:00:00 2001
From: qing zhao <qing.zhao@oracle.com>
Date: Tue, 22 May 2018 13:07:32 -0400
Subject: [PATCH] 2nd Patch for PR78009 Patch for PR83026

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Inline strcmp with small constant strings

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
missing strlen optimization for strcmp of unequal strings

The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the second part of change of PR78809 and PR83026:

B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0

   B.1. (PR83026) When the lengths of both arguments are constant and
        it's a strcmp:
      * if the lengths are NOT equal, we can safely fold the call
        to a non-zero value.
      * otherwise, do nothing now.

   B.2. (PR78809) When the length of one argument is constant, try to replace
   the call with a __builtin_str(n)cmp_eq call where possible, i.e:

   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
   string with constant length, C is a constant.
     if (C <= strlen(STR) && sizeof_array(s) > C)
       {
         replace this call with
         __builtin_strncmp_eq (s, STR, C) (!)= 0
       }
     if (C > strlen(STR)
       {
         it can be safely treated as a call to strcmp (s, STR) (!)= 0
         can handled by the following strcmp.
       }

   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
   string with constant length.
     if  (sizeof_array(s) > strlen(STR))
       {
         replace this call with
         __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
       }

   later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
   as __builtin_memcmp_eq, if the expansion does not succeed, change them back
   to call to __builtin_str(n)cmp.

adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026

bootstraped and tested on both X86 and Aarch64. no regression.
---
 gcc/builtins.c                     |  33 +++++
 gcc/builtins.def                   |   5 +
 gcc/gimple-fold.c                  |   5 +
 gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 ++++++++++
 gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 +++++
 gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 +++
 gcc/tree-ssa-strlen.c              | 268 +++++++++++++++++++++++++++++++++++++
 gcc/tree-ssa-structalias.c         |   2 +
 gcc/tree.c                         |   8 ++
 9 files changed, 435 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9a2bf8c..aed7513 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7110,12 +7110,45 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 	return target;
       break;
 
+    /* Expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it 
+       back to a BUILT_IN_STRCMP. Remember to delete the 3rd paramater
+       when changing it to a strcmp call.  */
+    case BUILT_IN_STRCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* Change this call back to a BUILT_IN_STRCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRCMP));
+
+      /* Delete the last parameter.  */
+      unsigned int i;
+      vec<tree, va_gc> *arg_vec;
+      vec_alloc (arg_vec, 2);
+      for (i = 0; i < 2; i++)
+	arg_vec->quick_push (CALL_EXPR_ARG (exp, i));
+      exp = build_call_vec (TREE_TYPE (exp), CALL_EXPR_FN (exp), arg_vec);
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRCMP:
       target = expand_builtin_strcmp (exp, target);
       if (target)
 	return target;
       break;
 
+    /* Expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it
+       back to a BUILT_IN_STRNCMP.  */
+    case BUILT_IN_STRNCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* Change it back to a BUILT_IN_STRNCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRNCMP));
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRNCMP:
       target = expand_builtin_strncmp (exp, target, mode);
       if (target)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 449d08d..58b4698 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -971,6 +971,11 @@ DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX, "__builtin_alloca_with_ali
    equality with zero.  */
 DEF_BUILTIN_STUB (BUILT_IN_MEMCMP_EQ, "__builtin_memcmp_eq")
 
+/* An internal version of strcmp/strncmp, used when the result is only 
+   tested for equality with zero.  */
+DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq")
+DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq")
+
 /* Object size checking builtins.  */
 DEF_GCC_BUILTIN	       (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_PURE_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_MEMCPY_CHK, "__memcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index b45798c..53bfcbb 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2215,12 +2215,14 @@ gimple_fold_builtin_string_compare (gimple_stmt_iterator *gsi)
       switch (fcode)
 	{
 	case BUILT_IN_STRCMP:
+	case BUILT_IN_STRCMP_EQ:
 	  {
 	    r = strcmp (p1, p2);
 	    known_result = true;
 	    break;
 	  }
 	case BUILT_IN_STRNCMP:
+	case BUILT_IN_STRNCMP_EQ:
 	  {
 	    if (length == -1)
 	      break;
@@ -2254,6 +2256,7 @@ gimple_fold_builtin_string_compare (gimple_stmt_iterator *gsi)
 
   bool nonzero_length = length >= 1
     || fcode == BUILT_IN_STRCMP
+    || fcode == BUILT_IN_STRCMP_EQ
     || fcode == BUILT_IN_STRCASECMP;
 
   location_t loc = gimple_location (stmt);
@@ -3687,8 +3690,10 @@ gimple_fold_builtin (gimple_stmt_iterator *gsi)
     case BUILT_IN_STRSTR:
       return gimple_fold_builtin_strstr (gsi);
     case BUILT_IN_STRCMP:
+    case BUILT_IN_STRCMP_EQ:
     case BUILT_IN_STRCASECMP:
     case BUILT_IN_STRNCMP:
+    case BUILT_IN_STRNCMP_EQ:
     case BUILT_IN_STRNCASECMP:
       return gimple_fold_builtin_string_compare (gsi);
     case BUILT_IN_MEMCHR:
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
new file mode 100644
index 0000000..0131b8f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+char s[100] = {'a','b','c','d'};
+typedef struct { char s[8]; int x; } S;
+
+__attribute__ ((noinline)) int 
+f1 (S *s) 
+{ 
+  return __builtin_strcmp (s->s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f2 (void) 
+{ 
+  return __builtin_strcmp (s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f3 (S *s) 
+{ 
+  return __builtin_strcmp ("abc", s->s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f4 (void) 
+{ 
+  return __builtin_strcmp ("abc", s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f5 (S *s) 
+{ 
+  return __builtin_strncmp (s->s, "abc", 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f6 (void) 
+{ 
+  return __builtin_strncmp (s, "abc", 2) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f7 (S *s) 
+{ 
+  return __builtin_strncmp ("abc", s->s, 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f8 (void) 
+{ 
+  return __builtin_strncmp ("abc", s, 2) != 0; 
+}
+
+int main (void)
+{
+  S ss = {{'a','b','c'}, 2};
+
+  if (f1 (&ss) != 0 || f2 () != 1 || f3 (&ss) != 0 ||
+      f4 () != 1 || f5 (&ss) != 0 || f6 () != 0 ||
+      f7 (&ss) != 0 || f8 () != 0)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 8 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
new file mode 100644
index 0000000..86a0d7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__ ((noinline)) int 
+f1 (void) 
+{ 
+  char *s0= "abcd";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp(s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int
+f2 (void) 
+{ 
+  char *s0 = "ab";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp("abc", s) != 0; 
+}
+
+int main (void)
+{
+  if (f1 () != 1 
+      || f2 () != 1)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c
new file mode 100644
index 0000000..d727bc3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+typedef struct { char s[8]; int x; } S;
+extern int max_i;
+
+int
+f1 (S * s)
+{ 
+  int result, i;
+  for (i = 0; i < max_i; i++)
+    result += __builtin_strcmp (s->s, "abc") != 0 ? 2 : 1;
+  return result;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 1 "strlen" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 33004b6..f733bf4 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-chkp.h"
 #include "tree-hash-traits.h"
+#include "tree-object-size.h"
 #include "builtins.h"
 #include "target.h"
 #include "diagnostic-core.h"
@@ -2764,6 +2765,268 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/* Given an index to the strinfo vector, compute the string length for the
+   corresponding string. Return -1 when unknown.  */
+ 
+static HOST_WIDE_INT 
+compute_string_length (int idx)
+{
+  HOST_WIDE_INT string_leni = -1; 
+  gcc_assert (idx != 0);
+
+  if (idx < 0)
+   return ~idx;
+
+  strinfo *si = get_strinfo (idx);
+  if (si)
+    {
+      tree const_string_len = get_string_length (si);
+      if (const_string_len && tree_fits_shwi_p (const_string_len))
+	string_leni = tree_to_shwi (const_string_len);
+    }
+
+  if (string_leni < 0)
+    return -1;
+
+  return string_leni;
+}
+
+/* Determine the minimum size of the object referenced by DEST expression which
+   must have a pointer type. 
+   Return the minimum size of the object if successful or NULL when the size 
+   cannot be determined.  */
+static tree
+determine_min_objsize (tree dest)
+{
+  unsigned HOST_WIDE_INT size = 0;
+
+  if (compute_builtin_object_size (dest, 2, &size))
+    return build_int_cst (sizetype, size);
+
+  /* Try to determine the size of the object through the RHS of the 
+     assign statement.  */
+  if (TREE_CODE (dest) == SSA_NAME)
+    {
+      gimple *stmt = SSA_NAME_DEF_STMT (dest);
+      if (!is_gimple_assign (stmt))
+	return NULL_TREE;
+
+      if (!gimple_assign_single_p (stmt)
+	  && !gimple_assign_unary_nop_p (stmt))
+	return NULL_TREE;
+
+      dest = gimple_assign_rhs1 (stmt);
+      return determine_min_objsize (dest);
+    }
+
+  /* Try to determine the size of the object from its type.  */
+  if (TREE_CODE (dest) != ADDR_EXPR)
+    return NULL_TREE;
+
+  tree type = TREE_TYPE (dest);
+  if (TREE_CODE (type) == POINTER_TYPE)
+    type = TREE_TYPE (type);
+
+  type = TYPE_MAIN_VARIANT (type);
+
+  /* We cannot determine the size of the array if it's a flexible array, 
+     which is declared at the end of a structure.  */
+  if (TREE_CODE (type) == ARRAY_TYPE
+      && !array_at_struct_end_p (dest))
+    {
+      tree size_t = TYPE_SIZE_UNIT (type);
+      if (size_t && TREE_CODE (size_t) == INTEGER_CST 
+	  && !integer_zerop (size_t))
+        return size_t;
+    }
+
+  return NULL_TREE;
+}
+
+/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
+   equality test against zero:
+
+   A. When the lengths of both arguments are constant and it's a strcmp:
+      * if the lengths are NOT equal, we can safely fold the call
+        to a non-zero value.
+      * otherwise, do nothing now.
+  
+   B. When the length of one argument is constant, try to replace the call with
+   a __builtin_str(n)cmp_eq call where possible, i.e:
+
+   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
+   string with constant length , C is a constant.
+     if (C <= strlen(STR) && sizeof_array(s) > C)
+       {
+         replace this call with
+         strncmp_eq (s, STR, C) (!)= 0
+       }
+     if (C > strlen(STR)
+       {
+         it can be safely treated as a call to strcmp (s, STR) (!)= 0
+         can handled by the following strcmp.
+       }
+
+   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
+   string with constant length.
+     if  (sizeof_array(s) > strlen(STR))
+       {
+         replace this call with
+         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
+       }
+
+   Return false when the call is transformed, return true otherwise. 
+ */ 
+
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree res = gimple_call_lhs (stmt);
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  int idx1 = get_stridx (arg1);
+  int idx2 = get_stridx (arg2);
+  HOST_WIDE_INT length = -1;
+  bool is_ncmp = false;
+
+  if (!res)
+    return true;
+
+  /* When both arguments are unknown, do nothing.  */
+  if (idx1 == 0 && idx2 == 0)
+    return true;
+
+  /* Handle strncmp function.  */
+  if (gimple_call_num_args (stmt) == 3)
+    {
+      tree len = gimple_call_arg (stmt, 2);
+      if (tree_fits_shwi_p (len))
+        length = tree_to_shwi (len);
+
+      is_ncmp = true;
+    }
+
+  /* For strncmp, if the length argument is NOT known, do nothing.  */
+  if (is_ncmp && length < 0)
+    return true;
+
+  /* When the result is ONLY used to do equality test against zero.  */
+  FOR_EACH_IMM_USE_FAST (use_p, iter, res) 
+    {    
+      gimple *use_stmt = USE_STMT (use_p);
+
+      if (is_gimple_debug (use_stmt))
+        continue;
+      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
+	{    
+	  tree_code code = gimple_assign_rhs_code (use_stmt);
+	  if (code == COND_EXPR) 
+	    {
+	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
+	      if ((TREE_CODE (cond_expr) != EQ_EXPR 
+		   && (TREE_CODE (cond_expr) != NE_EXPR))
+		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
+		return true;
+	    }
+	  else if (code == EQ_EXPR || code == NE_EXPR)
+	    {
+	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
+		return true;
+            }
+	  else 
+	    return true;
+	}
+      else if (gimple_code (use_stmt) == GIMPLE_COND)
+	{
+	  tree_code code = gimple_cond_code (use_stmt);
+	  if ((code != EQ_EXPR && code != NE_EXPR)
+	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
+	    return true;
+	}
+      else
+        return true;
+    }
+  
+  /* When the lengths of both arguments are known, and they are unequal, we can 
+     safely fold the call to a non-zero value for strcmp;
+     othewise, do nothing now.  */
+  if (idx1 != 0 && idx2 != 0)
+    {
+      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
+      HOST_WIDE_INT const_string_leni2 = compute_string_length (idx2);
+
+      if (!is_ncmp 
+	  && const_string_leni1 != -1
+	  && const_string_leni2 != -1
+	  && const_string_leni1 != const_string_leni2) 
+	{
+	  replace_call_with_value (gsi, integer_one_node); 
+	  return false;
+	}
+      return true;
+    }
+
+  /* When the length of one argument is constant.  */
+  tree var_string = NULL_TREE;
+  HOST_WIDE_INT const_string_leni = -1;
+  
+  if (idx1)
+    {
+      const_string_leni = compute_string_length (idx1);
+      var_string = arg2;
+    } 
+  else 
+    {
+      gcc_checking_assert (idx2);
+      const_string_leni = compute_string_length (idx2);
+      var_string = arg1;
+    } 
+
+  if (const_string_leni < 0) 
+    return true;
+ 
+  unsigned HOST_WIDE_INT var_sizei = 0;
+  /* try to determine the minimum size of the object pointed by var_string.  */
+  tree size = determine_min_objsize (var_string);
+
+  if (!size)
+    return true;
+ 
+  if (tree_fits_uhwi_p (size))
+    var_sizei = tree_to_uhwi (size);
+
+  if (var_sizei == 0)
+    return true;
+
+  /* For strncmp, if length > const_string_leni , this call can be safely 
+     transformed to a strcmp.  */
+  if (is_ncmp && length > const_string_leni)
+    is_ncmp = false;
+
+  unsigned HOST_WIDE_INT final_length 
+    = is_ncmp ? length : const_string_leni + 1;
+
+  /* Replace strcmp or strncmp with the corresponding str(n)cmp_eq.  */
+  if (var_sizei > final_length) 
+    {
+      tree fn 
+	= (is_ncmp 
+	   ? builtin_decl_implicit (BUILT_IN_STRNCMP_EQ) 
+	   : builtin_decl_implicit (BUILT_IN_STRCMP_EQ));
+      if (!fn)
+	return true;
+      tree const_string_len = build_int_cst (size_type_node, final_length); 
+      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+    }
+  else 
+    return true;
+
+  return false;
+}
+
 /* Handle a POINTER_PLUS_EXPR statement.
    For p = "abcd" + 2; compute associated length, or if
    p = q + off is pointing to a '\0' character of a string, call
@@ -3216,6 +3479,11 @@ strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh)
 	    if (!handle_builtin_memcmp (gsi))
 	      return false;
 	    break;
+	  case BUILT_IN_STRCMP:
+	  case BUILT_IN_STRNCMP:
+	    if (!handle_builtin_string_cmp (gsi))
+	      return false;
+	    break;
 	  default:
 	    break;
 	  }
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index 8fb13a0..14ab83a 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -4498,7 +4498,9 @@ find_func_aliases_for_builtin_call (struct function *fn, gcall *t)
          that use the memory pointed to by their arguments (but not
 	 transitively).  */
       case BUILT_IN_STRCMP:
+      case BUILT_IN_STRCMP_EQ:
       case BUILT_IN_STRNCMP:
+      case BUILT_IN_STRNCMP_EQ:
       case BUILT_IN_STRCASECMP:
       case BUILT_IN_STRNCASECMP:
       case BUILT_IN_MEMCMP:
diff --git a/gcc/tree.c b/gcc/tree.c
index ff982fe..5f65001 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10279,6 +10279,14 @@ build_common_builtin_nodes (void)
 			"__builtin_memcmp_eq",
 			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
 
+  local_define_builtin ("__builtin_strncmp_eq", ftype, BUILT_IN_STRNCMP_EQ,
+			"__builtin_strncmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
+  local_define_builtin ("__builtin_strcmp_eq", ftype, BUILT_IN_STRCMP_EQ,
+			"__builtin_strcmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
   /* If there's a possibility that we might use the ARM EABI, build the
     alternate __cxa_end_cleanup node used to resume from C++.  */
   if (targetm.arm_eabi_unwinder)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026
  2018-02-07 15:36           ` [PATCH][Middle-end][version 3]2nd " Qing Zhao
  2018-05-22 17:56             ` Ping: " Qing Zhao
@ 2018-05-25 20:56             ` Jeff Law
  2018-05-30  0:16               ` Qing Zhao
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff Law @ 2018-05-25 20:56 UTC (permalink / raw)
  To: Qing Zhao, GCC Patches; +Cc: Jakub Jelinek, richard Biener

On 02/07/2018 08:36 AM, Qing Zhao wrote:
> Hi, this is the 3rd version for this patch.
> 
> the main change compared with 2nd version are:
> 	1. do not use “compute_objsize” to get the minimum object size per Jeff and Richard’s
> comment. Instead, add a new function “determine_min_objsize” for this purpose. This new
> function calls “compute_builtin_object_size” to get the minimum objsize, however, due to 
> the fact that “compute_builtin_object_size” does nothing for SSA_NAME when optimize > 0 (don’t
> know the exactly reason for this), inside “determine_min_objsize”, I have to add  more code
> to catch up some simple SSA_NAME cases.
> 
> 	2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the new 
> BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where 
> BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked.
> 
> 	3. some  format change and comments change per Jeff’s comment. 
> 
> let me know if you have any comments.
> 
> thanks a lot.
> 
> Qing
> 
> *********************************
> 
> 2nd Patch for PR78009 
> Patch for PR83026
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
> Inline strcmp with small constant strings
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
> missing strlen optimization for strcmp of unequal strings
> 
> The design doc for PR78809 is at:
> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
> 
> this patch is for the second part of change of PR78809 and PR83026:
> 
> B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
> 
>  B.1. (PR83026) When the lengths of both arguments are constant and
>       it's a strcmp:
>     * if the lengths are NOT equal, we can safely fold the call
>       to a non-zero value.
>     * otherwise, do nothing now.
> 
>  B.2. (PR78809) When the length of one argument is constant, try to replace
>  the call with a __builtin_str(n)cmp_eq call where possible, i.e:
> 
>  strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
>  string with constant length, C is a constant.
>    if (C <= strlen(STR) && sizeof_array(s) > C)
>      {
>        replace this call with
>        __builtin_strncmp_eq (s, STR, C) (!)= 0
>      }
>    if (C > strlen(STR)
>      {
>        it can be safely treated as a call to strcmp (s, STR) (!)= 0
>        can handled by the following strcmp.
>      }
> 
>  strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
>  string with constant length.
>    if  (sizeof_array(s) > strlen(STR))
>      {
>        replace this call with
>        __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
>      }
> 
>  later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
>  as __builtin_memcmp_eq, if the expansion does not succeed, change them back
>  to call to __builtin_str(n)cmp.
> 
> adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
> PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026
> 
> bootstraped and tested on both X86 and Aarch64. no regression.
> 
> ************************************************
> gcc/ChangeLog
> 
> +2018-02-02  <qing.zhao@oracle.com>
> +	
> +	PR middle-end/78809
> +	PR middle-end/83026
> +	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
> +	and BUILT_IN_STRNCMP_EQ.
> +	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
> +	BUILT_IN_STRNCMP_EQ.
> +	* gimple-fold.c (gimple_fold_builtin_string_compare): Add the 
> +	handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
> +	(gimple_fold_builtin): Likewise.
> +	* tree-ssa-strlen.c (compute_string_length): New function.
> +	(determine_min_obsize): New function.
> +	(handle_builtin_string_cmp): New function to handle calls to
> +	string compare functions.
> +	(strlen_optimize_stmt): Add handling to builtin string compare
> +	calls. 
> +	* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
> +	Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
> +	* tree.c (build_common_builtin_nodes): Add new defines of
> +	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
> +
> 
> gcc/testsuite/ChangeLog
> 
> +2018-02-02  <qing.zhao@oracle.com>
> +
> +	PR middle-end/78809
> +	* gcc.dg/strcmpopt_2.c: New testcase.
> +	* gcc.dg/strcmpopt_3.c: New testcase.
> +
> +	PR middle-end/83026
> +	* gcc.dg/strcmpopt_3.c: New testcase.
> 
> 
>
> +
> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
> +   equality test against zero:
> +
> +   A. When the lengths of both arguments are constant and it's a strcmp:
> +      * if the lengths are NOT equal, we can safely fold the call
> +        to a non-zero value.
> +      * otherwise, do nothing now.
> +  
> +   B. When the length of one argument is constant, try to replace the call with
> +   a __builtin_str(n)cmp_eq call where possible, i.e:
> +
> +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   string with constant length , C is a constant.
> +     if (C <= strlen(STR) && sizeof_array(s) > C)
> +       {
> +         replace this call with
> +         strncmp_eq (s, STR, C) (!)= 0
> +       }
> +     if (C > strlen(STR)
> +       {
> +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
> +         can handled by the following strcmp.
> +       }
> +
> +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   string with constant length.
> +     if  (sizeof_array(s) > strlen(STR))
> +       {
> +         replace this call with
> +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
> +       }
> +
> +   Return false when the call is transformed, return true otherwise. 
> + */ 
> +
> +static bool
> +handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
So I originally thought you had the core logic wrong in the immediate
uses loop.  But it's actually the case that the return value is the
exact opposite of what I expected.

ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
was not transformed.

Can you fix that so it's not so confusing?

I think with that change we'll be good to go, but please repost for a
final looksie.

THanks,
Jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026
  2018-05-25 20:56             ` Jeff Law
@ 2018-05-30  0:16               ` Qing Zhao
  2018-05-31 21:10                 ` committed: " Qing Zhao
  2018-06-23  4:49                 ` Jeff Law
  0 siblings, 2 replies; 19+ messages in thread
From: Qing Zhao @ 2018-05-30  0:16 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches, Jakub Jelinek, richard Biener

[-- Attachment #1: Type: text/plain, Size: 748 bytes --]

Hi, Jeff,

Thanks a lot for your review and comments.

I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64.

please take a look at the patch again.

thanks.

Qing

> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote:

> So I originally thought you had the core logic wrong in the immediate
> uses loop.  But it's actually the case that the return value is the
> exact opposite of what I expected.
> 
> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
> was not transformed.
> 
> Can you fix that so it's not so confusing?
> 
> I think with that change we'll be good to go, but please repost for a
> final looksie.
> 
> THanks,
> Jeff


[-- Attachment #2: 0001-2nd-Patch-for-PR78009.patch --]
[-- Type: application/octet-stream, Size: 22847 bytes --]

From 750f44ee0777d55b568f07e263babdedd532d315 Mon Sep 17 00:00:00 2001
From: qing zhao <qing.zhao@oracle.com>
Date: Tue, 29 May 2018 16:15:21 -0400
Subject: [PATCH] 2nd Patch for PR78009 Patch for PR83026

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Inline strcmp with small constant strings

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
missing strlen optimization for strcmp of unequal strings

The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the second part of change of PR78809 and PR83026:

B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0

   B.1. (PR83026) When the lengths of both arguments are constant and
        it's a strcmp:
      * if the lengths are NOT equal, we can safely fold the call
        to a non-zero value.
      * otherwise, do nothing now.

   B.2. (PR78809) When the length of one argument is constant, try to replace
   the call with a __builtin_str(n)cmp_eq call where possible, i.e:

   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
   string with constant length, C is a constant.
     if (C <= strlen(STR) && sizeof_array(s) > C)
       {
         replace this call with
         __builtin_strncmp_eq (s, STR, C) (!)= 0
       }
     if (C > strlen(STR)
       {
         it can be safely treated as a call to strcmp (s, STR) (!)= 0
         can handled by the following strcmp.
       }

   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
   string with constant length.
     if  (sizeof_array(s) > strlen(STR))
       {
         replace this call with
         __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
       }

   later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
   as __builtin_memcmp_eq, if the expansion does not succeed, change them back
   to call to __builtin_str(n)cmp.

adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026

bootstraped and tested on both X86 and Aarch64. no regression.
---
 gcc/builtins.c                     |  33 ++++
 gcc/builtins.def                   |   5 +
 gcc/gimple-fold.c                  |   5 +
 gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 ++++++++
 gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 ++++
 gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 ++
 gcc/tree-ssa-strlen.c              | 304 ++++++++++++++++++++++++++++++++++---
 gcc/tree-ssa-structalias.c         |   2 +
 gcc/tree.c                         |   8 +
 9 files changed, 454 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 841c1ef..6b21723 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7120,12 +7120,45 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 	return target;
       break;
 
+    /* Expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it 
+       back to a BUILT_IN_STRCMP. Remember to delete the 3rd paramater
+       when changing it to a strcmp call.  */
+    case BUILT_IN_STRCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* Change this call back to a BUILT_IN_STRCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRCMP));
+
+      /* Delete the last parameter.  */
+      unsigned int i;
+      vec<tree, va_gc> *arg_vec;
+      vec_alloc (arg_vec, 2);
+      for (i = 0; i < 2; i++)
+	arg_vec->quick_push (CALL_EXPR_ARG (exp, i));
+      exp = build_call_vec (TREE_TYPE (exp), CALL_EXPR_FN (exp), arg_vec);
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRCMP:
       target = expand_builtin_strcmp (exp, target);
       if (target)
 	return target;
       break;
 
+    /* Expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it
+       back to a BUILT_IN_STRNCMP.  */
+    case BUILT_IN_STRNCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* Change it back to a BUILT_IN_STRNCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRNCMP));
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRNCMP:
       target = expand_builtin_strncmp (exp, target, mode);
       if (target)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 449d08d..58b4698 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -971,6 +971,11 @@ DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX, "__builtin_alloca_with_ali
    equality with zero.  */
 DEF_BUILTIN_STUB (BUILT_IN_MEMCMP_EQ, "__builtin_memcmp_eq")
 
+/* An internal version of strcmp/strncmp, used when the result is only 
+   tested for equality with zero.  */
+DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq")
+DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq")
+
 /* Object size checking builtins.  */
 DEF_GCC_BUILTIN	       (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_PURE_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_MEMCPY_CHK, "__memcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index cd1ab80..9bbe7f1 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2215,12 +2215,14 @@ gimple_fold_builtin_string_compare (gimple_stmt_iterator *gsi)
       switch (fcode)
 	{
 	case BUILT_IN_STRCMP:
+	case BUILT_IN_STRCMP_EQ:
 	  {
 	    r = strcmp (p1, p2);
 	    known_result = true;
 	    break;
 	  }
 	case BUILT_IN_STRNCMP:
+	case BUILT_IN_STRNCMP_EQ:
 	  {
 	    if (length == -1)
 	      break;
@@ -2254,6 +2256,7 @@ gimple_fold_builtin_string_compare (gimple_stmt_iterator *gsi)
 
   bool nonzero_length = length >= 1
     || fcode == BUILT_IN_STRCMP
+    || fcode == BUILT_IN_STRCMP_EQ
     || fcode == BUILT_IN_STRCASECMP;
 
   location_t loc = gimple_location (stmt);
@@ -3687,8 +3690,10 @@ gimple_fold_builtin (gimple_stmt_iterator *gsi)
     case BUILT_IN_STRSTR:
       return gimple_fold_builtin_strstr (gsi);
     case BUILT_IN_STRCMP:
+    case BUILT_IN_STRCMP_EQ:
     case BUILT_IN_STRCASECMP:
     case BUILT_IN_STRNCMP:
+    case BUILT_IN_STRNCMP_EQ:
     case BUILT_IN_STRNCASECMP:
       return gimple_fold_builtin_string_compare (gsi);
     case BUILT_IN_MEMCHR:
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
new file mode 100644
index 0000000..0131b8f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+char s[100] = {'a','b','c','d'};
+typedef struct { char s[8]; int x; } S;
+
+__attribute__ ((noinline)) int 
+f1 (S *s) 
+{ 
+  return __builtin_strcmp (s->s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f2 (void) 
+{ 
+  return __builtin_strcmp (s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f3 (S *s) 
+{ 
+  return __builtin_strcmp ("abc", s->s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f4 (void) 
+{ 
+  return __builtin_strcmp ("abc", s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f5 (S *s) 
+{ 
+  return __builtin_strncmp (s->s, "abc", 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f6 (void) 
+{ 
+  return __builtin_strncmp (s, "abc", 2) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f7 (S *s) 
+{ 
+  return __builtin_strncmp ("abc", s->s, 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f8 (void) 
+{ 
+  return __builtin_strncmp ("abc", s, 2) != 0; 
+}
+
+int main (void)
+{
+  S ss = {{'a','b','c'}, 2};
+
+  if (f1 (&ss) != 0 || f2 () != 1 || f3 (&ss) != 0 ||
+      f4 () != 1 || f5 (&ss) != 0 || f6 () != 0 ||
+      f7 (&ss) != 0 || f8 () != 0)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 8 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
new file mode 100644
index 0000000..86a0d7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__ ((noinline)) int 
+f1 (void) 
+{ 
+  char *s0= "abcd";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp(s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int
+f2 (void) 
+{ 
+  char *s0 = "ab";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp("abc", s) != 0; 
+}
+
+int main (void)
+{
+  if (f1 () != 1 
+      || f2 () != 1)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c
new file mode 100644
index 0000000..d727bc3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+typedef struct { char s[8]; int x; } S;
+extern int max_i;
+
+int
+f1 (S * s)
+{ 
+  int result, i;
+  for (i = 0; i < max_i; i++)
+    result += __builtin_strcmp (s->s, "abc") != 0 ? 2 : 1;
+  return result;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 1 "strlen" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 556c5bc..7a89174 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-chkp.h"
 #include "tree-hash-traits.h"
+#include "tree-object-size.h"
 #include "builtins.h"
 #include "target.h"
 #include "diagnostic-core.h"
@@ -2627,27 +2628,28 @@ handle_builtin_malloc (enum built_in_function bcode, gimple_stmt_iterator *gsi)
 
 /* Handle a call to memset.
    After a call to calloc, memset(,0,) is unnecessary.
-   memset(malloc(n),0,n) is calloc(n,1).  */
+   memset(malloc(n),0,n) is calloc(n,1). 
+   return true when the call is transfomred, false otherwise.  */
 
 static bool
 handle_builtin_memset (gimple_stmt_iterator *gsi)
 {
   gimple *stmt2 = gsi_stmt (*gsi);
   if (!integer_zerop (gimple_call_arg (stmt2, 1)))
-    return true;
+    return false;
   tree ptr = gimple_call_arg (stmt2, 0);
   int idx1 = get_stridx (ptr);
   if (idx1 <= 0)
-    return true;
+    return false;
   strinfo *si1 = get_strinfo (idx1);
   if (!si1)
-    return true;
+    return false;
   gimple *stmt1 = si1->stmt;
   if (!stmt1 || !is_gimple_call (stmt1))
-    return true;
+    return false;
   tree callee1 = gimple_call_fndecl (stmt1);
   if (!valid_builtin_call (stmt1))
-    return true;
+    return false;
   enum built_in_function code1 = DECL_FUNCTION_CODE (callee1);
   tree size = gimple_call_arg (stmt2, 2);
   if (code1 == BUILT_IN_CALLOC)
@@ -2663,7 +2665,7 @@ handle_builtin_memset (gimple_stmt_iterator *gsi)
       si1->stmt = gsi_stmt (gsi1);
     }
   else
-    return true;
+    return false;
   tree lhs = gimple_call_lhs (stmt2);
   unlink_stmt_vdef (stmt2);
   if (lhs)
@@ -2677,12 +2679,13 @@ handle_builtin_memset (gimple_stmt_iterator *gsi)
       release_defs (stmt2);
     }
 
-  return false;
+  return true;
 }
 
 /* Handle a call to memcmp.  We try to handle small comparisons by
    converting them to load and compare, and replacing the call to memcmp
-   with a __builtin_memcmp_eq call where possible.  */
+   with a __builtin_memcmp_eq call where possible. 
+   return true when call is transformed, return false otherwise.  */
 
 static bool
 handle_builtin_memcmp (gimple_stmt_iterator *gsi)
@@ -2697,7 +2700,7 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
   imm_use_iterator iter;
 
   if (!res)
-    return true;
+    return false;
 
   FOR_EACH_IMM_USE_FAST (use_p, iter, res)
     {
@@ -2711,17 +2714,17 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
 	  tree_code code = gimple_assign_rhs_code (asgn);
 	  if ((code != EQ_EXPR && code != NE_EXPR)
 	      || !integer_zerop (gimple_assign_rhs2 (asgn)))
-	    return true;
+	    return false;
 	}
       else if (gimple_code (ustmt) == GIMPLE_COND)
 	{
 	  tree_code code = gimple_cond_code (ustmt);
 	  if ((code != EQ_EXPR && code != NE_EXPR)
 	      || !integer_zerop (gimple_cond_rhs (ustmt)))
-	    return true;
+	    return false;
 	}
       else
-	return true;
+	return false;
     }
 
   if (tree_fits_uhwi_p (len)
@@ -2756,12 +2759,274 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
 						   boolean_type_node,
 						   arg1, arg2));
 	  gimplify_and_update_call_from_tree (gsi, res);
-	  return false;
+	  return true;
 	}
     }
 
   gimple_call_set_fndecl (stmt2, builtin_decl_explicit (BUILT_IN_MEMCMP_EQ));
-  return false;
+  return true;
+}
+
+/* Given an index to the strinfo vector, compute the string length for the
+   corresponding string. Return -1 when unknown.  */
+ 
+static HOST_WIDE_INT 
+compute_string_length (int idx)
+{
+  HOST_WIDE_INT string_leni = -1; 
+  gcc_assert (idx != 0);
+
+  if (idx < 0)
+   return ~idx;
+
+  strinfo *si = get_strinfo (idx);
+  if (si)
+    {
+      tree const_string_len = get_string_length (si);
+      if (const_string_len && tree_fits_shwi_p (const_string_len))
+	string_leni = tree_to_shwi (const_string_len);
+    }
+
+  if (string_leni < 0)
+    return -1;
+
+  return string_leni;
+}
+
+/* Determine the minimum size of the object referenced by DEST expression which
+   must have a pointer type. 
+   Return the minimum size of the object if successful or NULL when the size 
+   cannot be determined.  */
+static tree
+determine_min_objsize (tree dest)
+{
+  unsigned HOST_WIDE_INT size = 0;
+
+  if (compute_builtin_object_size (dest, 2, &size))
+    return build_int_cst (sizetype, size);
+
+  /* Try to determine the size of the object through the RHS of the 
+     assign statement.  */
+  if (TREE_CODE (dest) == SSA_NAME)
+    {
+      gimple *stmt = SSA_NAME_DEF_STMT (dest);
+      if (!is_gimple_assign (stmt))
+	return NULL_TREE;
+
+      if (!gimple_assign_single_p (stmt)
+	  && !gimple_assign_unary_nop_p (stmt))
+	return NULL_TREE;
+
+      dest = gimple_assign_rhs1 (stmt);
+      return determine_min_objsize (dest);
+    }
+
+  /* Try to determine the size of the object from its type.  */
+  if (TREE_CODE (dest) != ADDR_EXPR)
+    return NULL_TREE;
+
+  tree type = TREE_TYPE (dest);
+  if (TREE_CODE (type) == POINTER_TYPE)
+    type = TREE_TYPE (type);
+
+  type = TYPE_MAIN_VARIANT (type);
+
+  /* We cannot determine the size of the array if it's a flexible array, 
+     which is declared at the end of a structure.  */
+  if (TREE_CODE (type) == ARRAY_TYPE
+      && !array_at_struct_end_p (dest))
+    {
+      tree size_t = TYPE_SIZE_UNIT (type);
+      if (size_t && TREE_CODE (size_t) == INTEGER_CST 
+	  && !integer_zerop (size_t))
+        return size_t;
+    }
+
+  return NULL_TREE;
+}
+
+/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
+   equality test against zero:
+
+   A. When the lengths of both arguments are constant and it's a strcmp:
+      * if the lengths are NOT equal, we can safely fold the call
+        to a non-zero value.
+      * otherwise, do nothing now.
+  
+   B. When the length of one argument is constant, try to replace the call with
+   a __builtin_str(n)cmp_eq call where possible, i.e:
+
+   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
+   string with constant length , C is a constant.
+     if (C <= strlen(STR) && sizeof_array(s) > C)
+       {
+         replace this call with
+         strncmp_eq (s, STR, C) (!)= 0
+       }
+     if (C > strlen(STR)
+       {
+         it can be safely treated as a call to strcmp (s, STR) (!)= 0
+         can handled by the following strcmp.
+       }
+
+   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
+   string with constant length.
+     if  (sizeof_array(s) > strlen(STR))
+       {
+         replace this call with
+         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
+       }
+
+   Return true when the call is transformed, return false otherwise. 
+ */ 
+
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree res = gimple_call_lhs (stmt);
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  int idx1 = get_stridx (arg1);
+  int idx2 = get_stridx (arg2);
+  HOST_WIDE_INT length = -1;
+  bool is_ncmp = false;
+
+  if (!res)
+    return false;
+
+  /* When both arguments are unknown, do nothing.  */
+  if (idx1 == 0 && idx2 == 0)
+    return false;
+
+  /* Handle strncmp function.  */
+  if (gimple_call_num_args (stmt) == 3)
+    {
+      tree len = gimple_call_arg (stmt, 2);
+      if (tree_fits_shwi_p (len))
+        length = tree_to_shwi (len);
+
+      is_ncmp = true;
+    }
+
+  /* For strncmp, if the length argument is NOT known, do nothing.  */
+  if (is_ncmp && length < 0)
+    return false;
+
+  /* When the result is ONLY used to do equality test against zero.  */
+  FOR_EACH_IMM_USE_FAST (use_p, iter, res) 
+    {    
+      gimple *use_stmt = USE_STMT (use_p);
+
+      if (is_gimple_debug (use_stmt))
+        continue;
+      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
+	{    
+	  tree_code code = gimple_assign_rhs_code (use_stmt);
+	  if (code == COND_EXPR) 
+	    {
+	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
+	      if ((TREE_CODE (cond_expr) != EQ_EXPR 
+		   && (TREE_CODE (cond_expr) != NE_EXPR))
+		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
+		return false;
+	    }
+	  else if (code == EQ_EXPR || code == NE_EXPR)
+	    {
+	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
+		return false;
+            }
+	  else 
+	    return false;
+	}
+      else if (gimple_code (use_stmt) == GIMPLE_COND)
+	{
+	  tree_code code = gimple_cond_code (use_stmt);
+	  if ((code != EQ_EXPR && code != NE_EXPR)
+	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
+	    return false;
+	}
+      else
+        return false;
+    }
+  
+  /* When the lengths of both arguments are known, and they are unequal, we can 
+     safely fold the call to a non-zero value for strcmp;
+     othewise, do nothing now.  */
+  if (idx1 != 0 && idx2 != 0)
+    {
+      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
+      HOST_WIDE_INT const_string_leni2 = compute_string_length (idx2);
+
+      if (!is_ncmp 
+	  && const_string_leni1 != -1
+	  && const_string_leni2 != -1
+	  && const_string_leni1 != const_string_leni2) 
+	{
+	  replace_call_with_value (gsi, integer_one_node); 
+	  return true;
+	}
+      return false;
+    }
+
+  /* When the length of one argument is constant.  */
+  tree var_string = NULL_TREE;
+  HOST_WIDE_INT const_string_leni = -1;
+  
+  if (idx1)
+    {
+      const_string_leni = compute_string_length (idx1);
+      var_string = arg2;
+    } 
+  else 
+    {
+      gcc_checking_assert (idx2);
+      const_string_leni = compute_string_length (idx2);
+      var_string = arg1;
+    } 
+
+  if (const_string_leni < 0) 
+    return false;
+ 
+  unsigned HOST_WIDE_INT var_sizei = 0;
+  /* try to determine the minimum size of the object pointed by var_string.  */
+  tree size = determine_min_objsize (var_string);
+
+  if (!size)
+    return false;
+ 
+  if (tree_fits_uhwi_p (size))
+    var_sizei = tree_to_uhwi (size);
+
+  if (var_sizei == 0)
+    return false;
+
+  /* For strncmp, if length > const_string_leni , this call can be safely 
+     transformed to a strcmp.  */
+  if (is_ncmp && length > const_string_leni)
+    is_ncmp = false;
+
+  unsigned HOST_WIDE_INT final_length 
+    = is_ncmp ? length : const_string_leni + 1;
+
+  /* Replace strcmp or strncmp with the corresponding str(n)cmp_eq.  */
+  if (var_sizei > final_length) 
+    {
+      tree fn 
+	= (is_ncmp 
+	   ? builtin_decl_implicit (BUILT_IN_STRNCMP_EQ) 
+	   : builtin_decl_implicit (BUILT_IN_STRCMP_EQ));
+      if (!fn)
+	return false;
+      tree const_string_len = build_int_cst (size_type_node, final_length); 
+      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+    }
+  else 
+    return false;
+
+  return true;
 }
 
 /* Handle a POINTER_PLUS_EXPR statement.
@@ -3209,11 +3474,16 @@ strlen_check_and_optimize_stmt (gimple_stmt_iterator *gsi, bool *cleanup_eh)
 	    handle_builtin_malloc (DECL_FUNCTION_CODE (callee), gsi);
 	    break;
 	  case BUILT_IN_MEMSET:
-	    if (!handle_builtin_memset (gsi))
+	    if (handle_builtin_memset (gsi))
 	      return false;
 	    break;
 	  case BUILT_IN_MEMCMP:
-	    if (!handle_builtin_memcmp (gsi))
+	    if (handle_builtin_memcmp (gsi))
+	      return false;
+	    break;
+	  case BUILT_IN_STRCMP:
+	  case BUILT_IN_STRNCMP:
+	    if (handle_builtin_string_cmp (gsi))
 	      return false;
 	    break;
 	  default:
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index 8fb13a0..14ab83a 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -4498,7 +4498,9 @@ find_func_aliases_for_builtin_call (struct function *fn, gcall *t)
          that use the memory pointed to by their arguments (but not
 	 transitively).  */
       case BUILT_IN_STRCMP:
+      case BUILT_IN_STRCMP_EQ:
       case BUILT_IN_STRNCMP:
+      case BUILT_IN_STRNCMP_EQ:
       case BUILT_IN_STRCASECMP:
       case BUILT_IN_STRNCASECMP:
       case BUILT_IN_MEMCMP:
diff --git a/gcc/tree.c b/gcc/tree.c
index e8dc425..85ce993 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10277,6 +10277,14 @@ build_common_builtin_nodes (void)
 			"__builtin_memcmp_eq",
 			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
 
+  local_define_builtin ("__builtin_strncmp_eq", ftype, BUILT_IN_STRNCMP_EQ,
+			"__builtin_strncmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
+  local_define_builtin ("__builtin_strcmp_eq", ftype, BUILT_IN_STRCMP_EQ,
+			"__builtin_strcmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
   /* If there's a possibility that we might use the ARM EABI, build the
     alternate __cxa_end_cleanup node used to resume from C++.  */
   if (targetm.arm_eabi_unwinder)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* committed: [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026
  2018-05-30  0:16               ` Qing Zhao
@ 2018-05-31 21:10                 ` Qing Zhao
  2018-06-23  4:49                 ` Jeff Law
  1 sibling, 0 replies; 19+ messages in thread
From: Qing Zhao @ 2018-05-31 21:10 UTC (permalink / raw)
  To: gcc Patches; +Cc: Jakub Jelinek, richard Biener, jeff Law

Hi, 

I have committed the patch as revision 261039.

thanks.

Qing

> On May 29, 2018, at 7:08 PM, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> Hi, Jeff,
> 
> Thanks a lot for your review and comments.
> 
> I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64.
> 
> please take a look at the patch again.
> 
> thanks.
> 
> Qing
> 
>> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote:
> 
>> So I originally thought you had the core logic wrong in the immediate
>> uses loop.  But it's actually the case that the return value is the
>> exact opposite of what I expected.
>> 
>> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
>> was not transformed.
>> 
>> Can you fix that so it's not so confusing?
>> 
>> I think with that change we'll be good to go, but please repost for a
>> final looksie.
>> 
>> THanks,
>> Jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026
  2018-05-30  0:16               ` Qing Zhao
  2018-05-31 21:10                 ` committed: " Qing Zhao
@ 2018-06-23  4:49                 ` Jeff Law
  2018-06-25 15:28                   ` Qing Zhao
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff Law @ 2018-06-23  4:49 UTC (permalink / raw)
  To: Qing Zhao; +Cc: GCC Patches, Jakub Jelinek, richard Biener

On 05/29/2018 06:08 PM, Qing Zhao wrote:
> Hi, Jeff,
> 
> Thanks a lot for your review and comments.
> 
> I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64.
> 
> please take a look at the patch again.
> 
> thanks.
> 
> Qing
> 
>> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote:
>> So I originally thought you had the core logic wrong in the immediate
>> uses loop.  But it's actually the case that the return value is the
>> exact opposite of what I expected.
>>
>> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
>> was not transformed.
>>
>> Can you fix that so it's not so confusing?
>>
>> I think with that change we'll be good to go, but please repost for a
>> final looksie.
>>
>> THanks,
>> Jeff
> 
> 
> 0001-2nd-Patch-for-PR78009.patch
> 
> 
> From 750f44ee0777d55b568f07e263babdedd532d315 Mon Sep 17 00:00:00 2001
> From: qing zhao <qing.zhao@oracle.com>
> Date: Tue, 29 May 2018 16:15:21 -0400
> Subject: [PATCH] 2nd Patch for PR78009 Patch for PR83026
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
> Inline strcmp with small constant strings
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
> missing strlen optimization for strcmp of unequal strings
> 
> The design doc for PR78809 is at:
> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
> 
> this patch is for the second part of change of PR78809 and PR83026:
> 
> B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
> 
>    B.1. (PR83026) When the lengths of both arguments are constant and
>         it's a strcmp:
>       * if the lengths are NOT equal, we can safely fold the call
>         to a non-zero value.
>       * otherwise, do nothing now.
> 
>    B.2. (PR78809) When the length of one argument is constant, try to replace
>    the call with a __builtin_str(n)cmp_eq call where possible, i.e:
> 
>    strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
>    string with constant length, C is a constant.
>      if (C <= strlen(STR) && sizeof_array(s) > C)
>        {
>          replace this call with
>          __builtin_strncmp_eq (s, STR, C) (!)= 0
>        }
>      if (C > strlen(STR)
>        {
>          it can be safely treated as a call to strcmp (s, STR) (!)= 0
>          can handled by the following strcmp.
>        }
> 
>    strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
>    string with constant length.
>      if  (sizeof_array(s) > strlen(STR))
>        {
>          replace this call with
>          __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
>        }
> 
>    later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
>    as __builtin_memcmp_eq, if the expansion does not succeed, change them back
>    to call to __builtin_str(n)cmp.
> 
> adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
> PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026
> 
> bootstraped and tested on both X86 and Aarch64. no regression.
> ---
>  gcc/builtins.c                     |  33 ++++
>  gcc/builtins.def                   |   5 +
>  gcc/gimple-fold.c                  |   5 +
>  gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 ++++++++
>  gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 ++++
>  gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 ++
>  gcc/tree-ssa-strlen.c              | 304 ++++++++++++++++++++++++++++++++++---
>  gcc/tree-ssa-structalias.c         |   2 +
>  gcc/tree.c                         |   8 +
>  9 files changed, 454 insertions(+), 17 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
>  create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
>  create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c
Sorry for the long delay.  This needs a ChangeLog.  With the ChangeLog
it is OK for the trunk.

Jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026
  2018-06-23  4:49                 ` Jeff Law
@ 2018-06-25 15:28                   ` Qing Zhao
  0 siblings, 0 replies; 19+ messages in thread
From: Qing Zhao @ 2018-06-25 15:28 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches, Jakub Jelinek, richard Biener


> On Jun 22, 2018, at 11:49 PM, Jeff Law <law@redhat.com> wrote:
> 
> On 05/29/2018 06:08 PM, Qing Zhao wrote:
>> Hi, Jeff,
>> 
>> Thanks a lot for your review and comments.
>> 
>> I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64.
>> 
>> please take a look at the patch again.
>> 
>> thanks.
>> 
>> Qing
>> 
>>> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote:
>>> So I originally thought you had the core logic wrong in the immediate
>>> uses loop.  But it's actually the case that the return value is the
>>> exact opposite of what I expected.
>>> 
>>> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
>>> was not transformed.
>>> 
>>> Can you fix that so it's not so confusing?
>>> 
>>> I think with that change we'll be good to go, but please repost for a
>>> final looksie.
>>> 
>>> THanks,
>>> Jeff
>> 
>> 
>> 0001-2nd-Patch-for-PR78009.patch
>> 
>> 
>> From 750f44ee0777d55b568f07e263babdedd532d315 Mon Sep 17 00:00:00 2001
>> From: qing zhao <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>
>> Date: Tue, 29 May 2018 16:15:21 -0400
>> Subject: [PATCH] 2nd Patch for PR78009 Patch for PR83026
>> 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809>
>> Inline strcmp with small constant strings
>> 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026>
>> missing strlen optimization for strcmp of unequal strings
>> 
>> The design doc for PR78809 is at:
>> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html <https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html>
>> 
>> this patch is for the second part of change of PR78809 and PR83026:
>> 
>> B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
>> 
>>   B.1. (PR83026) When the lengths of both arguments are constant and
>>        it's a strcmp:
>>      * if the lengths are NOT equal, we can safely fold the call
>>        to a non-zero value.
>>      * otherwise, do nothing now.
>> 
>>   B.2. (PR78809) When the length of one argument is constant, try to replace
>>   the call with a __builtin_str(n)cmp_eq call where possible, i.e:
>> 
>>   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
>>   string with constant length, C is a constant.
>>     if (C <= strlen(STR) && sizeof_array(s) > C)
>>       {
>>         replace this call with
>>         __builtin_strncmp_eq (s, STR, C) (!)= 0
>>       }
>>     if (C > strlen(STR)
>>       {
>>         it can be safely treated as a call to strcmp (s, STR) (!)= 0
>>         can handled by the following strcmp.
>>       }
>> 
>>   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
>>   string with constant length.
>>     if  (sizeof_array(s) > strlen(STR))
>>       {
>>         replace this call with
>>         __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
>>       }
>> 
>>   later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
>>   as __builtin_memcmp_eq, if the expansion does not succeed, change them back
>>   to call to __builtin_str(n)cmp.
>> 
>> adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
>> PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026
>> 
>> bootstraped and tested on both X86 and Aarch64. no regression.
>> ---
>> gcc/builtins.c                     |  33 ++++
>> gcc/builtins.def                   |   5 +
>> gcc/gimple-fold.c                  |   5 +
>> gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 ++++++++
>> gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 ++++
>> gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 ++
>> gcc/tree-ssa-strlen.c              | 304 ++++++++++++++++++++++++++++++++++---
>> gcc/tree-ssa-structalias.c         |   2 +
>> gcc/tree.c                         |   8 +
>> 9 files changed, 454 insertions(+), 17 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
>> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
>> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c
> Sorry for the long delay.  This needs a ChangeLog.  With the ChangeLog
> it is OK for the trunk.

Hi, Jeff,

the patch had been committed with ChangeLogs as following:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=261039 <https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=261039>

thanks.

Qing

> 
> Jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-06-25 15:28 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-14 19:48 [PATCH][Middle-end]2nd patch of PR78809 and PR83026 Qing Zhao
2017-12-14 20:46 ` Jakub Jelinek
2017-12-15 16:14   ` Qing Zhao
2017-12-15 16:42     ` Jakub Jelinek
2017-12-15 17:19       ` Qing Zhao
2017-12-15 17:47         ` Jakub Jelinek
2017-12-15 20:49           ` Qing Zhao
2017-12-21 21:34   ` [PATCH][Middle-end][version 2]2nd " Qing Zhao
2018-01-12 22:51     ` Jeff Law
2018-01-15  8:56       ` Richard Biener
2018-01-25 20:46         ` Qing Zhao
2018-02-07 15:36           ` [PATCH][Middle-end][version 3]2nd " Qing Zhao
2018-05-22 17:56             ` Ping: " Qing Zhao
2018-05-25 20:56             ` Jeff Law
2018-05-30  0:16               ` Qing Zhao
2018-05-31 21:10                 ` committed: " Qing Zhao
2018-06-23  4:49                 ` Jeff Law
2018-06-25 15:28                   ` Qing Zhao
2018-01-25 20:34       ` [PATCH][Middle-end][version 2]2nd " Qing Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).