public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <law@redhat.com>
To: Qing Zhao <qing.zhao@oracle.com>, Jakub Jelinek <jakub@redhat.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	richard Biener <rguenther@suse.de>
Subject: Re: [PATCH][Middle-end][version 2]2nd patch of PR78809 and PR83026
Date: Fri, 12 Jan 2018 22:51:00 -0000	[thread overview]
Message-ID: <dbc784db-938a-5962-7b37-3659ef7263f3@redhat.com> (raw)
In-Reply-To: <A4FB5DB3-1280-4155-A20F-B3D03E09D252@oracle.com>

On 12/21/2017 02:25 PM, Qing Zhao wrote:
> Hi, 
> 
> I updated my patch based on all your comments. 
> 
> the major changes are the following:
> 
> 	1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq;
>             in builtins.c,  when expanding the new __builtin_str(n)cmp_eq call, expand them first as
>             __builtin_memcmp_eq, if Not succeed,  change the call back to __builtin_str(n)cmp.
> 	2. change the call to “get_range_strlen” with “compute_objsize”.
Please read the big comment before compute_objsize.  If you are going to
use it to influence code generation or optimization, then you're most
likely doing something wrong.

compute_objsize can return an estimate in some circumstances.


> 	3. add the missing case for equality checking with zero;
> 	4. adjust the new testing case for PR83026; add a new testing case for the missing case added in 3.
> 	5. update “uhwi” to “shwi” for where it needs;
> 	6. some other minor format changes.
> 
> the changes are retested on x86 and aarch64, bootstrapped and regression tested. no issue.
> 
> Okay for trunk?
> 
> thanks.
> 
> Qing
> 
> Please see the updated patch:
> 
> gcc/ChangeLog:
> 
> +2017-12-21  Qing Zhao  <qing.zhao@oracle.com>
> +
> +	PR middle-end/78809
> +	PR middle-end/83026
> +	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
> +	and BUILT_IN_STRNCMP_EQ.
> +	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
> +	BUILT_IN_STRNCMP_EQ.
> +	* tree-ssa-strlen.c (compute_string_length): New function.
> +	(handle_builtin_string_cmp): New function to handle calls to
> +	string compare functions.
> +	(strlen_optimize_stmt): Add handling to builtin string compare
> +	calls. 
> +	* tree.c (build_common_builtin_nodes): Add new defines of
> +	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
> +
> gcc/testsuite/ChangeLog
> 
> +2017-12-21  Qing Zhao  <qing.zhao@oracle.com> 
> +
> +	PR middle-end/78809
> +	* gcc.dg/strcmpopt_2.c: New testcase.
> +	* gcc.dg/strcmpopt_3.c: New testcase.
> +
> +	PR middle-end/83026
> +	* gcc.dg/strcmpopt_3.c: New testcase.
What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ.
ISTM that if you're going to introduce those new builtins, then you have
to audit all the code that runs between their introduction into the IL
and when you expand them to ensure they're handled properly.

All you're really doing is carrying along a status bit about what
tree-ssa-strlen did.  So you could possibly store that status bit somewhere.

The concern with both is that something later invalidates the analysis
you've done.  I'm having a hard time coming up with a case where this
could happen, but I'm definitely concerned about this possibility.
Though I feel it's more likely to happen if we store a status bit vs
using STRCMP_EQ STRNCMP_EQ.

[ For example, we have two calls with the same arguments, but one feeds
an equality test, the other does not.  If we store a status bit that one
could be transformed, but then we end up CSE-ing the two calls, the
status bit would be incorrect because one of the calls did not feed an
equality test.  Hmmm. ]




> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> index 94f20ef..57563ef 100644
> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c
> @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>    return false;
>  }
>  
> +/* Given an index to the strinfo vector, compute the string length for the
> +   corresponding string. Return -1 when unknown.  */
> + 
> +static HOST_WIDE_INT 
> +compute_string_length (int idx)
> +{
> +  HOST_WIDE_INT string_leni = -1; 
> +  gcc_assert (idx != 0);
> +
> +  if (idx < 0)
> +    string_leni = ~idx;
So it seems to me you should just
  return ~idx;

Then appropriately simplify the rest of the code.

> +
> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
> +   equality test against zero:
> +
> +   A. When both arguments are constant strings and it's a strcmp:
> +      * if the length of the strings are NOT equal, we can safely fold the call
> +        to a non-zero value.
> +      * otherwise, do nothing now.
I'm guessing your comment needs a bit of work.  If both arguments are
constant strings, then we can just use the host str[n]cmp to fold the
str[n]cmp to a constant.  Presumably this is handled earlier :-)

So what I'm guessing is you're really referring to the case where the
lengths are known constants, even if the contents of the strings
themselves are not.  In that case if its an equality comparison, then
you can fold to a constant.  Otherwise we do nothing.  So I think the
comment needs updating here.






> +  
> +   B. When one of the arguments is constant string, try to replace the call with
> +   a __builtin_str(n)cmp_eq call where possible, i.e:
> +
> +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   constant string, C is a constant.
> +     if (C <= strlen(STR) && sizeof_array(s) > C)
> +       {
> +         replace this call with
> +         strncmp_eq (s, STR, C) (!)= 0
> +       }
> +     if (C > strlen(STR)
> +       {
> +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
> +         can handled by the following strcmp.
> +       }
> +
> +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   constant string.
> +     if  (sizeof_array(s) > strlen(STR))
> +       {
> +         replace this call with
> +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
So I'd hazard a guess that this comment has the same problem.  It's
mixing up the concept of a constant string and the concept of a
nonconstant string, but which has a known constant length.

First, if one of the arguments is a string constant, then it should be
easy to get the constant at expansion time with minimal effort.  It's
also the case that knowing if we're comparing the result against zero is
trivial to figure out at expansion time.  This would probably be a
reasonble thing to add to the expanders.



Your function comment for handle_builtin_string_cmp does not indicate
what the return value means.  From reading the code is appears to return
TRUE when it does nothing and FALSE when it optimizes the call in some
way.  Is that correct?  That seems inverted from the way we'd normally
write this stuff.



> +
> +  /* When one of args is constant string.  */
> +  tree var_string = NULL_TREE;
> +  HOST_WIDE_INT const_string_leni = -1;
> +  
> +  if (idx1)
> +    {
> +      const_string_leni = compute_string_length (idx1);
> +      var_string = arg2;
> +    } 
> +  else 
> +    {
> +      gcc_checking_assert (idx2);
> +      const_string_leni = compute_string_length (idx2);
> +      var_string = arg1;
> +    } 
> +
> +  if (const_string_leni < 0) 
> +    return true;
> + 
> +  unsigned HOST_WIDE_INT var_sizei = 0;
> +  /* try to determine the minimum size of the object pointed by var_string.  */
> +  tree size = compute_objsize (var_string, 2);
So this worries me as well.  compute_objsize is not supposed to be used
to influence code generation or optimization.  It is not guaranteed to
return an exact size, but instead may return an estimate!




I'd really like to hear Jakub or Richi chime in with their thoughts on
how this transforms one builtin into another and whether or not they
think that is wise.

jeff

  reply	other threads:[~2018-01-12 22:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-14 19:48 [PATCH][Middle-end]2nd " Qing Zhao
2017-12-14 20:46 ` Jakub Jelinek
2017-12-15 16:14   ` Qing Zhao
2017-12-15 16:42     ` Jakub Jelinek
2017-12-15 17:19       ` Qing Zhao
2017-12-15 17:47         ` Jakub Jelinek
2017-12-15 20:49           ` Qing Zhao
2017-12-21 21:34   ` [PATCH][Middle-end][version 2]2nd " Qing Zhao
2018-01-12 22:51     ` Jeff Law [this message]
2018-01-15  8:56       ` Richard Biener
2018-01-25 20:46         ` Qing Zhao
2018-02-07 15:36           ` [PATCH][Middle-end][version 3]2nd " Qing Zhao
2018-05-22 17:56             ` Ping: " Qing Zhao
2018-05-25 20:56             ` Jeff Law
2018-05-30  0:16               ` Qing Zhao
2018-05-31 21:10                 ` committed: " Qing Zhao
2018-06-23  4:49                 ` Jeff Law
2018-06-25 15:28                   ` Qing Zhao
2018-01-25 20:34       ` [PATCH][Middle-end][version 2]2nd " Qing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dbc784db-938a-5962-7b37-3659ef7263f3@redhat.com \
    --to=law@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=qing.zhao@oracle.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).