public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jakub Jelinek <jakub@redhat.com>
To: Qing Zhao <qing.zhao@oracle.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>, jeff Law <law@redhat.com>,
	       richard Biener <rguenther@suse.de>
Subject: Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
Date: Thu, 14 Dec 2017 20:46:00 -0000	[thread overview]
Message-ID: <20171214204542.GB2353@tucnak> (raw)
In-Reply-To: <C5493A43-682F-4977-BDC8-5A21BB2CE7CA@oracle.com>

On Thu, Dec 14, 2017 at 01:45:21PM -0600, Qing Zhao wrote:
> 2017-12-11  Qing Zhao  <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>

No " <mailto:qing.zhao@oracle.com>" in ChangeLog entries please.

> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c
> @@ -2541,6 +2541,198 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>    return false;
>  }
>  
> +/* Given an index to the strinfo vector, compute the string length for the
> +   corresponding string. Return -1 when unknown.  */
> + 
> +static HOST_WIDE_INT 
> +compute_string_length (int idx)
> +{
> +  HOST_WIDE_INT string_leni = -1; 
> +  gcc_assert (idx != 0);
> +
> +  if (idx < 0)
> +    string_leni = ~idx;
> +  else
> +    {
> +      strinfo *si = get_strinfo (idx);
> +      if (si)
> +	{
> +	  tree const_string_len = get_string_length (si);
> +	  string_leni
> +	    = (const_string_len && tree_fits_uhwi_p (const_string_len)
> +	       ? tree_to_uhwi(const_string_len) : -1); 

So, you are returning a signed HWI, then clearly tree_fits_uhwi_p and
tree_to_uhwi are inappropriate, you should have used tree_fits_shwi_p
and tree_to_shwi.  Space after function name is missing too.
And, as you start by initializing string_leni to -1, there is no
point to write it this way rather than
	  if (const_string_len && tree_fits_shwi_p (const_string_len))
	    string_leni = tree_to_shwi (const_string_len);

> +	}
> +    }

Maybe also do
  if (string_leni < 0)
    return -1;

> +  return string_leni;

unless the callers just look for negative value as unusable.

> +      tree len = gimple_call_arg (stmt, 2);
> +      if (tree_fits_uhwi_p (len))
> +        length = tree_to_uhwi (len);

Similarly to above, you are mixing signed and unsigned HWIs too much.

> +      if (gimple_code (ustmt) == GIMPLE_ASSIGN)

      if (is_gimple_assign (ustmt))

Usually we use use_stmt instead of ustmt.

> +	{    
> +	  gassign *asgn = as_a <gassign *> (ustmt);

No need for the gassign and ugly as_a, gimple_assign_rhs_code
as well as gimple_assign_rhs2 can be called on gimple * too.

> +	  tree_code code = gimple_assign_rhs_code (asgn);
> +	  if ((code != EQ_EXPR && code != NE_EXPR)
> +	      || !integer_zerop (gimple_assign_rhs2 (asgn)))
> +	    return true;
> +	}
> +      else if (gimple_code (ustmt) == GIMPLE_COND)
> +	{
> +	  tree_code code = gimple_cond_code (ustmt);
> +	  if ((code != EQ_EXPR && code != NE_EXPR)
> +	      || !integer_zerop (gimple_cond_rhs (ustmt)))
> +	    return true;

There is another case you are missing, assign stmt with
gimple_assign_rhs_code COND_EXPR, where gimple_assign_rhs1 is
tree with TREE_CODE EQ_EXPR or NE_EXPR with TREE_OPERAND (rhs1, 1)
integer_zerop.

> +  /* When both arguments are known, and their strlens are unequal, we can 
> +     safely fold the call to a non-zero value for strcmp;
> +     othewise, do nothing now.  */
> +  if (idx1 != 0 && idx2 != 0)
> +    {
> +      HOST_WIDE_INT const_string_leni1 = -1;
> +      HOST_WIDE_INT const_string_leni2 = -1;
> +      const_string_leni1 = compute_string_length (idx1);
> +      const_string_leni2 = compute_string_length (idx2);

Why do you initialize the vars when you immediately overwrite it?
Just do
      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
etc.

> +  /* When one of args is constant string.  */
> +  tree var_string;
> +  HOST_WIDE_INT const_string_leni = -1;
> +  
> +  if (idx1)
> +    {
> +      const_string_leni = compute_string_length (idx1);
> +      var_string = arg2;
> +    } 
> +  else if (idx2)
> +    {
> +      const_string_leni = compute_string_length (idx2);
> +      var_string = arg1;
> +    } 

Haven't you checked earlier that one of idx1 and idx2 is non-zero?
If so, then the else if (idx2) will just might confuse -Wuninitialized,
if you just use else, you don't need to initialize const_string_leni
either.

> +  /* Try to get the min and max string length for var_string, the max length is
> +     the size of the array - 1, recorded in size[1].  */ 
> +  get_range_strlen (var_string, size);
> +  if (size[1] && tree_fits_uhwi_p (size[1]))
> +    var_sizei = tree_to_uhwi (size[1]) + 1;

This is something that looks problematic to me.  get_range_strlen returns
some conservative upper bound on the string length, which is fine if
var_string points to say a TREE_STATIC variable where you know the allocated
size, or automatic variable.  But if somebody passes you a pointer to a
structure and the source doesn't contain aggregate copying for it, not sure
if you can take for granted that all the bytes are readable after the '\0'
in the string.  Hopefully at least for flexible array members and arrays in
such positions get_range_strlen will not provide the upper bound, but even
in other cases it doesn't feel safe to me.

Furthermore, in the comments you say that you do it only for small strings,
but in the patch I can't see any upper bound, so you could transform strlen
that would happen to return say just 1 or 2 with a function call that
possibly reads megabytes of data (memcmp may read all bytes, not just stop
at the first difference).
> +  unsigned HOST_WIDE_INT final_length 
> +    = is_ncmp ? length : (const_string_leni + 1);

Why the ()s?

	Jakub

  reply	other threads:[~2017-12-14 20:46 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-14 19:48 Qing Zhao
2017-12-14 20:46 ` Jakub Jelinek [this message]
2017-12-15 16:14   ` Qing Zhao
2017-12-15 16:42     ` Jakub Jelinek
2017-12-15 17:19       ` Qing Zhao
2017-12-15 17:47         ` Jakub Jelinek
2017-12-15 20:49           ` Qing Zhao
2017-12-21 21:34   ` [PATCH][Middle-end][version 2]2nd " Qing Zhao
2018-01-12 22:51     ` Jeff Law
2018-01-15  8:56       ` Richard Biener
2018-01-25 20:46         ` Qing Zhao
2018-02-07 15:36           ` [PATCH][Middle-end][version 3]2nd " Qing Zhao
2018-05-22 17:56             ` Ping: " Qing Zhao
2018-05-25 20:56             ` Jeff Law
2018-05-30  0:16               ` Qing Zhao
2018-05-31 21:10                 ` committed: " Qing Zhao
2018-06-23  4:49                 ` Jeff Law
2018-06-25 15:28                   ` Qing Zhao
2018-01-25 20:34       ` [PATCH][Middle-end][version 2]2nd " Qing Zhao
2017-12-15 12:41 [PATCH][Middle-end]2nd " Wilco Dijkstra
2017-12-15 22:09 ` Qing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171214204542.GB2353@tucnak \
    --to=jakub@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=qing.zhao@oracle.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).