* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
@ 2017-12-15 12:41 Wilco Dijkstra
2017-12-15 22:09 ` Qing Zhao
0 siblings, 1 reply; 9+ messages in thread
From: Wilco Dijkstra @ 2017-12-15 12:41 UTC (permalink / raw)
To: Qing Zhao; +Cc: GCC Patches, nd, Jeff Law, Richard Biener
Hi Qing,
Just looking at a very high level, I have a few comments:
1. Constant folding str(n)cmp - folding is done separately in fold-const-call.c
and gimple-fold.c. There is already code for folding strcmp and strncmp,
so we shouldn't need to add new foldings. Or do you have an example that
isn't folded as expected? If so, a fix should be added to the existing code.
2. Why check for str(n)cmp == 0 / != 0? There is no need to explicitly check
for equality comparisons since folding into memcmp is always good.
3. Why handle strncmp? There is already code to convert strncmp into strcmp,
so why repeat that again in a different way? It just seems to make the
code significantly more complex for no benefit.
You can achieve the same effect by just optimizing strcmp into memcmp when
legal without checking for equality comparison. As a result you can considerably
reduce the size of this patch while handling more useful cases.
Wilco
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
2017-12-15 12:41 [PATCH][Middle-end]2nd patch of PR78809 and PR83026 Wilco Dijkstra
@ 2017-12-15 22:09 ` Qing Zhao
0 siblings, 0 replies; 9+ messages in thread
From: Qing Zhao @ 2017-12-15 22:09 UTC (permalink / raw)
To: Wilco Dijkstra; +Cc: GCC Patches, nd, Jeff Law, Richard Biener
Hi, Wilco,
thanks a lot for your review and comments.
> On Dec 15, 2017, at 6:41 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>
> Hi Qing,
>
> Just looking at a very high level, I have a few comments:
>
> 1. Constant folding str(n)cmp - folding is done separately in fold-const-call.c
> and gimple-fold.c. There is already code for folding strcmp and strncmp,
> so we shouldn't need to add new foldings. Or do you have an example that
> isn't folded as expected? If so, a fix should be added to the existing code.
are you referring the following:
B.1. (PR83026) When both arguments are constant strings and it's a strcmp:
* if the length of the strings are NOT equal, we can safely fold the call
to a non-zero value.
* otherwise, do nothing now.
the following testing case cannot be folded by the current gimple-fold or fold-const-call:
int f1 (void)
{
char *s0= "abcd";
char s[8];
strcpy (s, s0);
return __builtin_strcmp(s, "abc") != 0;
}
in order to fold the above call, we need the strlen information that provided by the tree-ssa-strlen.c.
and I don’t think that improving the gimple-fold or fold-const-call can fold the above call to 1.
with the strlen info in tree-ssa-strlen.c, the strlen of s could be determined as 4, which is not equal
to the strlen of the constant string “abc”, then we can safely fold
__builtin_strcmp(s, "abc") != 0
to 1.
(actually, your above comments remind me that one of my testcases added has some issue, I will fix the testcase,
strcmpopt_3.c).
>
> 2. Why check for str(n)cmp == 0 / != 0? There is no need to explicitly check
> for equality comparisons since folding into memcmp is always good.
If the result is Not used to compare with 0, then we need the exact value of strcmp of two strings. negative value, zero, or positive value.
I am not sure changing str(n)cmp to memcmp under such situation will still return the correct negative or positive value?
>
> 3. Why handle strncmp? There is already code to convert strncmp into strcmp,
> so why repeat that again in a different way? It just seems to make the
> code significantly more complex for no benefit.
if strncmp cannot be convert to strcmp, there is still call to strncmp that we should handle, right?
Qing
>
> You can achieve the same effect by just optimizing strcmp into memcmp when
> legal without checking for equality comparison. As a result you can considerably
> reduce the size of this patch while handling more useful cases.
>
> Wilco
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH][Middle-end]2nd patch of PR78809 and PR83026
@ 2017-12-14 19:48 Qing Zhao
2017-12-14 20:46 ` Jakub Jelinek
0 siblings, 1 reply; 9+ messages in thread
From: Qing Zhao @ 2017-12-14 19:48 UTC (permalink / raw)
To: gcc-patches; +Cc: jeff Law, richard Biener
Hi,
I am not sure whether it’s proper to send this patch during this late stage.
however, since the patch itself is quite straightforward, I decided to send it now.
=================
2nd Patch for PR78009
Patch for PR83026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809>
Inline strcmp with small constant strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026>
missing strlen optimization for strcmp of unequal strings
The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html <https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html>
this patch is for the second part of change of PR78809 and PR83026:
B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
B.1. (PR83026) When both arguments are constant strings and it's a strcmp:
* if the length of the strings are NOT equal, we can safely fold the call
to a non-zero value.
* otherwise, do nothing now.
B.2. (PR78809) When one of the arguments is a constant string, try to replace
the call with a __builtin_memcmp_eq call where possible, i.e:
strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
constant string, C is a constant.
if (C <= strlen(STR) && sizeof_array(s) > C)
{
replace this call with
memcmp (s, STR, C) (!)= 0
}
if (C > strlen(STR)
{
it can be safely treated as a call to strcmp (s, STR) (!)= 0
can handled by the following strcmp.
}
strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
constant string.
if (sizeof_array(s) > strlen(STR))
{
replace this call with
memcmp (s, STR, strlen(STR)+1) (!)= 0
}
(NOTE, currently, memcmp(s1, s2, N) (!)=0 has been optimized to a simple
sequence to access all bytes and accumulate the overall result in GCC by
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171>
as a result, such str(n)cmp call would like to be replaced by simple
sequences to access all types and accumulate the overall results.
adding test case strcmpopt_2.c into gcc.dg for part B of PR78809
adding test case strcmpopt_3.c into gcc.dg for PR83026
bootstraped and tested on both X86 and Aarch64. no regression.
Okay for trunk?
thanks.
Qing
================
gcc/ChangeLog:
2017-12-11 Qing Zhao <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>
PR middle-end/78809
PR middle-end/83026
* tree-ssa-strlen.c (compute_string_length): New function.
(handle_builtin_string_cmp): New function to handle calls to
string compare functions.
(strlen_optimize_stmt): Add handling to builtin string compare
calls.
gcc/testsuite/ChangeLog:
2017-12-11 Qing Zhao <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>
PR middle-end/78809
* gcc.dg/strcmpopt_2.c: New testcase.
PR middle-end/83026
* gcc.dg/strcmpopt_3.c: New testcase.
============
---
gcc/testsuite/gcc.dg/strcmpopt_2.c | 67 +++++++++++++
gcc/testsuite/gcc.dg/strcmpopt_3.c | 27 +++++
gcc/tree-ssa-strlen.c | 197 +++++++++++++++++++++++++++++++++++++
3 files changed, 291 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
new file mode 100644
index 0000000..ac8ff39
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+char s[100] = {'a','b','c','d'};
+typedef struct { int x; char s[8]; } S;
+
+__attribute__ ((noinline)) int
+f1 (S *s)
+{
+ return __builtin_strcmp (s->s, "abc") != 0;
+}
+
+__attribute__ ((noinline)) int
+f2 (void)
+{
+ return __builtin_strcmp (s, "abc") != 0;
+}
+
+__attribute__ ((noinline)) int
+f3 (S *s)
+{
+ return __builtin_strcmp ("abc", s->s) != 0;
+}
+
+__attribute__ ((noinline)) int
+f4 (void)
+{
+ return __builtin_strcmp ("abc", s) != 0;
+}
+
+__attribute__ ((noinline)) int
+f5 (S *s)
+{
+ return __builtin_strncmp (s->s, "abc", 3) != 0;
+}
+
+__attribute__ ((noinline)) int
+f6 (void)
+{
+ return __builtin_strncmp (s, "abc", 2) != 0;
+}
+
+__attribute__ ((noinline)) int
+f7 (S *s)
+{
+ return __builtin_strncmp ("abc", s->s, 3) != 0;
+}
+
+__attribute__ ((noinline)) int
+f8 (void)
+{
+ return __builtin_strncmp ("abc", s, 2) != 0;
+}
+
+int main (void)
+{
+ S ss = {2, {'a','b','c'}};
+
+ if (f1 (&ss) != 0 || f2 () != 1 || f3 (&ss) != 0 ||
+ f4 () != 1 || f5 (&ss) != 0 || f6 () != 0 ||
+ f7 (&ss) != 0 || f8 () != 0)
+ __builtin_abort ();
+
+ return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "memcmp_eq \\(" 8 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
new file mode 100644
index 0000000..a3be431
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__ ((noinline)) int
+f1 (void)
+{
+ char *s= "abcd";
+ return __builtin_strcmp(s, "abc") != 0;
+}
+
+__attribute__ ((noinline)) int
+f2 (void)
+{
+ char *s = "ab";
+ return __builtin_strcmp("abc", s) != 0;
+}
+
+int main (void)
+{
+ if (f1 () != 1
+ || f2 () != 1)
+ __builtin_abort ();
+
+ return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 48b9241..aafa574 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -2541,6 +2541,198 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
return false;
}
+/* Given an index to the strinfo vector, compute the string length for the
+ corresponding string. Return -1 when unknown. */
+
+static HOST_WIDE_INT
+compute_string_length (int idx)
+{
+ HOST_WIDE_INT string_leni = -1;
+ gcc_assert (idx != 0);
+
+ if (idx < 0)
+ string_leni = ~idx;
+ else
+ {
+ strinfo *si = get_strinfo (idx);
+ if (si)
+ {
+ tree const_string_len = get_string_length (si);
+ string_leni
+ = (const_string_len && tree_fits_uhwi_p (const_string_len)
+ ? tree_to_uhwi(const_string_len) : -1);
+ }
+ }
+ return string_leni;
+}
+
+/* Handle a call to strcmp or strncmp. When the result is ONLY used to do
+ equality test against zero:
+
+ A. When both arguments are constant strings and it's a strcmp:
+ * if the length of the strings are NOT equal, we can safely fold the call
+ to a non-zero value.
+ * otherwise, do nothing now.
+
+ B. When one of the arguments is constant string, try to replace the call with
+ a __builtin_memcmp_eq call where possible, i.e:
+
+ strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
+ constant string, C is a constant.
+ if (C <= strlen(STR) && sizeof_array(s) > C)
+ {
+ replace this call with
+ memcmp (s, STR, C) (!)= 0
+ }
+ if (C > strlen(STR)
+ {
+ it can be safely treated as a call to strcmp (s, STR) (!)= 0
+ can handled by the following strcmp.
+ }
+
+ strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
+ constant string.
+ if (sizeof_array(s) > strlen(STR))
+ {
+ replace this call with
+ memcmp (s, STR, strlen(STR)+1) (!)= 0
+ }
+ */
+
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+ gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+ tree res = gimple_call_lhs (stmt);
+ use_operand_p use_p;
+ imm_use_iterator iter;
+ tree arg1 = gimple_call_arg (stmt, 0);
+ tree arg2 = gimple_call_arg (stmt, 1);
+ int idx1 = get_stridx (arg1);
+ int idx2 = get_stridx (arg2);
+ HOST_WIDE_INT length = -1;
+ bool is_ncmp = false;
+
+ if (!res)
+ return true;
+
+ /* When both arguments are unknown, do nothing. */
+ if (idx1 == 0 && idx2 == 0)
+ return true;
+
+ /* Handle strncmp function. */
+ if (gimple_call_num_args (stmt) == 3)
+ {
+ tree len = gimple_call_arg (stmt, 2);
+ if (tree_fits_uhwi_p (len))
+ length = tree_to_uhwi (len);
+
+ is_ncmp = true;
+ }
+
+ /* For strncmp, if the length argument is NOT known, do nothing. */
+ if (is_ncmp && length < 0)
+ return true;
+
+ /* When the result is ONLY used to do equality test against zero. */
+ FOR_EACH_IMM_USE_FAST (use_p, iter, res)
+ {
+ gimple *ustmt = USE_STMT (use_p);
+
+ if (is_gimple_debug (ustmt))
+ continue;
+ if (gimple_code (ustmt) == GIMPLE_ASSIGN)
+ {
+ gassign *asgn = as_a <gassign *> (ustmt);
+ tree_code code = gimple_assign_rhs_code (asgn);
+ if ((code != EQ_EXPR && code != NE_EXPR)
+ || !integer_zerop (gimple_assign_rhs2 (asgn)))
+ return true;
+ }
+ else if (gimple_code (ustmt) == GIMPLE_COND)
+ {
+ tree_code code = gimple_cond_code (ustmt);
+ if ((code != EQ_EXPR && code != NE_EXPR)
+ || !integer_zerop (gimple_cond_rhs (ustmt)))
+ return true;
+ }
+ else
+ return true;
+ }
+
+ /* When both arguments are known, and their strlens are unequal, we can
+ safely fold the call to a non-zero value for strcmp;
+ othewise, do nothing now. */
+ if (idx1 != 0 && idx2 != 0)
+ {
+ HOST_WIDE_INT const_string_leni1 = -1;
+ HOST_WIDE_INT const_string_leni2 = -1;
+ const_string_leni1 = compute_string_length (idx1);
+ const_string_leni2 = compute_string_length (idx2);
+ if (!is_ncmp
+ && const_string_leni1 != -1
+ && const_string_leni2 != -1
+ && const_string_leni1 != const_string_leni2)
+ {
+ replace_call_with_value (gsi, integer_one_node);
+ return false;
+ }
+ return true;
+ }
+
+ /* When one of args is constant string. */
+ tree var_string;
+ HOST_WIDE_INT const_string_leni = -1;
+
+ if (idx1)
+ {
+ const_string_leni = compute_string_length (idx1);
+ var_string = arg2;
+ }
+ else if (idx2)
+ {
+ const_string_leni = compute_string_length (idx2);
+ var_string = arg1;
+ }
+
+ if (const_string_leni < 0)
+ return true;
+
+ tree size[2];
+ unsigned HOST_WIDE_INT var_sizei = 0;
+
+ /* Try to get the min and max string length for var_string, the max length is
+ the size of the array - 1, recorded in size[1]. */
+ get_range_strlen (var_string, size);
+ if (size[1] && tree_fits_uhwi_p (size[1]))
+ var_sizei = tree_to_uhwi (size[1]) + 1;
+
+ if (var_sizei == 0)
+ return true;
+
+ /* For strncmp, if length > const_string_leni , this call can be safely
+ transformed to a strcmp. */
+ if (is_ncmp && length > const_string_leni)
+ is_ncmp = false;
+
+ unsigned HOST_WIDE_INT final_length
+ = is_ncmp ? length : (const_string_leni + 1);
+
+ /* Replace strcmp or strncmp with the corresponding memcmp. */
+ if (var_sizei > final_length)
+ {
+ tree fn = builtin_decl_implicit (BUILT_IN_MEMCMP_EQ);
+ if (!fn)
+ return true;
+ tree const_string_len = build_int_cst (size_type_node, final_length);
+ update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+ }
+ else
+ return true;
+
+ return false;
+}
+
/* Handle a POINTER_PLUS_EXPR statement.
For p = "abcd" + 2; compute associated length, or if
p = q + off is pointing to a '\0' character of a string, call
@@ -2940,6 +3132,11 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
if (!handle_builtin_memcmp (gsi))
return false;
break;
+ case BUILT_IN_STRCMP:
+ case BUILT_IN_STRNCMP:
+ if (!handle_builtin_string_cmp (gsi))
+ return false;
+ break;
default:
break;
}
--
1.9.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
2017-12-14 19:48 Qing Zhao
@ 2017-12-14 20:46 ` Jakub Jelinek
2017-12-15 16:14 ` Qing Zhao
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2017-12-14 20:46 UTC (permalink / raw)
To: Qing Zhao; +Cc: gcc-patches, jeff Law, richard Biener
On Thu, Dec 14, 2017 at 01:45:21PM -0600, Qing Zhao wrote:
> 2017-12-11 Qing Zhao <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>
No " <mailto:qing.zhao@oracle.com>" in ChangeLog entries please.
> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c
> @@ -2541,6 +2541,198 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
> return false;
> }
>
> +/* Given an index to the strinfo vector, compute the string length for the
> + corresponding string. Return -1 when unknown. */
> +
> +static HOST_WIDE_INT
> +compute_string_length (int idx)
> +{
> + HOST_WIDE_INT string_leni = -1;
> + gcc_assert (idx != 0);
> +
> + if (idx < 0)
> + string_leni = ~idx;
> + else
> + {
> + strinfo *si = get_strinfo (idx);
> + if (si)
> + {
> + tree const_string_len = get_string_length (si);
> + string_leni
> + = (const_string_len && tree_fits_uhwi_p (const_string_len)
> + ? tree_to_uhwi(const_string_len) : -1);
So, you are returning a signed HWI, then clearly tree_fits_uhwi_p and
tree_to_uhwi are inappropriate, you should have used tree_fits_shwi_p
and tree_to_shwi. Space after function name is missing too.
And, as you start by initializing string_leni to -1, there is no
point to write it this way rather than
if (const_string_len && tree_fits_shwi_p (const_string_len))
string_leni = tree_to_shwi (const_string_len);
> + }
> + }
Maybe also do
if (string_leni < 0)
return -1;
> + return string_leni;
unless the callers just look for negative value as unusable.
> + tree len = gimple_call_arg (stmt, 2);
> + if (tree_fits_uhwi_p (len))
> + length = tree_to_uhwi (len);
Similarly to above, you are mixing signed and unsigned HWIs too much.
> + if (gimple_code (ustmt) == GIMPLE_ASSIGN)
if (is_gimple_assign (ustmt))
Usually we use use_stmt instead of ustmt.
> + {
> + gassign *asgn = as_a <gassign *> (ustmt);
No need for the gassign and ugly as_a, gimple_assign_rhs_code
as well as gimple_assign_rhs2 can be called on gimple * too.
> + tree_code code = gimple_assign_rhs_code (asgn);
> + if ((code != EQ_EXPR && code != NE_EXPR)
> + || !integer_zerop (gimple_assign_rhs2 (asgn)))
> + return true;
> + }
> + else if (gimple_code (ustmt) == GIMPLE_COND)
> + {
> + tree_code code = gimple_cond_code (ustmt);
> + if ((code != EQ_EXPR && code != NE_EXPR)
> + || !integer_zerop (gimple_cond_rhs (ustmt)))
> + return true;
There is another case you are missing, assign stmt with
gimple_assign_rhs_code COND_EXPR, where gimple_assign_rhs1 is
tree with TREE_CODE EQ_EXPR or NE_EXPR with TREE_OPERAND (rhs1, 1)
integer_zerop.
> + /* When both arguments are known, and their strlens are unequal, we can
> + safely fold the call to a non-zero value for strcmp;
> + othewise, do nothing now. */
> + if (idx1 != 0 && idx2 != 0)
> + {
> + HOST_WIDE_INT const_string_leni1 = -1;
> + HOST_WIDE_INT const_string_leni2 = -1;
> + const_string_leni1 = compute_string_length (idx1);
> + const_string_leni2 = compute_string_length (idx2);
Why do you initialize the vars when you immediately overwrite it?
Just do
HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
etc.
> + /* When one of args is constant string. */
> + tree var_string;
> + HOST_WIDE_INT const_string_leni = -1;
> +
> + if (idx1)
> + {
> + const_string_leni = compute_string_length (idx1);
> + var_string = arg2;
> + }
> + else if (idx2)
> + {
> + const_string_leni = compute_string_length (idx2);
> + var_string = arg1;
> + }
Haven't you checked earlier that one of idx1 and idx2 is non-zero?
If so, then the else if (idx2) will just might confuse -Wuninitialized,
if you just use else, you don't need to initialize const_string_leni
either.
> + /* Try to get the min and max string length for var_string, the max length is
> + the size of the array - 1, recorded in size[1]. */
> + get_range_strlen (var_string, size);
> + if (size[1] && tree_fits_uhwi_p (size[1]))
> + var_sizei = tree_to_uhwi (size[1]) + 1;
This is something that looks problematic to me. get_range_strlen returns
some conservative upper bound on the string length, which is fine if
var_string points to say a TREE_STATIC variable where you know the allocated
size, or automatic variable. But if somebody passes you a pointer to a
structure and the source doesn't contain aggregate copying for it, not sure
if you can take for granted that all the bytes are readable after the '\0'
in the string. Hopefully at least for flexible array members and arrays in
such positions get_range_strlen will not provide the upper bound, but even
in other cases it doesn't feel safe to me.
Furthermore, in the comments you say that you do it only for small strings,
but in the patch I can't see any upper bound, so you could transform strlen
that would happen to return say just 1 or 2 with a function call that
possibly reads megabytes of data (memcmp may read all bytes, not just stop
at the first difference).
> + unsigned HOST_WIDE_INT final_length
> + = is_ncmp ? length : (const_string_leni + 1);
Why the ()s?
Jakub
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
2017-12-14 20:46 ` Jakub Jelinek
@ 2017-12-15 16:14 ` Qing Zhao
2017-12-15 16:42 ` Jakub Jelinek
0 siblings, 1 reply; 9+ messages in thread
From: Qing Zhao @ 2017-12-15 16:14 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: gcc-patches, jeff Law, richard Biener
Hi, Jakub,
thanks a lot for your detailed review.
> On Dec 14, 2017, at 2:45 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Thu, Dec 14, 2017 at 01:45:21PM -0600, Qing Zhao wrote:
>> 2017-12-11 Qing Zhao <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>
>
> No " <mailto:qing.zhao@oracle.com>" in ChangeLog entries please.
this is an error when I pasted this part from my terminal to mail editor, not in my real code.
will double check next time when sending out email.
>
>> --- a/gcc/tree-ssa-strlen.c
>> +++ b/gcc/tree-ssa-strlen.c
>> @@ -2541,6 +2541,198 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>> return false;
>> }
>>
>> +/* Given an index to the strinfo vector, compute the string length for the
>> + corresponding string. Return -1 when unknown. */
>> +
>> +static HOST_WIDE_INT
>> +compute_string_length (int idx)
>> +{
>> + HOST_WIDE_INT string_leni = -1;
>> + gcc_assert (idx != 0);
>> +
>> + if (idx < 0)
>> + string_leni = ~idx;
>> + else
>> + {
>> + strinfo *si = get_strinfo (idx);
>> + if (si)
>> + {
>> + tree const_string_len = get_string_length (si);
>> + string_leni
>> + = (const_string_len && tree_fits_uhwi_p (const_string_len)
>> + ? tree_to_uhwi(const_string_len) : -1);
>
> So, you are returning a signed HWI, then clearly tree_fits_uhwi_p and
> tree_to_uhwi are inappropriate, you should have used tree_fits_shwi_p
> and tree_to_shwi. Space after function name is missing too.
> And, as you start by initializing string_leni to -1, there is no
> point to write it this way rather than
> if (const_string_len && tree_fits_shwi_p (const_string_len))
> string_leni = tree_to_shwi (const_string_len);
originally it returned an unsigned HWI. but later I changed it to return a signed one since I
need a negative value to represent the UNKNOWN state.
I will fix this.
>
>> + }
>> + }
>
> Maybe also do
> if (string_leni < 0)
> return -1;
Yes, this might be safer.
>
>> + return string_leni;
>
> unless the callers just look for negative value as unusable.
>
>> + tree len = gimple_call_arg (stmt, 2);
>> + if (tree_fits_uhwi_p (len))
>> + length = tree_to_uhwi (len);
>
> Similarly to above, you are mixing signed and unsigned HWIs too much.
same reason as above :-), I will fix this.
>
>> + if (gimple_code (ustmt) == GIMPLE_ASSIGN)
>
> if (is_gimple_assign (ustmt))
>
> Usually we use use_stmt instead of ustmt.
Okay.
>
>> + {
>> + gassign *asgn = as_a <gassign *> (ustmt);
>
> No need for the gassign and ugly as_a, gimple_assign_rhs_code
> as well as gimple_assign_rhs2 can be called on gimple * too.
this part of the code I just copied from the routine “handle_builtin_memcpy” and no change.
I will change it as you suggested.
>> + tree_code code = gimple_assign_rhs_code (asgn);
>> + if ((code != EQ_EXPR && code != NE_EXPR)
>> + || !integer_zerop (gimple_assign_rhs2 (asgn)))
>> + return true;
>> + }
>> + else if (gimple_code (ustmt) == GIMPLE_COND)
>> + {
>> + tree_code code = gimple_cond_code (ustmt);
>> + if ((code != EQ_EXPR && code != NE_EXPR)
>> + || !integer_zerop (gimple_cond_rhs (ustmt)))
>> + return true;
>
> There is another case you are missing, assign stmt with
> gimple_assign_rhs_code COND_EXPR, where gimple_assign_rhs1 is
> tree with TREE_CODE EQ_EXPR or NE_EXPR with TREE_OPERAND (rhs1, 1)
> integer_zerop.
a little confused here:
in the current code:
. the first case is: result = strcmp() != 0
. the second case is: if (strcmp() != 0)
so, the missing case you mentioned above is:
result = if (strcmp() != 0)
or something else?
>
>> + /* When both arguments are known, and their strlens are unequal, we can
>> + safely fold the call to a non-zero value for strcmp;
>> + othewise, do nothing now. */
>> + if (idx1 != 0 && idx2 != 0)
>> + {
>> + HOST_WIDE_INT const_string_leni1 = -1;
>> + HOST_WIDE_INT const_string_leni2 = -1;
>> + const_string_leni1 = compute_string_length (idx1);
>> + const_string_leni2 = compute_string_length (idx2);
>
> Why do you initialize the vars when you immediately overwrite it?
just a habit to declare a variable with initialization :-).
> Just do
> HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
I can change it like this.
> etc.
>
>> + /* When one of args is constant string. */
>> + tree var_string;
>> + HOST_WIDE_INT const_string_leni = -1;
>> +
>> + if (idx1)
>> + {
>> + const_string_leni = compute_string_length (idx1);
>> + var_string = arg2;
>> + }
>> + else if (idx2)
>> + {
>> + const_string_leni = compute_string_length (idx2);
>> + var_string = arg1;
>> + }
>
> Haven't you checked earlier that one of idx1 and idx2 is non-zero?
Yes.
it’s guaranteed that there is one and ONLY one of idx1 and idx2 is non-zero when getting here.
> If so, then the else if (idx2) will just might confuse -Wuninitialized,
Okay.
> if you just use else, you don't need to initialize const_string_leni
> either.
I think that const_string_leni still need to be initialized in this case, because when idx2 is non-zero,
const_string_leni is initialized to compute_string_length (idx2).
>
>> + /* Try to get the min and max string length for var_string, the max length is
>> + the size of the array - 1, recorded in size[1]. */
>> + get_range_strlen (var_string, size);
>> + if (size[1] && tree_fits_uhwi_p (size[1]))
>> + var_sizei = tree_to_uhwi (size[1]) + 1;
>
> This is something that looks problematic to me. get_range_strlen returns
> some conservative upper bound on the string length, which is fine if
> var_string points to say a TREE_STATIC variable where you know the allocated
> size, or automatic variable. But if somebody passes you a pointer to a
> structure and the source doesn't contain aggregate copying for it, not sure
> if you can take for granted that all the bytes are readable after the '\0'
> in the string. Hopefully at least for flexible array members and arrays in
> such positions get_range_strlen will not provide the upper bound, but even
> in other cases it doesn't feel safe to me.
this is the part that took me most of the time during the implementation.
I have considered the following 3 approaches to decide the size of the variable array:
A. use “compute_builtin_object_size” in tree-object-size.h to decide the size of the
object. However, even with the simplest case, it cannot provide the information.
B. use “get_range_strlen” in gimple-fold.h to decide the size of the object. however,
it cannot provide valid info for simple array, either.
C. write my own routine to decide the size of the object.
I gave up C first because I think that this new routine is very likely a partially redundant routine as
A or B, adding a partially redundant functionality into the code base might not be good.
Then I gave up A because I think that updating A will have more impact than updating B.
when reading B’s comments as following:
/* Determine the minimum and maximum value or string length that ARG
refers to and store each in the first two elements of MINMAXLEN.
For expressions that point to strings of unknown lengths that are
character arrays, use the upper bound of the array as the maximum
length. For example, given an expression like 'x ? array : "xyz"'
and array declared as 'char array[8]', MINMAXLEN[0] will be set
to 3 and MINMAXLEN[1] to 7, the longest string that could be
stored in array.
Return true if the range of the string lengths has been obtained
from the upper bound of an array at the end of a struct. Such
an array may hold a string that's longer than its upper bound
due to it being used as a poor-man's flexible array member. */
bool
get_range_strlen (tree arg, tree minmaxlen[2])
{
}
In the above, it clearly said:
"For expressions that point to strings of unknown lengths that are
character arrays, use the upper bound of the array as the maximum
length.”
So, I feel that this serves my purpose well.
then I updated the routine get_range_strlen to make it handles simple array case.
(this change has just been checked into upstream yesterday:
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=255654 <https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=255654>)
Let me know your suggestions.
I will be happy to update the code with your suggestion.
>
> Furthermore, in the comments you say that you do it only for small strings,
> but in the patch I can't see any upper bound, so you could transform strlen
> that would happen to return say just 1 or 2 with a function call that
> possibly reads megabytes of data (memcmp may read all bytes, not just stop
> at the first difference).
do you mean for very short constant string, we should NOT change it to a. call to memcmp? instead we should just
inline it with byte comparison sequence?
>> + unsigned HOST_WIDE_INT final_length
>> + = is_ncmp ? length : (const_string_leni + 1);
>
> Why the ()s?
just want to make the “const_string_leni + 1” together :-).
I can delete the () if the coding style prefers.
thanks.
Qing
>
> Jakub
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
2017-12-15 16:14 ` Qing Zhao
@ 2017-12-15 16:42 ` Jakub Jelinek
2017-12-15 17:19 ` Qing Zhao
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2017-12-15 16:42 UTC (permalink / raw)
To: Qing Zhao; +Cc: gcc-patches, jeff Law, richard Biener
On Fri, Dec 15, 2017 at 10:08:03AM -0600, Qing Zhao wrote:
> a little confused here:
>
> in the current code:
> . the first case is: result = strcmp() != 0
> . the second case is: if (strcmp() != 0)
>
> so, the missing case you mentioned above is:
>
> result = if (strcmp() != 0)
>
> or something else?
result = (strcmp () != 0 ? 15 : 37);
or similar. Though, usually COND_EXPRs are added by the tree-if-conversion
pass, so you might need -ftree-loop-if-convert option and it probably needs
to be within some loop which will have just a single bb after the
if-conversion.
> >
> >> + /* When both arguments are known, and their strlens are unequal, we can
> >> + safely fold the call to a non-zero value for strcmp;
> >> + othewise, do nothing now. */
> >> + if (idx1 != 0 && idx2 != 0)
> >> + {
> >> + HOST_WIDE_INT const_string_leni1 = -1;
> >> + HOST_WIDE_INT const_string_leni2 = -1;
> >> + const_string_leni1 = compute_string_length (idx1);
> >> + const_string_leni2 = compute_string_length (idx2);
> >
> > Why do you initialize the vars when you immediately overwrite it?
>
> just a habit to declare a variable with initialization :-).
>
> > Just do
> > HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
>
> I can change it like this.
> > etc.
> >
> >> + /* When one of args is constant string. */
> >> + tree var_string;
> >> + HOST_WIDE_INT const_string_leni = -1;
> >> +
> >> + if (idx1)
> >> + {
> >> + const_string_leni = compute_string_length (idx1);
> >> + var_string = arg2;
> >> + }
> >> + else if (idx2)
> >> + {
> >> + const_string_leni = compute_string_length (idx2);
> >> + var_string = arg1;
> >> + }
> >
> > Haven't you checked earlier that one of idx1 and idx2 is non-zero?
>
> Yes.
>
> itâs guaranteed that there is one and ONLY one of idx1 and idx2 is non-zero when getting here.
>
> > If so, then the else if (idx2) will just might confuse -Wuninitialized,
>
> Okay.
>
> > if you just use else, you don't need to initialize const_string_leni
> > either.
>
> I think that const_string_leni still need to be initialized in this case, because when idx2 is non-zero,
> const_string_leni is initialized to compute_string_length (idx2).
Sure. But
type uninitialized_var;
if (cond1)
uninitialized_var = foo;
else if (cond2)
uninitialized_var = bar;
use (uninitialized_var);
is a coding style which asks for -Wmaybe-uninitialized warnings, in order
not to warn, the compiler has to prove that cond1 || cond2 is always true,
which might not be always easy for the compiler.
> > This is something that looks problematic to me. get_range_strlen returns
> > some conservative upper bound on the string length, which is fine if
> > var_string points to say a TREE_STATIC variable where you know the allocated
> > size, or automatic variable. But if somebody passes you a pointer to a
> > structure and the source doesn't contain aggregate copying for it, not sure
> > if you can take for granted that all the bytes are readable after the '\0'
> > in the string. Hopefully at least for flexible array members and arrays in
> > such positions get_range_strlen will not provide the upper bound, but even
> > in other cases it doesn't feel safe to me.
>
> this is the part that took me most of the time during the implementation.
>
> I have considered the following 3 approaches to decide the size of the variable array:
>
> A. use âcompute_builtin_object_sizeâ in tree-object-size.h to decide the size of the
> object. However, even with the simplest case, it cannot provide the information.
compute_builtin_object_size with modes 0 or 1 computes upper bound, what you
are really looking for is lower bound, so that would be mode 2, though that
mode isn't actually used in real-world code and thus might be not fully
tested.
> B. use âget_range_strlenâ in gimple-fold.h to decide the size of the object. however,
> it cannot provide valid info for simple array, either.
get_range_strlen returns you a range, the minval is not what you're looking
for, that is the minimum string length, so might be too short for your
purposes. And maxval is an upper bound, but you are looking for lower
bound, you need guarantees this amount of memory can be accessed, even if
there is 0 in the first byte.
> > Furthermore, in the comments you say that you do it only for small strings,
> > but in the patch I can't see any upper bound, so you could transform strlen
> > that would happen to return say just 1 or 2 with a function call that
> > possibly reads megabytes of data (memcmp may read all bytes, not just stop
> > at the first difference).
>
> do you mean for very short constant string, we should NOT change it to a. call to memcmp? instead we should just
> inline it with byte comparison sequence?
I mean we should never ever replace strcmp or strncmp call with library call to
memcmp, you don't know how it is implemented and it could access any bytes
in the new range, while previously strcmp/strncmp wouldn't. If you have
sufficient guarantees that the memory must be all mapped and thus read-only
accessible (see above, I don't think get_range_strlen provides that), it
might be fine to inline expand the memcmp some way, because there you have
full control over what the compiler will emit. But if we find out during
expansion we don't want to expand it inline, we should fall back to calling
strcmp or strncmp. Which means, instead of changing the call to the
memcmp_eq builtin, change it to some STRCMP or STRNCMP internal function,
ensure it is expanded inline the way you prefer and if it can't, it will
call strcmp or strncmp.
Jakub
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
2017-12-15 16:42 ` Jakub Jelinek
@ 2017-12-15 17:19 ` Qing Zhao
2017-12-15 17:47 ` Jakub Jelinek
0 siblings, 1 reply; 9+ messages in thread
From: Qing Zhao @ 2017-12-15 17:19 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: gcc-patches, jeff Law, richard Biener
> On Dec 15, 2017, at 10:42 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Fri, Dec 15, 2017 at 10:08:03AM -0600, Qing Zhao wrote:
>> a little confused here:
>>
>> in the current code:
>> . the first case is: result = strcmp() != 0
>> . the second case is: if (strcmp() != 0)
>>
>> so, the missing case you mentioned above is:
>>
>> result = if (strcmp() != 0)
>>
>> or something else?
>
> result = (strcmp () != 0 ? 15 : 37);
> or similar. Though, usually COND_EXPRs are added by the tree-if-conversion
> pass, so you might need -ftree-loop-if-convert option and it probably needs
> to be within some loop which will have just a single bb after the
> if-conversion.
I see. thanks.
>>
>>> if you just use else, you don't need to initialize const_string_leni
>>> either.
>>
>> I think that const_string_leni still need to be initialized in this case, because when idx2 is non-zero,
>> const_string_leni is initialized to compute_string_length (idx2).
>
> Sure. But
> type uninitialized_var;
> if (cond1)
> uninitialized_var = foo;
> else if (cond2)
> uninitialized_var = bar;
> use (uninitialized_var);
> is a coding style which asks for -Wmaybe-uninitialized warnings, in order
> not to warn, the compiler has to prove that cond1 || cond2 is always true,
> which might not be always easy for the compiler.
in my case, I already initialize the “uninitialized_var” when declared it:
HOST_WIDE_INT const_string_leni = -1;
if (idx1)
{
const_string_leni = compute_string_length (idx1);
var_string = arg2;
}
else if (idx2)
{
const_string_leni = compute_string_length (idx2);
var_string = arg1;
}
so, the -Wmaybe-uninitialized should NOT issue warning, right?
but anyway, I can change the above as following:
HOST_WIDE_INT const_string_leni = -1;
if (idx1)
{
const_string_leni = compute_string_length (idx1);
var_string = arg2;
}
else
{
gcc_assert (idx2);
const_string_leni = compute_string_length (idx2);
var_string = arg1;
}
is this better?
>
>>> This is something that looks problematic to me. get_range_strlen returns
>>> some conservative upper bound on the string length, which is fine if
>>> var_string points to say a TREE_STATIC variable where you know the allocated
>>> size, or automatic variable. But if somebody passes you a pointer to a
>>> structure and the source doesn't contain aggregate copying for it, not sure
>>> if you can take for granted that all the bytes are readable after the '\0'
>>> in the string. Hopefully at least for flexible array members and arrays in
>>> such positions get_range_strlen will not provide the upper bound, but even
>>> in other cases it doesn't feel safe to me.
>>
>> this is the part that took me most of the time during the implementation.
>>
>> I have considered the following 3 approaches to decide the size of the variable array:
>>
>> A. use “compute_builtin_object_size” in tree-object-size.h to decide the size of the
>> object. However, even with the simplest case, it cannot provide the information.
>
> compute_builtin_object_size with modes 0 or 1 computes upper bound, what you
> are really looking for is lower bound,
you mean: 0, 1 is for maximum object size, and 2 is for minimum object size?
yes, I am looking for minimum object size for this optimization.
> so that would be mode 2, though that
> mode isn't actually used in real-world code and thus might be not fully
> tested.
so, using this routine with mode 2 should be the right approach to go? and we need fully testing on this too?
>
>> B. use “get_range_strlen” in gimple-fold.h to decide the size of the object. however,
>> it cannot provide valid info for simple array, either.
>
> get_range_strlen returns you a range, the minval is not what you're looking
> for, that is the minimum string length, so might be too short for your
> purposes. And maxval is an upper bound, but you are looking for lower
> bound, you need guarantees this amount of memory can be accessed, even if
> there is 0 in the first byte.
my understanding is that: get_range_strlen returns the minimum and maximum length of the string pointed by the
pointer, and the maximum length of the string is determined by the size of the allocated memory pointed by the
pointer, so, it should serve my purpose, did I misunderstand it?
>
>>> Furthermore, in the comments you say that you do it only for small strings,
>>> but in the patch I can't see any upper bound, so you could transform strlen
>>> that would happen to return say just 1 or 2 with a function call that
>>> possibly reads megabytes of data (memcmp may read all bytes, not just stop
>>> at the first difference).
>>
>> do you mean for very short constant string, we should NOT change it to a. call to memcmp? instead we should just
>> inline it with byte comparison sequence?
>
> I mean we should never ever replace strcmp or strncmp call with library call to
> memcmp, you don't know how it is implemented and it could access any bytes
> in the new range, while previously strcmp/strncmp wouldn't.
> If you have
> sufficient guarantees that the memory must be all mapped and thus read-only
> accessible (see above, I don't think get_range_strlen provides that), it
> might be fine to inline expand the memcmp some way, because there you have
> full control over what the compiler will emit.
agreed. and the size check in the implementation is try to guarantee the transformation
is safe.
so, the major issue here is whether “get_range_strlen” can provide the good info on this. whether we should use
compute_builtin_object_size mode 2 for this purpose.
I can change to use “compute_builtin_object_size” and mode 2, and try to see whether any improvement is needed
to compute_builtin_object_size for simple cases.
what’s your opinion?
> But if we find out during
> expansion we don't want to expand it inline, we should fall back to calling
> strcmp or strncmp.
under what situation we will NOT expand the memcpy_eq call inline?
Qing
> Which means, instead of changing the call to the
> memcmp_eq builtin, change it to some STRCMP or STRNCMP internal function,
> ensure it is expanded inline the way you prefer and if it can't, it will
> call strcmp or strncmp.
>
> Jakub
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
2017-12-15 17:19 ` Qing Zhao
@ 2017-12-15 17:47 ` Jakub Jelinek
2017-12-15 20:49 ` Qing Zhao
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2017-12-15 17:47 UTC (permalink / raw)
To: Qing Zhao; +Cc: gcc-patches, jeff Law, richard Biener
On Fri, Dec 15, 2017 at 11:17:37AM -0600, Qing Zhao wrote:
> HOST_WIDE_INT const_string_leni = -1;
>
> if (idx1)
> {
> const_string_leni = compute_string_length (idx1);
> var_string = arg2;
> }
> else if (idx2)
> {
> const_string_leni = compute_string_length (idx2);
> var_string = arg1;
> }
>
> so, the -Wmaybe-uninitialized should NOT issue warning, right?
Well, you had the var_string var uninitialized, so that is what I was
talking about.
> but anyway, I can change the above as following:
>
> HOST_WIDE_INT const_string_leni = -1;
And here you don't need to initialize it.
> if (idx1)
> {
> const_string_leni = compute_string_length (idx1);
> var_string = arg2;
> }
> else
> {
> gcc_assert (idx2);
> const_string_leni = compute_string_length (idx2);
> var_string = arg1;
> }
>
> is this better?
Yes, though the gcc_assert could be just gcc_checking_assert (idx2);
> > so that would be mode 2, though that
> > mode isn't actually used in real-world code and thus might be not fully
> > tested.
>
> so, using this routine with mode 2 should be the right approach to go?
> and we need fully testing on this too?
It has been a while since I wrote it, so it would need careful analysis.
> >> B. use âget_range_strlenâ in gimple-fold.h to decide the size of the object. however,
> >> it cannot provide valid info for simple array, either.
> >
> > get_range_strlen returns you a range, the minval is not what you're looking
> > for, that is the minimum string length, so might be too short for your
> > purposes. And maxval is an upper bound, but you are looking for lower
> > bound, you need guarantees this amount of memory can be accessed, even if
> > there is 0 in the first byte.
>
> my understanding is that: get_range_strlen returns the minimum and maximum length of the string pointed by the
> pointer, and the maximum length of the string is determined by the size of the allocated memory pointed by the
> pointer, so, it should serve my purpose, did I misunderstand it?
What I'm worried about is:
struct S { int a; char b[64]; };
struct T { struct S c; char d; };
int
foo (struct T *x)
{
return strcmp (x->c.b, "01234567890123456789012345678901234567890123456789") == 0;
}
int
bar (void)
{
struct S *p = malloc (offsetof (struct S, b) + 8);
p->a = 123;
strcpy (p->b, "0123456");
return foo ((struct T *) p);
}
etc. where if you transform that into memcmp (x->c.b, "01234567890123456789012345678901234567890123456789", 51) == 0
it will segfault, whereas strcmp would not.
> > But if we find out during
> > expansion we don't want to expand it inline, we should fall back to calling
> > strcmp or strncmp.
>
> under what situation we will NOT expand the memcpy_eq call inline?
target = expand_builtin_memcmp (exp, target, fcode == BUILT_IN_MEMCMP_EQ);
if (target)
return target;
if (fcode == BUILT_IN_MEMCMP_EQ)
{
tree newdecl = builtin_decl_explicit (BUILT_IN_MEMCMP);
TREE_OPERAND (exp, 1) = build_fold_addr_expr (newdecl);
}
is what builtins.c has, so it certainly counts with the possibility.
Now, both expand_builtin_memcmp, and emit_block_cmp_hints has several cases
when it fails. E.g. can_do_by_pieces decides it is too expensive to do it
inline, and emit_block_cmp_via_cmpmem fails because the target doesn't have
cmpmemsi expander. Various other cases.
Also, note that some target might have cmpstr*si expanders implemented, but
not cmpmemsi, in which case trying to optimize strcmp as memcmp_eq might be a
severe pessimization.
Jakub
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026
2017-12-15 17:47 ` Jakub Jelinek
@ 2017-12-15 20:49 ` Qing Zhao
0 siblings, 0 replies; 9+ messages in thread
From: Qing Zhao @ 2017-12-15 20:49 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: gcc-patches, jeff Law, richard Biener
> On Dec 15, 2017, at 11:47 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Fri, Dec 15, 2017 at 11:17:37AM -0600, Qing Zhao wrote:
>> HOST_WIDE_INT const_string_leni = -1;
>>
>> if (idx1)
>> {
>> const_string_leni = compute_string_length (idx1);
>> var_string = arg2;
>> }
>> else if (idx2)
>> {
>> const_string_leni = compute_string_length (idx2);
>> var_string = arg1;
>> }
>>
>> so, the -Wmaybe-uninitialized should NOT issue warning, right?
>
> Well, you had the var_string var uninitialized, so that is what I was
> talking about.
oh, yeah, forgot to initialize var_string.
>
>> but anyway, I can change the above as following:
>>
>> HOST_WIDE_INT const_string_leni = -1;
>
> And here you don't need to initialize it.
>
>> if (idx1)
>> {
>> const_string_leni = compute_string_length (idx1);
>> var_string = arg2;
>> }
>> else
>> {
>> gcc_assert (idx2);
>> const_string_leni = compute_string_length (idx2);
>> var_string = arg1;
>> }
>>
>> is this better?
>
> Yes, though the gcc_assert could be just gcc_checking_assert (idx2);
what’s the major difference between these two assertion calls?
>
>>> so that would be mode 2, though that
>>> mode isn't actually used in real-world code and thus might be not fully
>>> tested.
>>
>> so, using this routine with mode 2 should be the right approach to go?
>> and we need fully testing on this too?
>
> It has been a while since I wrote it, so it would need careful analysis.
>
>>>> B. use “get_range_strlen” in gimple-fold.h to decide the size of the object. however,
>>>> it cannot provide valid info for simple array, either.
>>>
>>> get_range_strlen returns you a range, the minval is not what you're looking
>>> for, that is the minimum string length, so might be too short for your
>>> purposes. And maxval is an upper bound, but you are looking for lower
>>> bound, you need guarantees this amount of memory can be accessed, even if
>>> there is 0 in the first byte.
>>
>> my understanding is that: get_range_strlen returns the minimum and maximum length of the string pointed by the
>> pointer, and the maximum length of the string is determined by the size of the allocated memory pointed by the
>> pointer, so, it should serve my purpose, did I misunderstand it?
>
> What I'm worried about is:
> struct S { int a; char b[64]; };
> struct T { struct S c; char d; };
> int
> foo (struct T *x)
> {
> return strcmp (x->c.b, "01234567890123456789012345678901234567890123456789") == 0;
> }
> int
> bar (void)
> {
> struct S *p = malloc (offsetof (struct S, b) + 8);
> p->a = 123;
> strcpy (p->b, "0123456");
> return foo ((struct T *) p);
> }
> etc. where if you transform that into memcmp (x->c.b, "01234567890123456789012345678901234567890123456789", 51) == 0
> it will segfault, whereas strcmp would not.
thanks for the example.
is this issue only for the flexible array member?
if so, the current get_range_strlen will distinguish whether this is for flexible array member or not, then we can disable such transformation
for flexible array member.
>
>>> But if we find out during
>>> expansion we don't want to expand it inline, we should fall back to calling
>>> strcmp or strncmp.
>>
>> under what situation we will NOT expand the memcpy_eq call inline?
>
> target = expand_builtin_memcmp (exp, target, fcode == BUILT_IN_MEMCMP_EQ);
> if (target)
> return target;
> if (fcode == BUILT_IN_MEMCMP_EQ)
> {
> tree newdecl = builtin_decl_explicit (BUILT_IN_MEMCMP);
> TREE_OPERAND (exp, 1) = build_fold_addr_expr (newdecl);
> }
> is what builtins.c has, so it certainly counts with the possibility.
> Now, both expand_builtin_memcmp, and emit_block_cmp_hints has several cases
> when it fails. E.g. can_do_by_pieces decides it is too expensive to do it
> inline, and emit_block_cmp_via_cmpmem fails because the target doesn't have
> cmpmemsi expander. Various other cases.
>
> Also, note that some target might have cmpstr*si expanders implemented, but
> not cmpmemsi, in which case trying to optimize strcmp as memcmp_eq might be a
> severe pessimization.
Okay, I see. this is reasonable.
if the following better: (some details still need more work, just basic idea)
in handle_builtin_string_cmp of tree-ssa-strlen.c:
+ /* Replace strcmp or strncmp with the BUILT_IN_STRCMP_EQ. */
+ if (var_sizei > final_length)
+ {
+ tree fn = builtin_decl_implicit (BUILT_IN_STRCMP_EQ/BUILT_IN_STRNCMP_EQ;
+ if (!fn)
+ return true;
+ tree const_string_len = build_int_cst (size_type_node, final_length);
+ update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+ }
then in builtins.c, add the following:
case BUILT_IN_STRCMP_EQ:
case BUILT_IN_STRNCMP_EQ:
target = expand_builtin_memcmp (exp, target, fcode == BUILT_IN_MEMCMP_EQ);
if (target)
return target;
else
{
tree newdecl = builtin_decl_explicit (BUILT_IN_STRCMP/BUILT_IN_STRNCMP);
TREE_OPERAND (exp, 1) = build_fold_addr_expr (newdecl);
}
break;
thanks.
Qing
>
> Jakub
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-12-15 22:09 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-15 12:41 [PATCH][Middle-end]2nd patch of PR78809 and PR83026 Wilco Dijkstra
2017-12-15 22:09 ` Qing Zhao
-- strict thread matches above, loose matches on Subject: below --
2017-12-14 19:48 Qing Zhao
2017-12-14 20:46 ` Jakub Jelinek
2017-12-15 16:14 ` Qing Zhao
2017-12-15 16:42 ` Jakub Jelinek
2017-12-15 17:19 ` Qing Zhao
2017-12-15 17:47 ` Jakub Jelinek
2017-12-15 20:49 ` Qing Zhao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).