* [PATCH v2] cselib: add function to check if SET is redundant [PR106187]
@ 2022-07-29 13:52 Richard Earnshaw
2022-07-30 19:57 ` Jeff Law
0 siblings, 1 reply; 6+ messages in thread
From: Richard Earnshaw @ 2022-07-29 13:52 UTC (permalink / raw)
To: gcc-patches, Richard Biener
[-- Attachment #1: Type: text/plain, Size: 1056 bytes --]
A SET operation that writes memory may have the same value as an earlier
store but if the alias sets of the new and earlier store do not conflict
then the set is not truly redundant. This can happen, for example, if
objects of different types share a stack slot.
To fix this we define a new function in cselib that first checks for
equality and if that is successful then finds the earlier store in the
value history and checks the alias sets.
The routine is used in two places elsewhere in the compiler. Firstly
in cfgcleanup and secondly in postreload.
gcc/ChangeLog:
* alias.h (mems_same_for_tbaa_p): Declare.
* alias.cc (mems_same_for_tbaa_p): New function.
* dse.cc (record_store): Use it instead of open-coding
alias check.
* cselib.h (cselib_redundant_set_p): Declare.
* cselib.cc: Include alias.h
(cselib_redundant_set_p): New function.
* cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
of rtx_equal_for_cselib_p.
* postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
(reload_cse_noop_set_p): Delete.
[-- Attachment #2: pr106187-master.patch --]
[-- Type: text/x-patch, Size: 7321 bytes --]
diff --git a/gcc/alias.cc b/gcc/alias.cc
index 8c08452e0ac..d54feb15268 100644
--- a/gcc/alias.cc
+++ b/gcc/alias.cc
@@ -389,6 +389,20 @@ refs_same_for_tbaa_p (tree earlier, tree later)
|| alias_set_subset_of (later_base_set, earlier_base_set));
}
+/* Similar to refs_same_for_tbaa_p() but for use on MEM rtxs. */
+bool
+mems_same_for_tbaa_p (rtx earlier, rtx later)
+{
+ gcc_assert (MEM_P (earlier));
+ gcc_assert (MEM_P (later));
+
+ return ((MEM_ALIAS_SET (earlier) == MEM_ALIAS_SET (later)
+ || alias_set_subset_of (MEM_ALIAS_SET (later),
+ MEM_ALIAS_SET (earlier)))
+ && (!MEM_EXPR (earlier)
+ || refs_same_for_tbaa_p (MEM_EXPR (earlier), MEM_EXPR (later))));
+}
+
/* Returns a pointer to the alias set entry for ALIAS_SET, if there is
such an entry, or NULL otherwise. */
diff --git a/gcc/alias.h b/gcc/alias.h
index b2596518ac9..ee3db466763 100644
--- a/gcc/alias.h
+++ b/gcc/alias.h
@@ -40,6 +40,7 @@ tree reference_alias_ptr_type_1 (tree *);
bool alias_ptr_types_compatible_p (tree, tree);
int compare_base_decls (tree, tree);
bool refs_same_for_tbaa_p (tree, tree);
+bool mems_same_for_tbaa_p (rtx, rtx);
/* This alias set can be used to force a memory to conflict with all
other memories, creating a barrier across which no memory reference
diff --git a/gcc/cfgcleanup.cc b/gcc/cfgcleanup.cc
index 18047da7b24..a8b0139bb4d 100644
--- a/gcc/cfgcleanup.cc
+++ b/gcc/cfgcleanup.cc
@@ -208,7 +208,7 @@ mark_effect (rtx exp, regset nonequal)
return false;
case SET:
- if (rtx_equal_for_cselib_p (SET_DEST (exp), SET_SRC (exp)))
+ if (cselib_redundant_set_p (exp))
return false;
dest = SET_DEST (exp);
if (dest == pc_rtx)
diff --git a/gcc/cselib.cc b/gcc/cselib.cc
index 6769beeeaf8..6a5609786fa 100644
--- a/gcc/cselib.cc
+++ b/gcc/cselib.cc
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. If not see
#include "dumpfile.h"
#include "cselib.h"
#include "function-abi.h"
+#include "alias.h"
/* A list of cselib_val structures. */
struct elt_list
@@ -1157,6 +1158,75 @@ rtx_equal_for_cselib_1 (rtx x, rtx y, machine_mode memmode, int depth)
return 1;
}
+/* Wrapper for rtx_equal_for_cselib_p to determine whether a SET is
+ truly redundant, taking into account aliasing information. */
+bool
+cselib_redundant_set_p (rtx set)
+{
+ gcc_assert (GET_CODE (set) == SET);
+ rtx dest = SET_DEST (set);
+ if (cselib_reg_set_mode (dest) != GET_MODE (dest))
+ return false;
+
+ if (!rtx_equal_for_cselib_p (dest, SET_SRC (set)))
+ return false;
+
+ while (GET_CODE (dest) == SUBREG
+ || GET_CODE (dest) == ZERO_EXTRACT
+ || GET_CODE (dest) == STRICT_LOW_PART)
+ dest = XEXP (dest, 0);
+
+ if (!flag_strict_aliasing || !MEM_P (dest))
+ return true;
+
+ /* For a store we need to check that suppressing it will not change
+ the effective alias set. */
+ rtx dest_addr = XEXP (dest, 0);
+
+ /* Lookup the equivalents to the original dest (rather than just the
+ MEM). */
+ cselib_val *src_val = cselib_lookup (SET_DEST (set),
+ GET_MODE (SET_DEST (set)),
+ 0, VOIDmode);
+
+ if (src_val)
+ {
+ /* Walk the list of source equivalents to find the MEM accessing
+ the same location. */
+ for (elt_loc_list *l = src_val->locs; l; l = l->next)
+ {
+ rtx src_equiv = l->loc;
+ while (GET_CODE (src_equiv) == SUBREG
+ || GET_CODE (src_equiv) == ZERO_EXTRACT
+ || GET_CODE (src_equiv) == STRICT_LOW_PART)
+ src_equiv = XEXP (src_equiv, 0);
+
+ if (MEM_P (src_equiv))
+ {
+ /* Match the MEMs by comparing the addresses. We can
+ only remove the later store if the earlier aliases at
+ least all the accesses of the later one. */
+ if (rtx_equal_for_cselib_1 (dest_addr, XEXP (src_equiv, 0),
+ GET_MODE (dest), 0))
+ return mems_same_for_tbaa_p (src_equiv, dest);
+ }
+ }
+ }
+
+ /* We failed to find a recorded value in the cselib history, so try
+ the source of this set; this catches cases such as *p = *q when p
+ and q have the same value. */
+ rtx src = SET_SRC (set);
+ while (GET_CODE (src) == SUBREG)
+ src = XEXP (src, 0);
+
+ if (MEM_P (src)
+ && rtx_equal_for_cselib_1 (dest_addr, XEXP (src, 0), GET_MODE (dest), 0))
+ return mems_same_for_tbaa_p (src, dest);
+
+ return false;
+}
+
/* Helper function for cselib_hash_rtx. Arguments like for cselib_hash_rtx,
except that it hashes (plus:P x c). */
diff --git a/gcc/cselib.h b/gcc/cselib.h
index 9ae65e6459e..b0905053ea5 100644
--- a/gcc/cselib.h
+++ b/gcc/cselib.h
@@ -83,6 +83,7 @@ extern void cselib_process_insn (rtx_insn *);
extern bool fp_setter_insn (rtx_insn *);
extern machine_mode cselib_reg_set_mode (const_rtx);
extern int rtx_equal_for_cselib_1 (rtx, rtx, machine_mode, int);
+extern bool cselib_redundant_set_p (rtx);
extern int references_value_p (const_rtx, int);
extern rtx cselib_expand_value_rtx (rtx, bitmap, int);
typedef rtx (*cselib_expand_callback)(rtx, bitmap, int, void *);
diff --git a/gcc/dse.cc b/gcc/dse.cc
index 90a4c1f22db..0f7b0fb9796 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1570,12 +1570,7 @@ record_store (rtx body, bb_info_t bb_info)
width)
/* We can only remove the later store if the earlier aliases
at least all accesses the later one. */
- && ((MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)
- || alias_set_subset_of (MEM_ALIAS_SET (mem),
- MEM_ALIAS_SET (s_info->mem)))
- && (!MEM_EXPR (s_info->mem)
- || refs_same_for_tbaa_p (MEM_EXPR (s_info->mem),
- MEM_EXPR (mem)))))
+ && mems_same_for_tbaa_p (s_info->mem, mem))
{
if (GET_MODE (mem) == BLKmode)
{
diff --git a/gcc/postreload.cc b/gcc/postreload.cc
index d1c99fe6dc9..41f61d32648 100644
--- a/gcc/postreload.cc
+++ b/gcc/postreload.cc
@@ -43,7 +43,6 @@ along with GCC; see the file COPYING3. If not see
#include "function-abi.h"
#include "rtl-iter.h"
-static int reload_cse_noop_set_p (rtx);
static bool reload_cse_simplify (rtx_insn *, rtx);
static void reload_cse_regs_1 (void);
static int reload_cse_simplify_set (rtx, rtx_insn *);
@@ -74,16 +73,6 @@ reload_cse_regs (rtx_insn *first ATTRIBUTE_UNUSED)
}
}
-/* See whether a single set SET is a noop. */
-static int
-reload_cse_noop_set_p (rtx set)
-{
- if (cselib_reg_set_mode (SET_DEST (set)) != GET_MODE (SET_DEST (set)))
- return 0;
-
- return rtx_equal_for_cselib_p (SET_DEST (set), SET_SRC (set));
-}
-
/* Try to simplify INSN. Return true if the CFG may have changed. */
static bool
reload_cse_simplify (rtx_insn *insn, rtx testreg)
@@ -118,7 +107,7 @@ reload_cse_simplify (rtx_insn *insn, rtx testreg)
this out, so it's safer to simplify before we delete. */
count += reload_cse_simplify_set (body, insn);
- if (!count && reload_cse_noop_set_p (body))
+ if (!count && cselib_redundant_set_p (body))
{
if (check_for_inc_dec (insn))
delete_insn_and_edges (insn);
@@ -157,7 +146,7 @@ reload_cse_simplify (rtx_insn *insn, rtx testreg)
rtx part = XVECEXP (body, 0, i);
if (GET_CODE (part) == SET)
{
- if (! reload_cse_noop_set_p (part))
+ if (! cselib_redundant_set_p (part))
break;
if (REG_P (SET_DEST (part))
&& REG_FUNCTION_VALUE_P (SET_DEST (part)))
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] cselib: add function to check if SET is redundant [PR106187]
2022-07-29 13:52 [PATCH v2] cselib: add function to check if SET is redundant [PR106187] Richard Earnshaw
@ 2022-07-30 19:57 ` Jeff Law
2022-08-01 10:38 ` Richard Earnshaw
0 siblings, 1 reply; 6+ messages in thread
From: Jeff Law @ 2022-07-30 19:57 UTC (permalink / raw)
To: gcc-patches
On 7/29/2022 7:52 AM, Richard Earnshaw via Gcc-patches wrote:
> A SET operation that writes memory may have the same value as an
> earlier store but if the alias sets of the new and earlier store do
> not conflict then the set is not truly redundant. This can happen,
> for example, if objects of different types share a stack slot.
>
> To fix this we define a new function in cselib that first checks for
> equality and if that is successful then finds the earlier store in the
> value history and checks the alias sets.
>
> The routine is used in two places elsewhere in the compiler. Firstly
> in cfgcleanup and secondly in postreload.
>
> gcc/ChangeLog:
> * alias.h (mems_same_for_tbaa_p): Declare.
> * alias.cc (mems_same_for_tbaa_p): New function.
> * dse.cc (record_store): Use it instead of open-coding
> alias check.
> * cselib.h (cselib_redundant_set_p): Declare.
> * cselib.cc: Include alias.h
> (cselib_redundant_set_p): New function.
> * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
> of rtx_equal_for_cselib_p.
> * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
> (reload_cse_noop_set_p): Delete.
Seems quite reasonable. The only question I would have would be
whether or not you considered including the aliasing info into the
hashing used by cselib. You'd probably still need the bulk of this
patch as well since we could presumably still get a hash conflict with
two stores of the same value to the same location, but with different
alias sets (it's just much less likely), so perhaps it doesn't really
buy us anything.
Ideally this would include a testcase. You might be able to turn that
non-executawble reduced case into something useful by scanning the
post-reload dumps.
Jeff
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] cselib: add function to check if SET is redundant [PR106187]
2022-07-30 19:57 ` Jeff Law
@ 2022-08-01 10:38 ` Richard Earnshaw
2022-08-02 16:06 ` Richard Earnshaw
0 siblings, 1 reply; 6+ messages in thread
From: Richard Earnshaw @ 2022-08-01 10:38 UTC (permalink / raw)
To: Jeff Law, gcc-patches
On 30/07/2022 20:57, Jeff Law via Gcc-patches wrote:
>
>
> On 7/29/2022 7:52 AM, Richard Earnshaw via Gcc-patches wrote:
>> A SET operation that writes memory may have the same value as an
>> earlier store but if the alias sets of the new and earlier store do
>> not conflict then the set is not truly redundant. This can happen,
>> for example, if objects of different types share a stack slot.
>>
>> To fix this we define a new function in cselib that first checks for
>> equality and if that is successful then finds the earlier store in the
>> value history and checks the alias sets.
>>
>> The routine is used in two places elsewhere in the compiler. Firstly
>> in cfgcleanup and secondly in postreload.
>>
>> gcc/ChangeLog:
>> * alias.h (mems_same_for_tbaa_p): Declare.
>> * alias.cc (mems_same_for_tbaa_p): New function.
>> * dse.cc (record_store): Use it instead of open-coding
>> alias check.
>> * cselib.h (cselib_redundant_set_p): Declare.
>> * cselib.cc: Include alias.h
>> (cselib_redundant_set_p): New function.
>> * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
>> of rtx_equal_for_cselib_p.
>> * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
>> (reload_cse_noop_set_p): Delete.
> Seems quite reasonable. The only question I would have would be
> whether or not you considered including the aliasing info into the
> hashing used by cselib. You'd probably still need the bulk of this
> patch as well since we could presumably still get a hash conflict with
> two stores of the same value to the same location, but with different
> alias sets (it's just much less likely), so perhaps it doesn't really
> buy us anything.
I thought about this, but if the alias set were included in the hash,
then surely you'd get every alias set in a different value. Then you'd
miss the cases where the alias sets do conflict even though they are not
the same. Anyway, the values /are/ the same so in some circumstances
you might want to know that.
>
> Ideally this would include a testcase. You might be able to turn that
> non-executawble reduced case into something useful by scanning the
> post-reload dumps.
I considered this as well, but the testcase I have is far too fragile, I
think. The existing test only fails on Arm, only fails on 11.2 (not
11.3 or gcc-12 onwards), relies on two objects with the same value being
in distinct alias sets but still assigned to the same stack slot and for
some copy dance to end up trying to write back the original value to the
same slot but with a non-conflicting set. And finally, the scheduler
has to then try to move a load past the non-aliasing store.
To get anywhere close to this I think we'd need something akin to the
gimple reader but for RTL so that we could set up all the conditions for
the failure without the risk of an earlier transform blowing the test
away.
I even considered whether we could start with a gimple dump and
bypassing all the tree/gimple transformations, but even that would be
still at the mercy of the stack-slot allocation algorithm.
>
> Jeff
R.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] cselib: add function to check if SET is redundant [PR106187]
2022-08-01 10:38 ` Richard Earnshaw
@ 2022-08-02 16:06 ` Richard Earnshaw
2022-08-02 23:36 ` Jeff Law
0 siblings, 1 reply; 6+ messages in thread
From: Richard Earnshaw @ 2022-08-02 16:06 UTC (permalink / raw)
To: Jeff Law, gcc-patches
On 01/08/2022 11:38, Richard Earnshaw via Gcc-patches wrote:
>
>
> On 30/07/2022 20:57, Jeff Law via Gcc-patches wrote:
>>
>>
>> On 7/29/2022 7:52 AM, Richard Earnshaw via Gcc-patches wrote:
>>> A SET operation that writes memory may have the same value as an
>>> earlier store but if the alias sets of the new and earlier store do
>>> not conflict then the set is not truly redundant. This can happen,
>>> for example, if objects of different types share a stack slot.
>>>
>>> To fix this we define a new function in cselib that first checks for
>>> equality and if that is successful then finds the earlier store in the
>>> value history and checks the alias sets.
>>>
>>> The routine is used in two places elsewhere in the compiler. Firstly
>>> in cfgcleanup and secondly in postreload.
>>>
>>> gcc/ChangeLog:
>>> * alias.h (mems_same_for_tbaa_p): Declare.
>>> * alias.cc (mems_same_for_tbaa_p): New function.
>>> * dse.cc (record_store): Use it instead of open-coding
>>> alias check.
>>> * cselib.h (cselib_redundant_set_p): Declare.
>>> * cselib.cc: Include alias.h
>>> (cselib_redundant_set_p): New function.
>>> * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
>>> of rtx_equal_for_cselib_p.
>>> * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
>>> (reload_cse_noop_set_p): Delete.
>> Seems quite reasonable. The only question I would have would be
>> whether or not you considered including the aliasing info into the
>> hashing used by cselib. You'd probably still need the bulk of this
>> patch as well since we could presumably still get a hash conflict with
>> two stores of the same value to the same location, but with different
>> alias sets (it's just much less likely), so perhaps it doesn't really
>> buy us anything.
>
> I thought about this, but if the alias set were included in the hash,
> then surely you'd get every alias set in a different value. Then you'd
> miss the cases where the alias sets do conflict even though they are not
> the same. Anyway, the values /are/ the same so in some circumstances
> you might want to know that.
>
>>
>> Ideally this would include a testcase. You might be able to turn that
>> non-executawble reduced case into something useful by scanning the
>> post-reload dumps.
>
> I considered this as well, but the testcase I have is far too fragile, I
> think. The existing test only fails on Arm, only fails on 11.2 (not
> 11.3 or gcc-12 onwards), relies on two objects with the same value being
> in distinct alias sets but still assigned to the same stack slot and for
> some copy dance to end up trying to write back the original value to the
> same slot but with a non-conflicting set. And finally, the scheduler
> has to then try to move a load past the non-aliasing store.
>
> To get anywhere close to this I think we'd need something akin to the
> gimple reader but for RTL so that we could set up all the conditions for
> the failure without the risk of an earlier transform blowing the test away.
I wasn't aware of the rtl reader already in the compiler. But it
doesn't really get me any closer as it is lacking in so many regards:
- It can't handle (const_double:SF ...) - it tries to handle the
argument as an int. This is a consequence, I think, of the reader being
based on that for reading machine descriptions where FP const_double is
simply never encountered.
- It doesn't seem to handle anything much more than very basic types,
and in particular appears to have no way of ensuring that alias sets
match up with the type system.
>
> I even considered whether we could start with a gimple dump and
> bypassing all the tree/gimple transformations, but even that would be
> still at the mercy of the stack-slot allocation algorithm.
I spent a while trying to get some gimple out of the dumpers in a form
that was usable, but that's pretty much a non-starter. To make it work
we'd need to add support for gimple clobbers on objects - without that
there's no way to get the stack-slot sharing code to work. Furthermore,
even feeding fully-optimized gimple directly into expand is such a long
way from the postreload pass, that I can't believe the testcase would
remain stable for long.
And the other major issue is that the original testcase is heavily
templated C++ and neither of the parsers gimple or rtl is supported in
cc1plus: converting the boilerplate to be C-friendly is probably going
to be hard.
I can't afford to spend much more time on this, especially given the
low-quality test we're going to get out of the end of the process.
>
>>
>> Jeff
>
> R.
R.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] cselib: add function to check if SET is redundant [PR106187]
2022-08-02 16:06 ` Richard Earnshaw
@ 2022-08-02 23:36 ` Jeff Law
2022-08-03 9:09 ` Richard Earnshaw
0 siblings, 1 reply; 6+ messages in thread
From: Jeff Law @ 2022-08-02 23:36 UTC (permalink / raw)
To: Richard Earnshaw, gcc-patches
On 8/2/2022 10:06 AM, Richard Earnshaw wrote:
>
>
> On 01/08/2022 11:38, Richard Earnshaw via Gcc-patches wrote:
>>
>>
>> On 30/07/2022 20:57, Jeff Law via Gcc-patches wrote:
>>>
>>>
>>> On 7/29/2022 7:52 AM, Richard Earnshaw via Gcc-patches wrote:
>>>> A SET operation that writes memory may have the same value as an
>>>> earlier store but if the alias sets of the new and earlier store do
>>>> not conflict then the set is not truly redundant. This can happen,
>>>> for example, if objects of different types share a stack slot.
>>>>
>>>> To fix this we define a new function in cselib that first checks for
>>>> equality and if that is successful then finds the earlier store in the
>>>> value history and checks the alias sets.
>>>>
>>>> The routine is used in two places elsewhere in the compiler. Firstly
>>>> in cfgcleanup and secondly in postreload.
>>>>
>>>> gcc/ChangeLog:
>>>> * alias.h (mems_same_for_tbaa_p): Declare.
>>>> * alias.cc (mems_same_for_tbaa_p): New function.
>>>> * dse.cc (record_store): Use it instead of open-coding
>>>> alias check.
>>>> * cselib.h (cselib_redundant_set_p): Declare.
>>>> * cselib.cc: Include alias.h
>>>> (cselib_redundant_set_p): New function.
>>>> * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
>>>> of rtx_equal_for_cselib_p.
>>>> * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
>>>> (reload_cse_noop_set_p): Delete.
>>> Seems quite reasonable. The only question I would have would be
>>> whether or not you considered including the aliasing info into the
>>> hashing used by cselib. You'd probably still need the bulk of this
>>> patch as well since we could presumably still get a hash conflict
>>> with two stores of the same value to the same location, but with
>>> different alias sets (it's just much less likely), so perhaps it
>>> doesn't really buy us anything.
>>
>> I thought about this, but if the alias set were included in the hash,
>> then surely you'd get every alias set in a different value. Then
>> you'd miss the cases where the alias sets do conflict even though
>> they are not the same. Anyway, the values /are/ the same so in some
>> circumstances you might want to know that.
>>
>>>
>>> Ideally this would include a testcase. You might be able to turn
>>> that non-executawble reduced case into something useful by scanning
>>> the post-reload dumps.
>>
>> I considered this as well, but the testcase I have is far too
>> fragile, I think. The existing test only fails on Arm, only fails on
>> 11.2 (not 11.3 or gcc-12 onwards), relies on two objects with the
>> same value being in distinct alias sets but still assigned to the
>> same stack slot and for some copy dance to end up trying to write
>> back the original value to the same slot but with a non-conflicting
>> set. And finally, the scheduler has to then try to move a load past
>> the non-aliasing store.
>
>
>>
>> To get anywhere close to this I think we'd need something akin to the
>> gimple reader but for RTL so that we could set up all the conditions
>> for the failure without the risk of an earlier transform blowing the
>> test away.
>
> I wasn't aware of the rtl reader already in the compiler. But it
> doesn't really get me any closer as it is lacking in so many regards:
>
> - It can't handle (const_double:SF ...) - it tries to handle the
> argument as an int. This is a consequence, I think, of the reader
> being based on that for reading machine descriptions where FP
> const_double is simply never encountered.
>
> - It doesn't seem to handle anything much more than very basic types,
> and in particular appears to have no way of ensuring that alias sets
> match up with the type system.
>
>>
>> I even considered whether we could start with a gimple dump and
>> bypassing all the tree/gimple transformations, but even that would be
>> still at the mercy of the stack-slot allocation algorithm.
>
> I spent a while trying to get some gimple out of the dumpers in a form
> that was usable, but that's pretty much a non-starter. To make it
> work we'd need to add support for gimple clobbers on objects - without
> that there's no way to get the stack-slot sharing code to work.
> Furthermore, even feeding fully-optimized gimple directly into expand
> is such a long way from the postreload pass, that I can't believe the
> testcase would remain stable for long.
>
> And the other major issue is that the original testcase is heavily
> templated C++ and neither of the parsers gimple or rtl is supported in
> cc1plus: converting the boilerplate to be C-friendly is probably going
> to be hard.
>
> I can't afford to spend much more time on this, especially given the
> low-quality test we're going to get out of the end of the process.
Understood. Let's just go with the patch as-is. That's normal for
cases where we can't produce a reasonable test.
jeff
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] cselib: add function to check if SET is redundant [PR106187]
2022-08-02 23:36 ` Jeff Law
@ 2022-08-03 9:09 ` Richard Earnshaw
0 siblings, 0 replies; 6+ messages in thread
From: Richard Earnshaw @ 2022-08-03 9:09 UTC (permalink / raw)
To: Jeff Law, gcc-patches
On 03/08/2022 00:36, Jeff Law wrote:
>
>
> On 8/2/2022 10:06 AM, Richard Earnshaw wrote:
>>
>>
>> On 01/08/2022 11:38, Richard Earnshaw via Gcc-patches wrote:
>>>
>>>
>>> On 30/07/2022 20:57, Jeff Law via Gcc-patches wrote:
>>>>
>>>>
>>>> On 7/29/2022 7:52 AM, Richard Earnshaw via Gcc-patches wrote:
>>>>> A SET operation that writes memory may have the same value as an
>>>>> earlier store but if the alias sets of the new and earlier store do
>>>>> not conflict then the set is not truly redundant. This can happen,
>>>>> for example, if objects of different types share a stack slot.
>>>>>
>>>>> To fix this we define a new function in cselib that first checks for
>>>>> equality and if that is successful then finds the earlier store in the
>>>>> value history and checks the alias sets.
>>>>>
>>>>> The routine is used in two places elsewhere in the compiler. Firstly
>>>>> in cfgcleanup and secondly in postreload.
>>>>>
>>>>> gcc/ChangeLog:
>>>>> * alias.h (mems_same_for_tbaa_p): Declare.
>>>>> * alias.cc (mems_same_for_tbaa_p): New function.
>>>>> * dse.cc (record_store): Use it instead of open-coding
>>>>> alias check.
>>>>> * cselib.h (cselib_redundant_set_p): Declare.
>>>>> * cselib.cc: Include alias.h
>>>>> (cselib_redundant_set_p): New function.
>>>>> * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
>>>>> of rtx_equal_for_cselib_p.
>>>>> * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
>>>>> (reload_cse_noop_set_p): Delete.
>>>> Seems quite reasonable. The only question I would have would be
>>>> whether or not you considered including the aliasing info into the
>>>> hashing used by cselib. You'd probably still need the bulk of this
>>>> patch as well since we could presumably still get a hash conflict
>>>> with two stores of the same value to the same location, but with
>>>> different alias sets (it's just much less likely), so perhaps it
>>>> doesn't really buy us anything.
>>>
>>> I thought about this, but if the alias set were included in the hash,
>>> then surely you'd get every alias set in a different value. Then
>>> you'd miss the cases where the alias sets do conflict even though
>>> they are not the same. Anyway, the values /are/ the same so in some
>>> circumstances you might want to know that.
>>>
>>>>
>>>> Ideally this would include a testcase. You might be able to turn
>>>> that non-executawble reduced case into something useful by scanning
>>>> the post-reload dumps.
>>>
>>> I considered this as well, but the testcase I have is far too
>>> fragile, I think. The existing test only fails on Arm, only fails on
>>> 11.2 (not 11.3 or gcc-12 onwards), relies on two objects with the
>>> same value being in distinct alias sets but still assigned to the
>>> same stack slot and for some copy dance to end up trying to write
>>> back the original value to the same slot but with a non-conflicting
>>> set. And finally, the scheduler has to then try to move a load past
>>> the non-aliasing store.
>>
>>
>>>
>>> To get anywhere close to this I think we'd need something akin to the
>>> gimple reader but for RTL so that we could set up all the conditions
>>> for the failure without the risk of an earlier transform blowing the
>>> test away.
>>
>> I wasn't aware of the rtl reader already in the compiler. But it
>> doesn't really get me any closer as it is lacking in so many regards:
>>
>> - It can't handle (const_double:SF ...) - it tries to handle the
>> argument as an int. This is a consequence, I think, of the reader
>> being based on that for reading machine descriptions where FP
>> const_double is simply never encountered.
>>
>> - It doesn't seem to handle anything much more than very basic types,
>> and in particular appears to have no way of ensuring that alias sets
>> match up with the type system.
>>
>>>
>>> I even considered whether we could start with a gimple dump and
>>> bypassing all the tree/gimple transformations, but even that would be
>>> still at the mercy of the stack-slot allocation algorithm.
>>
>> I spent a while trying to get some gimple out of the dumpers in a form
>> that was usable, but that's pretty much a non-starter. To make it
>> work we'd need to add support for gimple clobbers on objects - without
>> that there's no way to get the stack-slot sharing code to work.
>> Furthermore, even feeding fully-optimized gimple directly into expand
>> is such a long way from the postreload pass, that I can't believe the
>> testcase would remain stable for long.
>>
>> And the other major issue is that the original testcase is heavily
>> templated C++ and neither of the parsers gimple or rtl is supported in
>> cc1plus: converting the boilerplate to be C-friendly is probably going
>> to be hard.
>>
>> I can't afford to spend much more time on this, especially given the
>> low-quality test we're going to get out of the end of the process.
> Understood. Let's just go with the patch as-is. That's normal for
> cases where we can't produce a reasonable test.
>
Thanks, committed to trunk. Will work on backports if it doesn't throw
up any issues in the next few days.
R.
> jeff
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-08-03 9:09 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-29 13:52 [PATCH v2] cselib: add function to check if SET is redundant [PR106187] Richard Earnshaw
2022-07-30 19:57 ` Jeff Law
2022-08-01 10:38 ` Richard Earnshaw
2022-08-02 16:06 ` Richard Earnshaw
2022-08-02 23:36 ` Jeff Law
2022-08-03 9:09 ` Richard Earnshaw
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).