* [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.
@ 2022-02-22 16:39 Andrew MacLeod
2022-02-22 16:56 ` Jakub Jelinek
0 siblings, 1 reply; 9+ messages in thread
From: Andrew MacLeod @ 2022-02-22 16:39 UTC (permalink / raw)
To: gcc-patches; +Cc: Aldy Hernandez, Jakub Jelinek
I'd like to get clarification on some subtle terminology. I find I am
conflating calls that don't return with calls that may throw, and I
think they have different considerations.
My experiments with calls that can throw indicate that they always end a
basic block. This makes sense to me as there is the outgoing fall-thru
edge and an outgoing EH edge. Are there any conditions under which this
is not the case? (other than non-call exceptions)
If that supposition is true, that leaves us with calls in the middle of
the block which may not return. This prevents us from allowing later
calculations from impacting anything which happens before the call.
I believe the following 2 small patches could then resolve this.
1 - Export global names to SSA_NAME_RANGE_INFO during the statement
walk instead of at the end of the pass
2 - Use the existing lazy recomputation machinery to recompute any
globals which are defined in the block where a dependent value becomes
non-null.
More details in each patch. Neither is very large. We could add this
to this release or wait for stage 1.
Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.
2022-02-22 16:39 [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation Andrew MacLeod
@ 2022-02-22 16:56 ` Jakub Jelinek
2022-02-22 17:39 ` Andrew MacLeod
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2022-02-22 16:56 UTC (permalink / raw)
To: Andrew MacLeod; +Cc: gcc-patches, Aldy Hernandez
On Tue, Feb 22, 2022 at 11:39:41AM -0500, Andrew MacLeod wrote:
> I'd like to get clarification on some subtle terminology. I find I am
> conflating calls that don't return with calls that may throw, and I think
> they have different considerations.
>
> My experiments with calls that can throw indicate that they always end a
> basic block. This makes sense to me as there is the outgoing fall-thru edge
> and an outgoing EH edge. Are there any conditions under which this is not
> the case? (other than non-call exceptions)
Generally, there are 2 kinds of calls that can throw, those that can throw
internally and those can throw externally (e.g. there are
stmt_could_throw_{in,ex}ternal predicates).
Consider e.g.
void foo ();
struct S { S (); ~S (); };
void bar () { foo (); foo (); }
void baz () { S s; foo (); foo (); }
void qux () { try { foo (); } catch (...) {} }
the calls to foo in bar throw externally, if they throw, execution doesn't
continue anywhere in bar but in some bar's caller, or could just terminate
if nothing catches it at all. Such calls don't terminate a bb.
In baz, the s variable needs destruction if either of the foo calls throw,
so those calls do terminate bb and there are normal fallthru edges from
those bbs and eh edges to an EH pad which will destruct s and continue
propagating the exception.
In qux, there is explicit try/catch, so again, foo throws internally, ends
bb, has an EH edge to EH landing pad which will do what catch does.
That is EH, then there are calls that might not return because they leave
in some other way (e.g. longjmp), or might loop forever, might exit, might
abort, trap etc.
I must say I don't know if we have any call flags that would guarantee
the function will always return (exactly once) if called.
Perhaps ECF_CONST/EFC_PURE without ECF_LOOPING_CONST_OR_PURE do?
Jakub
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.
2022-02-22 16:56 ` Jakub Jelinek
@ 2022-02-22 17:39 ` Andrew MacLeod
2022-02-22 17:57 ` Jakub Jelinek
0 siblings, 1 reply; 9+ messages in thread
From: Andrew MacLeod @ 2022-02-22 17:39 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: gcc-patches, Aldy Hernandez
On 2/22/22 11:56, Jakub Jelinek wrote:
> On Tue, Feb 22, 2022 at 11:39:41AM -0500, Andrew MacLeod wrote:
>> I'd like to get clarification on some subtle terminology. I find I am
>> conflating calls that don't return with calls that may throw, and I think
>> they have different considerations.
>>
>> My experiments with calls that can throw indicate that they always end a
>> basic block. This makes sense to me as there is the outgoing fall-thru edge
>> and an outgoing EH edge. Are there any conditions under which this is not
>> the case? (other than non-call exceptions)
> Generally, there are 2 kinds of calls that can throw, those that can throw
> internally and those can throw externally (e.g. there are
> stmt_could_throw_{in,ex}ternal predicates).
>
> Consider e.g.
>
> void foo ();
> struct S { S (); ~S (); };
> void bar () { foo (); foo (); }
> void baz () { S s; foo (); foo (); }
> void qux () { try { foo (); } catch (...) {} }
>
> the calls to foo in bar throw externally, if they throw, execution doesn't
> continue anywhere in bar but in some bar's caller, or could just terminate
> if nothing catches it at all. Such calls don't terminate a bb.
This is not a problem.
> In baz, the s variable needs destruction if either of the foo calls throw,
> so those calls do terminate bb and there are normal fallthru edges from
> those bbs and eh edges to an EH pad which will destruct s and continue
> propagating the exception.
> In qux, there is explicit try/catch, so again, foo throws internally, ends
> bb, has an EH edge to EH landing pad which will do what catch does.
Those also are not a problem, everything should flow fine in these
situations as well now that we make non-null adjustments on edges, and
don't for EH edges.
As far as these patches go, any block which has a call at the exit point
will not have any import or exports as there is no range stmt at the end
of the block, so we will not be marking anything in those blocks as stale.
>
> That is EH, then there are calls that might not return because they leave
> in some other way (e.g. longjmp), or might loop forever, might exit, might
> abort, trap etc.
Generally speaking, calls which do not return should not now be a
problem... as long as they do not transfer control to somewhere else in
the current function.
> I must say I don't know if we have any call flags that would guarantee
> the function will always return (exactly once) if called.
> Perhaps ECF_CONST/EFC_PURE without ECF_LOOPING_CONST_OR_PURE do?
>
I don't think I actually need that.
Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.
2022-02-22 17:39 ` Andrew MacLeod
@ 2022-02-22 17:57 ` Jakub Jelinek
2022-02-22 18:07 ` Jeff Law
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2022-02-22 17:57 UTC (permalink / raw)
To: Andrew MacLeod; +Cc: gcc-patches, Aldy Hernandez
On Tue, Feb 22, 2022 at 12:39:28PM -0500, Andrew MacLeod wrote:
> > That is EH, then there are calls that might not return because they leave
> > in some other way (e.g. longjmp), or might loop forever, might exit, might
> > abort, trap etc.
> Generally speaking, calls which do not return should not now be a problem...
> as long as they do not transfer control to somewhere else in the current
> function.
I thought all of those cases are very relevant to PR104530.
If we have:
_1 = ptr_2(D) == 0;
// unrelated code in the same bb
_3 = *ptr_2(D);
then in light of PR104288, we can optimize ptr_2(D) == 0 into true only if
there are no calls inside of "// unrelated code in the same bb"
or if all calls in "// unrelated code in the same bb" are guaranteed to
return exactly once. Because, if there is a call in there which could
exit (that is the PR104288 testcase), or abort, or trap, or loop forever,
or throw externally, or longjmp or in any other non-UB way
cause the _1 = ptr_2(D) == 0; stmt to be invoked at runtime but
_3 = *ptr_2(D) not being invoked, then we can't optimize the earlier
comparison because ptr_2(D) could be NULL in a valid program.
While if there are no calls (and no problematic inline asms) and no trapping
insns in between, we can and PR104530 is asking that we continue to optimize
that.
Jakub
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.
2022-02-22 17:57 ` Jakub Jelinek
@ 2022-02-22 18:07 ` Jeff Law
2022-02-22 19:18 ` Andrew MacLeod
0 siblings, 1 reply; 9+ messages in thread
From: Jeff Law @ 2022-02-22 18:07 UTC (permalink / raw)
To: Jakub Jelinek, Andrew MacLeod; +Cc: gcc-patches
On 2/22/2022 10:57 AM, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Feb 22, 2022 at 12:39:28PM -0500, Andrew MacLeod wrote:
>>> That is EH, then there are calls that might not return because they leave
>>> in some other way (e.g. longjmp), or might loop forever, might exit, might
>>> abort, trap etc.
>> Generally speaking, calls which do not return should not now be a problem...
>> as long as they do not transfer control to somewhere else in the current
>> function.
> I thought all of those cases are very relevant to PR104530.
> If we have:
> _1 = ptr_2(D) == 0;
> // unrelated code in the same bb
> _3 = *ptr_2(D);
> then in light of PR104288, we can optimize ptr_2(D) == 0 into true only if
> there are no calls inside of "// unrelated code in the same bb"
> or if all calls in "// unrelated code in the same bb" are guaranteed to
> return exactly once. Because, if there is a call in there which could
> exit (that is the PR104288 testcase), or abort, or trap, or loop forever,
> or throw externally, or longjmp or in any other non-UB way
> cause the _1 = ptr_2(D) == 0; stmt to be invoked at runtime but
> _3 = *ptr_2(D) not being invoked, then we can't optimize the earlier
> comparison because ptr_2(D) could be NULL in a valid program.
> While if there are no calls (and no problematic inline asms) and no trapping
> insns in between, we can and PR104530 is asking that we continue to optimize
> that.
Right. This is similar to some of the restrictions we deal with in the
path isolation pass. Essentially we have a path, when traversed, would
result in a *0. We would like to be able to find the edge upon-which
the *0 is control dependent and optimize the test so that it always went
to the valid path rather than the *0 path.
The problem is there may be observable side effects on the *0 path
between the test and the actual *0 -- including calls to nonreturning
functions, setjmp/longjmp, things that could trap, etc. This case is
similar. We can't back-propagate the non-null status through any
statements with observable side effects.
Jeff
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.
2022-02-22 18:07 ` Jeff Law
@ 2022-02-22 19:18 ` Andrew MacLeod
2022-02-23 7:48 ` Richard Biener
0 siblings, 1 reply; 9+ messages in thread
From: Andrew MacLeod @ 2022-02-22 19:18 UTC (permalink / raw)
To: Jeff Law, Jakub Jelinek; +Cc: gcc-patches
On 2/22/22 13:07, Jeff Law wrote:
>
>
> On 2/22/2022 10:57 AM, Jakub Jelinek via Gcc-patches wrote:
>> On Tue, Feb 22, 2022 at 12:39:28PM -0500, Andrew MacLeod wrote:
>>>> That is EH, then there are calls that might not return because they
>>>> leave
>>>> in some other way (e.g. longjmp), or might loop forever, might
>>>> exit, might
>>>> abort, trap etc.
>>> Generally speaking, calls which do not return should not now be a
>>> problem...
>>> as long as they do not transfer control to somewhere else in the
>>> current
>>> function.
>> I thought all of those cases are very relevant to PR104530.
>> If we have:
>> _1 = ptr_2(D) == 0;
>> // unrelated code in the same bb
>> _3 = *ptr_2(D);
>> then in light of PR104288, we can optimize ptr_2(D) == 0 into true
>> only if
>> there are no calls inside of "// unrelated code in the same bb"
>> or if all calls in "// unrelated code in the same bb" are guaranteed to
>> return exactly once. Because, if there is a call in there which could
>> exit (that is the PR104288 testcase), or abort, or trap, or loop
>> forever,
>> or throw externally, or longjmp or in any other non-UB way
>> cause the _1 = ptr_2(D) == 0; stmt to be invoked at runtime but
>> _3 = *ptr_2(D) not being invoked, then we can't optimize the earlier
>> comparison because ptr_2(D) could be NULL in a valid program.
>> While if there are no calls (and no problematic inline asms) and no
>> trapping
>> insns in between, we can and PR104530 is asking that we continue to
>> optimize
>> that.
> Right. This is similar to some of the restrictions we deal with in
> the path isolation pass. Essentially we have a path, when traversed,
> would result in a *0. We would like to be able to find the edge
> upon-which the *0 is control dependent and optimize the test so that
> it always went to the valid path rather than the *0 path.
>
> The problem is there may be observable side effects on the *0 path
> between the test and the actual *0 -- including calls to nonreturning
> functions, setjmp/longjmp, things that could trap, etc. This case is
> similar. We can't back-propagate the non-null status through any
> statements with observable side effects.
>
> Jeff
>
We can't back propagate, but we can alter our forward view. Any
ssa-name defined before the observable side effect can be recalculated
using the updated values, and all uses of those names after the
side-effect would then appear to be "up-to-date"
This does not actually change anything before the side-effect statement,
but the lazy re-evalaution ranger employs makes it appear as if we do a
new computation when _1 is used afterwards. ie:
_1 = ptr_2(D) == 0;
// unrelated code in the same bb
_3 = *ptr_2(D);
_4 = ptr_2(D) == 0; // ptr_2 is known to be [+1, +INF] now.
And we use _4 everywhere _1 was used. This is the effect.
so we do not actually change anything in the unrelated code, just
observable effects afterwards. We already do these recalculations on
outgoing edges in other blocks, just not within the definition block
because non-null wasn't visible within the def block.
Additionally, In the testcase, there is a store to C before the side
effects.
these patches get rid of the branch and thus the call in the testcase as
requested, but we still have to compute _3 in order to store it into
global C since it occurs pre side-effect.
b.0_1 = b;
_2 = b.0_1 == 0B;
_3 = (int) _2;
c = _3;
_5 = *b.0_1;
No matter how you look at it, you are going to need to process a block
twice in order to handle any code pre-side-effect. Whether it be
assigning stmt uids, or what have you.
VRP could pre-process the block, and if it gets to the end of the block,
and it had at least one statement with a side effect and no calls which
may not return you could process the block with all the side effects
already active. I'm not sure if that buys as much as the cost, but it
would change the value written to C to be 1, and it would change the
global values exported for _2 and _3.
Another option would be flag the ssa-names instead of/as well as marking
them as stale. If we get to the end of the block and there were no
non-returning functions or EH edges, then re-calculate and export those
ssa_names using the latest values.. That would export [0,0] for _2 and _3.
This would have no tangible impact during the first VRP pass, but the
*next* VRP pass, (or any other ranger pass) would pick up the new global
ranges, and do all the right things... so we basically let a subsequent
pass pick up the info and do the dirty work.
Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.
2022-02-22 19:18 ` Andrew MacLeod
@ 2022-02-23 7:48 ` Richard Biener
2022-02-23 16:30 ` Andrew MacLeod
0 siblings, 1 reply; 9+ messages in thread
From: Richard Biener @ 2022-02-23 7:48 UTC (permalink / raw)
To: Andrew MacLeod; +Cc: Jeff Law, Jakub Jelinek, gcc-patches
On Tue, Feb 22, 2022 at 8:19 PM Andrew MacLeod via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On 2/22/22 13:07, Jeff Law wrote:
> >
> >
> > On 2/22/2022 10:57 AM, Jakub Jelinek via Gcc-patches wrote:
> >> On Tue, Feb 22, 2022 at 12:39:28PM -0500, Andrew MacLeod wrote:
> >>>> That is EH, then there are calls that might not return because they
> >>>> leave
> >>>> in some other way (e.g. longjmp), or might loop forever, might
> >>>> exit, might
> >>>> abort, trap etc.
> >>> Generally speaking, calls which do not return should not now be a
> >>> problem...
> >>> as long as they do not transfer control to somewhere else in the
> >>> current
> >>> function.
> >> I thought all of those cases are very relevant to PR104530.
> >> If we have:
> >> _1 = ptr_2(D) == 0;
> >> // unrelated code in the same bb
> >> _3 = *ptr_2(D);
> >> then in light of PR104288, we can optimize ptr_2(D) == 0 into true
> >> only if
> >> there are no calls inside of "// unrelated code in the same bb"
> >> or if all calls in "// unrelated code in the same bb" are guaranteed to
> >> return exactly once. Because, if there is a call in there which could
> >> exit (that is the PR104288 testcase), or abort, or trap, or loop
> >> forever,
> >> or throw externally, or longjmp or in any other non-UB way
> >> cause the _1 = ptr_2(D) == 0; stmt to be invoked at runtime but
> >> _3 = *ptr_2(D) not being invoked, then we can't optimize the earlier
> >> comparison because ptr_2(D) could be NULL in a valid program.
> >> While if there are no calls (and no problematic inline asms) and no
> >> trapping
> >> insns in between, we can and PR104530 is asking that we continue to
> >> optimize
> >> that.
> > Right. This is similar to some of the restrictions we deal with in
> > the path isolation pass. Essentially we have a path, when traversed,
> > would result in a *0. We would like to be able to find the edge
> > upon-which the *0 is control dependent and optimize the test so that
> > it always went to the valid path rather than the *0 path.
> >
> > The problem is there may be observable side effects on the *0 path
> > between the test and the actual *0 -- including calls to nonreturning
> > functions, setjmp/longjmp, things that could trap, etc. This case is
> > similar. We can't back-propagate the non-null status through any
> > statements with observable side effects.
> >
> > Jeff
> >
> We can't back propagate, but we can alter our forward view. Any
> ssa-name defined before the observable side effect can be recalculated
> using the updated values, and all uses of those names after the
> side-effect would then appear to be "up-to-date"
>
> This does not actually change anything before the side-effect statement,
> but the lazy re-evalaution ranger employs makes it appear as if we do a
> new computation when _1 is used afterwards. ie:
>
> _1 = ptr_2(D) == 0;
> // unrelated code in the same bb
> _3 = *ptr_2(D);
> _4 = ptr_2(D) == 0; // ptr_2 is known to be [+1, +INF] now.
> And we use _4 everywhere _1 was used. This is the effect.
>
> so we do not actually change anything in the unrelated code, just
> observable effects afterwards. We already do these recalculations on
> outgoing edges in other blocks, just not within the definition block
> because non-null wasn't visible within the def block.
>
> Additionally, In the testcase, there is a store to C before the side
> effects.
> these patches get rid of the branch and thus the call in the testcase as
> requested, but we still have to compute _3 in order to store it into
> global C since it occurs pre side-effect.
>
> b.0_1 = b;
> _2 = b.0_1 == 0B;
> _3 = (int) _2;
> c = _3;
> _5 = *b.0_1;
>
> No matter how you look at it, you are going to need to process a block
> twice in order to handle any code pre-side-effect. Whether it be
> assigning stmt uids, or what have you.
Yes. I thought that is what ranger already does when it discovers new
ranges from edges. Say we have
_1 = 10 / _2;
if (_2 == 1)
{
_3 = _1 + 1;
then when evaluating _1 + 1 we re-evaluate 10 / _2 using _2 == 1 and
can compute _3 to [11, 11]?
That obviously extends to any stmt-level ranges we discover for uses
(not defs because defs are never used upthread). And doing that is
_not_ affected by any function/BB terminating calls or EH or whatnot
as long as the updated ranges are only affecting stmts dominating the
current one.
What complicates all this reasoning is that it is straight-forward when
you work with a traditional IL walking pass but it gets hard (and possibly
easy to get wrong) with on-demand processing and caching because
everything you cache will now be context dependent (valid only
starting after stmt X and for stmts dominated by it).
> VRP could pre-process the block, and if it gets to the end of the block,
> and it had at least one statement with a side effect and no calls which
> may not return you could process the block with all the side effects
> already active. I'm not sure if that buys as much as the cost, but it
> would change the value written to C to be 1, and it would change the
> global values exported for _2 and _3.
>
> Another option would be flag the ssa-names instead of/as well as marking
> them as stale. If we get to the end of the block and there were no
> non-returning functions or EH edges, then re-calculate and export those
> ssa_names using the latest values.. That would export [0,0] for _2 and _3.
>
> This would have no tangible impact during the first VRP pass, but the
> *next* VRP pass, (or any other ranger pass) would pick up the new global
> ranges, and do all the right things... so we basically let a subsequent
> pass pick up the info and do the dirty work.
>
> Andrew
>
>
>
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.
2022-02-23 7:48 ` Richard Biener
@ 2022-02-23 16:30 ` Andrew MacLeod
0 siblings, 0 replies; 9+ messages in thread
From: Andrew MacLeod @ 2022-02-23 16:30 UTC (permalink / raw)
To: Richard Biener; +Cc: Jeff Law, Jakub Jelinek, gcc-patches
On 2/23/22 02:48, Richard Biener wrote:
> On Tue, Feb 22, 2022 at 8:19 PM Andrew MacLeod via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>> On 2/22/22 13:07, Jeff Law wrote:
>>>
>>> On 2/22/2022 10:57 AM, Jakub Jelinek via Gcc-patches wrote:
>>>> On Tue, Feb 22, 2022 at 12:39:28PM -0500, Andrew MacLeod wrote:
>>>>>> That is EH, then there are calls that might not return because they
>>>>>> leave
>>>>>> in some other way (e.g. longjmp), or might loop forever, might
>>>>>> exit, might
>>>>>> abort, trap etc.
>>>>> Generally speaking, calls which do not return should not now be a
>>>>> problem...
>>>>> as long as they do not transfer control to somewhere else in the
>>>>> current
>>>>> function.
>>>> I thought all of those cases are very relevant to PR104530.
>>>> If we have:
>>>> _1 = ptr_2(D) == 0;
>>>> // unrelated code in the same bb
>>>> _3 = *ptr_2(D);
>>>> then in light of PR104288, we can optimize ptr_2(D) == 0 into true
>>>> only if
>>>> there are no calls inside of "// unrelated code in the same bb"
>>>> or if all calls in "// unrelated code in the same bb" are guaranteed to
>>>> return exactly once. Because, if there is a call in there which could
>>>> exit (that is the PR104288 testcase), or abort, or trap, or loop
>>>> forever,
>>>> or throw externally, or longjmp or in any other non-UB way
>>>> cause the _1 = ptr_2(D) == 0; stmt to be invoked at runtime but
>>>> _3 = *ptr_2(D) not being invoked, then we can't optimize the earlier
>>>> comparison because ptr_2(D) could be NULL in a valid program.
>>>> While if there are no calls (and no problematic inline asms) and no
>>>> trapping
>>>> insns in between, we can and PR104530 is asking that we continue to
>>>> optimize
>>>> that.
>>> Right. This is similar to some of the restrictions we deal with in
>>> the path isolation pass. Essentially we have a path, when traversed,
>>> would result in a *0. We would like to be able to find the edge
>>> upon-which the *0 is control dependent and optimize the test so that
>>> it always went to the valid path rather than the *0 path.
>>>
>>> The problem is there may be observable side effects on the *0 path
>>> between the test and the actual *0 -- including calls to nonreturning
>>> functions, setjmp/longjmp, things that could trap, etc. This case is
>>> similar. We can't back-propagate the non-null status through any
>>> statements with observable side effects.
>>>
>>> Jeff
>>>
>> We can't back propagate, but we can alter our forward view. Any
>> ssa-name defined before the observable side effect can be recalculated
>> using the updated values, and all uses of those names after the
>> side-effect would then appear to be "up-to-date"
>>
>> This does not actually change anything before the side-effect statement,
>> but the lazy re-evalaution ranger employs makes it appear as if we do a
>> new computation when _1 is used afterwards. ie:
>>
>> _1 = ptr_2(D) == 0;
>> // unrelated code in the same bb
>> _3 = *ptr_2(D);
>> _4 = ptr_2(D) == 0; // ptr_2 is known to be [+1, +INF] now.
>> And we use _4 everywhere _1 was used. This is the effect.
>>
>> so we do not actually change anything in the unrelated code, just
>> observable effects afterwards. We already do these recalculations on
>> outgoing edges in other blocks, just not within the definition block
>> because non-null wasn't visible within the def block.
>>
>> Additionally, In the testcase, there is a store to C before the side
>> effects.
>> these patches get rid of the branch and thus the call in the testcase as
>> requested, but we still have to compute _3 in order to store it into
>> global C since it occurs pre side-effect.
>>
>> b.0_1 = b;
>> _2 = b.0_1 == 0B;
>> _3 = (int) _2;
>> c = _3;
>> _5 = *b.0_1;
>>
>> No matter how you look at it, you are going to need to process a block
>> twice in order to handle any code pre-side-effect. Whether it be
>> assigning stmt uids, or what have you.
> Yes. I thought that is what ranger already does when it discovers new
> ranges from edges. Say we have
>
> _1 = 10 / _2;
> if (_2 == 1)
> {
> _3 = _1 + 1;
>
> then when evaluating _1 + 1 we re-evaluate 10 / _2 using _2 == 1 and
> can compute _3 to [11, 11]?
Correct, we get most of these first order effects via edges.
>
> That obviously extends to any stmt-level ranges we discover for uses
> (not defs because defs are never used upthread). And doing that is
> _not_ affected by any function/BB terminating calls or EH or whatnot
> as long as the updated ranges are only affecting stmts dominating the
> current one.
>
> What complicates all this reasoning is that it is straight-forward when
> you work with a traditional IL walking pass but it gets hard (and possibly
> easy to get wrong) with on-demand processing and caching because
> everything you cache will now be context dependent (valid only
> starting after stmt X and for stmts dominated by it).
Yeah, which is why this particular side effect code only applies to
definitions during a dom walk. we know we will not return to a def.
The non-null list (and next release the generalized side-effects) are
only applied to on-exit ranges via non-EH edges.. so they cant really
get us into trouble as we are sure of those values only affecting
dominated blocks. Pure on-demand clients will not get any of this
intra-block fine tuning.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.
@ 2022-02-23 21:23 Martin Uecker
0 siblings, 0 replies; 9+ messages in thread
From: Martin Uecker @ 2022-02-23 21:23 UTC (permalink / raw)
To: jeffreyalaw; +Cc: gcc-patches
>
> On 2/22/2022 10:57 AM, Jakub Jelinek via Gcc-patches wrote:
> > On Tue, Feb 22, 2022 at 12:39:28PM -0500, Andrew MacLeod wrote:
> >>> That is EH, then there are calls that might not return because they leave
> >>> in some other way (e.g. longjmp), or might loop forever, might exit, might
> >>> abort, trap etc.
> >> Generally speaking, calls which do not return should not now be a problem...
> >> as long as they do not transfer control to somewhere else in the current
> >> function.
> > I thought all of those cases are very relevant to PR104530.
> > If we have:
> > _1 = ptr_2(D) == 0;
> > // unrelated code in the same bb
> > _3 = *ptr_2(D);
> > then in light of PR104288, we can optimize ptr_2(D) == 0 into true only if
> > there are no calls inside of "// unrelated code in the same bb"
> > or if all calls in "// unrelated code in the same bb" are guaranteed to
> > return exactly once. Because, if there is a call in there which could
> > exit (that is the PR104288 testcase), or abort, or trap, or loop forever,
> > or throw externally, or longjmp or in any other non-UB way
> > cause the _1 = ptr_2(D) == 0; stmt to be invoked at runtime but
> > _3 = *ptr_2(D) not being invoked, then we can't optimize the earlier
> > comparison because ptr_2(D) could be NULL in a valid program.
> > While if there are no calls (and no problematic inline asms) and no trapping
> > insns in between, we can and PR104530 is asking that we continue to optimize
> > that.
> Right. This is similar to some of the restrictions we deal with in the
> path isolation pass. Essentially we have a path, when traversed, would
> result in a *0. We would like to be able to find the edge upon-which
> the *0 is control dependent and optimize the test so that it always went
> to the valid path rather than the *0 path.
>
> The problem is there may be observable side effects on the *0 path
> between the test and the actual *0 -- including calls to nonreturning
> functions, setjmp/longjmp, things that could trap, etc. This case is
> similar. We can't back-propagate the non-null status through any
> statements with observable side effects.
There are cases with volatile accesses where
this is currently not the case.
Also there is a proposal for C++ to require
an explicit std::observable fence to make sure
observable side effects are not undone by
later UB. (see the discussion about "reordering
of trapping operations and volatile" in
Janunary on the gcc list)
In my opinion it is much better if a compiler
makes sure to preserve observable side effects
even in code path with later UB (because the
behavior may otherwise be surprising and existing
code may be broken in a subtle way). If I
understand you correctly, GCC intends to do
this.
Can we then also agree that the remaining volatile
cases used be fixed?
Martin
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-02-23 21:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-22 16:39 [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation Andrew MacLeod
2022-02-22 16:56 ` Jakub Jelinek
2022-02-22 17:39 ` Andrew MacLeod
2022-02-22 17:57 ` Jakub Jelinek
2022-02-22 18:07 ` Jeff Law
2022-02-22 19:18 ` Andrew MacLeod
2022-02-23 7:48 ` Richard Biener
2022-02-23 16:30 ` Andrew MacLeod
2022-02-23 21:23 Martin Uecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).