public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Some PINGs
@ 2021-11-06 22:20 Roger Sayle
  2021-11-07  0:36 ` Jeff Law
  2021-11-08  8:50 ` Richard Biener
  0 siblings, 2 replies; 7+ messages in thread
From: Roger Sayle @ 2021-11-06 22:20 UTC (permalink / raw)
  To: 'GCC Patches'


I wonder if reviewers could take a look (or a second look) at some of my
outstanding patches.

Four nvptx backend patches:

nvptx: Use cvt to perform sign-extension of truncation.
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578256.html

nvptx: Add (experimental) support for HFmode with -misa=sm_53
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579623.html

nvptx: Adds uses of -misa=sm_75 and -misa=sm_80
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579691.html

Transition nvptx backend to STORE_FLAG_VALUE = 1
https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581004.html

Two middle-end patches:

Simplify paradoxical subreg extensions of TRUNCATE
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578848.html

PR middle-end/100810: Penalize IV candidates with undefined value bases
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578441.html

And a gfortran patch:

gfortran: Improve translation of POPPAR intrinsic
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548055.html


Many thanks in advance,
Roger
--



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some PINGs
  2021-11-06 22:20 Some PINGs Roger Sayle
@ 2021-11-07  0:36 ` Jeff Law
  2021-11-07 14:07   ` Roger Sayle
  2021-11-08  8:50 ` Richard Biener
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff Law @ 2021-11-07  0:36 UTC (permalink / raw)
  To: Roger Sayle, 'GCC Patches'



On 11/6/2021 4:20 PM, Roger Sayle wrote:
>
> Simplify paradoxical subreg extensions of TRUNCATE
> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578848.html
So the discussion seemed to end with a recommendation to try and address 
this earlier in the call chain rather than in simplify_rtx.

No state on the others...

jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Some PINGs
  2021-11-07  0:36 ` Jeff Law
@ 2021-11-07 14:07   ` Roger Sayle
  0 siblings, 0 replies; 7+ messages in thread
From: Roger Sayle @ 2021-11-07 14:07 UTC (permalink / raw)
  To: 'Jeff Law', 'GCC Patches'


>On 11/6/2021 4:20 PM, Roger Sayle wrote:
>> Simplify paradoxical subreg extensions of TRUNCATE 
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578848.html
> So the discussion seemed to end with a recommendation to try and address this earlier in the call chain rather than in simplify_rtx.

I take full responsibility that my poor choice of wording in describing this patch
may be confusing to those not familiar with this style of describing middle-end
transformations in GCC.  Take the opening sentence "This patch simplifies the
RTX (subreg:HI (truncate:QI (reg:SI))) as (truncate:HI (reg:SI))", taken literally
someone may look-up the RTL documentation and observe that SUBREG
may never be applied to truncate hence the above is invalid, and therefore
this transformation could never fire.  The wording is perhaps unfortunate; the
abstract RTL above is never actually created/allocated, instead we're describing the
transformation/context, i.e. the behaviour of the call to gen_lowpart when passed
a TRUCATE rtx as an operand.  Hence, this patch is already "earlier in the call chain"
and where it should be.  This is no different to how GCC's match describes tree
expressions to be folded, but that never physically exist in GIMPLE, where the
single-static assignment form guarantees no operator may have another as an
operand.

If someone could review the code change itself, and not misinterpret the patch 
description that would be much appreciated.

Thanks in advance,
Roger
--



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some PINGs
  2021-11-06 22:20 Some PINGs Roger Sayle
  2021-11-07  0:36 ` Jeff Law
@ 2021-11-08  8:50 ` Richard Biener
  2021-11-08 14:02   ` Roger Sayle
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Biener @ 2021-11-08  8:50 UTC (permalink / raw)
  To: Roger Sayle; +Cc: GCC Patches

On Sat, Nov 6, 2021 at 11:21 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> I wonder if reviewers could take a look (or a second look) at some of my
> outstanding patches.
>
> Four nvptx backend patches:
>
> nvptx: Use cvt to perform sign-extension of truncation.
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578256.html
>
> nvptx: Add (experimental) support for HFmode with -misa=sm_53
> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579623.html
>
> nvptx: Adds uses of -misa=sm_75 and -misa=sm_80
> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579691.html
>
> Transition nvptx backend to STORE_FLAG_VALUE = 1
> https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581004.html
>
> Two middle-end patches:
>
> Simplify paradoxical subreg extensions of TRUNCATE
> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578848.html
>
> PR middle-end/100810: Penalize IV candidates with undefined value bases
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578441.html

I did comment on this one, noting the more general issue.  My opinion is still
that doing heavy lifting in IVOPTs is misplaced.

> And a gfortran patch:
>
> gfortran: Improve translation of POPPAR intrinsic
> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548055.html
>
>
> Many thanks in advance,
> Roger
> --
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Some PINGs
  2021-11-08  8:50 ` Richard Biener
@ 2021-11-08 14:02   ` Roger Sayle
  2021-11-08 15:40     ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Roger Sayle @ 2021-11-08 14:02 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: 'GCC Patches'


Hi Richard,

>> I wonder if reviewers could take a look (or a second look) at some of 
>> my outstanding patches.
>> PR middle-end/100810: Penalize IV candidates with undefined value 
>> bases 
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578441.html
>
> I did comment on this one, noting the more general issue.
> My opinion is still that doing heavy lifting in IVOPTs is misplaced.

I wasn't sure whether you'd had the opportunity to give this bug some more
thought.

You're completely right, that it is theoretically possible for GCC to upgrade
its data flow (CCP/VRP) passes to use a finer grained definition of undefined/
uninitialized/indeterminate values; an indeterminate-value numbering pass
if you will.  Under the constraints that the automatic variables are not volatile,
and the types don't supporting a trapping values, the compiler could determine
that "undef1 - undef1", or "undef2 ^ undef2" have defined values, but that 
"undef1 - undef2" and "undef3 ^ undef4" remain indeterminate.  Like
traditional value numbering, it may be possible to track "t1 = undef3 ^ undef4",
"t2 = t1 ^ undef4", "t3 = t2 - undef3".   Conceptually, the same applies to 
(floating point) mathematics and its numerous infinities, sometimes "+Inf - +Inf"
is known to be zero, provided that it is the exact same infinity (omega) that is
being subtracted.

The two counter arguments for this solution as a fix for PR 100810, is that such
a pass doesn't yet exist, and it still (probably) falls foul of C/C++'s undefined
behaviour from use of an uninitialized automatic variable.  From an engineering
perspective, it's a lot of effort to support poorly written code.

Quick question to the language lawyers:

int foo()
{
  int undef;
  return undef ^ undef;
}

int bar()
{
  volatile int undef;
  return undef ^ undef;
}

Do the above functions invoke undefined behaviour?

The approach taken by the proposed patch is that it's the pass/transformation that
introduces more undefined behaviour than was present in the original code, that is
at fault.  Even if later passes, decided not to take advantage of UB, is there any benefit
for replacing an induction variable with a well-defined value (evolution) and substituting
it with one that depends upon indefinite values.  I'd argue the correct fix is to go the other
way, and attempt to reduce the instances of undefined behaviour.

So transform

	int undef;
	while (cond())
	  undef++;
	...

which invokes UB on each iteration with:

	int undef;
	unsigned int count = 0;
	while (cond())
	  count++;
	undef += count;
	...

which now only invokes UB after the loop.

Consider:
	int undef;
	int result = 0;
	while (cond())
	  result ^= undef;
	return result;

where the final value of result may be well-defined if the loop iterates
an even number of times.

	int undef;
	int result = 0;
	while (cond())
	  result ^= 1;
	return result ? undef : 0;


Has anyone proposed an alternate fix to PR middle-end/100810?
We can always revert my proposed fix (for this P1 regression), once IV opts
is able to confirm that it is safe to make the substitution that it is proposing.


I'm curious to hear your (latest) thoughts.

Thanks again for thinking about this.

Best regards,
Roger
--



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some PINGs
  2021-11-08 14:02   ` Roger Sayle
@ 2021-11-08 15:40     ` Richard Biener
  2021-11-09  9:18       ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2021-11-08 15:40 UTC (permalink / raw)
  To: Roger Sayle; +Cc: GCC Patches

On Mon, Nov 8, 2021 at 3:02 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> Hi Richard,
>
> >> I wonder if reviewers could take a look (or a second look) at some of
> >> my outstanding patches.
> >> PR middle-end/100810: Penalize IV candidates with undefined value
> >> bases
> >> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578441.html
> >
> > I did comment on this one, noting the more general issue.
> > My opinion is still that doing heavy lifting in IVOPTs is misplaced.
>
> I wasn't sure whether you'd had the opportunity to give this bug some more
> thought.
>
> You're completely right, that it is theoretically possible for GCC to upgrade
> its data flow (CCP/VRP) passes to use a finer grained definition of undefined/
> uninitialized/indeterminate values; an indeterminate-value numbering pass
> if you will.  Under the constraints that the automatic variables are not volatile,
> and the types don't supporting a trapping values, the compiler could determine
> that "undef1 - undef1", or "undef2 ^ undef2" have defined values, but that
> "undef1 - undef2" and "undef3 ^ undef4" remain indeterminate.  Like
> traditional value numbering, it may be possible to track "t1 = undef3 ^ undef4",
> "t2 = t1 ^ undef4", "t3 = t2 - undef3".   Conceptually, the same applies to
> (floating point) mathematics and its numerous infinities, sometimes "+Inf - +Inf"
> is known to be zero, provided that it is the exact same infinity (omega) that is
> being subtracted.
>
> The two counter arguments for this solution as a fix for PR 100810, is that such
> a pass doesn't yet exist, and it still (probably) falls foul of C/C++'s undefined
> behaviour from use of an uninitialized automatic variable.  From an engineering
> perspective, it's a lot of effort to support poorly written code.
>
> Quick question to the language lawyers:
>
> int foo()
> {
>   int undef;
>   return undef ^ undef;
> }
>
> int bar()
> {
>   volatile int undef;
>   return undef ^ undef;
> }
>
> Do the above functions invoke undefined behaviour?

Yes.  C17 6.3.2.1 says so.  Note that it has a strange restriction
on being 'register' qualifiable, so taking the address of 'undef'
in an unrelated stmt would make it not undefined?  Whenever the
language spec makes the use not undefined the argument is
that the abstract machine would for

  int bar()
  {
    int undef;
    int tem = undef;
    return tem ^ tem;
  }

cause 'undef' to have a single evaluation and thus 'tem' have
a single consistent value.  In GCC we'd have propagated those out,
resulting in two evaluations (not sure if a SSA use can be considered
an "evaluation" or whether the non-existing(!) single definition is the sole
evaluation)

> The approach taken by the proposed patch is that it's the pass/transformation that
> introduces more undefined behaviour than was present in the original code, that is
> at fault.  Even if later passes, decided not to take advantage of UB, is there any benefit
> for replacing an induction variable with a well-defined value (evolution) and substituting
> it with one that depends upon indefinite values.  I'd argue the correct fix is to go the other
> way, and attempt to reduce the instances of undefined behaviour.
>
> So transform
>
>         int undef;
>         while (cond())
>           undef++;
>         ...
>
> which invokes UB on each iteration with:
>
>         int undef;
>         unsigned int count = 0;
>         while (cond())
>           count++;
>         undef += count;
>         ...
>
> which now only invokes UB after the loop.
>
> Consider:
>         int undef;
>         int result = 0;
>         while (cond())
>           result ^= undef;
>         return result;
>
> where the final value of result may be well-defined if the loop iterates
> an even number of times.
>
>         int undef;
>         int result = 0;
>         while (cond())
>           result ^= 1;
>         return result ? undef : 0;
>
>
> Has anyone proposed an alternate fix to PR middle-end/100810?
> We can always revert my proposed fix (for this P1 regression), once IV opts
> is able to confirm that it is safe to make the substitution that it is proposing.

I do agree with the idea to not increase the number of UNDEF uses.  But the
issue is that undefinedness as in what your patch would consider is brittle,
it can be easily hidden in a function parameter and thus the bug will likely
re-surface if you just hide the undefinedness from the pass in some way.

So any place where we try to do sth when we see an UNDEF for _correctness_
reasons should really be testing for must-definedness instead which I think
is similarly infeasible and broken.

So for the PR at hand it's actually ifcombine that turns the testcase from
one not invoking undefined behavior first (all uses of 'i' are never executed)
into one that does by turning

    if (!b)
      i &&d;

into

   _Bool tem = b == 0 & i != 0;
   if (tem)
     d;

the loop body is indeed mangled by IVOPTs by replacing h with a
computation based on the undefined i and that elimination is
invalid exactly because for

  # i_16 = PHI <i_24(D)(20), i_17(7)>
  _41 = (unsigned int) i_24(D);
  _51 = _28 + _41;
  _22 = (unsigned int) i_16;
  _48 = -_22;

i_24(D) and i_16 do not cancel out on the edge from (20)
where i_16 is defined as a copy from i_24(D) ... I do think that
CCP is what really breaks things here - CCP assumes that
each SSA use of an UNDEF value can yield a different value.

"stabilizing" 'i's value by making it a parameter to the function
(but still undefined) will make it work.

That said, when we treat undefined as undefined we indeed
have to be very careful about transforms we do (see signed
overflow).  For undefs the important part is to handle them
optimistically in PHIs, anything else is likely going to cause
more problems than it helps.

But - I'm not fully set on any option to take.  Btw, the bug
is surely latent and it should be possible to craft a simpler
testcase that also breaks on older GCC, thus P1 isn't
appropriate here.

I'll think a bit more tomorrow.

Thanks,
Richard.

> I'm curious to hear your (latest) thoughts.
>
> Thanks again for thinking about this.
>
> Best regards,
> Roger
> --
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Some PINGs
  2021-11-08 15:40     ` Richard Biener
@ 2021-11-09  9:18       ` Richard Biener
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2021-11-09  9:18 UTC (permalink / raw)
  To: Roger Sayle; +Cc: GCC Patches, bin.cheng

[-- Attachment #1: Type: text/plain, Size: 7916 bytes --]

On Mon, Nov 8, 2021 at 4:40 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Mon, Nov 8, 2021 at 3:02 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
> >
> >
> > Hi Richard,
> >
> > >> I wonder if reviewers could take a look (or a second look) at some of
> > >> my outstanding patches.
> > >> PR middle-end/100810: Penalize IV candidates with undefined value
> > >> bases
> > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578441.html
> > >
> > > I did comment on this one, noting the more general issue.
> > > My opinion is still that doing heavy lifting in IVOPTs is misplaced.
> >
> > I wasn't sure whether you'd had the opportunity to give this bug some more
> > thought.
> >
> > You're completely right, that it is theoretically possible for GCC to upgrade
> > its data flow (CCP/VRP) passes to use a finer grained definition of undefined/
> > uninitialized/indeterminate values; an indeterminate-value numbering pass
> > if you will.  Under the constraints that the automatic variables are not volatile,
> > and the types don't supporting a trapping values, the compiler could determine
> > that "undef1 - undef1", or "undef2 ^ undef2" have defined values, but that
> > "undef1 - undef2" and "undef3 ^ undef4" remain indeterminate.  Like
> > traditional value numbering, it may be possible to track "t1 = undef3 ^ undef4",
> > "t2 = t1 ^ undef4", "t3 = t2 - undef3".   Conceptually, the same applies to
> > (floating point) mathematics and its numerous infinities, sometimes "+Inf - +Inf"
> > is known to be zero, provided that it is the exact same infinity (omega) that is
> > being subtracted.
> >
> > The two counter arguments for this solution as a fix for PR 100810, is that such
> > a pass doesn't yet exist, and it still (probably) falls foul of C/C++'s undefined
> > behaviour from use of an uninitialized automatic variable.  From an engineering
> > perspective, it's a lot of effort to support poorly written code.
> >
> > Quick question to the language lawyers:
> >
> > int foo()
> > {
> >   int undef;
> >   return undef ^ undef;
> > }
> >
> > int bar()
> > {
> >   volatile int undef;
> >   return undef ^ undef;
> > }
> >
> > Do the above functions invoke undefined behaviour?
>
> Yes.  C17 6.3.2.1 says so.  Note that it has a strange restriction
> on being 'register' qualifiable, so taking the address of 'undef'
> in an unrelated stmt would make it not undefined?  Whenever the
> language spec makes the use not undefined the argument is
> that the abstract machine would for
>
>   int bar()
>   {
>     int undef;
>     int tem = undef;
>     return tem ^ tem;
>   }
>
> cause 'undef' to have a single evaluation and thus 'tem' have
> a single consistent value.  In GCC we'd have propagated those out,
> resulting in two evaluations (not sure if a SSA use can be considered
> an "evaluation" or whether the non-existing(!) single definition is the sole
> evaluation)
>
> > The approach taken by the proposed patch is that it's the pass/transformation that
> > introduces more undefined behaviour than was present in the original code, that is
> > at fault.  Even if later passes, decided not to take advantage of UB, is there any benefit
> > for replacing an induction variable with a well-defined value (evolution) and substituting
> > it with one that depends upon indefinite values.  I'd argue the correct fix is to go the other
> > way, and attempt to reduce the instances of undefined behaviour.
> >
> > So transform
> >
> >         int undef;
> >         while (cond())
> >           undef++;
> >         ...
> >
> > which invokes UB on each iteration with:
> >
> >         int undef;
> >         unsigned int count = 0;
> >         while (cond())
> >           count++;
> >         undef += count;
> >         ...
> >
> > which now only invokes UB after the loop.
> >
> > Consider:
> >         int undef;
> >         int result = 0;
> >         while (cond())
> >           result ^= undef;
> >         return result;
> >
> > where the final value of result may be well-defined if the loop iterates
> > an even number of times.
> >
> >         int undef;
> >         int result = 0;
> >         while (cond())
> >           result ^= 1;
> >         return result ? undef : 0;
> >
> >
> > Has anyone proposed an alternate fix to PR middle-end/100810?
> > We can always revert my proposed fix (for this P1 regression), once IV opts
> > is able to confirm that it is safe to make the substitution that it is proposing.
>
> I do agree with the idea to not increase the number of UNDEF uses.  But the
> issue is that undefinedness as in what your patch would consider is brittle,
> it can be easily hidden in a function parameter and thus the bug will likely
> re-surface if you just hide the undefinedness from the pass in some way.
>
> So any place where we try to do sth when we see an UNDEF for _correctness_
> reasons should really be testing for must-definedness instead which I think
> is similarly infeasible and broken.
>
> So for the PR at hand it's actually ifcombine that turns the testcase from
> one not invoking undefined behavior first (all uses of 'i' are never executed)
> into one that does by turning
>
>     if (!b)
>       i &&d;
>
> into
>
>    _Bool tem = b == 0 & i != 0;
>    if (tem)
>      d;
>
> the loop body is indeed mangled by IVOPTs by replacing h with a
> computation based on the undefined i and that elimination is
> invalid exactly because for
>
>   # i_16 = PHI <i_24(D)(20), i_17(7)>
>   _41 = (unsigned int) i_24(D);
>   _51 = _28 + _41;
>   _22 = (unsigned int) i_16;
>   _48 = -_22;
>
> i_24(D) and i_16 do not cancel out on the edge from (20)
> where i_16 is defined as a copy from i_24(D) ... I do think that
> CCP is what really breaks things here - CCP assumes that
> each SSA use of an UNDEF value can yield a different value.
>
> "stabilizing" 'i's value by making it a parameter to the function
> (but still undefined) will make it work.
>
> That said, when we treat undefined as undefined we indeed
> have to be very careful about transforms we do (see signed
> overflow).  For undefs the important part is to handle them
> optimistically in PHIs, anything else is likely going to cause
> more problems than it helps.
>
> But - I'm not fully set on any option to take.  Btw, the bug
> is surely latent and it should be possible to craft a simpler
> testcase that also breaks on older GCC, thus P1 isn't
> appropriate here.
>
> I'll think a bit more tomorrow.

OK, so apart from trying to find a better balance of where
we exploit UNDEF I had a look at the IVOPTs patch again.

I think instead of attacking this at the cost side we should
avoid making IVs with SSA undefs candidates for USEs
that are not the original one.

One can do this - almost - by disregarding !IP_ORIGINAL
candidates with undef bases and making the IP_ORIGINAL
one not important (so it's not considered for other uses).
Almost - since we have the 'consider_all_candidates'
override.  So the attached patch works with --param
iv-consider-all-candidates-bound=1
only.

I suppose the iv_cand->important two-state flag could be
made an enum to be able to single out some from
all-candidates processing as well.

Interestingly for the testcase it also works to not allow any
candidate with the SSA UNDEF (the IV then gets expressed
with the h_lsm candidate), but in general the IV might not
be eliminatable and thus IVOPTs likely will crash if we tie
its hand that way.

So the idea is to tie IVOPTs hands so it will simply keep
the original IVs when they have UNDEF bases and not
use them to eliminate other IVs.  I think that should be
doable with the patch (apart from the all_candidates issue).

I've CCed Bin who knows the IVOPTs code best for an
opinion.

Thanks,
Richard.


> Thanks,
> Richard.
>
> > I'm curious to hear your (latest) thoughts.
> >
> > Thanks again for thinking about this.
> >
> > Best regards,
> > Roger
> > --
> >
> >

[-- Attachment #2: p --]
[-- Type: application/octet-stream, Size: 5359 bytes --]

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 4a498abe3b0..164d221a20c 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -3072,6 +3072,17 @@ get_loop_invariant_expr (struct ivopts_data *data, tree inv_expr)
   return *slot;
 }
 
+static tree
+find_ssa_undef (tree *tp, int *walk_subtrees, void *)
+{
+  if (TREE_CODE (*tp) == SSA_NAME
+      && ssa_undefined_value_p (*tp, false))
+    return *tp;
+  if (!EXPR_P (*tp))
+    *walk_subtrees = 0;
+  return NULL;
+}
+
 /* Adds a candidate BASE + STEP * i.  Important field is set to IMPORTANT and
    position to POS.  If USE is not NULL, the candidate is set as related to
    it.  If both BASE and STEP are NULL, we add a pseudocandidate for the
@@ -3099,6 +3110,13 @@ add_candidate_1 (struct ivopts_data *data, tree base, tree step, bool important,
   if (flag_keep_gc_roots_live && POINTER_TYPE_P (TREE_TYPE (base)))
     return NULL;
 
+  if (walk_tree (&base, find_ssa_undef, NULL, NULL))
+    {
+      if (pos != IP_ORIGINAL)
+	return NULL;
+      important = false;
+    }
+
   /* For non-original variables, make sure their values are computed in a type
      that does not invoke undefined behavior on overflows (since in general,
      we cannot prove that these induction variables are non-wrapping).  */
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 8aac733ac25..82b6c3fcdab 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2759,6 +2759,18 @@ jt_path_registry::cancel_invalid_paths (vec<jump_thread_edge *> &path)
   gcc_checking_assert (!path.is_empty ());
   edge entry = path[0]->e;
   edge exit = path[path.length () - 1]->e;
+
+  // The path should either start and end in the same loop or exit the
+  // loop it starts in but never enter a loop.  This also catches
+  // creating irreducible loops, not only rotation.
+  if (entry->src->loop_father != exit->dest->loop_father
+      && !flow_loop_nested_p (exit->src->loop_father,
+			      entry->dest->loop_father))
+    {
+      cancel_thread (&path, "Path rotates loop");
+      return true;
+    }
+
   bool seen_latch = false;
   int loops_crossed = 0;
   bool crossed_latch = false;
@@ -2810,7 +2822,7 @@ jt_path_registry::cancel_invalid_paths (vec<jump_thread_edge *> &path)
 
   // If we crossed a loop into an outer loop without crossing the
   // latch, this is just an early exit from the loop.
-  if (loops_crossed == 1
+  if (loops_crossed <= 2
       && !crossed_latch
       && flow_loop_nested_p (exit->dest->loop_father, exit->src->loop_father))
     return false;
@@ -2829,16 +2841,6 @@ jt_path_registry::cancel_invalid_paths (vec<jump_thread_edge *> &path)
       cancel_thread (&path, "Path crosses loops");
       return true;
     }
-  // The path should either start and end in the same loop or exit the
-  // loop it starts in but never enter a loop.  This also catches
-  // creating irreducible loops, not only rotation.
-  if (entry->src->loop_father != exit->dest->loop_father
-      && !flow_loop_nested_p (exit->src->loop_father,
-			      entry->dest->loop_father))
-    {
-      cancel_thread (&path, "Path rotates loop");
-      return true;
-    }
   if (crossed_loop_header)
     {
       cancel_thread (&path, "Path crosses loop header but does not exit it");
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index e6c5bcdad36..9b54589e6b4 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -4344,32 +4344,30 @@ vect_recog_bool_pattern (vec_info *vinfo,
       /* Build a scalar type for the boolean result that when
          vectorized matches the vector type of the result in
 	 size and number of elements.  */
-      unsigned prec
-	= vector_element_size (tree_to_poly_uint64 (TYPE_SIZE (vectype)),
-			       TYPE_VECTOR_SUBPARTS (vectype));
+      tree truth_type = truth_type_for (vectype);
+      unsigned prec = TYPE_PRECISION (TREE_TYPE (truth_type));
 
       tree type
 	= build_nonstandard_integer_type (prec,
 					  TYPE_UNSIGNED (TREE_TYPE (var)));
-      if (get_vectype_for_scalar_type (vinfo, type) == NULL_TREE)
-	return NULL;
-
-      if (!check_bool_pattern (var, vinfo, bool_stmts))
-	return NULL;
 
-      rhs = adjust_bool_stmts (vinfo, bool_stmts, type, stmt_vinfo);
+      if (check_bool_pattern (var, vinfo, bool_stmts))
+	{
+	  rhs = adjust_bool_stmts (vinfo, bool_stmts, type, stmt_vinfo);
 
-      lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
-      pattern_stmt 
-	  = gimple_build_assign (lhs, COND_EXPR,
-				 build2 (NE_EXPR, boolean_type_node,
-					 rhs, build_int_cst (type, 0)),
-				 gimple_assign_rhs2 (last_stmt),
-				 gimple_assign_rhs3 (last_stmt));
-      *type_out = vectype;
-      vect_pattern_detected ("vect_recog_bool_pattern", last_stmt);
+	  lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+	  pattern_stmt 
+	      = gimple_build_assign (lhs, COND_EXPR,
+				     build2 (NE_EXPR, boolean_type_node,
+					     rhs, build_int_cst (type, 0)),
+				     gimple_assign_rhs2 (last_stmt),
+				     gimple_assign_rhs3 (last_stmt));
+	  *type_out = vectype;
+	  vect_pattern_detected ("vect_recog_bool_pattern", last_stmt);
 
-      return pattern_stmt;
+	  return pattern_stmt;
+	}
+      return NULL;
     }
   else if (rhs_code == SSA_NAME
 	   && STMT_VINFO_DATA_REF (stmt_vinfo))

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-11-09  9:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-06 22:20 Some PINGs Roger Sayle
2021-11-07  0:36 ` Jeff Law
2021-11-07 14:07   ` Roger Sayle
2021-11-08  8:50 ` Richard Biener
2021-11-08 14:02   ` Roger Sayle
2021-11-08 15:40     ` Richard Biener
2021-11-09  9:18       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).