public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
       [not found] <33fb3bb3-9548-4867-95f6-f7319bde2270@AMS1EPF00000043.eurprd04.prod.outlook.com>
@ 2023-11-09  9:09 ` Tamar Christina
  2023-11-09  9:23   ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Tamar Christina @ 2023-11-09  9:09 UTC (permalink / raw)
  To: Richard Biener, gcc-patches

>  	  guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
>  	  edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
> -	  guard_to = split_edge (epilog_e);
> +	  guard_to = epilog_e->dest;
>  	  guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, guard_to,
>  					   skip_vector ? anchor : guard_bb,
>  					   prob_epilog.invert (),
> @@ -3443,8 +3229,30 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
> niters, tree nitersm1,
>  	  if (vect_epilogues)
>  	    epilogue_vinfo->skip_this_loop_edge = guard_e;
>  	  edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo);
> -	  slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv,
> guard_e,
> -					      epilog_e);
> +	  gphi_iterator gsi2 = gsi_start_phis (main_iv->dest);
> +	  for (gphi_iterator gsi = gsi_start_phis (guard_to);
> +	       !gsi_end_p (gsi); gsi_next (&gsi))
> +	    {
> +	      /* We are expecting all of the PHIs we have on epilog_e
> +		 to be also on the main loop exit.  But sometimes
> +		 a stray virtual definition can appear at epilog_e
> +		 which we can then take as the same on all exits,
> +		 we've removed the LC SSA PHI on the main exit before
> +		 so we wouldn't need to create a loop PHI for it.  */
> +	      if (virtual_operand_p (gimple_phi_result (*gsi))
> +		  && (gsi_end_p (gsi2)
> +		      || !virtual_operand_p (gimple_phi_result (*gsi2))))
> +		add_phi_arg (*gsi,
> +			     gimple_phi_arg_def_from_edge (*gsi, epilog_e),
> +			     guard_e, UNKNOWN_LOCATION);
> +	      else
> +		{
> +		  add_phi_arg (*gsi, gimple_phi_result (*gsi2), guard_e,
> +			       UNKNOWN_LOCATION);
> +		  gsi_next (&gsi2);
> +		}
> +	    }
> +

I've been having some trouble incorporating this change into the early break work.
My understanding is that here you've removed the lookup that find_guard did
and are assuming that the order between the PHI nodes between loop->exit
and epilog->exit are the same - sporadic virtual operands.

But the loop->exit for early break has to materialize all PHI nodes from the main
loop into the epilog loop since we need them to restart the scalar loop iteration.

This means that the number of PHI nodes between the first loop and the second
Loop are not the same, so we end up mis-linking phi nodes.  i.e. consider this loop

https://gist.github.com/Mistuke/65d476b18f991772fdec159a09b81869

which now goes wrong (throw that in a dotgraph viewer).

I'm struggling to figure out how to handle this without doing a lookup.

Any advice?

Thanks,
Tamar


>  	  /* Only need to handle basic block before epilog loop if it's not
>  	     the guard_bb, which is the case when skip_vector is true.  */
>  	  if (guard_bb != bb_before_epilog)
> --
> 2.35.3

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
  2023-11-09  9:09 ` [PATCH] tree-optimization/111950 - vectorizer loop copying Tamar Christina
@ 2023-11-09  9:23   ` Richard Biener
  2023-11-09 10:26     ` Tamar Christina
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2023-11-09  9:23 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches

On Thu, 9 Nov 2023, Tamar Christina wrote:

> >  	  guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
> >  	  edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
> > -	  guard_to = split_edge (epilog_e);
> > +	  guard_to = epilog_e->dest;
> >  	  guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, guard_to,
> >  					   skip_vector ? anchor : guard_bb,
> >  					   prob_epilog.invert (),
> > @@ -3443,8 +3229,30 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
> > niters, tree nitersm1,
> >  	  if (vect_epilogues)
> >  	    epilogue_vinfo->skip_this_loop_edge = guard_e;
> >  	  edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > -	  slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv,
> > guard_e,
> > -					      epilog_e);
> > +	  gphi_iterator gsi2 = gsi_start_phis (main_iv->dest);
> > +	  for (gphi_iterator gsi = gsi_start_phis (guard_to);
> > +	       !gsi_end_p (gsi); gsi_next (&gsi))
> > +	    {
> > +	      /* We are expecting all of the PHIs we have on epilog_e
> > +		 to be also on the main loop exit.  But sometimes
> > +		 a stray virtual definition can appear at epilog_e
> > +		 which we can then take as the same on all exits,
> > +		 we've removed the LC SSA PHI on the main exit before
> > +		 so we wouldn't need to create a loop PHI for it.  */
> > +	      if (virtual_operand_p (gimple_phi_result (*gsi))
> > +		  && (gsi_end_p (gsi2)
> > +		      || !virtual_operand_p (gimple_phi_result (*gsi2))))
> > +		add_phi_arg (*gsi,
> > +			     gimple_phi_arg_def_from_edge (*gsi, epilog_e),
> > +			     guard_e, UNKNOWN_LOCATION);
> > +	      else
> > +		{
> > +		  add_phi_arg (*gsi, gimple_phi_result (*gsi2), guard_e,
> > +			       UNKNOWN_LOCATION);
> > +		  gsi_next (&gsi2);
> > +		}
> > +	    }
> > +
> 
> I've been having some trouble incorporating this change into the early break work.
> My understanding is that here you've removed the lookup that find_guard did
> and are assuming that the order between the PHI nodes between loop->exit
> and epilog->exit are the same - sporadic virtual operands.
> 
> But the loop->exit for early break has to materialize all PHI nodes from the main
> loop into the epilog loop since we need them to restart the scalar loop iteration.
> 
> This means that the number of PHI nodes between the first loop and the second
> Loop are not the same, so we end up mis-linking phi nodes.  i.e. consider this loop
> 
> https://gist.github.com/Mistuke/65d476b18f991772fdec159a09b81869

I don't see any multi-exits here?  I think you need exactly the same
PHIs you need for the branch to the epilogue, no?

If you can point me to a testcase that fails on your branch I can
try to have a look.

> which now goes wrong (throw that in a dotgraph viewer).
> 
> I'm struggling to figure out how to handle this without doing a lookup.
> 
> Any advice?
> 
> Thanks,
> Tamar
> 
> 
> >  	  /* Only need to handle basic block before epilog loop if it's not
> >  	     the guard_bb, which is the case when skip_vector is true.  */
> >  	  if (guard_bb != bb_before_epilog)
> > --
> > 2.35.3
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
  2023-11-09  9:23   ` Richard Biener
@ 2023-11-09 10:26     ` Tamar Christina
  2023-11-09 11:54       ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Tamar Christina @ 2023-11-09 10:26 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Thursday, November 9, 2023 9:24 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
> 
> On Thu, 9 Nov 2023, Tamar Christina wrote:
> 
> > >  	  guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
> > >  	  edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
> > > -	  guard_to = split_edge (epilog_e);
> > > +	  guard_to = epilog_e->dest;
> > >  	  guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, guard_to,
> > >  					   skip_vector ? anchor : guard_bb,
> > >  					   prob_epilog.invert (),
> > > @@ -3443,8 +3229,30 @@ vect_do_peeling (loop_vec_info loop_vinfo,
> > > tree niters, tree nitersm1,
> > >  	  if (vect_epilogues)
> > >  	    epilogue_vinfo->skip_this_loop_edge = guard_e;
> > >  	  edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > > -	  slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv,
> > > guard_e,
> > > -					      epilog_e);
> > > +	  gphi_iterator gsi2 = gsi_start_phis (main_iv->dest);
> > > +	  for (gphi_iterator gsi = gsi_start_phis (guard_to);
> > > +	       !gsi_end_p (gsi); gsi_next (&gsi))
> > > +	    {
> > > +	      /* We are expecting all of the PHIs we have on epilog_e
> > > +		 to be also on the main loop exit.  But sometimes
> > > +		 a stray virtual definition can appear at epilog_e
> > > +		 which we can then take as the same on all exits,
> > > +		 we've removed the LC SSA PHI on the main exit before
> > > +		 so we wouldn't need to create a loop PHI for it.  */
> > > +	      if (virtual_operand_p (gimple_phi_result (*gsi))
> > > +		  && (gsi_end_p (gsi2)
> > > +		      || !virtual_operand_p (gimple_phi_result (*gsi2))))
> > > +		add_phi_arg (*gsi,
> > > +			     gimple_phi_arg_def_from_edge (*gsi, epilog_e),
> > > +			     guard_e, UNKNOWN_LOCATION);
> > > +	      else
> > > +		{
> > > +		  add_phi_arg (*gsi, gimple_phi_result (*gsi2), guard_e,
> > > +			       UNKNOWN_LOCATION);
> > > +		  gsi_next (&gsi2);
> > > +		}
> > > +	    }
> > > +
> >
> > I've been having some trouble incorporating this change into the early break
> work.
> > My understanding is that here you've removed the lookup that
> > find_guard did and are assuming that the order between the PHI nodes
> > between loop->exit and epilog->exit are the same - sporadic virtual
> operands.
> >
> > But the loop->exit for early break has to materialize all PHI nodes
> > from the main loop into the epilog loop since we need them to restart the
> scalar loop iteration.
> >
> > This means that the number of PHI nodes between the first loop and the
> > second Loop are not the same, so we end up mis-linking phi nodes.
> > i.e. consider this loop
> >
> > https://gist.github.com/Mistuke/65d476b18f991772fdec159a09b81869
> 
> I don't see any multi-exits here?  I think you need exactly the same PHIs you
> need for the branch to the epilogue, no?
> 

Ah it's a failing testcase but not one with an early break,

> If you can point me to a testcase that fails on your branch I can try to have a
> look.

I've updated the branch refs/users/tnfchris/heads/gcc-14-early-break

Quite a few tests fail, a simple one is vect-early-break_5.c and vect-early-break_20.c

But what you just said above makes me wonder.. at the moment before we have
differening amount because we require to have the loop counters and IVs as PHI nodes
such that vect_update_ivs_after_vectorizer can thread them through correctly as it
searches for PHI nodes.  However for the epilog exit, those that are not live are not
needed.  This is why we get different counts.

Maybe.. the solution is that I need to do the same thing as vectorizable_live_operations
In that when vect_update_ivs_after_vectorizer is done I should either remove the PHI
nodes or turn them into simple assignments.  Since they're always single value.

Looking at a few examples that seems like it would fix the issue.. Does that sound right to you?

Regards,
Tamar
> 
> > which now goes wrong (throw that in a dotgraph viewer).
> >
> > I'm struggling to figure out how to handle this without doing a lookup.
> >
> > Any advice?
> >
> > Thanks,
> > Tamar
> >
> >
> > >  	  /* Only need to handle basic block before epilog loop if it's not
> > >  	     the guard_bb, which is the case when skip_vector is true.  */
> > >  	  if (guard_bb != bb_before_epilog)
> > > --
> > > 2.35.3
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
  2023-11-09 10:26     ` Tamar Christina
@ 2023-11-09 11:54       ` Richard Biener
  2023-11-09 16:22         ` Tamar Christina
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2023-11-09 11:54 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches

On Thu, 9 Nov 2023, Tamar Christina wrote:

> > -----Original Message-----
> > From: Richard Biener <rguenther@suse.de>
> > Sent: Thursday, November 9, 2023 9:24 AM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
> > 
> > On Thu, 9 Nov 2023, Tamar Christina wrote:
> > 
> > > >  	  guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
> > > >  	  edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
> > > > -	  guard_to = split_edge (epilog_e);
> > > > +	  guard_to = epilog_e->dest;
> > > >  	  guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, guard_to,
> > > >  					   skip_vector ? anchor : guard_bb,
> > > >  					   prob_epilog.invert (),
> > > > @@ -3443,8 +3229,30 @@ vect_do_peeling (loop_vec_info loop_vinfo,
> > > > tree niters, tree nitersm1,
> > > >  	  if (vect_epilogues)
> > > >  	    epilogue_vinfo->skip_this_loop_edge = guard_e;
> > > >  	  edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > > > -	  slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv,
> > > > guard_e,
> > > > -					      epilog_e);
> > > > +	  gphi_iterator gsi2 = gsi_start_phis (main_iv->dest);
> > > > +	  for (gphi_iterator gsi = gsi_start_phis (guard_to);
> > > > +	       !gsi_end_p (gsi); gsi_next (&gsi))
> > > > +	    {
> > > > +	      /* We are expecting all of the PHIs we have on epilog_e
> > > > +		 to be also on the main loop exit.  But sometimes
> > > > +		 a stray virtual definition can appear at epilog_e
> > > > +		 which we can then take as the same on all exits,
> > > > +		 we've removed the LC SSA PHI on the main exit before
> > > > +		 so we wouldn't need to create a loop PHI for it.  */
> > > > +	      if (virtual_operand_p (gimple_phi_result (*gsi))
> > > > +		  && (gsi_end_p (gsi2)
> > > > +		      || !virtual_operand_p (gimple_phi_result (*gsi2))))
> > > > +		add_phi_arg (*gsi,
> > > > +			     gimple_phi_arg_def_from_edge (*gsi, epilog_e),
> > > > +			     guard_e, UNKNOWN_LOCATION);
> > > > +	      else
> > > > +		{
> > > > +		  add_phi_arg (*gsi, gimple_phi_result (*gsi2), guard_e,
> > > > +			       UNKNOWN_LOCATION);
> > > > +		  gsi_next (&gsi2);
> > > > +		}
> > > > +	    }
> > > > +
> > >
> > > I've been having some trouble incorporating this change into the early break
> > work.
> > > My understanding is that here you've removed the lookup that
> > > find_guard did and are assuming that the order between the PHI nodes
> > > between loop->exit and epilog->exit are the same - sporadic virtual
> > operands.
> > >
> > > But the loop->exit for early break has to materialize all PHI nodes
> > > from the main loop into the epilog loop since we need them to restart the
> > scalar loop iteration.
> > >
> > > This means that the number of PHI nodes between the first loop and the
> > > second Loop are not the same, so we end up mis-linking phi nodes.
> > > i.e. consider this loop
> > >
> > > https://gist.github.com/Mistuke/65d476b18f991772fdec159a09b81869
> > 
> > I don't see any multi-exits here?  I think you need exactly the same PHIs you
> > need for the branch to the epilogue, no?
> > 
> 
> Ah it's a failing testcase but not one with an early break,
> 
> > If you can point me to a testcase that fails on your branch I can try to have a
> > look.
> 
> I've updated the branch refs/users/tnfchris/heads/gcc-14-early-break
> 
> Quite a few tests fail, a simple one is vect-early-break_5.c and vect-early-break_20.c
> 
> But what you just said above makes me wonder.. at the moment before we have
> differening amount because we require to have the loop counters and IVs as PHI nodes
> such that vect_update_ivs_after_vectorizer can thread them through correctly as it
> searches for PHI nodes.  However for the epilog exit, those that are not live are not
> needed.  This is why we get different counts.
> 
> Maybe.. the solution is that I need to do the same thing as vectorizable_live_operations
> In that when vect_update_ivs_after_vectorizer is done I should either remove the PHI
> nodes or turn them into simple assignments.  Since they're always single value.
> 
> Looking at a few examples that seems like it would fix the issue.. Does that sound right to you?

Without diving deep into it I can't say for sure.  I'd say any
complex mangling of things should happen after we've set up the basic
loop structure with prologue, epilogue end exit edges (I've hesitated
with that virtual PHIs, thinking we should eventually delay removing it
until we're ready with all the copying for example).

Maybe vect_update_ivs_after_vectorizer is a similar thing.

Richard.

> Regards,
> Tamar
> > 
> > > which now goes wrong (throw that in a dotgraph viewer).
> > >
> > > I'm struggling to figure out how to handle this without doing a lookup.
> > >
> > > Any advice?
> > >
> > > Thanks,
> > > Tamar
> > >
> > >
> > > >  	  /* Only need to handle basic block before epilog loop if it's not
> > > >  	     the guard_bb, which is the case when skip_vector is true.  */
> > > >  	  if (guard_bb != bb_before_epilog)
> > > > --
> > > > 2.35.3
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
  2023-11-09 11:54       ` Richard Biener
@ 2023-11-09 16:22         ` Tamar Christina
  2023-11-10 10:21           ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Tamar Christina @ 2023-11-09 16:22 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Thursday, November 9, 2023 11:54 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
> 
> On Thu, 9 Nov 2023, Tamar Christina wrote:
> 
> > > -----Original Message-----
> > > From: Richard Biener <rguenther@suse.de>
> > > Sent: Thursday, November 9, 2023 9:24 AM
> > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > Cc: gcc-patches@gcc.gnu.org
> > > Subject: RE: [PATCH] tree-optimization/111950 - vectorizer loop
> > > copying
> > >
> > > On Thu, 9 Nov 2023, Tamar Christina wrote:
> > >
> > > > >  	  guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
> > > > >  	  edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
> > > > > -	  guard_to = split_edge (epilog_e);
> > > > > +	  guard_to = epilog_e->dest;
> > > > >  	  guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, guard_to,
> > > > >  					   skip_vector ? anchor : guard_bb,
> > > > >  					   prob_epilog.invert (),
> > > > > @@ -3443,8 +3229,30 @@ vect_do_peeling (loop_vec_info
> > > > > loop_vinfo, tree niters, tree nitersm1,
> > > > >  	  if (vect_epilogues)
> > > > >  	    epilogue_vinfo->skip_this_loop_edge = guard_e;
> > > > >  	  edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > > > > -	  slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv,
> > > > > guard_e,
> > > > > -					      epilog_e);
> > > > > +	  gphi_iterator gsi2 = gsi_start_phis (main_iv->dest);
> > > > > +	  for (gphi_iterator gsi = gsi_start_phis (guard_to);
> > > > > +	       !gsi_end_p (gsi); gsi_next (&gsi))
> > > > > +	    {
> > > > > +	      /* We are expecting all of the PHIs we have on epilog_e
> > > > > +		 to be also on the main loop exit.  But sometimes
> > > > > +		 a stray virtual definition can appear at epilog_e
> > > > > +		 which we can then take as the same on all exits,
> > > > > +		 we've removed the LC SSA PHI on the main exit before
> > > > > +		 so we wouldn't need to create a loop PHI for it.  */
> > > > > +	      if (virtual_operand_p (gimple_phi_result (*gsi))
> > > > > +		  && (gsi_end_p (gsi2)
> > > > > +		      || !virtual_operand_p (gimple_phi_result (*gsi2))))
> > > > > +		add_phi_arg (*gsi,
> > > > > +			     gimple_phi_arg_def_from_edge (*gsi,
> epilog_e),
> > > > > +			     guard_e, UNKNOWN_LOCATION);
> > > > > +	      else
> > > > > +		{
> > > > > +		  add_phi_arg (*gsi, gimple_phi_result (*gsi2),
> guard_e,
> > > > > +			       UNKNOWN_LOCATION);
> > > > > +		  gsi_next (&gsi2);
> > > > > +		}
> > > > > +	    }
> > > > > +
> > > >
> > > > I've been having some trouble incorporating this change into the
> > > > early break
> > > work.
> > > > My understanding is that here you've removed the lookup that
> > > > find_guard did and are assuming that the order between the PHI
> > > > nodes between loop->exit and epilog->exit are the same - sporadic
> > > > virtual
> > > operands.
> > > >
> > > > But the loop->exit for early break has to materialize all PHI
> > > > nodes from the main loop into the epilog loop since we need them
> > > > to restart the
> > > scalar loop iteration.
> > > >
> > > > This means that the number of PHI nodes between the first loop and
> > > > the second Loop are not the same, so we end up mis-linking phi nodes.
> > > > i.e. consider this loop
> > > >
> > > >
> https://gist.github.com/Mistuke/65d476b18f991772fdec159a09b81869
> > >
> > > I don't see any multi-exits here?  I think you need exactly the same
> > > PHIs you need for the branch to the epilogue, no?
> > >
> >
> > Ah it's a failing testcase but not one with an early break,
> >
> > > If you can point me to a testcase that fails on your branch I can
> > > try to have a look.
> >
> > I've updated the branch refs/users/tnfchris/heads/gcc-14-early-break
> >
> > Quite a few tests fail, a simple one is vect-early-break_5.c and
> > vect-early-break_20.c
> >
> > But what you just said above makes me wonder.. at the moment before we
> > have differening amount because we require to have the loop counters
> > and IVs as PHI nodes such that vect_update_ivs_after_vectorizer can
> > thread them through correctly as it searches for PHI nodes.  However
> > for the epilog exit, those that are not live are not needed.  This is why we get
> different counts.
> >
> > Maybe.. the solution is that I need to do the same thing as
> > vectorizable_live_operations In that when
> > vect_update_ivs_after_vectorizer is done I should either remove the PHI
> nodes or turn them into simple assignments.  Since they're always single value.
> >
> > Looking at a few examples that seems like it would fix the issue.. Does that
> sound right to you?
> 
> Without diving deep into it I can't say for sure.  I'd say any complex mangling
> of things should happen after we've set up the basic loop structure with
> prologue, epilogue end exit edges (I've hesitated with that virtual PHIs,
> thinking we should eventually delay removing it until we're ready with all the
> copying for example).
> 

Ok, thank you, the hint you gave:

> I think you need exactly the same
> > > PHIs you need for the branch to the epilogue, no?

Made me realize what the issue is.  I realize now that your change Is essentially
assuming that the main loop exit and the epilog exit contain the same number
of PHI nodes.  And indeed for the main exit there's no reason why they can't
since all the phi nodes are singular.  We don't need them.  I was creating them
because vect_update_ivs_after_vectorizer normally searches for PHI nodes,

but it searches for PHI nodes on the merge block.  In the case of a single exit
that means that there's nothing to update for the missing PHI nodes since
peeling has already done the right thing.

There's no need to update the IVs because in the guard block they'll be update
based on niters instead.   So the PHIs are not needed.

For multiple exits the connection was already done by peeling when it maintained
LCSSA.  So the correct fix is just to for the main exit not create the unneeded
singular PHI nodes to begin with.  The order is then maintained by redirect_branch
and flush. Then it all works and no change is needed elsewhere.

This also allows me to simplify the code for peeling a bit.   You'll get the updated patch
series with all the comments made so far addressed on Tuesday.

Thanks for the hint!

Regards,
Tamar
> Maybe vect_update_ivs_after_vectorizer is a similar thing.
> 
> Richard.
> 
> > Regards,
> > Tamar
> > >
> > > > which now goes wrong (throw that in a dotgraph viewer).
> > > >
> > > > I'm struggling to figure out how to handle this without doing a lookup.
> > > >
> > > > Any advice?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > >
> > > > >  	  /* Only need to handle basic block before epilog loop if it's not
> > > > >  	     the guard_bb, which is the case when skip_vector is true.  */
> > > > >  	  if (guard_bb != bb_before_epilog)
> > > > > --
> > > > > 2.35.3
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> > > Nuernberg, Germany;
> > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > Nuernberg)
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
  2023-11-09 16:22         ` Tamar Christina
@ 2023-11-10 10:21           ` Richard Biener
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2023-11-10 10:21 UTC (permalink / raw)
  To: Tamar Christina; +Cc: gcc-patches

On Thu, 9 Nov 2023, Tamar Christina wrote:

> > -----Original Message-----
> > From: Richard Biener <rguenther@suse.de>
> > Sent: Thursday, November 9, 2023 11:54 AM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: RE: [PATCH] tree-optimization/111950 - vectorizer loop copying
> > 
> > On Thu, 9 Nov 2023, Tamar Christina wrote:
> > 
> > > > -----Original Message-----
> > > > From: Richard Biener <rguenther@suse.de>
> > > > Sent: Thursday, November 9, 2023 9:24 AM
> > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > > Cc: gcc-patches@gcc.gnu.org
> > > > Subject: RE: [PATCH] tree-optimization/111950 - vectorizer loop
> > > > copying
> > > >
> > > > On Thu, 9 Nov 2023, Tamar Christina wrote:
> > > >
> > > > > >  	  guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
> > > > > >  	  edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
> > > > > > -	  guard_to = split_edge (epilog_e);
> > > > > > +	  guard_to = epilog_e->dest;
> > > > > >  	  guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, guard_to,
> > > > > >  					   skip_vector ? anchor : guard_bb,
> > > > > >  					   prob_epilog.invert (),
> > > > > > @@ -3443,8 +3229,30 @@ vect_do_peeling (loop_vec_info
> > > > > > loop_vinfo, tree niters, tree nitersm1,
> > > > > >  	  if (vect_epilogues)
> > > > > >  	    epilogue_vinfo->skip_this_loop_edge = guard_e;
> > > > > >  	  edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > > > > > -	  slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv,
> > > > > > guard_e,
> > > > > > -					      epilog_e);
> > > > > > +	  gphi_iterator gsi2 = gsi_start_phis (main_iv->dest);
> > > > > > +	  for (gphi_iterator gsi = gsi_start_phis (guard_to);
> > > > > > +	       !gsi_end_p (gsi); gsi_next (&gsi))
> > > > > > +	    {
> > > > > > +	      /* We are expecting all of the PHIs we have on epilog_e
> > > > > > +		 to be also on the main loop exit.  But sometimes
> > > > > > +		 a stray virtual definition can appear at epilog_e
> > > > > > +		 which we can then take as the same on all exits,
> > > > > > +		 we've removed the LC SSA PHI on the main exit before
> > > > > > +		 so we wouldn't need to create a loop PHI for it.  */
> > > > > > +	      if (virtual_operand_p (gimple_phi_result (*gsi))
> > > > > > +		  && (gsi_end_p (gsi2)
> > > > > > +		      || !virtual_operand_p (gimple_phi_result (*gsi2))))
> > > > > > +		add_phi_arg (*gsi,
> > > > > > +			     gimple_phi_arg_def_from_edge (*gsi,
> > epilog_e),
> > > > > > +			     guard_e, UNKNOWN_LOCATION);
> > > > > > +	      else
> > > > > > +		{
> > > > > > +		  add_phi_arg (*gsi, gimple_phi_result (*gsi2),
> > guard_e,
> > > > > > +			       UNKNOWN_LOCATION);
> > > > > > +		  gsi_next (&gsi2);
> > > > > > +		}
> > > > > > +	    }
> > > > > > +
> > > > >
> > > > > I've been having some trouble incorporating this change into the
> > > > > early break
> > > > work.
> > > > > My understanding is that here you've removed the lookup that
> > > > > find_guard did and are assuming that the order between the PHI
> > > > > nodes between loop->exit and epilog->exit are the same - sporadic
> > > > > virtual
> > > > operands.
> > > > >
> > > > > But the loop->exit for early break has to materialize all PHI
> > > > > nodes from the main loop into the epilog loop since we need them
> > > > > to restart the
> > > > scalar loop iteration.
> > > > >
> > > > > This means that the number of PHI nodes between the first loop and
> > > > > the second Loop are not the same, so we end up mis-linking phi nodes.
> > > > > i.e. consider this loop
> > > > >
> > > > >
> > https://gist.github.com/Mistuke/65d476b18f991772fdec159a09b81869
> > > >
> > > > I don't see any multi-exits here?  I think you need exactly the same
> > > > PHIs you need for the branch to the epilogue, no?
> > > >
> > >
> > > Ah it's a failing testcase but not one with an early break,
> > >
> > > > If you can point me to a testcase that fails on your branch I can
> > > > try to have a look.
> > >
> > > I've updated the branch refs/users/tnfchris/heads/gcc-14-early-break
> > >
> > > Quite a few tests fail, a simple one is vect-early-break_5.c and
> > > vect-early-break_20.c
> > >
> > > But what you just said above makes me wonder.. at the moment before we
> > > have differening amount because we require to have the loop counters
> > > and IVs as PHI nodes such that vect_update_ivs_after_vectorizer can
> > > thread them through correctly as it searches for PHI nodes.  However
> > > for the epilog exit, those that are not live are not needed.  This is why we get
> > different counts.
> > >
> > > Maybe.. the solution is that I need to do the same thing as
> > > vectorizable_live_operations In that when
> > > vect_update_ivs_after_vectorizer is done I should either remove the PHI
> > nodes or turn them into simple assignments.  Since they're always single value.
> > >
> > > Looking at a few examples that seems like it would fix the issue.. Does that
> > sound right to you?
> > 
> > Without diving deep into it I can't say for sure.  I'd say any complex mangling
> > of things should happen after we've set up the basic loop structure with
> > prologue, epilogue end exit edges (I've hesitated with that virtual PHIs,
> > thinking we should eventually delay removing it until we're ready with all the
> > copying for example).
> > 
> 
> Ok, thank you, the hint you gave:
> 
> > I think you need exactly the same
> > > > PHIs you need for the branch to the epilogue, no?
> 
> Made me realize what the issue is.  I realize now that your change Is essentially
> assuming that the main loop exit and the epilog exit contain the same number
> of PHI nodes.  And indeed for the main exit there's no reason why they can't
> since all the phi nodes are singular.  We don't need them.  I was creating them
> because vect_update_ivs_after_vectorizer normally searches for PHI nodes,
> 
> but it searches for PHI nodes on the merge block.  In the case of a single exit
> that means that there's nothing to update for the missing PHI nodes since
> peeling has already done the right thing.
> 
> There's no need to update the IVs because in the guard block they'll be update
> based on niters instead.   So the PHIs are not needed.
> 
> For multiple exits the connection was already done by peeling when it maintained
> LCSSA.  So the correct fix is just to for the main exit not create the unneeded
> singular PHI nodes to begin with.  The order is then maintained by redirect_branch
> and flush. Then it all works and no change is needed elsewhere.
> 
> This also allows me to simplify the code for peeling a bit.   You'll get the updated patch
> series with all the comments made so far addressed on Tuesday.
> 
> Thanks for the hint!

You're welcome ;)  And yes, the motivation of my change was to simplify
this ugly piece of code ...

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] tree-optimization/111950 - vectorizer loop copying
@ 2023-11-06 13:14 Richard Biener
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2023-11-06 13:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: tamar.christina

The following simplifies LC-PHI arg population during epilog peeling,
thereby fixing the testcase in this PR.

Bootstrapped and tested on x86_64-unknown-linux-gnu, also built
SPEC CPU 2017 with and without LTO, pushed.

	PR tree-optimization/111950
	* tre-vect-loop-manip.cc (slpeel_duplicate_current_defs_from_edges):
	Remove.
	(find_guard_arg): Likewise.
	(slpeel_update_phi_nodes_for_guard2): Likewise.
	(slpeel_tree_duplicate_loop_to_edge_cfg): Remove calls to
	slpeel_duplicate_current_defs_from_edges, do not elide
	LC-PHIs for invariant values.
	(vect_do_peeling): Materialize PHI arguments for the edge
	around the epilog from the PHI defs of the main loop exit.

	* gcc.dg/torture/pr111950.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr111950.c |  16 ++
 gcc/tree-vect-loop-manip.cc             | 242 +++---------------------
 2 files changed, 41 insertions(+), 217 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr111950.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr111950.c b/gcc/testsuite/gcc.dg/torture/pr111950.c
new file mode 100644
index 00000000000..4eeffeb6827
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr111950.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-vectorize -fno-vect-cost-model" } */
+
+int a, b, d;
+int c[4];
+unsigned e;
+void f() {
+  char g;
+  for (; d; d++) {
+    g = 1;
+    for (; g >= 0; g--) {
+      e = b >= 2 || a >> b ?: a;
+      c[g] = e;
+    }
+  }
+}
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 43ca985c53c..b9161274ce4 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -1392,58 +1392,6 @@ vect_set_loop_condition (class loop *loop, edge loop_e, loop_vec_info loop_vinfo
 		     (gimple *) cond_stmt);
 }
 
-/* Helper routine of slpeel_tree_duplicate_loop_to_edge_cfg.
-   For all PHI arguments in FROM->dest and TO->dest from those
-   edges ensure that TO->dest PHI arguments have current_def
-   to that in from.  */
-
-static void
-slpeel_duplicate_current_defs_from_edges (edge from, edge to)
-{
-  gimple_stmt_iterator gsi_from, gsi_to;
-
-  for (gsi_from = gsi_start_phis (from->dest),
-       gsi_to = gsi_start_phis (to->dest);
-       !gsi_end_p (gsi_from) && !gsi_end_p (gsi_to);)
-    {
-      gimple *from_phi = gsi_stmt (gsi_from);
-      gimple *to_phi = gsi_stmt (gsi_to);
-      tree from_arg = PHI_ARG_DEF_FROM_EDGE (from_phi, from);
-      tree to_arg = PHI_ARG_DEF_FROM_EDGE (to_phi, to);
-      if (virtual_operand_p (from_arg))
-	{
-	  gsi_next (&gsi_from);
-	  continue;
-	}
-      if (virtual_operand_p (to_arg))
-	{
-	  gsi_next (&gsi_to);
-	  continue;
-	}
-      if (TREE_CODE (from_arg) != SSA_NAME)
-	gcc_assert (operand_equal_p (from_arg, to_arg, 0));
-      else if (TREE_CODE (to_arg) == SSA_NAME
-	       && from_arg != to_arg)
-	{
-	  if (get_current_def (to_arg) == NULL_TREE)
-	    {
-	      gcc_assert (types_compatible_p (TREE_TYPE (to_arg),
-					      TREE_TYPE (get_current_def
-							   (from_arg))));
-	      set_current_def (to_arg, get_current_def (from_arg));
-	    }
-	}
-      gsi_next (&gsi_from);
-      gsi_next (&gsi_to);
-    }
-
-  gphi *from_phi = get_virtual_phi (from->dest);
-  gphi *to_phi = get_virtual_phi (to->dest);
-  if (from_phi)
-    set_current_def (PHI_ARG_DEF_FROM_EDGE (to_phi, to),
-		     get_current_def (PHI_ARG_DEF_FROM_EDGE (from_phi, from)));
-}
-
 /* Given LOOP this function generates a new copy of it and puts it
    on E which is either the entry or exit of LOOP.  If SCALAR_LOOP is
    non-NULL, assume LOOP and SCALAR_LOOP are equivalent and copy the
@@ -1577,21 +1525,6 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
 	  adjust_debug_stmts (orig_def, PHI_RESULT (gsi.phi ()), exit->dest);
       }
 
-  /* This condition happens when the loop has been versioned. e.g. due to ifcvt
-     versioning the loop.  */
-  if (scalar_loop != loop)
-    {
-      /* If we copied from SCALAR_LOOP rather than LOOP, SSA_NAMEs from
-	 SCALAR_LOOP will have current_def set to SSA_NAMEs in the new_loop,
-	 but LOOP will not.  slpeel_update_phi_nodes_for_guard{1,2} expects
-	 the LOOP SSA_NAMEs (on the exit edge and edge from latch to
-	 header) to have current_def set, so copy them over.  */
-      slpeel_duplicate_current_defs_from_edges (scalar_exit, exit);
-      slpeel_duplicate_current_defs_from_edges (EDGE_SUCC (scalar_loop->latch,
-							   0),
-						EDGE_SUCC (loop->latch, 0));
-    }
-
   auto loop_exits = get_loop_exit_edges (loop);
   auto_vec<basic_block> doms;
 
@@ -1655,18 +1588,6 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
 
 	  if (TREE_CODE (new_arg) != SSA_NAME)
 	    continue;
-	  /* If the PHI node dominates the loop then we shouldn't create
-	      a new LC-SSSA PHI for it in the intermediate block.  Unless the
-	      the loop has been versioned.  If it has then we need the PHI
-	      node such that later when the loop guard is added the original
-	      dominating PHI can be found.  */
-	  basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (new_arg));
-	  if (loop == scalar_loop
-	      && (!def_bb || !flow_bb_inside_loop_p (loop, def_bb)))
-	    {
-	      auto gsi = gsi_for_stmt (phi);
-	      remove_phi_node (&gsi, true);
-	    }
 	}
 
       /* Copy the current loop LC PHI nodes between the original loop exit
@@ -2711,40 +2632,6 @@ vect_gen_vector_loop_niters_mult_vf (loop_vec_info loop_vinfo,
   *niters_vector_mult_vf_ptr = niters_vector_mult_vf;
 }
 
-/* LCSSA_PHI is a lcssa phi of EPILOG loop which is copied from LOOP,
-   this function searches for the corresponding lcssa phi node in exit
-   bb of LOOP following the LCSSA_EDGE to the exit node.  If it is found,
-   return the phi result; otherwise return NULL.  */
-
-static tree
-find_guard_arg (class loop *loop ATTRIBUTE_UNUSED, const_edge loop_e,
-		class loop *epilog ATTRIBUTE_UNUSED, gphi *lcssa_phi,
-		int lcssa_edge = 0)
-{
-  gphi_iterator gsi;
-
-  for (gsi = gsi_start_phis (loop_e->dest); !gsi_end_p (gsi); gsi_next (&gsi))
-    {
-      gphi *phi = gsi.phi ();
-      /* Nested loops with multiple exits can have different no# phi node
-	arguments between the main loop and epilog as epilog falls to the
-	second loop.  */
-      if (gimple_phi_num_args (phi) > loop_e->dest_idx)
-	{
-	 tree var = PHI_ARG_DEF (phi, loop_e->dest_idx);
-	 if (TREE_CODE (var) != SSA_NAME)
-	    continue;
-	 tree def = get_current_def (var);
-	 if (!def)
-	   continue;
-	 if (operand_equal_p (def,
-			      PHI_ARG_DEF (lcssa_phi, lcssa_edge), 0))
-	   return PHI_RESULT (phi);
-	}
-    }
-  return NULL_TREE;
-}
-
 /* Function slpeel_add_loop_guard adds guard skipping from the beginning
    of SKIP_LOOP to the beginning of UPDATE_LOOP.  GUARD_EDGE and MERGE_EDGE
    are two pred edges of the merge point before UPDATE_LOOP.  The two loops
@@ -2827,107 +2714,6 @@ slpeel_update_phi_nodes_for_guard1 (class loop *skip_loop,
     }
 }
 
-/* LOOP and EPILOG are two consecutive loops in CFG connected by LOOP_EXIT edge
-   and EPILOG is copied from LOOP.  Function slpeel_add_loop_guard adds guard
-   skipping from a point between the two loops to the end of EPILOG.  Edges
-   GUARD_EDGE and MERGE_EDGE are the two pred edges of merge_bb at the end of
-   EPILOG.  The CFG looks like:
-
-     loop:
-       header_a:
-	 i_1 = PHI<i_0, i_2>;
-	 ...
-	 i_2 = i_1 + 1;
-	 if (cond_a)
-	   goto latch_a;
-	 else
-	   goto exit_a;
-       latch_a:
-	 goto header_a;
-
-       exit_a:
-
-       guard_bb:
-	 if (cond)
-	   goto merge_bb;
-	 else
-	   goto epilog_loop;
-
-       ;; fall_through_bb
-
-     epilog_loop:
-       header_b:
-	 i_3 = PHI<i_2, i_4>;
-	 ...
-	 i_4 = i_3 + 1;
-	 if (cond_b)
-	   goto latch_b;
-	 else
-	   goto merge_bb;
-       latch_b:
-	 goto header_b;
-
-       merge_bb:
-	 ; PHI node (i_y = PHI<i_2, i_4>) to be created at merge point.
-
-       exit_bb:
-	 i_x = PHI<i_4>;  ;Use of i_4 to be replaced with i_y in merge_bb.
-
-   For each name used out side EPILOG (i.e - for each name that has a lcssa
-   phi in exit_bb) we create a new PHI in merge_bb.  The new PHI has two
-   args corresponding to GUARD_EDGE and MERGE_EDGE.  Arg for MERGE_EDGE is
-   the arg of the original PHI in exit_bb, arg for GUARD_EDGE is defined
-   by LOOP and is found in the exit bb of LOOP.  Arg of the original PHI
-   in exit_bb will also be updated.  */
-
-static void
-slpeel_update_phi_nodes_for_guard2 (class loop *loop, class loop *epilog,
-				    const_edge loop_exit,
-				    edge guard_edge, edge merge_edge)
-{
-  gphi_iterator gsi;
-  basic_block merge_bb = guard_edge->dest;
-
-  gcc_assert (single_succ_p (merge_bb));
-  edge e = single_succ_edge (merge_bb);
-  basic_block exit_bb = e->dest;
-
-  for (gsi = gsi_start_phis (exit_bb); !gsi_end_p (gsi); gsi_next (&gsi))
-    {
-      gphi *update_phi = gsi.phi ();
-      tree old_arg = PHI_ARG_DEF (update_phi, e->dest_idx);
-
-      tree merge_arg = NULL_TREE;
-
-      /* If the old argument is a SSA_NAME use its current_def.  */
-      if (TREE_CODE (old_arg) == SSA_NAME)
-	merge_arg = get_current_def (old_arg);
-      /* If it's a constant or doesn't have a current_def, just use the old
-	 argument.  */
-      if (!merge_arg)
-	merge_arg = old_arg;
-
-      tree guard_arg = find_guard_arg (loop, loop_exit, epilog,
-				       update_phi, e->dest_idx);
-      /* If the var is live after loop but not a reduction, we simply
-	 use the old arg.  */
-      if (!guard_arg)
-	guard_arg = old_arg;
-
-      /* Create new phi node in MERGE_BB:  */
-      tree new_res = copy_ssa_name (PHI_RESULT (update_phi));
-      gphi *merge_phi = create_phi_node (new_res, merge_bb);
-
-      /* MERGE_BB has two incoming edges: GUARD_EDGE and MERGE_EDGE, Set
-	 the two PHI args in merge_phi for these edges.  */
-      add_phi_arg (merge_phi, merge_arg, merge_edge, UNKNOWN_LOCATION);
-      add_phi_arg (merge_phi, guard_arg, guard_edge, UNKNOWN_LOCATION);
-
-      /* Update the original phi in exit_bb.  */
-      adjust_phi_and_debug_stmts (update_phi, e, new_res);
-    }
-}
-
 /* LOOP_VINFO is an epilogue loop whose corresponding main loop can be skipped.
    Return a value that equals:
 
@@ -3435,7 +3221,7 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
 				    niters, niters_vector_mult_vf);
 	  guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
 	  edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo);
-	  guard_to = split_edge (epilog_e);
+	  guard_to = epilog_e->dest;
 	  guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, guard_to,
 					   skip_vector ? anchor : guard_bb,
 					   prob_epilog.invert (),
@@ -3443,8 +3229,30 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
 	  if (vect_epilogues)
 	    epilogue_vinfo->skip_this_loop_edge = guard_e;
 	  edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo);
-	  slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv, guard_e,
-					      epilog_e);
+	  gphi_iterator gsi2 = gsi_start_phis (main_iv->dest);
+	  for (gphi_iterator gsi = gsi_start_phis (guard_to);
+	       !gsi_end_p (gsi); gsi_next (&gsi))
+	    {
+	      /* We are expecting all of the PHIs we have on epilog_e
+		 to be also on the main loop exit.  But sometimes
+		 a stray virtual definition can appear at epilog_e
+		 which we can then take as the same on all exits,
+		 we've removed the LC SSA PHI on the main exit before
+		 so we wouldn't need to create a loop PHI for it.  */
+	      if (virtual_operand_p (gimple_phi_result (*gsi))
+		  && (gsi_end_p (gsi2)
+		      || !virtual_operand_p (gimple_phi_result (*gsi2))))
+		add_phi_arg (*gsi,
+			     gimple_phi_arg_def_from_edge (*gsi, epilog_e),
+			     guard_e, UNKNOWN_LOCATION);
+	      else
+		{
+		  add_phi_arg (*gsi, gimple_phi_result (*gsi2), guard_e,
+			       UNKNOWN_LOCATION);
+		  gsi_next (&gsi2);
+		}
+	    }
+
 	  /* Only need to handle basic block before epilog loop if it's not
 	     the guard_bb, which is the case when skip_vector is true.  */
 	  if (guard_bb != bb_before_epilog)
-- 
2.35.3

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-11-10 10:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <33fb3bb3-9548-4867-95f6-f7319bde2270@AMS1EPF00000043.eurprd04.prod.outlook.com>
2023-11-09  9:09 ` [PATCH] tree-optimization/111950 - vectorizer loop copying Tamar Christina
2023-11-09  9:23   ` Richard Biener
2023-11-09 10:26     ` Tamar Christina
2023-11-09 11:54       ` Richard Biener
2023-11-09 16:22         ` Tamar Christina
2023-11-10 10:21           ` Richard Biener
2023-11-06 13:14 Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).