public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* decremnt IV patch create fails on PowerPC
@ 2023-05-25 22:45 钟居哲
  2023-05-26  6:46 ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: 钟居哲 @ 2023-05-25 22:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford, rguenther, linkw

[-- Attachment #1: Type: text/plain, Size: 1099 bytes --]

Yesterday's patch has been approved (decremnt IV support):
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619663.html 

However, it creates fails on PowerPC:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 

I am really sorry for causing inconvinience.

I wonder as we disccussed:
+  /* If we're vectorizing a loop that uses length "controls" and
+     can iterate more than once, we apply decrementing IV approach
+     in loop control.  */
+  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
+      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
+      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
+      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
+	   && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
+			LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
+    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;

This conditions can not disable decrement IV on PowerPC.
Should I add a target hook for it? 
The patch I can only do bootstrap and regression on X86.
I didn't have an environment to test PowerPC. I am really sorry.

Thanks.


juzhe.zhong@rivai.ai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: decremnt IV patch create fails on PowerPC
  2023-05-25 22:45 decremnt IV patch create fails on PowerPC 钟居哲
@ 2023-05-26  6:46 ` Richard Biener
  2023-05-26  7:46   ` juzhe.zhong
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Biener @ 2023-05-26  6:46 UTC (permalink / raw)
  To: 钟居哲; +Cc: gcc-patches, richard.sandiford, linkw

On Fri, 26 May 2023, ??? wrote:

> Yesterday's patch has been approved (decremnt IV support):
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619663.html 
> 
> However, it creates fails on PowerPC:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 
> 
> I am really sorry for causing inconvinience.
> 
> I wonder as we disccussed:
> +  /* If we're vectorizing a loop that uses length "controls" and
> +     can iterate more than once, we apply decrementing IV approach
> +     in loop control.  */
> +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> +	   && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> +			LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> 
> This conditions can not disable decrement IV on PowerPC.
> Should I add a target hook for it?

No.  I've put some analysis in the PR.  To me the question is
why (without that SELECT_VL case) we need a decrementing IV
_for the loop control_?  We could simply retain the original
incrementing IV for loop control and add the decrementing
IV for computing LEN in addition to that and leave IVOPTs
sorting out to eventually merge them (or not).

Alternatively avoid the variable decrement as I wrote in the
PR and do the exit test based on the previous IV value.

But as said all this won't work for the SELECT_VL case, but
then it's availability is something to key off rather than a
new target hook?

> The patch I can only do bootstrap and regression on X86.
> I didn't have an environment to test PowerPC. I am really sorry.

You can do some testing with a cross compiler, alternatively
there are powerpc machines in the GCC compile farm.

Richard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: decremnt IV patch create fails on PowerPC
  2023-05-26  6:46 ` Richard Biener
@ 2023-05-26  7:46   ` juzhe.zhong
  2023-05-30  9:22     ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: juzhe.zhong @ 2023-05-26  7:46 UTC (permalink / raw)
  To: rguenther; +Cc: gcc-patches, richard.sandiford, linkw

[-- Attachment #1: Type: text/plain, Size: 3185 bytes --]

Hi, Richi. Thanks for your analysis and helps.

>> We could simply retain the original
>> incrementing IV for loop control and add the decrementing
>> IV for computing LEN in addition to that and leave IVOPTs
>> sorting out to eventually merge them (or not).

I am not sure how to do that. Could you give me more informations?

I somehow understand your concern is that variable amount of IV will make
IVOPT fails. 

I have seen similar situation in LLVM (when apply variable IV,
they failed to interleave the vectorize code). I am not sure whether they
are the same reason for that.

For RVV, we not only want decrement IV style in vectorization but also
we want to apply SELECT_VL in single-rgroup which is most happen cases (LLVM also only apply get_vector_length in single vector length).

>>You can do some testing with a cross compiler, alternatively
>>there are powerpc machines in the GCC compile farm.

It seems that Power is ok with decrement IV since most cases are improved.

I think Richard may help to explain decrement IV more clearly.

Thanks


juzhe.zhong@rivai.ai
 
From: Richard Biener
Date: 2023-05-26 14:46
To: 钟居哲
CC: gcc-patches; richard.sandiford; linkw
Subject: Re: decremnt IV patch create fails on PowerPC
On Fri, 26 May 2023, ??? wrote:
 
> Yesterday's patch has been approved (decremnt IV support):
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619663.html 
> 
> However, it creates fails on PowerPC:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 
> 
> I am really sorry for causing inconvinience.
> 
> I wonder as we disccussed:
> +  /* If we're vectorizing a loop that uses length "controls" and
> +     can iterate more than once, we apply decrementing IV approach
> +     in loop control.  */
> +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> +    && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> + LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> 
> This conditions can not disable decrement IV on PowerPC.
> Should I add a target hook for it?
 
No.  I've put some analysis in the PR.  To me the question is
why (without that SELECT_VL case) we need a decrementing IV
_for the loop control_?  We could simply retain the original
incrementing IV for loop control and add the decrementing
IV for computing LEN in addition to that and leave IVOPTs
sorting out to eventually merge them (or not).
 
Alternatively avoid the variable decrement as I wrote in the
PR and do the exit test based on the previous IV value.
 
But as said all this won't work for the SELECT_VL case, but
then it's availability is something to key off rather than a
new target hook?
 
> The patch I can only do bootstrap and regression on X86.
> I didn't have an environment to test PowerPC. I am really sorry.
 
You can do some testing with a cross compiler, alternatively
there are powerpc machines in the GCC compile farm.
 
Richard.
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: decremnt IV patch create fails on PowerPC
  2023-05-26  7:46   ` juzhe.zhong
@ 2023-05-30  9:22     ` Richard Biener
  2023-05-30  9:26       ` juzhe.zhong
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Biener @ 2023-05-30  9:22 UTC (permalink / raw)
  To: juzhe.zhong; +Cc: gcc-patches, richard.sandiford, linkw

On Fri, 26 May 2023, juzhe.zhong@rivai.ai wrote:

> Hi, Richi. Thanks for your analysis and helps.
> 
> >> We could simply retain the original
> >> incrementing IV for loop control and add the decrementing
> >> IV for computing LEN in addition to that and leave IVOPTs
> >> sorting out to eventually merge them (or not).
> 
> I am not sure how to do that. Could you give me more informations?
> 
> I somehow understand your concern is that variable amount of IV will make
> IVOPT fails. 
> 
> I have seen similar situation in LLVM (when apply variable IV,
> they failed to interleave the vectorize code). I am not sure whether they
> are the same reason for that.
> 
> For RVV, we not only want decrement IV style in vectorization but also
> we want to apply SELECT_VL in single-rgroup which is most happen cases (LLVM also only apply get_vector_length in single vector length).
>
> >>You can do some testing with a cross compiler, alternatively
> >>there are powerpc machines in the GCC compile farm.
> 
> It seems that Power is ok with decrement IV since most cases are improved.

Well, but Power never will have SELECT_VL so at least for !SELECT_VL
targets you should avoid having an IV with variable decrement.  As
I said it should be easy to rewrite decrement IV to use a constant
increment (when not using SELECT_VL) and testing the pre-decrement
value in the exit test.

Richard.
 
> I think Richard may help to explain decrement IV more clearly.
> 
> Thanks
> 
> 
> juzhe.zhong@rivai.ai
>  
> From: Richard Biener
> Date: 2023-05-26 14:46
> To: ???
> CC: gcc-patches; richard.sandiford; linkw
> Subject: Re: decremnt IV patch create fails on PowerPC
> On Fri, 26 May 2023, ??? wrote:
>  
> > Yesterday's patch has been approved (decremnt IV support):
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619663.html 
> > 
> > However, it creates fails on PowerPC:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 
> > 
> > I am really sorry for causing inconvinience.
> > 
> > I wonder as we disccussed:
> > +  /* If we're vectorizing a loop that uses length "controls" and
> > +     can iterate more than once, we apply decrementing IV approach
> > +     in loop control.  */
> > +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> > +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> > +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> > +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> > +    && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> > + LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> > +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> > 
> > This conditions can not disable decrement IV on PowerPC.
> > Should I add a target hook for it?
>  
> No.  I've put some analysis in the PR.  To me the question is
> why (without that SELECT_VL case) we need a decrementing IV
> _for the loop control_?  We could simply retain the original
> incrementing IV for loop control and add the decrementing
> IV for computing LEN in addition to that and leave IVOPTs
> sorting out to eventually merge them (or not).
>  
> Alternatively avoid the variable decrement as I wrote in the
> PR and do the exit test based on the previous IV value.
>  
> But as said all this won't work for the SELECT_VL case, but
> then it's availability is something to key off rather than a
> new target hook?
>  
> > The patch I can only do bootstrap and regression on X86.
> > I didn't have an environment to test PowerPC. I am really sorry.
>  
> You can do some testing with a cross compiler, alternatively
> there are powerpc machines in the GCC compile farm.
>  
> Richard.
>  
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: decremnt IV patch create fails on PowerPC
  2023-05-30  9:22     ` Richard Biener
@ 2023-05-30  9:26       ` juzhe.zhong
  2023-05-30  9:50         ` Richard Biener
  2023-05-30  9:51         ` Kewen.Lin
  0 siblings, 2 replies; 15+ messages in thread
From: juzhe.zhong @ 2023-05-30  9:26 UTC (permalink / raw)
  To: rguenther; +Cc: gcc-patches, richard.sandiford, linkw

[-- Attachment #1: Type: text/plain, Size: 4923 bytes --]

Ok.

It seems that for this conditions:

+  /* If we're vectorizing a loop that uses length "controls" and
+     can iterate more than once, we apply decrementing IV approach
+     in loop control.  */
+  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
+      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
+      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
+      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
+	   && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
+			LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
+    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;

I should add direct_supportted_p (SELECT_VL...) to this is that right?

I have send SELECT_VL patch. I will add this in next SELECT_VL patch.

Let's wait Richard's more comments.

Thanks.


juzhe.zhong@rivai.ai
 
From: Richard Biener
Date: 2023-05-30 17:22
To: juzhe.zhong@rivai.ai
CC: gcc-patches; richard.sandiford; linkw
Subject: Re: Re: decremnt IV patch create fails on PowerPC
On Fri, 26 May 2023, juzhe.zhong@rivai.ai wrote:
 
> Hi, Richi. Thanks for your analysis and helps.
> 
> >> We could simply retain the original
> >> incrementing IV for loop control and add the decrementing
> >> IV for computing LEN in addition to that and leave IVOPTs
> >> sorting out to eventually merge them (or not).
> 
> I am not sure how to do that. Could you give me more informations?
> 
> I somehow understand your concern is that variable amount of IV will make
> IVOPT fails. 
> 
> I have seen similar situation in LLVM (when apply variable IV,
> they failed to interleave the vectorize code). I am not sure whether they
> are the same reason for that.
> 
> For RVV, we not only want decrement IV style in vectorization but also
> we want to apply SELECT_VL in single-rgroup which is most happen cases (LLVM also only apply get_vector_length in single vector length).
>
> >>You can do some testing with a cross compiler, alternatively
> >>there are powerpc machines in the GCC compile farm.
> 
> It seems that Power is ok with decrement IV since most cases are improved.
 
Well, but Power never will have SELECT_VL so at least for !SELECT_VL
targets you should avoid having an IV with variable decrement.  As
I said it should be easy to rewrite decrement IV to use a constant
increment (when not using SELECT_VL) and testing the pre-decrement
value in the exit test.
 
Richard.
> I think Richard may help to explain decrement IV more clearly.
> 
> Thanks
> 
> 
> juzhe.zhong@rivai.ai
>  
> From: Richard Biener
> Date: 2023-05-26 14:46
> To: ???
> CC: gcc-patches; richard.sandiford; linkw
> Subject: Re: decremnt IV patch create fails on PowerPC
> On Fri, 26 May 2023, ??? wrote:
>  
> > Yesterday's patch has been approved (decremnt IV support):
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619663.html 
> > 
> > However, it creates fails on PowerPC:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 
> > 
> > I am really sorry for causing inconvinience.
> > 
> > I wonder as we disccussed:
> > +  /* If we're vectorizing a loop that uses length "controls" and
> > +     can iterate more than once, we apply decrementing IV approach
> > +     in loop control.  */
> > +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> > +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> > +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> > +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> > +    && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> > + LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> > +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> > 
> > This conditions can not disable decrement IV on PowerPC.
> > Should I add a target hook for it?
>  
> No.  I've put some analysis in the PR.  To me the question is
> why (without that SELECT_VL case) we need a decrementing IV
> _for the loop control_?  We could simply retain the original
> incrementing IV for loop control and add the decrementing
> IV for computing LEN in addition to that and leave IVOPTs
> sorting out to eventually merge them (or not).
>  
> Alternatively avoid the variable decrement as I wrote in the
> PR and do the exit test based on the previous IV value.
>  
> But as said all this won't work for the SELECT_VL case, but
> then it's availability is something to key off rather than a
> new target hook?
>  
> > The patch I can only do bootstrap and regression on X86.
> > I didn't have an environment to test PowerPC. I am really sorry.
>  
> You can do some testing with a cross compiler, alternatively
> there are powerpc machines in the GCC compile farm.
>  
> Richard.
>  
> 
 
-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: decremnt IV patch create fails on PowerPC
  2023-05-30  9:26       ` juzhe.zhong
@ 2023-05-30  9:50         ` Richard Biener
  2023-05-30  9:55           ` juzhe.zhong
  2023-05-30 11:30           ` juzhe.zhong
  2023-05-30  9:51         ` Kewen.Lin
  1 sibling, 2 replies; 15+ messages in thread
From: Richard Biener @ 2023-05-30  9:50 UTC (permalink / raw)
  To: juzhe.zhong; +Cc: gcc-patches, richard.sandiford, linkw

On Tue, 30 May 2023, juzhe.zhong@rivai.ai wrote:

> Ok.
> 
> It seems that for this conditions:
> 
> +  /* If we're vectorizing a loop that uses length "controls" and
> +     can iterate more than once, we apply decrementing IV approach
> +     in loop control.  */
> +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> +	   && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> +			LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> 
> I should add direct_supportted_p (SELECT_VL...) to this is that right?

No, since powerpc is fine with decrementing VL it should also use it.
Instead you should make sure to produce SCEV analyzable IVs when
possible (when SELECT_VL is not or cannot be used).

Richard.

> I have send SELECT_VL patch. I will add this in next SELECT_VL patch.
> 
> Let's wait Richard's more comments.
> 
> Thanks.
> 
> 
> juzhe.zhong@rivai.ai
>  
> From: Richard Biener
> Date: 2023-05-30 17:22
> To: juzhe.zhong@rivai.ai
> CC: gcc-patches; richard.sandiford; linkw
> Subject: Re: Re: decremnt IV patch create fails on PowerPC
> On Fri, 26 May 2023, juzhe.zhong@rivai.ai wrote:
>  
> > Hi, Richi. Thanks for your analysis and helps.
> > 
> > >> We could simply retain the original
> > >> incrementing IV for loop control and add the decrementing
> > >> IV for computing LEN in addition to that and leave IVOPTs
> > >> sorting out to eventually merge them (or not).
> > 
> > I am not sure how to do that. Could you give me more informations?
> > 
> > I somehow understand your concern is that variable amount of IV will make
> > IVOPT fails. 
> > 
> > I have seen similar situation in LLVM (when apply variable IV,
> > they failed to interleave the vectorize code). I am not sure whether they
> > are the same reason for that.
> > 
> > For RVV, we not only want decrement IV style in vectorization but also
> > we want to apply SELECT_VL in single-rgroup which is most happen cases (LLVM also only apply get_vector_length in single vector length).
> >
> > >>You can do some testing with a cross compiler, alternatively
> > >>there are powerpc machines in the GCC compile farm.
> > 
> > It seems that Power is ok with decrement IV since most cases are improved.
>  
> Well, but Power never will have SELECT_VL so at least for !SELECT_VL
> targets you should avoid having an IV with variable decrement.  As
> I said it should be easy to rewrite decrement IV to use a constant
> increment (when not using SELECT_VL) and testing the pre-decrement
> value in the exit test.
>  
> Richard.
> > I think Richard may help to explain decrement IV more clearly.
> > 
> > Thanks
> > 
> > 
> > juzhe.zhong@rivai.ai
> >  
> > From: Richard Biener
> > Date: 2023-05-26 14:46
> > To: ???
> > CC: gcc-patches; richard.sandiford; linkw
> > Subject: Re: decremnt IV patch create fails on PowerPC
> > On Fri, 26 May 2023, ??? wrote:
> >  
> > > Yesterday's patch has been approved (decremnt IV support):
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619663.html 
> > > 
> > > However, it creates fails on PowerPC:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 
> > > 
> > > I am really sorry for causing inconvinience.
> > > 
> > > I wonder as we disccussed:
> > > +  /* If we're vectorizing a loop that uses length "controls" and
> > > +     can iterate more than once, we apply decrementing IV approach
> > > +     in loop control.  */
> > > +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> > > +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> > > +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> > > +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> > > +    && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> > > + LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> > > +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> > > 
> > > This conditions can not disable decrement IV on PowerPC.
> > > Should I add a target hook for it?
> >  
> > No.  I've put some analysis in the PR.  To me the question is
> > why (without that SELECT_VL case) we need a decrementing IV
> > _for the loop control_?  We could simply retain the original
> > incrementing IV for loop control and add the decrementing
> > IV for computing LEN in addition to that and leave IVOPTs
> > sorting out to eventually merge them (or not).
> >  
> > Alternatively avoid the variable decrement as I wrote in the
> > PR and do the exit test based on the previous IV value.
> >  
> > But as said all this won't work for the SELECT_VL case, but
> > then it's availability is something to key off rather than a
> > new target hook?
> >  
> > > The patch I can only do bootstrap and regression on X86.
> > > I didn't have an environment to test PowerPC. I am really sorry.
> >  
> > You can do some testing with a cross compiler, alternatively
> > there are powerpc machines in the GCC compile farm.
> >  
> > Richard.
> >  
> > 
>  
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: decremnt IV patch create fails on PowerPC
  2023-05-30  9:26       ` juzhe.zhong
  2023-05-30  9:50         ` Richard Biener
@ 2023-05-30  9:51         ` Kewen.Lin
  2023-05-30 10:00           ` Richard Biener
  1 sibling, 1 reply; 15+ messages in thread
From: Kewen.Lin @ 2023-05-30  9:51 UTC (permalink / raw)
  To: juzhe.zhong, rguenther; +Cc: gcc-patches, richard.sandiford

on 2023/5/30 17:26, juzhe.zhong@rivai.ai wrote:
> Ok.
> 
> It seems that for this conditions:
> 
> +  /* If we're vectorizing a loop that uses length "controls" and
> +     can iterate more than once, we apply decrementing IV approach
> +     in loop control.  */
> +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> +	   && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> +			LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> 
> 
> I should add direct_supportted_p (SELECT_VL...) to this is that right?

I guess no, with this condition any targets without SELECT_VL are unable
to leverage the new decrement scheme for lengths, as your reply in PR109971
you didn't meant to disable it.  IIUC, what Richi suggested is to introduce
one new IV just like the previous one which has non-variable step, then it's
SCEV-ed and some analysis based on it can do a good job.

Since this is mainly for targets without SELECT_VL capability, I can follow
up this if you don't mind.

BR,
Kewen

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: decremnt IV patch create fails on PowerPC
  2023-05-30  9:50         ` Richard Biener
@ 2023-05-30  9:55           ` juzhe.zhong
  2023-05-30 11:30           ` juzhe.zhong
  1 sibling, 0 replies; 15+ messages in thread
From: juzhe.zhong @ 2023-05-30  9:55 UTC (permalink / raw)
  To: rguenther; +Cc: gcc-patches, richard.sandiford, linkw

[-- Attachment #1: Type: text/plain, Size: 6029 bytes --]

>> No, since powerpc is fine with decrementing VL it should also use it.
>>Instead you should make sure to produce SCEV analyzable IVs when
>>possible (when SELECT_VL is not or cannot be used).
Ok. Would you mind giving me the guideline how to rewrite the decrement IV?
Since I am not familiar with SCEV and I am not sure how to do that SCEV can analysis the decrement IV.



juzhe.zhong@rivai.ai
 
From: Richard Biener
Date: 2023-05-30 17:50
To: juzhe.zhong@rivai.ai
CC: gcc-patches; richard.sandiford; linkw
Subject: Re: Re: decremnt IV patch create fails on PowerPC
On Tue, 30 May 2023, juzhe.zhong@rivai.ai wrote:
 
> Ok.
> 
> It seems that for this conditions:
> 
> +  /* If we're vectorizing a loop that uses length "controls" and
> +     can iterate more than once, we apply decrementing IV approach
> +     in loop control.  */
> +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> +    && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> + LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> 
> I should add direct_supportted_p (SELECT_VL...) to this is that right?
 
No, since powerpc is fine with decrementing VL it should also use it.
Instead you should make sure to produce SCEV analyzable IVs when
possible (when SELECT_VL is not or cannot be used).
 
Richard.
 
> I have send SELECT_VL patch. I will add this in next SELECT_VL patch.
> 
> Let's wait Richard's more comments.
> 
> Thanks.
> 
> 
> juzhe.zhong@rivai.ai
>  
> From: Richard Biener
> Date: 2023-05-30 17:22
> To: juzhe.zhong@rivai.ai
> CC: gcc-patches; richard.sandiford; linkw
> Subject: Re: Re: decremnt IV patch create fails on PowerPC
> On Fri, 26 May 2023, juzhe.zhong@rivai.ai wrote:
>  
> > Hi, Richi. Thanks for your analysis and helps.
> > 
> > >> We could simply retain the original
> > >> incrementing IV for loop control and add the decrementing
> > >> IV for computing LEN in addition to that and leave IVOPTs
> > >> sorting out to eventually merge them (or not).
> > 
> > I am not sure how to do that. Could you give me more informations?
> > 
> > I somehow understand your concern is that variable amount of IV will make
> > IVOPT fails. 
> > 
> > I have seen similar situation in LLVM (when apply variable IV,
> > they failed to interleave the vectorize code). I am not sure whether they
> > are the same reason for that.
> > 
> > For RVV, we not only want decrement IV style in vectorization but also
> > we want to apply SELECT_VL in single-rgroup which is most happen cases (LLVM also only apply get_vector_length in single vector length).
> >
> > >>You can do some testing with a cross compiler, alternatively
> > >>there are powerpc machines in the GCC compile farm.
> > 
> > It seems that Power is ok with decrement IV since most cases are improved.
>  
> Well, but Power never will have SELECT_VL so at least for !SELECT_VL
> targets you should avoid having an IV with variable decrement.  As
> I said it should be easy to rewrite decrement IV to use a constant
> increment (when not using SELECT_VL) and testing the pre-decrement
> value in the exit test.
>  
> Richard.
> > I think Richard may help to explain decrement IV more clearly.
> > 
> > Thanks
> > 
> > 
> > juzhe.zhong@rivai.ai
> >  
> > From: Richard Biener
> > Date: 2023-05-26 14:46
> > To: ???
> > CC: gcc-patches; richard.sandiford; linkw
> > Subject: Re: decremnt IV patch create fails on PowerPC
> > On Fri, 26 May 2023, ??? wrote:
> >  
> > > Yesterday's patch has been approved (decremnt IV support):
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619663.html 
> > > 
> > > However, it creates fails on PowerPC:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 
> > > 
> > > I am really sorry for causing inconvinience.
> > > 
> > > I wonder as we disccussed:
> > > +  /* If we're vectorizing a loop that uses length "controls" and
> > > +     can iterate more than once, we apply decrementing IV approach
> > > +     in loop control.  */
> > > +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> > > +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> > > +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> > > +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> > > +    && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> > > + LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> > > +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> > > 
> > > This conditions can not disable decrement IV on PowerPC.
> > > Should I add a target hook for it?
> >  
> > No.  I've put some analysis in the PR.  To me the question is
> > why (without that SELECT_VL case) we need a decrementing IV
> > _for the loop control_?  We could simply retain the original
> > incrementing IV for loop control and add the decrementing
> > IV for computing LEN in addition to that and leave IVOPTs
> > sorting out to eventually merge them (or not).
> >  
> > Alternatively avoid the variable decrement as I wrote in the
> > PR and do the exit test based on the previous IV value.
> >  
> > But as said all this won't work for the SELECT_VL case, but
> > then it's availability is something to key off rather than a
> > new target hook?
> >  
> > > The patch I can only do bootstrap and regression on X86.
> > > I didn't have an environment to test PowerPC. I am really sorry.
> >  
> > You can do some testing with a cross compiler, alternatively
> > there are powerpc machines in the GCC compile farm.
> >  
> > Richard.
> >  
> > 
>  
> 
 
-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: decremnt IV patch create fails on PowerPC
  2023-05-30  9:51         ` Kewen.Lin
@ 2023-05-30 10:00           ` Richard Biener
  2023-05-30 10:05             ` juzhe.zhong
  2023-05-30 10:12             ` Richard Sandiford
  0 siblings, 2 replies; 15+ messages in thread
From: Richard Biener @ 2023-05-30 10:00 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: juzhe.zhong, gcc-patches, richard.sandiford

On Tue, 30 May 2023, Kewen.Lin wrote:

> on 2023/5/30 17:26, juzhe.zhong@rivai.ai wrote:
> > Ok.
> > 
> > It seems that for this conditions:
> > 
> > +  /* If we're vectorizing a loop that uses length "controls" and
> > +     can iterate more than once, we apply decrementing IV approach
> > +     in loop control.  */
> > +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> > +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> > +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> > +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> > +	   && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> > +			LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> > +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> > 
> > 
> > I should add direct_supportted_p (SELECT_VL...) to this is that right?
> 
> I guess no, with this condition any targets without SELECT_VL are unable
> to leverage the new decrement scheme for lengths, as your reply in PR109971
> you didn't meant to disable it.  IIUC, what Richi suggested is to introduce
> one new IV just like the previous one which has non-variable step, then it's
> SCEV-ed and some analysis based on it can do a good job.

No, I said the current scheme does sth along

 do {
   remain -= MIN (vf, remain);
 } while (remain != 0);

and I suggest to instead do

 do {
   old_remain = remain;
   len = MIN (vf, remain);
   remain -= vf;
 } while (old_remain >= vf);

basically since only the last iteration will have len < vf we can
ignore that remain -= vf will underflow there if we appropriately
rewrite the exit test to use the pre-decrement value.

> Since this is mainly for targets without SELECT_VL capability, I can follow
> up this if you don't mind.
> 
> BR,
> Kewen
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: decremnt IV patch create fails on PowerPC
  2023-05-30 10:00           ` Richard Biener
@ 2023-05-30 10:05             ` juzhe.zhong
  2023-05-30 10:12             ` Richard Sandiford
  1 sibling, 0 replies; 15+ messages in thread
From: juzhe.zhong @ 2023-05-30 10:05 UTC (permalink / raw)
  To: rguenther, linkw; +Cc: gcc-patches, richard.sandiford

[-- Attachment #1: Type: text/plain, Size: 2971 bytes --]

>> No, I said the current scheme does sth along

>> do {
>>    remain -= MIN (vf, remain);
>> } while (remain != 0);

>> and I suggest to instead do

>> do {
>>    old_remain = remain;
>>    len = MIN (vf, remain);
>>    remain -= vf;
>> } while (old_remain >= vf);

>> basically since only the last iteration will have len < vf we can
>> ignore that remain -= vf will underflow there if we appropriately
>> rewrite the exit test to use the pre-decrement value.

Oh, I understand you now. I will definitely have a try and send a patch.

Thank you so much.

By the way, could you take a look at SELECT_VL patch?
I guess you want to defer it to Richard and I will wait but still I think your comment is very important.

Thanks.


juzhe.zhong@rivai.ai
 
From: Richard Biener
Date: 2023-05-30 18:00
To: Kewen.Lin
CC: juzhe.zhong@rivai.ai; gcc-patches; richard.sandiford
Subject: Re: decremnt IV patch create fails on PowerPC
On Tue, 30 May 2023, Kewen.Lin wrote:
 
> on 2023/5/30 17:26, juzhe.zhong@rivai.ai wrote:
> > Ok.
> > 
> > It seems that for this conditions:
> > 
> > +  /* If we're vectorizing a loop that uses length "controls" and
> > +     can iterate more than once, we apply decrementing IV approach
> > +     in loop control.  */
> > +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> > +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> > +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> > +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> > +    && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> > + LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> > +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> > 
> > 
> > I should add direct_supportted_p (SELECT_VL...) to this is that right?
> 
> I guess no, with this condition any targets without SELECT_VL are unable
> to leverage the new decrement scheme for lengths, as your reply in PR109971
> you didn't meant to disable it.  IIUC, what Richi suggested is to introduce
> one new IV just like the previous one which has non-variable step, then it's
> SCEV-ed and some analysis based on it can do a good job.
 
No, I said the current scheme does sth along
 
do {
   remain -= MIN (vf, remain);
} while (remain != 0);
 
and I suggest to instead do
 
do {
   old_remain = remain;
   len = MIN (vf, remain);
   remain -= vf;
} while (old_remain >= vf);
 
basically since only the last iteration will have len < vf we can
ignore that remain -= vf will underflow there if we appropriately
rewrite the exit test to use the pre-decrement value.
 
> Since this is mainly for targets without SELECT_VL capability, I can follow
> up this if you don't mind.
> 
> BR,
> Kewen
> 
 
-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: decremnt IV patch create fails on PowerPC
  2023-05-30 10:00           ` Richard Biener
  2023-05-30 10:05             ` juzhe.zhong
@ 2023-05-30 10:12             ` Richard Sandiford
  2023-05-30 10:43               ` Richard Biener
  1 sibling, 1 reply; 15+ messages in thread
From: Richard Sandiford @ 2023-05-30 10:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: Kewen.Lin, juzhe.zhong, gcc-patches

My understanding was that we went into this knowing that the IVs
would defeat SCEV analysis.  Apparently that wasn't a problem for RVV,
but it's not surprising that it is a problem in general.

This isn't just about SELECT_VL though.  We use the same type of IV
for cases what aren't going to use SELECT_VL.

Richard Biener <rguenther@suse.de> writes:
> On Tue, 30 May 2023, Kewen.Lin wrote:
>
>> on 2023/5/30 17:26, juzhe.zhong@rivai.ai wrote:
>> > Ok.
>> > 
>> > It seems that for this conditions:
>> > 
>> > +  /* If we're vectorizing a loop that uses length "controls" and
>> > +     can iterate more than once, we apply decrementing IV approach
>> > +     in loop control.  */
>> > +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
>> > +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
>> > +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
>> > +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
>> > +	   && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
>> > +			LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
>> > +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
>> > 
>> > 
>> > I should add direct_supportted_p (SELECT_VL...) to this is that right?
>> 
>> I guess no, with this condition any targets without SELECT_VL are unable
>> to leverage the new decrement scheme for lengths, as your reply in PR109971
>> you didn't meant to disable it.  IIUC, what Richi suggested is to introduce
>> one new IV just like the previous one which has non-variable step, then it's
>> SCEV-ed and some analysis based on it can do a good job.
>
> No, I said the current scheme does sth along
>
>  do {
>    remain -= MIN (vf, remain);
>  } while (remain != 0);
>
> and I suggest to instead do
>
>  do {
>    old_remain = remain;
>    len = MIN (vf, remain);
>    remain -= vf;
>  } while (old_remain >= vf);
>
> basically since only the last iteration will have len < vf we can
> ignore that remain -= vf will underflow there if we appropriately
> rewrite the exit test to use the pre-decrement value.

Yeah, agree that should work.

But how easy would it be to extend SCEV analysis, via a pattern match?
The evolution of the IV phi wrt the inner loop is still a normal SCEV.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: decremnt IV patch create fails on PowerPC
  2023-05-30 10:12             ` Richard Sandiford
@ 2023-05-30 10:43               ` Richard Biener
  2023-05-30 11:29                 ` Richard Sandiford
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Biener @ 2023-05-30 10:43 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Kewen.Lin, juzhe.zhong, gcc-patches

On Tue, 30 May 2023, Richard Sandiford wrote:

> My understanding was that we went into this knowing that the IVs
> would defeat SCEV analysis.  Apparently that wasn't a problem for RVV,
> but it's not surprising that it is a problem in general.
> 
> This isn't just about SELECT_VL though.  We use the same type of IV
> for cases what aren't going to use SELECT_VL.
> 
> Richard Biener <rguenther@suse.de> writes:
> > On Tue, 30 May 2023, Kewen.Lin wrote:
> >
> >> on 2023/5/30 17:26, juzhe.zhong@rivai.ai wrote:
> >> > Ok.
> >> > 
> >> > It seems that for this conditions:
> >> > 
> >> > +  /* If we're vectorizing a loop that uses length "controls" and
> >> > +     can iterate more than once, we apply decrementing IV approach
> >> > +     in loop control.  */
> >> > +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> >> > +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> >> > +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> >> > +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> >> > +	   && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> >> > +			LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> >> > +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> >> > 
> >> > 
> >> > I should add direct_supportted_p (SELECT_VL...) to this is that right?
> >> 
> >> I guess no, with this condition any targets without SELECT_VL are unable
> >> to leverage the new decrement scheme for lengths, as your reply in PR109971
> >> you didn't meant to disable it.  IIUC, what Richi suggested is to introduce
> >> one new IV just like the previous one which has non-variable step, then it's
> >> SCEV-ed and some analysis based on it can do a good job.
> >
> > No, I said the current scheme does sth along
> >
> >  do {
> >    remain -= MIN (vf, remain);
> >  } while (remain != 0);
> >
> > and I suggest to instead do
> >
> >  do {
> >    old_remain = remain;
> >    len = MIN (vf, remain);
> >    remain -= vf;
> >  } while (old_remain >= vf);
> >
> > basically since only the last iteration will have len < vf we can
> > ignore that remain -= vf will underflow there if we appropriately
> > rewrite the exit test to use the pre-decrement value.
> 
> Yeah, agree that should work.

Btw, it's still on my TOOD list (unless somebody beats me...) to
rewrite the vectorizer code gen to do all loop control and conditions
on a decrementing "remaining scalar iters" IV.

> But how easy would it be to extend SCEV analysis, via a pattern match?
> The evolution of the IV phi wrt the inner loop is still a normal SCEV.

No, the IV isn't a normal SCEV, the final value is different.
I think pattern matching this in niter analysis could work though.

Richard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: decremnt IV patch create fails on PowerPC
  2023-05-30 10:43               ` Richard Biener
@ 2023-05-30 11:29                 ` Richard Sandiford
  2023-05-30 11:37                   ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Sandiford @ 2023-05-30 11:29 UTC (permalink / raw)
  To: Richard Biener; +Cc: Kewen.Lin, juzhe.zhong, gcc-patches

Richard Biener <rguenther@suse.de> writes:
>> But how easy would it be to extend SCEV analysis, via a pattern match?
>> The evolution of the IV phi wrt the inner loop is still a normal SCEV.
>
> No, the IV isn't a normal SCEV, the final value is different.

Which part of the IV though?  Won't all executions of the latch edge
decrement the IV phi (and specifically the phi) by VF (and only VF)?
So if we analyse the IV phi wrt the inner loop, the IV phi is simply
{ initial, -, VF }.

I agree "IV phi - step" isn't a SCEV, but that doesn't seem fatal.

Richard

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: decremnt IV patch create fails on PowerPC
  2023-05-30  9:50         ` Richard Biener
  2023-05-30  9:55           ` juzhe.zhong
@ 2023-05-30 11:30           ` juzhe.zhong
  1 sibling, 0 replies; 15+ messages in thread
From: juzhe.zhong @ 2023-05-30 11:30 UTC (permalink / raw)
  To: rguenther; +Cc: gcc-patches, richard.sandiford, linkw

[-- Attachment #1: Type: text/plain, Size: 5936 bytes --]

Hi, Richi.
I have send patch by following your suggestion and change the decrement IV follow:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620086.html 

It works well in RVV.

Could you take a look at it?
If it's ok, I will send patch of SELECT_VL base on this.

Thanks.


juzhe.zhong@rivai.ai
 
From: Richard Biener
Date: 2023-05-30 17:50
To: juzhe.zhong@rivai.ai
CC: gcc-patches; richard.sandiford; linkw
Subject: Re: Re: decremnt IV patch create fails on PowerPC
On Tue, 30 May 2023, juzhe.zhong@rivai.ai wrote:
 
> Ok.
> 
> It seems that for this conditions:
> 
> +  /* If we're vectorizing a loop that uses length "controls" and
> +     can iterate more than once, we apply decrementing IV approach
> +     in loop control.  */
> +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> +    && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> + LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> 
> I should add direct_supportted_p (SELECT_VL...) to this is that right?
 
No, since powerpc is fine with decrementing VL it should also use it.
Instead you should make sure to produce SCEV analyzable IVs when
possible (when SELECT_VL is not or cannot be used).
 
Richard.
 
> I have send SELECT_VL patch. I will add this in next SELECT_VL patch.
> 
> Let's wait Richard's more comments.
> 
> Thanks.
> 
> 
> juzhe.zhong@rivai.ai
>  
> From: Richard Biener
> Date: 2023-05-30 17:22
> To: juzhe.zhong@rivai.ai
> CC: gcc-patches; richard.sandiford; linkw
> Subject: Re: Re: decremnt IV patch create fails on PowerPC
> On Fri, 26 May 2023, juzhe.zhong@rivai.ai wrote:
>  
> > Hi, Richi. Thanks for your analysis and helps.
> > 
> > >> We could simply retain the original
> > >> incrementing IV for loop control and add the decrementing
> > >> IV for computing LEN in addition to that and leave IVOPTs
> > >> sorting out to eventually merge them (or not).
> > 
> > I am not sure how to do that. Could you give me more informations?
> > 
> > I somehow understand your concern is that variable amount of IV will make
> > IVOPT fails. 
> > 
> > I have seen similar situation in LLVM (when apply variable IV,
> > they failed to interleave the vectorize code). I am not sure whether they
> > are the same reason for that.
> > 
> > For RVV, we not only want decrement IV style in vectorization but also
> > we want to apply SELECT_VL in single-rgroup which is most happen cases (LLVM also only apply get_vector_length in single vector length).
> >
> > >>You can do some testing with a cross compiler, alternatively
> > >>there are powerpc machines in the GCC compile farm.
> > 
> > It seems that Power is ok with decrement IV since most cases are improved.
>  
> Well, but Power never will have SELECT_VL so at least for !SELECT_VL
> targets you should avoid having an IV with variable decrement.  As
> I said it should be easy to rewrite decrement IV to use a constant
> increment (when not using SELECT_VL) and testing the pre-decrement
> value in the exit test.
>  
> Richard.
> > I think Richard may help to explain decrement IV more clearly.
> > 
> > Thanks
> > 
> > 
> > juzhe.zhong@rivai.ai
> >  
> > From: Richard Biener
> > Date: 2023-05-26 14:46
> > To: ???
> > CC: gcc-patches; richard.sandiford; linkw
> > Subject: Re: decremnt IV patch create fails on PowerPC
> > On Fri, 26 May 2023, ??? wrote:
> >  
> > > Yesterday's patch has been approved (decremnt IV support):
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619663.html 
> > > 
> > > However, it creates fails on PowerPC:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 
> > > 
> > > I am really sorry for causing inconvinience.
> > > 
> > > I wonder as we disccussed:
> > > +  /* If we're vectorizing a loop that uses length "controls" and
> > > +     can iterate more than once, we apply decrementing IV approach
> > > +     in loop control.  */
> > > +  if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> > > +      && !LOOP_VINFO_LENS (loop_vinfo).is_empty ()
> > > +      && LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) == 0
> > > +      && !(LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> > > +    && known_le (LOOP_VINFO_INT_NITERS (loop_vinfo),
> > > + LOOP_VINFO_VECT_FACTOR (loop_vinfo))))
> > > +    LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) = true;
> > > 
> > > This conditions can not disable decrement IV on PowerPC.
> > > Should I add a target hook for it?
> >  
> > No.  I've put some analysis in the PR.  To me the question is
> > why (without that SELECT_VL case) we need a decrementing IV
> > _for the loop control_?  We could simply retain the original
> > incrementing IV for loop control and add the decrementing
> > IV for computing LEN in addition to that and leave IVOPTs
> > sorting out to eventually merge them (or not).
> >  
> > Alternatively avoid the variable decrement as I wrote in the
> > PR and do the exit test based on the previous IV value.
> >  
> > But as said all this won't work for the SELECT_VL case, but
> > then it's availability is something to key off rather than a
> > new target hook?
> >  
> > > The patch I can only do bootstrap and regression on X86.
> > > I didn't have an environment to test PowerPC. I am really sorry.
> >  
> > You can do some testing with a cross compiler, alternatively
> > there are powerpc machines in the GCC compile farm.
> >  
> > Richard.
> >  
> > 
>  
> 
 
-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: decremnt IV patch create fails on PowerPC
  2023-05-30 11:29                 ` Richard Sandiford
@ 2023-05-30 11:37                   ` Richard Biener
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Biener @ 2023-05-30 11:37 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Kewen.Lin, juzhe.zhong, gcc-patches

On Tue, 30 May 2023, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> >> But how easy would it be to extend SCEV analysis, via a pattern match?
> >> The evolution of the IV phi wrt the inner loop is still a normal SCEV.
> >
> > No, the IV isn't a normal SCEV, the final value is different.
> 
> Which part of the IV though?

The relevant IV (for niter analysis) is the one used in the loop
exit test and that currently isn't a SCEV.  The IV used in the
*_len operations isn't either (and that's not going to change,
obviously).

>  Won't all executions of the latch edge
> decrement the IV phi (and specifically the phi) by VF (and only VF)?

But currently there's no decrement by invariant VF but only
by MIN (VF, remain), that's what I suggested to address to
make the loop exit condition analyzable (as said, in theory
we can try pattern matching the analysis of the exit test in
niter analysis).

> So if we analyse the IV phi wrt the inner loop, the IV phi is simply
> { initial, -, VF }.
> 
> I agree "IV phi - step" isn't a SCEV, but that doesn't seem fatal.

Right.  Fatal is the non-SCEV in the exit test which makes most
followup loop optimizations fail to consider the loop because the
number of iterations cannot be determined.

Richard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-05-30 11:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-25 22:45 decremnt IV patch create fails on PowerPC 钟居哲
2023-05-26  6:46 ` Richard Biener
2023-05-26  7:46   ` juzhe.zhong
2023-05-30  9:22     ` Richard Biener
2023-05-30  9:26       ` juzhe.zhong
2023-05-30  9:50         ` Richard Biener
2023-05-30  9:55           ` juzhe.zhong
2023-05-30 11:30           ` juzhe.zhong
2023-05-30  9:51         ` Kewen.Lin
2023-05-30 10:00           ` Richard Biener
2023-05-30 10:05             ` juzhe.zhong
2023-05-30 10:12             ` Richard Sandiford
2023-05-30 10:43               ` Richard Biener
2023-05-30 11:29                 ` Richard Sandiford
2023-05-30 11:37                   ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).