Loop vectorizer optimization questions

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Loop vectorizer optimization questions
@ 2024-01-08 13:57 钟居哲
  2024-01-08 17:50 ` Tamar Christina
  2024-01-09 13:10 ` Richard Biener
  0 siblings, 2 replies; 6+ messages in thread
From: 钟居哲 @ 2024-01-08 13:57 UTC (permalink / raw)
  To: gcc; +Cc: rdapp.gcc, richard.guenther

[-- Attachment #1: Type: text/plain, Size: 949 bytes --]

Hi, Richard.

I saw this following code:

      if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
        {
          if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,
                                              OPTIMIZE_FOR_SPEED))
            return false;
          else
            vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype, NULL);
        }

for early break, current early break is not sufficient to support target with length partial vector so that we are not able to enable early break for RVV.

I wonder if I want to support this in middle-end, is it allowed in GCC-14 ? Or should I defer to GCC-15.

Also, another question is that I am working on min/max reduction with index, I believe it should be in GCC-15, but I wonder
whether I can pre-post for review in stage 4, or I should post patch (min/max reduction with index) when GCC-15 is open.

Thanks.

juzhe.zhong@rivai.ai

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Loop vectorizer optimization questions
  2024-01-08 13:57 Loop vectorizer optimization questions 钟居哲
@ 2024-01-08 17:50 ` Tamar Christina
  2024-01-08 22:46   ` 钟居哲
  2024-01-09 13:10 ` Richard Biener
  1 sibling, 1 reply; 6+ messages in thread
From: Tamar Christina @ 2024-01-08 17:50 UTC (permalink / raw)
  To: 钟居哲, gcc; +Cc: rdapp.gcc, richard.guenther

> 
> Also, another question is that I am working on min/max reduction with index, I
> believe it should be in GCC-15, but I wonder
> whether I can pre-post for review in stage 4, or I should post patch (min/max
> reduction with index) when GCC-15 is open.
> 

FWIW, We tried to implement this 5 years ago https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534518.html
and you'll likely get the same feedback if you aren't already doing so.

I think Richard would prefer to have a general framework these kinds of operations.  We never got around to doing so
and it's still on my list but if you're taking care of it 😊

Just though I'd point out the previous feedback.

Cheers,
Tamar

> Thanks.
> 
> 
> juzhe.zhong@rivai.ai

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RE: Loop vectorizer optimization questions
  2024-01-08 17:50 ` Tamar Christina
@ 2024-01-08 22:46   ` 钟居哲
  2024-01-09  8:59     ` Tamar Christina
  0 siblings, 1 reply; 6+ messages in thread
From: 钟居哲 @ 2024-01-08 22:46 UTC (permalink / raw)
  To: Tamar Christina, gcc; +Cc: rdapp.gcc, richard.guenther

[-- Attachment #1: Type: text/plain, Size: 1666 bytes --]

Oh. It's nice to see you have support min/max index reduction.

I knew your patch can handle this following:

int idx = ii;
int max = mm;
for (int i = 0; i < n; ++i) {
  int x = a[i];
  if (max < x) {
    max = x;
    idx = i;
  }
}

But I wonder whether your patch can handle this:
int idx = ii;
int max = mm;
for (int i = 0; i < n; ++i) {
  int x = a[i];
  if (max <= x) {
    max = x;
    idx = i;
  }
}

Will you continue to work on min/max with index ?
Or you want me to continue this work base on your patch ?

I have an initial patch which roughly implemented LLVM's approach but turns out Richi doesn't want me to apply LLVM's approach so your patch may be more reasonable than LLVM's approach.

Thanks.

juzhe.zhong@rivai.ai

From: Tamar Christina
Date: 2024-01-09 01:50
To: 钟居哲; gcc
CC: rdapp.gcc; richard.guenther
Subject: RE: Loop vectorizer optimization questions
> 
> Also, another question is that I am working on min/max reduction with index, I
> believe it should be in GCC-15, but I wonder
> whether I can pre-post for review in stage 4, or I should post patch (min/max
> reduction with index) when GCC-15 is open.
> 

FWIW, We tried to implement this 5 years ago https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534518.html
and you'll likely get the same feedback if you aren't already doing so.

I think Richard would prefer to have a general framework these kinds of operations.  We never got around to doing so
and it's still on my list but if you're taking care of it 

Just though I'd point out the previous feedback.

Cheers,
Tamar

> Thanks.
> 
> 
> juzhe.zhong@rivai.ai

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RE: Loop vectorizer optimization questions
  2024-01-08 22:46   ` 钟居哲
@ 2024-01-09  8:59     ` Tamar Christina
  2024-01-09  9:21       ` juzhe.zhong
  0 siblings, 1 reply; 6+ messages in thread
From: Tamar Christina @ 2024-01-09  8:59 UTC (permalink / raw)
  To: 钟居哲; +Cc: richard.guenther, rdapp.gcc, gcc

Hi,

The 01/08/2024 22:46, 钟居哲 wrote:
> Oh. It's nice to see you have support min/max index reduction.
> 
> I knew your patch can handle this following:
> 
> 
> int idx = ii;
> int max = mm;
> for (int i = 0; i < n; ++i) {
>   int x = a[i];
>   if (max < x) {
>     max = x;
>     idx = i;
>   }
> }
> 
> But I wonder whether your patch can handle this:
> 
> int idx = ii;
> int max = mm;
> for (int i = 0; i < n; ++i) {
>   int x = a[i];
>   if (max <= x) {
>     max = x;
>     idx = i;
>   }
> }
> 

The last version of the patch we sent handled all conditionals:

https://inbox.sourceware.org/gcc-patches/DB9PR08MB6603DCCB35007D83C6736167F5599@DB9PR08MB6603.eurprd08.prod.outlook.com/

There are some additional testcases in the patch for all these as well.

> Will you continue to work on min/max with index ?

I don't know if I'll have the free time to do so, that's the reason I haven't resent the new one.
The engineer who started it no longer works for Arm.

> Or you want me to continue this work base on your patch ?
> 
> I have an initial patch which roughly implemented LLVM's approach but turns out Richi doesn't want me to apply LLVM's approach so your patch may be more reasonable than LLVM's approach.
> 

When Richi reviewed it he wasn't against the approach in the patch https://inbox.sourceware.org/gcc-patches/nycvar.YFH.7.76.2105071320170.9200@zhemvz.fhfr.qr/
but he wanted the concept of a dependent reduction to be handle more generically, so we could extend it in the future.

I think, from looking at Richi's feedback is that he wants vect_recog_minmax_index_pattern to be more general. We've basically hardcoded the reduction type,
but it could just be a property on STMT_VINFO.

Unless I'm mistaken the patch already relies on first finding both reductions, but we immediately try to resolve the relationship using vect_recog_minmax_index_pattern.
Instead I think what Richi wanted was for us to keep track of reductions that operate on the same induction variable and after we finish analysing all reductions we
try to see if any reductions we kept track of can be combined.

Basically just separate out the discovery and tieing of the reductions.

Am I right here Richi?

I think the codegen part can mostly be used as is, though we might be able to do better for VLA.

So it should be fairly straight forward to go from that final patch to what Richi wants, but.. I just lack time.

If you want to tackle it that would be great :)

Thanks,
Tamar

> Thanks.
> ________________________________
> juzhe.zhong@rivai.ai
> 
> From: Tamar Christina<mailto:Tamar.Christina@arm.com>
> Date: 2024-01-09 01:50
> To: 钟居哲<mailto:juzhe.zhong@rivai.ai>; gcc<mailto:gcc@gcc.gnu.org>
> CC: rdapp.gcc<mailto:rdapp.gcc@gmail.com>; richard.guenther<mailto:richard.guenther@gmail.com>
> Subject: RE: Loop vectorizer optimization questions
> >
> > Also, another question is that I am working on min/max reduction with index, I
> > believe it should be in GCC-15, but I wonder
> > whether I can pre-post for review in stage 4, or I should post patch (min/max
> > reduction with index) when GCC-15 is open.
> >
> 
> FWIW, We tried to implement this 5 years ago https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534518.html
> and you'll likely get the same feedback if you aren't already doing so.
> 
> I think Richard would prefer to have a general framework these kinds of operations.  We never got around to doing so
> and it's still on my list but if you're taking care of it 
> 
> Just though I'd point out the previous feedback.
> 
> Cheers,
> Tamar
> 
> > Thanks.
> >
> >
> > juzhe.zhong@rivai.ai

-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Re: Loop vectorizer optimization questions
  2024-01-09  8:59     ` Tamar Christina
@ 2024-01-09  9:21       ` juzhe.zhong
  0 siblings, 0 replies; 6+ messages in thread
From: juzhe.zhong @ 2024-01-09  9:21 UTC (permalink / raw)
  To: tamar.christina; +Cc: Richard Biener, Robin Dapp, gcc

[-- Attachment #1: Type: text/plain, Size: 4186 bytes --]

I see. Thanks Tamar.

I am willing to to investigate Arm's initial patch to see what else we need in that patch.

Since min/max reduction with index can improve SPEC performance, I will take a look at it in GCC-15.

Thanks a lot !

juzhe.zhong@rivai.ai

From: Tamar Christina
Date: 2024-01-09 16:59
To: 钟居哲
CC: richard.guenther; rdapp.gcc; gcc
Subject: Re: RE: Loop vectorizer optimization questions
Hi,

The 01/08/2024 22:46, 钟居哲 wrote:
> Oh. It's nice to see you have support min/max index reduction.
> 
> I knew your patch can handle this following:
> 
> 
> int idx = ii;
> int max = mm;
> for (int i = 0; i < n; ++i) {
>   int x = a[i];
>   if (max < x) {
>     max = x;
>     idx = i;
>   }
> }
> 
> But I wonder whether your patch can handle this:
> 
> int idx = ii;
> int max = mm;
> for (int i = 0; i < n; ++i) {
>   int x = a[i];
>   if (max <= x) {
>     max = x;
>     idx = i;
>   }
> }
> 

The last version of the patch we sent handled all conditionals:

https://inbox.sourceware.org/gcc-patches/DB9PR08MB6603DCCB35007D83C6736167F5599@DB9PR08MB6603.eurprd08.prod.outlook.com/

There are some additional testcases in the patch for all these as well.

> Will you continue to work on min/max with index ?

I don't know if I'll have the free time to do so, that's the reason I haven't resent the new one.
The engineer who started it no longer works for Arm.

> Or you want me to continue this work base on your patch ?
> 
> I have an initial patch which roughly implemented LLVM's approach but turns out Richi doesn't want me to apply LLVM's approach so your patch may be more reasonable than LLVM's approach.
> 

When Richi reviewed it he wasn't against the approach in the patch https://inbox.sourceware.org/gcc-patches/nycvar.YFH.7.76.2105071320170.9200@zhemvz.fhfr.qr/
but he wanted the concept of a dependent reduction to be handle more generically, so we could extend it in the future.

I think, from looking at Richi's feedback is that he wants vect_recog_minmax_index_pattern to be more general. We've basically hardcoded the reduction type,
but it could just be a property on STMT_VINFO.

Unless I'm mistaken the patch already relies on first finding both reductions, but we immediately try to resolve the relationship using vect_recog_minmax_index_pattern.
Instead I think what Richi wanted was for us to keep track of reductions that operate on the same induction variable and after we finish analysing all reductions we
try to see if any reductions we kept track of can be combined.

Basically just separate out the discovery and tieing of the reductions.

Am I right here Richi?

I think the codegen part can mostly be used as is, though we might be able to do better for VLA.

So it should be fairly straight forward to go from that final patch to what Richi wants, but.. I just lack time.

If you want to tackle it that would be great :)

Thanks,
Tamar

> Thanks.
> ________________________________
> juzhe.zhong@rivai.ai
> 
> From: Tamar Christina<mailto:Tamar.Christina@arm.com>
> Date: 2024-01-09 01:50
> To: 钟居哲<mailto:juzhe.zhong@rivai.ai>; gcc<mailto:gcc@gcc.gnu.org>
> CC: rdapp.gcc<mailto:rdapp.gcc@gmail.com>; richard.guenther<mailto:richard.guenther@gmail.com>
> Subject: RE: Loop vectorizer optimization questions
> >
> > Also, another question is that I am working on min/max reduction with index, I
> > believe it should be in GCC-15, but I wonder
> > whether I can pre-post for review in stage 4, or I should post patch (min/max
> > reduction with index) when GCC-15 is open.
> >
> 
> FWIW, We tried to implement this 5 years ago https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534518.html
> and you'll likely get the same feedback if you aren't already doing so.
> 
> I think Richard would prefer to have a general framework these kinds of operations.  We never got around to doing so
> and it's still on my list but if you're taking care of it 
> 
> Just though I'd point out the previous feedback.
> 
> Cheers,
> Tamar
> 
> > Thanks.
> >
> >
> > juzhe.zhong@rivai.ai

-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Loop vectorizer optimization questions
  2024-01-08 13:57 Loop vectorizer optimization questions 钟居哲
  2024-01-08 17:50 ` Tamar Christina
@ 2024-01-09 13:10 ` Richard Biener
  1 sibling, 0 replies; 6+ messages in thread
From: Richard Biener @ 2024-01-09 13:10 UTC (permalink / raw)
  To: 钟居哲; +Cc: gcc, rdapp.gcc

On Mon, Jan 8, 2024 at 2:57 PM 钟居哲 <juzhe.zhong@rivai.ai> wrote:
>
> Hi, Richard.
>
> I saw this following code:
>
>       if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
>         {
>           if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,
>                                               OPTIMIZE_FOR_SPEED))
>             return false;
>           else
>             vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype, NULL);
>         }
>
> for early break, current early break is not sufficient to support target with length partial vector so that we are not able to enable early break for RVV.
>
> I wonder if I want to support this in middle-end, is it allowed in GCC-14 ? Or should I defer to GCC-15.

Defer to GCC 15.

> Also, another question is that I am working on min/max reduction with index, I believe it should be in GCC-15, but I wonder
> whether I can pre-post for review in stage 4, or I should post patch (min/max reduction with index) when GCC-15 is open.

You can always post patches for review, just don't expect timely ones ;)

> Thanks.
> ________________________________
> juzhe.zhong@rivai.ai

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-01-09 13:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-08 13:57 Loop vectorizer optimization questions 钟居哲
2024-01-08 17:50 ` Tamar Christina
2024-01-08 22:46   ` 钟居哲
2024-01-09  8:59     ` Tamar Christina
2024-01-09  9:21       ` juzhe.zhong
2024-01-09 13:10 ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).