public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Enable the vectorizer at -O2 for GCC 12
@ 2021-08-30 13:04 Florian Weimer
  2021-08-30 14:11 ` Bill Schmidt
  2021-09-01 11:23 ` Tamar Christina
  0 siblings, 2 replies; 8+ messages in thread
From: Florian Weimer @ 2021-08-30 13:04 UTC (permalink / raw)
  To: gcc
  Cc: Jeff Law, wschmidt, H.J. Lu, Hongtao Liu, Segher Boessenkool,
	jakub, rearnsha, richard.sandiford, Premachandra.Mallappa

There has been a discussion, both off-list and on the gcc-help mailing
list (“Why vectorization didn't turn on by -O2”, spread across several
months), about enabling the auto-vectorizer at -O2, similar to what
Clang does.

I think the review concluded that the very cheap cost model should be
used for that.

Are there any remaining blockers?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Enable the vectorizer at -O2 for GCC 12
  2021-08-30 13:04 Enable the vectorizer at -O2 for GCC 12 Florian Weimer
@ 2021-08-30 14:11 ` Bill Schmidt
  2021-08-31  3:10   ` Kewen.Lin
  2021-09-01 11:23 ` Tamar Christina
  1 sibling, 1 reply; 8+ messages in thread
From: Bill Schmidt @ 2021-08-30 14:11 UTC (permalink / raw)
  To: Florian Weimer, gcc
  Cc: Jeff Law, H.J. Lu, Hongtao Liu, Segher Boessenkool, jakub,
	rearnsha, richard.sandiford, Premachandra.Mallappa, Kewen.Lin

On 8/30/21 8:04 AM, Florian Weimer wrote:
> There has been a discussion, both off-list and on the gcc-help mailing
> list (“Why vectorization didn't turn on by -O2”, spread across several
> months), about enabling the auto-vectorizer at -O2, similar to what
> Clang does.
>
> I think the review concluded that the very cheap cost model should be
> used for that.
>
> Are there any remaining blockers?

Hi Florian,

I don't think I'd characterize it as having blockers, but we are 
continuing to investigate small performance issues that arise with 
very-cheap, including some things that regressed in GCC 12.  Kewen Lin 
is leading that effort.  Kewen, do you feel we have any major remaining 
concerns with this plan?

Thanks,
Bill

>
> Thanks,
> Florian
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Enable the vectorizer at -O2 for GCC 12
  2021-08-30 14:11 ` Bill Schmidt
@ 2021-08-31  3:10   ` Kewen.Lin
  2021-08-31  3:30     ` Hongtao Liu
  0 siblings, 1 reply; 8+ messages in thread
From: Kewen.Lin @ 2021-08-31  3:10 UTC (permalink / raw)
  To: wschmidt, Florian Weimer
  Cc: Jeff Law, H.J. Lu, Hongtao Liu, Segher Boessenkool, jakub,
	rearnsha, richard.sandiford, Premachandra.Mallappa, gcc

on 2021/8/30 下午10:11, Bill Schmidt wrote:
> On 8/30/21 8:04 AM, Florian Weimer wrote:
>> There has been a discussion, both off-list and on the gcc-help mailing
>> list (“Why vectorization didn't turn on by -O2”, spread across several
>> months), about enabling the auto-vectorizer at -O2, similar to what
>> Clang does.
>>
>> I think the review concluded that the very cheap cost model should be
>> used for that.
>>
>> Are there any remaining blockers?
> 
> Hi Florian,
> 
> I don't think I'd characterize it as having blockers, but we are continuing to investigate small performance issues that arise with very-cheap, including some things that regressed in GCC 12.  Kewen Lin is leading that effort.  Kewen, do you feel we have any major remaining concerns with this plan?
> 

Hi Florian & Bill,

There are some small performance issues like PR101944 and PR102054, and
still two degraded bmks (P9 520.omnetpp_r -2.41% and P8 526.blender_r
-1.31%) to be investigated/clarified, but since their performance numbers
with separated loop and slp vectorization options look neutral, they are
very likely noises.  IMHO I don't think they are/will be blockers.  

So I think it's good to turn this on by default for Power.

BR,
Kewen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Enable the vectorizer at -O2 for GCC 12
  2021-08-31  3:10   ` Kewen.Lin
@ 2021-08-31  3:30     ` Hongtao Liu
  2021-08-31  4:13       ` Jeff Law
  0 siblings, 1 reply; 8+ messages in thread
From: Hongtao Liu @ 2021-08-31  3:30 UTC (permalink / raw)
  To: Kewen.Lin
  Cc: Bill Schmidt, Florian Weimer, Jakub Jelinek, rearnsha,
	Segher Boessenkool, GCC Development, Richard Sandiford,
	Premachandra.Mallappa, Hongtao Liu

On Tue, Aug 31, 2021 at 11:11 AM Kewen.Lin via Gcc <gcc@gcc.gnu.org> wrote:
>
> on 2021/8/30 下午10:11, Bill Schmidt wrote:
> > On 8/30/21 8:04 AM, Florian Weimer wrote:
> >> There has been a discussion, both off-list and on the gcc-help mailing
> >> list (“Why vectorization didn't turn on by -O2”, spread across several
> >> months), about enabling the auto-vectorizer at -O2, similar to what
> >> Clang does.
> >>
> >> I think the review concluded that the very cheap cost model should be
> >> used for that.
> >>
> >> Are there any remaining blockers?
> >
> > Hi Florian,
> >
> > I don't think I'd characterize it as having blockers, but we are continuing to investigate small performance issues that arise with very-cheap, including some things that regressed in GCC 12.  Kewen Lin is leading that effort.  Kewen, do you feel we have any major remaining concerns with this plan?
> >
>
> Hi Florian & Bill,
>
> There are some small performance issues like PR101944 and PR102054, and
> still two degraded bmks (P9 520.omnetpp_r -2.41% and P8 526.blender_r
> -1.31%) to be investigated/clarified, but since their performance numbers
> with separated loop and slp vectorization options look neutral, they are
> very likely noises.  IMHO I don't think they are/will be blockers.
>
> So I think it's good to turn this on by default for Power.
The intel side is also willing to enable O2 vectorization after
measuring performance impact for SPEC2017 and eembc.
Meanwhile we are investigating PR101908/PR101909/PR101910/PR92740
which are reported O2 vectorization regresses extra benchmarks on
znver and kabylake.
>
> BR,
> Kewen



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Enable the vectorizer at -O2 for GCC 12
  2021-08-31  3:30     ` Hongtao Liu
@ 2021-08-31  4:13       ` Jeff Law
  2021-09-01  9:10         ` Andrew Stubbs
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Law @ 2021-08-31  4:13 UTC (permalink / raw)
  To: Hongtao Liu, Kewen.Lin
  Cc: Florian Weimer, Jakub Jelinek, rearnsha, Segher Boessenkool,
	GCC Development, Richard Sandiford, Premachandra.Mallappa,
	Hongtao Liu



On 8/30/2021 9:30 PM, Hongtao Liu via Gcc wrote:
> On Tue, Aug 31, 2021 at 11:11 AM Kewen.Lin via Gcc <gcc@gcc.gnu.org> wrote:
>> on 2021/8/30 下午10:11, Bill Schmidt wrote:
>>> On 8/30/21 8:04 AM, Florian Weimer wrote:
>>>> There has been a discussion, both off-list and on the gcc-help mailing
>>>> list (“Why vectorization didn't turn on by -O2”, spread across several
>>>> months), about enabling the auto-vectorizer at -O2, similar to what
>>>> Clang does.
>>>>
>>>> I think the review concluded that the very cheap cost model should be
>>>> used for that.
>>>>
>>>> Are there any remaining blockers?
>>> Hi Florian,
>>>
>>> I don't think I'd characterize it as having blockers, but we are continuing to investigate small performance issues that arise with very-cheap, including some things that regressed in GCC 12.  Kewen Lin is leading that effort.  Kewen, do you feel we have any major remaining concerns with this plan?
>>>
>> Hi Florian & Bill,
>>
>> There are some small performance issues like PR101944 and PR102054, and
>> still two degraded bmks (P9 520.omnetpp_r -2.41% and P8 526.blender_r
>> -1.31%) to be investigated/clarified, but since their performance numbers
>> with separated loop and slp vectorization options look neutral, they are
>> very likely noises.  IMHO I don't think they are/will be blockers.
>>
>> So I think it's good to turn this on by default for Power.
> The intel side is also willing to enable O2 vectorization after
> measuring performance impact for SPEC2017 and eembc.
> Meanwhile we are investigating PR101908/PR101909/PR101910/PR92740
> which are reported O2 vectorization regresses extra benchmarks on
> znver and kabylake.
We'd like to see it on for our processor as well.  Though I don't have 
numbers I can share at this time.

jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Enable the vectorizer at -O2 for GCC 12
  2021-08-31  4:13       ` Jeff Law
@ 2021-09-01  9:10         ` Andrew Stubbs
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Stubbs @ 2021-09-01  9:10 UTC (permalink / raw)
  To: Jeff Law, Hongtao Liu, Kewen.Lin
  Cc: Florian Weimer, Jakub Jelinek, rearnsha, Segher Boessenkool,
	GCC Development, Richard Sandiford, Premachandra.Mallappa,
	Hongtao Liu

On 31/08/2021 05:13, Jeff Law wrote:
> 
> 
> On 8/30/2021 9:30 PM, Hongtao Liu via Gcc wrote:
>> On Tue, Aug 31, 2021 at 11:11 AM Kewen.Lin via Gcc <gcc@gcc.gnu.org> 
>> wrote:
>>> on 2021/8/30 下午10:11, Bill Schmidt wrote:
>>>> On 8/30/21 8:04 AM, Florian Weimer wrote:
>>>>> There has been a discussion, both off-list and on the gcc-help mailing
>>>>> list (“Why vectorization didn't turn on by -O2”, spread across several
>>>>> months), about enabling the auto-vectorizer at -O2, similar to what
>>>>> Clang does.
>>>>>
>>>>> I think the review concluded that the very cheap cost model should be
>>>>> used for that.
>>>>>
>>>>> Are there any remaining blockers?
>>>> Hi Florian,
>>>>
>>>> I don't think I'd characterize it as having blockers, but we are 
>>>> continuing to investigate small performance issues that arise with 
>>>> very-cheap, including some things that regressed in GCC 12.  Kewen 
>>>> Lin is leading that effort.  Kewen, do you feel we have any major 
>>>> remaining concerns with this plan?
>>>>
>>> Hi Florian & Bill,
>>>
>>> There are some small performance issues like PR101944 and PR102054, and
>>> still two degraded bmks (P9 520.omnetpp_r -2.41% and P8 526.blender_r
>>> -1.31%) to be investigated/clarified, but since their performance 
>>> numbers
>>> with separated loop and slp vectorization options look neutral, they are
>>> very likely noises.  IMHO I don't think they are/will be blockers.
>>>
>>> So I think it's good to turn this on by default for Power.
>> The intel side is also willing to enable O2 vectorization after
>> measuring performance impact for SPEC2017 and eembc.
>> Meanwhile we are investigating PR101908/PR101909/PR101910/PR92740
>> which are reported O2 vectorization regresses extra benchmarks on
>> znver and kabylake.
> We'd like to see it on for our processor as well.  Though I don't have 
> numbers I can share at this time.

AMD GCN probably ought to have it on too, possibly set to maximum ... a 
GPU without vectors is pretty terrible.

Andrew

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Enable the vectorizer at -O2 for GCC 12
  2021-08-30 13:04 Enable the vectorizer at -O2 for GCC 12 Florian Weimer
  2021-08-30 14:11 ` Bill Schmidt
@ 2021-09-01 11:23 ` Tamar Christina
  2021-09-06  9:04   ` Hongtao Liu
  1 sibling, 1 reply; 8+ messages in thread
From: Tamar Christina @ 2021-09-01 11:23 UTC (permalink / raw)
  To: Florian Weimer
  Cc: jakub, Richard Earnshaw, Segher Boessenkool, Richard Sandiford,
	Premachandra.Mallappa, Hongtao Liu, gcc

-- edit, added list back in --

Just to add some AArch64 numbers for Spec2017 we see 2.1% overall Geomean improvements (all from x264 as expected) with no real regressions (everything within variance) and only a 0.06% binary size increase overall (of which x264 grew 0.15%) using the very cheap cost model.

So we'd be quite keen on this as well.

Cheers,
Tamar

> -----Original Message-----
> From: Gcc <gcc-bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf
> Of Florian Weimer via Gcc
> Sent: Monday, August 30, 2021 2:05 PM
> To: gcc@gcc.gnu.org
> Cc: jakub@redhat.com; Richard Earnshaw <Richard.Earnshaw@arm.com>;
> Segher Boessenkool <segher@kernel.crashing.org>; Richard Sandiford
> <Richard.Sandiford@arm.com>; Premachandra.Mallappa@amd.com;
> Hongtao Liu <hongtao.liu@intel.com>
> Subject: Enable the vectorizer at -O2 for GCC 12
> 
> There has been a discussion, both off-list and on the gcc-help mailing list
> (“Why vectorization didn't turn on by -O2”, spread across several months),
> about enabling the auto-vectorizer at -O2, similar to what Clang does.
> 
> I think the review concluded that the very cheap cost model should be used
> for that.
> 
> Are there any remaining blockers?
> 
> Thanks,
> Florian


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Enable the vectorizer at -O2 for GCC 12
  2021-09-01 11:23 ` Tamar Christina
@ 2021-09-06  9:04   ` Hongtao Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Hongtao Liu @ 2021-09-06  9:04 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Florian Weimer, jakub, Richard Earnshaw, Segher Boessenkool, gcc,
	Richard Sandiford, Premachandra.Mallappa, Hongtao Liu

On Wed, Sep 1, 2021 at 7:24 PM Tamar Christina via Gcc <gcc@gcc.gnu.org> wrote:
>
> -- edit, added list back in --
>
> Just to add some AArch64 numbers for Spec2017 we see 2.1% overall Geomean improvements (all from x264 as expected) with no real regressions (everything within variance) and only a 0.06% binary size increase overall (of which x264 grew 0.15%) using the very cheap cost model.
>
> So we'd be quite keen on this as well.
>
> Cheers,
> Tamar
>
> > -----Original Message-----
> > From: Gcc <gcc-bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf
> > Of Florian Weimer via Gcc
> > Sent: Monday, August 30, 2021 2:05 PM
> > To: gcc@gcc.gnu.org
> > Cc: jakub@redhat.com; Richard Earnshaw <Richard.Earnshaw@arm.com>;
> > Segher Boessenkool <segher@kernel.crashing.org>; Richard Sandiford
> > <Richard.Sandiford@arm.com>; Premachandra.Mallappa@amd.com;
> > Hongtao Liu <hongtao.liu@intel.com>
> > Subject: Enable the vectorizer at -O2 for GCC 12
> >
> > There has been a discussion, both off-list and on the gcc-help mailing list
> > (“Why vectorization didn't turn on by -O2”, spread across several months),
> > about enabling the auto-vectorizer at -O2, similar to what Clang does.
> >
> > I think the review concluded that the very cheap cost model should be used
> > for that.
> >
> > Are there any remaining blockers?
> >
> > Thanks,
> > Florian
>

A patch is posted at [1] to enable auto-vectorization at O2 w/
very-cheap cost mode.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578877.html

-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-09-06  8:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-30 13:04 Enable the vectorizer at -O2 for GCC 12 Florian Weimer
2021-08-30 14:11 ` Bill Schmidt
2021-08-31  3:10   ` Kewen.Lin
2021-08-31  3:30     ` Hongtao Liu
2021-08-31  4:13       ` Jeff Law
2021-09-01  9:10         ` Andrew Stubbs
2021-09-01 11:23 ` Tamar Christina
2021-09-06  9:04   ` Hongtao Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).