public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100021] New: [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell
@ 2021-04-10 17:14 nok.raven at gmail dot com
  2021-04-11  8:52 ` [Bug target/100021] " ubizjak at gmail dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: nok.raven at gmail dot com @ 2021-04-10 17:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100021

            Bug ID: 100021
           Summary: [9/10/11 Regression] std::clamp unprofitable
                    vectorization on -march=nehalem/.../broadwell
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: nok.raven at gmail dot com
  Target Milestone: ---

//#include <algorithm>
namespace std
{
    template<typename _Tp>
    constexpr const _Tp&
    clamp(const _Tp& __val, const _Tp& __lo, const _Tp& __hi)
    {
      return (__val < __lo) ? __lo : (__hi < __val) ? __hi : __val;
    }
}

int foo(int x, int y) {
    return std::clamp(x - y, -1, 1);
}

https://godbolt.org/z/6534c1ff1 either GCC makes unprofitable vectorization or
LLVM MCA calculation is wrong.

Affected targets are
-march=nehalem/westmere/sandybridge/ivybridge/haswell/broadwell.

I do not know why it is vectorized on -O2 in the first place; could not find a
switch which triggers it, and doing -O1 + everything -O2 supposed to be
enabling does not reproduce the -O2 behavior.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/100021] [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell
  2021-04-10 17:14 [Bug target/100021] New: [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell nok.raven at gmail dot com
@ 2021-04-11  8:52 ` ubizjak at gmail dot com
  2021-04-11  9:02 ` ubizjak at gmail dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-11  8:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100021

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
This is not vectorization, but the compiler uses vector registers to perform
scalar operations. This is STV (scalar-to-vector) pass in action, you can use
-mno-stv to avoid transformation.

The transformation is used to avoid CMOV instruction.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/100021] [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell
  2021-04-10 17:14 [Bug target/100021] New: [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell nok.raven at gmail dot com
  2021-04-11  8:52 ` [Bug target/100021] " ubizjak at gmail dot com
@ 2021-04-11  9:02 ` ubizjak at gmail dot com
  2021-04-12  8:27 ` rguenth at gcc dot gnu.org
  2021-04-12 14:28 ` nok.raven at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: ubizjak at gmail dot com @ 2021-04-11  9:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100021

--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
Also, you are passing -march=sandybridge, but the profiler seems to show
Skylake (SKX) target. The STV pass heavily depends on target costs, and when
-march=skylake is passed, the conversion is avoided.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/100021] [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell
  2021-04-10 17:14 [Bug target/100021] New: [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell nok.raven at gmail dot com
  2021-04-11  8:52 ` [Bug target/100021] " ubizjak at gmail dot com
  2021-04-11  9:02 ` ubizjak at gmail dot com
@ 2021-04-12  8:27 ` rguenth at gcc dot gnu.org
  2021-04-12 14:28 ` nok.raven at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-12  8:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100021

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
             Target|                            |x86_64-*-* i?86-*-*
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-04-12

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I'm quite sure the LLVM MCA calculation is "wrong".  The transform speeds up a
SPEC benchmark by >10% on a Haswell CPU.

So please show an actual runtime testcase that regresses.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/100021] [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell
  2021-04-10 17:14 [Bug target/100021] New: [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell nok.raven at gmail dot com
                   ` (2 preceding siblings ...)
  2021-04-12  8:27 ` rguenth at gcc dot gnu.org
@ 2021-04-12 14:28 ` nok.raven at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: nok.raven at gmail dot com @ 2021-04-12 14:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100021

Nikita Kniazev <nok.raven at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
             Status|WAITING                     |RESOLVED

--- Comment #4 from Nikita Kniazev <nok.raven at gmail dot com> ---
> Also, you are passing -march=sandybridge, but the profiler seems to show Skylake (SKX) target.

I indeed missed that Compiler Explorer does not pass -march flag to MCA
automatically.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-12 14:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-10 17:14 [Bug target/100021] New: [9/10/11 Regression] std::clamp unprofitable vectorization on -march=nehalem/.../broadwell nok.raven at gmail dot com
2021-04-11  8:52 ` [Bug target/100021] " ubizjak at gmail dot com
2021-04-11  9:02 ` ubizjak at gmail dot com
2021-04-12  8:27 ` rguenth at gcc dot gnu.org
2021-04-12 14:28 ` nok.raven at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).