[Bug libstdc++/108856] New: Increment and decrement on std::experimental::where

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug libstdc++/108856] New: Increment and decrement on std::experimental::where_expression should optimize better
@ 2023-02-20 10:23 mkretz at gcc dot gnu.org
  2023-02-20 14:09 ` [Bug libstdc++/108856] " mkretz at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: mkretz at gcc dot gnu.org @ 2023-02-20 10:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108856

            Bug ID: 108856
           Summary: Increment and decrement on
                    std::experimental::where_expression should optimize
                    better
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mkretz at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-*-*, i?86-*-*

#include <experimental/simd>

namespace stdx = std::experimental;

auto f(stdx::native_simd<int> a, stdx::native_simd_mask<int> k)
{
  ++where(k, a);
  return a;
}

With AVX512 this should compile to a bitmask to vectormask conversion with
subsequent subtraction:
        kmovw   k0, edi
        vpbroadcastmw2d zmm1, k0
        vpsubd  zmm0, zmm0, zmm1

Instead we get:
  vmovdqa32 zmm1, zmm0
  mov eax, 1
  kmovw k1, edi
  vpbroadcastd zmm0, eax
  vmovdqa32 zmm2, zmm1
  vpaddd zmm2{k1}, zmm1, zmm0
  vmovdqa32 zmm0, zmm2

Without AVX512 this should compile to a single subtraction:
        vpsubd  ymm0, ymm0, ymm1

Instead we get:
  mov eax, 1
  vmovd xmm2, eax
  vpbroadcastd ymm2, xmm2
  vpaddd ymm2, ymm0, ymm2
  vpblendvb ymm0, ymm0, ymm2, ymm1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108856] Increment and decrement on std::experimental::where_expression should optimize better
  2023-02-20 10:23 [Bug libstdc++/108856] New: Increment and decrement on std::experimental::where_expression should optimize better mkretz at gcc dot gnu.org
@ 2023-02-20 14:09 ` mkretz at gcc dot gnu.org
  2023-02-24 18:40 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: mkretz at gcc dot gnu.org @ 2023-02-20 14:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108856

Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2023-02-20
             Status|UNCONFIRMED                 |ASSIGNED

--- Comment #1 from Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> ---
The optimized AVX512 part was wrong. It should be 
        vpternlogd      zmm1, zmm1, zmm1, 0xFF
        kmovw   k1, edi
        vpsubd  zmm0{k1}, zmm0, zmm1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108856] Increment and decrement on std::experimental::where_expression should optimize better
  2023-02-20 10:23 [Bug libstdc++/108856] New: Increment and decrement on std::experimental::where_expression should optimize better mkretz at gcc dot gnu.org
  2023-02-20 14:09 ` [Bug libstdc++/108856] " mkretz at gcc dot gnu.org
@ 2023-02-24 18:40 ` cvs-commit at gcc dot gnu.org
  2023-05-23 10:02 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-24 18:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108856

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Matthias Kretz <mkretz@gcc.gnu.org>:

https://gcc.gnu.org/g:6ce55180d494b616e2e3e68ffedfe9007e42ca06

commit r13-6333-g6ce55180d494b616e2e3e68ffedfe9007e42ca06
Author: Matthias Kretz <m.kretz@gsi.de>
Date:   Mon Feb 20 16:33:31 2023 +0100

    libstdc++: More efficient masked inc-/decrement implementation

    Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

    libstdc++-v3/ChangeLog:

            PR libstdc++/108856
            * include/experimental/bits/simd_builtin.h
            (_SimdImplBuiltin::_S_masked_unary): More efficient
            implementation of masked inc-/decrement for integers and floats
            without AVX2.
            * include/experimental/bits/simd_x86.h
            (_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
            builtins for masked inc-/decrement.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108856] Increment and decrement on std::experimental::where_expression should optimize better
  2023-02-20 10:23 [Bug libstdc++/108856] New: Increment and decrement on std::experimental::where_expression should optimize better mkretz at gcc dot gnu.org
  2023-02-20 14:09 ` [Bug libstdc++/108856] " mkretz at gcc dot gnu.org
  2023-02-24 18:40 ` cvs-commit at gcc dot gnu.org
@ 2023-05-23 10:02 ` cvs-commit at gcc dot gnu.org
  2023-05-25  7:04 ` cvs-commit at gcc dot gnu.org
  2023-05-25  7:07 ` mkretz at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-23 10:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108856

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Matthias Kretz
<mkretz@gcc.gnu.org>:

https://gcc.gnu.org/g:4452077962d0c327dcb08670ab73f7197be53e91

commit r12-9640-g4452077962d0c327dcb08670ab73f7197be53e91
Author: Matthias Kretz <m.kretz@gsi.de>
Date:   Mon Feb 20 16:33:31 2023 +0100

    libstdc++: More efficient masked inc-/decrement implementation

    Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

    libstdc++-v3/ChangeLog:

            PR libstdc++/108856
            * include/experimental/bits/simd_builtin.h
            (_SimdImplBuiltin::_S_masked_unary): More efficient
            implementation of masked inc-/decrement for integers and floats
            without AVX2.
            * include/experimental/bits/simd_x86.h
            (_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
            builtins for masked inc-/decrement.

    (cherry picked from commit 6ce55180d494b616e2e3e68ffedfe9007e42ca06)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108856] Increment and decrement on std::experimental::where_expression should optimize better
  2023-02-20 10:23 [Bug libstdc++/108856] New: Increment and decrement on std::experimental::where_expression should optimize better mkretz at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-05-23 10:02 ` cvs-commit at gcc dot gnu.org
@ 2023-05-25  7:04 ` cvs-commit at gcc dot gnu.org
  2023-05-25  7:07 ` mkretz at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-25  7:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108856

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Matthias Kretz
<mkretz@gcc.gnu.org>:

https://gcc.gnu.org/g:7408248888717405a30d9ee01c65aac8839926d2

commit r11-10814-g7408248888717405a30d9ee01c65aac8839926d2
Author: Matthias Kretz <m.kretz@gsi.de>
Date:   Mon Feb 20 16:33:31 2023 +0100

    libstdc++: More efficient masked inc-/decrement implementation

    Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

    libstdc++-v3/ChangeLog:

            PR libstdc++/108856
            * include/experimental/bits/simd_builtin.h
            (_SimdImplBuiltin::_S_masked_unary): More efficient
            implementation of masked inc-/decrement for integers and floats
            without AVX2.
            * include/experimental/bits/simd_x86.h
            (_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
            builtins for masked inc-/decrement.

    (cherry picked from commit 6ce55180d494b616e2e3e68ffedfe9007e42ca06)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108856] Increment and decrement on std::experimental::where_expression should optimize better
  2023-02-20 10:23 [Bug libstdc++/108856] New: Increment and decrement on std::experimental::where_expression should optimize better mkretz at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-05-25  7:04 ` cvs-commit at gcc dot gnu.org
@ 2023-05-25  7:07 ` mkretz at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: mkretz at gcc dot gnu.org @ 2023-05-25  7:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108856

Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> ---
Resolved on all branches.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-05-25  7:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-20 10:23 [Bug libstdc++/108856] New: Increment and decrement on std::experimental::where_expression should optimize better mkretz at gcc dot gnu.org
2023-02-20 14:09 ` [Bug libstdc++/108856] " mkretz at gcc dot gnu.org
2023-02-24 18:40 ` cvs-commit at gcc dot gnu.org
2023-05-23 10:02 ` cvs-commit at gcc dot gnu.org
2023-05-25  7:04 ` cvs-commit at gcc dot gnu.org
2023-05-25  7:07 ` mkretz at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).