public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13
@ 2023-01-30 13:18 balder at yahooinc dot com
  2023-01-30 13:20 ` [Bug c++/108599] " balder at yahooinc dot com
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: balder at yahooinc dot com @ 2023-01-30 13:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

            Bug ID: 108599
           Summary: Incorrect code generation newer intel architectures
                    for gcc 12 and 13
           Product: gcc
           Version: 12.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: balder at yahooinc dot com
  Target Milestone: ---

The code fragment below generates incorrect code for some architectures.

It works fine when compiled with
c++ -Wall -Wextra -O2 -march=haswell -mtune=skylake test.cpp && ./a.out

Changing -mtune to skylake-avx512 makes it fail.
It also fails cascadelake and icelake-client and icelake-server.
It fails with both -O2 and -O3, but works fine with -O1 and -Og.

c++ -Wall -Wextra -O2 -march=haswell -mtune=skylake-avx512 test.cpp && ./a.out
a.out: test.cpp:23: void assert_stats(size_t, size_t, size_t, size_t, Stats):
Assertion `exp_dead == stats._dead' failed.

Compiler version:
c++ -v
Using built-in specs.
COLLECT_GCC=c++
COLLECT_LTO_WRAPPER=/opt/rh/gcc-toolset-12/root/usr/libexec/gcc/x86_64-redhat-linux/12/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,fortran,lto --prefix=/opt/rh/gcc-toolset-12/root/usr
--mandir=/opt/rh/gcc-toolset-12/root/usr/share/man
--infodir=/opt/rh/gcc-toolset-12/root/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-gcc-major-version-only --enable-libstdcxx-backtrace
--with-linker-hash-style=gnu --enable-plugin --enable-initfini-array
--with-isl=/builddir/build/BUILD/gcc-12.1.1-20220628/obj-x86_64-redhat-linux/isl-install
--enable-offload-targets=nvptx-none --without-cuda-driver
--enable-offload-defaulted --enable-gnu-indirect-function --enable-cet
--with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.1.1 20220628 (Red Hat 12.1.1-3) (GCC) 

------------------------ Code --------------------
#include <cstddef>
#include <cassert>

struct Stats
{
    size_t _used;
    size_t _hold;
    size_t _dead;
    size_t _extra_used;
    Stats() : _used(0), _hold(0), _dead(0), _extra_used(0) {}
    Stats used(size_t val) { _used += val; return *this; }
    Stats hold(size_t val) { _hold += val; return *this; }
};

void
assert_stats(size_t exp_used, size_t exp_hold, size_t exp_dead,
             size_t exp_extra_used, const Stats stats)
{
    assert(exp_used == stats._used);
    assert(exp_hold == stats._hold);
    assert(exp_dead == stats._dead);                // <===== Assert fails
    assert(exp_extra_used == stats._extra_used);
}

int main(int , char* [])
{
    assert_stats(16, 0, 0, 0, Stats().used(16).hold(0));
    assert_stats(16, 16, 0, 0, Stats().used(16).hold(16));    // <=========
Causes assert to fail
    return 0;
}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug c++/108599] Incorrect code generation newer intel architectures for gcc 12 and 13
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
@ 2023-01-30 13:20 ` balder at yahooinc dot com
  2023-01-30 13:32 ` [Bug target/108599] " balder at yahooinc dot com
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: balder at yahooinc dot com @ 2023-01-30 13:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

--- Comment #1 from Henning Baldersheim <balder at yahooinc dot com> ---
This is spun out from https://github.com/vespa-engine/vespa/pull/25786

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] Incorrect code generation newer intel architectures for gcc 12 and 13
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
  2023-01-30 13:20 ` [Bug c++/108599] " balder at yahooinc dot com
@ 2023-01-30 13:32 ` balder at yahooinc dot com
  2023-01-30 13:49 ` [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: balder at yahooinc dot com @ 2023-01-30 13:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

--- Comment #2 from Henning Baldersheim <balder at yahooinc dot com> ---
This is an even bigger issue with gcc 13. Then it only requires -march=haswell

Works:
c++ -Wall -Wextra -O3 -march=ivybridge test.cpp && ./a.out

Fails:
c++ -Wall -Wextra -O3 -march=haswell test.cpp && ./a.out
a.out: test.cpp:23: void assert_stats(size_t, size_t, size_t, size_t, Stats):
Assertion `exp_dead == stats._dead' failed.

gcc version:
gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/13/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto
--prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=https://urldefense.com/v3/__http://bugzilla.redhat.com/bugzilla__;!!Op6eflyXZCqGR5I!CPaxwlsoI_wy_AwjCIWAdmMOLgtd7sjMpqqz-fNyAFo5nnEaBO0guYKTV5m7wWkHYS5N4zBFXJGe0DaFnA$
 --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-gcc-major-version-only --enable-libstdcxx-backtrace
--with-libstdcxx-zoneinfo=/usr/share/zoneinfo
--with-linker-hash-style=gnu --enable-plugin --enable-initfini-array
--with-isl=/builddir/build/BUILD/gcc-13.0.1-20230117/obj-x86_64-redhat-linux/isl-install
--enable-offload-targets=nvptx-none --without-cuda-driver
--enable-offload-defaulted --enable-gnu-indirect-function --enable-cet
--with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
--with-build-config=bootstrap-lto --enable-link-serialization=1
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.0.1 20230117 (Red Hat 13.0.1-0) (GCC)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
  2023-01-30 13:20 ` [Bug c++/108599] " balder at yahooinc dot com
  2023-01-30 13:32 ` [Bug target/108599] " balder at yahooinc dot com
@ 2023-01-30 13:49 ` jakub at gcc dot gnu.org
  2023-01-30 14:00 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-30 13:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.3
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2023-01-30
            Summary|Incorrect code generation   |[12/13 Regression]
                   |newer intel architectures   |Incorrect code generation
                   |for gcc 12 and 13           |newer intel architectures
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
With just -O3 -march=haswell on following test it started with
r13-4124-g156f523f9582f1e6bcce27ece03f2776960408c8
With -O2 -march=haswell -mtune=skylake-avx512 started with
r12-4240-g2b8453c401b699ed93c085d0413ab4b5030bcdb8
Finally with -O3 -march=haswell -mtune=skylake-avx512 with
r12-2666-g29f0e955c97da002b5adb4e8c9dfd2ea9709e207
__attribute__((noipa, noreturn)) void
bar (const char *, const char *, unsigned int, const char *) noexcept
{
  __builtin_abort ();
}

#  define assert(expr)                                                  \
     (static_cast <bool> (expr)                                         \
      ? void (0)                                                        \
      : bar (#expr, __FILE__, __LINE__, __PRETTY_FUNCTION__))

typedef decltype (sizeof 0) size_t;

struct Stats
{
  size_t _used;
  size_t _hold;
  size_t _dead;
  size_t _extra_used;
  Stats () : _used(0), _hold(0), _dead(0), _extra_used(0) {}
  Stats used (size_t val) { _used += val; return *this; }
  Stats hold (size_t val) { _hold += val; return *this; }
};

void
foo (size_t exp_used, size_t exp_hold, size_t exp_dead,
     size_t exp_extra_used, const Stats stats)
{
  assert (exp_used == stats._used);
  assert (exp_hold == stats._hold);
  assert (exp_dead == stats._dead);
  assert (exp_extra_used == stats._extra_used);
}

int
main ()
{
  foo (16, 0, 0, 0, Stats ().used (16).hold (0));
  foo (16, 16, 0, 0, Stats ().used (16).hold (16));
}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (2 preceding siblings ...)
  2023-01-30 13:49 ` [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures jakub at gcc dot gnu.org
@ 2023-01-30 14:00 ` jakub at gcc dot gnu.org
  2023-01-30 14:10 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-30 14:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
More simplified:
struct S
{
  unsigned long a, b, c, d;
  S () : a(0), b(0), c(0), d(0) {}
  S bar (unsigned long val) { a += val; return *this; }
  S baz (unsigned long val) { b += val; return *this; }
};

__attribute__((noipa)) void
foo (unsigned long x, unsigned long y, unsigned long z, unsigned long w, const
S s)
{
  if (s.a != x || s.b != y || s.c != z || s.d != w)
    __builtin_abort ();
}

int
main ()
{
  foo (16, 0, 0, 0, S ().bar (16).baz (0));
  foo (16, 16, 0, 0, S ().bar (16).baz (16));
}

It is main that is being miscompiled during RTL passes, in *.optimized we have
correct:
  MEM <vector(4) long unsigned int> [(void *)&D.3337] = { 16, 0, 0, 0 };
  foo (16, 0, 0, 0, D.3337);
  MEM <vector(4) long unsigned int> [(void *)&D.3338] = { 16, 16, 0, 0 };
  foo (16, 16, 0, 0, D.3338);
but what is actually passed in the second case is { 16, 16, 16, 16 }.
I'll have a look.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (3 preceding siblings ...)
  2023-01-30 14:00 ` jakub at gcc dot gnu.org
@ 2023-01-30 14:10 ` jakub at gcc dot gnu.org
  2023-01-30 14:31 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-30 14:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
cse2 still has correct:
(insn 28 27 29 2 (set (reg:V4DI 86)
        (mem/u/c:V4DI (symbol_ref/u:DI ("*.LC2") [flags 0x2]) [0  S32 A256]))
"pr108599.C":6:49 1811 {movv4di_internal}
     (expr_list:REG_EQUAL (const_vector:V4DI [
                (const_int 16 [0x10]) repeated x2
                (const_int 0 [0]) repeated x2
            ])
        (nil)))
(insn 29 28 30 2 (set (mem/c:V4DI (plus:DI (reg/f:DI 19 frame)
                (const_int -32 [0xffffffffffffffe0])) [0 MEM <vector(4) long
unsigned int> [(void *)&D.3338]+0 S32 A256])
        (reg:V4DI 86)) "pr108599.C":6:49 1811 {movv4di_internal}
     (expr_list:REG_DEAD (reg:V4DI 86)
        (nil)))
...
(insn 35 33 36 2 (set (reg:OI 88 [ D.3338 ])
        (mem/c:OI (plus:DI (reg/f:DI 19 frame)
                (const_int -32 [0xffffffffffffffe0])) [2 D.3338+0 S32 A256]))
"pr108599.C":20:7 discrim 3 80 {*movoi_internal_avx}
     (nil))
(insn 36 35 37 2 (set (mem:OI (reg/f:DI 7 sp) [0  S32 A64])
        (reg:OI 88 [ D.3338 ])) "pr108599.C":20:7 discrim 3 80
{*movoi_internal_avx}
     (expr_list:REG_DEAD (reg:OI 88 [ D.3338 ])
        (nil)))
But dse1 turns it into incorrect:
(insn 28 27 53 2 (set (reg:V4DI 86)
        (mem/u/c:V4DI (symbol_ref/u:DI ("*.LC2") [flags 0x2]) [0  S32 A256]))
"pr108599.C":6:49 1811 {movv4di_internal}
     (expr_list:REG_EQUAL (const_vector:V4DI [
                (const_int 16 [0x10]) repeated x2
                (const_int 0 [0]) repeated x2
            ])
        (nil)))
(insn 53 28 52 2 (set (reg:DI 94)
        (const_int 16 [0x10])) "pr108599.C":6:49 82 {*movdi_internal}
     (nil))
(insn 52 53 54 2 (set (reg:V4DI 93)
        (vec_duplicate:V4DI (reg:DI 94))) "pr108599.C":6:49 8004 {vec_dupv4di}
     (expr_list:REG_DEAD (reg:DI 94)
        (nil)))
(insn 54 52 30 2 (set (reg:OI 92)
        (subreg:OI (reg:V4DI 93) 0)) "pr108599.C":6:49 80 {*movoi_internal_avx}
     (expr_list:REG_DEAD (reg:V4DI 93)
        (expr_list:REG_EQUAL (const_wide_int 0x100000000000000010)
            (nil))))
...
(insn 35 33 36 2 (set (reg:OI 88 [ D.3338 ])
        (reg:OI 92)) "pr108599.C":20:7 discrim 3 80 {*movoi_internal_avx}
     (expr_list:REG_DEAD (reg:OI 92)
        (nil)))

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (4 preceding siblings ...)
  2023-01-30 14:10 ` jakub at gcc dot gnu.org
@ 2023-01-30 14:31 ` jakub at gcc dot gnu.org
  2023-01-30 15:17 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-30 14:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The bug is in ix86_convert_const_wide_int_to_broadcast.  It is called with
OImode and
(const_wide_int 0x100000000000000010), that CONST_WIDE_INT is actually usable
as broadcast from DImode 0x10, but only to TImode, not OImode nor XImode.
  /* Check if OP can be broadcasted from VAL.  */
  for (int i = 1; i < CONST_WIDE_INT_NUNITS (op); i++)
    if (val != CONST_WIDE_INT_ELT (op, i))
      return nullptr;
checks all elements of the CONST_WIDE_INT, but nothing checks that it has the
expected number of elements...
Note, 0 and -1 shouldn't happen here, those would be CONST_INT rather than
CONST_WIDE_INT and all the others have to use the right number of
CONST_WIDE_INT_NUNITS in order to be broadcastable.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (5 preceding siblings ...)
  2023-01-30 14:31 ` jakub at gcc dot gnu.org
@ 2023-01-30 15:17 ` rguenth at gcc dot gnu.org
  2023-01-30 15:19 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-30 15:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-* i?86-*-*
           Priority|P3                          |P2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (6 preceding siblings ...)
  2023-01-30 15:17 ` rguenth at gcc dot gnu.org
@ 2023-01-30 15:19 ` jakub at gcc dot gnu.org
  2023-01-31  9:12 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-30 15:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54372
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54372&action=edit
gcc13-pr108599.patch

Untested fix.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (7 preceding siblings ...)
  2023-01-30 15:19 ` jakub at gcc dot gnu.org
@ 2023-01-31  9:12 ` cvs-commit at gcc dot gnu.org
  2023-01-31  9:14 ` [Bug target/108599] [12 " jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-01-31  9:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:963315a922e228c4f6853826666151fc540f111a

commit r13-5529-g963315a922e228c4f6853826666151fc540f111a
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Jan 31 10:12:19 2023 +0100

    i386: Fix up ix86_convert_const_wide_int_to_broadcast [PR108599]

    The following testcase is miscompiled.  The problem is that during
    RTL DSE we see a V4DI register is being loaded { 16, 16, 0, 0 }
    value and DSE mostly works in terms of scalar modes, so it calls
    movoi to set an OImode REG to (const_wide_int 0x100000000000000010)
    and ix86_convert_const_wide_int_to_broadcast thinks it can compute
    that value by broadcasting DImode 0x10.  While it is true that
    for TImode result the broadcast could be used, for OImode/XImode
    it can't be, because all but the lowest 2 HOST_WIDE_INTs aren't
    present (so are 0 or -1 depending on sign), not 0x10 in this case.
    The function checks if the least significant HOST_WIDE_INT elt
    of the CONST_WIDE_INT is broadcastable from QI/HI/SI/DImode and then
      /* Check if OP can be broadcasted from VAL.  */
      for (int i = 1; i < CONST_WIDE_INT_NUNITS (op); i++)
        if (val != CONST_WIDE_INT_ELT (op, i))
          return nullptr;
    That is needed of course, but nothing checks that
    CONST_WIDE_INT_NUNITS (op) isn't too small for the mode in question.
    I think if op would be 0 or -1, it ought to be never CONST_WIDE_INT,
    but CONST_INT and so we can just punt whenever the number of
    CONST_WIDE_INT elts is not the expected one.

    2023-01-31  Jakub Jelinek  <jakub@redhat.com>

            PR target/108599
            * config/i386/i386-expand.cc
            (ix86_convert_const_wide_int_to_broadcast): Return nullptr if
            CONST_WIDE_INT_NUNITS (op) times HOST_BITS_PER_WIDE_INT isn't
            equal to bitsize of mode.

            * gcc.target/i386/avx2-pr108599.c: New test.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (8 preceding siblings ...)
  2023-01-31  9:12 ` cvs-commit at gcc dot gnu.org
@ 2023-01-31  9:14 ` jakub at gcc dot gnu.org
  2023-02-09 11:54 ` balder at yahooinc dot com
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-01-31  9:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[12/13 Regression]          |[12 Regression] Incorrect
                   |Incorrect code generation   |code generation newer intel
                   |newer intel architectures   |architectures

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk so far.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (9 preceding siblings ...)
  2023-01-31  9:14 ` [Bug target/108599] [12 " jakub at gcc dot gnu.org
@ 2023-02-09 11:54 ` balder at yahooinc dot com
  2023-02-09 12:02 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: balder at yahooinc dot com @ 2023-02-09 11:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

--- Comment #10 from Henning Baldersheim <balder at yahooinc dot com> ---
Will this be backported to gcc-12, or do we need to wait for gcc-13 ?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (10 preceding siblings ...)
  2023-02-09 11:54 ` balder at yahooinc dot com
@ 2023-02-09 12:02 ` jakub at gcc dot gnu.org
  2023-02-10 17:46 ` cvs-commit at gcc dot gnu.org
  2023-02-10 18:01 ` jakub at gcc dot gnu.org
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-09 12:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
At some point yes, don't know when exactly.  Will need to collect several
dozens of backports and test them.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (11 preceding siblings ...)
  2023-02-09 12:02 ` jakub at gcc dot gnu.org
@ 2023-02-10 17:46 ` cvs-commit at gcc dot gnu.org
  2023-02-10 18:01 ` jakub at gcc dot gnu.org
  13 siblings, 0 replies; 15+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-10 17:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:7d7f275ebe7295264a0406876c0670e25a50169a

commit r12-9147-g7d7f275ebe7295264a0406876c0670e25a50169a
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Jan 31 10:12:19 2023 +0100

    i386: Fix up ix86_convert_const_wide_int_to_broadcast [PR108599]

    The following testcase is miscompiled.  The problem is that during
    RTL DSE we see a V4DI register is being loaded { 16, 16, 0, 0 }
    value and DSE mostly works in terms of scalar modes, so it calls
    movoi to set an OImode REG to (const_wide_int 0x100000000000000010)
    and ix86_convert_const_wide_int_to_broadcast thinks it can compute
    that value by broadcasting DImode 0x10.  While it is true that
    for TImode result the broadcast could be used, for OImode/XImode
    it can't be, because all but the lowest 2 HOST_WIDE_INTs aren't
    present (so are 0 or -1 depending on sign), not 0x10 in this case.
    The function checks if the least significant HOST_WIDE_INT elt
    of the CONST_WIDE_INT is broadcastable from QI/HI/SI/DImode and then
      /* Check if OP can be broadcasted from VAL.  */
      for (int i = 1; i < CONST_WIDE_INT_NUNITS (op); i++)
        if (val != CONST_WIDE_INT_ELT (op, i))
          return nullptr;
    That is needed of course, but nothing checks that
    CONST_WIDE_INT_NUNITS (op) isn't too small for the mode in question.
    I think if op would be 0 or -1, it ought to be never CONST_WIDE_INT,
    but CONST_INT and so we can just punt whenever the number of
    CONST_WIDE_INT elts is not the expected one.

    2023-01-31  Jakub Jelinek  <jakub@redhat.com>

            PR target/108599
            * config/i386/i386-expand.cc
            (ix86_convert_const_wide_int_to_broadcast): Return nullptr if
            CONST_WIDE_INT_NUNITS (op) times HOST_BITS_PER_WIDE_INT isn't
            equal to bitsize of mode.

            * gcc.target/i386/avx2-pr108599.c: New test.

    (cherry picked from commit 963315a922e228c4f6853826666151fc540f111a)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug target/108599] [12 Regression] Incorrect code generation newer intel architectures
  2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
                   ` (12 preceding siblings ...)
  2023-02-10 17:46 ` cvs-commit at gcc dot gnu.org
@ 2023-02-10 18:01 ` jakub at gcc dot gnu.org
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-10 18:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108599

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed for gcc 12.3 too.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-02-10 18:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-30 13:18 [Bug c++/108599] New: Incorrect code generation newer intel architectures for gcc 12 and 13 balder at yahooinc dot com
2023-01-30 13:20 ` [Bug c++/108599] " balder at yahooinc dot com
2023-01-30 13:32 ` [Bug target/108599] " balder at yahooinc dot com
2023-01-30 13:49 ` [Bug target/108599] [12/13 Regression] Incorrect code generation newer intel architectures jakub at gcc dot gnu.org
2023-01-30 14:00 ` jakub at gcc dot gnu.org
2023-01-30 14:10 ` jakub at gcc dot gnu.org
2023-01-30 14:31 ` jakub at gcc dot gnu.org
2023-01-30 15:17 ` rguenth at gcc dot gnu.org
2023-01-30 15:19 ` jakub at gcc dot gnu.org
2023-01-31  9:12 ` cvs-commit at gcc dot gnu.org
2023-01-31  9:14 ` [Bug target/108599] [12 " jakub at gcc dot gnu.org
2023-02-09 11:54 ` balder at yahooinc dot com
2023-02-09 12:02 ` jakub at gcc dot gnu.org
2023-02-10 17:46 ` cvs-commit at gcc dot gnu.org
2023-02-10 18:01 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).