public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
@ 2021-01-15  1:42 vsevolod.livinskij at frtk dot ru
  2021-01-15  8:11 ` [Bug tree-optimization/98694] " crazylht at gmail dot com
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: vsevolod.livinskij at frtk dot ru @ 2021-01-15  1:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

            Bug ID: 98694
           Summary: GCC produces incorrect code for loops with -O3 for
                    skylake-avx512 and icelake-server
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vsevolod.livinskij at frtk dot ru
  Target Milestone: ---

The reproducer is a bit big, but I was not able to reduce it further.
Reproducer:

// func.cpp
#include <algorithm>

extern short var_1, var_29, var_89;
extern unsigned var_2, var_11;
extern bool var_4;
extern long var_6;
extern char var_7;
extern int var_8, var_10;
extern short arr_206[10][14][13][21][14] __attribute__((aligned));
extern int arr_257[];

long f(long l) { return 0 > l ? 0 : l; }

void test() {
  var_11 = var_6;
  for (char a = 0; a < (char)var_2; a = 6)
    for (int b = 0; b < var_2; b = ~0)
      for (int c = 0; c < 2; c = var_1)
        for (bool d = 0; d < var_4; d = 1)
          var_29 = f(~var_6);
  for (short e = 0; e < short(var_6); e = var_6) {
    for (; 0 < (int)var_6;)
      ;
    for (char g = 0; g < 4; g++)
      for (; std::min(var_7 / 405077347810ULL, (unsigned long long)9);
           var_7 += 2)
        for (char h = 0; h < (char)var_8; h += 4)
          for (short i = 0; i < (var_4 && var_6) + 13; i++) {
            arr_206[0][g][0][h][i] = var_6;
            var_89 = std::min(var_4 ?: 709U, (unsigned)var_4);
          }
    for (short j = 0; j < var_2; j += 4)
      for (int k = 0; k < 5U; k = var_10)
        arr_257[k] = var_6;
  }
}

// driver.cpp
#include <stdio.h>

short var_1 = (short)7531;
unsigned int var_2 = 187158918U;
bool var_4 = (bool)1;
unsigned long long int var_6 = 10263287916162477044ULL;
signed char var_7 = 0;
long long int var_8 = 21;
unsigned int var_10 = 3309705747U;
unsigned int var_11 = 222967114U;
short var_29 = (short)-22723;
short var_89 = (short)-19017;
short arr_206 [10] [14] [13] [21] [14] __attribute__((aligned));
int arr_257 [5];

void test();

int main() {
    test();
    for (size_t i_0 = 0; i_0 < 5; ++i_0)
        printf("%d ", arr_257 [i_0]);
    printf("\n");
}

Error:

>$ g++ -march=skylake-avx512 func.cpp driver.cpp -O2 && sde -skx -- ./a.out 
-2039714828 0 0 0 0 
>$ g++ -march=skylake-avx512 func.cpp driver.cpp -O3 && sde -skx -- ./a.out 
27636 0 0 0 0

gcc version 11.0.0 20210113 (8fc183ccd0628465205b8a88c29ab69bfe74a08a)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/98694] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
@ 2021-01-15  8:11 ` crazylht at gmail dot com
  2021-01-15  9:43 ` [Bug target/98694] " marxin at gcc dot gnu.org
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-15  8:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
cprop hardreg change

(insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
        (reg:SI 37 r9 [orig:86 _11 ] [86])) "test.c":29:36 75 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
        (nil)))

to

(insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
        (reg:SI 22 xmm2 [orig:86 _11 ] [86])) "test.c":29:36 75
{*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
        (nil)))

since it thought the lower 32bit of r9 and xmm2 is the same?

but with xmm2 defined as

        kmovw   %k0, %edi       # 69    [c=4 l=4]  *movhi_internal/6
        kmovd   %k0, %edx       # 487   [c=4 l=3]  *movsi_internal/16
        vmovd   %edi, %xmm2     # 489

the bit16-32 is clear with kmovw(note k0 is equal to r9 with SImode, it's var_6
in source code)

(insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
        (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
{*movhi_internal}
     (nil))

(insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
        (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
     (nil))

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
  2021-01-15  8:11 ` [Bug tree-optimization/98694] " crazylht at gmail dot com
@ 2021-01-15  9:43 ` marxin at gcc dot gnu.org
  2021-01-15 10:03 ` [Bug target/98694] [11 Regression] " rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-01-15  9:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marxin at gcc dot gnu.org
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-01-15
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Martin Liška <marxin at gcc dot gnu.org> ---
Just for the record, started with r11-4428-g4a369d199bf2f34e but it only made
it visible I think.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] [11 Regression] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
  2021-01-15  8:11 ` [Bug tree-optimization/98694] " crazylht at gmail dot com
  2021-01-15  9:43 ` [Bug target/98694] " marxin at gcc dot gnu.org
@ 2021-01-15 10:03 ` rguenth at gcc dot gnu.org
  2021-01-15 16:52 ` crazylht at gmail dot com
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-15 10:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1
   Target Milestone|---                         |11.0
                 CC|                            |rguenth at gcc dot gnu.org
            Summary|GCC produces incorrect code |[11 Regression] GCC
                   |for loops with -O3 for      |produces incorrect code for
                   |skylake-avx512 and          |loops with -O3 for
                   |icelake-server              |skylake-avx512 and
                   |                            |icelake-server

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] [11 Regression] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (2 preceding siblings ...)
  2021-01-15 10:03 ` [Bug target/98694] [11 Regression] " rguenth at gcc dot gnu.org
@ 2021-01-15 16:52 ` crazylht at gmail dot com
  2021-01-15 16:56 ` crazylht at gmail dot com
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-15 16:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #1)
> cprop hardreg change
> 
> (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
>         (reg:SI 37 r9 [orig:86 _11 ] [86])) "test.c":29:36 75
> {*movsi_internal}
>      (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86])
>         (nil)))
> 
> to
> 
> (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86])
>         (reg:SI 22 xmm2 [orig:86 _11 ] [86])) "test.c":29:36 75
> {*movsi_internal}
>      (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86])
>         (nil)))
> 
> since it thought the lower 32bit of r9 and xmm2 is the same?
> 
> but with xmm2 defined as
> 
> 	kmovw	%k0, %edi	# 69	[c=4 l=4]  *movhi_internal/6
> 	kmovd	%k0, %edx	# 487	[c=4 l=3]  *movsi_internal/16
> 	vmovd	%edi, %xmm2	# 489
> 
> the bit16-32 is clear with kmovw(note k0 is equal to r9 with SImode, it's
> var_6 in source code)
> 
> (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96])
>         (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76
> {*movhi_internal}
>      (nil))
> 
> (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297])
>         (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal}
>      (nil))

It seems to be be handled here.

cut from copy_value in regcprop.c:
----
  /* If SRC had been assigned a mode narrower than the copy, we can't
     link DEST into the chain, because not all of the pieces of the
     copy came from oldest_regno.  */
  else if (sn > hard_regno_nregs (sr, vd->e[sr].mode))
    return;
----

here we have %edi set as HImode, but use as SImode and be copied to %xmm2, but
the condition failed to check this beacuase both SImode and HImode has nregs as
1, since the upper part could be garbage, it can't link DEST into the chain.

        kmovw   %k0, %edi       # 69    [c=4 l=4]  *movhi_internal/6  <----HI
        kmovd   %k0, %edx       # 487   [c=4 l=3]  *movsi_internal/16 
        vmovd   %edi, %xmm2     # 489   [c=4 l=6]  *movsi_internal/13 <----SI
        sall    $16, %edx       # 73    [c=4 l=3]  *ashlsi3_1/0
        kmovw   %k0, %r8d       # 74    [c=4 l=5]  *zero_extendhisi2/1
        vpshuflw        $0, %xmm2, %xmm0        # 78    [c=4 l=5] 
*vec_dupv4hi/1
        orl     %edx, %r8d      # 75    [c=4 l=3]  *iorsi_1/0
        testw   %di, %di        # 82    [c=4 l=3]  *cmphi_ccno_1/0
        jle     .L52    # 83    [c=12 l=6]  *jcc
        kmovd   %k0, %r9d       # 85    [c=4 l=4]  *movsi_internal/16 <----SI
        testl   %r9d, %r9d      # 88    [c=4 l=3]  *cmpsi_ccno_1/0

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] [11 Regression] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (3 preceding siblings ...)
  2021-01-15 16:52 ` crazylht at gmail dot com
@ 2021-01-15 16:56 ` crazylht at gmail dot com
  2021-01-15 17:22 ` crazylht at gmail dot com
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-15 16:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---

> It seems to be be handled here.
>  
> cut from copy_value in regcprop.c:
> ----
>   /* If SRC had been assigned a mode narrower than the copy, we can't
>      link DEST into the chain, because not all of the pieces of the
>      copy came from oldest_regno.  */
>   else if (sn > hard_regno_nregs (sr, vd->e[sr].mode))
>     return;
> ----
> 
> here we have %edi set as HImode, but use as SImode and be copied to %xmm2,
> but the condition failed to check this beacuase both SImode and HImode has
> nregs as 1, since the upper part could be garbage, it can't link DEST into
> the chain.
> 
>         kmovw   %k0, %edi       # 69    [c=4 l=4]  *movhi_internal/6  <----HI
>         kmovd   %k0, %edx       # 487   [c=4 l=3]  *movsi_internal/16 
>         vmovd   %edi, %xmm2     # 489   [c=4 l=6]  *movsi_internal/13 <----SI
>         sall    $16, %edx       # 73    [c=4 l=3]  *ashlsi3_1/0
>         kmovw   %k0, %r8d       # 74    [c=4 l=5]  *zero_extendhisi2/1
>         vpshuflw        $0, %xmm2, %xmm0        # 78    [c=4 l=5] 
> *vec_dupv4hi/1
>         orl     %edx, %r8d      # 75    [c=4 l=3]  *iorsi_1/0
>         testw   %di, %di        # 82    [c=4 l=3]  *cmphi_ccno_1/0
>         jle     .L52    # 83    [c=12 l=6]  *jcc
>         kmovd   %k0, %r9d       # 85    [c=4 l=4]  *movsi_internal/16 <----SI
>         testl   %r9d, %r9d      # 88    [c=4 l=3]  *cmpsi_ccno_1/0

and it looks like a generic code bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] [11 Regression] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (4 preceding siblings ...)
  2021-01-15 16:56 ` crazylht at gmail dot com
@ 2021-01-15 17:22 ` crazylht at gmail dot com
  2021-01-15 17:33 ` crazylht at gmail dot com
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-15 17:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
and rewritten pattern
(define_insn "*vec_dupv4hi"
  [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
        (vec_duplicate:V4HI
          (truncate:HI
            (match_operand:SI 1 "register_operand" "0,xYw"))))]
to 

(define_insn "*vec_dupv4hi"
  [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
        (vec_duplicate:V4HI
          (vecTselect:HI
            (match_operand:V4HI 1 "register_operand" "0,xYw")
            (parallel [(const_int 0)]))))]

could avoid this issue.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] [11 Regression] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (5 preceding siblings ...)
  2021-01-15 17:22 ` crazylht at gmail dot com
@ 2021-01-15 17:33 ` crazylht at gmail dot com
  2021-01-18  9:41 ` crazylht at gmail dot com
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-15 17:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #5)
> and rewritten pattern
> (define_insn "*vec_dupv4hi"
>   [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
>         (vec_duplicate:V4HI
>           (truncate:HI
>             (match_operand:SI 1 "register_operand" "0,xYw"))))]
> to 
> 
> (define_insn "*vec_dupv4hi"
>   [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
>         (vec_duplicate:V4HI
>           (vecTselect:HI
>             (match_operand:V4HI 1 "register_operand" "0,xYw")
>             (parallel [(const_int 0)]))))]
> 
> could avoid this issue.

Oh, not workable.
---
  if (MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass))
    {
      /* Vector registers do not support QI or HImode loads.  If we don't
         disallow a change to these modes, reload will assume it's ok to
         drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
         the vec_dupv4hi pattern.  */
      if (GET_MODE_SIZE (from) < 4)
        return
---

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] [11 Regression] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (6 preceding siblings ...)
  2021-01-15 17:33 ` crazylht at gmail dot com
@ 2021-01-18  9:41 ` crazylht at gmail dot com
  2021-01-21  5:30 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-18  9:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
Another testcase reproduce the same issue.

#include<immintrin.h>
typedef short v4hi __attribute__ ((vector_size (8)));
typedef int v2si __attribute__ ((vector_size (8)));
v4hi b;

__attribute__ ((noipa))
v2si
foo (__m512i src1, __m512i src2)
{
  __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2);
  short s = (short) m;
  int i = (int)m;
  b = __extension__ (v4hi) {s, s, s, s};
  return __extension__ (v2si) {i, i};
}

int main ()
{
  __m512i src1 = _mm512_setzero_si512 ();
  __m512i src2 = _mm512_set_epi8 (0, 1, 0, 1, 0, 1, 0, 1,
                                  0, 1, 0, 1, 0, 1, 0, 1,
                                  0, 1, 0, 1, 0, 1, 0, 1,
                                  0, 1, 0, 1, 0, 1, 0, 1,
                                  0, 1, 0, 1, 0, 1, 0, 1,
                                  0, 1, 0, 1, 0, 1, 0, 1,
                                  0, 1, 0, 1, 0, 1, 0, 1,
                                  0, 1, 0, 1, 0, 1, 0, 1);
  __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2);
  v2si a = foo (src1, src2);
  if (a[0] != (int)m)
    __builtin_abort ();
  return 0;
}

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] [11 Regression] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (7 preceding siblings ...)
  2021-01-18  9:41 ` crazylht at gmail dot com
@ 2021-01-21  5:30 ` cvs-commit at gcc dot gnu.org
  2021-01-21  5:33 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-21  5:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:e711b67a9081ae84c66174a50705dc98ba993a43

commit r11-6828-ge711b67a9081ae84c66174a50705dc98ba993a43
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Jan 18 16:55:32 2021 +0800

    Fix incorrect optimization by cprop_hardreg.

    If SRC had been assigned a mode narrower than the copy, we can't
    always link DEST into the chain even they have same
    hard_regno_nregs(i.e. HImode/SImode in i386 backend).

    i.e
            kmovw   %k0, %edi
            vmovd   %edi, %xmm2
            vpshuflw        $0, %xmm2, %xmm0
            kmovw   %k0, %r8d
            kmovd   %k0, %r9d
    ...
    -        movl %r9d, %r11d
    +        vmovd %xmm2, %r11d

    gcc/ChangeLog:

            PR rtl-optimization/98694
            * regcprop.c (copy_value): If SRC had been assigned a mode
            narrower than the copy, we can't link DEST into the chain even
            they have same hard_regno_nregs(i.e. HImode/SImode in i386
            backend).

    gcc/testsuite/ChangeLog:

            PR rtl-optimization/98694
            * gcc.target/i386/pr98694.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] [11 Regression] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (8 preceding siblings ...)
  2021-01-21  5:30 ` cvs-commit at gcc dot gnu.org
@ 2021-01-21  5:33 ` crazylht at gmail dot com
  2021-01-21  9:41 ` [Bug target/98694] " rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-21  5:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
Fix on trunk sofar

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (9 preceding siblings ...)
  2021-01-21  5:33 ` crazylht at gmail dot com
@ 2021-01-21  9:41 ` rguenth at gcc dot gnu.org
  2021-01-21 11:12 ` crazylht at gmail dot com
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-21  9:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[11 Regression] GCC         |GCC produces incorrect code
                   |produces incorrect code for |for loops with -O3 for
                   |loops with -O3 for          |skylake-avx512 and
                   |skylake-avx512 and          |icelake-server
                   |icelake-server              |
   Target Milestone|11.0                        |---
      Known to work|                            |11.0

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk, latent on the branch(es) where we don't have a testcase(?)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (10 preceding siblings ...)
  2021-01-21  9:41 ` [Bug target/98694] " rguenth at gcc dot gnu.org
@ 2021-01-21 11:12 ` crazylht at gmail dot com
  2021-05-05 17:48 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-21 11:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #11 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #10)
> Fixed on trunk, latent on the branch(es) where we don't have a testcase(?)

Yes, not sure about the backport.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (11 preceding siblings ...)
  2021-01-21 11:12 ` crazylht at gmail dot com
@ 2021-05-05 17:48 ` jakub at gcc dot gnu.org
  2022-05-10  8:17 ` cvs-commit at gcc dot gnu.org
  2022-10-28 23:30 ` pinskia at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-05 17:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Related to PR100342.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (12 preceding siblings ...)
  2021-05-05 17:48 ` jakub at gcc dot gnu.org
@ 2022-05-10  8:17 ` cvs-commit at gcc dot gnu.org
  2022-10-28 23:30 ` pinskia at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-10  8:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:0372a414e7500dccab1eb423a2a620645c820a52

commit r10-10611-g0372a414e7500dccab1eb423a2a620645c820a52
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Jan 18 16:55:32 2021 +0800

    Fix incorrect optimization by cprop_hardreg.

    If SRC had been assigned a mode narrower than the copy, we can't
    always link DEST into the chain even they have same
    hard_regno_nregs(i.e. HImode/SImode in i386 backend).

    i.e
            kmovw   %k0, %edi
            vmovd   %edi, %xmm2
            vpshuflw        $0, %xmm2, %xmm0
            kmovw   %k0, %r8d
            kmovd   %k0, %r9d
    ...
    -        movl %r9d, %r11d
    +        vmovd %xmm2, %r11d

    gcc/ChangeLog:

            PR rtl-optimization/98694
            * regcprop.c (copy_value): If SRC had been assigned a mode
            narrower than the copy, we can't link DEST into the chain even
            they have same hard_regno_nregs(i.e. HImode/SImode in i386
            backend).

    gcc/testsuite/ChangeLog:

            PR rtl-optimization/98694
            * gcc.target/i386/pr98694.c: New test.

    (cherry picked from commit e711b67a9081ae84c66174a50705dc98ba993a43)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/98694] GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server
  2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
                   ` (13 preceding siblings ...)
  2022-05-10  8:17 ` cvs-commit at gcc dot gnu.org
@ 2022-10-28 23:30 ` pinskia at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-10-28 23:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98694

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
   Target Milestone|---                         |10.4
             Status|NEW                         |RESOLVED
      Known to work|                            |10.4.0

--- Comment #14 from Vsevolod Livinskii <vsevolod.livinskiy at gmail dot com> ---
Should this issue be marked as Resolved and Fixed?

--- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-10-28 23:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-15  1:42 [Bug tree-optimization/98694] New: GCC produces incorrect code for loops with -O3 for skylake-avx512 and icelake-server vsevolod.livinskij at frtk dot ru
2021-01-15  8:11 ` [Bug tree-optimization/98694] " crazylht at gmail dot com
2021-01-15  9:43 ` [Bug target/98694] " marxin at gcc dot gnu.org
2021-01-15 10:03 ` [Bug target/98694] [11 Regression] " rguenth at gcc dot gnu.org
2021-01-15 16:52 ` crazylht at gmail dot com
2021-01-15 16:56 ` crazylht at gmail dot com
2021-01-15 17:22 ` crazylht at gmail dot com
2021-01-15 17:33 ` crazylht at gmail dot com
2021-01-18  9:41 ` crazylht at gmail dot com
2021-01-21  5:30 ` cvs-commit at gcc dot gnu.org
2021-01-21  5:33 ` crazylht at gmail dot com
2021-01-21  9:41 ` [Bug target/98694] " rguenth at gcc dot gnu.org
2021-01-21 11:12 ` crazylht at gmail dot com
2021-05-05 17:48 ` jakub at gcc dot gnu.org
2022-05-10  8:17 ` cvs-commit at gcc dot gnu.org
2022-10-28 23:30 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).