[Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
@ 2022-10-05 16:21 seurer at gcc dot gnu.org
  2022-10-06 10:43 ` [Bug target/107160] " rguenth at gcc dot gnu.org
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: seurer at gcc dot gnu.org @ 2022-10-05 16:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

            Bug ID: 107160
           Summary: [13 regression] r13-2641-g0ee1548d96884d  causes
                    verification failure in spec2006
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:0ee1548d96884d2689482054d925967a9a21d697, r13-2641-g0ee1548d96884d 

commit 0ee1548d96884d2689482054d925967a9a21d697
Author: Kewen Lin <linkw@linux.ibm.com>
Date:   Tue Sep 13 04:13:10 2022 -0500

    rs6000: Suggest unroll factor for loop vectorization

This commit is causing a verification error in a spec2006 test.

447.dealII      11440        322      35.5  *   11440       10.6            VE 


Build successes: 447.dealII(base), 447.dealII(peak)

Setting Up Run Directories
  Setting up 447.dealII ref base ppc64 default: created
(run_base_ref_ppc64.0000)
  Setting up 447.dealII ref peak ppc64 default: created
(run_peak_ref_ppc64.0000)
Running Benchmarks
  Running 447.dealII ref base ppc64 default
/home/seurer/gcc/cpu2006/bin/specinvoke -d
/home/seurer/gcc/cpu2006/benchspec/CPU2006/447.dealII/run/run_base_ref_ppc64.0000
-e speccmds.err -o speccmds.stdout -f speccmds.cmd -C -q
/home/seurer/gcc/cpu2006/bin/specinvoke -E -d
/home/seurer/gcc/cpu2006/benchspec/CPU2006/447.dealII/run/run_base_ref_ppc64.0000
-c 1 -e compare.err -o compare.stdout -f compare.cmd

*** Miscompare of grid-1.eps; for details see
   
/home/seurer/gcc/cpu2006/benchspec/CPU2006/447.dealII/run/run_base_ref_ppc64.0000/grid-1.eps.mis
Invalid run; unable to continue.


0004:  %%BoundingBox: 0 0 301 289
       %%BoundingBox: 0 0 301 286
                                ^
0012:  b 30.6779 58.2976 m 68.5015 87.3442 x
       b 121.58 122.428 m 137.988 136.638 x
              ^
0013:  b 68.5015 87.3442 m 106.325 116.391 x
       b 178.42 162.86 m 194.829 177.071 x
              ^
0014:  b 30.6779 58.2976 m 96.5094 10.9581 x
       b 121.58 179.269 m 137.988 193.479 x
              ^
0015:  b 30.6779 58.2976 m 3.86089 130.352 x
       b 137.988 136.638 m 166.408 128.434 x
               ^
0016:  b 30.6779 58.2976 m 57.3515 84.0277 x
       b 137.988 136.638 m 137.988 165.059 x
               ^
0017:  b 57.3515 84.0277 m 118.028 133.945 x
       b 137.988 165.059 m 137.988 193.479 x
               ^
0018:  b 189.158 76.8613 m 181.972 165.917 x
       b 194.829 148.65 m 194.829 177.071 x
               ^
0019:  b 181.972 165.917 m 171.837 160.85 x
       b 137.988 193.479 m 166.408 185.275 x
               ^
0020:  b 171.837 160.85 m 161.703 155.782 x
       b 166.408 185.275 m 194.829 177.071 x
               ^

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
@ 2022-10-06 10:43 ` rguenth at gcc dot gnu.org
  2022-10-06 14:04 ` seurer at gcc dot gnu.org
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-10-06 10:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code
   Target Milestone|---                         |13.0

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
what other flags do you use?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
  2022-10-06 10:43 ` [Bug target/107160] " rguenth at gcc dot gnu.org
@ 2022-10-06 14:04 ` seurer at gcc dot gnu.org
  2022-10-06 19:08 ` seurer at gcc dot gnu.org
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: seurer at gcc dot gnu.org @ 2022-10-06 14:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #2 from seurer at gcc dot gnu.org ---
One of the compilation commands:

/home/seurer/gcc/git/install/gcc-test/bin/g++ -c -o auto_derivative_function.o
-DSPEC_CPU -DNDEBUG  -Iinclude -DBOOST_DISABLE_THREADS -Ddeal_II_dimension=3
-m64 -O3 -mcpu=power8 -ffast-math -funroll-loops -fpeel-loops -fvect-cost-model
-mpopcntd -mrecip=rsqrt    -DSPEC_CPU_LP64 -DSPEC_CPU_LINUX  -include cstddef
-std=gnu++98      auto_derivative_function.cc

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
  2022-10-06 10:43 ` [Bug target/107160] " rguenth at gcc dot gnu.org
  2022-10-06 14:04 ` seurer at gcc dot gnu.org
@ 2022-10-06 19:08 ` seurer at gcc dot gnu.org
  2022-10-10  2:29 ` linkw at gcc dot gnu.org
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: seurer at gcc dot gnu.org @ 2022-10-06 19:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #3 from seurer at gcc dot gnu.org ---
Note that 554.roms_r from spec2017 also fails after this commit.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-10-06 19:08 ` seurer at gcc dot gnu.org
@ 2022-10-10  2:29 ` linkw at gcc dot gnu.org
  2022-10-10 19:14 ` seurer at gcc dot gnu.org
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-10  2:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin <linkw at gcc dot gnu.org> ---
Just back from holiday, thanks for reporting! I'll take a look.


(In reply to seurer from comment #3)
> Note that 554.roms_r from spec2017 also fails after this commit.

I suspected what you saw on SPEC2017 554.roms_r is related to PR103320. The
culprit commit can make some loops unrolled (they are not previously), PR103320
showed that loops unrolling matters. So if you did add option
"-fno-unsafe-math-optimizations" for the run with "-funroll-loops", then you
need it since this commit too. Could you double check?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-10-10  2:29 ` linkw at gcc dot gnu.org
@ 2022-10-10 19:14 ` seurer at gcc dot gnu.org
  2022-10-12  8:31 ` linkw at gcc dot gnu.org
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: seurer at gcc dot gnu.org @ 2022-10-10 19:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #5 from seurer at gcc dot gnu.org ---
I added that option and 554.roms_r now runs OK.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-10-10 19:14 ` seurer at gcc dot gnu.org
@ 2022-10-12  8:31 ` linkw at gcc dot gnu.org
  2022-10-13  9:57 ` linkw at gcc dot gnu.org
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-12  8:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-10-12
     Ever confirmed|0                           |1

--- Comment #6 from Kewen Lin <linkw at gcc dot gnu.org> ---
Confirmed.

The option can be reduced to: -m64 -O3 -mcpu=power8 -ffast-math

The culprit loop is the one in function l1_norm of file vector.cc (actually
it's from the header file). The resulting gimple after loop vect pass looks
weird to me, it seems not to expose one issue related to math but one actual
vect issue. I'll check it further and post one reduced test case.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-10-12  8:31 ` linkw at gcc dot gnu.org
@ 2022-10-13  9:57 ` linkw at gcc dot gnu.org
  2022-10-13 10:16 ` rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-13  9:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org,
                   |                            |rsandifo at gcc dot gnu.org

--- Comment #7 from Kewen Lin <linkw at gcc dot gnu.org> ---

One reduced test case is:

============================================================

#include <stdio.h>
#include <math.h>

#define N 128
float fl[N];

__attribute__ ((noipa, optimize (0))) void
init ()
{
  for (int i = 0; i < N; i++)
    fl[i] = i;
}

__attribute__ ((noipa)) float
foo (int n1)
{
  float sum0, sum1, sum2, sum3;
  sum0 = sum1 = sum2 = sum3 = 0.0f;

  int n = (n1 / 4) * 4;
  for (int i = 0; i < n; i += 4)
    {
      sum0 += fabs (fl[i]);
      sum1 += fabs (fl[i + 1]);
      sum2 += fabs (fl[i + 2]);
      sum3 += fabs (fl[i + 3]);
    }

  return sum0 + sum1 + sum2 + sum3;
}

__attribute__ ((optimize (0))) int
main ()
{
  init ();
  float res = foo (80);
  __builtin_printf ("res:%f\n", res);
  return 0;
}

============================================================ 
incorrect result "res:670.000000" vs expected result "res:3160.000000"

It looks it exposes one bug in vectorization reduction support. The reduction
epilogue handling looks wrong, it generates gimple code like:

    # vect_sum3_31.16_101 = PHI <vect_sum3_31.16_97(3)>
    # vect_sum3_31.16_102 = PHI <vect_sum3_31.16_98(3)>
    # vect_sum3_31.16_103 = PHI <vect_sum3_31.16_99(3)>
    # vect_sum3_31.16_104 = PHI <vect_sum3_31.16_100(3)>
    _105 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 0>;
    _106 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 32>;
    _107 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 64>;
    _108 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 96>;
    _109 = BIT_FIELD_REF <vect_sum3_31.16_102, 32, 0>;
    _110 = BIT_FIELD_REF <vect_sum3_31.16_102, 32, 32>;
    _111 = BIT_FIELD_REF <vect_sum3_31.16_102, 32, 64>;
    _112 = BIT_FIELD_REF <vect_sum3_31.16_102, 32, 96>;
...

it doesn't consider the reduced results vect_sum3_31.16_10{1,2,3,4} from the
loop can be reduced again in loop exit block as they are in the same slp group.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-10-13  9:57 ` linkw at gcc dot gnu.org
@ 2022-10-13 10:16 ` rguenth at gcc dot gnu.org
  2022-10-13 10:31 ` linkw at gcc dot gnu.org
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-10-13 10:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #7)
> One reduced test case is:
> 
> ============================================================
> 
> #include <stdio.h>
> #include <math.h>
> 
> #define N 128
> float fl[N];
> 
> __attribute__ ((noipa, optimize (0))) void
> init ()
> {
>   for (int i = 0; i < N; i++)
>     fl[i] = i;
> }
> 
> __attribute__ ((noipa)) float
> foo (int n1)
> {
>   float sum0, sum1, sum2, sum3;
>   sum0 = sum1 = sum2 = sum3 = 0.0f;
> 
>   int n = (n1 / 4) * 4;
>   for (int i = 0; i < n; i += 4)
>     {
>       sum0 += fabs (fl[i]);
>       sum1 += fabs (fl[i + 1]);
>       sum2 += fabs (fl[i + 2]);
>       sum3 += fabs (fl[i + 3]);
>     }
> 
>   return sum0 + sum1 + sum2 + sum3;
> }
> 
> __attribute__ ((optimize (0))) int
> main ()
> {
>   init ();
>   float res = foo (80);
>   __builtin_printf ("res:%f\n", res);
>   return 0;
> }
> 
> ============================================================ 
> incorrect result "res:670.000000" vs expected result "res:3160.000000"
> 
> It looks it exposes one bug in vectorization reduction support. The
> reduction epilogue handling looks wrong, it generates gimple code like:
> 
>     # vect_sum3_31.16_101 = PHI <vect_sum3_31.16_97(3)>
>     # vect_sum3_31.16_102 = PHI <vect_sum3_31.16_98(3)>
>     # vect_sum3_31.16_103 = PHI <vect_sum3_31.16_99(3)>
>     # vect_sum3_31.16_104 = PHI <vect_sum3_31.16_100(3)>
>     _105 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 0>;
>     _106 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 32>;
>     _107 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 64>;
>     _108 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 96>;
>     _109 = BIT_FIELD_REF <vect_sum3_31.16_102, 32, 0>;
>     _110 = BIT_FIELD_REF <vect_sum3_31.16_102, 32, 32>;
>     _111 = BIT_FIELD_REF <vect_sum3_31.16_102, 32, 64>;
>     _112 = BIT_FIELD_REF <vect_sum3_31.16_102, 32, 96>;
> ...
> 
> it doesn't consider the reduced results vect_sum3_31.16_10{1,2,3,4} from the
> loop can be reduced again in loop exit block as they are in the same slp
> group.

The above doesn't look wrong (but may miss the rest of the IL).  On
x86_64 this looks like

  <bb 4> [local count: 105119324]:
  # sum0_41 = PHI <sum0_28(3)>
  # sum1_39 = PHI <sum1_29(3)>
  # sum2_37 = PHI <sum2_30(3)>
  # sum3_35 = PHI <sum3_31(3)>
  # vect_sum3_31.11_59 = PHI <vect_sum3_31.11_60(3)>
  _58 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 0>;
  _57 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 32>;
  _56 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 64>;
  _55 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 96>;
  _74 = _58 + _57;
  _76 = _56 + _74;
  _78 = _55 + _76;

  <bb 5> [local count: 118111600]:
  # prephitmp_79 = PHI <_78(4), 0.0(2)>
  return prephitmp_79;

when unrolling is applied, thus with a larger VF, you should ideally
see the vectors accumulated.

Btw, I've fixed a SLP reduction issue two days ago in r13-3226-gee467644c53ee2
though that looks unrelated?

When I force a larger VF on x86 by adding a int store in the loop I see

  <bb 11> [local count: 94607391]:
  # sum0_48 = PHI <sum0_29(3)>
  # sum1_36 = PHI <sum1_30(3)>
  # sum2_35 = PHI <sum2_31(3)>
  # sum3_24 = PHI <sum3_32(3)>
  # vect_sum3_32.16_110 = PHI <vect_sum3_32.16_106(3)>
  # vect_sum3_32.16_111 = PHI <vect_sum3_32.16_107(3)>
  # vect_sum3_32.16_112 = PHI <vect_sum3_32.16_108(3)>
  # vect_sum3_32.16_113 = PHI <vect_sum3_32.16_109(3)>
  _114 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 0>;
  _115 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 32>;
  _116 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 64>;
  _117 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 96>;
  _118 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 0>;
  _119 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 32>;
  _120 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 64>;
  _121 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 96>;
  _122 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 0>;
  _123 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 32>;
  _124 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 64>;
  _125 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 96>;
  _126 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 0>;
  _127 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 32>;
  _128 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 64>;
  _129 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 96>;
  _130 = _114 + _118;
  _131 = _115 + _119;
  _132 = _116 + _120;
  _133 = _117 + _121;
  _134 = _130 + _122;
  _135 = _131 + _123;
  _136 = _132 + _124;
  _137 = _133 + _125;
  _138 = _134 + _126;

see how the lanes from the different vectors are accumulated?  (yeah,
we should simply add the vectors!)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-10-13 10:16 ` rguenth at gcc dot gnu.org
@ 2022-10-13 10:31 ` linkw at gcc dot gnu.org
  2022-10-13 11:05 ` rguenther at suse dot de
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-13 10:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #9 from Kewen Lin <linkw at gcc dot gnu.org> ---
> 
> The above doesn't look wrong (but may miss the rest of the IL).  On
> x86_64 this looks like
> 
>   <bb 4> [local count: 105119324]:
>   # sum0_41 = PHI <sum0_28(3)>
>   # sum1_39 = PHI <sum1_29(3)>
>   # sum2_37 = PHI <sum2_30(3)>
>   # sum3_35 = PHI <sum3_31(3)>
>   # vect_sum3_31.11_59 = PHI <vect_sum3_31.11_60(3)>
>   _58 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 0>;
>   _57 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 32>;
>   _56 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 64>;
>   _55 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 96>;
>   _74 = _58 + _57;
>   _76 = _56 + _74;
>   _78 = _55 + _76;
> 
>   <bb 5> [local count: 118111600]:
>   # prephitmp_79 = PHI <_78(4), 0.0(2)>
>   return prephitmp_79;
> 

Yeah, it looks expected without unrolling.

> when unrolling is applied, thus with a larger VF, you should ideally
> see the vectors accumulated.
> 
> Btw, I've fixed a SLP reduction issue two days ago in
> r13-3226-gee467644c53ee2
> though that looks unrelated?

Thanks for the information, I'll double check it.

> 
> When I force a larger VF on x86 by adding a int store in the loop I see
> 
>   <bb 11> [local count: 94607391]:
>   # sum0_48 = PHI <sum0_29(3)>
>   # sum1_36 = PHI <sum1_30(3)>
>   # sum2_35 = PHI <sum2_31(3)>
>   # sum3_24 = PHI <sum3_32(3)>
>   # vect_sum3_32.16_110 = PHI <vect_sum3_32.16_106(3)>
>   # vect_sum3_32.16_111 = PHI <vect_sum3_32.16_107(3)>
>   # vect_sum3_32.16_112 = PHI <vect_sum3_32.16_108(3)>
>   # vect_sum3_32.16_113 = PHI <vect_sum3_32.16_109(3)>
>   _114 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 0>;
>   _115 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 32>;
>   _116 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 64>;
>   _117 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 96>;
>   _118 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 0>;
>   _119 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 32>;
>   _120 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 64>;
>   _121 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 96>;
>   _122 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 0>;
>   _123 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 32>;
>   _124 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 64>;
>   _125 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 96>;
>   _126 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 0>;
>   _127 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 32>;
>   _128 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 64>;
>   _129 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 96>;
>   _130 = _114 + _118;
>   _131 = _115 + _119;
>   _132 = _116 + _120;
>   _133 = _117 + _121;
>   _134 = _130 + _122;
>   _135 = _131 + _123;
>   _136 = _132 + _124;
>   _137 = _133 + _125;
>   _138 = _134 + _126;
> 
> see how the lanes from the different vectors are accumulated?  (yeah,
> we should simply add the vectors!)

Yes, it's the same as what I saw on ppc64le, but the closely following dce6
removes the three vect_sum3_32 (in your dump, they are
vect_sum3_32.16_10{7,8,9}) as the subsequent joints don't actually use the
separated accumulated lane values (_138 -> sum0 ...) but only use
vect_sum3_32.16_110.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-10-13 10:31 ` linkw at gcc dot gnu.org
@ 2022-10-13 11:05 ` rguenther at suse dot de
  2022-10-13 11:45 ` linkw at gcc dot gnu.org
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenther at suse dot de @ 2022-10-13 11:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #10 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 13 Oct 2022, linkw at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160
> 
> --- Comment #9 from Kewen Lin <linkw at gcc dot gnu.org> ---
> > 
> > The above doesn't look wrong (but may miss the rest of the IL).  On
> > x86_64 this looks like
> > 
> >   <bb 4> [local count: 105119324]:
> >   # sum0_41 = PHI <sum0_28(3)>
> >   # sum1_39 = PHI <sum1_29(3)>
> >   # sum2_37 = PHI <sum2_30(3)>
> >   # sum3_35 = PHI <sum3_31(3)>
> >   # vect_sum3_31.11_59 = PHI <vect_sum3_31.11_60(3)>
> >   _58 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 0>;
> >   _57 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 32>;
> >   _56 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 64>;
> >   _55 = BIT_FIELD_REF <vect_sum3_31.11_59, 32, 96>;
> >   _74 = _58 + _57;
> >   _76 = _56 + _74;
> >   _78 = _55 + _76;
> > 
> >   <bb 5> [local count: 118111600]:
> >   # prephitmp_79 = PHI <_78(4), 0.0(2)>
> >   return prephitmp_79;
> > 
> 
> Yeah, it looks expected without unrolling.
> 
> > when unrolling is applied, thus with a larger VF, you should ideally
> > see the vectors accumulated.
> > 
> > Btw, I've fixed a SLP reduction issue two days ago in
> > r13-3226-gee467644c53ee2
> > though that looks unrelated?
> 
> Thanks for the information, I'll double check it.
> 
> > 
> > When I force a larger VF on x86 by adding a int store in the loop I see
> > 
> >   <bb 11> [local count: 94607391]:
> >   # sum0_48 = PHI <sum0_29(3)>
> >   # sum1_36 = PHI <sum1_30(3)>
> >   # sum2_35 = PHI <sum2_31(3)>
> >   # sum3_24 = PHI <sum3_32(3)>
> >   # vect_sum3_32.16_110 = PHI <vect_sum3_32.16_106(3)>
> >   # vect_sum3_32.16_111 = PHI <vect_sum3_32.16_107(3)>
> >   # vect_sum3_32.16_112 = PHI <vect_sum3_32.16_108(3)>
> >   # vect_sum3_32.16_113 = PHI <vect_sum3_32.16_109(3)>
> >   _114 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 0>;
> >   _115 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 32>;
> >   _116 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 64>;
> >   _117 = BIT_FIELD_REF <vect_sum3_32.16_110, 32, 96>;
> >   _118 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 0>;
> >   _119 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 32>;
> >   _120 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 64>;
> >   _121 = BIT_FIELD_REF <vect_sum3_32.16_111, 32, 96>;
> >   _122 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 0>;
> >   _123 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 32>;
> >   _124 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 64>;
> >   _125 = BIT_FIELD_REF <vect_sum3_32.16_112, 32, 96>;
> >   _126 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 0>;
> >   _127 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 32>;
> >   _128 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 64>;
> >   _129 = BIT_FIELD_REF <vect_sum3_32.16_113, 32, 96>;
> >   _130 = _114 + _118;
> >   _131 = _115 + _119;
> >   _132 = _116 + _120;
> >   _133 = _117 + _121;
> >   _134 = _130 + _122;
> >   _135 = _131 + _123;
> >   _136 = _132 + _124;
> >   _137 = _133 + _125;
> >   _138 = _134 + _126;
> > 
> > see how the lanes from the different vectors are accumulated?  (yeah,
> > we should simply add the vectors!)
> 
> Yes, it's the same as what I saw on ppc64le, but the closely following dce6
> removes the three vect_sum3_32 (in your dump, they are
> vect_sum3_32.16_10{7,8,9}) as the subsequent joints don't actually use the
> separated accumulated lane values (_138 -> sum0 ...) but only use
> vect_sum3_32.16_110.

I do - the epilog is even vectorized and it works fine at runtime.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2022-10-13 11:05 ` rguenther at suse dot de
@ 2022-10-13 11:45 ` linkw at gcc dot gnu.org
  2022-10-13 11:50 ` rguenther at suse dot de
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-13 11:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #11 from Kewen Lin <linkw at gcc dot gnu.org> ---
> > > Btw, I've fixed a SLP reduction issue two days ago in
> > > r13-3226-gee467644c53ee2
> > > though that looks unrelated?
> > 
> > Thanks for the information, I'll double check it.
> > 

To rebase to r13-3226 or even the latest trunk doesn't help.

> 
> I do - the epilog is even vectorized and it works fine at runtime.

You meant the code on x86 is all fine? and the case doesn't show the "res"
difference? hmm, it's weird.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2022-10-13 11:45 ` linkw at gcc dot gnu.org
@ 2022-10-13 11:50 ` rguenther at suse dot de
  2022-10-13 12:01 ` marxin at gcc dot gnu.org
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenther at suse dot de @ 2022-10-13 11:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #12 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 13 Oct 2022, linkw at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160
> 
> --- Comment #11 from Kewen Lin <linkw at gcc dot gnu.org> ---
> > > > Btw, I've fixed a SLP reduction issue two days ago in
> > > > r13-3226-gee467644c53ee2
> > > > though that looks unrelated?
> > > 
> > > Thanks for the information, I'll double check it.
> > > 
> 
> To rebase to r13-3226 or even the latest trunk doesn't help.

:/

> > 
> > I do - the epilog is even vectorized and it works fine at runtime.
> 
> You meant the code on x86 is all fine? and the case doesn't show the "res"
> difference? hmm, it's weird.

Yes.  I'll try to look at cross compiler IL.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2022-10-13 11:50 ` rguenther at suse dot de
@ 2022-10-13 12:01 ` marxin at gcc dot gnu.org
  2022-10-13 12:05 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: marxin at gcc dot gnu.org @ 2022-10-13 12:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #13 from Martin Liška <marxin at gcc dot gnu.org> ---
A bit reduced test-case that can be compiled with cross compiler:

$ cat pr107160.c
#define N 16
float fl[N];

__attribute__ ((noipa, optimize (0))) void
init ()
{
  for (int i = 0; i < N; i++)
    fl[i] = 1 << i;
}

__attribute__ ((noipa)) float
foo (int n)
{
  float sum0, sum1, sum2, sum3;
  sum0 = sum1 = sum2 = sum3 = 0.0f;

  for (int i = 0; i < n; i += 4)
    {
      sum0 += __builtin_fabs (fl[i]);
      sum1 += __builtin_fabs (fl[i + 1]);
      sum2 += __builtin_fabs (fl[i + 2]);
      sum3 += __builtin_fabs (fl[i + 3]);
    }

  return sum0 + sum1 + sum2 + sum3;
}

__attribute__ ((optimize (0))) int
main ()
{
  init ();
  float res = foo (N);
  __builtin_printf ("res:%f\n", res);
  return 0;
}

x86_64:
./a.out 
res:65535.000000

ppc64le:
./a.out
res:15.000000

so the result is wrong, it sums only first 4 elements (and not 16).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2022-10-13 12:01 ` marxin at gcc dot gnu.org
@ 2022-10-13 12:05 ` rguenth at gcc dot gnu.org
  2022-10-13 12:23 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-10-13 12:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
Aha, so the issue is that we have a vectorized epilogue here and the epilogue
of _that_ ends up doing


  <bb 11> [local count: 94607391]:
  # sum0_48 = PHI <sum0_28(3)>
  # sum1_47 = PHI <sum1_29(3)>
  # sum2_46 = PHI <sum2_30(3)>
  # sum3_45 = PHI <sum3_31(3)>
  # vect_sum3_31.16_101 = PHI <vect_sum3_31.16_97(3)>
  # vect_sum3_31.16_102 = PHI <vect_sum3_31.16_98(3)>
  # vect_sum3_31.16_103 = PHI <vect_sum3_31.16_99(3)>
  # vect_sum3_31.16_104 = PHI <vect_sum3_31.16_100(3)>
  _105 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 0>;
  _106 = BIT_FIELD_REF <vect_sum3_31.16_101, 32, 32>;
...
this is from the main vect

  <bb 17> [local count: 81467476]:
  # sum0_135 = PHI <sum0_62(12)>
  # sum1_136 = PHI <sum1_58(12)>
  # sum2_137 = PHI <sum2_54(12)>
  # sum3_138 = PHI <sum3_50(12)>
  # vect_sum3_50.25_159 = PHI <vect_sum3_50.25_158(12)>
this is from the epilogue

  <bb 15> [local count: 105119324]:
  # sum0_15 = PHI <sum0_135(17), _129(11)>
  # sum1_77 = PHI <sum1_136(17), _130(11)>
  # sum2_75 = PHI <sum2_137(17), _131(11)>
  # sum3_13 = PHI <sum3_138(17), _132(11)>
  # _160 = PHI <vect_sum3_50.25_159(17), vect_sum3_31.16_101(11)>
  _161 = BIT_FIELD_REF <_160, 32, 0>;
  _162 = BIT_FIELD_REF <_160, 32, 32>;
  _163 = BIT_FIELD_REF <_160, 32, 64>;
  _164 = BIT_FIELD_REF <_160, 32, 96>;
  _74 = _161 + _162;
  _76 = _74 + _163;
  _78 = _76 + _164;

so we fail to accumulate the main loops accumulators and just
use the first one.  On x86 the vectorized epilogue uses a smaller
vector size but the same number of accumulators.

It seems it's simply unexpected to have the unrolled SLP reduction
and a vectorized epilogue with the same vector mode (but not unrolled).

I can reproduce the failure when patching the x86 cost model to force
unrolling by 2 (maybe we want a --param to force that to aid debugging...).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2022-10-13 12:05 ` rguenth at gcc dot gnu.org
@ 2022-10-13 12:23 ` rguenth at gcc dot gnu.org
  2022-10-13 13:17 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-10-13 12:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|linkw at gcc dot gnu.org           |rguenth at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think that either vect_find_reusable_accumulator should punt on this or
vect_create_epilog_for_reduction shouldn't register this accumulator.
The caller of vect_find_reusable_accumulator checks for vec_num == 1
(in the epilog) but we register for vec_num == 2.  For SLP reductions
we "fail" to reduce to a single accumulator.

  int vec_num = vec_oprnds0.length ();
  gcc_assert (vec_num == 1 || slp_node);


The following fixes this:

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 1996ecfee7a..b1442a93581 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -6232,7 +6232,8 @@ vect_create_epilog_for_reduction (loop_vec_info
loop_vinfo,
     }

   /* Record this operation if it could be reused by the epilogue loop.  */
-  if (STMT_VINFO_REDUC_TYPE (reduc_info) == TREE_CODE_REDUCTION)
+  if (STMT_VINFO_REDUC_TYPE (reduc_info) == TREE_CODE_REDUCTION
+      && vec_num == 1)
     loop_vinfo->reusable_accumulators.put (scalar_results[0],
                                           { orig_reduc_input, reduc_info });

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2022-10-13 12:23 ` rguenth at gcc dot gnu.org
@ 2022-10-13 13:17 ` cvs-commit at gcc dot gnu.org
  2022-10-13 13:19 ` [Bug target/107160] [12/13 " rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-10-13 13:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #16 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:5cbaf84c191b9a3e3cb26545c808d208bdbf2ab5

commit r13-3273-g5cbaf84c191b9a3e3cb26545c808d208bdbf2ab5
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Oct 13 14:24:05 2022 +0200

    tree-optimization/107160 - avoid reusing multiple accumulators

    Epilogue vectorization is not set up to re-use a vectorized
    accumulator consisting of more than one vector.  For non-SLP
    we always reduce to a single but for SLP that isn't happening.
    In such case we currenlty miscompile the epilog so avoid this.

            PR tree-optimization/107160
            * tree-vect-loop.cc (vect_create_epilog_for_reduction):
            Do not register accumulator if we failed to reduce it
            to a single vector.

            * gcc.dg/vect/pr107160.c: New testcase.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [12/13 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2022-10-13 13:17 ` cvs-commit at gcc dot gnu.org
@ 2022-10-13 13:19 ` rguenth at gcc dot gnu.org
  2022-10-14  2:54 ` [Bug target/107160] [12 " linkw at gcc dot gnu.org
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-10-13 13:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[13 regression]             |[12/13 regression]
                   |r13-2641-g0ee1548d96884d    |r13-2641-g0ee1548d96884d
                   |causes verification failure |causes verification failure
                   |in spec2006                 |in spec2006
           Priority|P3                          |P2
   Target Milestone|13.0                        |12.3

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk, but I think the issue is latent on the GCC 12 branch, queueing
for backports.  I am testing a followup fixing the missed optimization
(PR107247).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [12 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2022-10-13 13:19 ` [Bug target/107160] [12/13 " rguenth at gcc dot gnu.org
@ 2022-10-14  2:54 ` linkw at gcc dot gnu.org
  2022-10-17 13:10 ` cvs-commit at gcc dot gnu.org
  2022-10-17 13:11 ` rguenth at gcc dot gnu.org
  19 siblings, 0 replies; 21+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-14  2:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #18 from Kewen Lin <linkw at gcc dot gnu.org> ---
Thanks for the prompt fix! I just verified it fixed the SPEC2006 447.dealII
regression perfectly.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [12 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2022-10-14  2:54 ` [Bug target/107160] [12 " linkw at gcc dot gnu.org
@ 2022-10-17 13:10 ` cvs-commit at gcc dot gnu.org
  2022-10-17 13:11 ` rguenth at gcc dot gnu.org
  19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-10-17 13:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

--- Comment #19 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:d282dd56275485a88e1fe9c4ae1939b62d23b20b

commit r12-8839-gd282dd56275485a88e1fe9c4ae1939b62d23b20b
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Oct 13 14:24:05 2022 +0200

    tree-optimization/107160 - avoid reusing multiple accumulators

    Epilogue vectorization is not set up to re-use a vectorized
    accumulator consisting of more than one vector.  For non-SLP
    we always reduce to a single but for SLP that isn't happening.
    In such case we currenlty miscompile the epilog so avoid this.

            PR tree-optimization/107160
            * tree-vect-loop.cc (vect_create_epilog_for_reduction):
            Do not register accumulator if we failed to reduce it
            to a single vector.

            * gcc.dg/vect/pr107160.c: New testcase.

    (cherry picked from commit 5cbaf84c191b9a3e3cb26545c808d208bdbf2ab5)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug target/107160] [12 regression] r13-2641-g0ee1548d96884d  causes verification failure in spec2006
  2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2022-10-17 13:10 ` cvs-commit at gcc dot gnu.org
@ 2022-10-17 13:11 ` rguenth at gcc dot gnu.org
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-10-17 13:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107160

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
      Known to fail|                            |12.2.0
             Status|ASSIGNED                    |RESOLVED
      Known to work|                            |12.2.1

--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-10-17 13:11 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-05 16:21 [Bug target/107160] New: [13 regression] r13-2641-g0ee1548d96884d causes verification failure in spec2006 seurer at gcc dot gnu.org
2022-10-06 10:43 ` [Bug target/107160] " rguenth at gcc dot gnu.org
2022-10-06 14:04 ` seurer at gcc dot gnu.org
2022-10-06 19:08 ` seurer at gcc dot gnu.org
2022-10-10  2:29 ` linkw at gcc dot gnu.org
2022-10-10 19:14 ` seurer at gcc dot gnu.org
2022-10-12  8:31 ` linkw at gcc dot gnu.org
2022-10-13  9:57 ` linkw at gcc dot gnu.org
2022-10-13 10:16 ` rguenth at gcc dot gnu.org
2022-10-13 10:31 ` linkw at gcc dot gnu.org
2022-10-13 11:05 ` rguenther at suse dot de
2022-10-13 11:45 ` linkw at gcc dot gnu.org
2022-10-13 11:50 ` rguenther at suse dot de
2022-10-13 12:01 ` marxin at gcc dot gnu.org
2022-10-13 12:05 ` rguenth at gcc dot gnu.org
2022-10-13 12:23 ` rguenth at gcc dot gnu.org
2022-10-13 13:17 ` cvs-commit at gcc dot gnu.org
2022-10-13 13:19 ` [Bug target/107160] [12/13 " rguenth at gcc dot gnu.org
2022-10-14  2:54 ` [Bug target/107160] [12 " linkw at gcc dot gnu.org
2022-10-17 13:10 ` cvs-commit at gcc dot gnu.org
2022-10-17 13:11 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).