[Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target
@ 2021-08-31  0:15 wilson at gcc dot gnu.org
  2021-08-31  0:35 ` [Bug tree-optimization/102139] " pinskia at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: wilson at gcc dot gnu.org @ 2021-08-31  0:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

            Bug ID: 102139
           Summary: -O3 miscompile due to slp-vectorize on strict align
                    target
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wilson at gcc dot gnu.org
  Target Milestone: ---

This was originally reported here.
https://github.com/riscv/riscv-gcc/issues/289

This testcase is miscompiled at -O3 for a riscv64 target, though this is not a
bug in the riscv64 port.  I think it will fail for any strict align target.

typedef unsigned short uint16_t;

void zero_two_uint16(uint16_t* ptr) {
  ptr[0] = 0;
  ptr[1] = 0;
}

void zero(uint16_t* ptr) {
  for (int i = 0; i < 16; ++i) {
    zero_two_uint16(ptr);
    ptr += 2;
  }
}

The output is
zero:
        sd      zero,0(a0)
        sd      zero,8(a0)
        sd      zero,16(a0)
        sd      zero,24(a0)
        sd      zero,32(a0)
        sd      zero,40(a0)
        sd      zero,48(a0)
        sd      zero,56(a0)
        ret
which fails due to unaligned accesses as a0 only has 2 byte alignment.

A git bisect tracked the problem down to this commit.

commit f5e18dd
Author: Kewen Lin linkw@gcc.gnu.org
Date: Tue Nov 3 02:51:47 2020 +0000

        pass: Run cleanup passes before SLP [PR96789]
        ...

I get correct code if I disable the fre4 pass, which is the fre pass inside
pre_slp_scalar_cleanup which was added by this patch.

The 169t.vectorize pass adds an address alignment check, and then emits a loop
with double-word stores if aligned, and a loop with half-word stores if
unaligned.  172t.cunroll fully unrolls both loops.  The 173t.fre4 pass deletes
a phi node before the half-word stores.  The 172t output has
  <bb 13> [local count: 12627204]:
  # ptr_3 = PHI <ptr_4(D)(2)>
  # ivtmp_15 = PHI <16(2)>
  *ptr_3 = 0;
and the 173t.fre4 output has
  <bb 13> [local count: 12627204]:
  *ptr_4(D) = 0;
In the 175t.slp1 pass, the block of half-word stores gets vectorized which is
wrong.  Then later 207t.dce7 notices duplicate code and deletes the second
block of stores.

Comparing the full slp1 dump with fre4 disabled versus the unmodified slp1
dump, I see that the first significant difference is when computing pointer
alignment.  With fre4 disabled, I get

tmp.c:4:10: note:  recording new base alignment for vectp_ptr.8_125
  alignment:    8
  misalignment: 0
  based on:     MEM <vector(4) short unsigned int> [(uint16_t
*)vectp_ptr.8_125] = { 0, 0, 0, 0 };
tmp.c:4:10: note:  recording new base alignment for ptr_3
  alignment:    2
  misalignment: 0
  based on:     *ptr_3 = 0;
tmp.c:4:10: note:   === vect_slp_analyze_instance_alignment ===
tmp.c:4:10: note:   vect_compute_data_ref_alignment:
tmp.c:4:10: note:   can't force alignment of ref: *ptr_3

It then refuses to vectorize.  With the unmodified compiler I get

tmp.c:4:10: note:  recording new base alignment for ptr_4(D)
  alignment:    8
  misalignment: 0
  based on:     MEM <vector(4) short unsigned int> [(uint16_t *)ptr_4(D)] = {
0, 0, 0, 0 };
tmp.c:4:10: note:   === vect_slp_analyze_instance_alignment ===
tmp.c:4:10: note:   vect_compute_data_ref_alignment:
tmp.c:4:10: missed:   misalign = 0 bytes of ref *ptr_4(D)

and then goes ahead and vectorizes which is wrong.

Maybe fre4 shouldn't optimize away a phi node when the pointers have different
alignment?

I noticed that before slp1 runs, the double-word store block has
  # ALIGN = 8, MISALIGN = 0
but the half-word store block does not.  After slp1 runs, both the double-word
store and the half-word store block have these notes.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
@ 2021-08-31  0:35 ` pinskia at gcc dot gnu.org
  2021-08-31  0:38 ` pinskia at gcc dot gnu.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-31  0:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
It works fine for me on aarch64 with -O3 -mstrict-align.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
  2021-08-31  0:35 ` [Bug tree-optimization/102139] " pinskia at gcc dot gnu.org
@ 2021-08-31  0:38 ` pinskia at gcc dot gnu.org
  2021-08-31  1:24 ` [Bug tree-optimization/102139] [11/12 Regression] " pinskia at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-31  0:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> It works fine for me on aarch64 with -O3 -mstrict-align.

But I can reproduce it with -O3 -mgeneral-regs-only -mstrict-align

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11/12 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
  2021-08-31  0:35 ` [Bug tree-optimization/102139] " pinskia at gcc dot gnu.org
  2021-08-31  0:38 ` pinskia at gcc dot gnu.org
@ 2021-08-31  1:24 ` pinskia at gcc dot gnu.org
  2021-08-31  1:35 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-31  1:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Target Milestone|---                         |11.3
     Ever confirmed|0                           |1
            Summary|-O3 miscompile due to       |[11/12 Regression] -O3
                   |slp-vectorize on strict     |miscompile due to
                   |align target                |slp-vectorize on strict
                   |                            |align target
   Last reconfirmed|                            |2021-08-31

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
fre is fine here.
The problem is SLP.
Here is why
take:
typedef decltype(sizeof(0)) size_t;
typedef unsigned short uint16_t;
typedef unsigned uint32_t;

void zero_two_uint16(uint16_t* ptr) {
  ptr[0] = 0;
  ptr[1] = 0;
}

#define vector __attribute__((vector_size(sizeof(uint16_t)*4)))
void f(uint16_t *a)
{
    vector uint16_t *b = (vector uint16_t *)a;
    *b = (vector uint16_t){};
}

void g(uint16_t *a)
{
    size_t t = (size_t)a;
    if ((t & 0x7)==0) {
        for(int i = 0;i < 8;i++)
      f((a + i*4));
    } else {
        for(int i = 0;i < 16;i++)
    zero_two_uint16((a + i*2));
    }
}
---- CUT ----
Compile it on aarch64 with -O3 -mgeneral-regs-only
-fno-tree-loop-distribute-patterns -mstrict-align -fno-tree-loop-vectorize
and you will produce the same result for SLP.
There is no PHI for a for FRE to merge even. And there is no alignment
information on the pointers assignments either.

As you can see by the dump (-fdump-tree-*-all):
  # PT = nonlocal null 
  uint16_tD.1724 * a_15(D) = aD.1733;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11/12 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-08-31  1:24 ` [Bug tree-optimization/102139] [11/12 Regression] " pinskia at gcc dot gnu.org
@ 2021-08-31  1:35 ` pinskia at gcc dot gnu.org
  2021-08-31  1:47 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-31  1:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #3)
> There is no PHI for a for FRE to merge even. And there is no alignment
> information on the pointers assignments either.

The only thing FRE does for this testcase which SLP might make a huge mistake
on is prop a_11(D) into the MEM.
From:
  # i_24 = PHI <0(2)>
  _22 = i_24 * 8;
  _25 = a_11(D) + _22;
  MEM[(vector(4) short unsigned int *)_25] = { 0, 0, 0, 0 };
Into:
  MEM[(vector(4) short unsigned int *)a_11(D)] = { 0, 0, 0, 0 };

But SLP seems like does not take into account the two sides of the branches are
unrelated and still uses the alignment of the first for the second.

/app/example.cpp:6:10: note:  recording new base alignment for a_11(D)
  alignment:    2
  misalignment: 0
  based on:     *a_11(D) = 0;
/app/example.cpp:6:10: note:  recording new base alignment for a_11(D)
  alignment:    8
  misalignment: 0
  based on:     MEM[(vector(4) short unsigned int *)a_11(D)] = { 0, 0, 0, 0 };

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11/12 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-08-31  1:35 ` pinskia at gcc dot gnu.org
@ 2021-08-31  1:47 ` pinskia at gcc dot gnu.org
  2021-08-31  7:38 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-31  1:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Here is a testcase which fails with -O3 -fno-tree-loop-ivcanon
-fno-tree-forwprop -mgeneral-regs-only -fno-tree-loop-distribute-patterns
-fno-vect-cost-model -mstrict-align -fno-tree-fre -fno-tree-loop-vectorize :


typedef decltype(sizeof(0)) size_t;
typedef unsigned short uint16_t;
typedef unsigned uint32_t;

void zero_two_uint16(uint16_t* ptr) {
  ptr[0] = 0;
  ptr[1] = 0;
}

#define vector __attribute__((vector_size(sizeof(uint16_t)*4)))
void f(uint16_t *a)
{
    vector uint16_t *b = (vector uint16_t *)a;
    *b = (vector uint16_t){};
}

void g(uint16_t *a)
{
    size_t t = (size_t)a;
    if ((t & 0x7)==0) {
        for(size_t i = 0;i < 2;i++)
      f((a + i*4));
    } else {
        for(size_t i = 0;i < 2;i++)
    zero_two_uint16((a + i*2));
    }
}

Which shows the cleanups don't make a difference and some other patch is
causing it but it was still working in GCC 10.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11/12 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-08-31  1:47 ` pinskia at gcc dot gnu.org
@ 2021-08-31  7:38 ` rguenth at gcc dot gnu.org
  2021-08-31 10:18 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-31  7:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |riscv
           Priority|P3                          |P2
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Mine.  This is caused by doing SLP on the whole function rather than on a
single BB but failing to update the base_alignment hash-map to also consider
flow.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11/12 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-08-31  7:38 ` rguenth at gcc dot gnu.org
@ 2021-08-31 10:18 ` rguenth at gcc dot gnu.org
  2021-08-31 10:22 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-31 10:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Testcase triggering a segfault on x86_64 and showing the issue inside a single
BB with a function that doesn't return.

void __attribute__((noipa))
foo (int i)
{
  if (i)
    __builtin_exit (0);
}
typedef double aligned_double __attribute__((aligned(2*sizeof(double))));
void __attribute__((noipa))
bar (double *p)
{
  p[0] = 0.;
  p[1] = 1.;
  foo (1);
  *(aligned_double *)p = 3.;
  p[1] = 4.;
}
double x[4] __attribute__((aligned(2*sizeof (double))));
int main()
{
  bar (&x[1]);
  return 0;
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11/12 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-08-31 10:18 ` rguenth at gcc dot gnu.org
@ 2021-08-31 10:22 ` rguenth at gcc dot gnu.org
  2021-09-01 10:54 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-31 10:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|riscv                       |riscv, x86_64-*-*

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
And for a condition:

typedef double aligned_double __attribute__((aligned(2*sizeof(double))));
void __attribute__((noipa))
bar (int aligned, double *p)
{
  if (aligned)
    {
      *(aligned_double *)p = 3.;
      p[1] = 4.;
    }
  else
    {
      p[2] = 0.;
      p[3] = 1.;
    }
}
double x[8] __attribute__((aligned(2*sizeof (double))));
int main()
{
  bar (0, &x[1]);
  return 0;
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11/12 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2021-08-31 10:22 ` rguenth at gcc dot gnu.org
@ 2021-09-01 10:54 ` cvs-commit at gcc dot gnu.org
  2021-09-01 10:55 ` [Bug tree-optimization/102139] [11 " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-01 10:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:153766ec8351d55cfe8bd6d69bdfc0c2cef71e56

commit r12-3283-g153766ec8351d55cfe8bd6d69bdfc0c2cef71e56
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Aug 31 10:28:40 2021 +0200

    tree-optimization/102139 - fix SLP DR base alignment

    When doing whole-function SLP we have to make sure the recorded
    base alignments we compute as the maximum alignment seen for a
    base anywhere in the function is actually valid at the point
    we want to make use of it.

    To make this work we now record the stmt the alignment was derived
    from in addition to the DRs innermost behavior and we use a
    dominance check to verify the recorded info is valid when doing
    BB vectorization.  For this to work for groups inside a BB that are
    separate by a call that might not return we now store the DR
    analysis group-id permanently and use that for an additional check
    when the DRs are in the same BB.

    2021-08-31  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/102139
            * tree-vectorizer.h (vec_base_alignments): Adjust hash-map
            type to record a std::pair of the stmt-info and the innermost
            loop behavior.
            (dr_vec_info::group): New member.
            * tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
            (vect_compute_data_ref_alignment): Verify the recorded
            base alignment can be used.
            (data_ref_pair): Remove.
            (dr_group_sort_cmp): Adjust.
            (vect_analyze_data_ref_accesses): Store the group-ID in the
            dr_vec_info and operate on a vector of dr_vec_infos.

            * gcc.dg/torture/pr102139.c: New testcase.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2021-09-01 10:54 ` cvs-commit at gcc dot gnu.org
@ 2021-09-01 10:55 ` rguenth at gcc dot gnu.org
  2021-11-08 12:35 ` cvs-commit at gcc dot gnu.org
  2021-11-08 12:37 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-01 10:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[11/12 Regression] -O3      |[11 Regression] -O3
                   |miscompile due to           |miscompile due to
                   |slp-vectorize on strict     |slp-vectorize on strict
                   |align target                |align target
      Known to work|                            |12.0
      Known to fail|                            |11.2.0

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
Should be fixed on trunk now.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2021-09-01 10:55 ` [Bug tree-optimization/102139] [11 " rguenth at gcc dot gnu.org
@ 2021-11-08 12:35 ` cvs-commit at gcc dot gnu.org
  2021-11-08 12:37 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-11-08 12:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:7f04f47d8d414c399ce7b5c8158fadc437469755

commit r11-9224-g7f04f47d8d414c399ce7b5c8158fadc437469755
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Aug 31 10:28:40 2021 +0200

    tree-optimization/102139 - fix SLP DR base alignment

    When doing whole-function SLP we have to make sure the recorded
    base alignments we compute as the maximum alignment seen for a
    base anywhere in the function is actually valid at the point
    we want to make use of it.

    To make this work we now record the stmt the alignment was derived
    from in addition to the DRs innermost behavior and we use a
    dominance check to verify the recorded info is valid when doing
    BB vectorization.  For this to work for groups inside a BB that are
    separate by a call that might not return we now store the DR
    analysis group-id permanently and use that for an additional check
    when the DRs are in the same BB.

    2021-08-31  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/102139
            * tree-vectorizer.h (vec_base_alignments): Adjust hash-map
            type to record a std::pair of the stmt-info and the innermost
            loop behavior.
            (dr_vec_info::group): New member.
            * tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
            (vect_compute_data_ref_alignment): Verify the recorded
            base alignment can be used.
            (data_ref_pair): Remove.
            (dr_group_sort_cmp): Adjust.
            (vect_analyze_data_ref_accesses): Store the group-ID in the
            dr_vec_info and operate on a vector of dr_vec_infos.

            * gcc.dg/torture/pr102139.c: New testcase.

    (cherry picked from commit 153766ec8351d55cfe8bd6d69bdfc0c2cef71e56)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/102139] [11 Regression] -O3 miscompile due to slp-vectorize on strict align target
  2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2021-11-08 12:35 ` cvs-commit at gcc dot gnu.org
@ 2021-11-08 12:37 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-11-08 12:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
      Known to work|                            |11.2.1
         Resolution|---                         |FIXED

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-11-08 12:37 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-31  0:15 [Bug tree-optimization/102139] New: -O3 miscompile due to slp-vectorize on strict align target wilson at gcc dot gnu.org
2021-08-31  0:35 ` [Bug tree-optimization/102139] " pinskia at gcc dot gnu.org
2021-08-31  0:38 ` pinskia at gcc dot gnu.org
2021-08-31  1:24 ` [Bug tree-optimization/102139] [11/12 Regression] " pinskia at gcc dot gnu.org
2021-08-31  1:35 ` pinskia at gcc dot gnu.org
2021-08-31  1:47 ` pinskia at gcc dot gnu.org
2021-08-31  7:38 ` rguenth at gcc dot gnu.org
2021-08-31 10:18 ` rguenth at gcc dot gnu.org
2021-08-31 10:22 ` rguenth at gcc dot gnu.org
2021-09-01 10:54 ` cvs-commit at gcc dot gnu.org
2021-09-01 10:55 ` [Bug tree-optimization/102139] [11 " rguenth at gcc dot gnu.org
2021-11-08 12:35 ` cvs-commit at gcc dot gnu.org
2021-11-08 12:37 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).