public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256
@ 2021-02-15 10:19 acoplan at gcc dot gnu.org
  2021-02-15 10:49 ` [Bug target/99102] [11 Regression] " acoplan at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: acoplan at gcc dot gnu.org @ 2021-02-15 10:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

            Bug ID: 99102
           Summary: SVE: Wrong code with -O2 -ftree-vectorize
                    -march=armv8.2-a+sve -msve-vector-bits=256
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

AArch64 GCC miscompiles the following testcase:

long a[44];
short d, e = -7;
void b(char f, short j, short k, unsigned l) {
  for (int g = 0; g < 9; g += f)
    for (int b = 0; b < 90; b -= k)
      for (int h = 0; h < f; h++)
        for (short i = 0; i < 15; i += 4)
          if (!l)
            a[i] = j;
}
int main() {
  for (long c = 0; c < 2; ++c)
    a[c] = 7;
  b(9, d, e, 5);
  if (!a[0])
    __builtin_abort();
}

at -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256. Looking at
the generated code for b:

b:
.LFB0:
        .cfi_startproc
        adrp    x3, .LANCHOR0
        and     w0, w0, 255
        sxth    w2, w2
        add     x3, x3, :lo12:.LANCHOR0
        mov     w5, 0
        mov     z0.h, w1
        ptrue   p0.b, vl32
        mov     x1, 32
        sxth    z0.d, p0/m, z0.d
        index   z1.d, #0, x1
.L2:
        mov     w4, 0
        .p2align 3,,7
.L7:
        mov     w1, 0
        cbz     w0, .L5
        .p2align 3,,7
.L3:
        add     w1, w1, 1
        st1d    z0.d, p0, [x3, z1.d]
        cmp     w0, w1
        bne     .L3
.L5:
        sub     w4, w4, w2
        cmp     w4, 89
        ble     .L7
        add     w5, w5, w0
        cmp     w5, 8
        ble     .L2
        ret

we appear to ignore the value for the argument "l" completely (we immediately
clobber x3 with the address for a).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/99102] [11 Regression] SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256
  2021-02-15 10:19 [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256 acoplan at gcc dot gnu.org
@ 2021-02-15 10:49 ` acoplan at gcc dot gnu.org
  2021-02-15 12:08 ` ktkachov at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: acoplan at gcc dot gnu.org @ 2021-02-15 10:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

Alex Coplan <acoplan at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|SVE: Wrong code with -O2    |[11 Regression] SVE: Wrong
                   |-ftree-vectorize            |code with -O2
                   |-march=armv8.2-a+sve        |-ftree-vectorize
                   |-msve-vector-bits=256       |-march=armv8.2-a+sve
                   |                            |-msve-vector-bits=256
   Target Milestone|---                         |11.0

--- Comment #1 from Alex Coplan <acoplan at gcc dot gnu.org> ---
Can't reproduce on GCC 10 branch.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/99102] [11 Regression] SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256
  2021-02-15 10:19 [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256 acoplan at gcc dot gnu.org
  2021-02-15 10:49 ` [Bug target/99102] [11 Regression] " acoplan at gcc dot gnu.org
@ 2021-02-15 12:08 ` ktkachov at gcc dot gnu.org
  2021-02-25 16:43 ` joelh at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-02-15 12:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ktkachov at gcc dot gnu.org
           Priority|P3                          |P1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/99102] [11 Regression] SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256
  2021-02-15 10:19 [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256 acoplan at gcc dot gnu.org
  2021-02-15 10:49 ` [Bug target/99102] [11 Regression] " acoplan at gcc dot gnu.org
  2021-02-15 12:08 ` ktkachov at gcc dot gnu.org
@ 2021-02-25 16:43 ` joelh at gcc dot gnu.org
  2021-03-03 12:22 ` joelh at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: joelh at gcc dot gnu.org @ 2021-02-25 16:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

Joel Hutton <joelh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-02-25

--- Comment #2 from Joel Hutton <joelh at gcc dot gnu.org> ---
Bisect shows

46c705e70e078f6a1920d92e49042125d5e18495 is the first bad commit
commit 46c705e70e078f6a1920d92e49042125d5e18495
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Wed Nov 11 11:42:46 2020 +0000

    aarch64: Support SVE comparisons for unpacked integers

    This patch adds support for comparing unpacked SVE integer vectors,
    such as byte elements stored in the bottom bytes of halfword
    containers.  It also adds support for selects between unpacked
    SVE vectors (both integer and floating-point), since selects and
    compares are closely tied via the vcond optab interface.



is the offending commit.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/99102] [11 Regression] SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256
  2021-02-15 10:19 [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256 acoplan at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-02-25 16:43 ` joelh at gcc dot gnu.org
@ 2021-03-03 12:22 ` joelh at gcc dot gnu.org
  2021-03-05 16:18 ` joelh at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: joelh at gcc dot gnu.org @ 2021-03-03 12:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

--- Comment #3 from Joel Hutton <joelh at gcc dot gnu.org> ---
Seems like the 'vect' pass is vectorizing a 'MASK_STORE' to a 'SCATTER_STORE'
and ignoring the mask.

1292 testcase.c:7:29: note:  vect_is_simple_use: operand 0, type of def:
constant$
1293 testcase.c:7:29: note:  created new init_stmt: vect_cst__65 = { 0, 0, 0, 0
};$
1294 testcase.c:7:29: note:  add new stmt: mask__60.11_66 = vect_cst__64 ==
vect_cst__65;$
1295 testcase.c:7:29: note:  ------>vectorizing statement: _61 = &a[_10];$
1296 testcase.c:7:29: note:  ------>vectorizing statement: .MASK_STORE (_61,
128B, _60, _31);$
1297 testcase.c:7:29: note:  transform statement.$
1298 testcase.c:7:29: note:  vect_is_simple_use: operand l_23(D) == 0, type of
def: internal$
1299 testcase.c:7:29: note:  vect_is_simple_use: vectype vector(4)
<signed-boolean:8>$
1300 testcase.c:7:29: note:  vect_is_simple_use: operand (long intD.9) j_24(D),
type of def: internal$
1301 testcase.c:7:29: note:  vect_is_simple_use: vectype vector(4) long int$
1302 Applying pattern match.pd:139, generic-match.c:24056$
1303 Applying pattern match.pd:139, generic-match.c:24056$
1304 testcase.c:7:29: note:  transform store. ncopies = 1$
1305 testcase.c:7:29: note:  vect_get_vec_defs_for_operand: _31$
1306 testcase.c:7:29: note:  vect_is_simple_use: operand (long intD.9) j_24(D),
type of def: internal$
1307 testcase.c:7:29: note:    def_stmt =  _31 = (long int) j_24(D);$
1308 testcase.c:7:29: note:  vect_get_vec_defs_for_operand: _60$
1309 testcase.c:7:29: note:  vect_is_simple_use: operand l_23(D) == 0, type of
def: internal$
1310 testcase.c:7:29: note:    def_stmt =  _60 = l_23(D) == 0;$
1311 testcase.c:7:29: note:  create integer_type-pointer variable to type: long
int  vectorizing a pointer ref: MEM[(long int *)&a]$
1312 Applying pattern match.pd:139, generic-match.c:27580$
1313 testcase.c:7:29: note:  created &a$
1314 testcase.c:7:29: note:  add new stmt: .SCATTER_STORE (vectp_a.12_67, { 0,
32, 64, 96 }, 1, vect__31.10_63);$



before:

112   <bb 4> [local count: 858993458]:$
113   # i_39 = PHI <i_26(17), 0(3)>$
114   # ivtmp_16 = PHI <ivtmp_7(17), 4(3)>$
115   _1 = (int) i_39;$
116   _2 = (long int) j_24(D);$
117   _45 = l_23(D) == 0;$
118   _46 = &a[_1];$
119   .MASK_STORE (_46, 128B, _45, _2);$
120   i.0_3 = (unsigned short) i_39;$
121   _4 = i.0_3 + 4;$
122   i_26 = (short int) _4;$
123   ivtmp_7 = ivtmp_16 - 1;$
124   if (ivtmp_7 != 0)$
125     goto <bb 17>; [75.00%]$
126   else$
127     goto <bb 7>; [25.00%]$
128 $
129   <bb 17> [local count: 644245087]:$
130   goto <bb 4>; [100.00%]$
131 $
132   <bb 7> [local count: 214748368]:$
133   h_22 = h_33 + 1;$
134   if (_14 > h_22)$
135     goto <bb 16>; [89.00%]$
136   else$
137     goto <bb 19>; [11.00%]$


after:

1525   # PT = null { D.3608 } (nonlocal)$
1526   # ALIGN = 32, MISALIGN = 0$
1527   # vectp_a.12_67 = PHI <&aD.3608(21), vectp_a.12_68(26)>$
1528   # ivtmp_70 = PHI <0(21), ivtmp_71(26)>$
1529   _10 = (intD.7) i_9;$
1530   vect__31.10_63 = (vector(4) long intD.9) vect_cst__62;$
1531   _31 = (long intD.9) j_24(D);$
1532   mask__60.11_66 = vect_cst__64 == vect_cst__65;$
1533   _60 = l_23(D) == 0;$
1534   # PT = null { D.3608 } (nonlocal)$
1535   _61 = &aD.3608[_10];$
1536   # .MEM_69 = VDEF <.MEM_13>$
1537   # USE = anything~$
1538   # CLB = anything~$
1539   .SCATTER_STORE (vectp_a.12_67, { 0, 32, 64, 96 }, 1, vect__31.10_63);$
1540   # RANGE [0, 14] NONZERO 12$
1541   i.0_27 = (unsigned short) i_9;$
1542   # RANGE [4, 18] NONZERO 28$
1543   _28 = i.0_27 + 4;$
1544   # RANGE [4, 18] NONZERO 28$
1545   i_29 = (short intD.18) _28;$
1546   ivtmp_30 = ivtmp_11 - 1;$
1547   # PT = null { D.3608 } (nonlocal)$
1548   # ALIGN = 32, MISALIGN = 0$
1549   vectp_a.12_68 = vectp_a.12_67 + 128;$
1550   ivtmp_71 = ivtmp_70 + 1;$
1551   if (ivtmp_71 < 1)$
1552     goto <bb 26>; [0.00%]$
1553   else$
1554     goto <bb 40>; [100.00%]$
1555 ;;    succ:       26 [never (adjusted)]  count:0 (estimated locally)
(TRUE_VALUE,EXECUTABLE)$
1556 ;;                40 [always (adjusted)]  count:214748371 (estimated
locally) (FALSE_VALUE,EXECUTABLE)$
1557 $
1558 ;;   basic block 26, loop depth 4, count 0 (estimated locally)$
1559 ;;    prev block 22, next block 40, flags: (NEW, VISITED)$
1560 ;;    pred:       22 [never (adjusted)]  count:0 (estimated locally)
(TRUE_VALUE,EXECUTABLE)$
1561   goto <bb 22>; [100.00%]$

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/99102] [11 Regression] SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256
  2021-02-15 10:19 [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256 acoplan at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-03-03 12:22 ` joelh at gcc dot gnu.org
@ 2021-03-05 16:18 ` joelh at gcc dot gnu.org
  2021-03-10 12:26 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: joelh at gcc dot gnu.org @ 2021-03-05 16:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

--- Comment #4 from Joel Hutton <joelh at gcc dot gnu.org> ---
It seems it is vectorizing a 'MASK_STORE' into a 'SCATTER_STORE' when it should
be using a 'MASK_SCATTER_STORE'. Currently it's choosing between
IFN_SCATTER_STORE and IFN_MASK_SCATTER_STORE based on the
'using_partial_vectors' field.

 7729   vec_loop_masks *loop_masks
 7730     = (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
 7731        ? &LOOP_VINFO_MASKS (loop_vinfo)
 7732        : NULL);
 7733   vec_loop_lens *loop_lens
 7734     = (loop_vinfo && LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)
 7735        ? &LOOP_VINFO_LENS (loop_vinfo)
 7736        : NULL);

 806 #define LOOP_VINFO_FULLY_MASKED_P(L)>--->-------\$
 807   (LOOP_VINFO_USING_PARTIAL_VECTORS_P (L)>------\$
 808    && !LOOP_VINFO_MASKS (L).is_empty ())$
 809 $

 8005               if (memory_access_type == VMAT_GATHER_SCATTER)
 8006                 {
 8007                   tree scale = size_int (gs_info.scale);
 8008                   gcall *call;
 8009                   if (loop_masks)
 8010                     call = gimple_build_call_internal
 8011                       (IFN_MASK_SCATTER_STORE, 5, dataref_ptr,
vec_offset,
 8012                        scale, vec_oprnd, final_mask);
 8013                   else
 8014                     call = gimple_build_call_internal
 8015                       (IFN_SCATTER_STORE, 4, dataref_ptr, vec_offset,
 8016                        scale, vec_oprnd);
 8017                   gimple_call_set_nothrow (call, true);
 8018                   vect_finish_stmt_generation (vinfo, stmt_info, call,
gsi);
 8019                   new_stmt = call;
 8020                   break;
 8021                 }
 8022
 8023               if (i > 0)
 8024                 /* Bump the vector pointer.  */
 8025                 dataref_ptr = bump_vector_ptr (vinfo, dataref_ptr,
ptr_incr,
 8026                                                gsi, stmt_info, bump);
 8027
 8028               if (slp)
 8029                 vec_oprnd = vec_oprnds[i];

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/99102] [11 Regression] SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256
  2021-02-15 10:19 [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256 acoplan at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-03-05 16:18 ` joelh at gcc dot gnu.org
@ 2021-03-10 12:26 ` cvs-commit at gcc dot gnu.org
  2021-03-10 12:48 ` joelh at gcc dot gnu.org
  2021-03-30 16:05 ` rsandifo at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-03-10 12:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Joel Hutton <joelh@gcc.gnu.org>:

https://gcc.gnu.org/g:99d5299376d203fe5172574c2d6b0b088e532383

commit r11-7597-g99d5299376d203fe5172574c2d6b0b088e532383
Author: Joel Hutton <joel.hutton@arm.com>
Date:   Wed Mar 10 12:22:45 2021 +0000

    [Vect] Fix mask check on Scatter loads/stores

    Previously, IFN_MASK_SCATTER_STORE was used if 'loop_masks' was
    non-null, but the mask used is 'final_mask'. This caused a bug where
    a 'MASK_STORE' was vectorized into a 'SCATTER_STORE' instead of a
    'MASK_SCATTER_STORE'. This fixes PR target/99102.

    gcc/ChangeLog:

            PR target/99102
            * tree-vect-stmts.c (vectorizable_store): Fix scatter store mask
            check condition.
            (vectorizable_load): Fix gather load mask check condition.

    gcc/testsuite/ChangeLog:

            PR target/99102
            * gcc.dg/vect/pr99102.c: New test.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/99102] [11 Regression] SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256
  2021-02-15 10:19 [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256 acoplan at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-03-10 12:26 ` cvs-commit at gcc dot gnu.org
@ 2021-03-10 12:48 ` joelh at gcc dot gnu.org
  2021-03-30 16:05 ` rsandifo at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: joelh at gcc dot gnu.org @ 2021-03-10 12:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

Joel Hutton <joelh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #6 from Joel Hutton <joelh at gcc dot gnu.org> ---
Fixed on trunk.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/99102] [11 Regression] SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256
  2021-02-15 10:19 [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256 acoplan at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-03-10 12:48 ` joelh at gcc dot gnu.org
@ 2021-03-30 16:05 ` rsandifo at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2021-03-30 16:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

--- Comment #7 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
*** Bug 98917 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-03-30 16:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-15 10:19 [Bug target/99102] New: SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256 acoplan at gcc dot gnu.org
2021-02-15 10:49 ` [Bug target/99102] [11 Regression] " acoplan at gcc dot gnu.org
2021-02-15 12:08 ` ktkachov at gcc dot gnu.org
2021-02-25 16:43 ` joelh at gcc dot gnu.org
2021-03-03 12:22 ` joelh at gcc dot gnu.org
2021-03-05 16:18 ` joelh at gcc dot gnu.org
2021-03-10 12:26 ` cvs-commit at gcc dot gnu.org
2021-03-10 12:48 ` joelh at gcc dot gnu.org
2021-03-30 16:05 ` rsandifo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).