public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/99220] New: [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes
@ 2021-02-23 16:40 tnfchris at gcc dot gnu.org
  2021-02-24  8:11 ` [Bug tree-optimization/99220] " rguenth at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2021-02-23 16:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99220

            Bug ID: 99220
           Summary: [11 Regression] ICE during vectorization when multiple
                    instances do the same calculation but have different
                    num lanes
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: ice-on-valid-code
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: tnfchris at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64-*

The following testcase

class a {
  float b;
  float c;

public:
  a(float d, float e) : b(d), c(e) {}
  a operator+(a d) { return a(b + d.b, c + d.c); }
  a operator-(a d) { return a(b - d.b, c - d.c); }
  a operator*(a d) { return a(b * b - c * c, b * c + c * d.b); }
};
long f;
a *g;
class {
  a *h;
  long i;
  a *j;

public:
  void k() {
    a l = h[0], m = g[i], n = l * g[1], o = l * j[8];
    g[i] = m + n;
    g[i + 1] = m - n;
    j[f] = o;
  }
} p;
main() { p.k(); }

crashes with aarch64-none-elf-g++ -w -march=armv8.3-a -O3 -S main.cpp

because two nodes end up with the same pointer. During the loop that analyzes
all the instances during optimize_load_redistribution_1 we do

      if (value)
        {
          SLP_TREE_REF_COUNT (value)++;
          SLP_TREE_CHILDREN (root)[i] = value;
          vect_free_slp_tree (node);
        }

when doing a replacement.  When this is done and the refcount for the node
reaches 0, the node is removed, which allows the libc to return the pointer
again in the next call to new, which it does..


First instance

note:   node 0x5325f48 (max_nunits=1, refcnt=2)
note:   op: VEC_PERM_EXPR
note:           { }
note:           lane permutation { 0[0] 1[1] 0[2] 1[3] }
note:           children 0x5325db0 0x5325200

Second instance

note:   node 0x5325f48 (max_nunits=1, refcnt=1)
note:   op: VEC_PERM_EXPR
note:           { }
note:           lane permutation { 0[0] 1[1] }
note:           children 0x53255b8 0x5325530

This will end up with the illegal construction of

note:   node 0x53258e8 (max_nunits=2, refcnt=2)
note:   op template: slp_patt_57 = .COMPLEX_MUL (_16, _16);
note:           stmt 0 _16 = _14 - _15;
note:           stmt 1 _23 = _17 + _22;
note:           children 0x53257d8 0x5325d28
note:   node 0x53257d8 (max_nunits=2, refcnt=3)
note:   op template: l$b_4 = MEM[(const struct a &)_3].b;
note:           stmt 0 l$b_4 = MEM[(const struct a &)_3].b;
note:           stmt 1 l$c_5 = MEM[(const struct a &)_3].c;
note:           load permutation { 0 1 }
note:   node 0x5325d28 (max_nunits=2, refcnt=8)
note:   op template: l$b_4 = MEM[(const struct a &)_3].b;
note:           stmt 0 l$b_4 = MEM[(const struct a &)_3].b;
note:           stmt 1 l$c_5 = MEM[(const struct a &)_3].c;
note:           stmt 2 l$b_4 = MEM[(const struct a &)_3].b;
note:           stmt 3 l$c_5 = MEM[(const struct a &)_3].c;
note:           load permutation { 0 1 0 1 }

To prevent this we need to add these temporary VEC_PERM_EXPR nodes to the
bst_map cache and increase their refcnt one more.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/99220] [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes
  2021-02-23 16:40 [Bug tree-optimization/99220] New: [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes tnfchris at gcc dot gnu.org
@ 2021-02-24  8:11 ` rguenth at gcc dot gnu.org
  2021-02-24 10:42 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-24  8:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99220

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/99220] [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes
  2021-02-23 16:40 [Bug tree-optimization/99220] New: [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes tnfchris at gcc dot gnu.org
  2021-02-24  8:11 ` [Bug tree-optimization/99220] " rguenth at gcc dot gnu.org
@ 2021-02-24 10:42 ` rguenth at gcc dot gnu.org
  2021-02-24 15:17 ` cvs-commit at gcc dot gnu.org
  2021-02-24 15:18 ` tnfchris at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-24 10:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99220

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 50245
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50245&action=edit
patch

So sth like

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 091e727bbc3..24962af903f 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2351,6 +2351,11 @@ next:
        {
          SLP_TREE_REF_COUNT (value)++;
          SLP_TREE_CHILDREN (root)[i] = value;
+         /* ???  We know the original leafs of the replaced nodes will
+            be referenced by bst_map, only the permutes created by
+            pattern matching are not.  */
+         if (SLP_TREE_REF_COUNT (node) == 1)
+           load_map->remove (node);
          vect_free_slp_tree (node);
        }
     }

fixes the testcase but not caching load_map across pattern matching
will make reasoning this fix is complete easier.

Thus I am testing the attached on x86_64-linux, can you test on arm and report
back?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/99220] [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes
  2021-02-23 16:40 [Bug tree-optimization/99220] New: [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes tnfchris at gcc dot gnu.org
  2021-02-24  8:11 ` [Bug tree-optimization/99220] " rguenth at gcc dot gnu.org
  2021-02-24 10:42 ` rguenth at gcc dot gnu.org
@ 2021-02-24 15:17 ` cvs-commit at gcc dot gnu.org
  2021-02-24 15:18 ` tnfchris at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-02-24 15:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99220

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tamar Christina <tnfchris@gcc.gnu.org>:

https://gcc.gnu.org/g:6c35e79b47ab582e18d851f6c5df776bac766eaf

commit r11-7359-g6c35e79b47ab582e18d851f6c5df776bac766eaf
Author: Tamar Christina <tamar.christina@arm.com>
Date:   Wed Feb 24 15:16:23 2021 +0000

    slp: fix accidental resource re-use of slp_tree (PR99220)

    The attached testcase shows a bug where two nodes end up with the same
pointer.
    During the loop that analyzes all the instances
    in optimize_load_redistribution_1 we do

          if (value)
            {
              SLP_TREE_REF_COUNT (value)++;
              SLP_TREE_CHILDREN (root)[i] = value;
              vect_free_slp_tree (node);
            }

    when doing a replacement.  When this is done and the refcount for the node
    reaches 0, the node is removed, which allows the libc to return the pointer
    again in the next call to new, which it does..

    First instance

    note:   node 0x5325f48 (max_nunits=1, refcnt=2)
    note:   op: VEC_PERM_EXPR
    note:           { }
    note:           lane permutation { 0[0] 1[1] 0[2] 1[3] }
    note:           children 0x5325db0 0x5325200

    Second instance

    note:   node 0x5325f48 (max_nunits=1, refcnt=1)
    note:   op: VEC_PERM_EXPR
    note:           { }
    note:           lane permutation { 0[0] 1[1] }
    note:           children 0x53255b8 0x5325530

    This will end up with the illegal construction of

    note:   node 0x53258e8 (max_nunits=2, refcnt=2)
    note:   op template: slp_patt_57 = .COMPLEX_MUL (_16, _16);
    note:           stmt 0 _16 = _14 - _15;
    note:           stmt 1 _23 = _17 + _22;
    note:           children 0x53257d8 0x5325d28
    note:   node 0x53257d8 (max_nunits=2, refcnt=3)
    note:   op template: l$b_4 = MEM[(const struct a &)_3].b;
    note:           stmt 0 l$b_4 = MEM[(const struct a &)_3].b;
    note:           stmt 1 l$c_5 = MEM[(const struct a &)_3].c;
    note:           load permutation { 0 1 }
    note:   node 0x5325d28 (max_nunits=2, refcnt=8)
    note:   op template: l$b_4 = MEM[(const struct a &)_3].b;
    note:           stmt 0 l$b_4 = MEM[(const struct a &)_3].b;
    note:           stmt 1 l$c_5 = MEM[(const struct a &)_3].c;
    note:           stmt 2 l$b_4 = MEM[(const struct a &)_3].b;
    note:           stmt 3 l$c_5 = MEM[(const struct a &)_3].c;
    note:           load permutation { 0 1 0 1 }

    To prevent this we remove the node from the load_map if it's
    about to be deleted.

    gcc/ChangeLog:

            PR tree-optimization/99220
            * tree-vect-slp.c (optimize_load_redistribution_1): Remove
            node from cache when it's about to be deleted.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/99220
            * g++.dg/vect/pr99220.cc: New test.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/99220] [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes
  2021-02-23 16:40 [Bug tree-optimization/99220] New: [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes tnfchris at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-02-24 15:17 ` cvs-commit at gcc dot gnu.org
@ 2021-02-24 15:18 ` tnfchris at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2021-02-24 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99220

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
Fixed on master

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-02-24 15:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-23 16:40 [Bug tree-optimization/99220] New: [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes tnfchris at gcc dot gnu.org
2021-02-24  8:11 ` [Bug tree-optimization/99220] " rguenth at gcc dot gnu.org
2021-02-24 10:42 ` rguenth at gcc dot gnu.org
2021-02-24 15:17 ` cvs-commit at gcc dot gnu.org
2021-02-24 15:18 ` tnfchris at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).