public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/94301] New: Missed vector-vector CTOR / permute simplification
@ 2020-03-24 14:43 rguenth at gcc dot gnu.org
  2020-03-24 15:13 ` [Bug tree-optimization/94301] " rguenth at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-03-24 14:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94301

            Bug ID: 94301
           Summary: Missed vector-vector CTOR / permute simplification
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

When the vectorizer creates sth stupid like

  _11 = __MEM <vector(1) double> ((double *)vectp_y.3_4);
  vect__2.5_15 = _Literal (vector(2) double) {_11, _Literal (vector(1) double)
{ 0.0 }};
  vectp_y.3_16 = vectp_y.3_4 + 16ul;
  _17 = __MEM <vector(1) double> ((double *)vectp_y.3_16);
  vect__2.6_18 = _Literal (vector(2) double) {_17, _Literal (vector(1) double)
{ 0.0 }};
  vect__2.7_19 = __VEC_PERM (vect__2.5_15, vect__2.6_18, _Literal (vector(2)
ssizetype) { 0l, 2l });
  _20 = __VEC_PERM (vect__2.7_19, vect__2.7_19, _Literal (vector(2) ssizetype)
{ 0l, 0l });
  _21 = __VEC_PERM (vect__2.7_19, vect__2.7_19, _Literal (vector(2) ssizetype)
{ 1l, 1l });

we fail to combine those instructions to

  _20 = { _11, _11 };
  _21 = { _17, _17 };

and instead end up with

  _11 = MEM[symbol: y, index: ivtmp.13_10, offset: _Literal (double *) 0];
  vect__2.5_15 = _Literal (vector(2) double) {_11, _Literal (vector(1) double)
{ 0.0 }};
  _17 = MEM[symbol: y, index: ivtmp.13_10, offset: _Literal (double *) 16];
  vect__2.7_19 = __BIT_INSERT (vect__2.5_15, _17, 64u);
  _20 = __VEC_PERM (vect__2.7_19, vect__2.7_19, _Literal (vector(2) ssizetype)
{ 0l, 0l });
  _21 = __VEC_PERM (vect__2.7_19, vect__2.7_19, _Literal (vector(2) ssizetype)
{ 1l, 1l });

where RTL expansion even ICEs on when trying to expand the __BIT_INSERT,
probably because of the V1DFmode insert which eventually ends up as
BLKmode to store_bit_field:

#4  0x0000000000d27c83 in store_bit_field (str_rtx=0x7ffff6dab000, bitsize=...,
bitnum=..., 
    bitregion_start=..., bitregion_end=..., fieldmode=E_BLKmode,
value=0x7ffff6da3888, 
    reverse=false) at ../../src/trunk/gcc/expmed.c:1174


That said, the vector-vector CTORs are probably unhandled in forwprop
simplifications.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/94301] Missed vector-vector CTOR / permute simplification
  2020-03-24 14:43 [Bug tree-optimization/94301] New: Missed vector-vector CTOR / permute simplification rguenth at gcc dot gnu.org
@ 2020-03-24 15:13 ` rguenth at gcc dot gnu.org
  2020-09-01  9:32 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-03-24 15:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94301

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2020-03-24
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
           Keywords|                            |missed-optimization

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/94301] Missed vector-vector CTOR / permute simplification
  2020-03-24 14:43 [Bug tree-optimization/94301] New: Missed vector-vector CTOR / permute simplification rguenth at gcc dot gnu.org
  2020-03-24 15:13 ` [Bug tree-optimization/94301] " rguenth at gcc dot gnu.org
@ 2020-09-01  9:32 ` rguenth at gcc dot gnu.org
  2020-09-02  9:04 ` rguenth at gcc dot gnu.org
  2021-12-12 13:26 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-01  9:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94301

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Testcase

typedef double v1df __attribute__((vector_size(8)));
typedef double v2df __attribute__((vector_size(16)));
typedef long v2di __attribute__((vector_size(16)));

v2df __GIMPLE(ssa) foo (v1df x, v1df z)
{
  v2df y;

  __BB(2):
   y_2 = _Literal (v2df) { x_1(D), _Literal (v1df) { 0.0 } };
  y_3 = _Literal (v2df) { z_4(D), _Literal (v1df) { 0.0 } };
  y_5 = __VEC_PERM (y_2, y_3, _Literal (v2di) { 0l, 2l });
  y_6 = __VEC_PERM (y_5, y_5, _Literal (v2di) { 0l, 0l });
  return y_6;
}


> ./cc1 -quiet t.c -fgimple -O
during RTL pass: expand
t.c: In function 'foo':
t.c:5:20: internal compiler error: in require, at machmode.h:293
    5 | v2df __GIMPLE(ssa) foo (v1df x, v1df z)
      |                    ^~~
0xb84809 opt_mode<scalar_int_mode>::require() const
        /home/rguenther/src/gcc2/gcc/machmode.h:293
0xd5da8c store_integral_bit_field
        /home/rguenther/src/gcc2/gcc/expmed.c:1006
0xd5d2fe store_bit_field_1
        /home/rguenther/src/gcc2/gcc/expmed.c:873


works with -O0.  With -O we expand from

  y_2 = {x_1(D), { 0.0 }};
  y_4 = BIT_INSERT_EXPR <y_2, z_3(D), 64>;
  y_5 = VEC_PERM_EXPR <y_4, y_4, { 0, 0 }>;
  return y_5;

and the issue is that 'value' is

(mem/c:BLK (plus:DI (reg/f:DI 76 virtual-incoming-args)
        (const_int 8 [0x8])) [1 z+0 S8 A64])

because V1DF isn't a supported vector mode on x86_64 and vector lowering
doesn't do anything to it either.  Eventually V1m types should fall back to
the component mode transparently.  ABI-wise we seem to pass V1DF on the
stack ...

So the "simple" patch

diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index bde6fa22b58..90fc34e5a2c 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -512,6 +512,10 @@ mode_for_vector (scalar_mode innermode, poly_uint64
nunits)
        return mode;
     }

+  /* For single-element vectors, map it to the component mode.  */
+  if (known_eq (nunits, 1))
+    return innermode;
+
   return opt_machine_mode ();
 }


not only fixes the ICE and generates optimal code but also changes the ABI...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/94301] Missed vector-vector CTOR / permute simplification
  2020-03-24 14:43 [Bug tree-optimization/94301] New: Missed vector-vector CTOR / permute simplification rguenth at gcc dot gnu.org
  2020-03-24 15:13 ` [Bug tree-optimization/94301] " rguenth at gcc dot gnu.org
  2020-09-01  9:32 ` rguenth at gcc dot gnu.org
@ 2020-09-02  9:04 ` rguenth at gcc dot gnu.org
  2021-12-12 13:26 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-02  9:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94301

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so the ICE is the vectorizers fault in generating BLKmode vectors.

What remains is the missed optimization on __VEC_PERM:

typedef double v2df __attribute__((vector_size(16)));
typedef double v4df __attribute__((vector_size(32)));
typedef long long v4di __attribute__((vector_size(32)));

v4df __GIMPLE(ssa) foo (v2df x, v2df z)
{
  v4df y;

  __BB(2):
   y_2 = _Literal (v4df) { x_1(D), _Literal (v2df) { 0.0, 0.0 } };
  y_3 = _Literal (v4df) { z_4(D), _Literal (v2df) { 0.0, 0.0 } };
  y_5 = __VEC_PERM (y_2, y_3, _Literal (v4di) { 0l, 1l, 4l, 5l });
  return y_5;
}

remains

  y_2 = {x_1(D), { 0.0, 0.0 }};
  y_3 = {z_4(D), { 0.0, 0.0 }};
  y_5 = VEC_PERM_EXPR <y_2, y_3, { 0, 1, 4, 5 }>;

but should become just

  y_5 = {x_1(D), z_4(D)};

forwprop in simplify_permutation resorts to fold_ternary of
VEC_PERM_EXPR <{x_1(D), { 0.0, 0.0 }}, {z_4(D), { 0.0, 0.0 }}, { 0, 1, 4, 5 }>
which eventually ends up in fold_vec_perm which fails because
vec_cst_ctor_to_array cannot handle the vector typed component x_1(D).

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/94301] Missed vector-vector CTOR / permute simplification
  2020-03-24 14:43 [Bug tree-optimization/94301] New: Missed vector-vector CTOR / permute simplification rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2020-09-02  9:04 ` rguenth at gcc dot gnu.org
@ 2021-12-12 13:26 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-12 13:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94301

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-12-12 13:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-24 14:43 [Bug tree-optimization/94301] New: Missed vector-vector CTOR / permute simplification rguenth at gcc dot gnu.org
2020-03-24 15:13 ` [Bug tree-optimization/94301] " rguenth at gcc dot gnu.org
2020-09-01  9:32 ` rguenth at gcc dot gnu.org
2020-09-02  9:04 ` rguenth at gcc dot gnu.org
2021-12-12 13:26 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).