public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/98365] New: Miss vectoization
@ 2020-12-18  1:13 crazylht at gmail dot com
  2021-01-05  5:09 ` [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt crazylht at gmail dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2020-12-18  1:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

            Bug ID: 98365
           Summary: Miss vectoization
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
                CC: hjl.tools at gmail dot com, wwwhhhyyy333 at gmail dot com
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu

cat test.c

int foo (char a[64], char c[64])
{
  int i;
  char cnt=0;
  for (int i = 0;i != 64; i++)
    if (a[i] == c[i])
      cnt++;
  return cnt;
}

with -Ofast -mavx2 gcc failed to vectorize the loop due to 

dump of loop body:
-----------
  # cnt_21 = PHI <cnt_9(7), 0(15)>
  # i_22 = PHI <i_17(7), 0(15)>
  # ivtmp_19 = PHI <ivtmp_18(7), 64(15)>
  _1 = (sizetype) i_22;
  _2 = a_14(D) + _1;
  _3 = *_2;
  _5 = c_15(D) + _1;
  _6 = *_5;
  cnt.1_7 = (unsigned char) cnt_21;
  _8 = cnt.1_7 + 1;
  cnt_16 = (char) _8;
  cnt_9 = _3 == _6 ? cnt_16 : cnt_21;
  i_17 = i_22 + 1;
  ivtmp_18 = ivtmp_19 - 1;
----------

-fopt-info
---------
test.c:5:20: note:   vec_stmt_relevant_p: stmt live but not relevant.
test.c:5:20: note:   mark relevant 1, live 1: cnt_9 = _3 == _6 ? cnt_16 :
cnt_21;
test.c:5:20: note:   init: stmt relevant? i_17 = i_22 + 1;
test.c:5:20: note:   init: stmt relevant? ivtmp_18 = ivtmp_19 - 1;
test.c:5:20: note:   init: stmt relevant? if (ivtmp_18 != 0)
test.c:5:20: note:   worklist: examine stmt: cnt_9 = _3 == _6 ? cnt_16 :
cnt_21;
test.c:5:20: note:   vect_is_simple_use: operand *_2, type of def: internal
test.c:5:20: note:   mark relevant 1, live 0: _3 = *_2;
test.c:5:20: note:   vect_is_simple_use: operand *_5, type of def: internal
test.c:5:20: note:   mark relevant 1, live 0: _6 = *_5;
test.c:5:20: note:   vect_is_simple_use: operand (char) _8, type of def:
internal
test.c:5:20: note:   mark relevant 1, live 0: cnt_16 = (char) _8;
test.c:5:20: note:   vect_is_simple_use: operand cnt_21 = PHI <cnt_9(7),
0(15)>, type of def: unknown
test.c:5:20: missed:   Unsupported pattern.
----------------
Shouldn't cnt_21 = PHI <cnt_9(7), 0(15)>, stmt relevant?


BTW: with extra -fwrapv, gcc successfully vectorized the loop.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
  2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
@ 2021-01-05  5:09 ` crazylht at gmail dot com
  2021-01-05  5:40 ` crazylht at gmail dot com
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-05  5:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---

> Shouldn't cnt_21 = PHI <cnt_9(7), 0(15)>, stmt relevant?
> 

for stmt: cnt.1_7 = (unsigned char) cnt_21, the operand is defined by a
previous iteration of the loop which is assumed to be handled in
induction/reduction.

But vect_analyze_scalar_cycles can't get reduction of cnt as (cnt_9 = _3 == _6
? cnt_16 : cnt_21;_ since scalar evolution only handle
     - an SSA_NAME,
     - an INTEGER_CST,
     - a PLUS_EXPR,
     - a POINTER_PLUS_EXPR,
     - a MINUS_EXPR,
     - an ASSERT_EXPR,
     - other cases are not yet handled.  */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
  2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
  2021-01-05  5:09 ` [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt crazylht at gmail dot com
@ 2021-01-05  5:40 ` crazylht at gmail dot com
  2021-01-05  9:01 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-05  5:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---

>   cnt.1_7 = (unsigned char) cnt_21;
>   _8 = cnt.1_7 + 1;
>   cnt_16 = (char) _8;
>   cnt_9 = _3 == _6 ? cnt_16 : cnt_21;
>  

In tree_if_conversion, there's is_cond_scalar_reduction, i'm think to extend
the currect implementation to reduce bellow

      loop-header:
        cnt_21 = PHI <0, cnt_9>
      ...
        if (cond_expr)
          tmp1 = (unsigned type) cnt_21
          tmp2 = tmp1 +/- rhs2
          cnt_16 = (signed type) tmp2
        cnt_9 = PHI <cnt_16, cnt_21>

to 
     cnt_9 = PHI <0, cnt_21>
     tmp1 = (unsigned type)cnt_9;
     ifcvt = cond_expr ? rhs2 : 0
     tmp2 = tmp1 +/- ifcvt;
     cnt_21 = (signed type)tmp2;

I hope vectorizer reduction can handle the upper sequence.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
  2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
  2021-01-05  5:09 ` [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt crazylht at gmail dot com
  2021-01-05  5:40 ` crazylht at gmail dot com
@ 2021-01-05  9:01 ` rguenth at gcc dot gnu.org
  2021-01-05  9:57 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-05  9:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-01-05
             Blocks|                            |53947
     Ever confirmed|0                           |1
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is that we hit

  /* If this isn't a nested cycle or if the nested cycle reduction value
     is used ouside of the inner loop we cannot handle uses of the reduction
     value.  */
  if (nlatch_def_loop_uses > 1 || nphi_def_loop_uses > 1)
    {
      if (dump_enabled_p ())
        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                         "reduction used in loop.\n");
      return NULL;
    }

because cnt_21 is used in both the update and the COND_EXPR.  The reduction
doesn't fit the cond reductions we support but is a blend of a cond and
regular reduction.  Making the COND-reduction support handle this case should
be possible though.

Using 'int' we arrive at handled IL:

  # cnt_19 = PHI <cnt_8(7), 0(15)>
  _ifc__32 = _4 == _7 ? 1 : 0;
  cnt_8 = cnt_19 + _ifc__32;

so adjusting if-conversion can indeed help.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
  2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
                   ` (2 preceding siblings ...)
  2021-01-05  9:01 ` rguenth at gcc dot gnu.org
@ 2021-01-05  9:57 ` crazylht at gmail dot com
  2021-01-05 10:06 ` crazylht at gmail dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-05  9:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
> I hope vectorizer reduction can handle the upper sequence.

After hacked in ifcvt, got

.165.cvt
----
  <bb 3> [local count: 1057206201]:
  # cnt_21 = PHI <cnt_9(7), 0(15)>
  # i_22 = PHI <i_17(7), 0(15)>
  # ivtmp_19 = PHI <ivtmp_18(7), 64(15)>
  _1 = (sizetype) i_22;
  _2 = a_14(D) + _1;
  _3 = *_2;
  _5 = c_15(D) + _1;
  _6 = *_5;
  cnt.1_7 = (unsigned char) cnt_21;
  _ifc__35 = _3 == _6 ? 1 : 0;
  _nop__36 = cnt.1_7 + _ifc__35;
  cnt_9 = (char) _nop__36;
  i_17 = i_22 + 1;
  ivtmp_18 = ivtmp_19 - 1;
  if (ivtmp_18 != 0)
    goto <bb 7>; [98.44%]
-------

And successully vectorized.

.166t.vect
------
  <bb 3> [local count: 33071249]:
  # cnt_21 = PHI <cnt_9(7), 0(2)>
  # i_22 = PHI <i_17(7), 0(2)>
  # ivtmp_19 = PHI <ivtmp_18(7), 64(2)>
  # vect_cnt_21.6_38 = PHI <vect_cnt_9.16_51(7), { 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }(2)>
  # vectp_a.7_39 = PHI <vectp_a.7_40(7), a_14(D)(2)>
  # vectp_c.10_42 = PHI <vectp_c.10_43(7), c_15(D)(2)>
  # ivtmp_56 = PHI <ivtmp_57(7), 0(2)>
  _1 = (sizetype) i_22;
  _2 = a_14(D) + _1;
  vect__3.9_41 = MEM <vector(32) char> [(char *)vectp_a.7_39];
  _3 = *_2;
  _5 = c_15(D) + _1;
  vect__6.12_44 = MEM <vector(32) char> [(char *)vectp_c.10_42];
  _6 = *_5;
  vect_cnt.13_45 = VIEW_CONVERT_EXPR<vector(32) unsigned
char>(vect_cnt_21.6_38);
  cnt.1_7 = (unsigned char) cnt_21;
  _48 = vect__3.9_41 == vect__6.12_44;
  vect__ifc__35.14_49 = VEC_COND_EXPR <_48, vect_cst__46, vect_cst__47>;
  _ifc__35 = _3 == _6 ? 1 : 0;
  vect__nop__36.15_50 = vect_cnt.13_45 + vect__ifc__35.14_49;
  _nop__36 = cnt.1_7 + _ifc__35;
  vect_cnt_9.16_51 = VIEW_CONVERT_EXPR<vector(32) char>(vect__nop__36.15_50);
  cnt_9 = (char) _nop__36;
  i_17 = i_22 + 1;
  ivtmp_18 = ivtmp_19 - 1;
  vectp_a.7_40 = vectp_a.7_39 + 32;
  vectp_c.10_43 = vectp_c.10_42 + 32;
  ivtmp_57 = ivtmp_56 + 1;
  if (ivtmp_57 < 2)
    goto <bb 7>; [50.00%]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
  2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
                   ` (3 preceding siblings ...)
  2021-01-05  9:57 ` crazylht at gmail dot com
@ 2021-01-05 10:06 ` crazylht at gmail dot com
  2021-01-06  9:22 ` crazylht at gmail dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-05 10:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
> 
> And successully vectorized.
> 

Also vectorized loop with cnt defined as signed short.
.i.e
int foo (short a[64], short c[64])
{
  int i;
  short cnt=0;
  for (int i = 0;i != 64; i++)
    if (a[i] == c[i])
      cnt++;
  return cnt;
}

Since signed integer overflow is undefined in C++ and C, gcc would always
convert signed char/short to unsigned char/short to avoid UB, and ifcvt should
be able to "properly" hanlde those nop conversion.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
  2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
                   ` (4 preceding siblings ...)
  2021-01-05 10:06 ` crazylht at gmail dot com
@ 2021-01-06  9:22 ` crazylht at gmail dot com
  2021-06-01  1:59 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-06  9:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
Created attachment 49897
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49897&action=edit
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}

Waiting for GCC12 stage1.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
  2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
                   ` (5 preceding siblings ...)
  2021-01-06  9:22 ` crazylht at gmail dot com
@ 2021-06-01  1:59 ` cvs-commit at gcc dot gnu.org
  2021-06-01  2:01 ` crazylht at gmail dot com
  2021-09-17  6:41 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-06-01  1:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:28daadc98094501175c9dfe4a985871fa6aa4f94

commit r12-1138-g28daadc98094501175c9dfe4a985871fa6aa4f94
Author: liuhongt <hongtao.liu@intel.com>
Date:   Wed Jan 6 16:33:27 2021 +0800

    Extend is_cond_scalar_reduction to handle nop_expr after/before scalar
reduction.[PR98365]

    gcc/ChangeLog:

            PR tree-optimization/98365
            * tree-if-conv.c (strip_nop_cond_scalar_reduction): New function.
            (is_cond_scalar_reduction): Handle nop_expr in cond scalar
reduction.
            (convert_scalar_cond_reduction): Ditto.
            (predicate_scalar_phi): Ditto.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/98365
            * gcc.target/i386/pr98365.c: New test.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
  2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
                   ` (6 preceding siblings ...)
  2021-06-01  1:59 ` cvs-commit at gcc dot gnu.org
@ 2021-06-01  2:01 ` crazylht at gmail dot com
  2021-09-17  6:41 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-06-01  2:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
  2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
                   ` (7 preceding siblings ...)
  2021-06-01  2:01 ` crazylht at gmail dot com
@ 2021-09-17  6:41 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-17  6:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-09-17  6:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-18  1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
2021-01-05  5:09 ` [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt crazylht at gmail dot com
2021-01-05  5:40 ` crazylht at gmail dot com
2021-01-05  9:01 ` rguenth at gcc dot gnu.org
2021-01-05  9:57 ` crazylht at gmail dot com
2021-01-05 10:06 ` crazylht at gmail dot com
2021-01-06  9:22 ` crazylht at gmail dot com
2021-06-01  1:59 ` cvs-commit at gcc dot gnu.org
2021-06-01  2:01 ` crazylht at gmail dot com
2021-09-17  6:41 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).