public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/98365] New: Miss vectoization
@ 2020-12-18 1:13 crazylht at gmail dot com
2021-01-05 5:09 ` [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt crazylht at gmail dot com
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2020-12-18 1:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
Bug ID: 98365
Summary: Miss vectoization
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
CC: hjl.tools at gmail dot com, wwwhhhyyy333 at gmail dot com
Target Milestone: ---
Host: x86_64-pc-linux-gnu
cat test.c
int foo (char a[64], char c[64])
{
int i;
char cnt=0;
for (int i = 0;i != 64; i++)
if (a[i] == c[i])
cnt++;
return cnt;
}
with -Ofast -mavx2 gcc failed to vectorize the loop due to
dump of loop body:
-----------
# cnt_21 = PHI <cnt_9(7), 0(15)>
# i_22 = PHI <i_17(7), 0(15)>
# ivtmp_19 = PHI <ivtmp_18(7), 64(15)>
_1 = (sizetype) i_22;
_2 = a_14(D) + _1;
_3 = *_2;
_5 = c_15(D) + _1;
_6 = *_5;
cnt.1_7 = (unsigned char) cnt_21;
_8 = cnt.1_7 + 1;
cnt_16 = (char) _8;
cnt_9 = _3 == _6 ? cnt_16 : cnt_21;
i_17 = i_22 + 1;
ivtmp_18 = ivtmp_19 - 1;
----------
-fopt-info
---------
test.c:5:20: note: vec_stmt_relevant_p: stmt live but not relevant.
test.c:5:20: note: mark relevant 1, live 1: cnt_9 = _3 == _6 ? cnt_16 :
cnt_21;
test.c:5:20: note: init: stmt relevant? i_17 = i_22 + 1;
test.c:5:20: note: init: stmt relevant? ivtmp_18 = ivtmp_19 - 1;
test.c:5:20: note: init: stmt relevant? if (ivtmp_18 != 0)
test.c:5:20: note: worklist: examine stmt: cnt_9 = _3 == _6 ? cnt_16 :
cnt_21;
test.c:5:20: note: vect_is_simple_use: operand *_2, type of def: internal
test.c:5:20: note: mark relevant 1, live 0: _3 = *_2;
test.c:5:20: note: vect_is_simple_use: operand *_5, type of def: internal
test.c:5:20: note: mark relevant 1, live 0: _6 = *_5;
test.c:5:20: note: vect_is_simple_use: operand (char) _8, type of def:
internal
test.c:5:20: note: mark relevant 1, live 0: cnt_16 = (char) _8;
test.c:5:20: note: vect_is_simple_use: operand cnt_21 = PHI <cnt_9(7),
0(15)>, type of def: unknown
test.c:5:20: missed: Unsupported pattern.
----------------
Shouldn't cnt_21 = PHI <cnt_9(7), 0(15)>, stmt relevant?
BTW: with extra -fwrapv, gcc successfully vectorized the loop.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
@ 2021-01-05 5:09 ` crazylht at gmail dot com
2021-01-05 5:40 ` crazylht at gmail dot com
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-05 5:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
> Shouldn't cnt_21 = PHI <cnt_9(7), 0(15)>, stmt relevant?
>
for stmt: cnt.1_7 = (unsigned char) cnt_21, the operand is defined by a
previous iteration of the loop which is assumed to be handled in
induction/reduction.
But vect_analyze_scalar_cycles can't get reduction of cnt as (cnt_9 = _3 == _6
? cnt_16 : cnt_21;_ since scalar evolution only handle
- an SSA_NAME,
- an INTEGER_CST,
- a PLUS_EXPR,
- a POINTER_PLUS_EXPR,
- a MINUS_EXPR,
- an ASSERT_EXPR,
- other cases are not yet handled. */
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
2021-01-05 5:09 ` [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt crazylht at gmail dot com
@ 2021-01-05 5:40 ` crazylht at gmail dot com
2021-01-05 9:01 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-05 5:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
> cnt.1_7 = (unsigned char) cnt_21;
> _8 = cnt.1_7 + 1;
> cnt_16 = (char) _8;
> cnt_9 = _3 == _6 ? cnt_16 : cnt_21;
>
In tree_if_conversion, there's is_cond_scalar_reduction, i'm think to extend
the currect implementation to reduce bellow
loop-header:
cnt_21 = PHI <0, cnt_9>
...
if (cond_expr)
tmp1 = (unsigned type) cnt_21
tmp2 = tmp1 +/- rhs2
cnt_16 = (signed type) tmp2
cnt_9 = PHI <cnt_16, cnt_21>
to
cnt_9 = PHI <0, cnt_21>
tmp1 = (unsigned type)cnt_9;
ifcvt = cond_expr ? rhs2 : 0
tmp2 = tmp1 +/- ifcvt;
cnt_21 = (signed type)tmp2;
I hope vectorizer reduction can handle the upper sequence.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
2021-01-05 5:09 ` [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt crazylht at gmail dot com
2021-01-05 5:40 ` crazylht at gmail dot com
@ 2021-01-05 9:01 ` rguenth at gcc dot gnu.org
2021-01-05 9:57 ` crazylht at gmail dot com
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-05 9:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2021-01-05
Blocks| |53947
Ever confirmed|0 |1
Keywords| |missed-optimization
Status|UNCONFIRMED |NEW
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is that we hit
/* If this isn't a nested cycle or if the nested cycle reduction value
is used ouside of the inner loop we cannot handle uses of the reduction
value. */
if (nlatch_def_loop_uses > 1 || nphi_def_loop_uses > 1)
{
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
"reduction used in loop.\n");
return NULL;
}
because cnt_21 is used in both the update and the COND_EXPR. The reduction
doesn't fit the cond reductions we support but is a blend of a cond and
regular reduction. Making the COND-reduction support handle this case should
be possible though.
Using 'int' we arrive at handled IL:
# cnt_19 = PHI <cnt_8(7), 0(15)>
_ifc__32 = _4 == _7 ? 1 : 0;
cnt_8 = cnt_19 + _ifc__32;
so adjusting if-conversion can indeed help.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
` (2 preceding siblings ...)
2021-01-05 9:01 ` rguenth at gcc dot gnu.org
@ 2021-01-05 9:57 ` crazylht at gmail dot com
2021-01-05 10:06 ` crazylht at gmail dot com
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-05 9:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
> I hope vectorizer reduction can handle the upper sequence.
After hacked in ifcvt, got
.165.cvt
----
<bb 3> [local count: 1057206201]:
# cnt_21 = PHI <cnt_9(7), 0(15)>
# i_22 = PHI <i_17(7), 0(15)>
# ivtmp_19 = PHI <ivtmp_18(7), 64(15)>
_1 = (sizetype) i_22;
_2 = a_14(D) + _1;
_3 = *_2;
_5 = c_15(D) + _1;
_6 = *_5;
cnt.1_7 = (unsigned char) cnt_21;
_ifc__35 = _3 == _6 ? 1 : 0;
_nop__36 = cnt.1_7 + _ifc__35;
cnt_9 = (char) _nop__36;
i_17 = i_22 + 1;
ivtmp_18 = ivtmp_19 - 1;
if (ivtmp_18 != 0)
goto <bb 7>; [98.44%]
-------
And successully vectorized.
.166t.vect
------
<bb 3> [local count: 33071249]:
# cnt_21 = PHI <cnt_9(7), 0(2)>
# i_22 = PHI <i_17(7), 0(2)>
# ivtmp_19 = PHI <ivtmp_18(7), 64(2)>
# vect_cnt_21.6_38 = PHI <vect_cnt_9.16_51(7), { 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }(2)>
# vectp_a.7_39 = PHI <vectp_a.7_40(7), a_14(D)(2)>
# vectp_c.10_42 = PHI <vectp_c.10_43(7), c_15(D)(2)>
# ivtmp_56 = PHI <ivtmp_57(7), 0(2)>
_1 = (sizetype) i_22;
_2 = a_14(D) + _1;
vect__3.9_41 = MEM <vector(32) char> [(char *)vectp_a.7_39];
_3 = *_2;
_5 = c_15(D) + _1;
vect__6.12_44 = MEM <vector(32) char> [(char *)vectp_c.10_42];
_6 = *_5;
vect_cnt.13_45 = VIEW_CONVERT_EXPR<vector(32) unsigned
char>(vect_cnt_21.6_38);
cnt.1_7 = (unsigned char) cnt_21;
_48 = vect__3.9_41 == vect__6.12_44;
vect__ifc__35.14_49 = VEC_COND_EXPR <_48, vect_cst__46, vect_cst__47>;
_ifc__35 = _3 == _6 ? 1 : 0;
vect__nop__36.15_50 = vect_cnt.13_45 + vect__ifc__35.14_49;
_nop__36 = cnt.1_7 + _ifc__35;
vect_cnt_9.16_51 = VIEW_CONVERT_EXPR<vector(32) char>(vect__nop__36.15_50);
cnt_9 = (char) _nop__36;
i_17 = i_22 + 1;
ivtmp_18 = ivtmp_19 - 1;
vectp_a.7_40 = vectp_a.7_39 + 32;
vectp_c.10_43 = vectp_c.10_42 + 32;
ivtmp_57 = ivtmp_56 + 1;
if (ivtmp_57 < 2)
goto <bb 7>; [50.00%]
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
` (3 preceding siblings ...)
2021-01-05 9:57 ` crazylht at gmail dot com
@ 2021-01-05 10:06 ` crazylht at gmail dot com
2021-01-06 9:22 ` crazylht at gmail dot com
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-05 10:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
>
> And successully vectorized.
>
Also vectorized loop with cnt defined as signed short.
.i.e
int foo (short a[64], short c[64])
{
int i;
short cnt=0;
for (int i = 0;i != 64; i++)
if (a[i] == c[i])
cnt++;
return cnt;
}
Since signed integer overflow is undefined in C++ and C, gcc would always
convert signed char/short to unsigned char/short to avoid UB, and ifcvt should
be able to "properly" hanlde those nop conversion.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
` (4 preceding siblings ...)
2021-01-05 10:06 ` crazylht at gmail dot com
@ 2021-01-06 9:22 ` crazylht at gmail dot com
2021-06-01 1:59 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-01-06 9:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
Created attachment 49897
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49897&action=edit
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
Waiting for GCC12 stage1.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
` (5 preceding siblings ...)
2021-01-06 9:22 ` crazylht at gmail dot com
@ 2021-06-01 1:59 ` cvs-commit at gcc dot gnu.org
2021-06-01 2:01 ` crazylht at gmail dot com
2021-09-17 6:41 ` pinskia at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-06-01 1:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:28daadc98094501175c9dfe4a985871fa6aa4f94
commit r12-1138-g28daadc98094501175c9dfe4a985871fa6aa4f94
Author: liuhongt <hongtao.liu@intel.com>
Date: Wed Jan 6 16:33:27 2021 +0800
Extend is_cond_scalar_reduction to handle nop_expr after/before scalar
reduction.[PR98365]
gcc/ChangeLog:
PR tree-optimization/98365
* tree-if-conv.c (strip_nop_cond_scalar_reduction): New function.
(is_cond_scalar_reduction): Handle nop_expr in cond scalar
reduction.
(convert_scalar_cond_reduction): Ditto.
(predicate_scalar_phi): Ditto.
gcc/testsuite/ChangeLog:
PR tree-optimization/98365
* gcc.target/i386/pr98365.c: New test.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
` (6 preceding siblings ...)
2021-06-01 1:59 ` cvs-commit at gcc dot gnu.org
@ 2021-06-01 2:01 ` crazylht at gmail dot com
2021-09-17 6:41 ` pinskia at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2021-06-01 2:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
` (7 preceding siblings ...)
2021-06-01 2:01 ` crazylht at gmail dot com
@ 2021-09-17 6:41 ` pinskia at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-17 6:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98365
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |12.0
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-09-17 6:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-18 1:13 [Bug tree-optimization/98365] New: Miss vectoization crazylht at gmail dot com
2021-01-05 5:09 ` [Bug tree-optimization/98365] Miss vectoization for signed char ifcvt crazylht at gmail dot com
2021-01-05 5:40 ` crazylht at gmail dot com
2021-01-05 9:01 ` rguenth at gcc dot gnu.org
2021-01-05 9:57 ` crazylht at gmail dot com
2021-01-05 10:06 ` crazylht at gmail dot com
2021-01-06 9:22 ` crazylht at gmail dot com
2021-06-01 1:59 ` cvs-commit at gcc dot gnu.org
2021-06-01 2:01 ` crazylht at gmail dot com
2021-09-17 6:41 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).