public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102512] New: Redudant max/min operation for vector reduction
@ 2021-09-28 7:28 crazylht at gmail dot com
2021-09-28 7:41 ` [Bug tree-optimization/102512] Redundant max/min operation before " pinskia at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: crazylht at gmail dot com @ 2021-09-28 7:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102512
Bug ID: 102512
Summary: Redudant max/min operation for vector reduction
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
Host: x86_64-pc-linux-gnu
Target: x86_64-*-* i?86-*-*
cat test.c
#define MAX(a, b) ((a) > (b) ? (a) : (b))
short
foo1 (short* p)
{
short max = p[0];
for (int i = 0; i != 8; i++)
max = MAX(max, p[i]);
return max;
}
short
foo2 (short* p)
{
short max = p[0];
for (int i = 1; i != 8; i++)
max = MAX(max, p[i]);
return max;
}
gcc -O3 -mavx2 -S
in foo1 the first MAX_EXPR <_10, vect__4.7_13> is redundant since it's
contained by the latter .REDUC_MAX.
in foo2 vectorizer failed to recognize .REDUC_MAX pattern.
;; Function foo1 (foo1, funcdef_no=0, decl_uid=2991, cgraph_uid=1,
symbol_order=0)
.248t.optimized
short int foo1 (short int * p)
{
vector(8) short int vect_max_11.8;
vector(8) short int vect__4.7;
short int max;
vector(8) short int _10;
short int _20;
<bb 2> [local count: 119292720]:
max_9 = *p_8(D);
_10 = {max_9, max_9, max_9, max_9, max_9, max_9, max_9, max_9};
vect__4.7_13 = MEM <vector(8) short int> [(short int *)p_8(D)];
vect_max_11.8_14 = MAX_EXPR <_10, vect__4.7_13>;
_20 = .REDUC_MAX (vect_max_11.8_14); [tail call]
return _20;
}
;; Function foo2 (foo2, funcdef_no=1, decl_uid=3000, cgraph_uid=2,
symbol_order=1)
short int foo2 (short int * p)
{
short int stmp_max_11.21;
vector(4) short int vect_max_11.20;
vector(4) short int vect__4.19;
short int max;
short int _4;
short int _25;
vector(4) short int _30;
short int _34;
vector(4) short int _38;
vector(4) short int _39;
vector(4) short int _40;
vector(4) short int _41;
short int _44;
short int _46;
<bb 2> [local count: 268435454]:
max_9 = *p_8(D);
_30 = {max_9, max_9, max_9, max_9};
vect__4.19_35 = MEM <vector(4) short int> [(short int *)p_8(D) + 2B];
vect_max_11.20_36 = MAX_EXPR <_30, vect__4.19_35>;
_38 = VEC_PERM_EXPR <vect_max_11.20_36, { 0, 0, 0, 0 }, { 2, 3, 4, 5 }>;
_39 = MAX_EXPR <vect_max_11.20_36, _38>;
_40 = VEC_PERM_EXPR <_39, { 0, 0, 0, 0 }, { 1, 2, 3, 4 }>;
_41 = MAX_EXPR <_39, _40>;
stmp_max_11.21_42 = BIT_FIELD_REF <_41, 16, 0>;
_4 = MEM[(short int *)p_8(D) + 10B];
_46 = MEM[(short int *)p_8(D) + 12B];
_34 = MAX_EXPR <_4, _46>;
_25 = MEM[(short int *)p_8(D) + 14B];
_44 = MAX_EXPR <_25, stmp_max_11.21_42>;
max_26 = MAX_EXPR <_34, _44>;
return max_26;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102512] Redundant max/min operation before vector reduction
2021-09-28 7:28 [Bug tree-optimization/102512] New: Redudant max/min operation for vector reduction crazylht at gmail dot com
@ 2021-09-28 7:41 ` pinskia at gcc dot gnu.org
2021-09-28 9:07 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-28 7:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102512
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Last reconfirmed| |2021-09-28
Status|UNCONFIRMED |NEW
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I wonder if the prologue for the second case we if we know the size was
originally greater than 4 just do an overlap load and do the max.
This won't fix the issue fully but it will produce better code than we
currently do.
Otherwise confirmed.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102512] Redundant max/min operation before vector reduction
2021-09-28 7:28 [Bug tree-optimization/102512] New: Redudant max/min operation for vector reduction crazylht at gmail dot com
2021-09-28 7:41 ` [Bug tree-optimization/102512] Redundant max/min operation before " pinskia at gcc dot gnu.org
@ 2021-09-28 9:07 ` rguenth at gcc dot gnu.org
2021-09-28 9:08 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-28 9:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102512
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
max_9 = *p_8(D);
_10 = {max_9, max_9, max_9, max_9, max_9, max_9, max_9, max_9};
vect__4.7_13 = MEM <vector(8) short int> [(short int *)p_8(D)];
vect_max_11.8_14 = MAX_EXPR <_10, vect__4.7_13>;
_20 = .REDUC_MAX (vect_max_11.8_14); [tail call]
it's a bit difficult to improve here - match.pd doesn't like MEMs too much
and this all just collapses because _10 is a splat of element zero of
vect__4.7_13 ...
In theory the vectorizer could use the first full vector as initial value
or of course a vector of all SHORT_MIN. But the intent of using the first
scalar value was that this would optimize better ...
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102512] Redundant max/min operation before vector reduction
2021-09-28 7:28 [Bug tree-optimization/102512] New: Redudant max/min operation for vector reduction crazylht at gmail dot com
2021-09-28 7:41 ` [Bug tree-optimization/102512] Redundant max/min operation before " pinskia at gcc dot gnu.org
2021-09-28 9:07 ` rguenth at gcc dot gnu.org
@ 2021-09-28 9:08 ` rguenth at gcc dot gnu.org
2021-12-29 6:05 ` pinskia at gcc dot gnu.org
2024-08-24 4:57 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-28 9:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102512
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> max_9 = *p_8(D);
> _10 = {max_9, max_9, max_9, max_9, max_9, max_9, max_9, max_9};
> vect__4.7_13 = MEM <vector(8) short int> [(short int *)p_8(D)];
> vect_max_11.8_14 = MAX_EXPR <_10, vect__4.7_13>;
> _20 = .REDUC_MAX (vect_max_11.8_14); [tail call]
>
> it's a bit difficult to improve here - match.pd doesn't like MEMs too much
> and this all just collapses because _10 is a splat of element zero of
> vect__4.7_13 ...
>
> In theory the vectorizer could use the first full vector as initial value
> or of course a vector of all SHORT_MIN. But the intent of using the first
> scalar value was that this would optimize better ...
That is, the alternative is to apply the 'short max = p[0]' "bias" after
the epilogue and have the initial value be { SHORT_MIN, ... }.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102512] Redundant max/min operation before vector reduction
2021-09-28 7:28 [Bug tree-optimization/102512] New: Redudant max/min operation for vector reduction crazylht at gmail dot com
` (2 preceding siblings ...)
2021-09-28 9:08 ` rguenth at gcc dot gnu.org
@ 2021-12-29 6:05 ` pinskia at gcc dot gnu.org
2024-08-24 4:57 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-29 6:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102512
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102512] Redundant max/min operation before vector reduction
2021-09-28 7:28 [Bug tree-optimization/102512] New: Redudant max/min operation for vector reduction crazylht at gmail dot com
` (3 preceding siblings ...)
2021-12-29 6:05 ` pinskia at gcc dot gnu.org
@ 2024-08-24 4:57 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-08-24 4:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102512
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |syq at gcc dot gnu.org
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 116475 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-08-24 4:57 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-28 7:28 [Bug tree-optimization/102512] New: Redudant max/min operation for vector reduction crazylht at gmail dot com
2021-09-28 7:41 ` [Bug tree-optimization/102512] Redundant max/min operation before " pinskia at gcc dot gnu.org
2021-09-28 9:07 ` rguenth at gcc dot gnu.org
2021-09-28 9:08 ` rguenth at gcc dot gnu.org
2021-12-29 6:05 ` pinskia at gcc dot gnu.org
2024-08-24 4:57 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).