public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/112457] New: Possible better vectorization of different reduction min/max reduction
@ 2023-11-09 12:27 juzhe.zhong at rivai dot ai
2023-11-09 12:29 ` [Bug c/112457] " juzhe.zhong at rivai dot ai
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-11-09 12:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457
Bug ID: 112457
Summary: Possible better vectorization of different reduction
min/max reduction
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: juzhe.zhong at rivai dot ai
Target Milestone: ---
Hi, Richard.
GCC-14 almost has all features of RVV.
I am planning to participate on improving GCC loop vectorizer in GCC-15.
Fix FAILs of TSVC is one of my plan.
Currently we can vectorize this following case:
int idx = 0;
int max = 0;
void foo (int n, int * __restrict a){
for (int i = 0; i < n; ++i) {
max = max < a[i] ? a[i] : max;
}
}
However, if we change this case it failed:
void foo2 (int n, int * __restrict a){
for (int i = 0; i < n; ++i) {
if (max < a[i]) {
max = a[i];
} else
max = max;
}
}
Now, I notice another interesting and possible vectorization enhancement which
inspired by this patch of LLVM:
https://reviews.llvm.org/D143465
And more advance case is which is case from LLVM patch:
which is vectorization reduction with index:
void foo3 (int n, int * __restrict a){
for (int i = 0; i < n; ++i) {
if (max < a[i]) {
idx = i;
max = a[i];
}
}
}
I wonder it is a valuable optimization ? If yes, it would be one of my TODO
list.
Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c/112457] Possible better vectorization of different reduction min/max reduction
2023-11-09 12:27 [Bug c/112457] New: Possible better vectorization of different reduction min/max reduction juzhe.zhong at rivai dot ai
@ 2023-11-09 12:29 ` juzhe.zhong at rivai dot ai
2023-11-09 12:45 ` [Bug tree-optimization/112457] " rguenth at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-11-09 12:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457
--- Comment #1 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Reference: https://godbolt.org/z/9M1jWzMdx
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/112457] Possible better vectorization of different reduction min/max reduction
2023-11-09 12:27 [Bug c/112457] New: Possible better vectorization of different reduction min/max reduction juzhe.zhong at rivai dot ai
2023-11-09 12:29 ` [Bug c/112457] " juzhe.zhong at rivai dot ai
@ 2023-11-09 12:45 ` rguenth at gcc dot gnu.org
2024-01-02 8:58 ` juzhe.zhong at rivai dot ai
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-09 12:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|c |tree-optimization
Blocks| |53947
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, this is because MAX_EXPR detection fails when store motion inserts flags
(the max = max is elided) to avoid store-data races. Also when using
-Ofast we avoid this but then the next phiopt comes too late to discover
MAX after store motion is applied.
The more practical example is
int foo2 (int max, int n, int * __restrict a)
{
for (int i = 0; i < n; ++i)
if (max < a[i]) {
max = a[i];
}
return max;
}
and that's handled OK. For your second example, index reduction, there's
already bugreports.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/112457] Possible better vectorization of different reduction min/max reduction
2023-11-09 12:27 [Bug c/112457] New: Possible better vectorization of different reduction min/max reduction juzhe.zhong at rivai dot ai
2023-11-09 12:29 ` [Bug c/112457] " juzhe.zhong at rivai dot ai
2023-11-09 12:45 ` [Bug tree-optimization/112457] " rguenth at gcc dot gnu.org
@ 2024-01-02 8:58 ` juzhe.zhong at rivai dot ai
2024-01-02 16:03 ` xry111 at gcc dot gnu.org
2024-01-08 9:00 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2024-01-02 8:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457
--- Comment #3 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Created attachment 56973
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56973&action=edit
min/max reduction approach with index
Hi, Richi.
I have watch all PPT/video of 2023 llvm development meeting.
Turns out they already have a feasible solution/approach to support min/max
reduction with index.
Is it Ok that I support it by following the LLVM approach ?
The attachment is the PPT of LLVM development meeting that mentioned min/max
reduction with index.
Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/112457] Possible better vectorization of different reduction min/max reduction
2023-11-09 12:27 [Bug c/112457] New: Possible better vectorization of different reduction min/max reduction juzhe.zhong at rivai dot ai
` (2 preceding siblings ...)
2024-01-02 8:58 ` juzhe.zhong at rivai dot ai
@ 2024-01-02 16:03 ` xry111 at gcc dot gnu.org
2024-01-08 9:00 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-01-02 16:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457
Xi Ruoyao <xry111 at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |xry111 at gcc dot gnu.org
--- Comment #4 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
There is also:
double
test (double *p)
{
double ret = p[0];
for (int i = 1; i < 4; i++)
ret = __builtin_fmin (ret, p[i]);
return ret;
}
This is not vectorized.
And
double
test (double *p)
{
double ret = __builtin_inf(); /* or __builtin_nan("") */
for (int i = 0; i < 4; i++)
ret = __builtin_fmin (ret, p[i]);
return ret;
}
is compiled to:
_16 = .REDUC_FMIN (vect__4.7_17);
_22 = .REDUC_FMIN ({ Inf, Inf, Inf, Inf });
_20 = .FMIN (_16, _22); [tail call]
return _20;
So there is an redundant .FMIN operation.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/112457] Possible better vectorization of different reduction min/max reduction
2023-11-09 12:27 [Bug c/112457] New: Possible better vectorization of different reduction min/max reduction juzhe.zhong at rivai dot ai
` (3 preceding siblings ...)
2024-01-02 16:03 ` xry111 at gcc dot gnu.org
@ 2024-01-08 9:00 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-08 9:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
You want to find the duplicate bugreport for the min/max + index reductions,
IIRC the issue is that we fail the reduction detection because of multi-use
and we should really have two conditional reductions, one on the value and
one on the index without trying to be too clever combining them into a single
one.
That is, don't try to invent sth completely new based on what LLVM does but
understand what's missing in GCCs handling of conditional reductions
(it can do conditional value and conditional index reductions just fine,
just not both at the same time IIRC).
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-01-08 9:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-09 12:27 [Bug c/112457] New: Possible better vectorization of different reduction min/max reduction juzhe.zhong at rivai dot ai
2023-11-09 12:29 ` [Bug c/112457] " juzhe.zhong at rivai dot ai
2023-11-09 12:45 ` [Bug tree-optimization/112457] " rguenth at gcc dot gnu.org
2024-01-02 8:58 ` juzhe.zhong at rivai dot ai
2024-01-02 16:03 ` xry111 at gcc dot gnu.org
2024-01-08 9:00 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).