public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/107412] New: Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length
@ 2022-10-26 9:13 linkw at gcc dot gnu.org
2022-10-26 9:15 ` [Bug tree-optimization/107412] " linkw at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-26 9:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107412
Bug ID: 107412
Summary: Miss to fold LEN_{LOAD,STORE} when the specified
length equal to vector length
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: linkw at gcc dot gnu.org
Target Milestone: ---
test case:
===
#define N 16
int src[N];
int dest[N];
void foo (){
for (int i = 0; i < (N-1); i++)
dest[i] = src[i];
}
===
Options: -mcpu=power10 -fno-tree-loop-distribute-patterns --param
vect-partial-vector-usage=2 -O2 -ftree-vectorize -funroll-loops
-fno-vect-cost-model
optimized gimple output:
void foo ()
{
vector(16) unsigned char vect_2;
vector(16) unsigned char vect_13;
vector(16) unsigned char vect_34;
vector(16) unsigned char vect_47;
<bb 2> [local count: 67108864]:
vect_2 = .LEN_LOAD (&src, 128B, 16, 0);
.LEN_STORE (&dest, 128B, 16, vect_2, 0);
vect_34 = .LEN_LOAD (&MEM <int[16]> [(void *)&src + 16B], 128B, 16, 0);
.LEN_STORE (&MEM <int[16]> [(void *)&dest + 16B], 128B, 16, vect_34, 0);
vect_47 = .LEN_LOAD (&MEM <int[16]> [(void *)&src + 32B], 128B, 16, 0);
.LEN_STORE (&MEM <int[16]> [(void *)&dest + 32B], 128B, 16, vect_47, 0);
vect_13 = .LEN_LOAD (&MEM <int[16]> [(void *)&src + 48B], 128B, 12, 0);
.LEN_STORE (&MEM <int[16]> [(void *)&dest + 48B], 128B, 12, vect_13, 0);
[tail call]
return;
}
It's expected that we only have one separated .LEN_LOAD and .LEN_STORE with
length 12, the others can adopt just normal vector load/store.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/107412] Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length
2022-10-26 9:13 [Bug tree-optimization/107412] New: Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length linkw at gcc dot gnu.org
@ 2022-10-26 9:15 ` linkw at gcc dot gnu.org
2022-10-27 22:03 ` segher at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-26 9:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107412
Kewen Lin <linkw at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org
Ever confirmed|0 |1
Last reconfirmed| |2022-10-26
Keywords| |missed-optimization
Target| |powerpc64le-linux-gnu
--- Comment #1 from Kewen Lin <linkw at gcc dot gnu.org> ---
I'm working on a patch to extend the current
gimple_fold_mask_load_store_mem_ref.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/107412] Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length
2022-10-26 9:13 [Bug tree-optimization/107412] New: Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length linkw at gcc dot gnu.org
2022-10-26 9:15 ` [Bug tree-optimization/107412] " linkw at gcc dot gnu.org
@ 2022-10-27 22:03 ` segher at gcc dot gnu.org
2022-10-31 6:39 ` linkw at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: segher at gcc dot gnu.org @ 2022-10-27 22:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107412
Segher Boessenkool <segher at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |segher at gcc dot gnu.org
--- Comment #2 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Make sure we only use "plain" accesses on machines that allow all unaligned
accesses? p8 and later I think. The load-with-length insns are even later,
but a builtin does not necessarily translate to those newer insns, so some
care is required :-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/107412] Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length
2022-10-26 9:13 [Bug tree-optimization/107412] New: Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length linkw at gcc dot gnu.org
2022-10-26 9:15 ` [Bug tree-optimization/107412] " linkw at gcc dot gnu.org
2022-10-27 22:03 ` segher at gcc dot gnu.org
@ 2022-10-31 6:39 ` linkw at gcc dot gnu.org
2022-11-07 8:17 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-10-31 6:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107412
--- Comment #3 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #2)
> Make sure we only use "plain" accesses on machines that allow all unaligned
> accesses? p8 and later I think. The load-with-length insns are even later,
> but a builtin does not necessarily translate to those newer insns, so some
> care is required :-)
Thanks for raising this, for now these LEN_{LOAD,STORE} can ONLY be generated
when the target defines the relevant optab len_{load,store}, on power it's
power9 and later, while on s390 it should be some latest cpu I guess. Those are
internal function and invisible for users, the support and the documentation
assume it can work for unaligned access, if some target wants to emulate it
later, it needs some more updates (not only this folding). So we don't need to
bother it for now IHMO. :)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/107412] Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length
2022-10-26 9:13 [Bug tree-optimization/107412] New: Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length linkw at gcc dot gnu.org
` (2 preceding siblings ...)
2022-10-31 6:39 ` linkw at gcc dot gnu.org
@ 2022-11-07 8:17 ` cvs-commit at gcc dot gnu.org
2022-11-07 8:25 ` linkw at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-07 8:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107412
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kewen Lin <linkw@gcc.gnu.org>:
https://gcc.gnu.org/g:8408120fecc56385b316dafec1bdfe3aac61fc05
commit r13-3716-g8408120fecc56385b316dafec1bdfe3aac61fc05
Author: Kewen Lin <linkw@linux.ibm.com>
Date: Mon Nov 7 02:07:27 2022 -0600
vect: Fold LEN_{LOAD,STORE} if it's for the whole vector [PR107412]
As the test case in PR107412 shows, we can fold IFN .LEN_{LOAD,
STORE} into normal vector load/store if the given length is known
to be equal to the length of the whole vector. It would help to
improve overall cycles as normally the latency of vector access
with length in bytes is bigger than normal vector access, and it
also saves the preparation for length if constant length can not
be encoded into instruction (such as on power).
PR tree-optimization/107412
gcc/ChangeLog:
* gimple-fold.cc (gimple_fold_mask_load_store_mem_ref): Rename to
...
(gimple_fold_partial_load_store_mem_ref): ... this, add one
parameter
mask_p indicating it's for mask or length, and add some handlings
for
IFN LEN_{LOAD,STORE}.
(gimple_fold_mask_load): Rename to ...
(gimple_fold_partial_load): ... this, add one parameter mask_p.
(gimple_fold_mask_store): Rename to ...
(gimple_fold_partial_store): ... this, add one parameter mask_p.
(gimple_fold_call): Add the handlings for IFN LEN_{LOAD,STORE},
and adjust calls on gimple_fold_mask_load_store_mem_ref to
gimple_fold_partial_load_store_mem_ref.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr107412.c: New test.
* gcc.target/powerpc/p9-vec-length-epil-8.c: Adjust scan times for
folded LEN_LOAD.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/107412] Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length
2022-10-26 9:13 [Bug tree-optimization/107412] New: Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length linkw at gcc dot gnu.org
` (3 preceding siblings ...)
2022-11-07 8:17 ` cvs-commit at gcc dot gnu.org
@ 2022-11-07 8:25 ` linkw at gcc dot gnu.org
2022-11-07 8:25 ` linkw at gcc dot gnu.org
2022-12-05 5:28 ` cvs-commit at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-11-07 8:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107412
Kewen Lin <linkw at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #5 from Kewen Lin <linkw at gcc dot gnu.org> ---
Fixed on trunk.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/107412] Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length
2022-10-26 9:13 [Bug tree-optimization/107412] New: Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length linkw at gcc dot gnu.org
` (4 preceding siblings ...)
2022-11-07 8:25 ` linkw at gcc dot gnu.org
@ 2022-11-07 8:25 ` linkw at gcc dot gnu.org
2022-12-05 5:28 ` cvs-commit at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-11-07 8:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107412
Kewen Lin <linkw at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |13.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/107412] Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length
2022-10-26 9:13 [Bug tree-optimization/107412] New: Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length linkw at gcc dot gnu.org
` (5 preceding siblings ...)
2022-11-07 8:25 ` linkw at gcc dot gnu.org
@ 2022-12-05 5:28 ` cvs-commit at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-12-05 5:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107412
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kewen Lin <linkw@gcc.gnu.org>:
https://gcc.gnu.org/g:380d62c14c99d8df13b7a86660e7ee67d01ad827
commit r13-4488-g380d62c14c99d8df13b7a86660e7ee67d01ad827
Author: Kewen Lin <linkw@linux.ibm.com>
Date: Sun Dec 4 23:27:08 2022 -0600
gimple-fold: Refine gimple_fold_partial_load_store_mem_ref [PR107412]
Following Richard's review comments, this patch is to use
untruncated type for the length used for IFN_LEN_{LOAD,STORE}
instead of "unsigned int" for better robustness. It also
avoid to use to_constant and tree arithmetic for subtraction.
Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>
PR tree-optimization/107412
gcc/ChangeLog:
* gimple-fold.cc (gimple_fold_partial_load_store_mem_ref): Use
untruncated type for the length, and avoid to_constant and tree
arithmetic for subtraction.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-12-05 5:28 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-26 9:13 [Bug tree-optimization/107412] New: Miss to fold LEN_{LOAD,STORE} when the specified length equal to vector length linkw at gcc dot gnu.org
2022-10-26 9:15 ` [Bug tree-optimization/107412] " linkw at gcc dot gnu.org
2022-10-27 22:03 ` segher at gcc dot gnu.org
2022-10-31 6:39 ` linkw at gcc dot gnu.org
2022-11-07 8:17 ` cvs-commit at gcc dot gnu.org
2022-11-07 8:25 ` linkw at gcc dot gnu.org
2022-11-07 8:25 ` linkw at gcc dot gnu.org
2022-12-05 5:28 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).