public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch
@ 2024-02-22 19:04 tnfchris at gcc dot gnu.org
2024-02-22 19:07 ` [Bug tree-optimization/114061] " pinskia at gcc dot gnu.org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-02-22 19:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
Bug ID: 114061
Summary: GCC fails vectorization when using __builtin_prefetch
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
The following example:
void foo(double * restrict a, double * restrict b, int n){
int i;
for(i=0; i<n; ++i){
a[i] = a[i] + b[i];
__builtin_prefetch(&(b[i+8]));
}
}
fails to vectorize because of the __builtin_prefetch.
/app/example.c:5:5: missed: statement clobbers memory: __builtin_prefetch
(_10);
/app/example.c:3:13: missed: not vectorized: loop contains function calls or
data references that cannot be analyzed
However two things:
1. prefetching are usually hints anyway and not a correctness thing. It should
be safe to elide the call and vectorizer as normal.
2. SVE has prefetched vector operations which we can use here. The vector
prefetches are also predicated so they need to be actually codegened.
Perhaps one solution here would be to have a vect-pattern which checks for
COND_PREFETCH support if supported, and if not just elides the prefetch?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch
2024-02-22 19:04 [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch tnfchris at gcc dot gnu.org
@ 2024-02-22 19:07 ` pinskia at gcc dot gnu.org
2024-02-22 19:09 ` tnfchris at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-22 19:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I thought there was already one recorded about this.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch
2024-02-22 19:04 [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch tnfchris at gcc dot gnu.org
2024-02-22 19:07 ` [Bug tree-optimization/114061] " pinskia at gcc dot gnu.org
@ 2024-02-22 19:09 ` tnfchris at gcc dot gnu.org
2024-02-22 19:11 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-02-22 19:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #2 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> I thought there was already one recorded about this.
I could only find https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103938 about an
ICE when prefetching a vector address.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch
2024-02-22 19:04 [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch tnfchris at gcc dot gnu.org
2024-02-22 19:07 ` [Bug tree-optimization/114061] " pinskia at gcc dot gnu.org
2024-02-22 19:09 ` tnfchris at gcc dot gnu.org
@ 2024-02-22 19:11 ` pinskia at gcc dot gnu.org
2024-02-22 19:21 ` tnfchris at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-22 19:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2024-02-22
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Severity|normal |enhancement
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.
Though maybe we should drop them in the vectorized version of the loop. HW
prefetchers usually do a decent job and sometimes (maybe most) SW hinted
prefetches interfere with the HW prefetchers.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch
2024-02-22 19:04 [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch tnfchris at gcc dot gnu.org
` (2 preceding siblings ...)
2024-02-22 19:11 ` pinskia at gcc dot gnu.org
@ 2024-02-22 19:21 ` tnfchris at gcc dot gnu.org
2024-02-23 7:01 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-02-22 19:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #4 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #3)
> Confirmed.
>
> Though maybe we should drop them in the vectorized version of the loop. HW
> prefetchers usually do a decent job and sometimes (maybe most) SW hinted
> prefetches interfere with the HW prefetchers.
definitely agree that I'm not sure how useful they are, but some customers
definitely seem to want them.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch
2024-02-22 19:04 [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch tnfchris at gcc dot gnu.org
` (3 preceding siblings ...)
2024-02-22 19:21 ` tnfchris at gcc dot gnu.org
@ 2024-02-23 7:01 ` rguenth at gcc dot gnu.org
2024-04-08 14:01 ` victorldn at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-23 7:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think we could try to "vectorize" them by only updating the address (the
builtin doesn't specify a size) when that evolves in the scalar loop, updating
the step with the chosen VF.
Dependence shouldn't be a concern here.
The main issue is a representational - how to handle this in data-ref and
dependence analysis (or whether to just "skip" them in the vectorizer).
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch
2024-02-22 19:04 [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch tnfchris at gcc dot gnu.org
` (4 preceding siblings ...)
2024-02-23 7:01 ` rguenth at gcc dot gnu.org
@ 2024-04-08 14:01 ` victorldn at gcc dot gnu.org
2024-06-12 13:39 ` cvs-commit at gcc dot gnu.org
2024-06-12 17:15 ` pinskia at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: victorldn at gcc dot gnu.org @ 2024-04-08 14:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
Victor Do Nascimento <victorldn at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |victorldn at gcc dot gnu.org
Status|NEW |ASSIGNED
CC| |victorldn at gcc dot gnu.org
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch
2024-02-22 19:04 [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch tnfchris at gcc dot gnu.org
` (5 preceding siblings ...)
2024-04-08 14:01 ` victorldn at gcc dot gnu.org
@ 2024-06-12 13:39 ` cvs-commit at gcc dot gnu.org
2024-06-12 17:15 ` pinskia at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-12 13:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Victor Do Nascimento
<victorldn@gcc.gnu.org>:
https://gcc.gnu.org/g:adcc815a01ae009d2768b6afb546e357bd37bbd2
commit r15-1211-gadcc815a01ae009d2768b6afb546e357bd37bbd2
Author: Victor Do Nascimento <victor.donascimento@arm.com>
Date: Wed May 22 12:14:11 2024 +0100
middle-end: Drop __builtin_prefetch calls in autovectorization [PR114061]
At present the autovectorizer fails to vectorize simple loops
involving calls to `__builtin_prefetch'. A simple example of such
loop is given below:
void foo(double * restrict a, double * restrict b, int n){
int i;
for(i=0; i<n; ++i){
a[i] = a[i] + b[i];
__builtin_prefetch(&(b[i+8]));
}
}
The failure stems from two issues:
1. Given that it is typically not possible to fully reason about a
function call due to the possibility of side effects, the
autovectorizer does not attempt to vectorize loops which make such
calls.
Given the memory reference passed to `__builtin_prefetch', in the
absence of assurances about its effect on the passed memory
location the compiler deems the function unsafe to vectorize,
marking it as clobbering memory in `vect_find_stmt_data_reference'.
This leads to the failure in autovectorization.
2. Notwithstanding the above issue, though the prefetch statement
would be classed as `vect_unused_in_scope', the loop invariant that
is used in the address of the prefetch is the scalar loop's and not
the vector loop's IV. That is, it still uses `i' and not `vec_iv'
because the instruction wasn't vectorized, causing DCE to think the
value is live, such that we now have both the vector and scalar loop
invariant actively used in the loop.
This patch addresses both of these:
1. About the issue regarding the memory clobber, data prefetch does
not generate faults if its address argument is invalid and does not
write to memory. Therefore, it does not alter the internal state
of the program or its control flow under any circumstance. As
such, it is reasonable that the function be marked as not affecting
memory contents.
To achieve this, we add the necessary logic to
`get_references_in_stmt' to ensure that builtin functions are given
given the same treatment as internal functions. If the gimple call
is to a builtin function and its function code is
`BUILT_IN_PREFETCH', we mark `clobbers_memory' as false.
2. Finding precedence in the way clobber statements are handled,
whereby the vectorizer drops these from both the scalar and
vectorized versions of a given loop, we choose to drop prefetch
hints in a similar fashion. This seems appropriate given how
software prefetch hints are typically ignored by processors across
architectures, as they seldom lead to performance gain over their
hardware counterparts.
gcc/ChangeLog:
PR tree-optimization/114061
* tree-data-ref.cc (get_references_in_stmt): set
`clobbers_memory' to false for __builtin_prefetch.
* tree-vect-loop.cc (vect_transform_loop): Drop all
__builtin_prefetch calls from loops.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-prefetch-drop.c: New test.
* gcc.target/aarch64/vect-prefetch-drop.c: Likewise.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch
2024-02-22 19:04 [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch tnfchris at gcc dot gnu.org
` (6 preceding siblings ...)
2024-06-12 13:39 ` cvs-commit at gcc dot gnu.org
@ 2024-06-12 17:15 ` pinskia at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-12 17:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Target Milestone|--- |15.0
Status|ASSIGNED |RESOLVED
--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed I think.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-06-12 17:15 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-22 19:04 [Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch tnfchris at gcc dot gnu.org
2024-02-22 19:07 ` [Bug tree-optimization/114061] " pinskia at gcc dot gnu.org
2024-02-22 19:09 ` tnfchris at gcc dot gnu.org
2024-02-22 19:11 ` pinskia at gcc dot gnu.org
2024-02-22 19:21 ` tnfchris at gcc dot gnu.org
2024-02-23 7:01 ` rguenth at gcc dot gnu.org
2024-04-08 14:01 ` victorldn at gcc dot gnu.org
2024-06-12 13:39 ` cvs-commit at gcc dot gnu.org
2024-06-12 17:15 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).