public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/57037] New: GCC does not generate non-temporal stores on i386 with SSE2+
@ 2013-04-22 20:29 anlauf at gmx dot de
2014-12-29 20:34 ` [Bug target/57037] " anlauf at gmx dot de
2024-04-08 1:28 ` pinskia at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: anlauf at gmx dot de @ 2013-04-22 20:29 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57037
Bug #: 57037
Summary: GCC does not generate non-temporal stores on i386 with
SSE2+
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: anlauf@gmx.de
Hello,
it appears that gcc does not generate non-temporal stores
available on i386 at least with SSE2. This is an important
optimization for some memory-bandwidth limited codes.
Example: for the stream triad kernel,
subroutine stream_kernel_triad (a, b, c, n, s)
integer , intent(in) :: n
double precision :: a(*), b(*), c(*)
double precision, intent(in) :: s
integer :: j
do j = 1,n
a(j) = b(j) + s*c(j)
end do
end subroutine stream_kernel_triad
the Intel compiler generates vectorized code with a
throughput that is 25% higher on my Core2 than when
disabling the generation of non-temporal stores
(i.e. compiling with "-opt-streaming-stores never").
gfortran (using -Ofast -fprefetch-loop-arrays) exactly
reproduces the performance of the Intel compiler without
temporal stores. It appears that this is an important
optimization.
Harald
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/57037] GCC does not generate non-temporal stores on i386 with SSE2+
2013-04-22 20:29 [Bug target/57037] New: GCC does not generate non-temporal stores on i386 with SSE2+ anlauf at gmx dot de
@ 2014-12-29 20:34 ` anlauf at gmx dot de
2024-04-08 1:28 ` pinskia at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: anlauf at gmx dot de @ 2014-12-29 20:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57037
--- Comment #1 from Harald Anlauf <anlauf at gmx dot de> ---
(In reply to Harald Anlauf from comment #0)
> gfortran (using -Ofast -fprefetch-loop-arrays) exactly
> reproduces the performance of the Intel compiler without
> temporal stores. It appears that this is an important
> optimization.
I tried a current snapshot from trunk (r219084) and found
that -fprefetch-loop-arrays now gives an additional boost,
matching Intel v15 for the above code, even without the
streaming stores.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/57037] GCC does not generate non-temporal stores on i386 with SSE2+
2013-04-22 20:29 [Bug target/57037] New: GCC does not generate non-temporal stores on i386 with SSE2+ anlauf at gmx dot de
2014-12-29 20:34 ` [Bug target/57037] " anlauf at gmx dot de
@ 2024-04-08 1:28 ` pinskia at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-08 1:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57037
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |INVALID
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
GCC has a heustrics for deciding on non-temporal stores or not (when used with
-fprefetch-loop-arrays) but it is hard not hit it sometimes.
Anyways closing as invalid as we do have 2 testcases in the testsuite that
detects nontemporal stores, gcc.dg/tree-ssa/prefetch-{8,9}.c and they still
pass.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-04-08 1:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-22 20:29 [Bug target/57037] New: GCC does not generate non-temporal stores on i386 with SSE2+ anlauf at gmx dot de
2014-12-29 20:34 ` [Bug target/57037] " anlauf at gmx dot de
2024-04-08 1:28 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).