public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Improve -fprefetch-loop-arrays in general and for AArch64 in particular
@ 2017-01-30 11:27 Maxim Kuvyrkov
  2017-01-30 11:35 ` [PATCH 1/6] Add debug counter for loop array prefetching Maxim Kuvyrkov
                   ` (7 more replies)
  0 siblings, 8 replies; 32+ messages in thread
From: Maxim Kuvyrkov @ 2017-01-30 11:27 UTC (permalink / raw)
  To: GCC Patches; +Cc: Kyrylo Tkachov, Andrew Pinski, Richard Guenther

This patch series improves -fprefetch-loop-arrays pass through small fixes and tweaks, and then enables it for several AArch64 cores.

My tunings were done on and for Qualcomm hardware, with results varying between +0.5-1.9% for SPEC2006 INT and +0.25%-1.0% for SPEC2006 FP at -O3, depending on hardware revision.

This patch series enables restricted -fprefetch-loop-arrays at -O2, which also improves SPEC2006 numbers

Biggest progressions are on 419.mcf and 437.leslie3d, with no serious regressions on other benchmarks.

I'm now investigating making -fprefetch-loop-arrays more aggressive for Qualcomm hardware, which improves performance on most benchmarks, but also causes big regressions on 454.calculix and 462.libquantum.  If I can fix these two regressions, prefetching will give another boost to AArch64.

Andrew just posted similar prefetching tunings for Cavium's cores, and the two patches have trivial conflicts.  I'll post mine as-is, since it address one of the comments on Andrew's review (adding a stand-alone struct for tuning parameters).

Andrew, feel free to just copy-paste it to your patch, since it is just a mechanical change.

All patches were bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu.
 
--
Maxim Kuvyrkov
www.linaro.org



^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2017-06-09 16:06 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-30 11:27 [PATCH 0/6] Improve -fprefetch-loop-arrays in general and for AArch64 in particular Maxim Kuvyrkov
2017-01-30 11:35 ` [PATCH 1/6] Add debug counter for loop array prefetching Maxim Kuvyrkov
2017-01-30 12:33   ` Richard Biener
2017-01-30 11:43 ` [PATCH 2/6] Improve debug output of loop data prefetching Maxim Kuvyrkov
2017-01-30 12:37   ` Richard Biener
2017-01-30 11:47 ` [PATCH 3/6] Fix prefetch heuristic calculation Maxim Kuvyrkov
2017-01-30 13:28   ` Richard Biener
2017-01-30 11:53 ` [PATCH 4/6] Port prefetch configuration from aarch32 to aarch64 Maxim Kuvyrkov
2017-01-30 16:39   ` Andrew Pinski
2017-02-03 11:52     ` Maxim Kuvyrkov
2017-06-08 13:48     ` James Greenhalgh
2017-06-08 15:13       ` Richard Earnshaw (lists)
2017-06-09  7:32         ` Maxim Kuvyrkov
2017-06-09 10:04           ` James Greenhalgh
2017-06-09 15:56             ` Maxim Kuvyrkov
2017-05-29 10:07   ` Maxim Kuvyrkov
2017-01-30 12:08 ` [PATCH 5/6][AArch64] Enable -fprefetch-loop-arrays at -O3 for cores that benefit from prefetching Maxim Kuvyrkov
2017-01-30 12:31   ` Kyrill Tkachov
2017-01-30 15:03     ` Maxim Kuvyrkov
2017-02-03 11:58       ` Maxim Kuvyrkov
2017-05-29 10:09         ` Maxim Kuvyrkov
2017-06-08 16:31         ` James Greenhalgh
2017-06-09 16:00           ` Maxim Kuvyrkov
2017-01-30 12:14 ` [PATCH 6/6][AArch64] Update prefetch tuning parameters for falkor and qdf24xx tunings Maxim Kuvyrkov
2017-06-08 16:32   ` James Greenhalgh
2017-06-09 16:02     ` Maxim Kuvyrkov
2017-01-30 16:55 ` [PATCH 0/6] Improve -fprefetch-loop-arrays in general and for AArch64 in particular Andrew Pinski
2017-02-20 14:44 ` Kyrill Tkachov
2017-02-28 10:03   ` Maxim Kuvyrkov
2017-05-28  8:18     ` Andrew Pinski
2017-05-29 10:24       ` Maxim Kuvyrkov
2017-06-09 16:06       ` Maxim Kuvyrkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).