public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug tree-optimization/53397] New: Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes @ 2012-05-18 12:07 venkataramanan.kumar at amd dot com 2012-05-18 12:10 ` [Bug tree-optimization/53397] " venkataramanan.kumar at amd dot com ` (2 more replies) 0 siblings, 3 replies; 4+ messages in thread From: venkataramanan.kumar at amd dot com @ 2012-05-18 12:07 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53397 Bug #: 53397 Summary: Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes Classification: Unclassified Product: gcc Version: tree-ssa Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: venkataramanan.kumar@amd.com With GCC4.7 the benchmark score drops from ~400 Mflops to ~40 mflops. Almost 10 folds. Prefecth instructions introduced in the innermost loops of "FFT_transform_internal" ( FFT.c ) in GCC4.7 but not in GCC4.6 which is causing the slow down. Compiling this function alone as a separate test case with -fno-prefetch-loop-arrays brings back the original score. The problem is exposed http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175474 With GCC r175473 -------------------------- gcc -O3 -march=amdfam10 *.c -o Scimark175473 -lm vekumar@pcedinar5:/local/home/vekumar/SciMark2_bench/SciMark2> ./Scimark175473 ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to pozo@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 99.67 FFT Mflops: 498.35 (N=1024) With GCC r175474 ------------------------- gcc -O3 -march=amdfam10 *.c -o Scimark175474 -lm vekumar@pcedinar5:/local/home/vekumar/SciMark2_bench/SciMark2> ./Scimark175474 ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to pozo@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 7.73 FFT Mflops: 38.66 (N=1024) ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/53397] Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes 2012-05-18 12:07 [Bug tree-optimization/53397] New: Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes venkataramanan.kumar at amd dot com @ 2012-05-18 12:10 ` venkataramanan.kumar at amd dot com 2012-05-18 12:16 ` rguenth at gcc dot gnu.org 2012-10-09 15:55 ` venkataramanan.kumar at amd dot com 2 siblings, 0 replies; 4+ messages in thread From: venkataramanan.kumar at amd dot com @ 2012-05-18 12:10 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53397 Venkataramanan <venkataramanan.kumar at amd dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Target| |x86_64-unknown-linux-gnu CC| |rguenth at gcc dot gnu.org Host| |x86_64-unknown-linux-gnu Build| |x86_64-unknown-linux-gnu Severity|normal |major ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/53397] Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes 2012-05-18 12:07 [Bug tree-optimization/53397] New: Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes venkataramanan.kumar at amd dot com 2012-05-18 12:10 ` [Bug tree-optimization/53397] " venkataramanan.kumar at amd dot com @ 2012-05-18 12:16 ` rguenth at gcc dot gnu.org 2012-10-09 15:55 ` venkataramanan.kumar at amd dot com 2 siblings, 0 replies; 4+ messages in thread From: rguenth at gcc dot gnu.org @ 2012-05-18 12:16 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53397 Richard Guenther <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2012-05-18 Version|tree-ssa |4.7.1 Ever Confirmed|0 |1 --- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-18 12:11:07 UTC --- Confirmed. ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/53397] Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes 2012-05-18 12:07 [Bug tree-optimization/53397] New: Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes venkataramanan.kumar at amd dot com 2012-05-18 12:10 ` [Bug tree-optimization/53397] " venkataramanan.kumar at amd dot com 2012-05-18 12:16 ` rguenth at gcc dot gnu.org @ 2012-10-09 15:55 ` venkataramanan.kumar at amd dot com 2 siblings, 0 replies; 4+ messages in thread From: venkataramanan.kumar at amd dot com @ 2012-10-09 15:55 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53397 Venkataramanan <venkataramanan.kumar at amd dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED --- Comment #2 from Venkataramanan <venkataramanan.kumar at amd dot com> 2012-10-09 15:55:04 UTC --- Fixed. http://gcc.gnu.org/viewcvs?view=revision&revision=192261 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-10-09 15:55 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-05-18 12:07 [Bug tree-optimization/53397] New: Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes venkataramanan.kumar at amd dot com 2012-05-18 12:10 ` [Bug tree-optimization/53397] " venkataramanan.kumar at amd dot com 2012-05-18 12:16 ` rguenth at gcc dot gnu.org 2012-10-09 15:55 ` venkataramanan.kumar at amd dot com
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).