public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation @ 2011-11-02 21:27 pthaugen at gcc dot gnu.org 2011-11-02 21:38 ` [Bug tree-optimization/50969] " pthaugen at gcc dot gnu.org ` (5 more replies) 0 siblings, 6 replies; 7+ messages in thread From: pthaugen at gcc dot gnu.org @ 2011-11-02 21:27 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969 Bug #: 50969 Summary: 17% degradation in 168.wupwise for interleave via permutation Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: pthaugen@gcc.gnu.org CC: bergner@gcc.gnu.org, rth@gcc.gnu.org Host: powerpc64-linux Target: powerpc64-linux Build: powerpc64-linux Created attachment 25694 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25694 benchmark file Revision 180450 (along with 180567 to fix the ICE) causes a large degradation in cpu2000 benchmark wupwise. Additional loops are now being vectorized but result in worse performance, not sure it that means a cost issue or what. Based on prior observations the degradation is most likely due to the permute instructions being used which are restricted to a single VSU pipe, so two of them can't be executed in parallel. Attatched file zaxpy.f is just one of the files containing a function that degraded (zscal.f is another). The second loop is where the time is spent in the function. Following degradations (compared to revision 180449) were observed with oprofile. -m64 -O3 -mcpu=power7 zaxpy : -24% zscal : -79% -m64 -O3 -mcpu=power7 -funroll-loops zaxpy : -65% zscal : -61% ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation 2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org @ 2011-11-02 21:38 ` pthaugen at gcc dot gnu.org 2011-11-03 8:19 ` rguenth at gcc dot gnu.org ` (4 subsequent siblings) 5 siblings, 0 replies; 7+ messages in thread From: pthaugen at gcc dot gnu.org @ 2011-11-02 21:38 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969 --- Comment #1 from Pat Haugen <pthaugen at gcc dot gnu.org> 2011-11-02 21:38:28 UTC --- I swapped the numbers, should be: -m64 -O3 -mcpu=power7 zaxpy : -79% zscal : -24% -m64 -O3 -mcpu=power7 -funroll-loops zaxpy : -61% zscal : -65% ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation 2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org 2011-11-02 21:38 ` [Bug tree-optimization/50969] " pthaugen at gcc dot gnu.org @ 2011-11-03 8:19 ` rguenth at gcc dot gnu.org 2012-02-06 21:40 ` wschmidt at gcc dot gnu.org ` (3 subsequent siblings) 5 siblings, 0 replies; 7+ messages in thread From: rguenth at gcc dot gnu.org @ 2011-11-03 8:19 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969 --- Comment #2 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-11-03 08:19:01 UTC --- Yes, sounds like a cost model issue. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation 2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org 2011-11-02 21:38 ` [Bug tree-optimization/50969] " pthaugen at gcc dot gnu.org 2011-11-03 8:19 ` rguenth at gcc dot gnu.org @ 2012-02-06 21:40 ` wschmidt at gcc dot gnu.org 2012-02-06 21:43 ` wschmidt at gcc dot gnu.org ` (2 subsequent siblings) 5 siblings, 0 replies; 7+ messages in thread From: wschmidt at gcc dot gnu.org @ 2012-02-06 21:40 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969 --- Comment #3 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2012-02-06 21:39:38 UTC --- Author: wschmidt Date: Mon Feb 6 21:39:34 2012 New Revision: 183944 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183944 Log: 2012-02-06 Bill Schmidt <wschmidt@linux.vnet.ibm.com> PR tree-optimization/50969 * tree-vect-stmts.c (vect_model_store_cost): Correct statement cost to use vec_perm rather than vector_stmt. (vect_model_load_cost): Likewise. * config/i386/i386.c (ix86_builtin_vectorization_cost): Change cost of vec_perm to be the same as other vector statements. * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Revise cost of vec_perm for TARGET_VSX. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/rs6000/rs6000.c trunk/gcc/tree-vect-stmts.c ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation 2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org ` (2 preceding siblings ...) 2012-02-06 21:40 ` wschmidt at gcc dot gnu.org @ 2012-02-06 21:43 ` wschmidt at gcc dot gnu.org 2012-02-14 19:42 ` wschmidt at gcc dot gnu.org 2012-03-02 14:54 ` wschmidt at gcc dot gnu.org 5 siblings, 0 replies; 7+ messages in thread From: wschmidt at gcc dot gnu.org @ 2012-02-06 21:43 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969 William J. Schmidt <wschmidt at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |FIXED --- Comment #4 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2012-02-06 21:41:47 UTC --- Fixed with simple permute cost change for now. A better analysis of permutes will be considered in 4.8. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation 2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org ` (3 preceding siblings ...) 2012-02-06 21:43 ` wschmidt at gcc dot gnu.org @ 2012-02-14 19:42 ` wschmidt at gcc dot gnu.org 2012-03-02 14:54 ` wschmidt at gcc dot gnu.org 5 siblings, 0 replies; 7+ messages in thread From: wschmidt at gcc dot gnu.org @ 2012-02-14 19:42 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969 --- Comment #5 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2012-02-14 19:40:22 UTC --- Author: wschmidt Date: Tue Feb 14 19:40:13 2012 New Revision: 184225 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184225 Log: 2012-02-14 Bill Schmidt <wschmidt@linux.vnet.ibm.com> Ira Rosen <irar@il.ibm.com> PR tree-optimization/50031 PR tree-optimization/50969 * targhooks.c (default_builtin_vectorization_cost): Handle vec_promote_demote. * target.h (enum vect_cost_for_stmt): Add vec_promote_demote. * tree-vect-loop.c (vect_get_single_scalar_iteraion_cost): Handle all types of reduction and pattern statements. (vect_estimate_min_profitable_iters): Likewise. * tree-vect-stmts.c (vect_model_promotion_demotion_cost): New function. (vect_model_store_cost): Use vec_perm rather than vector_stmt for statement cost. (vect_model_load_cost): Likewise. (vect_get_load_cost): Likewise; add dump logic for explicit realigns. (vectorizable_type_demotion): Call vect_model_promotion_demotion_cost. (vectorizable_type_promotion): Likewise. * config/spu/spu.c (spu_builtin_vectorization_cost): Handle vec_promote_demote. * config/i386/i386.c (ix86_builtin_vectorization_cost): Likewise. * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Update vec_perm for VSX and handle vec_promote_demote. Modified: branches/ibm/gcc-4_6-branch/gcc/ChangeLog.ibm branches/ibm/gcc-4_6-branch/gcc/config/i386/i386.c branches/ibm/gcc-4_6-branch/gcc/config/rs6000/rs6000.c branches/ibm/gcc-4_6-branch/gcc/config/spu/spu.c branches/ibm/gcc-4_6-branch/gcc/target.h branches/ibm/gcc-4_6-branch/gcc/targhooks.c branches/ibm/gcc-4_6-branch/gcc/tree-vect-loop.c branches/ibm/gcc-4_6-branch/gcc/tree-vect-stmts.c ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation 2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org ` (4 preceding siblings ...) 2012-02-14 19:42 ` wschmidt at gcc dot gnu.org @ 2012-03-02 14:54 ` wschmidt at gcc dot gnu.org 5 siblings, 0 replies; 7+ messages in thread From: wschmidt at gcc dot gnu.org @ 2012-03-02 14:54 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969 --- Comment #6 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2012-03-02 14:52:09 UTC --- Author: wschmidt Date: Fri Mar 2 14:51:58 2012 New Revision: 184787 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184787 Log: 2012-03-02 Bill Schmidt <wschmidt@linux.vnet.ibm.com> Ira Rosen <irar@il.ibm.com> PR tree-optimization/50031 PR tree-optimization/50969 * targhooks.c (default_builtin_vectorization_cost): Handle vec_promote_demote. * target.h (enum vect_cost_for_stmt): Add vec_promote_demote. * tree-vect-loop.c (vect_get_single_scalar_iteraion_cost): Handle all types of reduction and pattern statements. (vect_estimate_min_profitable_iters): Likewise. * tree-vect-stmts.c (vect_model_promotion_demotion_cost): New function. (vect_model_store_cost): Use vec_perm rather than vector_stmt for statement cost. (vect_model_load_cost): Likewise. (vect_get_load_cost): Likewise; add dump logic for explicit realigns. (vectorizable_type_demotion): Call vect_model_promotion_demotion_cost. (vectorizable_type_promotion): Likewise. * config/spu/spu.c (spu_builtin_vectorization_cost): Handle vec_promote_demote. * config/i386/i386.c (ix86_builtin_vectorization_cost): Likewise. * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Update vec_perm for VSX and handle vec_promote_demote. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/config/i386/i386.c branches/gcc-4_6-branch/gcc/config/rs6000/rs6000.c branches/gcc-4_6-branch/gcc/config/spu/spu.c branches/gcc-4_6-branch/gcc/target.h branches/gcc-4_6-branch/gcc/targhooks.c branches/gcc-4_6-branch/gcc/tree-vect-loop.c branches/gcc-4_6-branch/gcc/tree-vect-stmts.c ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-03-02 14:54 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org 2011-11-02 21:38 ` [Bug tree-optimization/50969] " pthaugen at gcc dot gnu.org 2011-11-03 8:19 ` rguenth at gcc dot gnu.org 2012-02-06 21:40 ` wschmidt at gcc dot gnu.org 2012-02-06 21:43 ` wschmidt at gcc dot gnu.org 2012-02-14 19:42 ` wschmidt at gcc dot gnu.org 2012-03-02 14:54 ` wschmidt at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).