public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation
@ 2011-11-02 21:27 pthaugen at gcc dot gnu.org
  2011-11-02 21:38 ` [Bug tree-optimization/50969] " pthaugen at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: pthaugen at gcc dot gnu.org @ 2011-11-02 21:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969

             Bug #: 50969
           Summary: 17% degradation in 168.wupwise for interleave via
                    permutation
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: pthaugen@gcc.gnu.org
                CC: bergner@gcc.gnu.org, rth@gcc.gnu.org
              Host: powerpc64-linux
            Target: powerpc64-linux
             Build: powerpc64-linux


Created attachment 25694
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25694
benchmark file

Revision 180450 (along with 180567 to fix the ICE) causes a large degradation
in cpu2000 benchmark wupwise. Additional loops are now being vectorized but
result in worse performance, not sure it that means a cost issue or what. Based
on prior observations the degradation is most likely due to the permute
instructions being used which are restricted to a single VSU pipe, so two of
them can't be executed in parallel.

Attatched file zaxpy.f is just one of the files containing a function that
degraded (zscal.f is another). The second loop is where the time is spent in
the function. Following degradations (compared to revision 180449) were
observed with oprofile.

-m64 -O3 -mcpu=power7
zaxpy : -24%
zscal : -79%

-m64 -O3 -mcpu=power7 -funroll-loops
zaxpy : -65%
zscal : -61%


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation
  2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org
@ 2011-11-02 21:38 ` pthaugen at gcc dot gnu.org
  2011-11-03  8:19 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pthaugen at gcc dot gnu.org @ 2011-11-02 21:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969

--- Comment #1 from Pat Haugen <pthaugen at gcc dot gnu.org> 2011-11-02 21:38:28 UTC ---
I swapped the numbers, should be:

-m64 -O3 -mcpu=power7
zaxpy : -79%
zscal : -24%

-m64 -O3 -mcpu=power7 -funroll-loops
zaxpy : -61%
zscal : -65%


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation
  2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org
  2011-11-02 21:38 ` [Bug tree-optimization/50969] " pthaugen at gcc dot gnu.org
@ 2011-11-03  8:19 ` rguenth at gcc dot gnu.org
  2012-02-06 21:40 ` wschmidt at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-11-03  8:19 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969

--- Comment #2 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-11-03 08:19:01 UTC ---
Yes, sounds like a cost model issue.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation
  2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org
  2011-11-02 21:38 ` [Bug tree-optimization/50969] " pthaugen at gcc dot gnu.org
  2011-11-03  8:19 ` rguenth at gcc dot gnu.org
@ 2012-02-06 21:40 ` wschmidt at gcc dot gnu.org
  2012-02-06 21:43 ` wschmidt at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: wschmidt at gcc dot gnu.org @ 2012-02-06 21:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969

--- Comment #3 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2012-02-06 21:39:38 UTC ---
Author: wschmidt
Date: Mon Feb  6 21:39:34 2012
New Revision: 183944

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183944
Log:
2012-02-06  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

    PR tree-optimization/50969
    * tree-vect-stmts.c (vect_model_store_cost): Correct statement cost to
    use vec_perm rather than vector_stmt.
    (vect_model_load_cost): Likewise.
    * config/i386/i386.c (ix86_builtin_vectorization_cost): Change cost of
    vec_perm to be the same as other vector statements.
    * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Revise
    cost of vec_perm for TARGET_VSX.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/rs6000/rs6000.c
    trunk/gcc/tree-vect-stmts.c


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation
  2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2012-02-06 21:40 ` wschmidt at gcc dot gnu.org
@ 2012-02-06 21:43 ` wschmidt at gcc dot gnu.org
  2012-02-14 19:42 ` wschmidt at gcc dot gnu.org
  2012-03-02 14:54 ` wschmidt at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: wschmidt at gcc dot gnu.org @ 2012-02-06 21:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969

William J. Schmidt <wschmidt at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |FIXED

--- Comment #4 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2012-02-06 21:41:47 UTC ---
Fixed with simple permute cost change for now.  A better analysis of permutes
will be considered in 4.8.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation
  2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2012-02-06 21:43 ` wschmidt at gcc dot gnu.org
@ 2012-02-14 19:42 ` wschmidt at gcc dot gnu.org
  2012-03-02 14:54 ` wschmidt at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: wschmidt at gcc dot gnu.org @ 2012-02-14 19:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969

--- Comment #5 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2012-02-14 19:40:22 UTC ---
Author: wschmidt
Date: Tue Feb 14 19:40:13 2012
New Revision: 184225

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184225
Log:
2012-02-14  Bill Schmidt <wschmidt@linux.vnet.ibm.com>
        Ira Rosen <irar@il.ibm.com>

    PR tree-optimization/50031
    PR tree-optimization/50969
    * targhooks.c (default_builtin_vectorization_cost): Handle
    vec_promote_demote.
    * target.h (enum vect_cost_for_stmt): Add vec_promote_demote.
    * tree-vect-loop.c (vect_get_single_scalar_iteraion_cost): Handle
    all types of reduction and pattern statements.
    (vect_estimate_min_profitable_iters): Likewise.
    * tree-vect-stmts.c (vect_model_promotion_demotion_cost): New function.
    (vect_model_store_cost): Use vec_perm rather than vector_stmt for
    statement cost.
    (vect_model_load_cost): Likewise.
    (vect_get_load_cost): Likewise; add dump logic for explicit realigns.
    (vectorizable_type_demotion): Call vect_model_promotion_demotion_cost.
    (vectorizable_type_promotion): Likewise.
    * config/spu/spu.c (spu_builtin_vectorization_cost): Handle
    vec_promote_demote.
    * config/i386/i386.c (ix86_builtin_vectorization_cost): Likewise.
    * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Update
    vec_perm for VSX and handle vec_promote_demote.


Modified:
    branches/ibm/gcc-4_6-branch/gcc/ChangeLog.ibm
    branches/ibm/gcc-4_6-branch/gcc/config/i386/i386.c
    branches/ibm/gcc-4_6-branch/gcc/config/rs6000/rs6000.c
    branches/ibm/gcc-4_6-branch/gcc/config/spu/spu.c
    branches/ibm/gcc-4_6-branch/gcc/target.h
    branches/ibm/gcc-4_6-branch/gcc/targhooks.c
    branches/ibm/gcc-4_6-branch/gcc/tree-vect-loop.c
    branches/ibm/gcc-4_6-branch/gcc/tree-vect-stmts.c


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/50969] 17% degradation in 168.wupwise for interleave via permutation
  2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2012-02-14 19:42 ` wschmidt at gcc dot gnu.org
@ 2012-03-02 14:54 ` wschmidt at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: wschmidt at gcc dot gnu.org @ 2012-03-02 14:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50969

--- Comment #6 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2012-03-02 14:52:09 UTC ---
Author: wschmidt
Date: Fri Mar  2 14:51:58 2012
New Revision: 184787

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184787
Log:
2012-03-02  Bill Schmidt <wschmidt@linux.vnet.ibm.com>
        Ira Rosen <irar@il.ibm.com>

    PR tree-optimization/50031
    PR tree-optimization/50969
    * targhooks.c (default_builtin_vectorization_cost): Handle
    vec_promote_demote.
    * target.h (enum vect_cost_for_stmt): Add vec_promote_demote.
    * tree-vect-loop.c (vect_get_single_scalar_iteraion_cost): Handle
    all types of reduction and pattern statements.
    (vect_estimate_min_profitable_iters): Likewise.
    * tree-vect-stmts.c (vect_model_promotion_demotion_cost): New function.
    (vect_model_store_cost): Use vec_perm rather than vector_stmt for
    statement cost.
    (vect_model_load_cost): Likewise.
    (vect_get_load_cost): Likewise; add dump logic for explicit realigns.
    (vectorizable_type_demotion): Call vect_model_promotion_demotion_cost.
    (vectorizable_type_promotion): Likewise.
    * config/spu/spu.c (spu_builtin_vectorization_cost): Handle
    vec_promote_demote.
    * config/i386/i386.c (ix86_builtin_vectorization_cost): Likewise.
    * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Update
    vec_perm for VSX and handle vec_promote_demote.


Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/i386/i386.c
    branches/gcc-4_6-branch/gcc/config/rs6000/rs6000.c
    branches/gcc-4_6-branch/gcc/config/spu/spu.c
    branches/gcc-4_6-branch/gcc/target.h
    branches/gcc-4_6-branch/gcc/targhooks.c
    branches/gcc-4_6-branch/gcc/tree-vect-loop.c
    branches/gcc-4_6-branch/gcc/tree-vect-stmts.c


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-03-02 14:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-02 21:27 [Bug tree-optimization/50969] New: 17% degradation in 168.wupwise for interleave via permutation pthaugen at gcc dot gnu.org
2011-11-02 21:38 ` [Bug tree-optimization/50969] " pthaugen at gcc dot gnu.org
2011-11-03  8:19 ` rguenth at gcc dot gnu.org
2012-02-06 21:40 ` wschmidt at gcc dot gnu.org
2012-02-06 21:43 ` wschmidt at gcc dot gnu.org
2012-02-14 19:42 ` wschmidt at gcc dot gnu.org
2012-03-02 14:54 ` wschmidt at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).