public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/44382] Slow integer multiply
       [not found] <bug-44382-4@http.gcc.gnu.org/bugzilla/>
@ 2011-07-12 17:34 ` wschmidt at gcc dot gnu.org
  2011-09-06 16:54 ` hjl at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: wschmidt at gcc dot gnu.org @ 2011-07-12 17:34 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382

William J. Schmidt <wschmidt at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wschmidt at gcc dot gnu.org

--- Comment #7 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2011-07-12 17:33:06 UTC ---
The test case from bug 45671 is as follows:

int myfunction (int a, int b, int c, int d, int e, int f, int g, int h) {
  int ret;

  ret = a + b + c + d + e + f + g + h;
  return ret;

}

Compiling with -O3 results in a series of dependent add instructions to
accumulate the sum.

        add 4,3,4
        add 4,4,5
        add 4,4,6
        add 4,4,7
        add 4,4,8
        add 4,4,9
        add 4,4,10


If we regrouped to (a+b)+(c+d)+... we can do multiple adds in parallel on
different execution units.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug middle-end/44382] Slow integer multiply
       [not found] <bug-44382-4@http.gcc.gnu.org/bugzilla/>
  2011-07-12 17:34 ` [Bug middle-end/44382] Slow integer multiply wschmidt at gcc dot gnu.org
@ 2011-09-06 16:54 ` hjl at gcc dot gnu.org
  2011-10-13 17:30 ` wschmidt at gcc dot gnu.org
  2011-10-21 14:42 ` wschmidt at gcc dot gnu.org
  3 siblings, 0 replies; 10+ messages in thread
From: hjl at gcc dot gnu.org @ 2011-09-06 16:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382

--- Comment #8 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2011-09-06 16:42:56 UTC ---
Author: hjl
Date: Tue Sep  6 16:42:47 2011
New Revision: 178602

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=178602
Log:
PR middle-end/44382: Tree reassociation improvement

gcc/

2011-09-06  Enkovich Ilya  <ilya.enkovich@intel.com>

    PR middle-end/44382
    * target.def (reassociation_width): New hook.

    * doc/tm.texi.in (reassociation_width): Likewise.

    * doc/tm.texi (reassociation_width): Likewise.

    * doc/invoke.texi (tree-reassoc-width): New param documented.

    * hooks.h (hook_int_uint_mode_1): New default hook.

    * hooks.c (hook_int_uint_mode_1): Likewise.

    * config/i386/i386.h (ix86_tune_indices): Add
    X86_TUNE_REASSOC_INT_TO_PARALLEL and
    X86_TUNE_REASSOC_FP_TO_PARALLEL.

    (TARGET_REASSOC_INT_TO_PARALLEL): New.
    (TARGET_REASSOC_FP_TO_PARALLEL): Likewise.

    * config/i386/i386.c (initial_ix86_tune_features): Add
    X86_TUNE_REASSOC_INT_TO_PARALLEL and
    X86_TUNE_REASSOC_FP_TO_PARALLEL.

    (ix86_reassociation_width) implementation of
    new hook for i386 target.

    * params.def (PARAM_TREE_REASSOC_WIDTH): New param added.

    * tree-ssa-reassoc.c (get_required_cycles): New function.
    (get_reassociation_width): Likewise.
    (swap_ops_for_binary_stmt): Likewise.
    (rewrite_expr_tree_parallel): Likewise.

    (rewrite_expr_tree): Refactored. Part of code moved into
    swap_ops_for_binary_stmt.

    (reassociate_bb): Now checks reassociation width to be used
    and call rewrite_expr_tree_parallel instead of rewrite_expr_tree
    if needed.

gcc/testsuite/

2011-09-06  Enkovich Ilya  <ilya.enkovich@intel.com>

    * gcc.dg/tree-ssa/pr38533.c (dg-options): Added option
    --param tree-reassoc-width=1.

    * gcc.dg/tree-ssa/reassoc-24.c: New test.
    * gcc.dg/tree-ssa/reassoc-25.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.h
    trunk/gcc/doc/invoke.texi
    trunk/gcc/doc/tm.texi
    trunk/gcc/doc/tm.texi.in
    trunk/gcc/hooks.c
    trunk/gcc/hooks.h
    trunk/gcc/params.def
    trunk/gcc/target.def
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/tree-ssa/pr38533.c
    trunk/gcc/tree-ssa-reassoc.c


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug middle-end/44382] Slow integer multiply
       [not found] <bug-44382-4@http.gcc.gnu.org/bugzilla/>
  2011-07-12 17:34 ` [Bug middle-end/44382] Slow integer multiply wschmidt at gcc dot gnu.org
  2011-09-06 16:54 ` hjl at gcc dot gnu.org
@ 2011-10-13 17:30 ` wschmidt at gcc dot gnu.org
  2011-10-21 14:42 ` wschmidt at gcc dot gnu.org
  3 siblings, 0 replies; 10+ messages in thread
From: wschmidt at gcc dot gnu.org @ 2011-10-13 17:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382

--- Comment #9 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2011-10-13 17:30:14 UTC ---
Just adding some status information well after the fact...

We experimented with adding powerpc64 hooks to use the parallel reassociation
support from comment #8.  We elected not to enable this support because the
results for SPEC were negative (quite negative in some cases), due to increased
register pressure in loops where spill was already an issue.  Our plans at this
point are to live with the left-linear association, at least until the spill
costs can be mitigated in some fashion.

If parallel reassociation had some heuristics that predicted for register
pressure (difficult in tree-ssa, I know), it might become practical for us.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug middle-end/44382] Slow integer multiply
       [not found] <bug-44382-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2011-10-13 17:30 ` wschmidt at gcc dot gnu.org
@ 2011-10-21 14:42 ` wschmidt at gcc dot gnu.org
  3 siblings, 0 replies; 10+ messages in thread
From: wschmidt at gcc dot gnu.org @ 2011-10-21 14:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382

--- Comment #10 from William J. Schmidt <wschmidt at gcc dot gnu.org> 2011-10-21 14:41:13 UTC ---
One more data point.  I repeated the experiment using -fsched-pressure. 
Although this reduced the degradations considerably, the overall results are
equivocal.  I see a few improvements and a few degradations in the 1-4% range,
with the geometric means essentially unchanged.  So even if -fsched-pressure
were the default, there wouldn't be an overwhelming case for enabling this
support on powerpc64.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug middle-end/44382] Slow integer multiply
  2010-06-02 15:04 [Bug middle-end/44382] New: " hjl dot tools at gmail dot com
                   ` (4 preceding siblings ...)
  2010-06-04 14:40 ` hjl dot tools at gmail dot com
@ 2010-09-15  4:29 ` hjl dot tools at gmail dot com
  5 siblings, 0 replies; 10+ messages in thread
From: hjl dot tools at gmail dot com @ 2010-09-15  4:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from hjl dot tools at gmail dot com  2010-09-15 04:29 -------
*** Bug 45671 has been marked as a duplicate of this bug. ***


-- 

hjl dot tools at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pthaugen at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug middle-end/44382] Slow integer multiply
  2010-06-02 15:04 [Bug middle-end/44382] New: " hjl dot tools at gmail dot com
                   ` (3 preceding siblings ...)
  2010-06-04 13:57 ` hjl dot tools at gmail dot com
@ 2010-06-04 14:40 ` hjl dot tools at gmail dot com
  2010-09-15  4:29 ` hjl dot tools at gmail dot com
  5 siblings, 0 replies; 10+ messages in thread
From: hjl dot tools at gmail dot com @ 2010-06-04 14:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from hjl dot tools at gmail dot com  2010-06-04 14:40 -------
tree-ssa-reassoc.c has

    2. Left linearization of the expression trees, so that (A+B)+(C+D)
    becomes (((A+B)+C)+D), which is easier for us to rewrite later.
    During linearization, we place the operands of the binary
    expressions into a vector of operand_entry_t

I think this may always generate slower codes. We may not want to
use much more registers. We can limit us to 2 temporaries.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug middle-end/44382] Slow integer multiply
  2010-06-02 15:04 [Bug middle-end/44382] New: " hjl dot tools at gmail dot com
                   ` (2 preceding siblings ...)
  2010-06-04 13:21 ` rguenth at gcc dot gnu dot org
@ 2010-06-04 13:57 ` hjl dot tools at gmail dot com
  2010-06-04 14:40 ` hjl dot tools at gmail dot com
  2010-09-15  4:29 ` hjl dot tools at gmail dot com
  5 siblings, 0 replies; 10+ messages in thread
From: hjl dot tools at gmail dot com @ 2010-06-04 13:57 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from hjl dot tools at gmail dot com  2010-06-04 13:56 -------
(In reply to comment #3)
> Yes, reassoc linearizes instead of building a tree (saves one (or was it two?)
> registers at best).
> 

Should we always build a tree? It may increase register pressure.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug middle-end/44382] Slow integer multiply
  2010-06-02 15:04 [Bug middle-end/44382] New: " hjl dot tools at gmail dot com
  2010-06-02 15:15 ` [Bug middle-end/44382] " rguenth at gcc dot gnu dot org
  2010-06-04 13:08 ` hjl dot tools at gmail dot com
@ 2010-06-04 13:21 ` rguenth at gcc dot gnu dot org
  2010-06-04 13:57 ` hjl dot tools at gmail dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-06-04 13:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from rguenth at gcc dot gnu dot org  2010-06-04 13:21 -------
Yes, reassoc linearizes instead of building a tree (saves one (or was it two?)
registers at best).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug middle-end/44382] Slow integer multiply
  2010-06-02 15:04 [Bug middle-end/44382] New: " hjl dot tools at gmail dot com
  2010-06-02 15:15 ` [Bug middle-end/44382] " rguenth at gcc dot gnu dot org
@ 2010-06-04 13:08 ` hjl dot tools at gmail dot com
  2010-06-04 13:21 ` rguenth at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: hjl dot tools at gmail dot com @ 2010-06-04 13:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from hjl dot tools at gmail dot com  2010-06-04 13:08 -------
(In reply to comment #1)
> Because our tree reassoc doesn't re-associate them.
> 

The tree reassoc pass makes it slower:

[hjl@gnu-6 44382]$ cat x.i
extern int a, b, c, d, e, f;
void
foo ()
{
  a = (b * c) * (d * e);
}
[hjl@gnu-6 44382]$ gcc -S -O2 x.i
[hjl@gnu-6 44382]$ cat x.s
        .file   "x.i"
        .text
        .p2align 4,,15
.globl foo
        .type   foo, @function
foo:
.LFB0:
        .cfi_startproc
        movl    c(%rip), %eax
        imull   b(%rip), %eax
        imull   d(%rip), %eax
        imull   e(%rip), %eax
        movl    %eax, a(%rip)
        ret
[hjl@gnu-6 44382]$ gcc -S -O2 x.i -fno-tree-reassoc
[hjl@gnu-6 44382]$ cat x.s
        .file   "x.i"
        .text
        .p2align 4,,15
.globl foo
        .type   foo, @function
foo:
.LFB0:
        .cfi_startproc
        movl    b(%rip), %eax
        movl    d(%rip), %edx
        imull   c(%rip), %eax
        imull   e(%rip), %edx
        imull   %edx, %eax
        movl    %eax, a(%rip)
        ret
[hjl@gnu-6 44382]$ 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug middle-end/44382] Slow integer multiply
  2010-06-02 15:04 [Bug middle-end/44382] New: " hjl dot tools at gmail dot com
@ 2010-06-02 15:15 ` rguenth at gcc dot gnu dot org
  2010-06-04 13:08 ` hjl dot tools at gmail dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-06-02 15:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from rguenth at gcc dot gnu dot org  2010-06-02 15:15 -------
Because our tree reassoc doesn't re-associate them.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
           Keywords|                            |missed-optimization
   Last reconfirmed|0000-00-00 00:00:00         |2010-06-02 15:15:06
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44382


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-10-21 14:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-44382-4@http.gcc.gnu.org/bugzilla/>
2011-07-12 17:34 ` [Bug middle-end/44382] Slow integer multiply wschmidt at gcc dot gnu.org
2011-09-06 16:54 ` hjl at gcc dot gnu.org
2011-10-13 17:30 ` wschmidt at gcc dot gnu.org
2011-10-21 14:42 ` wschmidt at gcc dot gnu.org
2010-06-02 15:04 [Bug middle-end/44382] New: " hjl dot tools at gmail dot com
2010-06-02 15:15 ` [Bug middle-end/44382] " rguenth at gcc dot gnu dot org
2010-06-04 13:08 ` hjl dot tools at gmail dot com
2010-06-04 13:21 ` rguenth at gcc dot gnu dot org
2010-06-04 13:57 ` hjl dot tools at gmail dot com
2010-06-04 14:40 ` hjl dot tools at gmail dot com
2010-09-15  4:29 ` hjl dot tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).