public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/28690]  New: Performace problem with indexed load/stores on powerpc
@ 2006-08-11  5:01 bergner at vnet dot ibm dot com
  2006-08-11 13:29 ` [Bug middle-end/28690] [4.2 Regression] " dje at gcc dot gnu dot org
                   ` (56 more replies)
  0 siblings, 57 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-08-11  5:01 UTC (permalink / raw)
  To: gcc-bugs

On some powerpc processors, it is very desirable for performance reasons, to
have the base pointer for an indexed load/store insn to be in the rA position
rather than the rB position (example insn shown below).

    lwzx rD,rA,rB

For some test cases, we get this right, but for the following test case, we get
it wrong (regardless of -m32 or -m64):

int indexedload (int *x, int i)
{
  return x[i];
}

Results in the following powerpc asm:

indexedload:
        slwi r4,r4,2
        lwzx r3,r4,r3   # We want r3,r3,r4
        blr

Dan Berlin tracked one problem down to the reassociation pass ignoring non
integral types like pointers.  His patch (I'll let Dan attach it here) fixed
the ordering of the address calc at the end of tree-ssa, but the rtl expanders
seem to be undoing this change so we still end up with the wrong ordering on
the lwzx insn.


-- 
           Summary: Performace problem with indexed load/stores on powerpc
           Product: gcc
           Version: 4.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: bergner at vnet dot ibm dot com
 GCC build triplet: powerpc64-linux
  GCC host triplet: powerpc64-linux
GCC target triplet: powerpc64-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
@ 2006-08-11 13:29 ` dje at gcc dot gnu dot org
  2006-08-11 14:34 ` dberlin at dberlin dot org
                   ` (55 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: dje at gcc dot gnu dot org @ 2006-08-11 13:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from dje at gcc dot gnu dot org  2006-08-11 13:29 -------
Confirmed


-- 

dje at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2006-08-11 13:29:43
               date|                            |
            Summary|Performace problem with     |[4.2 Regression] Performace
                   |indexed load/stores on      |problem with indexed
                   |powerpc                     |load/stores on powerpc


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
  2006-08-11 13:29 ` [Bug middle-end/28690] [4.2 Regression] " dje at gcc dot gnu dot org
@ 2006-08-11 14:34 ` dberlin at dberlin dot org
  2006-08-26  3:49 ` pinskia at gcc dot gnu dot org
                   ` (54 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2006-08-11 14:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from dberlin at gcc dot gnu dot org  2006-08-11 14:33 -------
Subject: Re:  [4.2 Regression] Performace problem with
 indexed load/stores on powerpc


Here is the reassoc patch that puts them in the right order at the tree
level.

Index: tree-ssa-reassoc.c
===================================================================
--- tree-ssa-reassoc.c  (revision 115962)
+++ tree-ssa-reassoc.c  (working copy)
@@ -356,6 +356,13 @@ sort_by_operand_rank (const void *pa, co
       && TREE_CODE (oeb->op) == SSA_NAME)
     return SSA_NAME_VERSION (oeb->op) - SSA_NAME_VERSION (oea->op);

+  /* For pointers, most things want the *base* pointer to go first to
+     try indexed loads. The base pointer is the one with the *lesser*
+     rank.  For everything else, put them in order from greatest rank
+     to least.  */
+  if (POINTER_TYPE_P (TREE_TYPE (oea->op)))
+    return oea->rank - oeb->rank;
+
   return oeb->rank - oea->rank;
 }

@@ -1309,7 +1316,9 @@ reassociate_bb (basic_block bb)
       if (TREE_CODE (stmt) == MODIFY_EXPR)
        {
          tree lhs = TREE_OPERAND (stmt, 0);
+         tree lhst = TREE_TYPE (lhs);
          tree rhs = TREE_OPERAND (stmt, 1);
+         tree rhst = TREE_TYPE (rhs);

          /* If this was part of an already processed tree, we don't
             need to touch it again. */
@@ -1318,10 +1327,10 @@ reassociate_bb (basic_block bb)

          /* If unsafe math optimizations we can do reassociation for
             non-integral types.  */
-         if ((!INTEGRAL_TYPE_P (TREE_TYPE (lhs))
-              || !INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
-             && (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (rhs))
-                 || !SCALAR_FLOAT_TYPE_P (TREE_TYPE(lhs))
+         if (((!INTEGRAL_TYPE_P (lhst) & !POINTER_TYPE_P (lhst))
+              || (!INTEGRAL_TYPE_P (rhst) && !POINTER_TYPE_P (rhst)))
+             && (!SCALAR_FLOAT_TYPE_P (rhst)
+                 || !SCALAR_FLOAT_TYPE_P (lhst)
                  || !flag_unsafe_math_optimizations))
            continue;



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
  2006-08-11 13:29 ` [Bug middle-end/28690] [4.2 Regression] " dje at gcc dot gnu dot org
  2006-08-11 14:34 ` dberlin at dberlin dot org
@ 2006-08-26  3:49 ` pinskia at gcc dot gnu dot org
  2006-08-26  4:24 ` bergner at vnet dot ibm dot com
                   ` (53 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-08-26  3:49 UTC (permalink / raw)
  To: gcc-bugs



-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu dot
                   |                            |org
   Target Milestone|---                         |4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (2 preceding siblings ...)
  2006-08-26  3:49 ` pinskia at gcc dot gnu dot org
@ 2006-08-26  4:24 ` bergner at vnet dot ibm dot com
  2006-09-01 21:50 ` mmitchel at gcc dot gnu dot org
                   ` (52 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-08-26  4:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from bergner at vnet dot ibm dot com  2006-08-26 04:24 -------
Ok, I tracked down where the expander is swapping the operands. It's occuring
at simplify-rtx.c:simplify_binary_operation() at line 1459:

  /* Make sure the constant is second.  */
  if (GET_RTX_CLASS (code) == RTX_COMM_ARITH
      && swap_commutative_operands_p (op0, op1))
    {
      tem = op0, op0 = op1, op1 = tem;
    }

In this particular case, op0 = (reg/v/f:SI 120 [ base ]) and
op1 = (mult:SI (reg/v:SI 121 [ offset ])
    (const_int 4 [0x4]))

[src being: int indexedload (int *base, int offset) { return base[offset]; }]

swap_commutative_operands_p(op0,op1) simply returns:
commutative_operand_precedence (op0) < commutative_operand_precedence (op1),
which ends up being "-1 < 4", so we swap the operands.  For powerpc, we'd
prefer the base pointer remain the first operand for performance reasons. I'd
like other people familar with this code to comment on how we can fix this. 
One could simply bump up the priority of base pointers (ie, "case RTX_OBJ:"),
but I personally don't know how that would affect other platforms.


-- 

bergner at vnet dot ibm dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bonzini at gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (3 preceding siblings ...)
  2006-08-26  4:24 ` bergner at vnet dot ibm dot com
@ 2006-09-01 21:50 ` mmitchel at gcc dot gnu dot org
  2006-09-03 13:51 ` bonzini at gnu dot org
                   ` (51 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2006-09-01 21:50 UTC (permalink / raw)
  To: gcc-bugs



-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (4 preceding siblings ...)
  2006-09-01 21:50 ` mmitchel at gcc dot gnu dot org
@ 2006-09-03 13:51 ` bonzini at gnu dot org
  2006-09-05 18:44 ` bergner at vnet dot ibm dot com
                   ` (50 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bonzini at gnu dot org @ 2006-09-03 13:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from bonzini at gnu dot org  2006-09-03 13:51 -------
> which ends up being "-1 < 4", so we swap the operands.  For powerpc, we'd
> prefer the base pointer remain the first operand for performance reasons. I'd
> like other people familar with this code to comment on how we can fix this. 
> One could simply bump up the priority of base pointers (ie, "case RTX_OBJ:"),
> but I personally don't know how that would affect other platforms.

Very much.  The canonical form enforced by swap_commutative_operands_p is
relied upon by all the code for simplifications, that expects for example a
(plus (mult A B) C) and not a (plus C (mult A B)).

If one took care to fix all of them, it could work, but it's no easy feat. :-(

I think the best solution (if it works) is to put this transformation in
rs6000's legitimize_address. Given a (plus (mult A B) C), force the mult into a
pseudo (let's call it D) and then return (plus C D) with the operands swapped.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (5 preceding siblings ...)
  2006-09-03 13:51 ` bonzini at gnu dot org
@ 2006-09-05 18:44 ` bergner at vnet dot ibm dot com
  2006-09-05 19:25 ` bonzini at gnu dot org
                   ` (49 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-09-05 18:44 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from bergner at vnet dot ibm dot com  2006-09-05 18:43 -------
Created an attachment (id=12190)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12190&action=view)
Patch to rs6000_legitimize_address to force base pointers into rA position of
indexed load/store instructions.

Ok, taking Paolo's suggestion of moving the change into
rs6000_legititmize_address, I'm trying the attached patch which bootstraps fine
and fixes the base pointer order for us.  I'm running the testsuite now and
will report back when it's done.

        * config/rs6000/rs6000.c (rs6000_legitimize_address): For performance
        reasons, force PLUS register operands that are base pointers to be the
        first operand.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (6 preceding siblings ...)
  2006-09-05 18:44 ` bergner at vnet dot ibm dot com
@ 2006-09-05 19:25 ` bonzini at gnu dot org
  2006-09-05 20:01 ` bergner at vnet dot ibm dot com
                   ` (48 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bonzini at gnu dot org @ 2006-09-05 19:25 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from bonzini at gnu dot org  2006-09-05 19:25 -------
To clarify, I make this suggestion because I think that we were getting it
right pre-4.2 just out of luck.

I also thought about having a lower commutative_operand_precedence for
REG_POINTER regs than normal regs, but in fact regs need to have the same
precedence as other rtx's of class RTX_OBJ, or you get pessimization on x86
(e.g. on crafty).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (7 preceding siblings ...)
  2006-09-05 19:25 ` bonzini at gnu dot org
@ 2006-09-05 20:01 ` bergner at vnet dot ibm dot com
  2006-09-07  5:14 ` bergner at vnet dot ibm dot com
                   ` (47 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-09-05 20:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from bergner at vnet dot ibm dot com  2006-09-05 20:01 -------
Well, to get REG_POINTER regs to be the first operand, we'd need to increase
their commutative_operand_precedence.  I tried that change already, but that
led to an infinite recursion loop while attempting to simplify the rtl. As you
said, the current ordering seems to be relied upon by the code for
simplifications.

My rs6000_legitimize_address change is still running through the testsuite.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (8 preceding siblings ...)
  2006-09-05 20:01 ` bergner at vnet dot ibm dot com
@ 2006-09-07  5:14 ` bergner at vnet dot ibm dot com
  2006-09-21 18:14 ` bergner at vnet dot ibm dot com
                   ` (46 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-09-07  5:14 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from bergner at vnet dot ibm dot com  2006-09-07 05:14 -------
Ok, this also passed regression tests on powerpc64-linux (32-bit and 64-bit
testsuite runs) for c, c++, fortran, objc, obj-c++ and java.

Does the attached patch look reasonable to everyone?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (9 preceding siblings ...)
  2006-09-07  5:14 ` bergner at vnet dot ibm dot com
@ 2006-09-21 18:14 ` bergner at vnet dot ibm dot com
  2006-09-21 18:16 ` bergner at vnet dot ibm dot com
                   ` (45 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-09-21 18:14 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from bergner at vnet dot ibm dot com  2006-09-21 18:14 -------
Created an attachment (id=12305)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12305&action=view)
Patch to rs6000_legitimize_address to force base pointers into rA position of
indexed load/store instructions.

It seems -msoft-float doesn't like the operand swapping, so this patch disables
it when we specify -msoft-float on the compile command.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (10 preceding siblings ...)
  2006-09-21 18:14 ` bergner at vnet dot ibm dot com
@ 2006-09-21 18:16 ` bergner at vnet dot ibm dot com
  2006-09-21 18:19 ` bergner at vnet dot ibm dot com
                   ` (44 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-09-21 18:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from bergner at vnet dot ibm dot com  2006-09-21 18:16 -------
(From update of attachment 12190)
Forgot to obsolete this patch by the updated patch.


-- 

bergner at vnet dot ibm dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #12190|0                           |1
        is obsolete|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (11 preceding siblings ...)
  2006-09-21 18:16 ` bergner at vnet dot ibm dot com
@ 2006-09-21 18:19 ` bergner at vnet dot ibm dot com
  2006-09-22 16:30 ` bergner at vnet dot ibm dot com
                   ` (43 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-09-21 18:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from bergner at vnet dot ibm dot com  2006-09-21 18:19 -------
Created an attachment (id=12306)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12306&action=view)
Alternate patch to rs6000_legitimize_address to force base pointers into rA
position of indexed load/store instructions.

Same as the other patch, except we don't call force_reg() on constants.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (12 preceding siblings ...)
  2006-09-21 18:19 ` bergner at vnet dot ibm dot com
@ 2006-09-22 16:30 ` bergner at vnet dot ibm dot com
  2006-09-22 16:56 ` bergner at vnet dot ibm dot com
                   ` (42 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-09-22 16:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from bergner at vnet dot ibm dot com  2006-09-22 16:30 -------
Anton dicovered that we don't get multiple dimensioned arrays like the
following test case:

int indexedload(int ***base, int idx0, int idx1, int idx2)
{
  return base[idx0][idx1][idx2];
}

This one leads to 3 indexed loads.  We transform the first indexed load ok, but
the other two we don't.  I tracked that down to force_reg (called from
break_out_memory_refs) doesn't propagate the MEM_POINTER flag to a REG_POINTER
flag on the reg it creates.  I posted/commited a fix which was approved:

    http://gcc.gnu.org/ml/gcc-patches/2006-09/msg00941.html

We now successfully transform all of the indexed loads in this test case now.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (13 preceding siblings ...)
  2006-09-22 16:30 ` bergner at vnet dot ibm dot com
@ 2006-09-22 16:56 ` bergner at vnet dot ibm dot com
  2006-09-22 17:05 ` pinskia at gcc dot gnu dot org
                   ` (41 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-09-22 16:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from bergner at vnet dot ibm dot com  2006-09-22 16:56 -------
Yet another test case from Anton we don't catch.  Will they never end?!?! ;)

int indexedload(int *base, int len)
{
  int i, sum = 0;
  for (i=0; i < len; i++)
    sum += base[i];
  return sum;
}

In this case, LEGITIMIZE_ADDRESS cannot help, because it is never passed an
operand that includes the base pointer.  Instead, we're passed a pseudo
register that was set previously to calculation using the base pointer, so in
this case, we can't propagate the REG_POINTER flag.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (14 preceding siblings ...)
  2006-09-22 16:56 ` bergner at vnet dot ibm dot com
@ 2006-09-22 17:05 ` pinskia at gcc dot gnu dot org
  2006-09-22 17:09   ` Andrew Pinski
  2006-09-22 17:09 ` pinskia at physics dot uc dot edu
                   ` (40 subsequent siblings)
  56 siblings, 1 reply; 59+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-09-22 17:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from pinskia at gcc dot gnu dot org  2006-09-22 17:05 -------
(In reply to comment #13)
> Yet another test case from Anton we don't catch.  Will they never end?!?! ;)
I bet a beer or a shot of vodka, that this is caused by MEM_REF not expanding
with a REG_POINTER.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (15 preceding siblings ...)
  2006-09-22 17:05 ` pinskia at gcc dot gnu dot org
@ 2006-09-22 17:09 ` pinskia at physics dot uc dot edu
  2006-09-22 17:27 ` sabre at nondot dot org
                   ` (39 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: pinskia at physics dot uc dot edu @ 2006-09-22 17:09 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from pinskia at physics dot uc dot edu  2006-09-22 17:09 -------
Subject: Re:  [4.2 Regression] Performace problem
        with indexed load/stores on powerpc

On Fri, 2006-09-22 at 17:05 +0000, pinskia at gcc dot gnu dot org wrote:
> 
> ------- Comment #14 from pinskia at gcc dot gnu dot org  2006-09-22 17:05 -------
> (In reply to comment #13)
> > Yet another test case from Anton we don't catch.  Will they never end?!?! ;)
> I bet a beer or a shot of vodka, that this is caused by MEM_REF not expanding
> with a REG_POINTER.

And I lost because we have:
;; sum = sum + MEM[base: base, index: (int *) i * 4B]
(insn 29 27 30 (set (reg:SI 128)
        (ashift:SI (reg/v:SI 123 [ i ])
            (const_int 2 [0x2]))) -1 (nil)
    (nil))

(insn 30 29 31 (set (reg:SI 129)
        (mem:SI (plus:SI (reg:SI 128)
                (reg/v/f:SI 125 [ base ])) [0 S4 A32])) -1 (nil)
    (nil))

(insn 31 30 0 (set (reg/v:SI 122 [ sum ])
        (plus:SI (reg/v:SI 122 [ sum ])
            (reg:SI 129))) -1 (nil)
    (nil))



-- Pinski


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Bug middle-end/28690] [4.2 Regression] Performace problem  with indexed load/stores on powerpc
  2006-09-22 17:05 ` pinskia at gcc dot gnu dot org
@ 2006-09-22 17:09   ` Andrew Pinski
  0 siblings, 0 replies; 59+ messages in thread
From: Andrew Pinski @ 2006-09-22 17:09 UTC (permalink / raw)
  To: gcc-bugzilla; +Cc: gcc-bugs

On Fri, 2006-09-22 at 17:05 +0000, pinskia at gcc dot gnu dot org wrote:
> 
> ------- Comment #14 from pinskia at gcc dot gnu dot org  2006-09-22 17:05 -------
> (In reply to comment #13)
> > Yet another test case from Anton we don't catch.  Will they never end?!?! ;)
> I bet a beer or a shot of vodka, that this is caused by MEM_REF not expanding
> with a REG_POINTER.

And I lost because we have:
;; sum = sum + MEM[base: base, index: (int *) i * 4B]
(insn 29 27 30 (set (reg:SI 128)
        (ashift:SI (reg/v:SI 123 [ i ])
            (const_int 2 [0x2]))) -1 (nil)
    (nil))

(insn 30 29 31 (set (reg:SI 129)
        (mem:SI (plus:SI (reg:SI 128)
                (reg/v/f:SI 125 [ base ])) [0 S4 A32])) -1 (nil)
    (nil))

(insn 31 30 0 (set (reg/v:SI 122 [ sum ])
        (plus:SI (reg/v:SI 122 [ sum ])
            (reg:SI 129))) -1 (nil)
    (nil))



-- Pinski


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (16 preceding siblings ...)
  2006-09-22 17:09 ` pinskia at physics dot uc dot edu
@ 2006-09-22 17:27 ` sabre at nondot dot org
  2006-10-03  3:30 ` bergner at vnet dot ibm dot com
                   ` (38 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: sabre at nondot dot org @ 2006-09-22 17:27 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from sabre at nondot dot org  2006-09-22 17:27 -------
Out of curiosity, which powerpc processors are affected by this?

-Chris


-- 

sabre at nondot dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sabre at nondot dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (17 preceding siblings ...)
  2006-09-22 17:27 ` sabre at nondot dot org
@ 2006-10-03  3:30 ` bergner at vnet dot ibm dot com
  2006-10-03  5:21 ` paolo dot bonzini at lu dot unisi dot ch
                   ` (37 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-10-03  3:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from bergner at vnet dot ibm dot com  2006-10-03 03:30 -------
Created an attachment (id=12375)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12375&action=view)
Patch to swap_commutative_operands_p and gen_addr_rtx to force base pointers
into rA position of indexed load/store instructions.

We propagated the MEM_POINTER/REG_POINTER flags fine.  The problem is that the
memory reference we're handed is a REG + REG which looks legitimate to us, so
we never call LEGITIMIZE_ADDRESS, so we never have a chance to swap the
operands.

Since we cannot fixup the latest test case in LEGITIMIZE_ADDRESS, I've decided
to attempt another swap_commutative_operands_p() /
commutative_operand_precedence() change.  However, I'm a little more selective
on when we change swap_commutative_operands_p()'s return value.  With this
patch, I'm able to transform each of the test cases so that the base address if
the first operand of the indexed load.

        * rtlanal.c (swap_commutative_operands_p): Preference a REG_POINTER
        over a non REG_POINTER.
        * tree-ssa-address.c (gen_addr_rtx): Force a REG_POINTER to be
        the first operand of a PLUS.


-- 

bergner at vnet dot ibm dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #12305|0                           |1
        is obsolete|                            |
  Attachment #12306|0                           |1
        is obsolete|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (18 preceding siblings ...)
  2006-10-03  3:30 ` bergner at vnet dot ibm dot com
@ 2006-10-03  5:21 ` paolo dot bonzini at lu dot unisi dot ch
  2006-10-03 15:52 ` bergner at vnet dot ibm dot com
                   ` (36 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: paolo dot bonzini at lu dot unisi dot ch @ 2006-10-03  5:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from paolo dot bonzini at lu dot unisi dot ch  2006-10-03 05:20 -------
Subject: Re:  [4.2 Regression] Performace problem with
 indexed load/stores on powerpc


>         * rtlanal.c (swap_commutative_operands_p): Preference a REG_POINTER
>         over a non REG_POINTER.
>         * tree-ssa-address.c (gen_addr_rtx): Force a REG_POINTER to be
>         the first operand of a PLUS.
This is more gentle indeed.  Be careful however as functions calling 
commutative_operand_precedence directly may have a problem with that.  
Can you try making an address illegitimate if it is non-REG_POINTER + 
REG_POINTER?  Or set up splitters to do the transformation just before 
reload?

Paolo


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (19 preceding siblings ...)
  2006-10-03  5:21 ` paolo dot bonzini at lu dot unisi dot ch
@ 2006-10-03 15:52 ` bergner at vnet dot ibm dot com
  2006-10-03 17:58 ` dje at gcc dot gnu dot org
                   ` (35 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-10-03 15:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from bergner at vnet dot ibm dot com  2006-10-03 15:51 -------
David has already said offline that he would reject any patch that would cause
us to view a non-REG_POINTER + REG_POINTER expression an not legitimate.  I
agree with that.

Sorry, but I'm slowly learning the machine description format.  How exactly
would adding a splitter help us?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (20 preceding siblings ...)
  2006-10-03 15:52 ` bergner at vnet dot ibm dot com
@ 2006-10-03 17:58 ` dje at gcc dot gnu dot org
  2006-10-03 18:11 ` dje at watson dot ibm dot com
                   ` (34 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: dje at gcc dot gnu dot org @ 2006-10-03 17:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from dje at gcc dot gnu dot org  2006-10-03 17:58 -------
Paolo, forcing all addresses through legitimize_address should not be the goal.
 The wrong ordering has performance effects, but is not an invalid address. 
While the performance effects on POWER-specific, canonicalizing addresses is a
general GCC issue.  GCC appears to want REG_POINTER first, but does not enforce
it.

I am willing to consider target-specific fixes as a last resort, but I do not
see any reason that GCC should not create and maintain a canonical address
ordering of REG_POINTER first.  Trying to correct this in the rs6000 backend is
a kludge.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (22 preceding siblings ...)
  2006-10-03 18:11 ` dje at watson dot ibm dot com
@ 2006-10-03 18:11 ` bonzini at gnu dot org
  2006-10-12 17:23 ` janis at gcc dot gnu dot org
                   ` (32 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bonzini at gnu dot org @ 2006-10-03 18:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from bonzini at gnu dot org  2006-10-03 18:07 -------
Note that I don't oppose at all fixing the problem in
swap_commutative_operands_p.  At the very least, you have to change at the very
least simplify-rtx.c's uses of commutative_operand_precedence to use s_c_o_p
instead, but that's a minor problem.

I'm also worried of the interaction between this change to
swap_commutative_operands_p and swap_commutative_operands_with_target which
(even though I refactored it quite recently to happen very early, in expand) is
an optimization that CSE has performed for years and has big impact on x86, for
example on crafty.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (21 preceding siblings ...)
  2006-10-03 17:58 ` dje at gcc dot gnu dot org
@ 2006-10-03 18:11 ` dje at watson dot ibm dot com
  2006-10-03 18:11 ` bonzini at gnu dot org
                   ` (33 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: dje at watson dot ibm dot com @ 2006-10-03 18:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #22 from dje at watson dot ibm dot com  2006-10-03 18:09 -------
Subject: Re:  [4.2 Regression] Performace problem with indexed load/stores on
powerpc 

        I am not suggesting that the problem has to be solved in
swap_commutative_operands, etc.  I would think that GCC should be able to
create commutative addresses where they are formed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (23 preceding siblings ...)
  2006-10-03 18:11 ` bonzini at gnu dot org
@ 2006-10-12 17:23 ` janis at gcc dot gnu dot org
  2006-11-08  3:30 ` [Bug middle-end/28690] [4.2/4.3 " bergner at vnet dot ibm dot com
                   ` (31 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: janis at gcc dot gnu dot org @ 2006-10-12 17:23 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from janis at gcc dot gnu dot org  2006-10-12 17:23 -------
Subject: Bug 28690

Author: janis
Date: Thu Oct 12 17:23:10 2006
New Revision: 117668

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=117668
Log:
        PR middle-end/28690
        * explow.c (force_reg): Set REG_POINTER flag according to
        MEM_POINTER flag.

Modified:
    branches/ibm/gcc-4_1-branch/gcc/ChangeLog
    branches/ibm/gcc-4_1-branch/gcc/explow.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (24 preceding siblings ...)
  2006-10-12 17:23 ` janis at gcc dot gnu dot org
@ 2006-11-08  3:30 ` bergner at vnet dot ibm dot com
  2006-11-08  3:35 ` pinskia at gcc dot gnu dot org
                   ` (30 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-11-08  3:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #24 from bergner at vnet dot ibm dot com  2006-11-08 03:30 -------
Ok, Anton hit another test case the last patch doesn't transform.  It's
actually the same test case as in comment #13, except, "int *base" is now a
global variable rather than a function parameter.

int *base;

int indexedload(int len)
{
  int i, sum = 0;
  for (i=0; i < len; i++)
    sum += base[i];
  return sum;
}

With this case, we get the following RTL generated for the load of *base into a
register:

;; base.1 = base
(insn 24 22 25 (set (reg:SI 129)
        (high:SI (symbol_ref:SI ("base") [flags 0x84] <var_decl 0x400a6930
base>))) -1 (nil)
    (nil))

(insn 25 24 26 (set (reg/f:SI 128)
        (lo_sum:SI (reg:SI 129)
            (symbol_ref:SI ("base") [flags 0x84] <var_decl 0x400a6930 base>)))
-1 (nil)
    (expr_list:REG_EQUAL (symbol_ref:SI ("base") [flags 0x84] <var_decl
0x400a6930 base>)
        (nil)))

(insn 26 25 0 (set (reg:SI 124 [ base.1 ])
        (mem/c/i:SI (reg/f:SI 128) [3 base+0 S4 A32])) -1 (nil)
    (nil))

There seem to be two problems here.  First, the mem/c/i:SI (reg/f:SI 128) seems
to be missing a MEM_POINTER attribute.  I tracked that down to
rtl.h:MEM_COPY_ATTRIBUTES(LHS,RHS) not copying the MEM_POINTER attribute from
RHS to LHS.  I added the code to do that, so the mem above does get the
MEM_POINTER flag set. However, the reg:SI 124 [base.1] is still missing the
REG_POINTER flag.

This second problem seems to be a problem in expand_one_register_var(tree var).
 In this specific case, var seems to be a pointer type, but looks to be
"artifical", so we skip the code that would have marked the reg rtx with the
REG_POINTER:

Breakpoint 4, expand_one_register_var (var=0x400a6d90) at
/home/bergner/gcc/gcc-mainline-rtlanal/gcc/cfgexpand.c:643
643       tree type = TREE_TYPE (var);
(gdb) call debug_tree (var)
 <var_decl 0x400a6d90 base.1
    type <pointer_type 0x4009ae38
        type <integer_type 0x4009a340 int sizes-gimplified public SI
            size <integer_cst 0x4008e5e0 constant invariant 32>
            unit size <integer_cst 0x4008e2a0 constant invariant 4>
            align 32 symtab 0 alias set 2 precision 32 min <integer_cst
0x4008e580 -2147483648> max <integer_cst 0x4008e5a0 2147483647>
            pointer_to_this <pointer_type 0x4009ae38>>
        public unsigned SI size <integer_cst 0x4008e5e0 32> unit size
<integer_cst 0x4008e2a0 4>
        align 32 symtab 0 alias set 3>
    used unsigned ignored SI file indexedload5.c line 7 size <integer_cst
0x4008e5e0 32> unit size <integer_cst 0x4008e2a0 4>
    align 32 context <function_decl 0x40151f00 indexedload> chain <var_decl
0x400a6e00 D.1531>>
(gdb) p var->decl_common.artificial_flag
$8 = 1

If I clear var->decl_common.artificial_flag with the debugger and continue,
then we get the REG_POINTER flag set and the indexed load is generated with the
base pointer first like we want.  Does anyone know why var above was marked as
artifical?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (25 preceding siblings ...)
  2006-11-08  3:30 ` [Bug middle-end/28690] [4.2/4.3 " bergner at vnet dot ibm dot com
@ 2006-11-08  3:35 ` pinskia at gcc dot gnu dot org
  2006-11-20 20:22 ` bergner at vnet dot ibm dot com
                   ` (29 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-11-08  3:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #25 from pinskia at gcc dot gnu dot org  2006-11-08 03:35 -------
(In reply to comment #24)
> Does anyone know why var above was marked as
> artifical?
Yes because it is an compiler generated decl for the load of the global.  The
main reason why we don't mark the RTL for artifical decls is because we get
still like:
(int*)int_var
Which causes problem.  This is what PTR_PLUS_EXPR which I am creating helps
solves.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (26 preceding siblings ...)
  2006-11-08  3:35 ` pinskia at gcc dot gnu dot org
@ 2006-11-20 20:22 ` bergner at vnet dot ibm dot com
  2006-11-29  7:56 ` bonzini at gnu dot org
                   ` (28 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-11-20 20:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #26 from bergner at vnet dot ibm dot com  2006-11-20 20:21 -------
The following patch was checked into mainline to fix the "first" problem
described in comment #24.

Author: bergner
Date: Sat Nov 11 04:20:37 2006
New Revision: 118684

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118684
Log:
        * rtl.h (MEM_COPY_ATTRIBUTES): Copy MEM_POINTER.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/rtl.h


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (27 preceding siblings ...)
  2006-11-20 20:22 ` bergner at vnet dot ibm dot com
@ 2006-11-29  7:56 ` bonzini at gnu dot org
  2006-11-29 20:11 ` bergner at vnet dot ibm dot com
                   ` (27 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bonzini at gnu dot org @ 2006-11-29  7:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #27 from bonzini at gnu dot org  2006-11-29 07:56 -------
This case is still not fixed:


struct s {
  int size;
  float *data;
};

void f(struct s *d, struct s *s)
{
  int i;
  for (i = 0; i < s->size; i++)
    d->data[i] += s->data[i];
}

The body of the loop is compiled to:

L4:
        slwi r2,r9,2
        addi r9,r9,1
        lfsx f0,r2,r3
        lfsx f13,r4,r2
        fadds f0,f0,f13
        stfsx f0,r2,r3
        bdnz L4

Note how r2 is twice in the first position, and once in the second.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (28 preceding siblings ...)
  2006-11-29  7:56 ` bonzini at gnu dot org
@ 2006-11-29 20:11 ` bergner at vnet dot ibm dot com
  2006-11-29 22:24 ` bergner at vnet dot ibm dot com
                   ` (26 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-11-29 20:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #28 from bergner at vnet dot ibm dot com  2006-11-29 20:11 -------
Another problem with the current patch, is we get one testsuite regression
(gfortran.fortran-torture/compile/defined_type_2.f90 at -O1).  For this simple
testcase, we end up generating bad assembler:

    mr 9,sfp

instead of:

    mr 9,1

For some reason, the stack frame pseudo isn't reloaded correctly and we spit
out the "sfp" text instead of the correct register number.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (29 preceding siblings ...)
  2006-11-29 20:11 ` bergner at vnet dot ibm dot com
@ 2006-11-29 22:24 ` bergner at vnet dot ibm dot com
  2006-12-05  4:22 ` bergner at vnet dot ibm dot com
                   ` (25 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-11-29 22:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #29 from bergner at vnet dot ibm dot com  2006-11-29 22:23 -------
Talking with Andrew on IRC, he said the test case in comment #27 fails for the
same reason as the test case in comment #24 (ie, it looks like an artificial
decl) and should be fixed with his PTR_PLUS_EXPR work.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (30 preceding siblings ...)
  2006-11-29 22:24 ` bergner at vnet dot ibm dot com
@ 2006-12-05  4:22 ` bergner at vnet dot ibm dot com
  2006-12-05  4:42 ` bergner at vnet dot ibm dot com
                   ` (24 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-12-05  4:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #30 from bergner at vnet dot ibm dot com  2006-12-05 04:22 -------
Ok, the problem from comment #28 was due to a latent bug in
reload1.c:eliminate_regs_in_insn(). The bug is that eliminate_regs_in_insn()
calls single_set() on the passed in insn.  This has been fine before, but now
with the patch, we end up passing in a parallel insn for a load with update and
the load portion of the parallel has the REG_UNUSED flag set.  This causes
single_set() to return the "update" portion of the parallel instead of
returning NULL as it would do normally with parallels.  This causes us to only
eliminate the update portion of the parallel and we skip eliminating the load
portion.  The problem insn belfore eliminate_regs_in_insn() looks like:

(insn 12 62 13 2 (parallel [
            (set (reg:SI 0 0 [125])
                (mem/s/j:SI (plus:SI (reg/f:SI 113 sfp)
                        (const_int 8 [0x8])) [0 S4 A32]))
            (set (reg/f:SI 9 9 [orig:124 D.965 ] [124])
                (plus:SI (reg/f:SI 113 sfp)
                    (const_int 8 [0x8])))
        ]) 373 {*movsi_update1} (nil)
    (expr_list:REG_UNUSED (reg:SI 0 0 [125])
        (nil)))

After eliminate_regs_in_insn(), we have:

(insn 12 62 13 2 (parallel [
            (set (reg:SI 0 0 [125])
                (mem/s/j:SI (plus:SI (reg/f:SI 113 sfp)
                        (const_int 8 [0x8])) [0 S4 A32]))
            (set (reg/f:SI 9 9 [orig:124 D.965 ] [124])
                (plus:SI (reg/f:SI 1 1)
                    (const_int 8 [0x8])))
        ]) 373 {*movsi_update1} (nil)
    (expr_list:REG_UNUSED (reg:SI 0 0 [125])
        (nil)))

However, calculate_needs_all_insns() ends up backing out the eliminated
(reg/f:SI 1 1) with the non eliminated (reg/f:SI 113 sfp) and  (reg/f:SI 113
sfp) never gets eliminated after that and we generate bogus assembler.

In addition to the latest patch attached here, I added the following patch to
stop eliminate-regs_in_insn from calling single_set for parallel insns.  It
fixed the bug here and bootstrapped and regtested with no errors.  I'll post
the  combined patch to gcc-patches.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (31 preceding siblings ...)
  2006-12-05  4:22 ` bergner at vnet dot ibm dot com
@ 2006-12-05  4:42 ` bergner at vnet dot ibm dot com
  2006-12-05 16:12 ` pthaugen at us dot ibm dot com
                   ` (23 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at vnet dot ibm dot com @ 2006-12-05  4:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #31 from bergner at vnet dot ibm dot com  2006-12-05 04:41 -------
...and here's the patch I mentioned in the previous comment:

Index: reload1.c
===================================================================
--- reload1.c   (revision 119497)
+++ reload1.c   (working copy)
@@ -2930,7 +2930,7 @@ eliminate_regs_in_insn (rtx insn, int re
   int icode = recog_memoized (insn);
   rtx old_body = PATTERN (insn);
   int insn_is_asm = asm_noperands (old_body) >= 0;
-  rtx old_set = single_set (insn);
+  rtx old_set;
   rtx new_body;
   int val = 0;
   int i;
@@ -2949,6 +2949,12 @@ eliminate_regs_in_insn (rtx insn, int re
       return 0;
     }

+  /* Guard against a PARALLEL with a REG_UNUSED note.  */
+  if (GET_CODE (PATTERN (insn)) != PARALLEL)
+    old_set = single_set (insn);
+  else
+    old_set = 0;
+
   if (old_set != 0 && REG_P (SET_DEST (old_set))
       && REGNO (SET_DEST (old_set)) < FIRST_PSEUDO_REGISTER)
     {


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (32 preceding siblings ...)
  2006-12-05  4:42 ` bergner at vnet dot ibm dot com
@ 2006-12-05 16:12 ` pthaugen at us dot ibm dot com
  2006-12-05 16:30 ` pthaugen at us dot ibm dot com
                   ` (22 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: pthaugen at us dot ibm dot com @ 2006-12-05 16:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #32 from pthaugen at us dot ibm dot com  2006-12-05 16:12 -------
Another example, pared down from ammp benchmark in cpu2000. 

void f2(int *, int *);
void mm_fv_update_nonbon(void)
{
 int j, nx;
 int naybor[27];

 f2(naybor, &nx);

 for(j=0; j< 27; j++)
     if( naybor[j]) break; /* Indexed load problem */


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (33 preceding siblings ...)
  2006-12-05 16:12 ` pthaugen at us dot ibm dot com
@ 2006-12-05 16:30 ` pthaugen at us dot ibm dot com
  2007-01-17 20:58 ` bergner at gcc dot gnu dot org
                   ` (21 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: pthaugen at us dot ibm dot com @ 2006-12-05 16:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #33 from pthaugen at us dot ibm dot com  2006-12-05 16:30 -------
My prior comment is missing the closing bracket for the procedure, but example
is otherwise complete.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (34 preceding siblings ...)
  2006-12-05 16:30 ` pthaugen at us dot ibm dot com
@ 2007-01-17 20:58 ` bergner at gcc dot gnu dot org
  2007-02-12 17:30 ` bergner at gcc dot gnu dot org
                   ` (20 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at gcc dot gnu dot org @ 2007-01-17 20:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #34 from bergner at gcc dot gnu dot org  2007-01-17 20:58 -------
Created an attachment (id=12915)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12915&action=view)
Patch to commutative_operand_precedence to increase the precedence of
REG_POINTER and MEM_POINTER objects.

This patch modifies commutative_operand_precedence to increase the precedence
of REG_POINTER and MEM_POINTER objects. This obviates the need for
swap_commutative_operands_with_target which has been replaced by a call to
swap_commutative_operands_p.

This patch improves performance on POWER6 (using -mcpu=power6) by 30% across
both specint and specfp, with a 498% improvement on galgel. On POWER5 (using
-mcpu=power5), there was only negligible differences between the base and
patched compilers. It also correctly transforms all of the above test cases
except those with "artificial" pointers which will have to wait until Andrew's
PTR_PLUS_EXPR work is complete.

Paolo tested this patch on x86 and saw 4% degradation on galgel which he said
he'd look into. However, overall across both specint and specfp, the
performance change didn't seem that bad.


-- 

bergner at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #12375|0                           |1
        is obsolete|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (35 preceding siblings ...)
  2007-01-17 20:58 ` bergner at gcc dot gnu dot org
@ 2007-02-12 17:30 ` bergner at gcc dot gnu dot org
  2007-02-23 17:14 ` bergner at gcc dot gnu dot org
                   ` (19 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at gcc dot gnu dot org @ 2007-02-12 17:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #35 from bergner at gcc dot gnu dot org  2007-02-12 17:29 -------
Created an attachment (id=13042)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13042&action=view)
Alternate patch to commutative_operand_precedence to increase the precedence of
REG_POINTER and MEM_POINTER objects.

Ok, now that the libjava multilib problems have been fixed, I've been
able to attempt to bootstrap the patch in Comment #34 with java enabled.
In doing so, I'm now hitting an ICE while building the 64-bit libgcj.
The ICE is occurring in the same location and for the same reason as
the ICE (optabs.c:emit_cmp_and_jump_insns()) I hit when I attempted
to change swap_commutative_operands_p() (as in this alternate
patch) so that it sorted REG's by register numbers similar to how
simplify_plus_minus_op_data_cmp() sorts them.

With this attached alternate patch, we have a simple testcase that 
exposes the problem:

  void
  gomp_sem_wait_slow (int *sem, int a, int b)
  {
    __sync_bool_compare_and_swap (sem, a, b);
  }

For this testcase, the swap_commutative_operands_p (x, y) call that guards
the gcc_assert (label) we're failing in, "x" and "y" are simple REGs that
are swapped due to REGNO(y) is smaller then RGENO(x).  I don't understand
why the gcc_assert(label) is needed, but I'll try and track that down.


-- 

bergner at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |bergner at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (36 preceding siblings ...)
  2007-02-12 17:30 ` bergner at gcc dot gnu dot org
@ 2007-02-23 17:14 ` bergner at gcc dot gnu dot org
  2007-05-14 21:28 ` mmitchel at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at gcc dot gnu dot org @ 2007-02-23 17:14 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #36 from bergner at gcc dot gnu dot org  2007-02-23 17:14 -------
Created an attachment (id=13101)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13101&action=view)
Alternate patch to commutative_operand_precedence to increase the precedence of
REG_POINTER and MEM_POINTER objects.

With help from Honza, this updated patch eliminates the ICE the previous patch
hit.


-- 

bergner at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #13042|0                           |1
        is obsolete|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (37 preceding siblings ...)
  2007-02-23 17:14 ` bergner at gcc dot gnu dot org
@ 2007-05-14 21:28 ` mmitchel at gcc dot gnu dot org
  2007-06-09  4:08 ` bergner at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2007-05-14 21:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #37 from mmitchel at gcc dot gnu dot org  2007-05-14 22:25 -------
Will not be fixed in 4.2.0; retargeting at 4.2.1.


-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.2.0                       |4.2.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (38 preceding siblings ...)
  2007-05-14 21:28 ` mmitchel at gcc dot gnu dot org
@ 2007-06-09  4:08 ` bergner at gcc dot gnu dot org
  2007-07-20  3:50 ` mmitchel at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at gcc dot gnu dot org @ 2007-06-09  4:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #38 from bergner at gcc dot gnu dot org  2007-06-09 04:08 -------
Created an attachment (id=13671)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13671&action=view)
Updated patch to address x86_64 performance issues.

This updated patch reverts the swap_commutative_operands_with_target removal
from the previous patch, since it seems x86_64 still relies on this for some
benchmarks. The change doesn't seem to affect the POWER6 performance benefits
we were seeing with the older patch.


-- 

bergner at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #12915|0                           |1
        is obsolete|                            |
  Attachment #13101|0                           |1
        is obsolete|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (39 preceding siblings ...)
  2007-06-09  4:08 ` bergner at gcc dot gnu dot org
@ 2007-07-20  3:50 ` mmitchel at gcc dot gnu dot org
  2007-08-06  8:09 ` bonzini at gnu dot org
                   ` (15 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2007-07-20  3:50 UTC (permalink / raw)
  To: gcc-bugs



-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.2.1                       |4.2.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (40 preceding siblings ...)
  2007-07-20  3:50 ` mmitchel at gcc dot gnu dot org
@ 2007-08-06  8:09 ` bonzini at gnu dot org
  2007-08-06 11:35 ` pinskia at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bonzini at gnu dot org @ 2007-08-06  8:09 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #39 from bonzini at gnu dot org  2007-08-06 08:08 -------
committed??


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (41 preceding siblings ...)
  2007-08-06  8:09 ` bonzini at gnu dot org
@ 2007-08-06 11:35 ` pinskia at gcc dot gnu dot org
  2007-08-06 11:52 ` paolo dot bonzini at lu dot unisi dot ch
                   ` (13 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-08-06 11:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #40 from pinskia at gcc dot gnu dot org  2007-08-06 11:35 -------
(In reply to comment #39)
> committed??

This is now more like a meta-bug, see the other two bugs which are opened for
the current issues (yes both are assigned to me and both are actively being
worked on, well one is depend on the other but still being worked on).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (42 preceding siblings ...)
  2007-08-06 11:35 ` pinskia at gcc dot gnu dot org
@ 2007-08-06 11:52 ` paolo dot bonzini at lu dot unisi dot ch
  2007-10-09 19:26 ` mmitchel at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: paolo dot bonzini at lu dot unisi dot ch @ 2007-08-06 11:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #41 from paolo dot bonzini at lu dot unisi dot ch  2007-08-06 11:52 -------
Subject: Re:  [4.2/4.3 Regression] Performace problem
 with indexed load/stores on powerpc


> This is now more like a meta-bug, see the other two bugs which are opened for
> the current issues (yes both are assigned to me and both are actively being
> worked on, well one is depend on the other but still being worked on).

Ah, I see.

Paolo


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (43 preceding siblings ...)
  2007-08-06 11:52 ` paolo dot bonzini at lu dot unisi dot ch
@ 2007-10-09 19:26 ` mmitchel at gcc dot gnu dot org
  2007-11-10 17:05 ` steven at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2007-10-09 19:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #42 from mmitchel at gcc dot gnu dot org  2007-10-09 19:21 -------
Change target milestone to 4.2.3, as 4.2.2 has been released.


-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.2.2                       |4.2.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (44 preceding siblings ...)
  2007-10-09 19:26 ` mmitchel at gcc dot gnu dot org
@ 2007-11-10 17:05 ` steven at gcc dot gnu dot org
  2008-01-07 18:07 ` steven at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2007-11-10 17:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #43 from steven at gcc dot gnu dot org  2007-11-10 17:05 -------
What is the status of this bug now?  Re. comment #39, a meta-bug for what? 
There is only one open bug left that depends on this one.

Are we still tracking an issue in this bug?  If so, what?  If not, please close
this bug report.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (45 preceding siblings ...)
  2007-11-10 17:05 ` steven at gcc dot gnu dot org
@ 2008-01-07 18:07 ` steven at gcc dot gnu dot org
  2008-01-08 16:09 ` bergner at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-01-07 18:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #44 from steven at gcc dot gnu dot org  2008-01-07 17:34 -------
Hello world.  Please, a status update.


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2/4.3 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (46 preceding siblings ...)
  2008-01-07 18:07 ` steven at gcc dot gnu dot org
@ 2008-01-08 16:09 ` bergner at gcc dot gnu dot org
  2008-01-08 16:12 ` [Bug middle-end/28690] [4.2 " steven at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at gcc dot gnu dot org @ 2008-01-08 16:09 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #45 from bergner at gcc dot gnu dot org  2008-01-08 15:44 -------
This has been fixed in mainline (4.3), but has not been fixed in 4.2.  I'm ok
with not back porting this to 4.2.  I'll give everyone a few days to object,
otherwise I'll change the Target Milestone to 4.3 and close as FIXED.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (47 preceding siblings ...)
  2008-01-08 16:09 ` bergner at gcc dot gnu dot org
@ 2008-01-08 16:12 ` steven at gcc dot gnu dot org
  2008-01-16  5:32 ` bergner at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-01-08 16:12 UTC (permalink / raw)
  To: gcc-bugs



-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW
   Last reconfirmed|2006-08-11 13:29:43         |2008-01-08 15:59:22
               date|                            |
            Summary|[4.2/4.3 Regression]        |[4.2 Regression] Performace
                   |Performace problem with     |problem with indexed
                   |indexed load/stores on      |load/stores on powerpc
                   |powerpc                     |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (48 preceding siblings ...)
  2008-01-08 16:12 ` [Bug middle-end/28690] [4.2 " steven at gcc dot gnu dot org
@ 2008-01-16  5:32 ` bergner at gcc dot gnu dot org
  2008-04-08  6:40 ` ubizjak at gmail dot com
                   ` (6 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at gcc dot gnu dot org @ 2008-01-16  5:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #46 from bergner at gcc dot gnu dot org  2008-01-16 01:51 -------
This is fixed on mainline and we're not going to backport it to 4.2, so I'm
changing the target milestone.


-- 

bergner at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|4.2.3                       |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (49 preceding siblings ...)
  2008-01-16  5:32 ` bergner at gcc dot gnu dot org
@ 2008-04-08  6:40 ` ubizjak at gmail dot com
  2008-04-08  6:43 ` ubizjak at gmail dot com
                   ` (5 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: ubizjak at gmail dot com @ 2008-04-08  6:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #47 from ubizjak at gmail dot com  2008-04-08 06:39 -------
Author: bergner
Date: Mon Apr  7 17:36:59 2008
New Revision: 133985

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=133985
Log:
        PR middle-end/PR28690
        * rtlanal.c: Update copyright years.
        (commutative_operand_precedence): Give SYMBOL_REF's the same precedence
        as REG_POINTER and MEM_POINTER operands.
        * emit-rtl.c (gen_reg_rtx_and_attrs): New function.
        (set_reg_attrs_from_value): Call mark_reg_pointer as appropriate.
        * rtl.h (gen_reg_rtx_and_attrs): Add prototype for new function.
        * gcse.c: Update copyright years.
        (pre_delete): Call gen_reg_rtx_and_attrs.
        (hoist_code): Likewise.
        (build_store_vectors): Likewise.
        (delete_store): Likewise.
        * loop-invariant.c (move_invariant_reg): Likewise.
        Update copyright years.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/emit-rtl.c
    trunk/gcc/gcse.c
    trunk/gcc/loop-invariant.c
    trunk/gcc/rtl.h
    trunk/gcc/rtlanal.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (50 preceding siblings ...)
  2008-04-08  6:40 ` ubizjak at gmail dot com
@ 2008-04-08  6:43 ` ubizjak at gmail dot com
  2008-04-08 14:50 ` bergner at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: ubizjak at gmail dot com @ 2008-04-08  6:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #48 from ubizjak at gmail dot com  2008-04-08 06:43 -------
(In reply to comment #47)

>        * rtlanal.c: Update copyright years.
>        (commutative_operand_precedence): Give SYMBOL_REF's the same precedence

This change causes regression in i686-pc-linux-gnu testsuite:

FAIL: gcc.target/i386/addr-sel-1.c scan-assembler a\\+1
FAIL: gcc.target/i386/addr-sel-1.c scan-assembler b\\+1

Tracked in PR 35867.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (51 preceding siblings ...)
  2008-04-08  6:43 ` ubizjak at gmail dot com
@ 2008-04-08 14:50 ` bergner at gcc dot gnu dot org
  2008-04-08 15:01 ` bonzini at gnu dot org
                   ` (3 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at gcc dot gnu dot org @ 2008-04-08 14:50 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #49 from bergner at gcc dot gnu dot org  2008-04-08 14:49 -------
The offending hunk has been reverted in revision 134095.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (52 preceding siblings ...)
  2008-04-08 14:50 ` bergner at gcc dot gnu dot org
@ 2008-04-08 15:01 ` bonzini at gnu dot org
  2008-04-08 18:51 ` bergner at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  56 siblings, 0 replies; 59+ messages in thread
From: bonzini at gnu dot org @ 2008-04-08 15:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #50 from bonzini at gnu dot org  2008-04-08 15:00 -------
I guess that you had modified the precedences in order to allow additional
simplifications.  Can you report here what is missed using the current values?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (53 preceding siblings ...)
  2008-04-08 15:01 ` bonzini at gnu dot org
@ 2008-04-08 18:51 ` bergner at gcc dot gnu dot org
  2008-04-08 19:08 ` bonzini at gnu dot org
  2008-04-09 15:39 ` bergner at gcc dot gnu dot org
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at gcc dot gnu dot org @ 2008-04-08 18:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #51 from bergner at gcc dot gnu dot org  2008-04-08 18:50 -------
Ok, I dug into this a little deeper.  For the following test case:

  int array[1024];
  void
  clear_table (unsigned int n)
  {
    unsigned int i;
    for (i = 0; i < n; i++)
      array[i] = 0;
  }

compiling this with -O1 (it's ok with -O2 or above) on powerpc{,64}-linux,
during expand, we call swap_commutative_operands_p with a SYMBOL_REF and a REG
which currently prefers the REG first.  Later, break_out_memory_refs forces the
SYMBOL_REF into a register (with the REG_POINTER attribute set), but we're
already done swapping, so we get the wrong operand ordering.  Paolo, I wonder
if this patch instead of the rtlanal.c hunk might be better.  It does fix my
problem:

Index: explow.c
===================================================================
--- explow.c    (revision 134095)
+++ explow.c    (working copy)
@@ -305,7 +305,7 @@ break_out_memory_refs (rtx x)
       rtx op1 = break_out_memory_refs (XEXP (x, 1));

       if (op0 != XEXP (x, 0) || op1 != XEXP (x, 1))
-       x = gen_rtx_fmt_ee (GET_CODE (x), Pmode, op0, op1);
+       x = simplify_gen_binary (GET_CODE (x), Pmode, op0, op1);
     }

   return x;


-- 

bergner at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (54 preceding siblings ...)
  2008-04-08 18:51 ` bergner at gcc dot gnu dot org
@ 2008-04-08 19:08 ` bonzini at gnu dot org
  2008-04-09 15:39 ` bergner at gcc dot gnu dot org
  56 siblings, 0 replies; 59+ messages in thread
From: bonzini at gnu dot org @ 2008-04-08 19:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #52 from bonzini at gnu dot org  2008-04-08 19:07 -------
Subject: Re:  [4.2 Regression] Performace problem with
 indexed load/stores on powerpc


> Index: explow.c
> ===================================================================
> --- explow.c    (revision 134095)
> +++ explow.c    (working copy)
> @@ -305,7 +305,7 @@ break_out_memory_refs (rtx x)
>        rtx op1 = break_out_memory_refs (XEXP (x, 1));
> 
>        if (op0 != XEXP (x, 0) || op1 != XEXP (x, 1))
> -       x = gen_rtx_fmt_ee (GET_CODE (x), Pmode, op0, op1);
> +       x = simplify_gen_binary (GET_CODE (x), Pmode, op0, op1);
>      }

Definitely a good idea.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc
  2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
                   ` (55 preceding siblings ...)
  2008-04-08 19:08 ` bonzini at gnu dot org
@ 2008-04-09 15:39 ` bergner at gcc dot gnu dot org
  56 siblings, 0 replies; 59+ messages in thread
From: bergner at gcc dot gnu dot org @ 2008-04-09 15:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #53 from bergner at gcc dot gnu dot org  2008-04-09 15:38 -------
Author: bergner
Date: Wed Apr  9 13:42:43 2008
New Revision: 134139

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=134139
Log:
        PR middle-end/PR28690
        * explow.c (break_out_memory_refs): Use simplify_gen_binary rather
        than gen_rtx_fmt_ee to perform more canonicalizations.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/explow.c


-- 

bergner at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690


^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2008-04-09 15:39 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-11  5:01 [Bug middle-end/28690] New: Performace problem with indexed load/stores on powerpc bergner at vnet dot ibm dot com
2006-08-11 13:29 ` [Bug middle-end/28690] [4.2 Regression] " dje at gcc dot gnu dot org
2006-08-11 14:34 ` dberlin at dberlin dot org
2006-08-26  3:49 ` pinskia at gcc dot gnu dot org
2006-08-26  4:24 ` bergner at vnet dot ibm dot com
2006-09-01 21:50 ` mmitchel at gcc dot gnu dot org
2006-09-03 13:51 ` bonzini at gnu dot org
2006-09-05 18:44 ` bergner at vnet dot ibm dot com
2006-09-05 19:25 ` bonzini at gnu dot org
2006-09-05 20:01 ` bergner at vnet dot ibm dot com
2006-09-07  5:14 ` bergner at vnet dot ibm dot com
2006-09-21 18:14 ` bergner at vnet dot ibm dot com
2006-09-21 18:16 ` bergner at vnet dot ibm dot com
2006-09-21 18:19 ` bergner at vnet dot ibm dot com
2006-09-22 16:30 ` bergner at vnet dot ibm dot com
2006-09-22 16:56 ` bergner at vnet dot ibm dot com
2006-09-22 17:05 ` pinskia at gcc dot gnu dot org
2006-09-22 17:09   ` Andrew Pinski
2006-09-22 17:09 ` pinskia at physics dot uc dot edu
2006-09-22 17:27 ` sabre at nondot dot org
2006-10-03  3:30 ` bergner at vnet dot ibm dot com
2006-10-03  5:21 ` paolo dot bonzini at lu dot unisi dot ch
2006-10-03 15:52 ` bergner at vnet dot ibm dot com
2006-10-03 17:58 ` dje at gcc dot gnu dot org
2006-10-03 18:11 ` dje at watson dot ibm dot com
2006-10-03 18:11 ` bonzini at gnu dot org
2006-10-12 17:23 ` janis at gcc dot gnu dot org
2006-11-08  3:30 ` [Bug middle-end/28690] [4.2/4.3 " bergner at vnet dot ibm dot com
2006-11-08  3:35 ` pinskia at gcc dot gnu dot org
2006-11-20 20:22 ` bergner at vnet dot ibm dot com
2006-11-29  7:56 ` bonzini at gnu dot org
2006-11-29 20:11 ` bergner at vnet dot ibm dot com
2006-11-29 22:24 ` bergner at vnet dot ibm dot com
2006-12-05  4:22 ` bergner at vnet dot ibm dot com
2006-12-05  4:42 ` bergner at vnet dot ibm dot com
2006-12-05 16:12 ` pthaugen at us dot ibm dot com
2006-12-05 16:30 ` pthaugen at us dot ibm dot com
2007-01-17 20:58 ` bergner at gcc dot gnu dot org
2007-02-12 17:30 ` bergner at gcc dot gnu dot org
2007-02-23 17:14 ` bergner at gcc dot gnu dot org
2007-05-14 21:28 ` mmitchel at gcc dot gnu dot org
2007-06-09  4:08 ` bergner at gcc dot gnu dot org
2007-07-20  3:50 ` mmitchel at gcc dot gnu dot org
2007-08-06  8:09 ` bonzini at gnu dot org
2007-08-06 11:35 ` pinskia at gcc dot gnu dot org
2007-08-06 11:52 ` paolo dot bonzini at lu dot unisi dot ch
2007-10-09 19:26 ` mmitchel at gcc dot gnu dot org
2007-11-10 17:05 ` steven at gcc dot gnu dot org
2008-01-07 18:07 ` steven at gcc dot gnu dot org
2008-01-08 16:09 ` bergner at gcc dot gnu dot org
2008-01-08 16:12 ` [Bug middle-end/28690] [4.2 " steven at gcc dot gnu dot org
2008-01-16  5:32 ` bergner at gcc dot gnu dot org
2008-04-08  6:40 ` ubizjak at gmail dot com
2008-04-08  6:43 ` ubizjak at gmail dot com
2008-04-08 14:50 ` bergner at gcc dot gnu dot org
2008-04-08 15:01 ` bonzini at gnu dot org
2008-04-08 18:51 ` bergner at gcc dot gnu dot org
2008-04-08 19:08 ` bonzini at gnu dot org
2008-04-09 15:39 ` bergner at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).