public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/34043]  New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
@ 2007-11-09 16:14 jsjodin at gcc dot gnu dot org
  2007-11-09 16:16 ` [Bug tree-optimization/34043] " jsjodin at gcc dot gnu dot org
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: jsjodin at gcc dot gnu dot org @ 2007-11-09 16:14 UTC (permalink / raw)
  To: gcc-bugs

Compile the attached file test.cpp with the following flags:
g++ -DNDEBUG -m32 -msse2 -O2 -msse2 -DSHOW_BUG gcc_bug.cpp -S

In the generated code there are useless stores and loads of %xmm0 to
-40(%eps) and -56(%eps).  If the code is compiled without -DSHOW_BUG
it will generate a more optimal version without the extra memory
accesses. See the attached generated files test_bad.s and test_good.s
for the resulting code. This is an old problem existing in 4.1.x.


-- 
           Summary: Missed optimization causing extra loads and stores when
                    using x86_64 builtin function together with aggregate
                    types.
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jsjodin at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
  2007-11-09 16:16 ` [Bug tree-optimization/34043] " jsjodin at gcc dot gnu dot org
@ 2007-11-09 16:16 ` jsjodin at gcc dot gnu dot org
  2007-11-09 16:17 ` jsjodin at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jsjodin at gcc dot gnu dot org @ 2007-11-09 16:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from jsjodin at gcc dot gnu dot org  2007-11-09 16:16 -------
Created an attachment (id=14516)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14516&action=view)
Source code to expose the missed optimization


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
@ 2007-11-09 16:16 ` jsjodin at gcc dot gnu dot org
  2007-11-09 16:16 ` jsjodin at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jsjodin at gcc dot gnu dot org @ 2007-11-09 16:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from jsjodin at gcc dot gnu dot org  2007-11-09 16:16 -------
Created an attachment (id=14517)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14517&action=view)
Assembly code for bad code.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
  2007-11-09 16:16 ` [Bug tree-optimization/34043] " jsjodin at gcc dot gnu dot org
  2007-11-09 16:16 ` jsjodin at gcc dot gnu dot org
@ 2007-11-09 16:17 ` jsjodin at gcc dot gnu dot org
  2007-11-09 18:24 ` pinskia at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jsjodin at gcc dot gnu dot org @ 2007-11-09 16:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from jsjodin at gcc dot gnu dot org  2007-11-09 16:17 -------
Created an attachment (id=14518)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14518&action=view)
Assembly code for good code.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2007-11-09 16:17 ` jsjodin at gcc dot gnu dot org
@ 2007-11-09 18:24 ` pinskia at gcc dot gnu dot org
  2007-11-13 17:07 ` jsjodin at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-11-09 18:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from pinskia at gcc dot gnu dot org  2007-11-09 18:24 -------
Related to PR 33790 (and most likely fixed by it).  There is another issue with
that bug relating to not deleting the extra store.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2007-11-09 18:24 ` pinskia at gcc dot gnu dot org
@ 2007-11-13 17:07 ` jsjodin at gcc dot gnu dot org
  2007-12-03 15:42 ` jsjodin at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jsjodin at gcc dot gnu dot org @ 2007-11-13 17:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from jsjodin at gcc dot gnu dot org  2007-11-13 17:07 -------
(In reply to comment #4)
> Related to PR 33790 (and most likely fixed by it).  There is another issue with
> that bug relating to not deleting the extra store.
> 

Indeed the extra load disappeared when with the patch. The store did not get
deleted as expected. I looked at the differences between the good and bad case. 
Compiling the good case has the following sequence before the fre pass:
Note: src and dst are unions

src.f = D.9650_45;
D.9630_31 = src.f;
D.9655_46 = __builtin_ia32_addps (D.9630_31, D.9630_31);
dst.f = D.9655_46;
D.9632_33 = dst.f;                

After fre the temps have been propagated and replaced the uses of dst.f:

src.f = D.9650_45;                                                              
D.9630_31 = D.9650_45;                                                          
D.9655_46 = __builtin_ia32_addps (D.9630_31, D.9630_31);                        
dst.f = D.9655_46;                                                              
D.9632_33 = D.9655_46;

The extra stores to src.f are eliminated in dce. 

The bad case has the following code before and after fre:
src.i = D.9651_44;                                                              
D.9630_31 = src.f;                                                              
D.9655_45 = __builtin_ia32_addps (D.9630_31, D.9630_31);                        
dst.f = D.9655_45;                                                              
D.9632_33 = dst.i;   

Since the src.i and src.f are probably not considered to be the same the
propagation does not work. It might be possible to handle this case if one
consideres the size of data being written and read from unions.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2007-11-13 17:07 ` jsjodin at gcc dot gnu dot org
@ 2007-12-03 15:42 ` jsjodin at gcc dot gnu dot org
  2007-12-09 22:34 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jsjodin at gcc dot gnu dot org @ 2007-12-03 15:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from jsjodin at gcc dot gnu dot org  2007-12-03 15:41 -------
> Indeed the extra load disappeared when with the patch. The store did not get
> deleted as expected. I looked at the differences between the good and bad case. 

After doing some more investigation an attempt to fix the problem was done in
the value numbering pass. By using the size and a unique type for the size to
compute the value number it not longer matters which union that is accessed,
only the number of bits that are read/written. This gives good code test.cpp,
however the problem with this approach is that in FRE the different types
result in a cast, but it should really be a reinterpretation of the bits
(different types, same bit pattern). The small example below gives: int: -1,
float: -1.000000 which is wrong. The expected result is: int: -1, float: nan

The solution would be to make the reinterpretation possible somehow. 


Simple test:
---------------------------------------------------------------
#include <stdio.h>

union foo {
  int i;
  float f;
} bar;


int main()
{
  int x = 0xFFFFFFFF;
  float y = 0.0;
  bar.i = x;
  y = bar.f;
  printf("int: %d, float: %f\n", bar.i, bar.f); 

  return 0;
}
-----------------------------------------------------------------




Index: tree-ssa-sccvn.c
===================================================================
--- tree-ssa-sccvn.c    (revision 130099)
+++ tree-ssa-sccvn.c    (working copy)
@@ -456,6 +456,33 @@ shared_vuses_from_stmt (tree stmt)
   return shared_lookup_vops;
 }

+
+static tree get_generic_type_node_from_size(int size)
+{
+  tree result = NULL;
+  switch(size)
+    {
+    case 8:
+      result = unsigned_intQI_type_node;
+      break;
+    case 16:
+      result = unsigned_intHI_type_node;
+      break;
+    case 32:
+      result = unsigned_intSI_type_node;
+      break;
+    case 64:
+      result = unsigned_intDI_type_node;
+      break;
+    case 128:
+      result = unsigned_intTI_type_node;
+      break;
+    default:
+      break;
+    }
+  return result;
+}
+
 /* Copy the operations present in load/store/call REF into RESULT, a vector of
    vn_reference_op_s's.  */

@@ -521,7 +548,23 @@ copy_reference_ops_from_ref (tree ref, V
          break;
        case COMPONENT_REF:
          /* Record field as operand.  */
-         temp.op0 = TREE_OPERAND (ref, 1);
+         {
+           tree aggregate = TREE_OPERAND (ref, 0);
+           tree aggregate_type = TREE_TYPE(aggregate);
+           tree field_decl = TREE_OPERAND (ref, 1);
+           tree field_decl_type = TREE_TYPE(field_decl);
+           tree field_decl_type_size = TYPE_SIZE(field_decl_type);
+           if (TREE_CODE (aggregate_type) == UNION_TYPE
+               && TREE_CODE (field_decl_type_size) == INTEGER_CST)
+             {
+               temp.op0 = field_decl_type_size;
+               temp.type =
get_generic_type_node_from_size(TREE_INT_CST_LOW(field_decl_type_size)) != NULL 
+                             ?
get_generic_type_node_from_size(TREE_INT_CST_LOW(field_decl_type_size)) 
+                             : temp.type;
+             } else {
+               temp.op0 = TREE_OPERAND (ref, 1);
+             }
+         }
          break;
        case ARRAY_RANGE_REF:
        case ARRAY_REF:



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (5 preceding siblings ...)
  2007-12-03 15:42 ` jsjodin at gcc dot gnu dot org
@ 2007-12-09 22:34 ` pinskia at gcc dot gnu dot org
  2007-12-13  1:57 ` jsjodin at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-12-09 22:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from pinskia at gcc dot gnu dot org  2007-12-09 22:34 -------
>+static tree get_generic_type_node_from_size(int size)

This function needs lots of improvement because the size of the modes is based
on UNIT_BIT_SIZE (I think that is the macro name) and really you can go from a
size to a mode and then to a tree type.

Note this patch really fixes my bug (PR 32964) if PRE is extended to handle
this too.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (6 preceding siblings ...)
  2007-12-09 22:34 ` pinskia at gcc dot gnu dot org
@ 2007-12-13  1:57 ` jsjodin at gcc dot gnu dot org
  2008-01-03 17:45 ` rguenth at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jsjodin at gcc dot gnu dot org @ 2007-12-13  1:57 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from jsjodin at gcc dot gnu dot org  2007-12-13 01:56 -------
(In reply to comment #7)
> >+static tree get_generic_type_node_from_size(int size)
> 
> This function needs lots of improvement because the size of the modes is based
> on UNIT_BIT_SIZE (I think that is the macro name) and really you can go from a
> size to a mode and then to a tree type.
> 
> Note this patch really fixes my bug (PR 32964) if PRE is extended to handle
> this too.
> 

Yes I rewrote the patch to always use the first member of a union that match a
specific size. When the expression was extracted (queried from VN function) a
VIEW_CONVERT_EXPR would be insterted if the expected type did not match the
stored expression type. This caused some problems (assertions) in the compiler
that I have not been able to track down yet, although it fixed both my test
cases and for constants it doesn't cause any problems since the expression gets
folded. To avoid introducing VIEW_CONVERT_EXPR it may be possible to return an
expression (from VN) that refers to a union member with a matching type
instead. It would mean that the value number would be the same, but the
returned expression might be different. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (7 preceding siblings ...)
  2007-12-13  1:57 ` jsjodin at gcc dot gnu dot org
@ 2008-01-03 17:45 ` rguenth at gcc dot gnu dot org
  2008-01-04 15:54 ` rguenth at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-03 17:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from rguenth at gcc dot gnu dot org  2008-01-03 17:29 -------
Confirmed.  While I have a patch to make VN handle the case of type-punning
constants, making it to generate VIEW_CONVERT_EXPRs instead runs into the
principle barrier that FRE does not do insertion.

The patch looks like

Index: tree-ssa-sccvn.c
===================================================================
*** tree-ssa-sccvn.c    (revision 131302)
--- tree-ssa-sccvn.c    (working copy)
*************** visit_reference_op_store (tree lhs, tree
*** 1231,1236 ****
--- 1247,1292 ----
        }

        vn_reference_insert (lhs, op, vdefs);
+
+       /* For unions and constant stores we can register stores to
+        the other union members as well, allowing propagation.
+        Do this for two-element unions only, as they are likely
+        used for type-punning which we want to optimize.  */
+       if (TREE_CODE (lhs) == COMPONENT_REF
+         && TREE_CODE (TREE_TYPE (TREE_OPERAND (lhs, 0))) == UNION_TYPE
+         /* We may be able to handle non-constant type-punnings as well.  */
+         && (TREE_CODE (op) == INTEGER_CST
+             || TREE_CODE (op) == REAL_CST
+             || TREE_CODE (op) == VECTOR_CST
+             || TREE_CODE (op) == COMPLEX_CST)
+         && fields_length (TREE_TYPE (TREE_OPERAND (lhs, 0))) == 2)
+       {
+         tree field = TYPE_FIELDS (TREE_TYPE (TREE_OPERAND (lhs, 0)));
+         tree val;
+
+         if (field == TREE_OPERAND (lhs, 1))
+           field = TREE_CHAIN (field);
+
+         /* Do not generate VIEW_CONVERT_EXPRs, but only handle the
+            cases we can fold it away.  */
+         val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (field), op);
+         if (val)
+           {
+             if (dump_file)
+               {
+                 fprintf (dump_file, "Value numbering store to alternate "
+                          "union field ");
+                 print_generic_expr (dump_file, field, 0);
+                 fprintf (dump_file, " with type-punned value ");
+                 print_generic_expr (dump_file, val, 0);
+                 fprintf (dump_file, "\n");
+               }
+             vn_reference_insert (build3 (COMPONENT_REF, TREE_TYPE (field),
+                                          TREE_OPERAND (lhs, 0), field,
+                                          NULL_TREE),
+                                  val, vdefs);
+           }
+       }
      }
    else
      {

and it handles the testcase in comment #6 correctly.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu dot
                   |                            |org
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2008-01-03 17:29:45
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (8 preceding siblings ...)
  2008-01-03 17:45 ` rguenth at gcc dot gnu dot org
@ 2008-01-04 15:54 ` rguenth at gcc dot gnu dot org
  2008-01-04 15:58 ` rguenth at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-04 15:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from rguenth at gcc dot gnu dot org  2008-01-04 15:35 -------
We can 'abuse' PREs inserting of fake stores machinery to handle the case in
PR33989, where for

union a
{
  int i;
  float f;
};

void f (float *a, int *b)
{
  union a c;
  c.f = *a;
  *b = c.i;
}

we then get (PRE)

<bb 2>:
  # VUSE <SMT.6_5(D)>
  D.1549_2 = *a_1(D);
  # SFT.0_8 = VDEF <SFT.0_6(D)>
  # SFT.1_9 = VDEF <SFT.1_7(D)>
  c.f = D.1549_2;
  uniontmp.12_13 = VIEW_CONVERT_EXPR<int>(D.1549_2);
  D.1550_3 = uniontmp.12_13;
  # SMT.7_11 = VDEF <SMT.7_10(D)>
  *b_4(D) = D.1550_3;
  return;

and before expand:

<bb 2>:
  # SMT.7_11 = VDEF <SMT.7_10(D)>
  *b = VIEW_CONVERT_EXPR<int>(*a);
  return;

and finally

f:
.LFB2:
        movl    (%rdi), %eax
        movl    %eax, (%rsi)
        ret

but we need to make sure VN processes the inserted conversions (which do
not have uses) and we need to enhance PREs poolification of trees to
handle COMPONENT_REFs (and possibly more).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (9 preceding siblings ...)
  2008-01-04 15:54 ` rguenth at gcc dot gnu dot org
@ 2008-01-04 15:58 ` rguenth at gcc dot gnu dot org
  2008-02-28 16:41 ` rguenth at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-04 15:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from rguenth at gcc dot gnu dot org  2008-01-04 15:38 -------
Created an attachment (id=14879)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14879&action=view)
patch prototype

So this patch certainly will ICE in a lot of cases, I wonder if poolifying the
inserted code is worth the hassle.  It also shows that PRE of loads/stores is
probably defect for all but indirect references.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (10 preceding siblings ...)
  2008-01-04 15:58 ` rguenth at gcc dot gnu dot org
@ 2008-02-28 16:41 ` rguenth at gcc dot gnu dot org
  2008-03-14 14:53 ` rguenth at gcc dot gnu dot org
  2008-03-14 16:17 ` rguenth at gcc dot gnu dot org
  13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-02-28 16:41 UTC (permalink / raw)
  To: gcc-bugs



-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |rguenth at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2008-01-03 17:29:45         |2008-02-28 16:40:35
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (11 preceding siblings ...)
  2008-02-28 16:41 ` rguenth at gcc dot gnu dot org
@ 2008-03-14 14:53 ` rguenth at gcc dot gnu dot org
  2008-03-14 16:17 ` rguenth at gcc dot gnu dot org
  13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-14 14:53 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from rguenth at gcc dot gnu dot org  2008-03-14 14:52 -------
Subject: Bug 34043

Author: rguenth
Date: Fri Mar 14 14:52:07 2008
New Revision: 133218

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=133218
Log:
2008-03-14  Richard Guenther  <rguenther@suse.de>

        PR tree-optimization/34043
        PR tree-optimization/33989
        * tree-ssa-pre.c (execute_pre): Allow SCCVN to do insertion
        when doing FRE.
        (bitmap_find_leader): Use extra argument to verify dominance
        relationship inside a basic-block.
        (can_PRE_operation): Add VIEW_CONVERT_EXPR.
        (find_leader_in_sets): Adjust.
        (create_component_ref_by_pieces): Take extra argument for
        dominance check, handle lookup failures.
        (find_or_generate_expression): Likewise.
        (create_expression_by_pieces): Likewise.
        (insert_into_preds_of_block): Adjust.
        (create_value_expr_from): If asked for, verify all operands
        are in the blocks AVAIL_OUT set.
        (make_values_for_stmt): Check for SSA_NAMEs that are life
        over an abnormal edge.
        (compute_avail): Remove such check.
        (do_SCCVN_insertion): New function.
        (eliminate): If we do not find a leader suitable for replacement
        insert a replacement expression from SCCVN if available.
        * tree-ssa-sccvn.h (run_scc_vn): Update prototype.
        (struct vn_ssa_aux): Add needs_insertion flag.
        * tree-ssa-sccvn.c (may_insert): New global flag.
        (copy_reference_ops_from_ref): Value-number union member access
        based on its size, not type and member if insertion is allowed.
        (visit_reference_op_load): For a weak match from union type
        punning lookup a view-converted value and insert a SSA_NAME
        for that value if that is not found.
        (visit_use): Make dumps shorter.  Do not disallow value numbering
        SSA_NAMEs that are life over an abnormal edge to constants.
        (free_scc_vn): Release inserted SSA_NAMEs.
        (run_scc_vn): New flag to specify whether insertion is allowed.
        Process SSA_NAMEs in forward order.
        * tree-ssa-loop-im.c (for_each_index): Handle invariant
        ADDR_EXPRs inside VIEW_CONVERT_EXPR.
        * fold-const.c (fold_unary): Fold VIEW_CONVERT_EXPRs from/to
        pointer type to/from integral types that do not change the
        precision to regular conversions.

        * gcc.dg/tree-ssa/ssa-fre-7.c: New testcase.
        * gcc.dg/tree-ssa/ssa-fre-8.c: Likewise.
        * gcc.dg/tree-ssa/ssa-fre-9.c: Likewise.
        * gcc.dg/tree-ssa/ssa-fre-10.c: Likewise.
        * gcc.dg/tree-ssa/ssa-pre-17.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-10.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-7.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-8.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-9.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-17.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/fold-const.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-loop-im.c
    trunk/gcc/tree-ssa-pre.c
    trunk/gcc/tree-ssa-sccvn.c
    trunk/gcc/tree-ssa-sccvn.h


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
  2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
                   ` (12 preceding siblings ...)
  2008-03-14 14:53 ` rguenth at gcc dot gnu dot org
@ 2008-03-14 16:17 ` rguenth at gcc dot gnu dot org
  13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-14 16:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from rguenth at gcc dot gnu dot org  2008-03-14 16:16 -------
Fixed.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|---                         |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2008-03-14 16:17 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-09 16:14 [Bug tree-optimization/34043] New: Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types jsjodin at gcc dot gnu dot org
2007-11-09 16:16 ` [Bug tree-optimization/34043] " jsjodin at gcc dot gnu dot org
2007-11-09 16:16 ` jsjodin at gcc dot gnu dot org
2007-11-09 16:17 ` jsjodin at gcc dot gnu dot org
2007-11-09 18:24 ` pinskia at gcc dot gnu dot org
2007-11-13 17:07 ` jsjodin at gcc dot gnu dot org
2007-12-03 15:42 ` jsjodin at gcc dot gnu dot org
2007-12-09 22:34 ` pinskia at gcc dot gnu dot org
2007-12-13  1:57 ` jsjodin at gcc dot gnu dot org
2008-01-03 17:45 ` rguenth at gcc dot gnu dot org
2008-01-04 15:54 ` rguenth at gcc dot gnu dot org
2008-01-04 15:58 ` rguenth at gcc dot gnu dot org
2008-02-28 16:41 ` rguenth at gcc dot gnu dot org
2008-03-14 14:53 ` rguenth at gcc dot gnu dot org
2008-03-14 16:17 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).