public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2
@ 2011-01-11 13:48 bmei at broadcom dot com
  2011-01-11 13:49 ` [Bug rtl-optimization/47258] " bmei at broadcom dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: bmei at broadcom dot com @ 2011-01-11 13:48 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258

           Summary: Extra instruction generated in 4.5.2
           Product: gcc
           Version: 4.5.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: bmei@broadcom.com


I encounter a performance regression in 4.5.2 (4.6 as well) compared with
4.5.1.

The code is from Core Mark. 

Compile the attached .i file. 

~/work/install-x86-452/bin/gcc core_matrix.i -O2 -S -o x86-452.s
...
.L5:
    movl    %r8d, %r10d
.L3:
    mov    %r9d, %r8d
    movswl    (%rcx,%rax), %r11d
    addq    $2, %rax
    movswl    (%rdx,%r8,2), %r8d
    addl    $1, %r9d
    imull    %r11d, %r8d
    addl    %r10d, %r8d
    cmpq    %rbx, %rax
    jne    .L5
...

~/work/install-x86-451/bin/gcc core_matrix.i -O2 -S -o x86-451.s
...
.L3:
    mov    %r9d, %r8d
    movswl    (%rcx,%rax), %r11d
    addq    $2, %rax
    movswl    (%rdx,%r8,2), %r8d
    addl    $1, %r9d
    imull    %r11d, %r8d
    addl    %r8d, %r10d
    cmpq    %rbx, %rax
    jne    .L3
...

The performance hit is even worse on our architecture because zero-overhead
loop instruction cannot be used in such irregular loop produced by 4.5.2

The configuration used is:
../gcc-4.5.1/configure
--prefix=/projects/firepath/tools/work/bmei/install-x86-451
--with-mpfr=/projects/firepath/tools/work/bmei/packages/mpfr/2.4.1/x86-64
--with-gmp=/projects/firepath/tools/work/bmei/packages/gmp/4.3.0/x86-64
--with-mpc=/projects/firepath/tools/work/bmei/packages/mpc/0.8.1/x86-64
--with-elf=/projects/firepath/tools/work/bmei/packages/libelf/x86-64
--disable-bootstrap --enable-languages=c --no-create --no-recursion


The difference between 4.5.1 and 4.5.2 seems to occur in RTL expand pass.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/47258] Extra instruction generated in 4.5.2
  2011-01-11 13:48 [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2 bmei at broadcom dot com
@ 2011-01-11 13:49 ` bmei at broadcom dot com
  2011-01-11 16:32 ` bmei at broadcom dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bmei at broadcom dot com @ 2011-01-11 13:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258

--- Comment #1 from Bingfeng Mei <bmei at broadcom dot com> 2011-01-11 13:38:13 UTC ---
Created attachment 22944
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22944
Preprocessed test case


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/47258] Extra instruction generated in 4.5.2
  2011-01-11 13:48 [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2 bmei at broadcom dot com
  2011-01-11 13:49 ` [Bug rtl-optimization/47258] " bmei at broadcom dot com
@ 2011-01-11 16:32 ` bmei at broadcom dot com
  2011-01-11 16:36 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bmei at broadcom dot com @ 2011-01-11 16:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258

--- Comment #2 from Bingfeng Mei <bmei at broadcom dot com> 2011-01-11 16:16:28 UTC ---
After tried patches one-by-one, I believe the misoptimization is down to the
following patch.

Index: tree-ssa-copyrename.c
===================================================================
RCS file: /cvs/dev/tools/src/fp_gcc/gcc/tree-ssa-copyrename.c,v
retrieving revision 1.1.2.5.2.1
retrieving revision 1.1.2.5.2.2
diff -u -r1.1.2.5.2.1 -r1.1.2.5.2.2
--- tree-ssa-copyrename.c    12 Apr 2010 13:15:43 -0000    1.1.2.5.2.1
+++ tree-ssa-copyrename.c    13 Dec 2010 05:51:45 -0000    1.1.2.5.2.2
@@ -225,11 +225,11 @@
       ign2 = false;
     }

-  /* Don't coalesce if the two variables aren't type compatible.  */
-  if (!types_compatible_p (TREE_TYPE (root1), TREE_TYPE (root2)))
+  /* Don't coalesce if the two variables are not of the same type.  */
+  if (TREE_TYPE (root1) != TREE_TYPE (root2))
     {
       if (debug)
-    fprintf (debug, " : Incompatible types.  No coalesce.\n");
+    fprintf (debug, " : Different types.  No coalesce.\n");
       return false;
     }


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/47258] Extra instruction generated in 4.5.2
  2011-01-11 13:48 [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2 bmei at broadcom dot com
  2011-01-11 13:49 ` [Bug rtl-optimization/47258] " bmei at broadcom dot com
  2011-01-11 16:32 ` bmei at broadcom dot com
@ 2011-01-11 16:36 ` rguenth at gcc dot gnu.org
  2011-01-11 16:42 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-01-11 16:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258

--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-01-11 16:34:40 UTC ---
(In reply to comment #2)
> After tried patches one-by-one, I believe the misoptimization is down to the
> following patch.

Which is a correctness patch.  You can try dumbing it down somewhat with

if (TYPE_MAIN_VARIANT (TREE_TYPE (root1)) != TYPE_MAIN_VARIANT (TREE_TYPE
(root2))
    || !types_compatible_p (TREE_TYPE (root1), TREE_TYPE (root2)))

and see if that helps.

> Index: tree-ssa-copyrename.c
> ===================================================================
> RCS file: /cvs/dev/tools/src/fp_gcc/gcc/tree-ssa-copyrename.c,v
> retrieving revision 1.1.2.5.2.1
> retrieving revision 1.1.2.5.2.2
> diff -u -r1.1.2.5.2.1 -r1.1.2.5.2.2
> --- tree-ssa-copyrename.c    12 Apr 2010 13:15:43 -0000    1.1.2.5.2.1
> +++ tree-ssa-copyrename.c    13 Dec 2010 05:51:45 -0000    1.1.2.5.2.2
> @@ -225,11 +225,11 @@
>        ign2 = false;
>      }
> 
> -  /* Don't coalesce if the two variables aren't type compatible.  */
> -  if (!types_compatible_p (TREE_TYPE (root1), TREE_TYPE (root2)))
> +  /* Don't coalesce if the two variables are not of the same type.  */
> +  if (TREE_TYPE (root1) != TREE_TYPE (root2))
>      {
>        if (debug)
> -    fprintf (debug, " : Incompatible types.  No coalesce.\n");
> +    fprintf (debug, " : Different types.  No coalesce.\n");
>        return false;
>      }


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/47258] Extra instruction generated in 4.5.2
  2011-01-11 13:48 [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2 bmei at broadcom dot com
                   ` (2 preceding siblings ...)
  2011-01-11 16:36 ` rguenth at gcc dot gnu.org
@ 2011-01-11 16:42 ` rguenth at gcc dot gnu.org
  2011-01-13 16:33 ` bmei at broadcom dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-01-11 16:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258

--- Comment #4 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-01-11 16:35:23 UTC ---
But we'll create bogus debug info for the typedef type decls then.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/47258] Extra instruction generated in 4.5.2
  2011-01-11 13:48 [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2 bmei at broadcom dot com
                   ` (3 preceding siblings ...)
  2011-01-11 16:42 ` rguenth at gcc dot gnu.org
@ 2011-01-13 16:33 ` bmei at broadcom dot com
  2011-12-15  2:09 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bmei at broadcom dot com @ 2011-01-13 16:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258

--- Comment #5 from Bingfeng Mei <bmei at broadcom dot com> 2011-01-13 15:49:23 UTC ---
It works. But I have no idea about the debug info issue in your other comment. 

> (In reply to comment #2)
> > After tried patches one-by-one, I believe the misoptimization is down to the
> > following patch.
> 
> Which is a correctness patch.  You can try dumbing it down somewhat with
> 
> if (TYPE_MAIN_VARIANT (TREE_TYPE (root1)) != TYPE_MAIN_VARIANT (TREE_TYPE
> (root2))
>     || !types_compatible_p (TREE_TYPE (root1), TREE_TYPE (root2)))
> 
> and see if that helps.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/47258] Extra instruction generated in 4.5.2
  2011-01-11 13:48 [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2 bmei at broadcom dot com
                   ` (4 preceding siblings ...)
  2011-01-13 16:33 ` bmei at broadcom dot com
@ 2011-12-15  2:09 ` pinskia at gcc dot gnu.org
  2011-12-15 10:21 ` bmei at broadcom dot com
  2012-02-02  8:39 ` [Bug tree-optimization/47258] [4.5/4.6/4.7 Regression] " pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-12-15  2:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-12-15 02:03:43 UTC ---
Can you try this patch:
Index: tree-outof-ssa.c
===================================================================
--- tree-outof-ssa.c    (revision 67191)
+++ tree-outof-ssa.c    (revision 67192)
@@ -1021,6 +1021,9 @@ insert_backedge_copies (void)
   basic_block bb;
   gimple_stmt_iterator gsi;

+  /* Make sure that edges have updated to be marked for back edges. */
+  mark_dfs_back_edges ();
+
   FOR_EACH_BB (bb)
     {
       /* Mark block as possibly needing calculation of UIDs.  */
--- CUT ---
I did not create this patch, it came from
http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01865.html .

Which means this is most likely fixed on the trunk already.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/47258] Extra instruction generated in 4.5.2
  2011-01-11 13:48 [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2 bmei at broadcom dot com
                   ` (5 preceding siblings ...)
  2011-12-15  2:09 ` pinskia at gcc dot gnu.org
@ 2011-12-15 10:21 ` bmei at broadcom dot com
  2012-02-02  8:39 ` [Bug tree-optimization/47258] [4.5/4.6/4.7 Regression] " pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: bmei at broadcom dot com @ 2011-12-15 10:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258

--- Comment #7 from Bingfeng Mei <bmei at broadcom dot com> 2011-12-15 10:18:06 UTC ---
Yes, the patch fixes the bug. Thanks.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/47258] [4.5/4.6/4.7 Regression] Extra instruction generated in 4.5.2
  2011-01-11 13:48 [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2 bmei at broadcom dot com
                   ` (6 preceding siblings ...)
  2011-12-15 10:21 ` bmei at broadcom dot com
@ 2012-02-02  8:39 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-02-02  8:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
      Known to work|                            |4.4.0
                URL|                            |http://gcc.gnu.org/ml/gcc-p
                   |                            |atches/2011-11/msg01865.htm
                   |                            |l
           Keywords|                            |missed-optimization
          Component|rtl-optimization            |tree-optimization
         Resolution|                            |FIXED
            Summary|Extra instruction generated |[4.5/4.6/4.7 Regression]
                   |in 4.5.2                    |Extra instruction generated
                   |                            |in 4.5.2
   Target Milestone|---                         |4.7.0
      Known to fail|                            |4.5.2, 4.6.0, 4.7.0

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-02-02 08:38:55 UTC ---
Fixed for 4.7.0 by:
------------------------------------------------------------------------
r181476 | wschmidt | 2011-11-18 06:15:38 -0800 (Fri, 18 Nov 2011) | 6 lines

2011-11-18  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

        * tree-outof-ssa.c (insert_back_edge_copies):  Add call to
        mark_dfs_back_edges.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-02-02  8:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-11 13:48 [Bug rtl-optimization/47258] New: Extra instruction generated in 4.5.2 bmei at broadcom dot com
2011-01-11 13:49 ` [Bug rtl-optimization/47258] " bmei at broadcom dot com
2011-01-11 16:32 ` bmei at broadcom dot com
2011-01-11 16:36 ` rguenth at gcc dot gnu.org
2011-01-11 16:42 ` rguenth at gcc dot gnu.org
2011-01-13 16:33 ` bmei at broadcom dot com
2011-12-15  2:09 ` pinskia at gcc dot gnu.org
2011-12-15 10:21 ` bmei at broadcom dot com
2012-02-02  8:39 ` [Bug tree-optimization/47258] [4.5/4.6/4.7 Regression] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).