public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/42027]  New: Performance regression in convolution loop optimization
@ 2009-11-13  9:49 nbenoit at tuxfamily dot org
  2009-11-13  9:52 ` [Bug tree-optimization/42027] " nbenoit at tuxfamily dot org
                   ` (20 more replies)
  0 siblings, 21 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-11-13  9:49 UTC (permalink / raw)
  To: gcc-bugs

GCC trunk rev. 154141 seems to handle less efficiently a convolution code than
previous stable releases, it was also spotted in revision 153048.

Here are some average timings on an Intel E5320 clocked at 1.86 GHz with 4 MB
of L2 cache, Debian GNU/Linux with a 2.6.26 kernel.

* with -O2 -march=native
GCC 4.3.2               8239 ms
GCC-4.4.2               8102 ms
GCC-snapshot-20091105   9347 ms
GCC-trunk-r154141       9343 ms

* with -O2
GCC 4.3.2               8128 ms
GCC-4.4.2               8158 ms
GCC-snapshot-20091105   9824 ms
GCC-trunk-r154141       9828 ms

* with -O1
GCC 4.3.2              20926 ms
GCC-4.4.2               8277 ms
GCC-snapshot-20091105   9369 ms
GCC-trunk-r154141       9375 ms

* with -O0
GCC 4.3.2              34061 ms
GCC-4.4.2              34241 ms
GCC-snapshot-20091105  34903 ms
GCC-trunk-r154141      34910 ms


GCC compiled with : configure --prefix=/export/home/nicolas/gcc/trunk-install
--enable-languages=c --disable-multilib --disable-bootstrap
--enable-checking=release

I haven't been able to track down the origin of the performance difference.


Note that data are not initialized in the attached code, as the slowdown is
observed wether they are or not.

---BEGIN code---
#define N  1024*512
#define M  512
#define ITER 16

double in[N];
double H[M];
double vH[N];

int main ( int argc,
           char **argv )
{
  int i, j, k;

  for ( i=0; i<ITER; ++i )
    for ( j=0; j<N; ++j )
      for ( k=0; (k<M)&&(k<=j); ++k )
        vH[j] += H[k]*in[j-k];

  return (int) vH[argc];
}
---END code---


-- 
           Summary: Performance regression in convolution loop optimization
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: nbenoit at tuxfamily dot org
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
@ 2009-11-13  9:52 ` nbenoit at tuxfamily dot org
  2009-11-13 13:49 ` [Bug tree-optimization/42027] [4.5 Regression] " rguenth at gcc dot gnu dot org
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-11-13  9:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from nbenoit at tuxfamily dot org  2009-11-13 09:51 -------
Created an attachment (id=19010)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19010&action=view)
Source file with a convolution loop pattern.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
  2009-11-13  9:52 ` [Bug tree-optimization/42027] " nbenoit at tuxfamily dot org
@ 2009-11-13 13:49 ` rguenth at gcc dot gnu dot org
  2009-11-26 15:09 ` nbenoit at tuxfamily dot org
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-11-13 13:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from rguenth at gcc dot gnu dot org  2009-11-13 13:49 -------
It looks like it is induction variable related.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
            Summary|Performance regression in   |[4.5 Regression] Performance
                   |convolution loop            |regression in convolution
                   |optimization                |loop optimization
   Target Milestone|---                         |4.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
  2009-11-13  9:52 ` [Bug tree-optimization/42027] " nbenoit at tuxfamily dot org
  2009-11-13 13:49 ` [Bug tree-optimization/42027] [4.5 Regression] " rguenth at gcc dot gnu dot org
@ 2009-11-26 15:09 ` nbenoit at tuxfamily dot org
  2009-11-27 11:18 ` rguenth at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-11-26 15:09 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from nbenoit at tuxfamily dot org  2009-11-26 15:08 -------
Using integer instead of double, the performance difference is even more
noticeable :

* with -O1
GCC 4.4.2               7475 ms
GCC-trunk-r154672       9390 ms


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (2 preceding siblings ...)
  2009-11-26 15:09 ` nbenoit at tuxfamily dot org
@ 2009-11-27 11:18 ` rguenth at gcc dot gnu dot org
  2009-12-01 10:11 ` nbenoit at tuxfamily dot org
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-11-27 11:18 UTC (permalink / raw)
  To: gcc-bugs



-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
           Priority|P3                          |P2
   Last reconfirmed|0000-00-00 00:00:00         |2009-11-27 11:18:44
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (3 preceding siblings ...)
  2009-11-27 11:18 ` rguenth at gcc dot gnu dot org
@ 2009-12-01 10:11 ` nbenoit at tuxfamily dot org
  2009-12-01 11:24 ` matz at gcc dot gnu dot org
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-12-01 10:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from nbenoit at tuxfamily dot org  2009-12-01 10:11 -------
It seems that this regression first appeared with revision 151080

* with -O1
GCC-4.4.2          7.4 s
GCC-trunk-r151078  7.4 s
GCC-trunk-r151079  7.4 s
GCC-trunk-r151080  9.4 s
GCC-trunk-r151081  9.4 s
GCC-trunk-r151082  9.4 s

Changelog for revision 151080
http://gcc.gnu.org/ml/gcc-patches/2009-08/msg01336.html

009-08-25  Michael Matz  <matz@suse.de>

        * expr.h (jumpifnot_1, jumpif_1, do_jump_1): Declare.
        * dojump.c (do_jump_by_parts_greater): Take two operands instead of
        full expression.
        (do_jump_by_parts_equality, do_compare_and_jump): Ditto.
        (jumpifnot_1, jumpif_1): New wrappers for do_jump_1.
        (do_jump): Split out code for simple binary comparisons into ...
        (do_jump_1): ... this, taking the individual operands and code.
        Change callers to helper function above accordingly.
        * expr.c (expand_expr_real_1): Use jumpifnot_1 for simple binary
        comparisons.


-- 

nbenoit at tuxfamily dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at suse dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (4 preceding siblings ...)
  2009-12-01 10:11 ` nbenoit at tuxfamily dot org
@ 2009-12-01 11:24 ` matz at gcc dot gnu dot org
  2009-12-13 21:51 ` matz at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-01 11:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from matz at gcc dot gnu dot org  2009-12-01 11:24 -------
Hmpf, something fishy is going on, as this patch should have been only
refactoring without influence on the generated code.  I'll look at it.


-- 

matz at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |matz at gcc dot gnu dot org
                   |dot org                     |
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2009-11-27 11:18:44         |2009-12-01 11:24:11
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (5 preceding siblings ...)
  2009-12-01 11:24 ` matz at gcc dot gnu dot org
@ 2009-12-13 21:51 ` matz at gcc dot gnu dot org
  2009-12-13 21:53 ` matz at gcc dot gnu dot org
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-13 21:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from matz at gcc dot gnu dot org  2009-12-13 21:51 -------
Subject: Bug 42027

Author: matz
Date: Sun Dec 13 21:51:34 2009
New Revision: 155196

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155196
Log:
        PR tree-optimization/42027
        * dojump.c (do_jump <TRUTH_AND_EXPR, TRUTH_OR_EXPR>): Go to
        TRUTH_ANDIF_EXPR resp. TRUTH_ORIF_EXPR expander, instead of
        falling through.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/dojump.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (6 preceding siblings ...)
  2009-12-13 21:51 ` matz at gcc dot gnu dot org
@ 2009-12-13 21:53 ` matz at gcc dot gnu dot org
  2009-12-16 10:35 ` nbenoit at tuxfamily dot org
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-13 21:53 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from matz at gcc dot gnu dot org  2009-12-13 21:53 -------
Fixed.


-- 

matz at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (7 preceding siblings ...)
  2009-12-13 21:53 ` matz at gcc dot gnu dot org
@ 2009-12-16 10:35 ` nbenoit at tuxfamily dot org
  2009-12-16 11:06 ` nbenoit at tuxfamily dot org
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-12-16 10:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from nbenoit at tuxfamily dot org  2009-12-16 10:34 -------
I am confused, a performance regression is still noticeable:

* Intel Xeon E5320 (x86_64 arch but gcc machine is i686-pc-linux-gnu), with -O1
flag
GCC-4.4.2          7364 ms
GCC-trunk-r155286  9515 ms

* Intel Xeon 5160 (x86_64 arch and gcc machine is x86_64-linux-gnu), with -O1
flag
GCC-4.4.1          5960 ms
GCC-trunk-r155286  7355 ms


Here is a diff on the assembly generated for the Intel E5320:

$ diff 442/convol.s r155286/convol.s
11c11
<       subl    $8, %esp
---
> 	subl	$12, %esp
13d12
<       movl    $H, %esi
17c16
<       imull   (%esi,%eax,4), %ebx
---
> 	imull	H(,%eax,4), %ebx
22c21
<       jg      .L10
---
> 	setle	%bl
24,25c23,25
<       jle     .L3
< .L10:
---
> 	setle	-21(%ebp)
> 	testb	%bl, -21(%ebp)
> 	jne	.L3
28c28
< .L6:
---
> .L5:
31,32c31,32
<       je      .L5
< .L8:
---
> 	je	.L4
> .L7:
34c34
<       js      .L6
---
> 	js	.L5
40c40
< .L5:
---
> .L4:
43c43
<       je      .L7
---
> 	je	.L6
46,47c46,47
<       jmp     .L8
< .L7:
---
> 	jmp	.L7
> .L6:
50c50
<       addl    $8, %esp
---
> 	addl	$12, %esp
60c60
<       .ident  "GCC: (GNU) 4.4.2"
---
> 	.ident	"GCC: (GNU) 4.5.0 20091216 (experimental)"


-- 

nbenoit at tuxfamily dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (8 preceding siblings ...)
  2009-12-16 10:35 ` nbenoit at tuxfamily dot org
@ 2009-12-16 11:06 ` nbenoit at tuxfamily dot org
  2009-12-16 12:24 ` rguenth at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-12-16 11:06 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from nbenoit at tuxfamily dot org  2009-12-16 11:06 -------
Here is a unified diff which focuses on the inner-loop exit conditions.

--- 442/convol.s
+++ r155286/convol.s

 .L3:
        movl    (%edx), %ebx
-       imull   (%esi,%eax,4), %ebx
+       imull   H(,%eax,4), %ebx
        addl    %ebx, %ecx
        addl    $1, %eax
        subl    $4, %edx
        cmpl    $511, %eax
-       jg      .L10
+       setle   %bl
        cmpl    %edi, %eax
-       jle     .L3
-.L10:
+       setle   -21(%ebp)
+       testb   %bl, -21(%ebp)
+       jne     .L3
        movl    -16(%ebp), %eax
        movl    %ecx, vH(,%eax,4)
-.L6:
+.L5:


L3 corresponds to the inner loop body.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (9 preceding siblings ...)
  2009-12-16 11:06 ` nbenoit at tuxfamily dot org
@ 2009-12-16 12:24 ` rguenth at gcc dot gnu dot org
  2009-12-16 12:54 ` nbenoit at tuxfamily dot org
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-16 12:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from rguenth at gcc dot gnu dot org  2009-12-16 12:24 -------
Which is the good version, the one with less or the one with more branches?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (10 preceding siblings ...)
  2009-12-16 12:24 ` rguenth at gcc dot gnu dot org
@ 2009-12-16 12:54 ` nbenoit at tuxfamily dot org
  2009-12-16 12:55 ` nbenoit at tuxfamily dot org
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-12-16 12:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from nbenoit at tuxfamily dot org  2009-12-16 12:53 -------
The fastest is the variant with more jumps (442/convol.s in the diff) generated
by GCC-4.4.2.
In the one jump variant (r155286/convol.s in the diff), I guess it is the
computing of both conditions before jumping which slows down the program.

Looking between revisions 151079 and 151080 (which is when the regression first
appeared), the RTL expand dump shows a difference regarding branch prediction.
I do not know if it is relevant:

--- r151079/convol.c.135r.expand        2009-12-01 14:10:55.000000000 +0100
+++ r151080/convol.c.135r.expand        2009-12-01 14:11:03.000000000 +0100
@@ -5,7 +5,6 @@
 ;; Generating RTL for gimple basic block 2

 ;; Generating RTL for gimple basic block 3
-Failed to add probability note

 ;; Generating RTL for gimple basic block 4

@@ -20,13 +19,6 @@
 ;; Generating RTL for gimple basic block 9

 ;; Generating RTL for gimple basic block 10
-Purged non-fallthru edges from bb 14
-Predictions for insn 61 bb 3
-  no prediction heuristics: 50.0%
-  combined heuristics: 50.0%
-Predictions for insn 65 bb 13
-  no prediction heuristics: 50.0%
-  combined heuristics: 50.0%

I am about to attach the whole diff file.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (11 preceding siblings ...)
  2009-12-16 12:54 ` nbenoit at tuxfamily dot org
@ 2009-12-16 12:55 ` nbenoit at tuxfamily dot org
  2009-12-16 14:15 ` rguenth at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-12-16 12:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from nbenoit at tuxfamily dot org  2009-12-16 12:55 -------
Created an attachment (id=19321)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19321&action=view)
Diff of the RTL expand dump between revisions 151079 and 151080


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (12 preceding siblings ...)
  2009-12-16 12:55 ` nbenoit at tuxfamily dot org
@ 2009-12-16 14:15 ` rguenth at gcc dot gnu dot org
  2009-12-16 23:43 ` matz at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-16 14:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from rguenth at gcc dot gnu dot org  2009-12-16 14:14 -------
It's indeed expand that makes the difference when expanding

;; if (k <= 511 && k <= j != 0)

probably due to the way TER works now.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (13 preceding siblings ...)
  2009-12-16 14:15 ` rguenth at gcc dot gnu dot org
@ 2009-12-16 23:43 ` matz at gcc dot gnu dot org
  2009-12-17  0:27 ` matz at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-16 23:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from matz at gcc dot gnu dot org  2009-12-16 23:43 -------
That's exactly what I fixed with my last patch.  If this still results in a
difference it's caused by difference in cheapness of branches.  I'll poke at
it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (14 preceding siblings ...)
  2009-12-16 23:43 ` matz at gcc dot gnu dot org
@ 2009-12-17  0:27 ` matz at gcc dot gnu dot org
  2009-12-17  1:42 ` matz at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-17  0:27 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from matz at gcc dot gnu dot org  2009-12-17 00:27 -------
Hmm, no, it's still my fault.  Something must have gone wrong with my brain
when I thought the bug was fixed, no idea how that happened.  The patch wasn't
complete.


-- 

matz at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |ASSIGNED
   Last reconfirmed|2009-12-01 11:24:11         |2009-12-17 00:27:36
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (15 preceding siblings ...)
  2009-12-17  0:27 ` matz at gcc dot gnu dot org
@ 2009-12-17  1:42 ` matz at gcc dot gnu dot org
  2009-12-17  9:32 ` nbenoit at tuxfamily dot org
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-17  1:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from matz at gcc dot gnu dot org  2009-12-17 01:42 -------
Created an attachment (id=19332)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19332&action=view)
Real fix

Now, before I blow it again, would you be so kind to test this patch (on top
of some recent trunk, doesn't have to be the newest one, you don't need to
bootstrap) if it fixes the performance problem.  For me it does now, I swear
:-)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (16 preceding siblings ...)
  2009-12-17  1:42 ` matz at gcc dot gnu dot org
@ 2009-12-17  9:32 ` nbenoit at tuxfamily dot org
  2009-12-17  9:34 ` nbenoit at tuxfamily dot org
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-12-17  9:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from nbenoit at tuxfamily dot org  2009-12-17 09:32 -------
(In reply to comment #16)
> Created an attachment (id=19332)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19332&action=view) [edit]
> Real fix
> 
> Now, before I blow it again, would you be so kind to test this patch (on top
> of some recent trunk, doesn't have to be the newest one, you don't need to
> bootstrap) if it fixes the performance problem.  For me it does now, I swear
> :-)
> 

Tested with trunk revision 155304, the regression is gone.

* Intel Xeon E5320 (x86_64 arch but gcc machine is i686-pc-linux-gnu), with -O1
flag
GCC-4.4.2          7364 ms
GCC-trunk-r155286  7360 ms

* Intel Xeon 5160 (x86_64 arch and gcc machine is x86_64-linux-gnu), with -O1
flag
GCC-4.4.1          5968 ms
GCC-trunk-r155286  5963 ms


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (17 preceding siblings ...)
  2009-12-17  9:32 ` nbenoit at tuxfamily dot org
@ 2009-12-17  9:34 ` nbenoit at tuxfamily dot org
  2009-12-20  1:16 ` matz at gcc dot gnu dot org
  2009-12-20 13:12 ` rguenth at gcc dot gnu dot org
  20 siblings, 0 replies; 22+ messages in thread
From: nbenoit at tuxfamily dot org @ 2009-12-17  9:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from nbenoit at tuxfamily dot org  2009-12-17 09:34 -------
(In reply to comment #17)
> (In reply to comment #16)
> > Created an attachment (id=19332)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19332&action=view) [edit]
> > Real fix
> > 
> > Now, before I blow it again, would you be so kind to test this patch (on top
> > of some recent trunk, doesn't have to be the newest one, you don't need to
> > bootstrap) if it fixes the performance problem.  For me it does now, I swear
> > :-)
> > 
> 
> Tested with trunk revision 155304, the regression is gone.
> 
> * Intel Xeon E5320 (x86_64 arch but gcc machine is i686-pc-linux-gnu), with -O1
> flag
> GCC-4.4.2          7364 ms
> GCC-trunk-r155286  7360 ms
> 
> * Intel Xeon 5160 (x86_64 arch and gcc machine is x86_64-linux-gnu), with -O1
> flag
> GCC-4.4.1          5968 ms
> GCC-trunk-r155286  5963 ms
> 

Oups, copy-pasted the GCC versions for the timings.
The correct versions are: 
* Intel Xeon E5320 (x86_64 arch but gcc machine is i686-pc-linux-gnu), with -O1
 flag
GCC-4.4.2                  7364 ms
GCC-trunk-r155304-patched  7360 ms

* Intel Xeon 5160 (x86_64 arch and gcc machine is x86_64-linux-gnu), with -O1
flag
GCC-4.4.1                  5968 ms
GCC-trunk-r155304-patched  5963 ms


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (18 preceding siblings ...)
  2009-12-17  9:34 ` nbenoit at tuxfamily dot org
@ 2009-12-20  1:16 ` matz at gcc dot gnu dot org
  2009-12-20 13:12 ` rguenth at gcc dot gnu dot org
  20 siblings, 0 replies; 22+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-12-20  1:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from matz at gcc dot gnu dot org  2009-12-20 01:16 -------
Subject: Bug 42027

Author: matz
Date: Sun Dec 20 01:15:46 2009
New Revision: 155367

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155367
Log:
        PR tree-optimization/42027
        * cfgexpand.c (expand_gimple_cond): Use jumpy sequence for &, &&, |
        and || if jumps are cheap.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgexpand.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug tree-optimization/42027] [4.5 Regression] Performance regression in convolution loop optimization
  2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
                   ` (19 preceding siblings ...)
  2009-12-20  1:16 ` matz at gcc dot gnu dot org
@ 2009-12-20 13:12 ` rguenth at gcc dot gnu dot org
  20 siblings, 0 replies; 22+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-20 13:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from rguenth at gcc dot gnu dot org  2009-12-20 13:11 -------
Fixed.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42027


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2009-12-20 13:12 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-13  9:49 [Bug tree-optimization/42027] New: Performance regression in convolution loop optimization nbenoit at tuxfamily dot org
2009-11-13  9:52 ` [Bug tree-optimization/42027] " nbenoit at tuxfamily dot org
2009-11-13 13:49 ` [Bug tree-optimization/42027] [4.5 Regression] " rguenth at gcc dot gnu dot org
2009-11-26 15:09 ` nbenoit at tuxfamily dot org
2009-11-27 11:18 ` rguenth at gcc dot gnu dot org
2009-12-01 10:11 ` nbenoit at tuxfamily dot org
2009-12-01 11:24 ` matz at gcc dot gnu dot org
2009-12-13 21:51 ` matz at gcc dot gnu dot org
2009-12-13 21:53 ` matz at gcc dot gnu dot org
2009-12-16 10:35 ` nbenoit at tuxfamily dot org
2009-12-16 11:06 ` nbenoit at tuxfamily dot org
2009-12-16 12:24 ` rguenth at gcc dot gnu dot org
2009-12-16 12:54 ` nbenoit at tuxfamily dot org
2009-12-16 12:55 ` nbenoit at tuxfamily dot org
2009-12-16 14:15 ` rguenth at gcc dot gnu dot org
2009-12-16 23:43 ` matz at gcc dot gnu dot org
2009-12-17  0:27 ` matz at gcc dot gnu dot org
2009-12-17  1:42 ` matz at gcc dot gnu dot org
2009-12-17  9:32 ` nbenoit at tuxfamily dot org
2009-12-17  9:34 ` nbenoit at tuxfamily dot org
2009-12-20  1:16 ` matz at gcc dot gnu dot org
2009-12-20 13:12 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).