* [Bug middle-end/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
@ 2012-05-18 11:48 ` rguenth at gcc dot gnu.org
2012-05-18 11:54 ` [Bug tree-optimization/53395] " dominiq at lps dot ens.fr
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-18 11:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|tree-optimization |middle-end
Target Milestone|--- |4.8.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
2012-05-18 11:48 ` [Bug middle-end/53395] " rguenth at gcc dot gnu.org
@ 2012-05-18 11:54 ` dominiq at lps dot ens.fr
2012-05-18 16:02 ` pinskia at gcc dot gnu.org
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-05-18 11:54 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
--- Comment #1 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2012-05-18 11:24:06 UTC ---
The assembly code for -O3 is almost the same for revisions 187182 and 187183.
However with '-O3 -ffast-math', revision 187182 gives for the loop
L12:
movapd %xmm2, %xmm1
L9:
movsd 8(%rsi), %xmm0
andpd %xmm3, %xmm0
comisd %xmm0, %xmm1
movapd %xmm0, %xmm2
maxsd %xmm1, %xmm2
cmovb %edx, %eax
addl $1, %edx
addq $8, %rsi
cmpl %ecx, %edx
jne L12
while revision 187183 gives
L6:
movapd %xmm2, %xmm1
L3:
movsd 8(%rsi), %xmm0
movapd %xmm1, %xmm3
andpd %xmm4, %xmm0
comisd %xmm0, %xmm1
movapd %xmm0, %xmm2
cmplesd %xmm1, %xmm2
cmovb %edx, %eax
addl $1, %edx
addq $8, %rsi
cmpl %ecx, %edx
andpd %xmm2, %xmm3
andnpd %xmm0, %xmm2
orpd %xmm3, %xmm2
jne L6
(for the later -ffast-math only change ucomisd to comisd).
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
2012-05-18 11:48 ` [Bug middle-end/53395] " rguenth at gcc dot gnu.org
2012-05-18 11:54 ` [Bug tree-optimization/53395] " dominiq at lps dot ens.fr
@ 2012-05-18 16:02 ` pinskia at gcc dot gnu.org
2012-05-18 17:34 ` pinskia at gcc dot gnu.org
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-05-18 16:02 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Component|middle-end |tree-optimization
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-05-18 15:25:13 UTC ---
dmax_12 = ABS_EXPR <D.1877_11>;
dmax_2 = dmax_1 >= dmax_12 ? dmax_1 : dmax_12;
__result_idamax_21 = dmax_1 >= dmax_12 ? __result_idamax_22 : i_3;
Hmm, dmax_2 should have been MAX_EXPR.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
` (2 preceding siblings ...)
2012-05-18 16:02 ` pinskia at gcc dot gnu.org
@ 2012-05-18 17:34 ` pinskia at gcc dot gnu.org
2012-05-18 17:42 ` pinskia at gcc dot gnu.org
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-05-18 17:34 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
AssignedTo|unassigned at gcc dot |pinskia at gcc dot gnu.org
|gnu.org |
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-05-18 16:13:59 UTC ---
I have a patch.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
` (3 preceding siblings ...)
2012-05-18 17:34 ` pinskia at gcc dot gnu.org
@ 2012-05-18 17:42 ` pinskia at gcc dot gnu.org
2012-05-18 17:46 ` dominiq at lps dot ens.fr
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-05-18 17:42 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-05-18 16:03:46 UTC ---
This should fix tree-if-conv.c:
Index: tree-if-conv.c
===================================================================
--- tree-if-conv.c (revision 187647)
+++ tree-if-conv.c (working copy)
@@ -1313,8 +1313,8 @@ predicate_scalar_phi (gimple phi, tree c
|| bb_postdominates_preds (bb));
/* Build new RHS using selected condition and arguments. */
- rhs = build3 (COND_EXPR, TREE_TYPE (res),
- unshare_expr (cond), arg_0, arg_1);
+ rhs = fold_build3 (COND_EXPR, TREE_TYPE (res),
+ unshare_expr (cond), arg_0, arg_1);
}
new_stmt = gimple_build_assign (res, rhs);
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
` (4 preceding siblings ...)
2012-05-18 17:42 ` pinskia at gcc dot gnu.org
@ 2012-05-18 17:46 ` dominiq at lps dot ens.fr
2012-05-18 17:51 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-05-18 17:46 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
--- Comment #6 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2012-05-18 17:41:21 UTC ---
> This should fix tree-if-conv.c:
It does. Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
` (5 preceding siblings ...)
2012-05-18 17:46 ` dominiq at lps dot ens.fr
@ 2012-05-18 17:51 ` pinskia at gcc dot gnu.org
2012-05-21 10:11 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-05-18 17:51 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2012-05-18
Ever Confirmed|0 |1
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-05-18 15:32:04 UTC ---
This was mentioned on http://gcc.gnu.org/ml/gcc/2011-10/msg00422.html . So
there are two ways of fixing this bug.
Way #1: Fix ifcvt on the tree level to produce MAX_EXPR instead of the
COND_EXPR.
Way #2: Simplify COND_EXPR to MAX_EXPR during expanding or some other time.
I want to say way #1 is the correct fix.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
` (6 preceding siblings ...)
2012-05-18 17:51 ` pinskia at gcc dot gnu.org
@ 2012-05-21 10:11 ` rguenth at gcc dot gnu.org
2012-08-28 1:18 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-21 10:11 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
--- Comment #7 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-21 09:39:51 UTC ---
Note that if-conversion does not fold to not destroy valid gimple RHS and
to avoid canonicalizing the condition. Producing a MAX_EXPR is certainly
fine of course ... (I'm to blame for not adding testcases for some of the
if-conversion improvements I've done in the last months ...)
But I suppose with the simple patch you at least need to gimplify the
result.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
` (7 preceding siblings ...)
2012-05-21 10:11 ` rguenth at gcc dot gnu.org
@ 2012-08-28 1:18 ` pinskia at gcc dot gnu.org
2012-08-28 7:05 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-08-28 1:18 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-08-28 01:18:17 UTC ---
Created attachment 28091
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28091
New patch based on Richard's comments
Testing a new fix which includes Richard's comments.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
` (8 preceding siblings ...)
2012-08-28 1:18 ` pinskia at gcc dot gnu.org
@ 2012-08-28 7:05 ` pinskia at gcc dot gnu.org
2012-09-03 20:32 ` pinskia at gcc dot gnu.org
2012-09-03 20:32 ` pinskia at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-08-28 7:05 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-08-28 07:04:52 UTC ---
While working on this, I noticed that sometimes we don't produce what the x86
back-end calls IEEE MIN/MAX either but that is a different issue all together
and I have a fix for that (I ran into that while implementing improving the
last phi-opt that also converts those PHIs into COND_EXPR like ifcvt does).
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
` (9 preceding siblings ...)
2012-08-28 7:05 ` pinskia at gcc dot gnu.org
@ 2012-09-03 20:32 ` pinskia at gcc dot gnu.org
2012-09-03 20:32 ` pinskia at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-09-03 20:32 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-09-03 20:31:55 UTC ---
Author: pinskia
Date: Mon Sep 3 20:31:52 2012
New Revision: 190904
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190904
Log:
2012-09-03 Andrew Pinski <apinski@cavium.com>
PR tree-opt/53395
* tree-if-conv.c (constant_or_ssa_name): New function.
(fold_build_cond_expr): New function.
(predicate_scalar_phi): Use fold_build_cond_expr instead of build3.
(predicate_mem_writes): Likewise.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-if-conv.c
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug tree-optimization/53395] [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183
2012-05-18 11:45 [Bug tree-optimization/53395] New: [4.8 Regression] The LAPACK functions i(d|s)amax are more than two times slower after revision 187183 dominiq at lps dot ens.fr
` (10 preceding siblings ...)
2012-09-03 20:32 ` pinskia at gcc dot gnu.org
@ 2012-09-03 20:32 ` pinskia at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-09-03 20:32 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53395
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
--- Comment #11 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-09-03 20:32:33 UTC ---
Fixed.
^ permalink raw reply [flat|nested] 13+ messages in thread