[Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.
@ 2011-10-28 17:19 venkataramanan.kumar.gnu at gmail dot com
  2011-10-28 19:14 ` [Bug rtl-optimization/50904] " dominiq at lps dot ens.fr
                   ` (49 more replies)
  0 siblings, 50 replies; 51+ messages in thread
From: venkataramanan.kumar.gnu at gmail dot com @ 2011-10-28 17:19 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

             Bug #: 50904
           Summary: Induct benchmark of polyhedron slows down when
                    -fno-protect-parens is enabled by -Ofast.
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: venkataramanan.kumar.gnu@gmail.com


Configurations:
GCC 4.7 trunk revison: 180364
Machine: AMD64 

Commandline:
gfortran -Ofast induct2.f90

Description:
We observed slowdown in induct benchmark for -Ofast after -fprotect-parens got
disabled in -Ofast (in gcc trunk rev 173385 on 2011-05-04 for
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48864).

When ISA enabled is avx, we observed a slowdown of ~2% 

While analyzing the slowdown, we found that there is a difference in code
generated in one of the induct's hot loop nest, between -Ofast (with
protect-parens) and  -Ofast (without protect-parens) irrespective of ISA
(avx,fma4)and tuning. Observations revealed this is due to an interaction with
the reassociation of expression happening in gimple and code hoisting in PRE
and other loop optimizations for the RTL generated for that expression.  

Details:

The following snippet shows the hot loop in subroutine
"mutual_ind_quad_cir_coil".

(-----Snip-----)
      do i = 1, 2*m
          theta = pi*real(i,longreal)/real(m,longreal)
          c_vector(1) = r_coil * cos(theta)
          c_vector(2) = r_coil * sin(theta)
!
!       compute current vector for the coil in the global coordinate system
!
          coil_tmp_vector(1) = -sin(theta)
          coil_tmp_vector(2) = cos(theta)
          coil_tmp_vector(3) = 0.0_longreal
          coil_current_vec(1) =
dot_product(rotate_coil(1,:),coil_tmp_vector(:))
          coil_current_vec(2) =
dot_product(rotate_coil(2,:),coil_tmp_vector(:))
          coil_current_vec(3) =
dot_product(rotate_coil(3,:),coil_tmp_vector(:))
!
          do j = 1, 9
              c_vector(3) = 0.5 * h_coil * z1gauss(j)
!
!       rotate coil vector into the global coordinate system and translate it
!
              rot_c_vector(1) = dot_product(rotate_coil(1,:),c_vector(:)) + dx
              rot_c_vector(2) = dot_product(rotate_coil(2,:),c_vector(:)) + dy
              rot_c_vector(3) = dot_product(rotate_coil(3,:),c_vector(:)) + dz
!
              do k = 1, 9
                  q_vector(1) = 0.5_longreal * a * (x2gauss(k) + 1.0_longreal)
                  q_vector(2) = 0.5_longreal * b1 * (y2gauss(k) - 1.0_longreal)
                  q_vector(3) = 0.0_longreal
!
!       rotate quad vector into the global coordinate system
!
                  rot_q_vector(1) = dot_product(rotate_quad(1,:),q_vector(:))
                  rot_q_vector(2) = dot_product(rotate_quad(2,:),q_vector(:))
                  rot_q_vector(3) = dot_product(rotate_quad(3,:),q_vector(:))
!
!       compute and add in quadrature term
!
                  numerator = w1gauss(j) * w2gauss(k) *                        
            &
                                                
dot_product(coil_current_vec,current_vector)
                  denominator = sqrt(dot_product(rot_c_vector-rot_q_vector,    
            &
                                                                 
rot_c_vector-rot_q_vector))
                  l12_lower = l12_lower + numerator/denominator
              end do
          end do
      end do
(-----Snip-----)

At Ofast, the k loop is unrolled and vectorized. 

When -fprotect-parens is enabled at -Ofast, "q_vector(2) = 0.5_longreal * b1 *
(y2gauss(k) - 1.0_longreal)" and part of the expression "rot_q_vector(1) =
dot_product(rotate_quad(1,:),q_vector(:))" are hoisted out of the j loop:

But in case when -fprotect-parens is disabled, the expressions are not hoisted
out of the loop.

Observations:

1) In gimple, when -fprotect-parens is disabled, the expression (y2gauss(k) -
1.0_longreal) is reassociated as shown below.

   induct2.f90.080t.dse1
   (-----Snip-----)
   D.8701_385 = y2gauss[D.8696_378];
   D.8702_386 = D.8701_385 - 1.0e+0;
   D.8703_387 = b1_148 * D.8702_386;
   D.8704_388 = D.8703_387 * 5.0e-1;
   (-----Snip-----)

   induct2.f90.081.reassoc1
   (-----Snip-----)
   D.8701_385 = y2gauss[D.8696_378];
   D.8702_386 = D.8701_385 + -1.0e+0;
   D.8703_387 = b1_148 * 5.0e-1;
   D.8704_388 = D.8703_387 * D.8702_386;
  (-----Snip-----)

However with  -fprotect-parens is enabled, 

  induct2.f90.081.reassoc1
  (-----Snip-----)
  D.8814_395 = y2gauss[D.8808_387];
  D.8815_396 = D.8814_395 - 1.0e+0;
  D.8816_397 = ((D.8815_396));
  D.8817_398 = b1_154 * 5.0e-1;
  D.8818_399 = D.8817_398 * D.8816_397
  (-----Snip-----)

2) Due to the reassociation that happens when -fprotect-parens is disabled, the
RTL generated for the expression "0.5_longreal * b1 * (y2gauss(k) -
1.0_longreal)" also changes.

For example first 2 elements in y2guass array, the RTL  is generated as  

(-----Snip-----)
insn 525 523 526 14 (set (reg:V2DF 1124)
        (mem/u/c/i:V2DF (symbol_ref/u:DI ("*.LC82") [flags 0x2]) [8 S16 A128]))
../induct2.f90:1662 1102 {*movv2df_internal}
     (expr_list:REG_EQUAL (const_vector:V2DF [
                (const_double:DF -1.0e+0 [-0x0.8p+1])
                (const_double:DF -1.0e+0 [-0x0.8p+1])
            ])
        (nil)))

(insn 526 525 527 14 (set (reg:V2DF 1123)
        (plus:V2DF (reg:V2DF 1124)
            (mem/c:V2DF (symbol_ref:DI ("y2gauss.2335") [flags 0x2]  <var_decl
0x2aaaabb09dc0 y2gauss>) [8 MEM[(real(kind=8)[9] *)&y2gauss]+0 S16 A256])))
../induct2.f90:1662 1130 {*addv2df3}
     (expr_list:REG_EQUAL (plus:V2DF (mem/c:V2DF (symbol_ref:DI
("y2gauss.2335") [flags 0x2]  <var_decl 0x2aaaabb09dc0 y2gauss>) [8
MEM[(real(kind=8)[9] *)&y2gauss]+0 S16 A256])
            (const_vector:V2DF [
                    (const_double:DF -1.0e+0 [-0x0.8p+1])
                    (const_double:DF -1.0e+0 [-0x0.8p+1])
                ]))
        (nil)))

(insn 527 526 528 14 (set (reg:V2DF 216 [ vect_var_.1769 ])
        (mult:V2DF (reg:V2DF 1123)
            (reg:V2DF 1108))) ../induct2.f90:1662 1139 {*mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 1123)
        (nil)))
(-----Snip-----)

These RTL expressions are computed inside the j loop and not hoisted out.

But in the case when -fprotect-parens enabled at -Ofast, RTL is as follows.

induct2.f90.157r.cprop1
(-----Snip-----)
(insn 536 533 537 14 (set (reg:V2DF 1172 [ MEM[(real(kind=8)[9] *)&y2gauss] ])
        (mem/c:V2DF (symbol_ref:DI ("y2gauss.2335") [flags 0x2]  <var_decl
0x2aaaabb09dc0 y2gauss>) [8 MEM[(real(kind=8)[9] *)&y2gauss]+0 S16 A256]))
induct2.f90:1662 1102 {*movv2df_internal}
     (nil))

(insn 537 536 538 14 (set (reg:V2DF 1170)
        (minus:V2DF (reg:V2DF 1172 [ MEM[(real(kind=8)[9] *)&y2gauss] ])
            (reg:V2DF 1168))) induct2.f90:1662 1131 {*subv2df3}
     (expr_list:REG_DEAD (reg:V2DF 1172 [ MEM[(real(kind=8)[9] *)&y2gauss] ])
        (expr_list:REG_EQUAL (minus:V2DF (mem/c:V2DF (symbol_ref:DI
("y2gauss.2335") [flags 0x2]  <var_decl 0x2aaaabb09dc0 y2gauss>) [8
MEM[(real(kind=8)[9] *)&y2gauss]+0 S16 A256])
                (const_vector:V2DF [
                        (const_double:DF 1.0e+0 [0x0.8p+1])
                        (const_double:DF 1.0e+0 [0x0.8p+1])
                    ]))
            (nil))))                                                            

(insn 538 537 539 14 (set (reg:V2DF 236 [ vect_var_.1777 ])
        (mult:V2DF (reg:V2DF 1170)
            (reg:V2DF 1155))) induct2.f90:1662 1139 {*mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 1170)
        (nil)))
(-----Snip-----)

Note these expressions get hoisted out of J loop.

In PRE (dump induct2.f90.158r.pre), the first instruction "insn 536" gets
hoisted. Other two instructions are insn 537 and 538 are hoisted at
induct2.f90.168r.loop2_unswitch

This hoisting difference is responsible for 2% degradation in induct benchmark
for avx and fma4 cases. At -Ofast slowdown is not expected and hence raising
this as a bug.

Please provide your suggestions.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
@ 2011-10-28 19:14 ` dominiq at lps dot ens.fr
  2011-10-30  9:41 ` rguenth at gcc dot gnu.org
                   ` (48 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-10-28 19:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Dominique d'Humieres <dominiq at lps dot ens.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2011-10-28
                 CC|                            |burnus@net-b.de
     Ever Confirmed|0                           |1

--- Comment #1 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-10-28 19:14:18 UTC ---
On a Core2Duo the run time when induct is compiled with -Ofast is 14.76s, when
it is compiled with -fprotect-parens -Ofast, the run time is 14.29.

> Please provide your suggestions.

As discussed with Tobias Burnus, I think the inclusion of -fno-protect-parens
in -Ofast was a poor choice. I'ld like to see it reverted, not on the ground of
speed, but because it violates one of the basic requirement of the Fortran
standard (BTW it breaks two of my codes).


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
  2011-10-28 19:14 ` [Bug rtl-optimization/50904] " dominiq at lps dot ens.fr
@ 2011-10-30  9:41 ` rguenth at gcc dot gnu.org
  2011-10-30  9:41 ` rguenth at gcc dot gnu.org
                   ` (47 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-10-30  9:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #2 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-10-30 09:40:38 UTC ---
I don't see why RTL invariant motion should move the one variant but not
the other.  Of course this also shows that we should, after loop unrolling
on the tree level, also perform loop invariant motion again ...


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
  2011-10-28 19:14 ` [Bug rtl-optimization/50904] " dominiq at lps dot ens.fr
  2011-10-30  9:41 ` rguenth at gcc dot gnu.org
@ 2011-10-30  9:41 ` rguenth at gcc dot gnu.org
  2011-10-30 11:25 ` dominiq at lps dot ens.fr
                   ` (46 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-10-30  9:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization

--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-10-30 09:41:23 UTC ---
It would be nice if somebody could reduce this to a more managable testcase.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (2 preceding siblings ...)
  2011-10-30  9:41 ` rguenth at gcc dot gnu.org
@ 2011-10-30 11:25 ` dominiq at lps dot ens.fr
  2011-10-30 11:35 ` dominiq at lps dot ens.fr
                   ` (45 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-10-30 11:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #4 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-10-30 11:24:57 UTC ---
> It would be nice if somebody could reduce this to a more managable testcase.

I have posted a reduced test in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265#c34 (pr34265).


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (3 preceding siblings ...)
  2011-10-30 11:25 ` dominiq at lps dot ens.fr
@ 2011-10-30 11:35 ` dominiq at lps dot ens.fr
  2011-11-01 13:53 ` ebotcazou at gcc dot gnu.org
                   ` (44 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-10-30 11:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #5 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-10-30 11:34:15 UTC ---
Created attachment 25666
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25666
Further reduced test for induct.f90

The reduced test contains a single et of nested loops in
mutual_ind_quad_cir_coil.  Using the main program in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265#c34 (induct_red.f90), the
execution time is 2.0s when induct_qc.f90 is compiled with -Ofast and 1.8s when
it is compiled with -fprotect-parens -Ofast.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (4 preceding siblings ...)
  2011-10-30 11:35 ` dominiq at lps dot ens.fr
@ 2011-11-01 13:53 ` ebotcazou at gcc dot gnu.org
  2011-11-02  5:51 ` venkataramanan.kumar.gnu at gmail dot com
                   ` (43 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-11-01 13:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebotcazou at gcc dot
                   |                            |gnu.org

--- Comment #6 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-11-01 13:52:48 UTC ---
> I don't see why RTL invariant motion should move the one variant but not
> the other.  Of course this also shows that we should, after loop unrolling
> on the tree level, also perform loop invariant motion again ...

The problem seems to be in RTL PRE, which hoists simple loads but not loads
that are wrapped up in a PLUS or a MINUS.  Even with -fprotect-parens, load
hoisting opportunities are lost because of this.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (5 preceding siblings ...)
  2011-11-01 13:53 ` ebotcazou at gcc dot gnu.org
@ 2011-11-02  5:51 ` venkataramanan.kumar.gnu at gmail dot com
  2011-11-04 21:55 ` ebotcazou at gcc dot gnu.org
                   ` (42 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: venkataramanan.kumar.gnu at gmail dot com @ 2011-11-02  5:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #7 from Venkataramanan Kumar <venkataramanan.kumar.gnu at gmail dot com> 2011-11-02 05:50:44 UTC ---
(In reply to comment #6)
> > I don't see why RTL invariant motion should move the one variant but not
> > the other.  Of course this also shows that we should, after loop unrolling
> > on the tree level, also perform loop invariant motion again ...
> The problem seems to be in RTL PRE, which hoists simple loads but not loads
> that are wrapped up in a PLUS or a MINUS.  Even with -fprotect-parens, load
> hoisting opportunities are lost because of this.

You mean to say PRE hosits the when expression of this pattern.

(Snip)
(insn 536 533 537 14 (set (reg:V2DF 1172 [ MEM[(real(kind=8)[9] *)&y2gauss] ])
        (mem/c:V2DF (symbol_ref:DI ("y2gauss.2335") [flags 0x2]  <var_decl
0x2aaaabb09dc0 y2gauss>) [8 MEM[(real(kind=8)[9] *)&y2gauss]+0 S16 A256]))
induct2.f90:1662 1102 {*movv2df_internal}
(Snip)


But not of this pattern.

(Snip)
(insn 526 525 527 14 (set (reg:V2DF 1123)
        (plus:V2DF (reg:V2DF 1124)
            (mem/c:V2DF (symbol_ref:DI ("y2gauss.2335") [flags 0x2]  <var_decl
0x2aaaabb09dc0 y2gauss>) [8 MEM[(real(kind=8)[9] *)&y2gauss]+0 S16 A256])))
../induct2.f90:1662 1130 {*addv2df3}
(Snip)


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast.
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (6 preceding siblings ...)
  2011-11-02  5:51 ` venkataramanan.kumar.gnu at gmail dot com
@ 2011-11-04 21:55 ` ebotcazou at gcc dot gnu.org
  2011-11-05 11:54 ` [Bug rtl-optimization/50904] [4.7 regression] pessimization " rguenth at gcc dot gnu.org
                   ` (41 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-11-04 21:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
         AssignedTo|unassigned at gcc dot       |ebotcazou at gcc dot
                   |gnu.org                     |gnu.org

--- Comment #8 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-11-04 21:54:02 UTC ---
Looking into it.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (7 preceding siblings ...)
  2011-11-04 21:55 ` ebotcazou at gcc dot gnu.org
@ 2011-11-05 11:54 ` rguenth at gcc dot gnu.org
  2011-11-07  0:33 ` ebotcazou at gcc dot gnu.org
                   ` (40 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-11-05 11:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
   Target Milestone|---                         |4.7.0

--- Comment #9 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-11-05 11:52:55 UTC ---
Fortran enabling -fno-protect-parens would be the regression, the RTL opt
problem likely isn't.  Keeping at P2 for now.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (8 preceding siblings ...)
  2011-11-05 11:54 ` [Bug rtl-optimization/50904] [4.7 regression] pessimization " rguenth at gcc dot gnu.org
@ 2011-11-07  0:33 ` ebotcazou at gcc dot gnu.org
  2011-11-08  0:43 ` ebotcazou at gcc dot gnu.org
                   ` (39 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-11-07  0:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #10 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-11-07 00:32:49 UTC ---
Created attachment 25731
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25731
Tentative fix

This enhances RTL PRE.  You need an up-to-date tree to apply it.

The other approaches are TER throttling and machine description fiddling.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (9 preceding siblings ...)
  2011-11-07  0:33 ` ebotcazou at gcc dot gnu.org
@ 2011-11-08  0:43 ` ebotcazou at gcc dot gnu.org
  2011-11-09  9:03 ` ebotcazou at gcc dot gnu.org
                   ` (38 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-11-08  0:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #25731|0                           |1
        is obsolete|                            |

--- Comment #11 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-11-08 00:33:24 UTC ---
Created attachment 25748
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25748
Tentative fix (2)

This one has a small glitch corrected.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (10 preceding siblings ...)
  2011-11-08  0:43 ` ebotcazou at gcc dot gnu.org
@ 2011-11-09  9:03 ` ebotcazou at gcc dot gnu.org
  2011-11-09 10:40 ` venkataramanan.kumar.gnu at gmail dot com
                   ` (37 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-11-09  9:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #25748|0                           |1
        is obsolete|                            |

--- Comment #12 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-11-09 08:57:37 UTC ---
Created attachment 25764
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25764
Tentative fix (3)

Final version.  Can someone try it on his favorite Fortran benchmark?


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (11 preceding siblings ...)
  2011-11-09  9:03 ` ebotcazou at gcc dot gnu.org
@ 2011-11-09 10:40 ` venkataramanan.kumar.gnu at gmail dot com
  2011-11-11 23:04 ` venkataramanan.kumar.gnu at gmail dot com
                   ` (36 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: venkataramanan.kumar.gnu at gmail dot com @ 2011-11-09 10:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #13 from Venkataramanan Kumar <venkataramanan.kumar.gnu at gmail dot com> 2011-11-09 10:22:39 UTC ---
(In reply to comment #12)
> Created attachment 25764 [details]
> Tentative fix (3)
> Final version.  Can someone try it on his favorite Fortran benchmark?

Ok I will check and let you know the results.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (12 preceding siblings ...)
  2011-11-09 10:40 ` venkataramanan.kumar.gnu at gmail dot com
@ 2011-11-11 23:04 ` venkataramanan.kumar.gnu at gmail dot com
  2011-11-12 17:22 ` [Bug tree-optimization/50904] " ebotcazou at gcc dot gnu.org
                   ` (35 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: venkataramanan.kumar.gnu at gmail dot com @ 2011-11-11 23:04 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #14 from Venkataramanan Kumar <venkataramanan.kumar.gnu at gmail dot com> 2011-11-11 22:58:01 UTC ---
I ran polyhedron benchmarks with -march=bdver1 and -Ofast. Induct run time was
brought down to 53.45 sec from 70.93 sec. Other benchmarks are not affected
much.

I am planning to test on older machine.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (13 preceding siblings ...)
  2011-11-11 23:04 ` venkataramanan.kumar.gnu at gmail dot com
@ 2011-11-12 17:22 ` ebotcazou at gcc dot gnu.org
  2011-11-19  7:18 ` venkataramanan.kumar.gnu at gmail dot com
                   ` (34 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-11-12 17:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW
                 CC|                            |rguenth at gcc dot gnu.org
          Component|rtl-optimization            |tree-optimization
         AssignedTo|ebotcazou at gcc dot        |unassigned at gcc dot
                   |gnu.org                     |gnu.org

--- Comment #15 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-11-12 16:45:13 UTC ---
> I don't see why RTL invariant motion should move the one variant but not
> the other.  Of course this also shows that we should, after loop unrolling
> on the tree level, also perform loop invariant motion again ...

AFAICS we already do that (lim3 is run after cunroll).  The problem is that
lim3 considers that the loads cannot be hoisted because they are "dependent". 
And it looks like a ccp pass is missing after cunroll as there is a lot of
cruft...

Recategorizing, at least temporarily, for further investigation.  If nothing
can be done at the Tree level, we could consider applying the RTL patch.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (14 preceding siblings ...)
  2011-11-12 17:22 ` [Bug tree-optimization/50904] " ebotcazou at gcc dot gnu.org
@ 2011-11-19  7:18 ` venkataramanan.kumar.gnu at gmail dot com
  2011-11-19  9:09 ` ebotcazou at gcc dot gnu.org
                   ` (33 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: venkataramanan.kumar.gnu at gmail dot com @ 2011-11-19  7:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #16 from Venkataramanan Kumar <venkataramanan.kumar.gnu at gmail dot com> 2011-11-19 04:25:27 UTC ---
(In reply to comment #13)
> (In reply to comment #12)
> > Created attachment 25764 [details]
> > Tentative fix (3)
> > Final version.  Can someone try it on his favorite Fortran benchmark?
> Ok I will check and let you know the results.

Hi Eric, 

I tested on machine with -march=amdfam10, with your patch induct run time
improves by ~5%. Other benchmarks are not affected much. 

With your patch, CPU2006 benchmark "416.gamess" fails during validation.
Mimimal flag -O3 -fno-tree-pre -mfma4. I am trying to reduce the test case.
Please let me know if you have any other suggestions.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (15 preceding siblings ...)
  2011-11-19  7:18 ` venkataramanan.kumar.gnu at gmail dot com
@ 2011-11-19  9:09 ` ebotcazou at gcc dot gnu.org
  2011-12-01  8:51 ` rguenther at suse dot de
                   ` (32 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-11-19  9:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #17 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-11-19 07:17:58 UTC ---
> I tested on machine with -march=amdfam10, with your patch induct run time
> improves by ~5%. Other benchmarks are not affected much. 

OK, thanks.

> With your patch, CPU2006 benchmark "416.gamess" fails during validation.
> Mimimal flag -O3 -fno-tree-pre -mfma4. I am trying to reduce the test case.
> Please let me know if you have any other suggestions.

I think that this should be handled at the Tree level.  The missed optimization
opportunities are there, with or without -fprotect-parens.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (16 preceding siblings ...)
  2011-11-19  9:09 ` ebotcazou at gcc dot gnu.org
@ 2011-12-01  8:51 ` rguenther at suse dot de
  2011-12-01 19:53 ` [Bug rtl-optimization/50904] " ebotcazou at gcc dot gnu.org
                   ` (31 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2011-12-01  8:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #18 from rguenther at suse dot de <rguenther at suse dot de> 2011-12-01 08:51:07 UTC ---
On Sat, 12 Nov 2011, ebotcazou at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904
> 
> Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|ASSIGNED                    |NEW
>                  CC|                            |rguenth at gcc dot gnu.org
>           Component|rtl-optimization            |tree-optimization
>          AssignedTo|ebotcazou at gcc dot        |unassigned at gcc dot
>                    |gnu.org                     |gnu.org
> 
> --- Comment #15 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-11-12 16:45:13 UTC ---
> > I don't see why RTL invariant motion should move the one variant but not
> > the other.  Of course this also shows that we should, after loop unrolling
> > on the tree level, also perform loop invariant motion again ...
> 
> AFAICS we already do that (lim3 is run after cunroll).  The problem is that
> lim3 considers that the loads cannot be hoisted because they are "dependent". 
> And it looks like a ccp pass is missing after cunroll as there is a lot of
> cruft...

lim3 was added as a "hack", now yes, cunroll needs ccp after it (but it's
there in the form of DOM and VRP).  It's a pass ordering issue that we
cannot ever solve.

> Recategorizing, at least temporarily, for further investigation.  If nothing
> can be done at the Tree level, we could consider applying the RTL patch.

Please - it seems like a missed optimization there, too.

Richard.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (17 preceding siblings ...)
  2011-12-01  8:51 ` rguenther at suse dot de
@ 2011-12-01 19:53 ` ebotcazou at gcc dot gnu.org
  2011-12-02  9:49 ` rguenther at suse dot de
                   ` (30 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-12-01 19:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
                 CC|ebotcazou at gcc dot        |
                   |gnu.org                     |
          Component|tree-optimization           |rtl-optimization
         AssignedTo|unassigned at gcc dot       |ebotcazou at gcc dot
                   |gnu.org                     |gnu.org

--- Comment #19 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-12-01 19:53:15 UTC ---
> lim3 was added as a "hack", now yes, cunroll needs ccp after it (but it's
> there in the form of DOM and VRP).  It's a pass ordering issue that we
> cannot ever solve.

OK, but that doesn't explain why LIM isn't able to hoist the loads...

> Please - it seems like a missed optimization there, too.

More of an acknowledged limitation I'd say.  And RTL passes aren't supposed to
be enhanced to plug holes in the Tree passes, but let's try anyway.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (18 preceding siblings ...)
  2011-12-01 19:53 ` [Bug rtl-optimization/50904] " ebotcazou at gcc dot gnu.org
@ 2011-12-02  9:49 ` rguenther at suse dot de
  2011-12-02 10:56 ` ebotcazou at gcc dot gnu.org
                   ` (29 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2011-12-02  9:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #20 from rguenther at suse dot de <rguenther at suse dot de> 2011-12-02 09:49:39 UTC ---
On Thu, 1 Dec 2011, ebotcazou at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904
> 
> Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|NEW                         |ASSIGNED
>                  CC|ebotcazou at gcc dot        |
>                    |gnu.org                     |
>           Component|tree-optimization           |rtl-optimization
>          AssignedTo|unassigned at gcc dot       |ebotcazou at gcc dot
>                    |gnu.org                     |gnu.org
> 
> --- Comment #19 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-12-01 19:53:15 UTC ---
> > lim3 was added as a "hack", now yes, cunroll needs ccp after it (but it's
> > there in the form of DOM and VRP).  It's a pass ordering issue that we
> > cannot ever solve.
> 
> OK, but that doesn't explain why LIM isn't able to hoist the loads...

If the expressions only become invariant after unrolling then the issue
is that without CCP LIM does not see they are invariant I suppose.
I'll have a closer look.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (19 preceding siblings ...)
  2011-12-02  9:49 ` rguenther at suse dot de
@ 2011-12-02 10:56 ` ebotcazou at gcc dot gnu.org
  2011-12-02 11:51 ` rguenth at gcc dot gnu.org
                   ` (28 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-12-02 10:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #21 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-12-02 10:54:45 UTC ---
> If the expressions only become invariant after unrolling then the issue
> is that without CCP LIM does not see they are invariant I suppose.

No, adding a CCP pass doesn't help (at least immediately).

> I'll have a closer look.

Thanks in advance.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (20 preceding siblings ...)
  2011-12-02 10:56 ` ebotcazou at gcc dot gnu.org
@ 2011-12-02 11:51 ` rguenth at gcc dot gnu.org
  2011-12-02 14:04 ` burnus at gcc dot gnu.org
                   ` (27 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-12-02 11:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #22 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-12-02 11:50:39 UTC ---
One thing I notice (and that's the only difference I can spot at the tree
level) is that we do not CSE the **2s of


      a = sqrt((rect_inductor%v2%x - rect_inductor%v4%x)**2 +
(rect_inductor%v2%y -         &
                       rect_inductor%v4%y)**2 + (rect_inductor%v2%z -
rect_inductor%v4%z)**2)

and

      xxvec = rect_inductor%v2%x - rect_inductor%v4%x
      xyvec = rect_inductor%v2%y - rect_inductor%v4%y
      xzvec = rect_inductor%v2%z - rect_inductor%v4%z
      magnitude = sqrt(xxvec**2 + xyvec**2 + xzvec**2)

because while the former has PAREN_EXPRs the latter does not and we do not
consider

<bb 8>:
  D.2113_79 = rect_inductor_78(D)->v2.x;
  D.2114_80 = rect_inductor_78(D)->v4.x;
  D.2115_81 = D.2113_79 - D.2114_80;
  D.1959_82 = ((D.2115_81));
  D.1960_83 = __builtin_pow (D.1959_82, 2.0e+0);
...
  D.1978_168 = __builtin_pow (D.2115_81, 2.0e+0);

D.1960_83 and D.1978_168 as equivalent (they are, value-wise, but we cannot
easily replace one with the other using our current value-numbering
machinery).  We could clevery see that ((x))**2 is equal to ((x**2))
but that would not help for seeing the CSE opportunity of the following
sum and sqrt either.

I wonder if Fortran, with -fprotect-parens, really has different
semantics for

 tem = 2 * a;
 c = b / tem;

vs.

 c = b / (2 * a);

?  Thus, is not every statement supposed to be wrapped in parens with
-fprotect-parens?  So that

 tem = 2 * a;

becomes

 tem = ( 2 * a );

implicitely?  I see that placing ()s at the toplevel of the relevant
stmts in the source has the desired effect of enabling CSE and
the tree level optimization differences vanish.

Thus, this is a question of 1) correctness of the -fprotect-parens
implementation in the frontend, 2) a question on what optimizations
we want to perform on protected expressions.

Relevant transform is CSE

 sqrt (x*x + y*y + z*z)

and

 sqrt (((x))*((x)) + ((y))*((y)) + ((z))*((z)))


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (21 preceding siblings ...)
  2011-12-02 11:51 ` rguenth at gcc dot gnu.org
@ 2011-12-02 14:04 ` burnus at gcc dot gnu.org
  2011-12-02 14:32 ` rguenther at suse dot de
                   ` (26 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: burnus at gcc dot gnu.org @ 2011-12-02 14:04 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Tobias Burnus <burnus at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |burnus at gcc dot gnu.org,
                   |                            |kargl at gcc dot gnu.org,
                   |                            |pault at gcc dot gnu.org

--- Comment #23 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 14:03:41 UTC ---
(In reply to comment #22)
> I wonder if Fortran [...]

Well, let's start with the Fortran standard (Fortran 2008):

"7.1.5.2.4 Evaluation of numeric intrinsic operations

"The execution of any numeric operation whose result is not defined by the
arithmetic used by the processor is prohibited. Raising a negative-valued
primary of type real to a real power is prohibited.

"Once the interpretation of a numeric intrinsic operation is established, the
processor may evaluate any mathematically equivalent expression, provided that
the integrity of parentheses is not violated.

"Two expressions of a numeric type are mathematically equivalent if, for all
possible values of their primaries, their mathematical values are equal.
However, mathematically equivalent expressions of numeric type may produce
different computational results."

[The section then contains a few non-normative notes; cf.
http://gcc.gnu.org/wiki/GFortranStandards#Fortran_2008 ]

And for the assignment:

"7.2.1 Assignment statement" [...]
"R732  assignment-stmt  is  variable = expr"
[...]
"Execution of an intrinsic assignment causes, in effect, the evaluation of the
expression expr and all expressions within variable (7.1), the possible
conversion of expr to the type and type parameters of the variable (Table 7.9),
and the definition of the variable with the resulting value. The execution of
the assignment shall have the same effect as if the evaluation of expr and the
evaluation of all expressions in variable occurred before any portion of the
variable is defined by the assignment. The evaluation of expressions within
variable shall neither affect nor be affected by the evaluation of expr."

> with -fprotect-parens, really has different semantics for
>  tem = 2 * a;
>  c = b / tem;
> vs.
>  c = b / (2 * a);
> ?
>
> Thus, is not every statement supposed to be wrapped in parens with
> -fprotect-parens?  So that
>  tem = 2 * a;
> becomes
>  tem = ( 2 * a );
> implicitely?
[...]
> Thus, this is a question of 1) correctness of the -fprotect-parens
> implementation in the frontend, 2) a question on what optimizations
> we want to perform on protected expressions.

It somehow looks as if one needs to add implicitly parentheses; this gets more
complicated, if one takes the scalarizer or inlining into account.

Contrary to the explicit parentheses, I am not aware of a program which breaks
with the extra temporary, but that's does not tell much. (Side note: I think
the majority of users doesn't care [or know] about the protection of either
parentheses or the separate assignment statements - and is happy as long the
result is mathematical the same. Though, some users do care as with unprotected
parentheses their program breaks.)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (22 preceding siblings ...)
  2011-12-02 14:04 ` burnus at gcc dot gnu.org
@ 2011-12-02 14:32 ` rguenther at suse dot de
  2011-12-02 14:41 ` burnus at gcc dot gnu.org
                   ` (25 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2011-12-02 14:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #24 from rguenther at suse dot de <rguenther at suse dot de> 2011-12-02 14:31:27 UTC ---
On Fri, 2 Dec 2011, burnus at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904
> 
> Tobias Burnus <burnus at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |burnus at gcc dot gnu.org,
>                    |                            |kargl at gcc dot gnu.org,
>                    |                            |pault at gcc dot gnu.org
> 
> --- Comment #23 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 14:03:41 UTC ---
> (In reply to comment #22)
> > I wonder if Fortran [...]
> 
> Well, let's start with the Fortran standard (Fortran 2008):
> 
> "7.1.5.2.4 Evaluation of numeric intrinsic operations
> 
> "The execution of any numeric operation whose result is not defined by the
> arithmetic used by the processor is prohibited. Raising a negative-valued
> primary of type real to a real power is prohibited.
> 
> "Once the interpretation of a numeric intrinsic operation is established, the
> processor may evaluate any mathematically equivalent expression, provided that
> the integrity of parentheses is not violated.
> 
> "Two expressions of a numeric type are mathematically equivalent if, for all
> possible values of their primaries, their mathematical values are equal.
> However, mathematically equivalent expressions of numeric type may produce
> different computational results."
> 
> [The section then contains a few non-normative notes; cf.
> http://gcc.gnu.org/wiki/GFortranStandards#Fortran_2008 ]
> 
> And for the assignment:
> 
> "7.2.1 Assignment statement" [...]
> "R732  assignment-stmt  is  variable = expr"
> [...]
> "Execution of an intrinsic assignment causes, in effect, the evaluation of the
> expression expr and all expressions within variable (7.1), the possible
> conversion of expr to the type and type parameters of the variable (Table 7.9),
> and the definition of the variable with the resulting value. The execution of
> the assignment shall have the same effect as if the evaluation of expr and the
> evaluation of all expressions in variable occurred before any portion of the
> variable is defined by the assignment. The evaluation of expressions within
> variable shall neither affect nor be affected by the evaluation of expr."
> 
> 
> > with -fprotect-parens, really has different semantics for
> >  tem = 2 * a;
> >  c = b / tem;
> > vs.
> >  c = b / (2 * a);
> > ?
> >
> > Thus, is not every statement supposed to be wrapped in parens with
> > -fprotect-parens?  So that
> >  tem = 2 * a;
> > becomes
> >  tem = ( 2 * a );
> > implicitely?
> [...]
> > Thus, this is a question of 1) correctness of the -fprotect-parens
> > implementation in the frontend, 2) a question on what optimizations
> > we want to perform on protected expressions.
> 
> It somehow looks as if one needs to add implicitly parentheses; this gets more
> complicated, if one takes the scalarizer or inlining into account.
> 
> Contrary to the explicit parentheses, I am not aware of a program which breaks
> with the extra temporary, but that's does not tell much. (Side note: I think
> the majority of users doesn't care [or know] about the protection of either
> parentheses or the separate assignment statements - and is happy as long the
> result is mathematical the same. Though, some users do care as with unprotected
> parentheses their program breaks.)

Every program that would break with non honoring explicit parantheses
would also break if the bracketed expression would be explicitely
computed into a temporary (without explicit parantheses).  So it
should be easy to construct a testcase if you have one that breaks
without -fno-protect-parens.

Richard.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (23 preceding siblings ...)
  2011-12-02 14:32 ` rguenther at suse dot de
@ 2011-12-02 14:41 ` burnus at gcc dot gnu.org
  2011-12-02 15:04 ` rguenther at suse dot de
                   ` (24 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: burnus at gcc dot gnu.org @ 2011-12-02 14:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Tobias Burnus <burnus at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|burnus@net-b.de             |dominiq at lps dot ens.fr

--- Comment #25 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 14:40:48 UTC ---
(In reply to comment #24)
> Every program that would break with non honoring explicit parantheses
> would also break if the bracketed expression would be explicitely
> computed into a temporary (without explicit parantheses).  So it
> should be easy to construct a testcase if you have one that breaks
> without -fno-protect-parens.

I vaguely recall that one of the Polyhedron benchmarks gets minutely out of the
correctness-check tolerance range with -fno-protect-parens while it stays
within without. I think Dominique has a program where the effect is more
disastrous.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (24 preceding siblings ...)
  2011-12-02 14:41 ` burnus at gcc dot gnu.org
@ 2011-12-02 15:04 ` rguenther at suse dot de
  2011-12-02 16:03 ` burnus at gcc dot gnu.org
                   ` (23 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2011-12-02 15:04 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #26 from rguenther at suse dot de <rguenther at suse dot de> 2011-12-02 15:02:27 UTC ---
On Fri, 2 Dec 2011, burnus at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904
> 
> Tobias Burnus <burnus at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|burnus@net-b.de             |dominiq at lps dot ens.fr
> 
> --- Comment #25 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 14:40:48 UTC ---
> (In reply to comment #24)
> > Every program that would break with non honoring explicit parantheses
> > would also break if the bracketed expression would be explicitely
> > computed into a temporary (without explicit parantheses).  So it
> > should be easy to construct a testcase if you have one that breaks
> > without -fno-protect-parens.
> 
> I vaguely recall that one of the Polyhedron benchmarks gets minutely out of the
> correctness-check tolerance range with -fno-protect-parens while it stays
> within without. I think Dominique has a program where the effect is more
> disastrous.

The trivial example is (x + 2**52) - 2**52 which rounds x to
an integer.  Without parens we optimize away that rounding effect.
Thus,

  real*8 x, tem
  x = 1.3d
  tem = x + 2.d**52
  x = tem - 2.d**52
  if (x.ne.1.0d)
    call abort

should not fail (minus my fortran coding errors ;)) with
-fprotect-parens


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (25 preceding siblings ...)
  2011-12-02 15:04 ` rguenther at suse dot de
@ 2011-12-02 16:03 ` burnus at gcc dot gnu.org
  2011-12-02 16:13 ` howarth at nitro dot med.uc.edu
                   ` (22 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: burnus at gcc dot gnu.org @ 2011-12-02 16:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #27 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 16:02:45 UTC ---
(In reply to comment #26)
> The trivial example is (x + 2**52) - 2**52 which rounds x to
> an integer.  Without parens we optimize away that rounding effect.

Corrected example. The result I get with other compilers matches the current
behaviour of GCC/gfortran:

- GCC: Gives (of course independent of -fno-protect-parens): 1.3 with "-O1
-ffast-math", 1.0 without -ffast-math.

- Intel ifort 12.2: -O1 has 1.0, -O2 has 1.3, -assume protect_parens does not
help but -fp-model strict does (with -O2: 1.0).

- PGI pgf95 11.5-0: 1.0 with up to -O4.

- Crayftn 7.1.4.111: 1.0 for -O0, 1.3 for -O1. Option "-O fp0" gives 1.0 while
already "-O fp1" gives 1.3.

- PathScale pathf95 3.2.99: 1.0 for up to -O3, -Ofast prints 1.3. As with GCC,
-OPT:fast_math={on,off} toggles between 1.0 and 1.3

- NAG f95: 1.0 for up to -O4, 1.3 with -Ounsafe.

- Sun Fortran 95 8.3: 1.0 for -O4, 1.3 for -fast.

program test
  implicit none
  real(8), volatile :: y
  y = 1.3d0
  call sub(y)
  print *, y
! if (y /= 1.0d0) &
!   call abort
contains
  subroutine sub(x)
    real*8 x, tem
    tem = x + 2.d0**52
    x = tem - 2.d0**52
  end subroutine sub
end program test


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (26 preceding siblings ...)
  2011-12-02 16:03 ` burnus at gcc dot gnu.org
@ 2011-12-02 16:13 ` howarth at nitro dot med.uc.edu
  2011-12-02 16:15 ` rguenther at suse dot de
                   ` (21 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-12-02 16:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Jack Howarth <howarth at nitro dot med.uc.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |howarth at nitro dot
                   |                            |med.uc.edu

--- Comment #28 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-12-02 16:10:59 UTC ---
The failing polyhedron 2005 benchmark is linpk which can be seen with -Ofast on
x86_64-apple-darwin11...

> Value= 25.114499300     Target= 23.100000000     Tolerance= 2.0000000000    
FAIL <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> Value=0.27880142600E-10 Target=0.27858826400E-10 Tolerance=0.10000000000E-09
> Value=0.22204460500E-15 Target=0.22204460500E-15 Tolerance=0.10000000000E-14
> Value= 1.0000000000     Target= 1.0000000000     Tolerance=0.10000000000E-07
> Value= 1.0000000000     Target= 1.0000000000     Tolerance=0.10000000000E-07

linpk FAILED    1 fails and    4 passes


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (27 preceding siblings ...)
  2011-12-02 16:13 ` howarth at nitro dot med.uc.edu
@ 2011-12-02 16:15 ` rguenther at suse dot de
  2011-12-02 16:30 ` burnus at gcc dot gnu.org
                   ` (20 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2011-12-02 16:15 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #29 from rguenther at suse dot de <rguenther at suse dot de> 2011-12-02 16:13:25 UTC ---
On Fri, 2 Dec 2011, burnus at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904
> 
> --- Comment #27 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 16:02:45 UTC ---
> (In reply to comment #26)
> > The trivial example is (x + 2**52) - 2**52 which rounds x to
> > an integer.  Without parens we optimize away that rounding effect.
> 
> Corrected example. The result I get with other compilers matches the current
> behaviour of GCC/gfortran:
> 
> - GCC: Gives (of course independent of -fno-protect-parens): 1.3 with "-O1
> -ffast-math", 1.0 without -ffast-math.

Indeed GCC does not perform FP association without some sub-flags
enabled by -ffast-math (it assumes then intermediate rounding is
to be preserved).

> - Intel ifort 12.2: -O1 has 1.0, -O2 has 1.3, -assume protect_parens does not
> help but -fp-model strict does (with -O2: 1.0).
> 
> - PGI pgf95 11.5-0: 1.0 with up to -O4.
> 
> - Crayftn 7.1.4.111: 1.0 for -O0, 1.3 for -O1. Option "-O fp0" gives 1.0 while
> already "-O fp1" gives 1.3.
> 
> - PathScale pathf95 3.2.99: 1.0 for up to -O3, -Ofast prints 1.3. As with GCC,
> -OPT:fast_math={on,off} toggles between 1.0 and 1.3
> 
> - NAG f95: 1.0 for up to -O4, 1.3 with -Ounsafe.
> 
> - Sun Fortran 95 8.3: 1.0 for -O4, 1.3 for -fast.
> 
> program test
>   implicit none
>   real(8), volatile :: y
>   y = 1.3d0
>   call sub(y)
>   print *, y
> ! if (y /= 1.0d0) &
> !   call abort
> contains
>   subroutine sub(x)
>     real*8 x, tem
>     tem = x + 2.d0**52
>     x = tem - 2.d0**52
>   end subroutine sub
> end program test

And for the sake of completeness the evaluation of sub above and

   subroutine sub2(x)
     real*8 x
     x = (x + 2.d0**52) - 2.d0**52
   end subroutine sub2

should behave consistently if I read your Fortran standard
quotations correctly.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (28 preceding siblings ...)
  2011-12-02 16:15 ` rguenther at suse dot de
@ 2011-12-02 16:30 ` burnus at gcc dot gnu.org
  2011-12-02 16:33 ` rguenther at suse dot de
                   ` (19 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: burnus at gcc dot gnu.org @ 2011-12-02 16:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #30 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 16:29:46 UTC ---
(In reply to comment #29)
> And for the sake of completeness the evaluation of sub above and
>      x = (x + 2.d0**52) - 2.d0**52
> should behave consistently if I read your Fortran standard
> quotations correctly.

Well, it kind of does, only when mixing (in GCC) -funsafe-math-optimizations
with -fprotect-parens or (in ifort) "-assume protect_parens" with a non-strict
-fp-model, you get a different results: 1.0 with the () version and 1.3 with
the 'tmp' version.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (29 preceding siblings ...)
  2011-12-02 16:30 ` burnus at gcc dot gnu.org
@ 2011-12-02 16:33 ` rguenther at suse dot de
  2011-12-02 16:38 ` dominiq at lps dot ens.fr
                   ` (18 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2011-12-02 16:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #31 from rguenther at suse dot de <rguenther at suse dot de> 2011-12-02 16:32:52 UTC ---
On Fri, 2 Dec 2011, burnus at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904
> 
> --- Comment #30 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 16:29:46 UTC ---
> (In reply to comment #29)
> > And for the sake of completeness the evaluation of sub above and
> >      x = (x + 2.d0**52) - 2.d0**52
> > should behave consistently if I read your Fortran standard
> > quotations correctly.
> 
> Well, it kind of does, only when mixing (in GCC) -funsafe-math-optimizations
> with -fprotect-parens or (in ifort) "-assume protect_parens" with a non-strict
> -fp-model, you get a different results: 1.0 with the () version and 1.3 with
> the 'tmp' version.

Ok, which is, I suppose, a bug in both compilers.

Richard.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (30 preceding siblings ...)
  2011-12-02 16:33 ` rguenther at suse dot de
@ 2011-12-02 16:38 ` dominiq at lps dot ens.fr
  2011-12-02 16:47 ` dominiq at lps dot ens.fr
                   ` (17 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-12-02 16:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #32 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-12-02 16:37:37 UTC ---
> And for the sake of completeness the evaluation of sub above and
>
>    subroutine sub2(x)
>      real*8 x
>      x = (x + 2.d0**52) - 2.d0**52
>    end subroutine sub2
>
> should behave consistently if I read your Fortran standard
> quotations correctly.

According my reading of

> "Once the interpretation of a numeric intrinsic operation is established, the
> processor may evaluate any mathematically equivalent expression, provided that
> the integrity of parentheses is not violated."

this is different from

>   subroutine sub(x)
>     real*8 x, tem
>     tem = x + 2.d0**52
>     x = tem - 2.d0**52
>   end subroutine sub

where 'x=tem-2.d0**52' can be evaluated as 'x=x+2.d0**52-2.d0**52' then as 'x'
(as long as x and tmp are of the same kind(?)), while in the former case '(x +
2.d0**52) - 2.d0**52' is prohibited by the standard to be evaluated as 'x'.

Note that if I replace 'tem = x + 2.d0**52' with 'tem = (x + 2.d0**52)', I get
1.0 unless I use -fno-protect-parens.

All this has been discussed previously, but the only pr I have been to find is
pr32172.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (31 preceding siblings ...)
  2011-12-02 16:38 ` dominiq at lps dot ens.fr
@ 2011-12-02 16:47 ` dominiq at lps dot ens.fr
  2011-12-02 17:07 ` burnus at gcc dot gnu.org
                   ` (16 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-12-02 16:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #33 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-12-02 16:45:24 UTC ---
> The failing polyhedron 2005 benchmark is linpk which can be seen with -Ofast on
> x86_64-apple-darwin11...
>
> > Value= 25.114499300     Target= 23.100000000     Tolerance= 2.0000000000    
> F

I think this test is not relevant: the "target" is already a residual error,
hence very sensitive to the way the computation is performed and a 10%
tolerance cannot be used to evaluate the "accuracy" of the residual.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (32 preceding siblings ...)
  2011-12-02 16:47 ` dominiq at lps dot ens.fr
@ 2011-12-02 17:07 ` burnus at gcc dot gnu.org
  2011-12-02 21:21 ` ebotcazou at gcc dot gnu.org
                   ` (15 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: burnus at gcc dot gnu.org @ 2011-12-02 17:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #34 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 17:06:57 UTC ---
(In reply to comment #31)
> Ok, which is, I suppose, a bug in both compilers.

Kind of, though, -ffast-math by itself already is on the verge of violating the
standard. I think -fno-protect-parens could be enabled by -ffast-math as that
means that one does not really care about the exact value.

However, there are users which want to have one or the other. Namely, most
users are happy with a default -fno-protect-parens but don't dare to use
-ffast-math, while others want to have -ffast-math optimizations but with
honored parentheses.

If we want to add add extra protection for
     tem = x + 2.d0**52
     x = tem - 2.d0**52
we probably need to add yet another flag as there are surely users, which want
to have protected parentheses but allow for optimizations in the 'tmp' case.
[Even if, as this PR shows, the extra optimization opportunity might lead to a
missed opportunity.] In any case, handling that well for function calls,
inlining and the scalarizer seems to be difficult. And frankly, I am not sure
whether there is any user; -ffast-math plus -fprotect-parens is already special
(cf. comment 32 for one user). Having -ffast-math plus parentheses plus
protected assignments might have even fewer users.

I believe most users simply use -O2, -O3 [-ffast-math], or -Ofast without
thinking (very) much about the options. [I also use typically either -O2, -O3
or -Ofast.]

* * *

Back to the comment 0 issue: I still do not quite understand what the double
evaluation (on tree level) of __builtin_pow in
  D.1959_82 = ((D.2115_81));
  D.1960_83 = __builtin_pow (D.1959_82, 2.0e+0);
  D.1978_168 = __builtin_pow (D.2115_81, 2.0e+0);
has to do with the -Ofast slow down. If I have understood it correctly, on tree
level, there is no reason for it while the slow-down happens on RTL level. That
-fprotect-parens makes it faster is a mere coincidence. Is that a correct rough
summary?


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (33 preceding siblings ...)
  2011-12-02 17:07 ` burnus at gcc dot gnu.org
@ 2011-12-02 21:21 ` ebotcazou at gcc dot gnu.org
  2011-12-03 14:55 ` dominiq at lps dot ens.fr
                   ` (14 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-12-02 21:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #35 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-12-02 21:21:15 UTC ---
> One thing I notice (and that's the only difference I can spot at the tree
> level) is that we do not CSE the **2s of

There are many missed hoisting opportunities, with or without the switch. 
There are just a few more with the switch, hence the performance regression.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (34 preceding siblings ...)
  2011-12-02 21:21 ` ebotcazou at gcc dot gnu.org
@ 2011-12-03 14:55 ` dominiq at lps dot ens.fr
  2011-12-05  8:19 ` rguenther at suse dot de
                   ` (13 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-12-03 14:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #36 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-12-03 14:54:40 UTC ---
> Kind of, though, -ffast-math by itself already is on the verge of violating the
> standard. 

I disagree with this statement at least for codes that does not use IEEE
intrinsic modules (not yet implemented in gfortran).  Indeed the interaction
between -ffast-math and IEEE intrinsic modules will have to be discussed and
documented when these modules will be implemented (see pr50724 for the kind of
problems).

> I think -fno-protect-parens could be enabled by -ffast-math as that
> means that one does not really care about the exact value.

PLEEEASE DON'T. The situation is bad enough with -Ofast to not make it worse:
with -fno-protect-parens gfortran no longer complies with the Fortran standard
(7.1.5.2.4 quoted in comment #23) and IMO SHOULD NOT be part of any compound
option.

Note that if I am using -ffast-math, it is not because I do "not really care
about the exact value". It is mostly because I KNOW that exceptions will be the
signature of either a bug in my code and/or a bad choice of the parameters
leading to numerical instabilities. In top of that, I think that the concept of
"exact value" for floating-point numbers is ill-posed and as a consequence I do
accept that the least significant digits may depend on the way I write the code
or it is optimized (small fluctuations for well-posed methods, large ones
otherwise).

> If we want to add add extra protection for
>      tem = x + 2.d0**52
>      x = tem - 2.d0**52
> we probably need to add yet another flag as there are surely users, which want
> to have protected parentheses but allow for optimizations in the 'tmp' case.

If I need an extra protection, I'll put parentheses and I don't need yet
another flag. However, since the logic of the optimization is surprising the
first time you hit it, it could (should) be documented.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (35 preceding siblings ...)
  2011-12-03 14:55 ` dominiq at lps dot ens.fr
@ 2011-12-05  8:19 ` rguenther at suse dot de
  2011-12-05  8:27 ` rguenther at suse dot de
                   ` (12 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2011-12-05  8:19 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #37 from rguenther at suse dot de <rguenther at suse dot de> 2011-12-05 08:18:00 UTC ---
On Fri, 2 Dec 2011, burnus at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904
> 
> --- Comment #34 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 17:06:57 UTC ---

[...]

> * * *
> 
> Back to the comment 0 issue: I still do not quite understand what the double
> evaluation (on tree level) of __builtin_pow in
>   D.1959_82 = ((D.2115_81));
>   D.1960_83 = __builtin_pow (D.1959_82, 2.0e+0);
>   D.1978_168 = __builtin_pow (D.2115_81, 2.0e+0);
> has to do with the -Ofast slow down. If I have understood it correctly, on tree
> level, there is no reason for it while the slow-down happens on RTL level.

Indeed I can find no other difference on the tree level (thus, no
invariant motion missed optimization that isn't present with both
-f[no-]protect-parens).

> That -fprotect-parens makes it faster is a mere coincidence. Is that a correct rough
> summary?

Yes.

Thus, I think if at the RTL level we see a missed invariant motion then
this is a RTL level bug (esp. if it only triggers with 
-fno-protect-parens).

Richard.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (36 preceding siblings ...)
  2011-12-05  8:19 ` rguenther at suse dot de
@ 2011-12-05  8:27 ` rguenther at suse dot de
  2011-12-05  9:21 ` ebotcazou at gcc dot gnu.org
                   ` (11 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2011-12-05  8:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #38 from rguenther at suse dot de <rguenther at suse dot de> 2011-12-05 08:27:08 UTC ---
On Fri, 2 Dec 2011, ebotcazou at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904
> 
> --- Comment #35 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-12-02 21:21:15 UTC ---
> > One thing I notice (and that's the only difference I can spot at the tree
> > level) is that we do not CSE the **2s of
> 
> There are many missed hoisting opportunities, with or without the switch. 
> There are just a few more with the switch, hence the performance regression.

Most of them (not sure if you mean those) are because they are
considered "cheap" by LIM and thus are not moved:

vect_px2gauss.123_641 = &x2gauss;
  invariant up to level 1, cost 1.

vect_cst_.126_659 = { 1.0e+0, 1.0e+0 };
  invariant up to level 1, cost 1.

vect_cst_.128_661 = {D.2126_109, D.2126_109};
  invariant up to level 1, cost 1.
...

vect_px2gauss.120_20 = vect_px2gauss.123_641 + 16;
  invariant up to level 1, cost 2.
...

ivtmp.176_899 = 1;
  invariant up to level 1, cost 1.
...

vect_px2gauss.120_649 = vect_px2gauss.120_336;
  invariant up to level 1, cost 5.
...

ISTR discussing to remove all cost considerations for tree
level loop invariant motion and simply move everything possible
(PRE for example doesn't consider any costs and moves all
invariants).

If you use --param lim-expensive=1 you get all invariants moved
on the tree level - does that solve the slowdown issue?
The issue is of course that this might increase register pressure
as we are not good in re-materializing for example constants
inside a loop.

I'll give --param lim-expensive=1 a try on SPEC 2k6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (37 preceding siblings ...)
  2011-12-05  8:27 ` rguenther at suse dot de
@ 2011-12-05  9:21 ` ebotcazou at gcc dot gnu.org
  2011-12-05  9:57 ` rguenther at suse dot de
                   ` (10 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-12-05  9:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #39 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-12-05 09:21:15 UTC ---
> Thus, I think if at the RTL level we see a missed invariant motion then
> this is a RTL level bug (esp. if it only triggers with  -fno-protect-parens).

Well, how can the RTL level invent load hoisting opportunities?  They are of
course already present at the Tree level, see the .optimized dump:

vect_var_.124_350 = MEM[(real(kind=8)[9] *)&x2gauss];

vect_var_.133_823 = MEM[(real(kind=8)[9] *)&y2gauss];

vect_var_.157_586 = MEM[(real(kind=8)[9] *)&w2gauss];

vect_var_.124_357 = MEM[(real(kind=8)[9] *)&x2gauss + 16B];

vect_var_.133_363 = MEM[(real(kind=8)[9] *)&y2gauss + 16B];

vect_var_.157_874 = MEM[(real(kind=8)[9] *)&w2gauss + 16B];

vect_var_.124_405 = MEM[(real(kind=8)[9] *)&x2gauss + 32B];

vect_var_.133_594 = MEM[(real(kind=8)[9] *)&y2gauss + 32B];

vect_var_.157_610 = MEM[(real(kind=8)[9] *)&w2gauss + 32B];

vect_var_.124_651 = MEM[(real(kind=8)[9] *)&x2gauss + 48B];

vect_var_.133_680 = MEM[(real(kind=8)[9] *)&y2gauss + 48B];

vect_var_.157_805 = MEM[(real(kind=8)[9] *)&w2gauss + 48B];


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (38 preceding siblings ...)
  2011-12-05  9:21 ` ebotcazou at gcc dot gnu.org
@ 2011-12-05  9:57 ` rguenther at suse dot de
  2011-12-05 10:13 ` dominiq at lps dot ens.fr
                   ` (9 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2011-12-05  9:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #40 from rguenther at suse dot de <rguenther at suse dot de> 2011-12-05 09:55:47 UTC ---
On Mon, 5 Dec 2011, ebotcazou at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904
> 
> --- Comment #39 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-12-05 09:21:15 UTC ---
> > Thus, I think if at the RTL level we see a missed invariant motion then
> > this is a RTL level bug (esp. if it only triggers with  -fno-protect-parens).
> 
> Well, how can the RTL level invent load hoisting opportunities?  They are of
> course already present at the Tree level, see the .optimized dump:
> 
> vect_var_.124_350 = MEM[(real(kind=8)[9] *)&x2gauss];

They are considered dependent because they are still decomposed as

  vect_px2gauss.123_680 = &x2gauss;
...
  vect_var_.124_350 = MEM[(real(kind=8)[9] *)vect_px2gauss.123_680];

during LIM3.  Let me check why we don't fix that up in LIM dependence
checking.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (39 preceding siblings ...)
  2011-12-05  9:57 ` rguenther at suse dot de
@ 2011-12-05 10:13 ` dominiq at lps dot ens.fr
  2011-12-05 10:21 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-12-05 10:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #41 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-12-05 10:12:39 UTC ---
Using --param lim-expensive=1 when compiling induct.f90 does not change the
timing, as for today (r181994):

gfc -Ofast induct.f90                                                          
              -> 14.62s
gfc -Ofast induct.f90 --param lim-expensive=1                               ->
14.61s
gfc -fprotect-parens -Ofast induct.f90                                         
    -> 14.11s
gfc -fprotect-parens -Ofast induct.f90 --param lim-expensive=1    -> 14.12s

(a ~0.15s improvement over the timing in comment #1).


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (40 preceding siblings ...)
  2011-12-05 10:13 ` dominiq at lps dot ens.fr
@ 2011-12-05 10:21 ` rguenth at gcc dot gnu.org
  2011-12-05 10:28 ` [Bug tree-optimization/50904] " ebotcazou at gcc dot gnu.org
                   ` (7 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-12-05 10:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #42 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-12-05 10:19:11 UTC ---
Argh.  It seems LIM didn't get proper lifting both at tuplification and
alias-improvements time.  So it's memory handling (everything it does
with VOPs) is a little very much conservative (read: it doesn't really work).

I'll look into this.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (41 preceding siblings ...)
  2011-12-05 10:21 ` rguenth at gcc dot gnu.org
@ 2011-12-05 10:28 ` ebotcazou at gcc dot gnu.org
  2011-12-05 11:13 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-12-05 10:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW
          Component|rtl-optimization            |tree-optimization
         AssignedTo|ebotcazou at gcc dot        |unassigned at gcc dot
                   |gnu.org                     |gnu.org

--- Comment #43 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-12-05 10:26:52 UTC ---
> Argh.  It seems LIM didn't get proper lifting both at tuplification and
> alias-improvements time.  So it's memory handling (everything it does
> with VOPs) is a little very much conservative (read: it doesn't really work).

OK, re-recategorizing for further investigation.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (42 preceding siblings ...)
  2011-12-05 10:28 ` [Bug tree-optimization/50904] " ebotcazou at gcc dot gnu.org
@ 2011-12-05 11:13 ` rguenth at gcc dot gnu.org
  2011-12-05 14:38 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-12-05 11:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
         AssignedTo|unassigned at gcc dot       |rguenth at gcc dot gnu.org
                   |gnu.org                     |

--- Comment #44 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-12-05 11:11:36 UTC ---
Created attachment 25990
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25990
proposed patch

Pretty much minimal patch in testing.  I have a 2nd patch doing some cost
adjustments.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (43 preceding siblings ...)
  2011-12-05 11:13 ` rguenth at gcc dot gnu.org
@ 2011-12-05 14:38 ` rguenth at gcc dot gnu.org
  2011-12-05 14:40 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-12-05 14:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #45 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-12-05 14:36:48 UTC ---
Author: rguenth
Date: Mon Dec  5 14:36:44 2011
New Revision: 182010

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182010
Log:
2011-12-05  Richard Guenther  <rguenther@suse.de>

    PR tree-optimization/50904
    * tree-ssa-loop-im.c (struct mem_ref): Remove vops member.
    (MEM_ANALYZABLE): New.
    (memory_references): Remove clobbered_vops and vop_ref_map
    members, add all_refs_stored_in_loop member.
    (memref_free): Adjust.
    (mem_ref_alloc): Likewise.
    (gather_mem_refs_stmt): Do not record clobbers, instead
    record refs for unanalyzable stmts.
    (gather_mem_refs_in_loops): Do not propagate clobbers.
    (struct vop_to_refs_elt, vtoe_hash, vtoe_eq, vtoe_free,
    record_vop_access, get_vop_accesses, get_vop_stores,
    add_vop_ref_mapping): Remove.
    (create_vop_ref_mapping_loop): Adjust to simply record all
    stores.
    (analyze_memory_references): Adjust.
    (refs_independent_p): Check for not analyzable refs.
    (can_sm_ref_p): Likewise.
    (ref_indep_loop_p_1): Simplify.
    (tree_ssa_lim_finalize): Adjust.

    * tree-ssa-loop-im.c (stmt_cost): Simplify, use LIM_EXPENSIVE
    rather than magic constants.  Assign zero cost to PAREN_EXPR
    and SSA_NAME copies.  Assign cost proportional to the vector
    size for vector constructors.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-ssa-loop-im.c


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (44 preceding siblings ...)
  2011-12-05 14:38 ` rguenth at gcc dot gnu.org
@ 2011-12-05 14:40 ` rguenth at gcc dot gnu.org
  2011-12-05 17:30 ` ebotcazou at gcc dot gnu.org
                   ` (3 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-12-05 14:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED

--- Comment #46 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-12-05 14:37:14 UTC ---
Fixed (fingers crossing).


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (45 preceding siblings ...)
  2011-12-05 14:40 ` rguenth at gcc dot gnu.org
@ 2011-12-05 17:30 ` ebotcazou at gcc dot gnu.org
  2011-12-05 17:59 ` dominiq at lps dot ens.fr
                   ` (2 subsequent siblings)
  49 siblings, 0 replies; 51+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2011-12-05 17:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebotcazou at gcc dot
                   |                            |gnu.org

--- Comment #47 from Eric Botcazou <ebotcazou at gcc dot gnu.org> 2011-12-05 17:29:21 UTC ---
> Fixed (fingers crossing).

That's great, thanks (you can count mine as well :-).


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (46 preceding siblings ...)
  2011-12-05 17:30 ` ebotcazou at gcc dot gnu.org
@ 2011-12-05 17:59 ` dominiq at lps dot ens.fr
  2011-12-06 10:00 ` venkataramanan.kumar.gnu at gmail dot com
  2011-12-07 13:21 ` venkataramanan.kumar.gnu at gmail dot com
  49 siblings, 0 replies; 51+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-12-05 17:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #48 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-12-05 17:58:13 UTC ---
> Fixed (fingers crossing).

So far, so good! The runtime for induct compiled with -fprotect-parens -Ofast
went down from 14.11s to 13.14s, and compiled with -Ofast from 14.62s to 13.32s
(still ~1% slower). The timings for the other tests in the polyhedron suite
were basically unchanged (i.e., within the noise margin).

Thanks for the patch.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (47 preceding siblings ...)
  2011-12-05 17:59 ` dominiq at lps dot ens.fr
@ 2011-12-06 10:00 ` venkataramanan.kumar.gnu at gmail dot com
  2011-12-07 13:21 ` venkataramanan.kumar.gnu at gmail dot com
  49 siblings, 0 replies; 51+ messages in thread
From: venkataramanan.kumar.gnu at gmail dot com @ 2011-12-06 10:00 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #49 from Venkataramanan Kumar <venkataramanan.kumar.gnu at gmail dot com> 2011-12-06 09:59:39 UTC ---
I am planning to test the patch on polyhedron benchmarks. 
Then I test it on CPU2006 SPEC.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Bug tree-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast
  2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
                   ` (48 preceding siblings ...)
  2011-12-06 10:00 ` venkataramanan.kumar.gnu at gmail dot com
@ 2011-12-07 13:21 ` venkataramanan.kumar.gnu at gmail dot com
  49 siblings, 0 replies; 51+ messages in thread
From: venkataramanan.kumar.gnu at gmail dot com @ 2011-12-07 13:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #50 from Venkataramanan Kumar <venkataramanan.kumar.gnu at gmail dot com> 2011-12-07 13:18:57 UTC ---
In the machine I used Induct run time improves from 68.9 seconds to 55.94
seconds for -Ofast. I will update on other benchmarks and SPEC2006 once I
complete testing.


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2011-12-07 13:21 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-28 17:19 [Bug rtl-optimization/50904] New: Induct benchmark of polyhedron slows down when -fno-protect-parens is enabled by -Ofast venkataramanan.kumar.gnu at gmail dot com
2011-10-28 19:14 ` [Bug rtl-optimization/50904] " dominiq at lps dot ens.fr
2011-10-30  9:41 ` rguenth at gcc dot gnu.org
2011-10-30  9:41 ` rguenth at gcc dot gnu.org
2011-10-30 11:25 ` dominiq at lps dot ens.fr
2011-10-30 11:35 ` dominiq at lps dot ens.fr
2011-11-01 13:53 ` ebotcazou at gcc dot gnu.org
2011-11-02  5:51 ` venkataramanan.kumar.gnu at gmail dot com
2011-11-04 21:55 ` ebotcazou at gcc dot gnu.org
2011-11-05 11:54 ` [Bug rtl-optimization/50904] [4.7 regression] pessimization " rguenth at gcc dot gnu.org
2011-11-07  0:33 ` ebotcazou at gcc dot gnu.org
2011-11-08  0:43 ` ebotcazou at gcc dot gnu.org
2011-11-09  9:03 ` ebotcazou at gcc dot gnu.org
2011-11-09 10:40 ` venkataramanan.kumar.gnu at gmail dot com
2011-11-11 23:04 ` venkataramanan.kumar.gnu at gmail dot com
2011-11-12 17:22 ` [Bug tree-optimization/50904] " ebotcazou at gcc dot gnu.org
2011-11-19  7:18 ` venkataramanan.kumar.gnu at gmail dot com
2011-11-19  9:09 ` ebotcazou at gcc dot gnu.org
2011-12-01  8:51 ` rguenther at suse dot de
2011-12-01 19:53 ` [Bug rtl-optimization/50904] " ebotcazou at gcc dot gnu.org
2011-12-02  9:49 ` rguenther at suse dot de
2011-12-02 10:56 ` ebotcazou at gcc dot gnu.org
2011-12-02 11:51 ` rguenth at gcc dot gnu.org
2011-12-02 14:04 ` burnus at gcc dot gnu.org
2011-12-02 14:32 ` rguenther at suse dot de
2011-12-02 14:41 ` burnus at gcc dot gnu.org
2011-12-02 15:04 ` rguenther at suse dot de
2011-12-02 16:03 ` burnus at gcc dot gnu.org
2011-12-02 16:13 ` howarth at nitro dot med.uc.edu
2011-12-02 16:15 ` rguenther at suse dot de
2011-12-02 16:30 ` burnus at gcc dot gnu.org
2011-12-02 16:33 ` rguenther at suse dot de
2011-12-02 16:38 ` dominiq at lps dot ens.fr
2011-12-02 16:47 ` dominiq at lps dot ens.fr
2011-12-02 17:07 ` burnus at gcc dot gnu.org
2011-12-02 21:21 ` ebotcazou at gcc dot gnu.org
2011-12-03 14:55 ` dominiq at lps dot ens.fr
2011-12-05  8:19 ` rguenther at suse dot de
2011-12-05  8:27 ` rguenther at suse dot de
2011-12-05  9:21 ` ebotcazou at gcc dot gnu.org
2011-12-05  9:57 ` rguenther at suse dot de
2011-12-05 10:13 ` dominiq at lps dot ens.fr
2011-12-05 10:21 ` rguenth at gcc dot gnu.org
2011-12-05 10:28 ` [Bug tree-optimization/50904] " ebotcazou at gcc dot gnu.org
2011-12-05 11:13 ` rguenth at gcc dot gnu.org
2011-12-05 14:38 ` rguenth at gcc dot gnu.org
2011-12-05 14:40 ` rguenth at gcc dot gnu.org
2011-12-05 17:30 ` ebotcazou at gcc dot gnu.org
2011-12-05 17:59 ` dominiq at lps dot ens.fr
2011-12-06 10:00 ` venkataramanan.kumar.gnu at gmail dot com
2011-12-07 13:21 ` venkataramanan.kumar.gnu at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).