[Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
@ 2012-05-14 11:43 dominiq at lps dot ens.fr
  2012-05-14 11:49 ` [Bug tree-optimization/53342] " rguenth at gcc dot gnu.org
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-05-14 11:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

             Bug #: 53342
           Summary: [4.8 Regression] rnflow.f90 is ~5% slower after
                    revision 187340
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: dominiq@lps.ens.fr
                CC: matz@gcc.gnu.org, rguenth@gcc.gnu.org,
                    ubizjak@gmail.com


Created attachment 27401
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27401
source evlrnf.f90 extracted from rnflow.f90

On x86_64-apple-darwin10, rnflow.f90 is ~5% slower after revision 187340

[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187339/bin/gfortran -O3 -ffast-math
-funroll-loops rnflow.f90
[macbook] test/dbg_rnflow% time a.out > /dev/null
27.542u 0.348s 0:27.93 99.8%    0+0k 0+0io 0pf+0w
[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187340/bin/gfortran -O3 -ffast-math
-funroll-loops rnflow.f90
[macbook] test/dbg_rnflow% time a.out > /dev/null
29.196u 0.348s 0:29.59 99.7%    0+0k 0+0io 0pf+0w

The slowdown comes from the optimization of evlrnf (compiled on top of the last
last set in pr53340#c1)

[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187339/bin/gfortran -c -O3
-ffast-math -funroll-loops evlrnf.f90
[macbook] test/dbg_rnflow% makeo ; time a.out > /dev/null
21.168u 0.348s 0:21.52 99.9%    0+0k 0+0io 0pf+0w
[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187340/bin/gfortran -c -O3
-ffast-math -funroll-loops evlrnf.f90
[macbook] test/dbg_rnflow% makeo ; time a.out > /dev/null
22.758u 0.347s 0:23.11 99.9%    0+0k 0+0io 0pf+0w


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
@ 2012-05-14 11:49 ` rguenth at gcc dot gnu.org
  2012-05-14 12:33 ` matz at gcc dot gnu.org
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-14 11:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-05-14
   Target Milestone|---                         |4.8.0
     Ever Confirmed|0                           |1

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-14 11:44:42 UTC ---
It should now be possible to fix PR53185 in a different way without disabling
peeling for alignment.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
  2012-05-14 11:49 ` [Bug tree-optimization/53342] " rguenth at gcc dot gnu.org
@ 2012-05-14 12:33 ` matz at gcc dot gnu.org
  2012-09-07 11:51 ` rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: matz at gcc dot gnu.org @ 2012-05-14 12:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

Michael Matz <matz at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot       |matz at gcc dot gnu.org
                   |gnu.org                     |

--- Comment #2 from Michael Matz <matz at gcc dot gnu.org> 2012-05-14 11:49:15 UTC ---
Yeah.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
  2012-05-14 11:49 ` [Bug tree-optimization/53342] " rguenth at gcc dot gnu.org
  2012-05-14 12:33 ` matz at gcc dot gnu.org
@ 2012-09-07 11:51 ` rguenth at gcc dot gnu.org
  2012-11-13 18:46 ` ubizjak at gmail dot com
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-09-07 11:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (2 preceding siblings ...)
  2012-09-07 11:51 ` rguenth at gcc dot gnu.org
@ 2012-11-13 18:46 ` ubizjak at gmail dot com
  2012-12-10 12:11 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ubizjak at gmail dot com @ 2012-11-13 18:46 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

--- Comment #3 from Uros Bizjak <ubizjak at gmail dot com> 2012-11-13 18:46:13 UTC ---
(In reply to comment #2)
> Yeah.

Is there any progress on this issue?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (3 preceding siblings ...)
  2012-11-13 18:46 ` ubizjak at gmail dot com
@ 2012-12-10 12:11 ` jakub at gcc dot gnu.org
  2012-12-10 12:26 ` rguenther at suse dot de
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-12-10 12:11 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-12-10 12:10:41 UTC ---
Is this slower compared to pre-r186530 gfortran, or just from the r186530
through 187339?  I think before that change on this testcase we've vectorized
just 25 loops, not 28 loops as now, and supposedly the loops for which the
r187340 change is a problem are only those that weren't vectorized before at
all.  If the latter, then this wouldn't be a regression.

Is there an easy way to detect if peeling could turn a simple_iv vectorized
load into non-!simple_iv?

Michael, do you plan to work on this?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (4 preceding siblings ...)
  2012-12-10 12:11 ` jakub at gcc dot gnu.org
@ 2012-12-10 12:26 ` rguenther at suse dot de
  2013-01-11 11:53 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenther at suse dot de @ 2012-12-10 12:26 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

--- Comment #5 from rguenther at suse dot de <rguenther at suse dot de> 2012-12-10 12:26:21 UTC ---
On Mon, 10 Dec 2012, jakub at gcc dot gnu.org wrote:

> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342
> 
> Jakub Jelinek <jakub at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |jakub at gcc dot gnu.org
> 
> --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-12-10 12:10:41 UTC ---
> Is this slower compared to pre-r186530 gfortran, or just from the r186530
> through 187339?  I think before that change on this testcase we've vectorized
> just 25 loops, not 28 loops as now, and supposedly the loops for which the
> r187340 change is a problem are only those that weren't vectorized before at
> all.  If the latter, then this wouldn't be a regression.
> 
> Is there an easy way to detect if peeling could turn a simple_iv vectorized
> load into non-!simple_iv?

See my "this could be done differently now" comment - you should be
able to re-use the original SCEV result, thus the non-simple_iv case
should never pop up "late".

Richard.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (5 preceding siblings ...)
  2012-12-10 12:26 ` rguenther at suse dot de
@ 2013-01-11 11:53 ` jakub at gcc dot gnu.org
  2013-01-14 15:56 ` matz at gcc dot gnu.org
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-01-11 11:53 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-01-11 11:52:59 UTC ---
Created attachment 29144
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29144
hackish attempt


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (6 preceding siblings ...)
  2013-01-11 11:53 ` jakub at gcc dot gnu.org
@ 2013-01-14 15:56 ` matz at gcc dot gnu.org
  2013-01-14 16:45 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: matz at gcc dot gnu.org @ 2013-01-14 15:56 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

--- Comment #7 from Michael Matz <matz at gcc dot gnu.org> 2013-01-14 15:55:51 UTC ---
The patch should lead to wrong code at some places (when peeling for
alignment actually does something).  The problem is, you
calculate base and step before peeling and cache that.  Caching the step
is fine, but caching the base is not, as the peeling specifically changes
the initial value of the accessed pointer.  For instance in the testcase
of pr53185.c we have this loop after peeling:

  bb_6 (preds = {bb_8 bb_26 }, succs = {bb_8 bb_23 })
  {
    # .MEM_27 = PHI <.MEM_21(8), .MEM_51(26)>
    # e.1_29 = PHI <e.4_22(8), e.1_52(26)>
    _10 = (long unsigned int) e.1_29;
    _11 = _10 * 4;
    _12 = f_5(D) + _11;
    _14 = (int) e.1_29;
    _16 = _14 * pretmp_38;
    _17 = (long unsigned int) _16;
    _18 = _17 * 4;
    _19 = pretmp_35 + _18;
    # VUSE <.MEM_27>
    _20 = *_19;
    # .MEM_21 = VDEF <.MEM_27>
    *_12 = _20;
    e.4_22 = e.1_29 + 1;
    if (e.4_22 < a.5_26)
      goto <bb 8>;
    else
      goto <bb 23>;
  }

Note the initial value of e.1_52 for e.1_29.  But your cached
information sets
  iv.base = pretmp_35
  iv.step = (long unsigned int) pretmp_38 * 4

It actually should be iv.base
 = pretmp_35 + 4 * ((long unsigned int) (pretmp_38 * (int) e.1_52))

The casts here are actually the cause for simple_iv not working in this
case.  This expression would have to be calculated outside the loop
and used as stride_base.  I don't see where this could easily be done
from existing places (like where the peeling loop is generated).


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (7 preceding siblings ...)
  2013-01-14 15:56 ` matz at gcc dot gnu.org
@ 2013-01-14 16:45 ` jakub at gcc dot gnu.org
  2013-02-05 12:12 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-01-14 16:45 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-01-14 16:45:11 UTC ---
Can't we then compute the final values of the bases after the peeling loop, and
add those gimplified after the peeling loop, then use them in the next loop?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (8 preceding siblings ...)
  2013-01-14 16:45 ` jakub at gcc dot gnu.org
@ 2013-02-05 12:12 ` rguenth at gcc dot gnu.org
  2013-02-05 13:08 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-05 12:12 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
         AssignedTo|matz at gcc dot gnu.org     |rguenth at gcc dot gnu.org

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-05 12:12:20 UTC ---
Created attachment 29355
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29355
patch

I was thinking of sth as simple as the attached.  (TODO: simplify and localize
vect_check_strided_load)

We need a runtime testcase that verifies PR53185 is vectorized correctly.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (9 preceding siblings ...)
  2013-02-05 12:12 ` rguenth at gcc dot gnu.org
@ 2013-02-05 13:08 ` rguenth at gcc dot gnu.org
  2013-02-05 13:36 ` dominiq at lps dot ens.fr
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-05 13:08 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #29355|0                           |1
        is obsolete|                            |

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-05 13:07:51 UTC ---
Created attachment 29356
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29356
patch

This is what I am testing now.  Of course nothing tells us (yet) if that fixes
this bug - the slowdown of rnflow.f90 - which might be caused by us vectorizing
this loop at all (as far as I understand before the change we did not vectorize
it).


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (10 preceding siblings ...)
  2013-02-05 13:08 ` rguenth at gcc dot gnu.org
@ 2013-02-05 13:36 ` dominiq at lps dot ens.fr
  2013-02-05 13:52 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: dominiq at lps dot ens.fr @ 2013-02-05 13:36 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

--- Comment #11 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2013-02-05 13:35:51 UTC ---
After an incremental update of r195753 with the patch in comment #10, compiling
rnflow.f90 with '-O3 -ffast-math -funroll-loops' gives an executable which
segfault.

  0: 0: 0.135 -> Read sequence
  0: 0:42.259 -> extract extrema
  0: 0:42.361 -> Generate raw transitions counts
  0: 0:42.566 -> Compute Markov matrix
  0: 0:42.595 -> Calculate theoretical rainflow
==6222== Invalid read of size 4
==6222==    at 0x10001E9FF: evlrnf_ (rnflow.f90:2703)
==6222==    by 0xFFFFFFFFFFFFFF15: ???
==6222==    by 0x10166D0CB: __strtodg (in /usr/lib/libSystem.B.dylib)
==6222==    by 0xE4: ???
==6222==  Address 0x507d4c030 is not stack'd, malloc'd or (recently) free'd
==6222== 
==6222== 
==6222== Process terminating with default action of signal 11 (SIGSEGV)
==6222==  General Protection Fault
==6222==    at 0x10170EFC1: dyld_stub_binder (in /usr/lib/libSystem.B.dylib)
==6222==    by 0x1015AC02F: ??? (in /opt/gcc/gcc4.8w/lib/libgfortran.3.dylib)
==6222==    by 0xE64: ???
==6222==    by 0x10149AE8A: _gfortrani_st_vprintf (in
/opt/gcc/gcc4.8w/lib/libgfortran.3.dylib)

If needed, I'll do a clean bootstrap later this afternoon.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (11 preceding siblings ...)
  2013-02-05 13:36 ` dominiq at lps dot ens.fr
@ 2013-02-05 13:52 ` rguenth at gcc dot gnu.org
  2013-02-05 15:34 ` rguenth at gcc dot gnu.org
  2013-02-05 15:36 ` rguenth at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-05 13:52 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #29356|0                           |1
        is obsolete|                            |

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-05 13:52:16 UTC ---
Created attachment 29359
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29359
updated patch

Updated patch that does not miscompile rnflow.  Simplified as well...

  0: 0:17.264 -> Completed program execution

with patch,

  0: 0:19.462 -> Completed program execution

with strided load vectorization disabled,

  0: 0:17.929 -> Completed program execution

without patch.  -Ofast -funroll-loops with generic on an old iCore (fam 6,
model 30)

Not sure what the original regression was against (certainly not against
enabling strided load vectorization?)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (12 preceding siblings ...)
  2013-02-05 13:52 ` rguenth at gcc dot gnu.org
@ 2013-02-05 15:34 ` rguenth at gcc dot gnu.org
  2013-02-05 15:36 ` rguenth at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-05 15:34 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-05 15:33:51 UTC ---
Author: rguenth
Date: Tue Feb  5 15:33:35 2013
New Revision: 195759

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=195759
Log:
2013-02-05  Richard Biener  <rguenther@suse.de>

    PR tree-optimization/53342
    PR tree-optimization/53185
    * tree-vectorizer.h (vect_check_strided_load): Remove.
    * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Do
    not disallow peeling for vectorized strided loads.
    (vect_check_strided_load): Make static and simplify.
    (vect_analyze_data_refs): Adjust.
    * tree-vect-stmts.c (vectorizable_load): Handle peeled loops
    correctly when vectorizing strided loads.

    * gcc.dg/vect/pr53185-2.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/vect/pr53185-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vect-data-refs.c
    trunk/gcc/tree-vect-stmts.c
    trunk/gcc/tree-vectorizer.h


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/53342] [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340
  2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
                   ` (13 preceding siblings ...)
  2013-02-05 15:34 ` rguenth at gcc dot gnu.org
@ 2013-02-05 15:36 ` rguenth at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-05 15:36 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53342

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-05 15:36:14 UTC ---
I suppose fixed.  Not a regression to 4.7.x anyway(?)


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-02-05 15:36 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-14 11:43 [Bug tree-optimization/53342] New: [4.8 Regression] rnflow.f90 is ~5% slower after revision 187340 dominiq at lps dot ens.fr
2012-05-14 11:49 ` [Bug tree-optimization/53342] " rguenth at gcc dot gnu.org
2012-05-14 12:33 ` matz at gcc dot gnu.org
2012-09-07 11:51 ` rguenth at gcc dot gnu.org
2012-11-13 18:46 ` ubizjak at gmail dot com
2012-12-10 12:11 ` jakub at gcc dot gnu.org
2012-12-10 12:26 ` rguenther at suse dot de
2013-01-11 11:53 ` jakub at gcc dot gnu.org
2013-01-14 15:56 ` matz at gcc dot gnu.org
2013-01-14 16:45 ` jakub at gcc dot gnu.org
2013-02-05 12:12 ` rguenth at gcc dot gnu.org
2013-02-05 13:08 ` rguenth at gcc dot gnu.org
2013-02-05 13:36 ` dominiq at lps dot ens.fr
2013-02-05 13:52 ` rguenth at gcc dot gnu.org
2013-02-05 15:34 ` rguenth at gcc dot gnu.org
2013-02-05 15:36 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).