public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto
@ 2015-04-08 13:41 rguenth at gcc dot gnu.org
  2015-04-08 13:45 ` [Bug ipa/65701] " rguenth at gcc dot gnu.org
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-04-08 13:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

            Bug ID: 65701
           Summary: r221530 makes 187.facerec drop with -Ofast -flto
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Keywords: lto, missed-optimization
          Severity: normal
          Priority: P3
         Component: ipa
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
                CC: hubicka at gcc dot gnu.org

+2015-03-20  Jan Hubicka  <hubicka@ucw.cz>
+
+       * ipa-inline.c (can_inline_edge_p): Short circuit if inline_failed
+       already is final.
+       (ipa_inline): Recompute inline_failed codes.
+       * cif-code.def (FUNCTION_NOT_OPTIMIZED, REDEFINED_EXTERN_INLINE,
+       USES_COMDAT_LOCAL, ATTRIBUTE_MISMATCH, UNREACHABLE): Declare as 
+       CIF_FINAL_ERROR.

makes 187.facerec drop in
http://gcc.opensuse.org/SPEC/CFP/sb-megrez-head-64/recent.html, but only for
LTO.

revision range is 221529 (good) 221531 (bad).


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
@ 2015-04-08 13:45 ` rguenth at gcc dot gnu.org
  2015-04-08 13:47 ` rguenth at gcc dot gnu.org
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-04-08 13:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
-Ofast -march=native, that is.  (which may be the key to the issue?)


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
  2015-04-08 13:45 ` [Bug ipa/65701] " rguenth at gcc dot gnu.org
@ 2015-04-08 13:47 ` rguenth at gcc dot gnu.org
  2015-04-08 16:37 ` hubicka at gcc dot gnu.org
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-04-08 13:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
build log

/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
FaceRecTypes.o              -Ofast -march=native -flto=8 -fno-fat-lto-objects 
FaceRecTypes.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
parameterRoutines.o              -Ofast -march=native -flto=8
-fno-fat-lto-objects  parameterRoutines.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
cfftb.o              -Ofast -march=native -flto=8 -fno-fat-lto-objects 
cfftb.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
cfftf.o              -Ofast -march=native -flto=8 -fno-fat-lto-objects 
cfftf.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
cffti.o              -Ofast -march=native -flto=8 -fno-fat-lto-objects 
cffti.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
fft2d.o              -Ofast -march=native -flto=8 -fno-fat-lto-objects 
fft2d.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
gaborRoutines.o              -Ofast -march=native -flto=8 -fno-fat-lto-objects 
gaborRoutines.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
imageRoutines.o              -Ofast -march=native -flto=8 -fno-fat-lto-objects 
imageRoutines.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
graphRoutines.o              -Ofast -march=native -flto=8 -fno-fat-lto-objects 
graphRoutines.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran -c -o
FaceRec.o              -Ofast -march=native -flto=8 -fno-fat-lto-objects 
FaceRec.f90
/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/bin/gfortran
-Wl,-rpath=/gcc/spec/sb-megrez-head-64/x86_64/install-201503200620/lib64     
-Ofast -march=native -flto=8 -fno-fat-lto-objects  FaceRecTypes.o
parameterRoutines.o cfftb.o cfftf.o cffti.o fft2d.o gaborRoutines.o
imageRoutines.o graphRoutines.o FaceRec.o     -o facerec


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
  2015-04-08 13:45 ` [Bug ipa/65701] " rguenth at gcc dot gnu.org
  2015-04-08 13:47 ` rguenth at gcc dot gnu.org
@ 2015-04-08 16:37 ` hubicka at gcc dot gnu.org
  2015-04-09  0:03 ` hubicka at gcc dot gnu.org
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-08 16:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #3 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Yep, I looked into this regression a bit.  The patch just avoids some "false
positives" of inlining functions called once (i.e. case where we think the
function will optimize out but it really won't so we end up with duplication)
and also some "false negatives".
As such, it can affect pretty large functions to be or not be inlined. I will
oprofile to figure out which one it is here.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2015-04-08 16:37 ` hubicka at gcc dot gnu.org
@ 2015-04-09  0:03 ` hubicka at gcc dot gnu.org
  2015-04-09  4:08 ` hubicka at gcc dot gnu.org
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-09  0:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-04-09
     Ever confirmed|0                           |1

--- Comment #4 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
The difference in inlining decision is 
(- is with patch reverted, + is current mainline):

 Why inlining failed?
-function body not available                       :      568 calls,  4150392
freq, 0 count
---param large-function-growth limit reached       :        3 calls,     3000
freq, 0 count
+function body not available                       :      568 calls,  3750333
freq, 0 count
+--param large-function-growth limit reached       :        2 calls,   101000
freq, 0 count
 --param large-stack-frame-growth limit reached    :        1 calls,     1000
freq, 0 count
---param max-inline-insns-auto limit reached       :       37 calls,   265369
freq, 0 count
+--param max-inline-insns-auto limit reached       :       37 calls,   275141
freq, 0 count

that actually means a lot of changes in the particular inlining decisions,
because several large functions gets inlined differently.

The beggining of inline changes seems as expected - the growths are corrected
on mainline and thus we start considering functions in different order.

I will need to profile this


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2015-04-09  0:03 ` hubicka at gcc dot gnu.org
@ 2015-04-09  4:08 ` hubicka at gcc dot gnu.org
  2015-04-09 17:44 ` hubicka at gcc dot gnu.org
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-09  4:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #5 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
The profile difference is:

 52.31%  facerec  facerec            [.] MAIN__.lto_priv.3                     
                                                                               
                           �
 16.68%  facerec  facerec            [.] topcostfct.3487.lto_priv.4            
                                                                               
                           �
  8.28%  facerec  facerec            [.] __gaborroutines_MOD_gabortrafo        
                                                                               
                           �
  7.91%  facerec  facerec            [.] cfftb_                                
                                                                               
                           �
  7.20%  facerec  libgfortran.so.3   [.] _gfortrani_cshift0_r4                 
                                                                               
                           �
  2.76%  facerec  facerec            [.] __fft2d_MOD_fft2db                    
                                                                               
                           �
  1.54%  facerec  facerec            [.]
__graphroutines_MOD_graphsimfct.constprop.0                                    
                                                                  �
  0.53%  facerec  libc-2.13.so       [.] __memcpy_ssse3                        
                                                                               
                           �

(mainline) WRT

 59.16%  facerec  facerec            [.] MAIN__.lto_priv.3                     
                                                                               
                           �
 10.95%  facerec  facerec            [.] __gaborroutines_MOD_gabortrafo        
                                                                               
                           �
 10.51%  facerec  facerec            [.] cfftb1_                               
                                                                               
                           �
  9.33%  facerec  libgfortran.so.3   [.] _gfortrani_cshift0_r4                 
                                                                               
                           �
  3.64%  facerec  facerec            [.] __fft2d_MOD_fft2db                    
                                                                               
                           �
  2.07%  facerec  facerec            [.]
__graphroutines_MOD_graphsimfct.constprop.0                                    
                                                                  �
  0.67%  facerec  libc-2.13.so       [.] __memcpy_ssse3                        
                                                                               
                           �
  0.57%  facerec  libgfortran.so.3   [.] _gfortrani_read_radix                 
                                                                               
                           �
  0.43%  facerec  libgcc_s.so.1      [.] __udivti3                             
                                                                               
                           �
  0.36%  facerec  libgfortran.so.3   [.] formatted_transfer                    
                                                                               
                           �
patch reverted. I wonder if we don't want to iline udivti... I suppose the
problem is that we no longer inline topcostfct which we do not inline
because...

not inlinable: localmove.constprop/304 -> topcostfct/208, --param
large-function-growth limit reached

while patched tree suceeds:

Inlining topcostfct size 1393.
 Called once from localmove.constprop 740 insns.
                Accounting size:1132.00, time:12187.80 on predicate:(true)

Bumping the large-function-insns limit up to 4000 makes the function to be
inlined but curiously enough causes further degradation. The profile is now:

66.35%  facerec  facerec            [.] MAIN__.lto_priv.3                      
                                                                               
                          �
  8.93%  facerec  facerec            [.] __gaborroutines_MOD_gabortrafo        
                                                                               
                           �
  8.72%  facerec  facerec            [.] cfftb_                                
                                                                               
                           �
  7.77%  facerec  libgfortran.so.3   [.] _gfortrani_cshift0_r4                 
                                                                               
                           �
  2.96%  facerec  facerec            [.] __fft2d_MOD_fft2db                    
                                                                               
                           �
  1.68%  facerec  facerec            [.]
__graphroutines_MOD_graphsimfct.constprop.0                                    
                                                                  �
  0.55%  facerec  libc-2.13.so       [.] __memcpy_ssse3                        
                                                                               
                           �
  0.47%  facerec  libgfortran.so.3   [.] _gfortrani_read_radix                 
                                                                               
                           �
  0.34%  facerec  libgcc_s.so.1      [.] __udivti3                             
                                                                               
                           �
  0.30%  facerec  libgfortran.so.3   [.] formatted_transfer                    
                                                                               
                           �
  0.22%  facerec  libgfortran.so.3   [.] next_format0                          
                                                                               
                           �
  0.22%  facerec  facerec            [.] cfftf_                                
                                                                               
                           �
  0.20%  facerec  libgfortran.so.3   [.] _gfortrani_read_block_form            
                                                                               
                           �
so basically identical except that mainline inlines cfftb1_ and the patched
tree inlines cfftb_ which is a wrapper. Perhaps the wrapper heuristics may be
generalized for this.
>From gcc-bugs-return-483062-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu Apr 09 04:50:22 2015
Return-Path: <gcc-bugs-return-483062-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 45056 invoked by alias); 9 Apr 2015 04:50:22 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 45005 invoked by uid 48); 9 Apr 2015 04:50:16 -0000
From: "michal.misiaszek at kofinder dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug gcov-profile/43341] pragma pack changes padding in struct gcov_info on 64-bit archs
Date: Thu, 09 Apr 2015 04:50:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: gcov-profile
X-Bugzilla-Version: 4.5.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: michal.misiaszek at kofinder dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-43341-4-cbhV8gJfzw@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-43341-4@http.gcc.gnu.org/bugzilla/>
References: <bug-43341-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-04/txt/msg00614.txt.bz2
Content-length: 1501

https://gcc.gnu.org/bugzilla/show_bug.cgi?idC341

Michal Misiaszek <michal.misiaszek at kofinder dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |michal.misiaszek at kofinder dot c
                   |                            |om

--- Comment #8 from Michal Misiaszek <michal.misiaszek at kofinder dot com> ---
Hi,
I was strglign for last 2 night with my aplication which generated coverage
files for all source files but one. At the end I found out that C++ file was
including .h file. If I commented out .h file then coverage was generated.
Only unusual thing I noticed was #pragma pack(1).
When I removed it the coverage was generated again !
Short search I found this bug. I can reproduce it all the time.
The version of g++ and gcov below
:
michal@ubuntu:~/git/blpr$ g++ --version
g++ (Ubuntu 4.9.1-16ubuntu6) 4.9.1
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

michal@ubuntu:~/git/blpr$ gcov --version
gcov (Ubuntu 4.9.1-16ubuntu6) 4.9.1
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.

Can you somehow patch it ?
Regards
Michal


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2015-04-09  4:08 ` hubicka at gcc dot gnu.org
@ 2015-04-09 17:44 ` hubicka at gcc dot gnu.org
  2015-04-09 17:56 ` hubicka at gcc dot gnu.org
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-09 17:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #6 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Strenghtening the wrapper heuristics:
Index: ipa-inline.c
===================================================================
--- ipa-inline.c        (revision 221909)
+++ ipa-inline.c        (working copy)
@@ -1124,8 +1124,8 @@ edge_badness (struct cgraph_edge *edge,
          /* ... and edges executed only conditionally ... */
          && edge->frequency < CGRAPH_FREQ_BASE
          /* ... consider case where callee is not inline but caller is ... */
-         && ((!DECL_DECLARED_INLINE_P (edge->callee->decl)
-              && DECL_DECLARED_INLINE_P (caller->decl))
+         && ((DECL_DECLARED_INLINE_P (edge->callee->decl)
+              <= DECL_DECLARED_INLINE_P (caller->decl))
              /* ... or when early optimizers decided to split and edge
                 frequency still indicates splitting is a win ... */
              || (callee->split_part && !caller->split_part

and bumping up the large-function-insns to 4000 makes the hot inline decisions
look the same. Still does not solve the benchmark. This time it seems that MAIN
got slower because we inlined more into it.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2015-04-09 17:44 ` hubicka at gcc dot gnu.org
@ 2015-04-09 17:56 ` hubicka at gcc dot gnu.org
  2015-04-09 18:19 ` hubicka at gcc dot gnu.org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-09 17:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #7 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
OK, and setting --param large-function-insns=1000 gets the performance then.
The key seems to be in not inlining too much into main.  The hotspot change
from:

  1.11 �3682:   mov    0x60(%rsp),%rdx                                          
  9.32 �3687:���vmovss (%rax,%r12,2),%xmm5                                      
  1.44 �     �  vmovss (%rax),%xmm6                                             
  4.46 �     �  inc    %rdi                                                     
  0.01 �     �  add    $0x10,%rcx                                               
  1.17 �     �  vinser $0x10,(%rax,%r13,1),%xmm5,%xmm0                          
  1.92 �     �  vinser $0x10,(%rax,%r12,1),%xmm6,%xmm1                          
  0.28 �     �  add    %r14,%rax                                                
  0.07 �     �  vmovlh %xmm0,%xmm1,%xmm0                                        
  2.48 �     �  vfmadd %xmm3,-0x10(%rcx),%xmm0,%xmm3                            
  5.15 �     �  cmp    %rdi,%rdx                                                
  0.01 �     ���ja     3687                                                     
  1.21 �        vhaddp %xmm3,%xmm3,%xmm3                                        
 10.30 �        mov    0x58(%rsp),%rax                                          
  0.03 �        mov    %r13,0x10(%rsp)                                          
  0.00 �        add %rax,%rsi
  1.18 �        vhaddp %xmm3,%xmm3,%xmm3
 10.80 �        vaddss %xmm3,%xmm4,%xmm4                                        
  4.47 �        cmp    0x68(%rsp),%rax                                          

(the slower variant) to:

  1.38 �        xor    %ecx,%ecx                                                
  6.04 �17c0:���vmovss (%rax,%r11,2),%xmm3                                      
  0.18 �     �  mov    0x90(%rsp),%rsi                                          
  1.43 �     �  inc    %rcx                                                     
  1.42 �     �  vmovss (%rax),%xmm5                                             
  0.36 �     �  vmovss (%rdx,%rbx,2),%xmm6                                      
  2.81 �     �  vmovss (%rdx),%xmm7                                             
  0.90 �     �  vinser $0x10,(%rax,%rsi,1),%xmm3,%xmm2                          
  2.96 �     �  mov    0x88(%rsp),%rsi                                          
  0.04 �     �  vinser $0x10,(%rax,%r11,1),%xmm5,%xmm4                          
  2.76 �     �  add    0x70(%rsp),%rax                                          
  0.07 �     �  vinser $0x10,(%rdx,%rbx,1),%xmm7,%xmm3                          
  0.02 �     �  vmovlh %xmm2,%xmm4,%xmm4                                        
  2.69 �     �  vinser $0x10,(%rdx,%rsi,1),%xmm6,%xmm2                          
  1.13 �     �  add    0x78(%rsp),%rdx                                          
  0.04 �     �  vmovlh %xmm2,%xmm3,%xmm2                                        
  0.01 �     �  vfmadd %xmm0,%xmm2,%xmm4,%xmm0                                  
  2.74 �     �  cmp    %rcx,0x80(%rsp)                                          
  0.07 �     ���ja     17c0                                                     
  1.39 �        vhaddp %xmm0,%xmm0,%xmm0                                        
  4.45 �        mov    0x48(%rsp),%rsi                                          
  1.42 �        vhaddp %xmm0,%xmm0,%xmm0                                        
  7.96 �        vaddss %xmm0,%xmm1,%xmm1                                        
  4.09 �        cmp    %r15,0x60(%rsp)                                          
  0.01 �      � je     18b1                                                     

(the faster variant, dunno why)
>From gcc-bugs-return-483186-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu Apr 09 17:59:39 2015
Return-Path: <gcc-bugs-return-483186-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 13617 invoked by alias); 9 Apr 2015 17:59:39 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 13560 invoked by uid 55); 9 Apr 2015 17:59:35 -0000
From: "hubicka at ucw dot cz" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
Date: Thu, 09 Apr 2015 17:59:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: ipa
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords: lto, missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: hubicka at ucw dot cz
X-Bugzilla-Status: NEW
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-65701-4-9xVJNKdaCo@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-65701-4@http.gcc.gnu.org/bugzilla/>
References: <bug-65701-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-04/txt/msg00738.txt.bz2
Content-length: 2237

https://gcc.gnu.org/bugzilla/show_bug.cgi?ide701

--- Comment #8 from Jan Hubicka <hubicka at ucw dot cz> ---
With spaces removed to be readable
>
>   1.11 ???3682:   mov    0x60(%rsp),%rdx
>   9.32 ???3687:?????????vmovss (%rax,%r12,2),%xmm5
>   1.44 ???     ???  vmovss (%rax),%xmm6
>   4.46 ???     ???  inc    %rdi
>   0.01 ???     ???  add    $0x10,%rcx
>   1.17 ???     ???  vinser $0x10,(%rax,%r13,1),%xmm5,%xmm0
>   1.92 ???     ???  vinser $0x10,(%rax,%r12,1),%xmm6,%xmm1
>   0.28 ???     ???  add    %r14,%rax
>   0.07 ???     ???  vmovlh %xmm0,%xmm1,%xmm0
>   2.48 ???     ???  vfmadd %xmm3,-0x10(%rcx),%xmm0,%xmm3
>   5.15 ???     ???  cmp    %rdi,%rdx
>   0.01 ???     ?????????ja     3687
>   1.21 ???        vhaddp %xmm3,%xmm3,%xmm3
>  10.30 ???        mov    0x58(%rsp),%rax
>   0.03 ???        mov    %r13,0x10(%rsp)
>   0.00 ???        add %rax,%rsi
>   1.18 ???        vhaddp %xmm3,%xmm3,%xmm3
>  10.80 ???        vaddss %xmm3,%xmm4,%xmm4
>   4.47 ???        cmp    0x68(%rsp),%rax
>
> (the slower variant) to:
>
>   1.38 ???        xor    %ecx,%ecx
>   6.04 ???17c0:?????????vmovss (%rax,%r11,2),%xmm3
>   0.18 ???     ???  mov    0x90(%rsp),%rsi
>   1.43 ???     ???  inc    %rcx
>   1.42 ???     ???  vmovss (%rax),%xmm5
>   0.36 ???     ???  vmovss (%rdx,%rbx,2),%xmm6
>   2.81 ???     ???  vmovss (%rdx),%xmm7
>   0.90 ???     ???  vinser $0x10,(%rax,%rsi,1),%xmm3,%xmm2
>   2.96 ???     ???  mov    0x88(%rsp),%rsi
>   0.04 ???     ???  vinser $0x10,(%rax,%r11,1),%xmm5,%xmm4
>   2.76 ???     ???  add    0x70(%rsp),%rax
>   0.07 ???     ???  vinser $0x10,(%rdx,%rbx,1),%xmm7,%xmm3
>   0.02 ???     ???  vmovlh %xmm2,%xmm4,%xmm4
>   2.69 ???     ???  vinser $0x10,(%rdx,%rsi,1),%xmm6,%xmm2
>   1.13 ???     ???  add    0x78(%rsp),%rdx
>   0.04 ???     ???  vmovlh %xmm2,%xmm3,%xmm2
>   0.01 ???     ???  vfmadd %xmm0,%xmm2,%xmm4,%xmm0
>   2.74 ???     ???  cmp    %rcx,0x80(%rsp)
>   0.07 ???     ?????????ja     17c0
>   1.39 ???        vhaddp %xmm0,%xmm0,%xmm0
>   4.45 ???        mov    0x48(%rsp),%rsi
>   1.42 ???        vhaddp %xmm0,%xmm0,%xmm0
>   7.96 ???        vaddss %xmm0,%xmm1,%xmm1
>   4.09 ???        cmp    %r15,0x60(%rsp)
>   0.01 ???      ??? je     18b1
>
> (the faster variant, dunno why)


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2015-04-09 17:56 ` hubicka at gcc dot gnu.org
@ 2015-04-09 18:19 ` hubicka at gcc dot gnu.org
  2015-04-09 19:40 ` hubicka at gcc dot gnu.org
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-09 18:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #9 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
To me it seems like more inlining enales us to SRA array descriptor that in
turn enables vectorizer to vectorize differently and slow down the code?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2015-04-09 18:19 ` hubicka at gcc dot gnu.org
@ 2015-04-09 19:40 ` hubicka at gcc dot gnu.org
  2015-04-09 19:45 ` hubicka at gcc dot gnu.org
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-09 19:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenther at suse dot de,
                   |                            |vmakarov at redhat dot com

--- Comment #10 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
This is on clean mainline and bdver1 machine.

GCC with patch reverted runtime is:
real    0m50.714s
user    0m50.402s
sys     0m0.356s

and now with different inliner settings:

(talos4)$ sh compile

real    1m4.636s
user    1m4.270s
sys     0m0.420s
(talos4)$ sh compile --param large-function-insns=1000

real    0m51.063s
user    0m50.742s
sys     0m0.364s
(talos4)$ sh compile --param large-function-insns=100000 --param
large-stack-frame=100000

real    1m1.369s
user    1m1.012s
sys     0m0.407s
(talos4)$ sh compile -fno-tree-vectorize

real    1m0.629s
user    1m0.299s
sys     0m0.381s
(talos4)$ sh compile -fno-tree-vectorize --param large-function-insns=1000

real    0m53.375s
user    0m53.053s
sys     0m0.367s
(talos4)$ sh compile -fno-tree-vectorize --param large-function-insns=100000
--param large-stack-frame=100000

real    0m55.131s
user    0m54.826s
sys     0m0.351s

param large-function-insns=1000 is thus a winner, but apparently by an
accident.

It seems that tree vectorizer actually make code slower when more inlining and
SRA happens. Richard, perhaps with you vect-costmodel-fu, you can take a look?
It also may be just an RA issue, but I do not see particularly many spills in
ther internal loops.

To completely flatten the whole benchmark, one needs to also bump up
max-inline-insns-auto. This seems to firther degrade perofmrance with both
vectorizer and nonvectorizer, so it also may be just an register pressure and
IRA issue.

Richard, since it is the second time we run into large-function-insns being
beneficial, I wonder if you can patch frescobaldi or czerny (so we have c++
benchmark and LTO spec covered) with change of the parameter value?

The current value was never really tuned it is quite possibly just too large.
I will see if I can get anything useful out of firefox benchmarks.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2015-04-09 19:40 ` hubicka at gcc dot gnu.org
@ 2015-04-09 19:45 ` hubicka at gcc dot gnu.org
  2015-04-10  9:10 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-09 19:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #11 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Also:
(talos4)$ sh compile  -fno-tree-sra

real    0m52.668s
user    0m52.365s
sys     0m0.348s

So it indeed looks like issue related to either vectorizer getting too fancy
with prior SRA or simply an register pressure issue.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2015-04-09 19:45 ` hubicka at gcc dot gnu.org
@ 2015-04-10  9:10 ` rguenth at gcc dot gnu.org
  2015-04-10  9:35 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-04-10  9:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Ganesh.Gopalasubramanian@am
                   |                            |d.com

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
I notice some (obvious) differences (just glancing at -fopt-info)

graphRoutines.f90:393
graphRoutines.f90:359

are not peeled for alignment when vectorized in the good case.

But it seems that's ok (well, we're peeling too much for alignment IMHO...).
In the fast variant we vectorize strided loads while in the slow variant
we can use vector loads for one of the loads (and we made sure to use
aligned loads by peeling).

  1.11 �3682:   mov    0x60(%rsp),%rdx                                          
  9.32 �3687:���vmovss (%rax,%r12,2),%xmm5                                      
  1.44 �     �  vmovss (%rax),%xmm6                                             
  4.46 �     �  inc    %rdi                                                     
  0.01 �     �  add    $0x10,%rcx                                               
  1.17 �     �  vinser $0x10,(%rax,%r13,1),%xmm5,%xmm0                          
  1.92 �     �  vinser $0x10,(%rax,%r12,1),%xmm6,%xmm1                          
  0.28 �     �  add    %r14,%rax                                                
  0.07 �     �  vmovlh %xmm0,%xmm1,%xmm0                                        
  2.48 �     �  vfmadd %xmm3,-0x10(%rcx),%xmm0,%xmm3                            
  5.15 �     �  cmp    %rdi,%rdx                                                
  0.01 �     ���ja     3687                             

so maybe the vfmadd with a memory operand is just bad for the pipeline
(I suspect bad for the decoder at least).

To me it really looks like trunk generates better code but we run into
a very odd bdver2 architectural issue (if the above loop is really the issue).
You could try disabling peeling for alignment with --param
vect-max-peeling-for-alignment=0 (so you get unaligned load and a vfmadd
without memory operand).

I don't think this is a RA issue.

Ganesh?
>From gcc-bugs-return-483263-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Fri Apr 10 09:13:27 2015
Return-Path: <gcc-bugs-return-483263-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 7372 invoked by alias); 10 Apr 2015 09:13:27 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 7331 invoked by uid 48); 10 Apr 2015 09:13:23 -0000
From: "redi at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/65728] template instantiation complains of sizeof failing due to incomplete definition
Date: Fri, 10 Apr 2015 09:13:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 4.8.2
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: redi at gcc dot gnu.org
X-Bugzilla-Status: RESOLVED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-65728-4-axV464oWbH@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-65728-4@http.gcc.gnu.org/bugzilla/>
References: <bug-65728-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-04/txt/msg00815.txt.bz2
Content-length: 286

https://gcc.gnu.org/bugzilla/show_bug.cgi?ide728

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Your testcase is invalid because it has an uninitialized reference member that
can never be initialized.

If you fix that, I think it should compile, so G++ is wrong.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2015-04-10  9:10 ` rguenth at gcc dot gnu.org
@ 2015-04-10  9:35 ` rguenth at gcc dot gnu.org
  2015-04-10 10:09 ` [Bug ipa/65701] [5 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2 rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-04-10  9:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
Having two extra loops (prologue and epilogue) in the deep loop nest may be not
the best idea.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] [5 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2015-04-10  9:35 ` rguenth at gcc dot gnu.org
@ 2015-04-10 10:09 ` rguenth at gcc dot gnu.org
  2015-04-10 19:46 ` hubicka at gcc dot gnu.org
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-04-10 10:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*
   Target Milestone|---                         |5.0
            Summary|r221530 makes 187.facerec   |[5 Regression] r221530
                   |drop with -Ofast -flto      |makes 187.facerec drop with
                   |                            |-Ofast -flto on bdver2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] [5 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2015-04-10 10:09 ` [Bug ipa/65701] [5 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2 rguenth at gcc dot gnu.org
@ 2015-04-10 19:46 ` hubicka at gcc dot gnu.org
  2015-04-12 22:42 ` [Bug ipa/65701] [5/6 " hubicka at gcc dot gnu.org
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-10 19:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #15 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
It would be nice to test it on AVX enabled intel CPU.  There are IMO at least
two things - first is the vectorizer oddity, second is that the fastest code
seems to happen with large-function-insns=1000.

I have no problem with adjusting this argument - it was largery unutuned for
years, but I would like to have some idea why the inlining hurts and if we can
fix that instead. Sadly the functions are quite big...


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] [5/6 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2015-04-10 19:46 ` hubicka at gcc dot gnu.org
@ 2015-04-12 22:42 ` hubicka at gcc dot gnu.org
  2015-04-15  7:57 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu.org @ 2015-04-12 22:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #16 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
However the spec score seems to indicate that well over half of the performance
gap is gone by the vectorizer change. Good ;)


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] [5/6 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2015-04-12 22:42 ` [Bug ipa/65701] [5/6 " hubicka at gcc dot gnu.org
@ 2015-04-15  7:57 ` rguenth at gcc dot gnu.org
  2015-05-22  9:09 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-04-15  7:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
   Target Milestone|5.0                         |5.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] [5/6 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2015-04-15  7:57 ` rguenth at gcc dot gnu.org
@ 2015-05-22  9:09 ` rguenth at gcc dot gnu.org
  2015-05-26 11:05 ` [Bug ipa/65701] [5 " rguenth at gcc dot gnu.org
  2015-07-16  9:16 ` rguenth at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-05-22  9:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Fri May 22 09:08:46 2015
New Revision: 223528

URL: https://gcc.gnu.org/viewcvs?rev=223528&root=gcc&view=rev
Log:
2015-05-22  Richard Biener  <rguenther@suse.de>

        PR tree-optimization/65701
        * tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
        Move peeling cost models into one place.  Peel for alignment
        for single loads only if an aligned load is cheaper than
        an unaligned load.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-vect-data-refs.c


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] [5 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2015-05-22  9:09 ` rguenth at gcc dot gnu.org
@ 2015-05-26 11:05 ` rguenth at gcc dot gnu.org
  2015-07-16  9:16 ` rguenth at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-05-26 11:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |6.0
            Summary|[5/6 Regression] r221530    |[5 Regression] r221530
                   |makes 187.facerec drop with |makes 187.facerec drop with
                   |-Ofast -flto on bdver2      |-Ofast -flto on bdver2

--- Comment #18 from Richard Biener <rguenth at gcc dot gnu.org> ---
Facerec is back on the tester.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug ipa/65701] [5 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2
  2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2015-05-26 11:05 ` [Bug ipa/65701] [5 " rguenth at gcc dot gnu.org
@ 2015-07-16  9:16 ` rguenth at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-07-16  9:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|5.2                         |5.3

--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 5.2 is being released, adjusting target milestone to 5.3.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-07-16  9:16 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-08 13:41 [Bug ipa/65701] New: r221530 makes 187.facerec drop with -Ofast -flto rguenth at gcc dot gnu.org
2015-04-08 13:45 ` [Bug ipa/65701] " rguenth at gcc dot gnu.org
2015-04-08 13:47 ` rguenth at gcc dot gnu.org
2015-04-08 16:37 ` hubicka at gcc dot gnu.org
2015-04-09  0:03 ` hubicka at gcc dot gnu.org
2015-04-09  4:08 ` hubicka at gcc dot gnu.org
2015-04-09 17:44 ` hubicka at gcc dot gnu.org
2015-04-09 17:56 ` hubicka at gcc dot gnu.org
2015-04-09 18:19 ` hubicka at gcc dot gnu.org
2015-04-09 19:40 ` hubicka at gcc dot gnu.org
2015-04-09 19:45 ` hubicka at gcc dot gnu.org
2015-04-10  9:10 ` rguenth at gcc dot gnu.org
2015-04-10  9:35 ` rguenth at gcc dot gnu.org
2015-04-10 10:09 ` [Bug ipa/65701] [5 Regression] r221530 makes 187.facerec drop with -Ofast -flto on bdver2 rguenth at gcc dot gnu.org
2015-04-10 19:46 ` hubicka at gcc dot gnu.org
2015-04-12 22:42 ` [Bug ipa/65701] [5/6 " hubicka at gcc dot gnu.org
2015-04-15  7:57 ` rguenth at gcc dot gnu.org
2015-05-22  9:09 ` rguenth at gcc dot gnu.org
2015-05-26 11:05 ` [Bug ipa/65701] [5 " rguenth at gcc dot gnu.org
2015-07-16  9:16 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).