[Bug rtl-optimization/60086] suboptimal asm generated for a loop (store/load false aliasing)

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "amonakov at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/60086] suboptimal asm generated for a loop (store/load false aliasing)
Date: Fri, 07 Feb 2014 14:33:00 -0000	[thread overview]
Message-ID: <bug-60086-4-FHiLXfMsb5@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-60086-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60086

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #7 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #1)
> alignment, but still the scheduler doesn't reorder the loads vs. the store,
> unless -O3 -mavx -fschedule-insns.  The reason why the second scheduler
> doesn't reorder those is that RA allocates the same register

I think you usually want -fschedule-insns (pre-regalloc scheduling) or
-frename-registers rather than -fselective-scheduling2 when the goal is to
workaround RA conservativeness.  Unfortunately, stack accesses in the loop
prevent sched2 from using the additional freedom supplied by regrename for AVX
code in this case (when tuning is enabled).  The stack accesses seem to be a
trunk regression judging by good code supplied in the opening comment. 

(-O3 -mavx -fschedule-insns or -frename-registers, same modulo ymm* names,
%rpb-based accesses in the loop are pretty bad, but otherwise it's scheduled as
desired)
.L9:    
        movq    -136(%rbp), %rdx
        vmovapd (%r9,%rax), %ymm1
        addq    $1, %rdi
        vmovapd (%r10,%rax), %ymm0
        vaddpd  (%rdx,%rax), %ymm1, %ymm1
        movq    -144(%rbp), %rdx
        vaddpd  (%rdx,%rax), %ymm0, %ymm0
        vmovapd %ymm1, (%r9,%rax)
        vmovapd %ymm0, (%r10,%rax)
        addq    $32, %rax
        cmpq    %rdi, -152(%rbp)
        ja      .L9

(-O3 -fschedule-insns or -frename-registers, same modulo xmm* names, scheduled
as desired)
.L7:
        movapd  (%r9,%rax), %xmm0
        addq    $1, %rdi
        movapd  (%r10,%rax), %xmm2
        addpd   (%r11,%rax), %xmm0
        addpd   (%rcx,%rax), %xmm2
        movaps  %xmm0, (%r9,%rax)
        movaps  %xmm2, (%r10,%rax)
        addq    $16, %rax
        cmpq    %rdi, %r8
        ja      .L7

(-mavx -O3 -mtune=corei7-avx -frename-registers, stack-based references prevent
good scheduling)
.L9:
        movq    -136(%rbp), %rdx
        addq    $1, %rdi
        vmovapd (%r9,%rax), %ymm0
        vmovapd (%r10,%rax), %ymm3
        vaddpd  (%rdx,%rax), %ymm0, %ymm2
        movq    -144(%rbp), %rdx
        vmovapd %ymm2, (%r9,%rax)
        vaddpd  (%rdx,%rax), %ymm3, %ymm4
        vmovapd %ymm4, (%r10,%rax)
        addq    $32, %rax
        cmpq    %rdi, -152(%rbp)
        ja      .L9

(-mavx -O3 -mtune=corei7-avx -fschedule-insns -fno-ivopts, no spilling in the
loop, scheduled as desired)
.L9:    
        addq    $32, %rcx
        addq    $32, %r10
        vmovapd (%rdx), %ymm1
        addq    $32, %rsi
        vmovapd (%rdi), %ymm0
        addq    $32, %r11
        addq    $1, %rax
        addq    $32, %rdx
        vaddpd  -32(%rcx), %ymm1, %ymm1
        addq    $32, %rdi
        vaddpd  -32(%r10), %ymm0, %ymm0
        vmovapd %ymm1, -32(%rsi)
        vmovapd %ymm0, -32(%r11)
        cmpq    %rax, -184(%rbp)
        ja      .L9

next prev parent reply	other threads:[~2014-02-07 14:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-05 22:41 [Bug rtl-optimization/60086] New: " marcin.krotkiewski at gmail dot com
2014-02-06  8:28 ` [Bug rtl-optimization/60086] " jakub at gcc dot gnu.org
2014-02-06  9:34 ` marcin.krotkiewski at gmail dot com
2014-02-06 10:10 ` mpolacek at gcc dot gnu.org
2014-02-06 10:22 ` rguenth at gcc dot gnu.org
2014-02-07  8:52 ` abel at gcc dot gnu.org
2014-02-07  8:53 ` abel at gcc dot gnu.org
2014-02-07 14:33 ` amonakov at gcc dot gnu.org [this message]
2014-02-07 16:43 ` marcin.krotkiewski at gmail dot com
2014-02-07 17:21 ` amonakov at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-60086-4-FHiLXfMsb5@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).