public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/60086] suboptimal asm generated for a loop (store/load false aliasing)
Date: Thu, 06 Feb 2014 08:28:00 -0000	[thread overview]
Message-ID: <bug-60086-4-7N1FaxzNqA@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-60086-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60086

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2014-02-06
                 CC|                            |abel at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org,
                   |                            |vmakarov at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC right now only handles __restrict on function parameters, so in this case
the aliasing info isn't known.  While the loop is versioned for aliasing at
runtime, the info about that is only known during the vectorizer, therefore
e.g. scheduler can hardly know it.  The pointers to overaligned memory is
something you should generally avoid, __builtin_assume_aligned is what can be
used to tell the compiler about the alignment instead, overaligned types often
actually hurt generated code instead of improving it.  And the way you are
calling posix_memalign is IMHO a strict aliasing violation.

Perhaps GCC could handle posix_memalign specially as builtin if declared with
the right prototype (and optionally some new attribute) and derive both the
aliasing and alignment info from it, like the taking of the address of the
pointer in it isn't really an escape site of any kind, all the call does is
return two values instead of just one, so it could be folded into passing an
address of some temporary to the call instead and then loading from the
temporary and using some special pass-thru builtin that would tell GCC that the
pointer is really malloc-like (non-aliasing anything else) and also
use__builtin_assume_aligned.  The GNU memalign is far better than
posix_memalign from this POV.

Anyway, if I rewrite your testcase as:
#include <stdlib.h>
#include <stdio.h>

__attribute__((noinline)) void
foo (double *__restrict__ a, double *__restrict__ b, double *__restrict__ c,
double *__restrict__ d, unsigned long NSIZE)
{
  unsigned long i, j;
  a = __builtin_assume_aligned (a, 32);
  b = __builtin_assume_aligned (b, 32);
  c = __builtin_assume_aligned (c, 32);
  d = __builtin_assume_aligned (d, 32);
  // initialize memory
  for(i=0; i<NSIZE; i++){
    a[i] = 0;
    b[i] = 0;
    c[i] = 0;
    d[i] = 0;
  }

  // outer loop - repeat short tests
  for(j=0; j<10000; j++){

    // inner loop - do the work
    for(i=0; i<NSIZE; i++){
      a[i] += b[i];
      c[i] += d[i];
    }

    // dummy - prevent loop interchange
    if(a[NSIZE/2]<0) printf("%lf\n", a[NSIZE/2]);
  }
}

int main(int argc, char*argv[])
{
  unsigned long NSIZE = atol(argv[1]);
  void *a, *b, *c, *d;

  // allocate starting from page boundary
  posix_memalign(&a, 4096, sizeof(double)*(NSIZE));
  posix_memalign(&b, 4096, sizeof(double)*(NSIZE));
  posix_memalign(&c, 4096, sizeof(double)*(NSIZE));
  posix_memalign(&d, 4096, sizeof(double)*(NSIZE));
  foo ((double *) a, (double *) b, (double *) c, (double *) d, NSIZE);
  return 0;
}

we don't do versioning for alias and also (as before) assume sufficient
alignment, but still the scheduler doesn't reorder the loads vs. the store,
unless -O3 -mavx -fschedule-insns.  The reason why the second scheduler doesn't
reorder those is that RA allocates the same register.  With -O3 -mavx
-fselective-scheduling2 the stores are also changed, but we end up with a
weird:
.L9:
        movq    -136(%rbp), %rdx
        vmovapd (%r9,%rax), %ymm0
        addq    $1, %rdi
        vmovapd (%r10,%rax), %ymm8
        vaddpd  (%rdx,%rax), %ymm0, %ymm0
        movq    -144(%rbp), %rdx
        vaddpd  (%rdx,%rax), %ymm8, %ymm9
        vmovapd %ymm0, (%r9,%rax)
        vmovapd %ymm8, %ymm0
        vmovapd %ymm9, %ymm0
        vmovapd %ymm0, (%r10,%rax)
        addq    $32, %rax
        cmpq    %rdi, -152(%rbp)
        ja      .L9
Why there is the vmovapd %ymm8, %ymm0 is a mystery, and vmovapd %ymm9, %ymm0
could be very well merged with the store into vmovapd %ymm9, (%r10,%rax).


  reply	other threads:[~2014-02-06  8:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-05 22:41 [Bug rtl-optimization/60086] New: " marcin.krotkiewski at gmail dot com
2014-02-06  8:28 ` jakub at gcc dot gnu.org [this message]
2014-02-06  9:34 ` [Bug rtl-optimization/60086] " marcin.krotkiewski at gmail dot com
2014-02-06 10:10 ` mpolacek at gcc dot gnu.org
2014-02-06 10:22 ` rguenth at gcc dot gnu.org
2014-02-07  8:52 ` abel at gcc dot gnu.org
2014-02-07  8:53 ` abel at gcc dot gnu.org
2014-02-07 14:33 ` amonakov at gcc dot gnu.org
2014-02-07 16:43 ` marcin.krotkiewski at gmail dot com
2014-02-07 17:21 ` amonakov at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-60086-4-7N1FaxzNqA@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).