public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "dwwork at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug fortran/102510] Function call has unnecessary stride check
Date: Tue, 28 Sep 2021 13:55:47 +0000	[thread overview]
Message-ID: <bug-102510-4-aa3UGBvg4N@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-102510-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102510

--- Comment #2 from Dalon Work <dwwork at gmail dot com> ---
Thanks for the information. Based on your comments, I've created 2 new
subroutines that call the "bad" function. The first places the result in a
contiguous array, while the second places the result in a strided array.
(https://godbolt.org/z/bTnWr3bMn)

The first:

subroutine add2vecs3(a,b,c)
    real(r32), dimension(8), intent(in) :: a,b
    real(r32), dimension(8), intent(out) :: c
    c = add2vecs2(a,b)
end subroutine

With "-O3 -mavx", this subroutine becomes fully vectorized:

__blah_MOD_add2vecs3:
        vmovups ymm0, YMMWORD PTR [rdi]
        vaddps  ymm0, ymm0, YMMWORD PTR [rsi]
        vmovups YMMWORD PTR [rdx], ymm0
        vzeroupper
        ret

The second:

subroutine add2vecs4(a,b,c)
    real(r32), dimension(8), intent(in) :: a,b
    real(r32), dimension(16), intent(out) :: c
    c(1:16:2) = add2vecs2(a,b)
end subroutine

In this case we get the non-vectorized version:

__blah_MOD_add2vecs4:
        vmovups ymm0, YMMWORD PTR [rsi]
        vaddps  ymm0, ymm0, YMMWORD PTR [rdi]
        vmovss  DWORD PTR [rdx], xmm0
        vextractps      DWORD PTR [rdx+8], xmm0, 1
        vextractps      DWORD PTR [rdx+16], xmm0, 2
        vextractps      DWORD PTR [rdx+24], xmm0, 3
        vextractf128    xmm0, ymm0, 0x1
        vmovss  DWORD PTR [rdx+32], xmm0
        vextractps      DWORD PTR [rdx+40], xmm0, 1
        vextractps      DWORD PTR [rdx+48], xmm0, 2
        vextractps      DWORD PTR [rdx+56], xmm0, 3
        vzeroupper
        ret

>From this, it seems you are correct. The result gets passed in as a descriptor
to a block of memory and from that the function figures out the best way to
fill in the data. Perhaps other compilers handle this differently, but there we
have it.

Changing this behavior might be difficult or impossible, as this would be an
ABI change, would it not? It's arguable whether it's even worth changing.
Perhaps other compilers do it differently. I guess what I assumed is that the
compiler would have a contigous block of memory available for the return
result. Any necessary striding would happen external to the function.

  parent reply	other threads:[~2021-09-28 13:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-28  2:17 [Bug fortran/102510] New: Function call has unnecessary aliasing check dwwork at gmail dot com
2021-09-28  8:54 ` [Bug fortran/102510] Function call has unnecessary stride check rguenth at gcc dot gnu.org
2021-09-28 13:55 ` dwwork at gmail dot com [this message]
2021-09-28 19:03 ` anlauf at gcc dot gnu.org
2021-09-28 19:26 ` dwwork at gmail dot com
2021-09-29 21:01 ` anlauf at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-102510-4-aa3UGBvg4N@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).