From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id DBD7C3858408; Tue, 28 Sep 2021 19:03:22 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DBD7C3858408
From: "anlauf at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug fortran/102510] Function call has unnecessary stride check
Date: Tue, 28 Sep 2021 19:03:22 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: fortran
X-Bugzilla-Version: 11.2.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: anlauf at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-102510-4-MmWWEXTdrJ@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-102510-4@http.gcc.gnu.org/bugzilla/>
References: <bug-102510-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Sep 2021 19:03:23 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102510

--- Comment #3 from anlauf at gcc dot gnu.org ---
It helps to look at the (Fortran) context.  As written, the subroutine vers=
ion
is declared with explicit size contiguous arrays.  If the caller has a
non-contiguous (strided) result array, it needs to pack/unpack.  For the
function version - as is - we might need a temporary to handle different
situations.

However, if you offer the compiler the chance to inline the calls, and using
optimization to inline the packing, you may get better code than you think.

Compile this example with -O3 -mavx:

module p
  use iso_fortran_env, only: r32 =3D> real32
  real(r32), dimension(8)  :: a,b
  real(r32), dimension(8)  :: c1, c2
  real(r32), dimension(16) :: d1, d2
contains
  subroutine add2vecs1(a,b,c)
    real(r32), dimension(8), intent(in) :: a,b
    real(r32), dimension(8), intent(out) :: c
    c =3D a + b
  end subroutine add2vecs1
  function add2vecs2(a,b)
    real(r32), dimension(8), intent(in) :: a,b
    real(r32), dimension(8) :: add2vecs2
    add2vecs2 =3D a + b
  end function add2vecs2
  !-
  subroutine s1 ()
    call add2vecs1 (a, b, c1)
  end subroutine s1
  !-
  subroutine s2 ()
    c2         =3D add2vecs2 (a, b)
  end subroutine s2
  !-
  subroutine s3 ()
    call add2vecs1 (a, b, d1(1:16:2))
  end subroutine s3
  !-
  subroutine s4 ()
    d2(1:16:2) =3D add2vecs2 (a, b)
  end subroutine s4
end

You'll find that s1 and s2 compile to the same code, and the strided versio=
ns
s3 and s4 (at least this is my reading of the assembly, but correct me if I
am wrong).

Is there really more to expect?=