From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 72BAC3858000; Thu,  2 Nov 2023 10:00:50 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 72BAC3858000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1698919250;
	bh=Xy9385kyPfpewa66oPwS6tWATj2ABuJfTz5VFhT79gQ=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=ySWPpMnPwZMaKGVU2/XkwrA+f+R3Szl+rwL+a5x21yPDqDr+2LhnZRsahOWXhMIke
	 BVXMtpDp1Njj1Y/dfc4iRqXKRMKYsEBBqHkbngchRWe4Z6645cik0A0Uo2/S2bpMk6
	 apYvqVm63Mx18LF1WAcsV+v2RyJhBRMsxXINxmZU=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/112331] Fail vectorization after loop
 interchange
Date: Thu, 02 Nov 2023 10:00:48 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: short_desc cc
Message-ID: <bug-112331-4-NtWTkbJvXZ@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-112331-4@http.gcc.gnu.org/bugzilla/>
References: <bug-112331-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112331

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|middle-end: Fail            |Fail vectorization after
                   |vectorization               |loop interchange
                 CC|                            |rguenth at gcc dot gnu.org
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, the "issue" is that we are performing loop interchange on this benchm=
ark
loop and the vectorizer doesn't like the zero-step in the then innermost lo=
op.

It's not a practical example, nobody would do such outer loop in practice.

There's a missed optimization in that we fail to elide the then inner loop.

The solution is to insert a use of 'a' after the inner loop, like TSVC
benchmarks usually have:

real_t s111(struct args_t * func_args)
{
//    linear dependence testing
//    no dependence - vectorizable

    initialise_arrays(__func__);

    for (int nl =3D 0; nl < 2*iterations; nl++) {
        for (int i =3D 1; i < LEN_1D; i +=3D 2) {
            a[i] =3D a[i - 1] + b[i];
        }
        dummy(a, b, c, d, e, aa, bb, cc, 0.);
    }

    return calc_checksum(__func__);
}

the it just works(TM).

WONTFIX (in the vectorizer).  In "theory" the interchanged loop could be
vectorized by outer loop vectorization.  But as said, IMHO a waste of time
to cheat badly written benchmarks.=