From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 615323851C20; Mon, 16 Nov 2020 20:11:37 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 615323851C20
From: "already5chosen at yahoo dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7
 times slower than -O3
Date: Mon, 16 Nov 2020 20:11:37 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 10.2.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: already5chosen at yahoo dot com
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-97832-4-jNhR40UY1K@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-97832-4@http.gcc.gnu.org/bugzilla/>
References: <bug-97832-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 16 Nov 2020 20:11:37 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D97832
--- Comment #3 from Michael_S <already5chosen at yahoo dot com> ---
(In reply to Richard Biener from comment #2)
> It's again reassociation making a mess out of the natural SLP opportunity
> (and thus SLP discovery fails miserably).
>=20
> One idea worth playing with would be to change reassociation to rank
> references
> from the same load group (as later vectorization would discover) the same.
>=20
> That said, further analysis and maybe a smaller testcase to look at is us=
eful
> here.  There is, after all, the opportunity to turn "bad" association at =
the
> source level to good for vectorization when -ffast-math is enabled as wel=
l.

It turned out, much simpler kernel suffers from the same problem.

void foo1x1(double* restrict y, const double* restrict x, int clen)
{
  int xi =3D clen & 2;
  double f_re =3D x[0+xi+0];
  double f_im =3D x[4+xi+0];
  int clen2 =3D (clen+xi) * 2;
  #pragma GCC unroll 0
  for (int c =3D 0; c < clen2; c +=3D 8) {
    // y[c] =3D y[c] - x[c]*conj(f);
    #pragma GCC unroll 4
    for (int k =3D 0; k < 4; ++k) {
      double x_re =3D x[c+0+k];
      double x_im =3D x[c+4+k];
      double y_re =3D y[c+0+k];
      double y_im =3D y[c+4+k];
      y_re =3D y_re - x_re * f_re - x_im * f_im;;
      y_im =3D y_im + x_re * f_im - x_im * f_re;
      y[c+0+k] =3D y_re;
      y[c+4+k] =3D y_im;
    }
  }
}

May be, it's possible to simplify further, but probably not by much.=