From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id F24D23858D28; Mon,  7 Aug 2023 09:10:56 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F24D23858D28
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1691399456;
	bh=TVwxnZ5xpmYezWeGbraqxJIfd8PXEpl1J9G/nDtogNU=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=Mjs4Zq1ZAX1zIJtWNiw6eWJoNyomtYVLk27yGlA+Y7nkPALMOa/p3BRTKUJ+IlY3U
	 EvggE6ASA71zLMHRzUQKqKNEmmJ1Vh0phwfdr+cYNA7bnIKqGodgbC7W9ePmaMkq9M
	 bMHquCvuTjM20Ym/iqJ4bYfCmkVop8I6oV6uAgxQ=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/49955] Fails to do partial basic-block SLP
Date: Mon, 07 Aug 2023 09:10:50 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 4.7.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_status assigned_to
Message-ID: <bug-49955-4-ecfQf3C1GA@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-49955-4@http.gcc.gnu.org/bugzilla/>
References: <bug-49955-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D49955

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot =
gnu.org
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
The loop in comment#1 isn't vectorized because we do not have interleaving
support for a group size of 5:

t.f:18:17: missed:   the size of the group of accesses is not a power of 2 =
or
not equal to 3
t.f:18:17: missed:   not falling back to elementwise accesses
t.f:19:72: missed:   not vectorized: relevant stmt not supported: t1_83 =3D
(*q_82(D))[_21];
t.f:18:17: missed:  bad operation or unsupported loop bound.

we don't try to SLP this because there's just a single lane reduction.  The=
re's
not really a loop vectorization opportunity and as comment#3 says there's at
most a BB reduction opportunity.  We try to analyze that now:

  _58 =3D powmult_9 + powmult_107;
  t7_108 =3D _58 + powmult_88;
  t7_109 =3D __builtin_sqrt (t7_108);
  M.7_110 =3D MAX_EXPR <t7_109, t8_126>;

and

t.f:28:72: note:   Starting SLP discovery for
t.f:28:72: note:     powmult_88 =3D _106 * _106;
t.f:28:72: note:     powmult_9 =3D _101 * _101;
t.f:28:72: note:     powmult_107 =3D _96 * _96;
t.f:28:72: note:   starting SLP discovery for node 0x50ef8a0
t.f:28:72: note:   Build SLP for powmult_88 =3D _106 * _106;
t.f:28:72: note:   get vectype for scalar type (group size 3): real(kind=3D=
8)
t.f:28:72: note:   vectype: vector(2) real(kind=3D8)
t.f:28:72: note:   nunits =3D 2
t.f:28:72: missed:   Build SLP failed: unrolling required in basic block SLP

we do not yet have code to limit a BB reduction vectorization to a subset
of lanes (in this case it's uniform so choosing any power-of-two elements
would work but ideally we'd let SLP discovery figure out the "best"
lane combination to vectorize - there's more missing support for BB
reduction vectorization).=