From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 3D5AC3893C44; Wed, 9 Jun 2021 12:41:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3D5AC3893C44 From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7 times slower than -O3 Date: Wed, 09 Jun 2021 12:41:50 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 10.2.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Jun 2021 12:41:51 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D97832 --- Comment #12 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:ce670e4faafb296d1f1a7828d20f8c8ba4686797 commit r12-1329-gce670e4faafb296d1f1a7828d20f8c8ba4686797 Author: Richard Biener Date: Wed Nov 18 14:17:34 2020 +0100 tree-optimization/97832 - handle associatable chains in SLP discovery This makes SLP discovery handle associatable (including mixed plus/minus) chains better by swapping operands across the whole chain. To work this adds caching of the 'matches' lanes for failed SLP discovery attempts, thereby fixing a failed SLP discovery for the slp-pr98855.cc testcase which results in building an operand from scalars as expected. Unfortunately this makes us trip over the cost threshold so I'm XFAILing the testcase for now. For BB vectorization all this doesn't work because we have no way to distinguish good from bad associations as we eventually build operands from scalars and thus not fail in the classical sense. 2021-05-31 Richard Biener PR tree-optimization/97832 * tree-vectorizer.h (_slp_tree::failed): New. * tree-vect-slp.c (_slp_tree::_slp_tree): Initialize failed member. (_slp_tree::~_slp_tree): Free failed. (vect_build_slp_tree): Retain failed nodes and record matches in them, copying that back out when running into a cached fail. Dump start and end of discovery. (dt_sort_cmp): New. (vect_build_slp_tree_2): Handle associatable chains together doing more aggressive operand swapping. * gcc.dg/vect/pr97832-1.c: New testcase. * gcc.dg/vect/pr97832-2.c: Likewise. * gcc.dg/vect/pr97832-3.c: Likewise. * g++.dg/vect/slp-pr98855.cc: XFAIL.=