From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 4639A385782B; Mon, 23 Oct 2023 02:24:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4639A385782B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1698027873; bh=UOP5u/+vpmJoPXFnIUkVL0ycCnauS81fGbu7m0jmv6s=; h=From:To:Subject:Date:In-Reply-To:References:From; b=bHZNxcrsIC9iWB6JaSeic2SbnJ/9T7dZHlGR6btL2MQ/+bUGI3eknvNSOAK5+Dngx c4RnlSmB8sACZfewVQBchPzzbRjWB5w8n2QhRejw4SsDWPRN/y/tZKr90T/+s3ulHn PkElCXtRN84ZZ6FUed9cztsSrp3bEwSNxKl+Z88g= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/111784] [14 Regression] aarch64: ldp_stp_{15,16,17,18}.c test failures since r14-4579 Date: Mon, 23 Oct 2023 02:24:32 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization, testsuite-fail X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: linkw at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111784 --- Comment #3 from CVS Commits --- The master branch has been updated by Kewen Lin : https://gcc.gnu.org/g:1908775f7982bd2de36df5d94396eca0865bad9a commit r14-4842-g1908775f7982bd2de36df5d94396eca0865bad9a Author: Kewen Lin Date: Sun Oct 22 21:18:40 2023 -0500 vect: Cost adjacent vector loads/stores together [PR111784] As comments[1][2], this patch is to change the costing way on some adjacent vector loads/stores from costing one by one to costing them together with the total number once. It helps to fix the exposed regression PR111784 on aarch64, as aarch64 specific costing could make different decisions according to the different costing ways (counting with total number vs. counting one by one). Based on a reduced test case from PR111784, only considering vec_num can fix the regression already, but vector loads/stores in regard to ncopies are also adjacent accesses, so they are considered as well. btw, this patch leaves the costing on dr_explicit_realign and dr_explicit_realign_optimized alone to make it simple. The costing way change can cause the differences for them since there is one costing depending on targetm.vectorize. builtin_mask_for_load and it's costed according to the calling times. IIUC, these two dr_alignment_support are mainly used for old Power? (only having 16 bytes aligned vector load/store but no unaligned vector load/store). [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630742.html [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630744.html PR tree-optimization/111784 gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_store): Adjust costing way f= or adjacent vector stores, by costing them with the total number rather than costing them one by one. (vectorizable_load): Adjust costing way for adjacent vector loads, by costing them with the total number rather than costing them one by one.=