From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id F3F88385E003; Wed, 25 Mar 2020 15:38:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F3F88385E003 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1585150699; bh=M/nJgm7cbhDYoUkTOGYRDUUfhejy0i0UrJc6Y6x48C4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=KPzGExkUOndB44Q0CwJnSSJHQu0qkT0+ps6WZ8SVbMgefS1MftzWJjA5idFjfn9dk hkRqHAnP+/s/nrEsqqJ6kPlfOk4s9ZBR+Pl1FaUAgCaCs2PdDDZSi68xZRbIAgxu+z TyiTtHXd0TaqOrOMrCqmQfhTWiGrOCu42M8ijaRk= From: "meissner at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/81594] Optimize PowerPC vector set and store Date: Wed, 25 Mar 2020 15:38:18 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 8.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: meissner at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: meissner at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2020 15:38:19 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D81594 --- Comment #6 from Michael Meissner --- If you look at the original patch, it did try to do this optimization. Whe= n I looked at it some time later, the combiner no longer generated the sequence because it thought it was slower (due to length, etc.). You could spend a lot of time tuning the code so eventually the combiner wi= ll generate it again, but it was simpler to just put the peephole in to catch = the cases that show up. If you want to take on the bug and do it earlier, go ahead. A peephole2 might not catch all uses, but it prevents whack-a-mole, where a change causes other code generation changes down the pike. Note, the original patch was written in the power8 time frame, and it would need to be adjust to power9 and future systems now (i.e. the patch only does the splitting if the value is a FPR or GPR, while in power9 it could be a traditional Altivec register). However, the splitter uses reload_completed that you always seem to object = to.=20 It could be done before register allocation, but then you would need to make sure that no other pass recombines the two separate items back into a vector once again.=