From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id F03803858D32; Tue, 26 Jul 2022 03:53:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F03803858D32 From: "luoxhu at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le Date: Tue, 26 Jul 2022 03:53:31 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: luoxhu at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.2 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jul 2022 03:53:32 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106069 --- Comment #15 from luoxhu at gcc dot gnu.org --- In combine: vec_select(vec_concat and the followed vec_select are combined = to a single extract instruction, which seems reasonable for both LE and BE? R146: 0 1 2 3 R141: 4 5 6 7 R150: 2 6 3 7 // vec_select(vec_concat(r146:V4SI,r141:V4SI),[2 6 3 7]) R151: R150[3] // vec_select(r150:V4SI,3) =3D>=20 R151: R141[3] // vec_select(r141:V4SI,3) Trying 21 -> 24: 21: r150:V4SI=3Dvec_select(vec_concat(r146:V4SI,r141:V4SI),parallel) REG_DEAD r146:V4SI REG_DEAD r141:V4SI 24: {r151:SI=3Dvec_select(r150:V4SI,parallel);clobber scratch;} Failed to match this instruction: (parallel [ (set (reg:SI 151) (vec_select:SI (reg:V4SI 141) (parallel [ (const_int 3 [0x3]) ]))) (clobber (scratch:SI)) (set (reg:V4SI 150) (vec_select:V4SI (vec_concat:V8SI (reg:V4SI 146) (reg:V4SI 141)) (parallel [ (const_int 2 [0x2]) (const_int 6 [0x6]) (const_int 3 [0x3]) (const_int 7 [0x7]) ]))) ]) Failed to match this instruction: (parallel [ (set (reg:SI 151) (vec_select:SI (reg:V4SI 141) (parallel [ (const_int 3 [0x3]) ]))) (set (reg:V4SI 150) (vec_select:V4SI (vec_concat:V8SI (reg:V4SI 146) (reg:V4SI 141)) (parallel [ (const_int 2 [0x2]) (const_int 6 [0x6]) (const_int 3 [0x3]) (const_int 7 [0x7]) ]))) ]) Successfully matched this instruction: (set (reg:V4SI 150) (vec_select:V4SI (vec_concat:V8SI (reg:V4SI 146) (reg:V4SI 141)) (parallel [ (const_int 2 [0x2]) (const_int 6 [0x6]) (const_int 3 [0x3]) (const_int 7 [0x7]) ]))) Successfully matched this instruction: (set (reg:SI 151) (vec_select:SI (reg:V4SI 141) (parallel [ (const_int 3 [0x3]) ]))) allowing combination of insns 21 and 24 original costs 4 + 4 =3D 8 replacement costs 4 + 4 =3D 8 modifying insn i2 21: r150:V4SI=3Dvec_select(vec_concat(r146:V4SI,r141:V4SI),parallel) REG_DEAD r146:V4SI deferring rescan insn with uid =3D 21. modifying insn i3 24: {r151:SI=3Dvec_select(r141:V4SI,parallel);clobber scratch;} REG_DEAD r141:V4SI deferring rescan insn with uid =3D 24. I guess the previous unspec implementation bypassed the LE + LE swap check,= so now in split2, we should generate vextuwlx instead of vextuwrx on little endian?=