From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 568D83858C1F; Fri, 14 Apr 2023 07:32:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 568D83858C1F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1681457561; bh=7mX+6KJpiBgGjfLaKGVlqbsRAe4RRwIhT6x61IGui3U=; h=From:To:Subject:Date:In-Reply-To:References:From; b=e6fJImy7drJ1qfUr8ojm3mJ60XJs972ByJUJYJADJ3JXMc3FR5jy8qOPVp6RvN8gC rZdidgKXCZqxvC4rH39V5DJ/QEXRv8GvRZ3a2kHAz+eSxtokFbEq2j96ZayJo88qmA XuaWxIBBmbrLe8UF016rXgi5nJaq3cOpz5EwDq7o= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/109040] [13 Regression] wrong code with v16hi compare & mask on riscv64 at -O2 since r13-4907-g2e886eef7f2b5a Date: Fri, 14 Apr 2023 07:32:40 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: DUPLICATE X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109040 --- Comment #9 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:9d1a6119590ef828f9782a7083d03e535bc2f2cf commit r13-7178-g9d1a6119590ef828f9782a7083d03e535bc2f2cf Author: Jakub Jelinek Date: Fri Apr 14 09:20:49 2023 +0200 combine: Fix AND handling for WORD_REGISTER_OPERATIONS targets [PR10904= 0] The following testcase is miscompiled on riscv since the addition of *mvconst_internal define_insn_and_split. We have: (insn 36 35 39 2 (set (mem/c:SI (plus:SI (reg/f:SI 65 frame) (const_int -64 [0xffffffffffffffc0])) [2 S4 A128]) (reg:SI 166)) "pr109040.c":9:11 178 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 166) (nil))) (insn 39 36 40 2 (set (reg:SI 171) (zero_extend:SI (mem/c:HI (plus:SI (reg/f:SI 65 frame) (const_int -64 [0xffffffffffffffc0])) [0 S2 A128])= )) "pr109040.c":9:11 111 {*zero_extendhisi2} (nil)) and RTL DSE's replace_read since r0-86337-g18b526e806ab6455 handles even different modes like in the above case, and so it optimizes it int= o: (insn 47 35 39 2 (set (reg:HI 175) (subreg:HI (reg:SI 166) 0)) "pr109040.c":9:11 179 {*movhi_inter= nal} (expr_list:REG_DEAD (reg:SI 166) (nil))) (insn 39 47 40 2 (set (reg:SI 171) (zero_extend:SI (reg:HI 175))) "pr109040.c":9:11 111 {*zero_extendhisi2} (expr_list:REG_DEAD (reg:HI 175) (nil))) Pseudo 166 is result of AND with 0x8084c constant (forced into a regist= er). Combine attempts to combine the AND with the insn 47 above created by D= SE, and turns it because of WORD_REGISTER_OPERATIONS and its assumption that all the subword operations are actually done on word mode into: (set (subreg:SI (reg:HI 175) 0) (and:SI (reg:SI 167 [ m ]) (reg:SI 168))) and later on the ZERO_EXTEND is thrown away. We then see (and:SI (subreg:SI (reg:HI 175) 0) (const_int 0x84c)) and optimize that into (subreg:SI (and:HI (reg:HI 175) (const_int 0x84c)) 0) which is still fine, in WORD_REGISTER_OPERATIONS the AND in HImode will set all upper bits up to BITS_PER_WORD to zeros. But later on simplify_binary_operation_1 or simplify_and_const_int_1 sees that because nonzero_bits ((reg:HI 175), HImode) =3D=3D 0x84c, we = can optimize the AND into (reg:HI 175). That isn't correct, because while the low 16 bits of that REG are known to have all bits but 0x84c cleare= d, we don't know that all the upper 16 bits are all clear as well. So, for WORD_REGISTER_OPERATIONS for integral modes smaller than word m= ode, we need to check all bits from word_mode in nonzero_bits for the optimizations. 2023-04-14 Jeff Law Jakub Jelinek PR target/108947 PR target/109040 * combine.cc (simplify_and_const_int_1): Compute nonzero_bits in word_mode rather than mode if WORD_REGISTER_OPERATIONS and mode= is smaller than word_mode. * simplify-rtx.cc (simplify_context::simplify_binary_operation_= 1) : Likewise. * gcc.dg/pr108947.c: New test. * gcc.c-torture/execute/pr109040.c: New test.=