From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 203873857BBD; Thu, 2 Jun 2022 08:41:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 203873857BBD From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/105778] Shift by register --- unnecessary AND instruction Date: Thu, 02 Jun 2022 08:41:58 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.1.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: jakub at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jun 2022 08:41:59 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105778 --- Comment #7 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:dcfdd2851b297e0005a8490b7f867ca45d1ad340 commit r13-927-gdcfdd2851b297e0005a8490b7f867ca45d1ad340 Author: Jakub Jelinek Date: Thu Jun 2 10:40:12 2022 +0200 i386: Optimize away shift count masking of shifts/rotates some more [PR105778] As the following testcase shows, our x86 backend support for optimizing out useless masking of shift/rotate counts when using instructions that naturally modulo the count themselves is insufficient. The *_mask define_insn_and_split patterns use (subreg:QI (and:SI (match_operand:SI) (match_operand "const_int_operand= "))) for the masking, but that can catch only the case where the masking is done in SImode, so typically in SImode in the source. We then have another set of patterns, *_mask_1, which use (and:QI (match_operand:QI) (match_operand "const_int_operand")) If the masking is done in DImode or in theory in HImode, we don't match it. The following patch does 4 different things to improve this: 1) drops the mode from AND and MATCH_OPERAND inside of the subreg:QI and replaces that by checking that the register shift count has SWI48 mode - I think doing it this way is cheaper than adding another mode iterator to patterns which use already another mode iterator and sometimes a code iterator as well 2) the doubleword shift patterns were only handling the case where the shift count is masked with a constant that has the most signific= ant bit clear, i.e. where we know the shift count is less than half the number of bits in double-word. If the mask is equal to half the number of bits in double-word minus 1, the masking was optimized away, otherwise the AND was kept. But if the most significant bit isn't clear, e use a word-sized shift and SHRD instruction, where the former does the modulo and the latter modulo with 64 / 32 depending on what mode the CPU is in (so 64 for 128-bit doubleword and 32 or 64-bit doubleword). So we can also optimize away the masking when the mask has all the relevant bits se= t, masking with the most significant bit will remain for the cmove test. 3) as requested, this patch adds a bunch of force_reg calls before gen_lowpart 4) 1-3 above unfortunately regressed +FAIL: gcc.target/i386/bt-mask-2.c scan-assembler-not and[lq][ \\t] +FAIL: gcc.target/i386/pr57819.c scan-assembler-not and[lq][ \\t] where we during combine match the new pattern we didn't match before and in the end don't match the pattern we were testing for. These 2 tests are fixed by the *jcc_bt_mask_1 pattern addition and small tweak to target rtx_costs, because even with the pattern around we'd refuse to match it because it appeared to have higher instruction cost 2022-06-02 Jakub Jelinek PR target/105778 * config/i386/i386.md (*ashl3_doubleword_mask): Remove :SI from AND and its operands and just verify operands[2] has HImod= e, SImode or for TARGET_64BIT DImode. Allow operands[3] to be a m= ask with all low 6 (64-bit) or 5 (32-bit) bits set and in that case just throw away the masking. Use force_reg before calling gen_lowpart. (*ashl3_doubleword_mask_1): Allow operands[3] to be a mask with all low 6 (64-bit) or 5 (32-bit) bits set and in that case just throw away the masking. (*ashl3_doubleword): Rename to ... (ashl3_doubleword): ... this. (*ashl3_mask): Remove :SI from AND and its operands and j= ust verify operands[2] has HImode, SImode or for TARGET_64BIT DImod= e. Use force_reg before calling gen_lowpart. (*3_mask): Likewise. (*3_doubleword_mask): Likewise. Allow operands[3] t= o be a mask with all low 6 (64-bit) or 5 (32-bit) bits set and in th= at case just throw away the masking. Use force_reg before calling gen_lowpart. (*3_doubleword_mask_1): Allow operands[3] to be a ma= sk with all low 6 (64-bit) or 5 (32-bit) bits set and in that case just throw away the masking. (*3_doubleword): Rename to ... (3_doubleword): ... this. (*3_mask): Remove :SI from AND and its operands and just verify operands[2] has HImode, SImode or for TARGET_64BIT DImod= e. Use force_reg before calling gen_lowpart. (splitter after it): Remove :SI from AND and its operands and j= ust verify operands[2] has HImode, SImode or for TARGET_64BIT DImod= e. (*_mask, *_mask): Remove :SI from AND a= nd its operands and just verify operands[1] has HImode, SImode or for TARGET_64BIT DImode. Use force_reg before calling gen_lowpart. (*jcc_bt_mask_1): New define_insn_and_split pattern. * config/i386/i386.cc (ix86_rtx_costs): For ZERO_EXTRACT with ZERO_EXTEND QI->SI in last operand ignore the cost of the ZERO_EXTEND. * gcc.target/i386/pr105778.c: New test.=