From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id AB82D385842B; Mon, 30 May 2022 21:21:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AB82D385842B From: "zero at smallinteger dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/105778] New: Rotate by register --- unnecessary AND instruction Date: Mon, 30 May 2022 21:21:30 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.1.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: zero at smallinteger dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 May 2022 21:21:30 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105778 Bug ID: 105778 Summary: Rotate by register --- unnecessary AND instruction Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: zero at smallinteger dot com Target Milestone: --- Created attachment 53053 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D53053&action=3Dedit Sample code With -O2, some x86 shift-by-register instructions are preceded by an unnecessary AND instruction. The AND instruction is unnecessary because the shift-by-register instructions already mask the register containing the variable shift. In the sample code, the #if 0 branch produces the code mov rax, rdi mov ecx, edi shr rax, cl ret but the #if 1 branch produces the code mov rcx, rdi mov rax, rdi and ecx, 63 shr rax, cl ret even though the code has the same behavior. Note that the and ecx, 63 is unnecessary here because shr rax, cl will already operate on the bottom 6 b= its of ecx anyway, as per the Intel manual. As notated in the code's comments, some explicit masks other than 0x3f may produce even more inefficient code, e.g.: movabs rcx, 35184372088831 mov rax, rdi and rcx, rdi shr rax, cl ret while some other masks like 0xff and 0xffff eliminate the explicit and altogether. Found with gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0. Verified with godbolt for all gcc versions from 9.4.0 through trunk. For the sake of completeness, I could not get clang to reproduce this probl= em.=20 The latest classic ICC compiler available in Godbolt (2021.5.0) can emit co= de with MOVABS as above. However, the newer ICX Intel compiler behaves like c= lang (this seems reasonable).=