From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 91B0E3858D1E; Thu, 28 Apr 2022 20:02:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 91B0E3858D1E From: "mareksz1958 at wp dot pl" To: gcc-bugs@gcc.gnu.org Subject: [Bug other/105429] New: Unnecessary moves generated by the compiler. Date: Thu, 28 Apr 2022 20:02:46 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: other X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: mareksz1958 at wp dot pl X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Apr 2022 20:02:46 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105429 Bug ID: 105429 Summary: Unnecessary moves generated by the compiler. Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: mareksz1958 at wp dot pl Target Milestone: --- The following C code: >>> #include #include uint32_t crc(uint32_t current, const uint8_t *buffer, size_t size) { for(size_t i =3D 0; i < size; i++) current =3D _mm_crc32_u64(current, buffer[i]); return current; } <<< Generates inefficient assembly on all optimisation presets due to the extra `mov eax, eax' - Os and O3 below: >>> crc: movl %edi, %eax xorl %ecx, %ecx .L2: cmpq %rdx, %rcx je .L5 movzbl (%rsi,%rcx), %edi movl %eax, %eax incq %rcx crc32q %rdi, %rax jmp .L2 .L5: ret crc: movl %edi, %eax testq %rdx, %rdx je .L6 leaq (%rsi,%rdx), %rcx .L3: movzbl (%rsi), %edx movl %eax, %eax addq $1, %rsi crc32q %rdx, %rax cmpq %rsi, %rcx jne .L3 .L6: ret <<< The problem seems to be present in all GCC versions I have access to. The redundant move greatly worsens the performance of the generated code. When `_mm_crc32_u64' is replaced by any other function, the problem seems to disappear.=