From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 01815385829D; Sun, 3 Dec 2023 18:09:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 01815385829D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1701626976; bh=0QHN6/I7UU+XdWv6suK3SQrU339uNTR7CL0sML2McYM=; h=From:To:Subject:Date:From; b=X3yGDLmEs59jHSkAH2JTm09COQwlBBNWcw8Q9/sf+MyDxSxvFJMFZa95pv5TokVxx /2agMXBdH8SRurAY6Vl1+4fby8i3BsKvJt8TxNYGN2K1eSv94/+PjpUcgxisjQValL CV7KVFCvSMk9eQVZU03JeT00Vd++oVwXWjtpOPCY= From: "gnu at kosak dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/112835] New: inverting the result of memcmp() produces inefficient code Date: Sun, 03 Dec 2023 18:09:34 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.2.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: gnu at kosak dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112835 Bug ID: 112835 Summary: inverting the result of memcmp() produces inefficient code Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gnu at kosak dot com Target Milestone: --- Created attachment 56778 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D56778&action=3Dedit the .ii file from --save-temps Hello, The assembly output of this program does a suboptimal thing in two different places. It moves a constant (0 or 1) into %eax and then does a `testl` and `sete` to invert that constant. This would be reasonable code for `!x` when= x is a runtime `int` value. However, when x is a constant, it would be simple= r to emit a single load instruction. ``` #include bool calc(const void *a, const void *b) { return !std::memcmp(a, b, 12); } ``` Relevant assembly on x86_64, invoked with g++ -S -O3 test.cc ``` _Z4calcPKvS0_: .LFB31: .cfi_startproc endbr64 movq (%rsi), %rax cmpq %rax, (%rdi) je .L5 .L2: movl $1, %eax testl %eax, %eax sete %al ret .p2align 4,,10 .p2align 3 .L5: movl 8(%rsi), %eax cmpl %eax, 8(%rdi) jne .L2 xorl %eax, %eax testl %eax, %eax sete %al ret .cfi_endproc ``` In the above, the code at label .L2 (movl/testl/sete) would be better off as the single instruction "xorl %eax,%eax" and likewise the code at the latter part of .L5 (xorl/testl/sete) would be better off as the single instruction "movl $1, %eax". You can see the compiler doing this, though with inverted logic, if you simply delete the ! in the source code. Apologies if I've posted this to the wrong component. Output from g++ -v: ``` Using built-in specs. COLLECT_GCC=3Dg++ COLLECT_LTO_WRAPPER=3D/usr/libexec/gcc/x86_64-linux-gnu/13/lto-wrapper OFFLOAD_TARGET_NAMES=3Dnvptx-none:amdgcn-amdhsa OFFLOAD_TARGET_DEFAULT=3D1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion=3D'Ubuntu 13.2.0-4ub= untu3' --with-bugurl=3Dfile:///usr/share/doc/gcc-13/README.Bugs --enable-languages=3Dc,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=3D/usr --with-gcc-major-version-only --program-suffix=3D-13 --program-prefix=3Dx86_64-linux-gnu- --enable-shared --enable-linker-build-= id --libexecdir=3D/usr/libexec --without-included-gettext --enable-threads=3Dp= osix --libdir=3D/usr/lib --enable-nls --enable-bootstrap --enable-clocale=3Dgnu --enable-libstdcxx-debug --enable-libstdcxx-time=3Dyes --with-default-libstdcxx-abi=3Dnew --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-= zlib --enable-libphobos-checking=3Drelease --with-target-system-zlib=3Dauto --enable-objc-gc=3Dauto --enable-multiarch --disable-werror --enable-cet --with-arch-32=3Di686 --with-abi=3Dm64 --with-multilib-list=3Dm32,m64,mx32 --enable-multilib --with-tune=3Dgeneric --enable-offload-targets=3Dnvptx-none=3D/build/gcc-13-XYspKM/gcc-13-13.2.0/= debian/tmp-nvptx/usr,amdgcn-amdhsa=3D/build/gcc-13-XYspKM/gcc-13-13.2.0/deb= ian/tmp-gcn/usr --enable-offload-defaulted --without-cuda-driver --enable-checking=3Drelease --build=3Dx86_64-linux-gnu --host=3Dx86_64-linux-gnu --target=3Dx86_64-linu= x-gnu --with-build-config=3Dbootstrap-lto-lean --enable-link-serialization=3D2 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.2.0 (Ubuntu 13.2.0-4ubuntu3)=20 ```=