From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id DBD9B3858D20; Tue, 30 May 2023 08:35:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DBD9B3858D20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685435759; bh=bOHzGK97Z0rvCw/u4/VtyKZyaDPvRdnMxq6/dmorbwU=; h=From:To:Subject:Date:From; b=e9JFwWPVoHjwWUx/mN9vol5DEvhaR69YyWWY061VT1UuuL9nLqYtcPu1dKjOBXqEE eN1Txxl3wPbpsSVqI87HbVVGzIiTejGGEA3FHIKgwhtoMS4dVgTsm3Y4g22gLniDJU rNgUC/7fSE27xUaYWvY/4NVVmpN3WgCLziyurL4Q= From: "ptk.prasertsuk at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements Date: Tue, 30 May 2023 08:35:53 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.1.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: ptk.prasertsuk at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110035 Bug ID: 110035 Summary: Missed optimization for dependent assignment statements Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ptk.prasertsuk at gmail dot com Target Milestone: --- Created attachment 55212 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D55212&action=3Dedit Test case, compiled with -stdc++=3D20 -O2 The test case, when compiled, produces additional move instructions: movdqu (%rdi), %xmm2 movdqu 16(%rdi), %xmm1 movdqu 32(%rdi), %xmm0 movl $48, %edi movaps %xmm2, 32(%rsp) movaps %xmm1, 16(%rsp) movaps %xmm0, (%rsp) call _Znwm@PLT movdqa 32(%rsp), %xmm2 movdqa 16(%rsp), %xmm1 movdqa (%rsp), %xmm0 movq %rax, %rdi movups %xmm2, (%rax) movups %xmm1, 16(%rax) movups %xmm0, 32(%rax) compared to more optimized result using clang++ 14.0.0 with same flags: callq _Znwm@PLT movups (%rbx), %xmm0 movups 16(%rbx), %xmm1 movups 32(%rbx), %xmm2 movups %xmm0, (%rax) movups %xmm1, 16(%rax) movups %xmm2, 32(%rax) movq %rax, %rdi Clang has MemCpyOptPass which detects and removes memory dependency of the second set of move instructions, which allows Dead Store Elimination pass to remove the first set of move instructions. g++-12 -v Using built-in specs. COLLECT_GCC=3Dg++-12 COLLECT_LTO_WRAPPER=3D/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper OFFLOAD_TARGET_NAMES=3Dnvptx-none:amdgcn-amdhsa OFFLOAD_TARGET_DEFAULT=3D1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion=3D'Ubuntu 12.1.0-2ubuntu1~22.04' --with-bugurl=3Dfile:///usr/share/doc/gcc-12/README.= Bugs --enable-languages=3Dc,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=3D/usr --with-gcc-major-version-only --program-suffix=3D-12 --program-prefix=3Dx86_64-linux-gnu- --enable-shared --enable-linker-build-= id --libexecdir=3D/usr/lib --without-included-gettext --enable-threads=3Dposix --libdir=3D/usr/lib --enable-nls --enable-clocale=3Dgnu --enable-libstdcxx-= debug --enable-libstdcxx-time=3Dyes --with-default-libstdcxx-abi=3Dnew --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=3Drelea= se --with-target-system-zlib=3Dauto --enable-objc-gc=3Dauto --enable-multiarch --disable-werror --enable-cet --with-arch-32=3Di686 --with-abi=3Dm64 --with-multilib-list=3Dm32,m64,mx32 --enable-multilib --with-tune=3Dgeneric --enable-offload-targets=3Dnvptx-none=3D/build/gcc-12-sZcx2y/gcc-12-12.1.0/= debian/tmp-nvptx/usr,amdgcn-amdhsa=3D/build/gcc-12-sZcx2y/gcc-12-12.1.0/deb= ian/tmp-gcn/usr --enable-offload-defaulted --without-cuda-driver --enable-checking=3Drelease --build=3Dx86_64-linux-gnu --host=3Dx86_64-linux-gnu --target=3Dx86_64-linu= x-gnu Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.1.0 (Ubuntu 12.1.0-2ubuntu1~22.04)=