From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 29955383B41A; Mon, 16 Aug 2021 10:56:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 29955383B41A From: "tnfchris at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/101197] __builtin_memmove does not perform constant optimizations Date: Mon, 16 Aug 2021 10:56:01 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: tnfchris at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: marxin at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Aug 2021 10:56:02 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101197 --- Comment #13 from Tamar Christina --- (In reply to cqwrteur from comment #12) > (In reply to cqwrteur from comment #11) > > (In reply to Tamar Christina from comment #10) > > > (In reply to cqwrteur from comment #9) > > > > (In reply to Tamar Christina from comment #8) > > > > > (In reply to Jakub Jelinek from comment #6) > > > > > > Shouldn't that be a different PR with details? I mean, this PR= is that we > > > > > > should expand shorter memmove inline even if the regions do ove= rlap. > > > > >=20 > > > > > Sure, I'm still trying to create a minimal representative example= (it's C++ > > > > > and templated) unless just pointing at the github is enough.=20 > > > > >=20 > > > > > To be clear though, just inlining memmove at all will cover most = of the > > > > > distance, it's just that you require less registers. > > > >=20 > > > > inline things like memcpy and memmove will lead to serious binary b= loat. The > > > > compiler usually picks to emit call to libc's memcpy and memmove th= at is > > > > usually highly optimized with assembly code. > > >=20 > > > Yes your binary will grow, but on small memcopy and memmove. the call= ing > > > overhead, not to mention the register allocation overhead you might g= et from > > > having to spill your caller saves more than makes up for it. > > >=20 > > > We already inline memcpy and memset. there's no reason not to do memm= ove, > > > especially at -O3. > >=20 > > That is false. inline memcpy and memset only works when the size is con= stant. >=20 > more for type punning reason. > but on small memcopy and memmove.(In reply to cqwrteur from comment #11) > (In reply to Tamar Christina from comment #10) > > (In reply to cqwrteur from comment #9) > > > (In reply to Tamar Christina from comment #8) > > > > (In reply to Jakub Jelinek from comment #6) > > > > > Shouldn't that be a different PR with details? I mean, this PR i= s that we > > > > > should expand shorter memmove inline even if the regions do overl= ap. > > > >=20 > > > > Sure, I'm still trying to create a minimal representative example (= it's C++ > > > > and templated) unless just pointing at the github is enough.=20 > > > >=20 > > > > To be clear though, just inlining memmove at all will cover most of= the > > > > distance, it's just that you require less registers. > > >=20 > > > inline things like memcpy and memmove will lead to serious binary blo= at. The > > > compiler usually picks to emit call to libc's memcpy and memmove that= is > > > usually highly optimized with assembly code. > >=20 > > Yes your binary will grow, but on small memcopy and memmove. the calling > > overhead, not to mention the register allocation overhead you might get= from > > having to spill your caller saves more than makes up for it. > >=20 > > We already inline memcpy and memset. there's no reason not to do memmov= e, > > especially at -O3. >=20 > That is false. inline memcpy and memset only works when the size is const= ant. How do you think you know when the size is small? > but on small memcopy and memmove. By logic this means you know the size is constant.=