From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id AD7E03858410; Thu, 4 Jan 2024 13:22:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AD7E03858410 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1704374560; bh=Nq0QLhrgaM5cKOOujgyz+Mw0NOQkS/+rLwhVKt/8GtI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Bv0cTvK5QeUDI4KN0eG6mpsUv0HyLkfE21omNpCFGwCRl7fNZduYI7/NT5uJ4V9ev DvhH+F6wDkKTwpfmP3V1CEDRBuEN0leRMLebuU31snTKKOqR+OZIcL0UanJctAIPyU pLgALTQqorgAoRVNUEL64u8uu97hmI/D+tbkaMJE= From: "bugdal at aerifal dot cx" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/32667] block copy with exact overlap is expanded as memcpy Date: Thu, 04 Jan 2024 13:22:39 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 4.2.0 X-Bugzilla-Keywords: documentation, wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: bugdal at aerifal dot cx X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D32667 --- Comment #57 from Rich Felker --- I think one could reasonably envision an implementation that does some sort= of vector loads/stores where, due to some performance constraint or avoiding special casing for possible page boundary past the end of the copy, it only wants to load N bits at a time, but the efficient store instruction always stores a full vector of 2N bits. Of course, one could also argue quite reasonably that this is a weird enough thing to do that the implementation should then just check for src=3D=3Ddest and early-out. I'm far less concerned about whether such mechanical breakage exists, and m= ore concerned about the consequences of LTO/whole-program-analysis where someth= ing in the translation process can see the violated restrict qualifier, infer U= B, and blow everything up. The change being requested here is really one of removing the restrict qualification from the arguments and making a custom weaker condition. This= may in turn have consequences on what types of transformations are possible.=