From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 9C3E93858D33; Tue, 17 Oct 2023 06:54:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9C3E93858D33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1697525659; bh=sOURYfmmOywoFXnVOGmTlPjXyK/dva277MwG8COJ92U=; h=From:To:Subject:Date:In-Reply-To:References:From; b=eW8CuIlFsdavnXmAUMSZsHiOEog+DPT2ctnskagvf17MFd2W74w0eCHNw0Sk99e9n a/pKWhVLWcL/ZMP4RRz0A7Y2Q5d+A3kiZV/r9/akatVsZq8ZxPZlLzAU8do6SoYtu1 MDbcreqlNNrKMr6dMfPWMChqpxjrMzhVOxWW+rQE= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/111844] missed optimization Date: Tue, 17 Oct 2023 06:54:19 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111844 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jamborm at gcc dot gnu.org, | |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- We are not optimizing the code at all on the GIMPLE level but expand from [local count: 1073741824]: memcpy (&p, buf_5(D), 88); _1 =3D p.x; inc.0_2 =3D (unsigned int) inc_7(D); _3 =3D _1 + inc.0_2; p.x =3D _3; memcpy (buf_5(D), &p, 88); p =3D{v} {CLOBBER(eol)}; return; where when expanding memcpy inline during RTL expanding we seem to be able = to clean up after that. It seems to me this is a task for SRA (again...) which should be more forgiving to select stmts requiring address-taking of locals but only when they are not rewritten plus analyzing memcpy, memset (and other select builtins) as to their effect. SRA handles the following by means of totally scalarizing 'p': void foo(P* buf, int inc) { P p; p =3D *buf; p.x +=3D inc; *buf =3D p; } and you get _Z3fooP1Pi: .LFB16: .cfi_startproc addl %esi, (%rdi) ret with or without the call to bar (). You could argue more aggressive "inline expanding" memcpy (to char[] =3D char[] in this case) would be asked for but I think this might confuse SRA and I'm not sure we apply the same costing as to whether to inline-expand the "memcpy" at RTL expansion time.=