From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 4B0F83857824; Fri, 20 Aug 2021 09:35:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4B0F83857824 From: "dumoulin.thibaut at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/101995] New: regression built-in memset missed-optimization arm -Os Date: Fri, 20 Aug 2021 09:35:53 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 10.3.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: dumoulin.thibaut at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Aug 2021 09:35:53 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101995 Bug ID: 101995 Summary: regression built-in memset missed-optimization arm -Os Product: gcc Version: 10.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: dumoulin.thibaut at gmail dot com Target Milestone: --- For cortex-m4 -Os, GCC10 produces bigger assembly code than GCC7 when memse= t is called. Here is the C code example to trigger the regression: ```C #include #include struct foo_t { int a; int b; int c; int d; }; /* Random function modifying foo with another value than 0 */ void doStuff(struct foo_t *foo) { foo->b =3D foo->a + foo->c; } void twoLinesFunction(struct foo_t *foo) { /* R0 is saved in GCC10 but not in GCC7 */ memset(foo, 0x00, sizeof(struct foo_t)); doStuff(foo); } int main(void) { struct foo_t foo; twoLinesFunction(&foo); return 0; } ``` compile command: `gcc -Os -mcpu=3Dcortex-m4` GCC7.3.1 produces: ```asm : push {r3, lr} movs r2, #16 movs r1, #0 bl 8168 ldmia.w sp!, {r3, lr} b.w 8104 ``` While GCC10.3.0 produces: ```asm : push {r4, lr} movs r2, #16 mov r4, r0 --> backup r0 movs r1, #0 bl 8174 mov r0, r4 --> restore r0 ldmia.w sp!, {r4, lr} b.w 810c ``` Main function remains the same. The builtin memset function does not change R0 so there is no need to save = it and restore it later. GCC7 is more efficient. GCC10 should not backup R0 for this builtin function in this case, it produ= ces slower code. There is this PR https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D61241 which= is also referring to this behavior with a patch to implement the optimization = but I'm not sure when this optimization has been wiped out.=