From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id AE123385040F; Fri, 29 Jan 2021 11:18:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AE123385040F From: "david at westcontrol dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/98884] New: Implement empty struct optimisations on ARM Date: Fri, 29 Jan 2021 11:18:47 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: david at westcontrol dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jan 2021 11:18:47 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D98884 Bug ID: 98884 Summary: Implement empty struct optimisations on ARM Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: david at westcontrol dot com Target Milestone: --- Empty "tag" structs (or classes) are useful for strong typing, function options, and so on. The rules of C++ require these to have a non-zero size= (so that addresses of different instances are valid and distinct), but they con= tain no significant data. Ideally, therefore, the compiler will not generate co= de that sets values or copies values when passing around such types.=20 Unfortunately, that is not quite the case. Consider these two examples, with foo1 creating a tag type, and foo2 passin= g it on: struct Tag { friend Tag make_tag(); private: Tag() {} }; Tag make_tag() {=20 return Tag{};=20 }; void needs_tag(Tag); void foo1(void) { Tag t =3D make_tag(); needs_tag(t); } struct Tag1 {}; struct Tag2 {}; struct Tag3 {}; struct Tag4 {}; struct Tag5 {}; void needs_tags(int x, Tag1 t1, Tag2 t2, Tag3 t3, Tag4 t4, Tag5 t5); void foo2(Tag1 t1, Tag2 t2, Tag3 t3, Tag4 t4, Tag5 t5) { needs_tags(12345, t1, t2, t3, t4, t5); } (Here is a godbolt link for convenience: ) On x86, since gcc 8, this has been quite efficient (this is all with -O2): make_tag(): xor eax, eax ret foo1(): jmp needs_tag(Tag) foo2(Tag1, Tag2, Tag3, Tag4, Tag5): mov edi, 12345 jmp needs_tags(int, Tag1, Tag2, Tag3, Tag4, Tag5) The contents of the tag instances are basically ignored. The exception is = on "make_tag", where the return is given the value 0 unnecessarily. But on ARM it is a different matter. This is for the Cortex-M4: make_tag(): mov r0, #0 bx lr foo1(): mov r0, #0 b needs_tag(Tag) foo2(Tag1, Tag2, Tag3, Tag4, Tag5): push {lr} sub sp, sp, #12 mov r2, #0 mov r3, r2 strb r2, [sp, #4] strb r2, [sp] mov r1, r2 movw r0, #12345 bl needs_tags(int, Tag1, Tag2, Tag3, Tag4, Tag5) add sp, sp, #12 ldr pc, [sp], #4 The needless register and stack allocations, initialisations and copying me= an that this technique has a significant overhead for something that should re= ally "disappear in the compilation". The x86 port manages this well. Is it possible to get such optimisations i= nto the ARM port too? Oh, and for comparison, clang with the same options (-std=3Dc++17 -Wall -We= xtra -O2 -mcpu=3Dcortex-m4) gives: make_tag(): bx lr foo1(): b needs_tag(Tag) foo2(Tag1, Tag2, Tag3, Tag4, Tag5): movw r0, #12345 b needs_tags(int, Tag1, Tag2, Tag3, Tag4, Tag5)=