From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 7D09B384AB45; Wed, 15 May 2024 07:05:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7D09B384AB45 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1715756720; bh=pPFkRO9yKMJwmLwnnPXJ5S3uGG1FIacKKeLnhvn4BLo=; h=From:To:Subject:Date:In-Reply-To:References:From; b=W9XWp4P8cI9S8JORBf56H+27ftIEi1iml9eWJtCcrVJUvZIJxZRfEbiChSuSU08Rf 8nE0KGex5dq+RP6X59cEvr7ldLXaSDF+uPjsbvE5HP+eVOkqK6K+AuWT/iO9wS7wh4 AFLsr/Fh7A+hueEHrIUmXxRoAiYFPlz8aw0EelNQ= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug ipa/115097] Strange suboptimal codegen specifically at -O2 when copying struct type Date: Wed, 15 May 2024 07:05:20 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: ipa X-Bugzilla-Version: 15.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: component cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D115097 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Component|tree-optimization |ipa CC| |hubicka at gcc dot gnu.org, | |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- So actually it seems that the reason is ICF plus inlining: t.ii:2:3: optimized: Semantic equality hit:A test1(A&)/0->A test2(A&&)/1 t.ii:2:3: optimized: Assembler symbol names:_Z5test1R1A/0->_Z5test2O1A/1 t.ii:2:3: optimized: Semantic equality hit:A test1(A&)/0->A test3(const A&)= /2 t.ii:2:3: optimized: Assembler symbol names:_Z5test1R1A/0->_Z5test3RK1A/2 t.ii:2:3: optimized: Semantic equality hit:A test1(A&)/0->A test4(const A&&= )/3 t.ii:2:3: optimized: Assembler symbol names:_Z5test1R1A/0->_Z5test4OK1A/3 optimized: Inlined A test1(A&)/4 into A test2(A&&)/1 which now has time 4.000000 and size 5, net change of -1. optimized: Inlined A test1(A&)/5 into A test3(const A&)/2 which now has ti= me 4.000000 and size 5, net change of -1. optimized: Inlined A test1(A&)/6 into A test4(const A&&)/3 which now has t= ime 4.000000 and size 5, net change of -1. for some reason we "optimize" the functions to the following in IPA ICF: struct A test4 (const struct A & a) { struct A retval.6; [local count: 1073741824]: retval.6 =3D test1 (a_2(D)); [tail call] return retval.6; } struct A test3 (const struct A & a) { struct A retval.5; [local count: 1073741824]: retval.5 =3D test1 (a_2(D)); [tail call] return retval.5; } struct A test2 (struct A & a) { struct A retval.4; [local count: 1073741824]: retval.4 =3D test1 (a_2(D)); [tail call] return retval.4; } and then we inline them back, introducing the extra copy. Why do we use tail-calls here instead of aliases? Why do we lack cost modeling here? Why do we inline back? It looks like a pointless exercise to me ... With -fdisable-ipa-inline we get _Z5test2O1A: .LFB5: .cfi_startproc jmp _Z5test1R1A so that's at least reasonable and what's expected I suppose. So one could argue the bug is in the inliner and with introducing the extra copy (IIRC there's a bug about this), but still.=