From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 0FEB63858410; Tue, 1 Nov 2022 01:33:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0FEB63858410 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667266382; bh=BCyNe5RFT/Y4M3feUpakSJyaCnNwgsYEmEpZ0FGYY8U=; h=From:To:Subject:Date:In-Reply-To:References:From; b=lbpzWBuB9Gj4qpK7R/AlWR1PbK+b81EpZP3Zl4Ej4g4Qe2t7R+gyyonj1lCDn4nYa Adg/sqbucisigVurnToSHqnwBSlHlG8ZWV8iZCiKdcTP3x+xAcso6zBgkh4dU7UMej UqFi2BnGJ2OqvAKo4Qb0w1NtAPmlMECrK79Pczw8= From: "dthorn at google dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/106725] LTO semantics for __attribute__((leaf)) Date: Tue, 01 Nov 2022 01:32:59 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 12.2.0 X-Bugzilla-Keywords: documentation, lto X-Bugzilla-Severity: normal X-Bugzilla-Who: dthorn at google dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106725 --- Comment #6 from Daniel Thornburgh --- I spent a little more time on this, and here's a more concrete reproducer of GCC's current behavior. The setup again has 3 files: main.c, lto.c, and ext.c. lto.c is a simple getter-setter interface wrapping a global int. main.c sets the value using = this interface, then makes an __attribute__((leaf)) call to ext.c. This sets the value to 0. This should be legal, since the call doesn't call back to main.= c, it calls to lto.c. $ tail -n+1 *.c =3D=3D> ext.c <=3D=3D void set_value(int v); void external_call(void) { set_value(0); } =3D=3D> lto.c <=3D=3D static int value; void set_value(int v) { value =3D v; } int get_value(void) { return value; } =3D=3D> main.c <=3D=3D #include void set_value(int v); int get_value(void); __attribute__((leaf)) void external_call(void); int main(void) { set_value(42); external_call(); printf("%d\n", get_value()); } If we compile main.c and lto.c together using the pre-WHOPR module-merging flow, the resulting binary assumes that the external call cannot clobber the value, and it thus prints 42 rather than zero. $ gcc -c -O2 ext.c $ gcc -O2 -flto-partition=3Dnone main.o lto.o ext.o $ ./a.out 42 If you instead use WHOPR, it looks like this optimization doesn't trigger: $ gcc -O2 -flto main.o lto.o ext.o $ ./a.out 0 At least in the unpartitioned case, it looks like the optimizer is consider= ing attribute((leaf)) to apply to the whole LTO unit. I'm unsure what WPA's semantics are, since there may be other reasons why this optimization wasn't taken there.=