From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 1BFFF3858408; Fri, 19 Jan 2024 16:43:39 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1BFFF3858408
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1705682619;
	bh=SZj1ly5kurohwlDA+p8GJG4D2HlUsOt2DFOJLlBob5E=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=X/WupRlRDs9AdsU3udkBeNVl4qBhHAuFfKMWQ679W23mEvoAEazAGDEZZaRxlwhzT
	 ZTR950Dm0YLqT8+TcgXIvtAj4hUUVzEYusIefIm1v7MP2uEODSMnpJLgF5xAdlK0aw
	 QUAEVK8o3v1c0qGaGZBWfBE+VUGi7hUyAtgeKTBI=
From: "hubicka at ucw dot cz" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug ipa/113478] -Os does not inline single instruction function
Date: Fri, 19 Jan 2024 16:43:38 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: ipa
X-Bugzilla-Version: 13.2.1
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: hubicka at ucw dot cz
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-113478-4-fWxEIuZ8Lj@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-113478-4@http.gcc.gnu.org/bugzilla/>
References: <bug-113478-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113478
--- Comment #4 from Jan Hubicka <hubicka at ucw dot cz> ---
> Possibly, at least when we know it doesn't expand to a libatomic call?  O=
TOH
> even then a function just wrapping such call should probably be inlined,
> so the question is whether the problem that
> is estimated as too big compared to the call calling the function
> (OTOH a1.test () has no arguments while __atomic_load_1 has two).

If we really want to optimize for size, calling function with one
parameter is shorter then calling function with two parameters.  The
code size model takes into account when the offline copy of the function
will disappear and it also has some biass towards understanding that a
lot of comdat functions are not really shared across units.

The testcase calls function 15 times and I guess wrapper function on
most architectures is shorter than 15 load zero instructions...

We now have -Os and -Oz and two-level optimize_size predicates. We may
make this less restrictive with lower size optimization level. But when
optimizing for size and if __atomic_load was ordinary function call, I
think the decision is correct.=