public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114262] New: Over-inlining when optimizing for size?
@ 2024-03-07 2:00 lh_mouse at 126 dot com
2024-03-07 2:07 ` [Bug tree-optimization/114262] " pinskia at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: lh_mouse at 126 dot com @ 2024-03-07 2:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262
Bug ID: 114262
Summary: Over-inlining when optimizing for size?
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: lh_mouse at 126 dot com
Target Milestone: ---
(https://gcc.godbolt.org/z/a4ox6oEfT)
```
struct impl;
struct impl* get_impl(int key);
int get_value(struct impl* p);
extern __inline__ __attribute__((__gnu_inline__))
int get_value_by_key(int key)
{
struct impl* p = get_impl(key);
if(!p)
return -1;
return get_value(p);
}
int real_get_value_by_key(int key)
{
return get_value_by_key(key);
}
```
This is actually two functions, one is `gnu_inline` and the other is a
non-inline one. It looks to me that if I mark a function `gnu_inline`, I assert
that 'somewhere I shall provide an external definition for you' so when
optimizing for size, GCC may generate a call instead of using the more complex
inline definition.
The `real_get_value_by_key` function is made a deliberate sibling call, so
ideally this should be
```
real_get_value_by_key:
jmp get_value_by_key
```
and not
```
real_get_value_by_key:
push rsi
call get_impl
test rax, rax
je .L2
mov rdi, rax
pop rcx
jmp get_value
.L2:
or eax, -1
pop rdx
ret
```
It still gets inlined with `-finline-limit=0` and can only be disabled by
`-fno-inline`. I have no idea how it is controlled.
---------------------------
# Trivia
These are two `gnu_inline` functions from the same library. Most of the time
they should both be inlined in user code. However, external definitions are
required when optimization is not turned on, or when their addresses are taken,
so they must still exist. As they are unlikely to be used anyway, optimizing
for size makes much more sense.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114262] Over-inlining when optimizing for size?
2024-03-07 2:00 [Bug tree-optimization/114262] New: Over-inlining when optimizing for size? lh_mouse at 126 dot com
@ 2024-03-07 2:07 ` pinskia at gcc dot gnu.org
2024-03-07 2:33 ` [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function lh_mouse at 126 dot com
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-07 2:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I thought it was documented that gnu_inline also causes always_inline if
optimization is turned on but I can't seem to find that ...
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function
2024-03-07 2:00 [Bug tree-optimization/114262] New: Over-inlining when optimizing for size? lh_mouse at 126 dot com
2024-03-07 2:07 ` [Bug tree-optimization/114262] " pinskia at gcc dot gnu.org
@ 2024-03-07 2:33 ` lh_mouse at 126 dot com
2024-03-07 2:57 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: lh_mouse at 126 dot com @ 2024-03-07 2:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262
--- Comment #2 from LIU Hao <lh_mouse at 126 dot com> ---
(In reply to Andrew Pinski from comment #1)
> I thought it was documented that gnu_inline also causes always_inline if
> optimization is turned on but I can't seem to find that ...
Is that the case in GCC source? I think I would have to find a workaround for
it.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function
2024-03-07 2:00 [Bug tree-optimization/114262] New: Over-inlining when optimizing for size? lh_mouse at 126 dot com
2024-03-07 2:07 ` [Bug tree-optimization/114262] " pinskia at gcc dot gnu.org
2024-03-07 2:33 ` [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function lh_mouse at 126 dot com
@ 2024-03-07 2:57 ` pinskia at gcc dot gnu.org
2024-03-07 3:20 ` lh_mouse at 126 dot com
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-07 2:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |documentation
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The C++ front-end does:
/* Handle gnu_inline attribute. */
if (GNU_INLINE_P (decl1))
{
DECL_EXTERNAL (decl1) = 1;
DECL_NOT_REALLY_EXTERN (decl1) = 0;
DECL_INTERFACE_KNOWN (decl1) = 1;
DECL_DISREGARD_INLINE_LIMITS (decl1) = 1;
}
C front-end does:
/* For GNU C extern inline functions disregard inline limits. */
if (DECL_EXTERNAL (fndecl)
&& DECL_DECLARED_INLINE_P (fndecl)
&& (flag_gnu89_inline
|| lookup_attribute ("gnu_inline", DECL_ATTRIBUTES (fndecl))))
DECL_DISREGARD_INLINE_LIMITS (fndecl) = 1;
This specifically from r0-82849-gc536a6a77a19a8 but it was done different
before that (using a language hook).
https://gcc.gnu.org/pipermail/gcc-patches/2007-July/221806.html
https://gcc.gnu.org/pipermail/gcc-patches/2007-August/223406.html
It looks like it has been this way since r0-37737-g4838c5ee553f06 (2001) (or
rather that is when it was used by the tree inline; I don't want to dig further
back to understand the RTL inliner). So looks like this is just missing
documentation ...
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function
2024-03-07 2:00 [Bug tree-optimization/114262] New: Over-inlining when optimizing for size? lh_mouse at 126 dot com
` (2 preceding siblings ...)
2024-03-07 2:57 ` pinskia at gcc dot gnu.org
@ 2024-03-07 3:20 ` lh_mouse at 126 dot com
2024-03-07 3:25 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: lh_mouse at 126 dot com @ 2024-03-07 3:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262
--- Comment #4 from LIU Hao <lh_mouse at 126 dot com> ---
(In reply to Andrew Pinski from comment #3)
> It looks like it has been this way since r0-37737-g4838c5ee553f06 (2001) (or
> rather that is when it was used by the tree inline; I don't want to dig
> further back to understand the RTL inliner). So looks like this is just
> missing documentation ...
It's not just about `gnu_inline`. If we switch to C++ inline we get the same
result:
(https://gcc.godbolt.org/z/ehbjqj5xh)
```
struct impl;
struct impl* get_impl(int key);
int get_value(struct impl* p);
extern inline
int get_value_by_key(int key)
{
struct impl* p = get_impl(key);
if(!p)
return -1;
return get_value(p);
}
int real_get_value_by_key(int key)
{
return get_value_by_key(key);
}
```
GCC outputs:
```
real_get_value_by_key(int):
push rsi
call get_impl(int)
test rax, rax
je .L2
mov rdi, rax
pop rcx
jmp get_value(impl*)
.L2:
or eax, -1
pop rdx
ret
```
If we switched to C99 `extern inline` then it would produce desired result:
```
get_value_by_key:
push rsi
call get_impl
test rax, rax
je .L2
mov rdi, rax
pop rcx
jmp get_value
.L2:
or eax, -1
pop rdx
ret
real_get_value_by_key:
jmp get_value_by_key
``
The only difference between the C99 `extern inline` and C++ `extern inline` is
that the C++ external definition is COMDAT.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function
2024-03-07 2:00 [Bug tree-optimization/114262] New: Over-inlining when optimizing for size? lh_mouse at 126 dot com
` (3 preceding siblings ...)
2024-03-07 3:20 ` lh_mouse at 126 dot com
@ 2024-03-07 3:25 ` pinskia at gcc dot gnu.org
2024-03-07 15:55 ` Jan Hubicka
2024-03-07 15:55 ` hubicka at ucw dot cz
2024-03-07 17:08 ` lh_mouse at 126 dot com
6 siblings, 1 reply; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-07 3:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to LIU Hao from comment #4)
> The only difference between the C99 `extern inline` and C++ `extern inline`
> is that the C++ external definition is COMDAT.
Well not really. comdat changes heurstics here though. The reason being C++
inline functions are most likely smaller and should really be inlined. This is
all heurstics of inlining and figuring out locally in the TU that it might not
be called in another TU or not.
Note GCC has not retuned its -Os heurstics for a long time because it has been
decent enough for most folks and corner cases like this is almost never come
up.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function
2024-03-07 3:25 ` pinskia at gcc dot gnu.org
@ 2024-03-07 15:55 ` Jan Hubicka
0 siblings, 0 replies; 9+ messages in thread
From: Jan Hubicka @ 2024-03-07 15:55 UTC (permalink / raw)
To: pinskia at gcc dot gnu.org; +Cc: gcc-bugs
> Note GCC has not retuned its -Os heurstics for a long time because it has been
> decent enough for most folks and corner cases like this is almost never come
> up.
There were quite few changes to -Os heuristics :)
One of bigger challenges is that we do see more and more C++ code built
with -Os which relies on certain functions to be inlined and optimized
in context, so we had to get more optimistic in a hope that inlined code
will optimize well.
COMDAT functions are more likely inlined because statistics shows that
many of them are not really shared between translations units
(see -param=comdat-sharing-probability parameter). This was necessary to
get reasonable code for Firefox approx 15 years ago.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function
2024-03-07 2:00 [Bug tree-optimization/114262] New: Over-inlining when optimizing for size? lh_mouse at 126 dot com
` (4 preceding siblings ...)
2024-03-07 3:25 ` pinskia at gcc dot gnu.org
@ 2024-03-07 15:55 ` hubicka at ucw dot cz
2024-03-07 17:08 ` lh_mouse at 126 dot com
6 siblings, 0 replies; 9+ messages in thread
From: hubicka at ucw dot cz @ 2024-03-07 15:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262
--- Comment #6 from Jan Hubicka <hubicka at ucw dot cz> ---
> Note GCC has not retuned its -Os heurstics for a long time because it has been
> decent enough for most folks and corner cases like this is almost never come
> up.
There were quite few changes to -Os heuristics :)
One of bigger challenges is that we do see more and more C++ code built
with -Os which relies on certain functions to be inlined and optimized
in context, so we had to get more optimistic in a hope that inlined code
will optimize well.
COMDAT functions are more likely inlined because statistics shows that
many of them are not really shared between translations units
(see -param=comdat-sharing-probability parameter). This was necessary to
get reasonable code for Firefox approx 15 years ago.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function
2024-03-07 2:00 [Bug tree-optimization/114262] New: Over-inlining when optimizing for size? lh_mouse at 126 dot com
` (5 preceding siblings ...)
2024-03-07 15:55 ` hubicka at ucw dot cz
@ 2024-03-07 17:08 ` lh_mouse at 126 dot com
6 siblings, 0 replies; 9+ messages in thread
From: lh_mouse at 126 dot com @ 2024-03-07 17:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262
--- Comment #7 from LIU Hao <lh_mouse at 126 dot com> ---
(In reply to Jan Hubicka from comment #6)
> > Note GCC has not retuned its -Os heurstics for a long time because it has been
> > decent enough for most folks and corner cases like this is almost never come
> > up.
> There were quite few changes to -Os heuristics :)
> One of bigger challenges is that we do see more and more C++ code built
> with -Os which relies on certain functions to be inlined and optimized
> in context, so we had to get more optimistic in a hope that inlined code
> will optimize well.
>
> COMDAT functions are more likely inlined because statistics shows that
> many of them are not really shared between translations units
> (see -param=comdat-sharing-probability parameter). This was necessary to
> get reasonable code for Firefox approx 15 years ago.
So is there no way to get the C99 extern inline behavior? i.e. sibling calls to
gnu_inline functions are inlined even when optimizing for size.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-03-07 17:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-07 2:00 [Bug tree-optimization/114262] New: Over-inlining when optimizing for size? lh_mouse at 126 dot com
2024-03-07 2:07 ` [Bug tree-optimization/114262] " pinskia at gcc dot gnu.org
2024-03-07 2:33 ` [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function lh_mouse at 126 dot com
2024-03-07 2:57 ` pinskia at gcc dot gnu.org
2024-03-07 3:20 ` lh_mouse at 126 dot com
2024-03-07 3:25 ` pinskia at gcc dot gnu.org
2024-03-07 15:55 ` Jan Hubicka
2024-03-07 15:55 ` hubicka at ucw dot cz
2024-03-07 17:08 ` lh_mouse at 126 dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).