public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming
@ 2023-06-21  7:05 rguenth at gcc dot gnu.org
  2023-06-21  7:06 ` [Bug ipa/110334] " rguenth at gcc dot gnu.org
                   ` (27 more replies)
  0 siblings, 28 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-21  7:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

            Bug ID: 110334
           Summary: [13/14 Regresssion] unused functions not eliminated
                    before LTO streaming
           Product: gcc
           Version: 13.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: ipa
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
                CC: marxin at gcc dot gnu.org
  Target Milestone: ---

https://bugzilla.suse.com/show_bug.cgi?id=1212101

reports crashes of Firefox because with LTO we end up with comdats built
with different ISA flags and the linker choosing the "wrong" one.  While
that's general a user error the TUs in question have no uses of the
offending symbols and they are indeed eliminated when optimizing.  But
this elimination happens only after LTO streaming which is unfortunate.

We also seem to lack any diagnostic at WPA time that we have COMDAT
(group members) from several TUs that are not built with the same options.
Apart from diagnosing this (with -Wodr?) a possible DWIM solution could
be to clone the COMDAT (groups), overriding the linker decision and
preserving the edges to the original refered copy.  That's also superior
to using symbol visibility in the TUs as it would allow to share the
comdats across TUs if they are built with the same set of flags.

But first and foremost I wonder why we do not eliminate these symbols
before LTO streaming.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
@ 2023-06-21  7:06 ` rguenth at gcc dot gnu.org
  2023-06-21  7:08 ` rguenth at gcc dot gnu.org
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-21  7:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org
   Target Milestone|---                         |13.2
           Keywords|                            |lto, missed-optimization

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
  2023-06-21  7:06 ` [Bug ipa/110334] " rguenth at gcc dot gnu.org
@ 2023-06-21  7:08 ` rguenth at gcc dot gnu.org
  2023-06-21  7:12 ` rguenth at gcc dot gnu.org
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-21  7:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 55378
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55378&action=edit
preprocessed testcase

This is one of the TUs in question, preprocessed with GCC 13.  It was built
with -O3 -mavx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
  2023-06-21  7:06 ` [Bug ipa/110334] " rguenth at gcc dot gnu.org
  2023-06-21  7:08 ` rguenth at gcc dot gnu.org
@ 2023-06-21  7:12 ` rguenth at gcc dot gnu.org
  2023-06-22 11:15 ` rguenth at gcc dot gnu.org
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-21  7:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The desired symbol table before LTO streaming has no symbols in the
skvx:: namespace

Just look at the final symbol table when not compiling with -flto, it takes
until the 'Optimized Symbol table' to do that, interestingly we have

Reclaiming functions:
Reclaiming variables:
Clearing address taken flags:
Optimized Symbol table:

so it doesn't claim to do anything.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-06-21  7:12 ` rguenth at gcc dot gnu.org
@ 2023-06-22 11:15 ` rguenth at gcc dot gnu.org
  2023-06-22 11:45 ` rguenth at gcc dot gnu.org
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-22 11:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
So it's actually that we require the instantiations but we are not able to
fully optimize avx::[rect_]memset* before LTO streaming.  In fact we're not too
far from that:

void skvx::Vec<2, unsigned int>::Vec (struct Vec * const this, unsigned int
D.78460)
{ 
  unsigned int _3(D);

  <bb 2> [local count: 1073741824]:
  MEM[(struct VecStorage *)this_2(D)] ={v} {CLOBBER};
  MEM[(struct Vec *)this_2(D)] ={v} {CLOBBER};
  MEM[(struct Vec *)this_2(D)].val = _3(D);
  MEM[(struct Vec *)this_2(D) + 4B] ={v} {CLOBBER};
  MEM[(struct Vec *)this_2(D) + 4B].val = _3(D);
  return;
}

void skvx::Vec<4, unsigned int>::Vec (struct Vec * const this, unsigned int
D.78672)
{ 
  unsigned int _3(D);
  struct Vec * _4;
  struct Vec * _5;

  <bb 2> [local count: 1073741824]:
  MEM[(struct VecStorage *)this_2(D)] ={v} {CLOBBER};
  _4 = &MEM[(struct VecStorage *)this_2(D)].lo;
  skvx::Vec<2, unsigned int>::Vec (_4, _3(D));
  _5 = &MEM[(struct VecStorage *)this_2(D)].hi;
  skvx::Vec<2, unsigned int>::Vec (_5, _3(D));
  return;

}

void avx::memsetT<unsigned int> (unsigned int * buffer, unsigned int value, int
count)
{
  struct Vec wideValue;
  long unsigned int _17;

  <bb 2> [local count: 118111600]:
  MEM[(struct VecStorage *)&wideValue] ={v} {CLOBBER};
  skvx::Vec<4, unsigned int>::Vec (&MEM[(struct VecStorage *)&wideValue].lo,
value_8(D));
  skvx::Vec<4, unsigned int>::Vec (&MEM[(struct VecStorage *)&wideValue].hi,
value_8(D));


we go inlining skvx::Vec<8, unsigned int>::Vec here which eventually
expands

__attribute__((always_inline))
void skvx::VecStorage<8, unsigned int>::VecStorage (struct VecStorage * const
this, unsigned int s)
{

but we do not inline into that function and as we do not iterate early
inlining we stop here.

Removing all always_inline from the TU fixes the issue.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-06-22 11:15 ` rguenth at gcc dot gnu.org
@ 2023-06-22 11:45 ` rguenth at gcc dot gnu.org
  2023-06-22 12:02 ` rguenth at gcc dot gnu.org
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-22 11:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So we run into

      /* Never inline regular functions into always-inline functions
         during incremental inlining.  This sucks as functions calling
         always inline functions will get less optimized, but at the
         same time inlining of functions calling always inline
         function into an always inline function might introduce
         cycles of edges to be always inlined in the callgraph.

         We might want to be smarter and just avoid this type of inlining.  */
      || (DECL_DISREGARD_INLINE_LIMITS (node->decl)
          && lookup_attribute ("always_inline",
                               DECL_ATTRIBUTES (node->decl))))

which I think is sensible but the issue is that for some reason the
skvx::Vec<4, unsigned int>::Vec (_1, s_6(D)) calls are not always_inline.

(gdb) p node->callees        
$3 = <cgraph_edge* 0x7ffff301c068 (<cgraph_node * 0x7ffff2f43000 "__ct_base
"/8008> -> <cgraph_node * 0x7ffff3338dd0 "__ct_comp "/8006>)>
(gdb) p node->callees->next_callee 
$4 = <cgraph_edge* 0x7ffff301c000 (<cgraph_node * 0x7ffff2f43000 "__ct_base
"/8008> -> <cgraph_node * 0x7ffff3338dd0 "__ct_comp "/8006>)>

__attribute__((always_inline))
void skvx::VecStorage<8, unsigned int>::VecStorage (struct VecStorage * const
this, unsigned int s)
{
  struct Vec * _1;
  struct Vec * _2;

  <bb 2> :
  *this_4(D) ={v} {CLOBBER};
  _1 = &this_4(D)->lo;
  skvx::Vec<4, unsigned int>::Vec (_1, s_6(D));
  _2 = &this_4(D)->hi;
  skvx::Vec<4, unsigned int>::Vec (_2, s_6(D));
  return;

the location info points to

    __attribute__((always_inline)) Vec(typename ConvertNative<N, T>::type
native) : Vec(bit_pun<Vec>(native)) {}

I'm not sure what fails to work here.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-06-22 11:45 ` rguenth at gcc dot gnu.org
@ 2023-06-22 12:02 ` rguenth at gcc dot gnu.org
  2023-06-23 10:47 ` hubicka at gcc dot gnu.org
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-22 12:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jason at gcc dot gnu.org

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the CTOR we call is built (well, cloned from that) via

implicitly_declare_fn (kind=sfk_inheriting_constructor, type=<record_type
0x7ffff3423e70 Vec>, const_p=false, pattern_fn=<function_decl 0x7ffff3427600
__ct >, inherited_parms=<tree_list 0x7ffff34298c0>) at
/space/rguenther/src/gcc11queue/gcc/cp/method.cc:3245

and 'pattern_fn' has the always_inline attribute here but we don't seem
to copy that anywhere?

-fno-new-inheriting-ctors also "fixes" the optimization issue for me.

So do we fail to copy DECL_ATTRIBUTES for these kind of implicitely
declared functions (which are in fact explicitely declared?!)?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2023-06-22 12:02 ` rguenth at gcc dot gnu.org
@ 2023-06-23 10:47 ` hubicka at gcc dot gnu.org
  2023-06-23 11:21 ` rguenth at gcc dot gnu.org
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: hubicka at gcc dot gnu.org @ 2023-06-23 10:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #6 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Comdats are really in conflict with the fact that we have command line options.
I blame C++ standard for that and I don't think there is fully satisfactory
solution to this problem.

I was playing with the idea of warning when at lto time when comdats have
different command line options, but this triggers way too often in practice. We
would need to determine "dangerous" one i.e. when -fno-avx2 is replaced by
-favx2 code. 
There are many ways one can stubly change semantics of the IL which makes
merging possibly dangerous which is done often in larger projects, like
firefox.

With syntactic aliases it is possible to keep multiple copies of comdat
function through merging process so inlining will chose corresponding one, but
it does make other things harder. One important anoyance is that it makes it a
lot harder to estimate overall size of the translation unit and how inlining
affects it. We currently assume that every function will need offline unless
all calls to it disappears. We will need to understand that this is true across
all syntacit aliases.

Also that conditional that disables early inliner for all always_inlines is
probably bit harmful these days as libstdc++ indeed has quite interesting set
of always_inlines that call normal inlines. I noticed that just recently while
looking into the push_back implementation how complex maze it got.

I will fix early inliner to allow safe inlining to always_inlines.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2023-06-23 10:47 ` hubicka at gcc dot gnu.org
@ 2023-06-23 11:21 ` rguenth at gcc dot gnu.org
  2023-06-23 12:59 ` hubicka at ucw dot cz
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-23 11:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jan Hubicka from comment #6)
> Comdats are really in conflict with the fact that we have command line
> options. I blame C++ standard for that and I don't think there is fully
> satisfactory solution to this problem.
> 
> I was playing with the idea of warning when at lto time when comdats have
> different command line options, but this triggers way too often in practice.

Really? :/

> We would need to determine "dangerous" one i.e. when -fno-avx2 is replaced
> by -favx2 code. 
> There are many ways one can stubly change semantics of the IL which makes
> merging possibly dangerous which is done often in larger projects, like
> firefox.

I think it would be desirable to diagnose these, maybe with an option to
selectively disable this specific diagnostic.  Because while it is not
always a correctness issue it can be a performance issue as well.

> With syntactic aliases it is possible to keep multiple copies of comdat
> function through merging process so inlining will chose corresponding one,
> but it does make other things harder. One important anoyance is that it
> makes it a lot harder to estimate overall size of the translation unit and
> how inlining affects it. We currently assume that every function will need
> offline unless all calls to it disappears. We will need to understand that
> this is true across all syntacit aliases.
> 
> Also that conditional that disables early inliner for all always_inlines is
> probably bit harmful these days as libstdc++ indeed has quite interesting
> set of always_inlines that call normal inlines. I noticed that just recently
> while looking into the push_back implementation how complex maze it got.
> 
> I will fix early inliner to allow safe inlining to always_inlines.

Beware of new always-inline calls then appearing after greedy inlining
(though that's exactly the case that we try to avoid here).  I suppose
you could disable inlining of a function which contains always-inline
calls or simply functions that did not yet have the early inliner run
on them (so keep the current behavior in cycles).  Beware of indirect
always-inline calls then.

Btw, for Skia the issue is really that some auto-generated CTOR isn't
marked always-inline but everything else is.  Maybe they should use
flatten instead of always-inline ...

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2023-06-23 11:21 ` rguenth at gcc dot gnu.org
@ 2023-06-23 12:59 ` hubicka at ucw dot cz
  2023-06-23 13:07   ` Jan Hubicka
  2023-06-23 13:07 ` hubicka at ucw dot cz
                   ` (18 subsequent siblings)
  27 siblings, 1 reply; 31+ messages in thread
From: hubicka at ucw dot cz @ 2023-06-23 12:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #8 from Jan Hubicka <hubicka at ucw dot cz> ---
> > I was playing with the idea of warning when at lto time when comdats have
> > different command line options, but this triggers way too often in practice.
> 
> Really? :/
Yep, for example firefox consist of many extra libraries (skia, video
codecs, image format decoders...) each developed independently LTOed
into one libxul.  Each of them has its own configure that tampers with
command line options.
These brings in libstdc++ comdats with different command line
sets (over 100 of different command lines for std::vector).

Firefox is bit extreme, but it is common for other bigger projects, too.
> 
> I think it would be desirable to diagnose these, maybe with an option to
> selectively disable this specific diagnostic.  Because while it is not
> always a correctness issue it can be a performance issue as well.

I can dig out the patch then.  But it was simply printing something like
warning: merging comdat function with function of same name but
different flags
note: first difference is -f.....
which was produced similar way we do diffs for optimization and target
attributes.

Now those -f.... were usually quite misleading as they tended to be
internal flags implied by something else (such as -Ofast instead of
-O2).  Often the picked -f flag was obviously harmless and false
positive, however other -f flag might have been important.

So I concluded that it is not very easy information to interpret from user
perspective.

But indeed I agree htat this is not very obvious source of interesting
correctness & performance issues.
> 
> Beware of new always-inline calls then appearing after greedy inlining
> (though that's exactly the case that we try to avoid here).  I suppose
> you could disable inlining of a function which contains always-inline
> calls or simply functions that did not yet have the early inliner run
> on them (so keep the current behavior in cycles).  Beware of indirect
> always-inline calls then.
> 
> Btw, for Skia the issue is really that some auto-generated CTOR isn't
> marked always-inline but everything else is.  Maybe they should use
> flatten instead of always-inline ...

We disable inlining to always_inline during early inline, but not greedy
inline.  Both of them can turn indirect calls to direct calls.  So I was
thinking that as first cut we can inline only callees with no indirect
calls and no inlinable direct calls into always_inlines with no indirect
calls.

Are you worried about possibility of early opt inventing new call into
builtin function that is defined as always_inline?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-23 12:59 ` hubicka at ucw dot cz
@ 2023-06-23 13:07   ` Jan Hubicka
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Hubicka @ 2023-06-23 13:07 UTC (permalink / raw)
  To: hubicka at ucw dot cz; +Cc: gcc-bugs

Just so it is somewhere, here is a testcase that we can't inline leaf
functions to always_inlines unless we do some tracking of what calls
were formerly indirect calls.

We really overloaded always_inline from the original semantics "drop
inlining heuristics" into "be sure that result is inlined" while for
the second it does not make sense to take its address.
Clang apparently simply does not error on failes always inlines which
makes its life easier.

int n;
typedef void (*fnptr)();
fnptr get_me();
__attribute__ ((always_inline))
inline void test(void)
{
        if (n < 10)
          (get_me())();
        n++;
        return;
}
fnptr get_me()
{
        return test;
}
void
foo()
{
        test();
}


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2023-06-23 12:59 ` hubicka at ucw dot cz
@ 2023-06-23 13:07 ` hubicka at ucw dot cz
  2023-06-26  6:39 ` rguenther at suse dot de
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: hubicka at ucw dot cz @ 2023-06-23 13:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #9 from Jan Hubicka <hubicka at ucw dot cz> ---
Just so it is somewhere, here is a testcase that we can't inline leaf
functions to always_inlines unless we do some tracking of what calls
were formerly indirect calls.

We really overloaded always_inline from the original semantics "drop
inlining heuristics" into "be sure that result is inlined" while for
the second it does not make sense to take its address.
Clang apparently simply does not error on failes always inlines which
makes its life easier.

int n;
typedef void (*fnptr)();
fnptr get_me();
__attribute__ ((always_inline))
inline void test(void)
{
        if (n < 10)
          (get_me())();
        n++;
        return;
}
fnptr get_me()
{
        return test;
}
void
foo()
{
        test();
}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2023-06-23 13:07 ` hubicka at ucw dot cz
@ 2023-06-26  6:39 ` rguenther at suse dot de
  2023-06-26 17:50 ` hubicka at ucw dot cz
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenther at suse dot de @ 2023-06-26  6:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #10 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 23 Jun 2023, hubicka at ucw dot cz wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
> 
> --- Comment #8 from Jan Hubicka <hubicka at ucw dot cz> ---
> > > I was playing with the idea of warning when at lto time when comdats have
> > > different command line options, but this triggers way too often in practice.
> > 
> > Really? :/
> Yep, for example firefox consist of many extra libraries (skia, video
> codecs, image format decoders...) each developed independently LTOed
> into one libxul.  Each of them has its own configure that tampers with
> command line options.
> These brings in libstdc++ comdats with different command line
> sets (over 100 of different command lines for std::vector).
> 
> Firefox is bit extreme, but it is common for other bigger projects, too.
> > 
> > I think it would be desirable to diagnose these, maybe with an option to
> > selectively disable this specific diagnostic.  Because while it is not
> > always a correctness issue it can be a performance issue as well.
> 
> I can dig out the patch then.  But it was simply printing something like
> warning: merging comdat function with function of same name but
> different flags
> note: first difference is -f.....
> which was produced similar way we do diffs for optimization and target
> attributes.
> 
> Now those -f.... were usually quite misleading as they tended to be
> internal flags implied by something else (such as -Ofast instead of
> -O2).  Often the picked -f flag was obviously harmless and false
> positive, however other -f flag might have been important.
> 
> So I concluded that it is not very easy information to interpret from user
> perspective.
> 
> But indeed I agree htat this is not very obvious source of interesting
> correctness & performance issues.

Hmm, maybe we should then really go the semantic alias route for
QOI reasons?  We can then still diagnose "preserving comdat symbol
because of different flags in TU X vs. Y" and maybe we can also estimate
a total unit growth because of doing that?

> > 
> > Beware of new always-inline calls then appearing after greedy inlining
> > (though that's exactly the case that we try to avoid here).  I suppose
> > you could disable inlining of a function which contains always-inline
> > calls or simply functions that did not yet have the early inliner run
> > on them (so keep the current behavior in cycles).  Beware of indirect
> > always-inline calls then.
> > 
> > Btw, for Skia the issue is really that some auto-generated CTOR isn't
> > marked always-inline but everything else is.  Maybe they should use
> > flatten instead of always-inline ...
> 
> We disable inlining to always_inline during early inline, but not greedy
> inline.  Both of them can turn indirect calls to direct calls.  So I was
> thinking that as first cut we can inline only callees with no indirect
> calls and no inlinable direct calls into always_inlines with no indirect
> calls.
>
> Are you worried about possibility of early opt inventing new call into
> builtin function that is defined as always_inline?

No, just that we end up with a always-inline cycle (aka recursion) which
we used to time/memory bomb on.  That is, sth like the following has
a "valid" inline solution (valid aka all always-inline calls are inlined),
but only if we break the cycle at the 'forward' call.  We used to
mishandle this when unlucky with ordering (depending on which edge
we enter the cycle when ordering for early opts).

static void forward (void);
static inline void __attribute__((always_inline)) always() { forward (); 
}
static void forward (void) { always (); };

void entry1 () { forward1 (); }
void entry2 () { always (); }

now you can make this less obvious a cycle via indirect calls, maybe
not without taking the address of always-inline functions which would
be on the border of being a user error.

Richard.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2023-06-26  6:39 ` rguenther at suse dot de
@ 2023-06-26 17:50 ` hubicka at ucw dot cz
  2023-06-27  6:41 ` rguenther at suse dot de
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: hubicka at ucw dot cz @ 2023-06-26 17:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #11 from Jan Hubicka <hubicka at ucw dot cz> ---
Hi,
what about this. It should make at least quite basic inlining to happen
to always_inline. I do not think many critical always_inlines have
indirect calls in them.  The test for lto is quite bad and I can
work on solving this incrementally (it would be nice to have this
tested and possibly backport it).

diff --git a/gcc/ipa-inline.cc b/gcc/ipa-inline.cc
index efc8df7d4e0..dcec07e49e1 100644
--- a/gcc/ipa-inline.cc
+++ b/gcc/ipa-inline.cc
@@ -702,6 +702,38 @@ can_early_inline_edge_p (struct cgraph_edge *e)
   if (!can_inline_edge_p (e, true, true)
       || !can_inline_edge_by_limits_p (e, true, false, true))
     return false;
+  /* When inlining regular functions into always-inline functions
+     during early inlining watch for possible inline cycles.  */
+  if (DECL_DISREGARD_INLINE_LIMITS (caller->decl)
+      && lookup_attribute ("always_inline", DECL_ATTRIBUTES (caller->decl))
+      && (!DECL_DISREGARD_INLINE_LIMITS (callee->decl)
+         || !lookup_attribute ("always_inline", DECL_ATTRIBUTES
(callee->decl))))
+    {
+      /* If there are indirect calls, inlining may produce direct call.
+        TODO: We may lift this restriction if we avoid errors on formely
+        indirect calls to always_inline functions.  Taking address
+        of always_inline function is generally bad idea and should
+        have been declared as undefined, but sadly we allow this.  */
+      if (caller->indirect_calls || e->callee->indirect_calls)
+       return false;
+      for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee)
+       {
+         struct cgraph_node *callee2 = e2->callee->ultimate_alias_target ();
+         /* As early inliner runs in RPO order, we will see uninlined
+            always_inline calls only in the case of cyclic graphs.  */
+         if (DECL_DISREGARD_INLINE_LIMITS (callee2->decl)
+             || lookup_attribute ("always_inline", callee2->decl))
+           return false;
+         /* With LTO watch for case where function is later replaced
+            by always_inline definition.
+            TODO: We may either stop treating noninlined cross-module always
+            inlines as errors, or we can extend decl merging to produce
+            syntacic alias and honor always inline only in units it has
+            been declared as such.  */
+         if (flag_lto && callee2->externally_visible)
+           return false;
+       }
+    }
   return true;
 }

@@ -3034,18 +3066,7 @@ early_inliner (function *fun)

   if (!optimize
       || flag_no_inline
-      || !flag_early_inlining
-      /* Never inline regular functions into always-inline functions
-        during incremental inlining.  This sucks as functions calling
-        always inline functions will get less optimized, but at the
-        same time inlining of functions calling always inline
-        function into an always inline function might introduce
-        cycles of edges to be always inlined in the callgraph.
-
-        We might want to be smarter and just avoid this type of inlining.  */
-      || (DECL_DISREGARD_INLINE_LIMITS (node->decl)
-         && lookup_attribute ("always_inline",
-                              DECL_ATTRIBUTES (node->decl))))
+      || !flag_early_inlining)
     ;
   else if (lookup_attribute ("flatten",
                             DECL_ATTRIBUTES (node->decl)) != NULL)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2023-06-26 17:50 ` hubicka at ucw dot cz
@ 2023-06-27  6:41 ` rguenther at suse dot de
  2023-06-28 10:00   ` Jan Hubicka
  2023-06-28  4:42 ` cvs-commit at gcc dot gnu.org
                   ` (14 subsequent siblings)
  27 siblings, 1 reply; 31+ messages in thread
From: rguenther at suse dot de @ 2023-06-27  6:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #12 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 26 Jun 2023, hubicka at ucw dot cz wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
> 
> --- Comment #11 from Jan Hubicka <hubicka at ucw dot cz> ---
> Hi,
> what about this. It should make at least quite basic inlining to happen
> to always_inline. I do not think many critical always_inlines have
> indirect calls in them.  The test for lto is quite bad and I can
> work on solving this incrementally (it would be nice to have this
> tested and possibly backport it).
> 
> diff --git a/gcc/ipa-inline.cc b/gcc/ipa-inline.cc
> index efc8df7d4e0..dcec07e49e1 100644
> --- a/gcc/ipa-inline.cc
> +++ b/gcc/ipa-inline.cc
> @@ -702,6 +702,38 @@ can_early_inline_edge_p (struct cgraph_edge *e)
>    if (!can_inline_edge_p (e, true, true)
>        || !can_inline_edge_by_limits_p (e, true, false, true))
>      return false;
> +  /* When inlining regular functions into always-inline functions
> +     during early inlining watch for possible inline cycles.  */
> +  if (DECL_DISREGARD_INLINE_LIMITS (caller->decl)
> +      && lookup_attribute ("always_inline", DECL_ATTRIBUTES (caller->decl))
> +      && (!DECL_DISREGARD_INLINE_LIMITS (callee->decl)
> +         || !lookup_attribute ("always_inline", DECL_ATTRIBUTES
> (callee->decl))))
> +    {
> +      /* If there are indirect calls, inlining may produce direct call.
> +        TODO: We may lift this restriction if we avoid errors on formely
> +        indirect calls to always_inline functions.  Taking address
> +        of always_inline function is generally bad idea and should
> +        have been declared as undefined, but sadly we allow this.  */
> +      if (caller->indirect_calls || e->callee->indirect_calls)

why disallow caller->indirect_calls?

> +       return false;
> +      for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee)

I don't think this flys - it looks quadratic.  Can we compute this
in the inline summary once instead?

As for indirect calls, can we maybe mark initial direct GIMPLE call
stmts as "always-inline" and only look at that marking, thus an
indirect call will never become "always-inline"?  Iff cgraph edges
prevail during all early inlining we could mark call edges for
this purpose?

> +       {
> +         struct cgraph_node *callee2 = e2->callee->ultimate_alias_target ();
> +         /* As early inliner runs in RPO order, we will see uninlined
> +            always_inline calls only in the case of cyclic graphs.  */
> +         if (DECL_DISREGARD_INLINE_LIMITS (callee2->decl)
> +             || lookup_attribute ("always_inline", callee2->decl))
> +           return false;
> +         /* With LTO watch for case where function is later replaced
> +            by always_inline definition.
> +            TODO: We may either stop treating noninlined cross-module always
> +            inlines as errors, or we can extend decl merging to produce
> +            syntacic alias and honor always inline only in units it has
> +            been declared as such.  */
> +         if (flag_lto && callee2->externally_visible)
> +           return false;
> +       }
> +    }
>    return true;
>  }
> 
> @@ -3034,18 +3066,7 @@ early_inliner (function *fun)
> 
>    if (!optimize
>        || flag_no_inline
> -      || !flag_early_inlining
> -      /* Never inline regular functions into always-inline functions
> -        during incremental inlining.  This sucks as functions calling
> -        always inline functions will get less optimized, but at the
> -        same time inlining of functions calling always inline
> -        function into an always inline function might introduce
> -        cycles of edges to be always inlined in the callgraph.
> -
> -        We might want to be smarter and just avoid this type of inlining.  */
> -      || (DECL_DISREGARD_INLINE_LIMITS (node->decl)
> -         && lookup_attribute ("always_inline",
> -                              DECL_ATTRIBUTES (node->decl))))
> +      || !flag_early_inlining)
>      ;
>    else if (lookup_attribute ("flatten",
>                              DECL_ATTRIBUTES (node->decl)) != NULL)
> 
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2023-06-27  6:41 ` rguenther at suse dot de
@ 2023-06-28  4:42 ` cvs-commit at gcc dot gnu.org
  2023-06-28 10:00 ` hubicka at ucw dot cz
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-28  4:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Jason Merrill <jason@gcc.gnu.org>:

https://gcc.gnu.org/g:abdf0b6cdff5783b97f35ad61ae31433f0569dfd

commit r14-2149-gabdf0b6cdff5783b97f35ad61ae31433f0569dfd
Author: Jason Merrill <jason@redhat.com>
Date:   Tue Jun 27 05:15:01 2023 -0400

    c++: inherited constructor attributes

    Inherited constructors are like constructor clones; they don't exist from
    the language perspective, so they should copy the attributes in the same
    way.  But it doesn't make sense to copy alias or ifunc attributes in either
    case.  Unlike handle_copy_attribute, we do want to copy inlining
attributes.

    The discussion of PR110334 pointed out that we weren't copying the
    always_inline attribute, leading to poor inlining choices.

            PR c++/110334

    gcc/cp/ChangeLog:

            * cp-tree.h (clone_attrs): Declare.
            * method.cc (implicitly_declare_fn): Use it for inherited
            constructor.
            * optimize.cc (clone_attrs): New.
            (maybe_clone_body): Use it.

    gcc/testsuite/ChangeLog:

            * g++.dg/cpp1z/nodiscard-inh1.C: New test.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-27  6:41 ` rguenther at suse dot de
@ 2023-06-28 10:00   ` Jan Hubicka
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Hubicka @ 2023-06-28 10:00 UTC (permalink / raw)
  To: rguenther at suse dot de; +Cc: gcc-bugs

> 
> why disallow caller->indirect_calls?
See testcase in comment #9
> 
> > +       return false;
> > +      for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee)
> 
> I don't think this flys - it looks quadratic.  Can we compute this
> in the inline summary once instead?

I guess I can place a cache there.  I think this check will become more
global over time so it more fits IMO here.
> 
> As for indirect calls, can we maybe mark initial direct GIMPLE call
> stmts as "always-inline" and only look at that marking, thus an
> indirect call will never become "always-inline"?  Iff cgraph edges
> prevail during all early inlining we could mark call edges for
> this purpose?

I also think we need call site specific info.
Tagging gimple call statements and copying the info to gimple edges will
probably be needed here.  We want to keep the info from early inlining
to late inlining since we output errors late.
We already have plenty of GF_CALL_ flags, so adding one should be easy?

Honza

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2023-06-28  4:42 ` cvs-commit at gcc dot gnu.org
@ 2023-06-28 10:00 ` hubicka at ucw dot cz
  2023-06-28 10:20 ` rguenther at suse dot de
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: hubicka at ucw dot cz @ 2023-06-28 10:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #14 from Jan Hubicka <hubicka at ucw dot cz> ---
> 
> why disallow caller->indirect_calls?
See testcase in comment #9
> 
> > +       return false;
> > +      for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee)
> 
> I don't think this flys - it looks quadratic.  Can we compute this
> in the inline summary once instead?

I guess I can place a cache there.  I think this check will become more
global over time so it more fits IMO here.
> 
> As for indirect calls, can we maybe mark initial direct GIMPLE call
> stmts as "always-inline" and only look at that marking, thus an
> indirect call will never become "always-inline"?  Iff cgraph edges
> prevail during all early inlining we could mark call edges for
> this purpose?

I also think we need call site specific info.
Tagging gimple call statements and copying the info to gimple edges will
probably be needed here.  We want to keep the info from early inlining
to late inlining since we output errors late.
We already have plenty of GF_CALL_ flags, so adding one should be easy?

Honza

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2023-06-28 10:00 ` hubicka at ucw dot cz
@ 2023-06-28 10:20 ` rguenther at suse dot de
  2023-06-28 10:45 ` hubicka at ucw dot cz
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenther at suse dot de @ 2023-06-28 10:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #15 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 28 Jun 2023, hubicka at ucw dot cz wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
> 
> --- Comment #14 from Jan Hubicka <hubicka at ucw dot cz> ---
> > 
> > why disallow caller->indirect_calls?
> See testcase in comment #9
> > 
> > > +       return false;
> > > +      for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee)
> > 
> > I don't think this flys - it looks quadratic.  Can we compute this
> > in the inline summary once instead?
> 
> I guess I can place a cache there.  I think this check will become more
> global over time so it more fits IMO here.
> > 
> > As for indirect calls, can we maybe mark initial direct GIMPLE call
> > stmts as "always-inline" and only look at that marking, thus an
> > indirect call will never become "always-inline"?  Iff cgraph edges
> > prevail during all early inlining we could mark call edges for
> > this purpose?
> 
> I also think we need call site specific info.
> Tagging gimple call statements and copying the info to gimple edges will
> probably be needed here.  We want to keep the info from early inlining
> to late inlining since we output errors late.
> We already have plenty of GF_CALL_ flags, so adding one should be easy?

We have 3 bits left :/  I was hoping that cgraph_edge lives long
enough?  But I suppose we're not keeping them across the early opts
pipeline.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2023-06-28 10:20 ` rguenther at suse dot de
@ 2023-06-28 10:45 ` hubicka at ucw dot cz
  2023-06-28 21:06 ` cvs-commit at gcc dot gnu.org
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: hubicka at ucw dot cz @ 2023-06-28 10:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #16 from Jan Hubicka <hubicka at ucw dot cz> ---
> > We already have plenty of GF_CALL_ flags, so adding one should be easy?
> 
> We have 3 bits left :/  I was hoping that cgraph_edge lives long
> enough?  But I suppose we're not keeping them across the early opts
> pipeline.

Hmm, so we have too many flags.  Indeed problem is that we don't want to
keep callgraph edges across all modifications gimple optimization passes
does.
Eventualy such annotations can probably go to hash_map just like we do
for EH regions etc.

Honza

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2023-06-28 10:45 ` hubicka at ucw dot cz
@ 2023-06-28 21:06 ` cvs-commit at gcc dot gnu.org
  2023-07-03  7:05 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-28 21:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #17 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jan Hubicka <hubicka@gcc.gnu.org>:

https://gcc.gnu.org/g:d88fd2e1d0720e6f892da9ff48e9a301a7ad0ad4

commit r14-2172-gd88fd2e1d0720e6f892da9ff48e9a301a7ad0ad4
Author: Jan Hubicka <jh@suse.cz>
Date:   Wed Jun 28 23:05:28 2023 +0200

    Enable early inlining into always_inline functions

    Early inliner currently skips always_inline functions and moreover we
ignore
    calls from always_inline in ipa_reverse_postorder.  This leads to disabling
    most of propagation done using early optimization that is quite bad when
    early inline functions are not leaf functions, which is now quite common
    in libstdc++.

    This patch instead of fully disabling the inline checks calls in callee.
    I am quite conservative about what can be inlined as this patch is bit
    touchy anyway.  To avoid problems with always_inline being optimized
    after early inline I extended inline_always_inline_functions to lazilly
    compute fnsummary when needed.

    gcc/ChangeLog:

            PR middle-end/110334
            * ipa-fnsummary.h (ipa_fn_summary): Add
            safe_to_inline_to_always_inline.
            * ipa-inline.cc (can_early_inline_edge_p): ICE
            if SSA is not built; do cycle checking for
            always_inline functions.
            (inline_always_inline_functions): Be recrusive;
            watch for cycles; do not updat overall summary.
            (early_inliner): Do not give up on always_inlines.
            * ipa-utils.cc (ipa_reverse_postorder): Do not skip
            always inlines.

    gcc/testsuite/ChangeLog:

            PR middle-end/110334
            * g++.dg/opt/pr66119.C: Disable early inlining.
            * gcc.c-torture/compile/pr110334.c: New test.
            * gcc.dg/tree-ssa/pr110334.c: New test.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2023-06-28 21:06 ` cvs-commit at gcc dot gnu.org
@ 2023-07-03  7:05 ` jakub at gcc dot gnu.org
  2023-07-10  7:40 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-07-03  7:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #18 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
For skia, wouldn't the best fix be stop compiling the TU(s) with -mavx but
instead
#pragma GCC push_options
#pragma GCC target("avx")
after including headers and
#pragma GCC pop_options
at the end (plus clang equivalents)?
Then whatever is defined in that file would be -mavx and the comdats if inlined
would do too, but if they aren't inlined would still not have it?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2023-07-03  7:05 ` jakub at gcc dot gnu.org
@ 2023-07-10  7:40 ` rguenth at gcc dot gnu.org
  2023-07-10  8:24 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-10  7:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
It seems that the C++ FE change in comment#13 causes libreoffice to fail to
build with

[  553s]
/home/abuild/rpmbuild/BUILD/libreoffice-7.5.4.2/workdir/UnpackedTarball/skia/include/private/SkVx.h:
In function 'interpret_skvm':
[  553s]
/home/abuild/rpmbuild/BUILD/libreoffice-7.5.4.2/workdir/UnpackedTarball/skia/include/private/SkVx.h:150:
error: inlining failed in call to 'always_inline' '__ct_comp ': target specific
option mismatch
[  553s]   150 |     using VecStorage<N,T>::VecStorage;
[  553s]       | 

I have yet to reproduce and extract a testcase though.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (19 preceding siblings ...)
  2023-07-10  7:40 ` rguenth at gcc dot gnu.org
@ 2023-07-10  8:24 ` rguenth at gcc dot gnu.org
  2023-07-10  8:24 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-10  8:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #19)
> It seems that the C++ FE change in comment#13 causes libreoffice to fail to
> build with
> 
> [  553s]
> /home/abuild/rpmbuild/BUILD/libreoffice-7.5.4.2/workdir/UnpackedTarball/skia/
> include/private/SkVx.h: In function 'interpret_skvm':
> [  553s]
> /home/abuild/rpmbuild/BUILD/libreoffice-7.5.4.2/workdir/UnpackedTarball/skia/
> include/private/SkVx.h:150: error: inlining failed in call to
> 'always_inline' '__ct_comp ': target specific option mismatch
> [  553s]   150 |     using VecStorage<N,T>::VecStorage;
> [  553s]       | 
> 
> I have yet to reproduce and extract a testcase though.

libreoffice has a skia import and links its Library/libskialo.so library
with LTO, combining the AVX, HSW, SSE41, SSE42, SSSE3, CRC32, SKX, ...
SkOpts_*.o objects which likely follow a very similar scheme as the firefox
copy.  SkOpts_hsw.cpp looks like

#include "src/core/SkOpts.h"

#define SK_OPTS_NS hsw
#include "src/core/SkCubicSolver.h"
#include "src/opts/SkBitmapProcState_opts.h"
#include "src/opts/SkBlitRow_opts.h"
#include "src/opts/SkRasterPipeline_opts.h"
#include "src/opts/SkSwizzler_opts.h"
#include "src/opts/SkUtils_opts.h"
#include "src/opts/SkVM_opts.h"

namespace SkOpts {
    void Init_hsw() {
        blit_row_color32     = hsw::blit_row_color32;
        blit_row_s32a_opaque = hsw::blit_row_s32a_opaque;

        S32_alpha_D32_filter_DX  = hsw::S32_alpha_D32_filter_DX;

        cubic_solver = SK_OPTS_NS::cubic_solver;

        RGBA_to_BGRA          = SK_OPTS_NS::RGBA_to_BGRA;
        RGBA_to_rgbA          = SK_OPTS_NS::RGBA_to_rgbA;
        RGBA_to_bgrA          = SK_OPTS_NS::RGBA_to_bgrA;
        gray_to_RGB1          = SK_OPTS_NS::gray_to_RGB1;
        grayA_to_RGBA         = SK_OPTS_NS::grayA_to_RGBA;
        grayA_to_rgbA         = SK_OPTS_NS::grayA_to_rgbA;
        inverted_CMYK_to_RGB1 = SK_OPTS_NS::inverted_CMYK_to_RGB1;
        inverted_CMYK_to_BGR1 = SK_OPTS_NS::inverted_CMYK_to_BGR1;

    #define M(st) stages_highp[SkRasterPipeline::st] = (StageFn)SK_OPTS_NS::st;
        SK_RASTER_PIPELINE_STAGES(M)
        just_return_highp = (StageFn)SK_OPTS_NS::just_return;
        start_pipeline_highp = SK_OPTS_NS::start_pipeline;
    #undef M

    #define M(st) stages_lowp[SkRasterPipeline::st] =
(StageFn)SK_OPTS_NS::lowp::st;
        SK_RASTER_PIPELINE_STAGES(M)
        just_return_lowp = (StageFn)SK_OPTS_NS::lowp::just_return;
        start_pipeline_lowp = SK_OPTS_NS::lowp::start_pipeline;
    #undef M

        interpret_skvm = SK_OPTS_NS::interpret_skvm;
    }
}  // namespace SkOpts

the question in the end is why we fail to process the always_inlines
early here and only discover them late.  Unfortunately the inline
diagnostic doesn't print the original TU name (that would be useful
for all late LTO diagnostics).

I'm not sure whether this is an issue with the C++ frontend change or with
the ODR issues present in Skia.  Eventually since we're dealing with
aliases(?), always_inline isn't appropriately detected or still missing
on some callees.

Unfortunately there isn't a knob to diagnose late inlined always-inline
functions.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (20 preceding siblings ...)
  2023-07-10  8:24 ` rguenth at gcc dot gnu.org
@ 2023-07-10  8:24 ` rguenth at gcc dot gnu.org
  2023-07-11 14:45 ` hubicka at ucw dot cz
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-10  8:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 55512
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55512&action=edit
another testcase

This one needs -mavx2 -mf16c -mfma -fPIC -O2 -std=c++17

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (21 preceding siblings ...)
  2023-07-10  8:24 ` rguenth at gcc dot gnu.org
@ 2023-07-11 14:45 ` hubicka at ucw dot cz
  2023-07-11 14:46 ` hubicka at ucw dot cz
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: hubicka at ucw dot cz @ 2023-07-11 14:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #22 from Jan Hubicka <hubicka at ucw dot cz> ---
I will cook up the patch to keep multiple variants of nodes pre-inline
and we will see how much that affects compile time & how hard it will be
to get unit size esitmates right.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (22 preceding siblings ...)
  2023-07-11 14:45 ` hubicka at ucw dot cz
@ 2023-07-11 14:46 ` hubicka at ucw dot cz
  2023-07-12  7:05 ` rguenther at suse dot de
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: hubicka at ucw dot cz @ 2023-07-11 14:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #23 from Jan Hubicka <hubicka at ucw dot cz> ---
But it would be nice to see why the functions are not early inlined.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (23 preceding siblings ...)
  2023-07-11 14:46 ` hubicka at ucw dot cz
@ 2023-07-12  7:05 ` rguenther at suse dot de
  2023-07-27  9:26 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: rguenther at suse dot de @ 2023-07-12  7:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #24 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 11 Jul 2023, hubicka at ucw dot cz wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
> 
> --- Comment #23 from Jan Hubicka <hubicka at ucw dot cz> ---
> But it would be nice to see why the functions are not early inlined.

Note this was on the 13 branch with the C++ fix backported, so
before your changes to allow to inline into always-inline.  But yes,
it would be nice to confirm the issue is gone on trunk and to see
where always_inline is still missing (if it is).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (24 preceding siblings ...)
  2023-07-12  7:05 ` rguenther at suse dot de
@ 2023-07-27  9:26 ` rguenth at gcc dot gnu.org
  2023-11-25 10:00 ` egallager at gcc dot gnu.org
  2023-11-27  7:28 ` rguenth at gcc dot gnu.org
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-27  9:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|13.2                        |13.3

--- Comment #25 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 13.2 is being released, retargeting bugs to GCC 13.3.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (25 preceding siblings ...)
  2023-07-27  9:26 ` rguenth at gcc dot gnu.org
@ 2023-11-25 10:00 ` egallager at gcc dot gnu.org
  2023-11-27  7:28 ` rguenth at gcc dot gnu.org
  27 siblings, 0 replies; 31+ messages in thread
From: egallager at gcc dot gnu.org @ 2023-11-25 10:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

Eric Gallager <egallager at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |egallager at gcc dot gnu.org

--- Comment #26 from Eric Gallager <egallager at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #20)
> Unfortunately there isn't a knob to diagnose late inlined always-inline
> functions.

Is there a separate bug for this?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming
  2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
                   ` (26 preceding siblings ...)
  2023-11-25 10:00 ` egallager at gcc dot gnu.org
@ 2023-11-27  7:28 ` rguenth at gcc dot gnu.org
  27 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-27  7:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334

--- Comment #27 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Eric Gallager from comment #26)
> (In reply to Richard Biener from comment #20)
> > Unfortunately there isn't a knob to diagnose late inlined always-inline
> > functions.
> 
> Is there a separate bug for this?

I don't think so.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-11-27  7:28 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-21  7:05 [Bug ipa/110334] New: [13/14 Regresssion] unused functions not eliminated before LTO streaming rguenth at gcc dot gnu.org
2023-06-21  7:06 ` [Bug ipa/110334] " rguenth at gcc dot gnu.org
2023-06-21  7:08 ` rguenth at gcc dot gnu.org
2023-06-21  7:12 ` rguenth at gcc dot gnu.org
2023-06-22 11:15 ` rguenth at gcc dot gnu.org
2023-06-22 11:45 ` rguenth at gcc dot gnu.org
2023-06-22 12:02 ` rguenth at gcc dot gnu.org
2023-06-23 10:47 ` hubicka at gcc dot gnu.org
2023-06-23 11:21 ` rguenth at gcc dot gnu.org
2023-06-23 12:59 ` hubicka at ucw dot cz
2023-06-23 13:07   ` Jan Hubicka
2023-06-23 13:07 ` hubicka at ucw dot cz
2023-06-26  6:39 ` rguenther at suse dot de
2023-06-26 17:50 ` hubicka at ucw dot cz
2023-06-27  6:41 ` rguenther at suse dot de
2023-06-28 10:00   ` Jan Hubicka
2023-06-28  4:42 ` cvs-commit at gcc dot gnu.org
2023-06-28 10:00 ` hubicka at ucw dot cz
2023-06-28 10:20 ` rguenther at suse dot de
2023-06-28 10:45 ` hubicka at ucw dot cz
2023-06-28 21:06 ` cvs-commit at gcc dot gnu.org
2023-07-03  7:05 ` jakub at gcc dot gnu.org
2023-07-10  7:40 ` rguenth at gcc dot gnu.org
2023-07-10  8:24 ` rguenth at gcc dot gnu.org
2023-07-10  8:24 ` rguenth at gcc dot gnu.org
2023-07-11 14:45 ` hubicka at ucw dot cz
2023-07-11 14:46 ` hubicka at ucw dot cz
2023-07-12  7:05 ` rguenther at suse dot de
2023-07-27  9:26 ` rguenth at gcc dot gnu.org
2023-11-25 10:00 ` egallager at gcc dot gnu.org
2023-11-27  7:28 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).