[Bug lto/107014] New: flatten+lto

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug lto/107014] New: flatten+lto
@ 2022-09-23  7:38 jirislaby at gmail dot com
  2022-09-23  7:55 ` [Bug lto/107014] flatten+lto fails the kernel build marxin at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: jirislaby at gmail dot com @ 2022-09-23  7:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

            Bug ID: 107014
           Summary: flatten+lto
           Product: gcc
           Version: 12.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: lto
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jirislaby at gmail dot com
                CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Maybe this is simply a dup of bug 77472, but I am not sure what the proper
solution is supposed to be.

In the kernel, when LTO is enabled, we disable flatten completely, so that the
build is able to finish -- in ~ 2 minutes, w/ 4G of RAM.

With flatten enabled, the build didn't finish in 10 minutes, eating 40G of RAM
at that point.

There is only a single user of flatten in the kernel: pcpu_build_alloc_info().
Here:
https://github.com/torvalds/linux/blob/bf682942cd26ce9cd5e87f73ae099b383041e782/mm/percpu.c#L2852-L2855

And that on its own makes the link not to finish in a reasonable time, with
reasonable RAM. I suppose much inlining happens in such a long function. But:
shouldn't really there be any limit?

I am going to link this bug in the commit message disabling flatten on LTO, so
that we have a reference of why/how...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/107014] flatten+lto fails the kernel build
  2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
@ 2022-09-23  7:55 ` marxin at gcc dot gnu.org
  2022-09-23  8:04 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: marxin at gcc dot gnu.org @ 2022-09-23  7:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-09-23
             Status|UNCONFIRMED                 |NEW
                 CC|                            |hubicka at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #1 from Martin Liška <marxin at gcc dot gnu.org> ---
@Honza: Can you comment on this, please?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/107014] flatten+lto fails the kernel build
  2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
  2022-09-23  7:55 ` [Bug lto/107014] flatten+lto fails the kernel build marxin at gcc dot gnu.org
@ 2022-09-23  8:04 ` rguenth at gcc dot gnu.org
  2022-09-23  8:21 ` amonakov at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-09-23  8:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The whole point of "flatten" is that there's _no_ limit.  Looking at the
function I don't see why you'd ever use that?

If the desire is to force inlining a specific call then I think there's
currently
no good way to get that.  It doesn't seem to be possible to use an
always-inline
alias to get this behavior at least.  Some not existing #pragma GCC inline
might be a better solution than "flatten".

But as said, I'd just remove "flatten" - is it known what it was added for?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/107014] flatten+lto fails the kernel build
  2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
  2022-09-23  7:55 ` [Bug lto/107014] flatten+lto fails the kernel build marxin at gcc dot gnu.org
  2022-09-23  8:04 ` rguenth at gcc dot gnu.org
@ 2022-09-23  8:21 ` amonakov at gcc dot gnu.org
  2022-09-23  8:24 ` jirislaby at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-09-23  8:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
It was added to force inlining of small helpers that outgrow limits when
building with gcov profiling:

https://github.com/torvalds/linux/commit/258e0815e2b1706e87c0d874211097aa8a7aa52f

(lack of inlining triggered a sanity check, as explained in the commit)


I am surprised that "flatten" blows up on this function. Is that with any
config, or again some specific settings like gcov? Is there an existing lkml
thread about this?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/107014] flatten+lto fails the kernel build
  2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
                   ` (2 preceding siblings ...)
  2022-09-23  8:21 ` amonakov at gcc dot gnu.org
@ 2022-09-23  8:24 ` jirislaby at gmail dot com
  2022-09-23  8:28 ` amonakov at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jirislaby at gmail dot com @ 2022-09-23  8:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

--- Comment #4 from Jiri Slaby <jirislaby at gmail dot com> ---
(In reply to Alexander Monakov from comment #3)
> It was added to force inlining of small helpers that outgrow limits when
> building with gcov profiling:

(with clang)

> I am surprised that "flatten" blows up on this function. Is that with any
> config, or again some specific settings like gcov? Is there an existing lkml
> thread about this?

Yes, linked in the commit log:
https://lore.kernel.org/all/CAK8P3a2ZWfNeXKSm8K_SUhhwkor17jFo3xApLXjzfPqX0eUDUA@mail.gmail.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/107014] flatten+lto fails the kernel build
  2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
                   ` (3 preceding siblings ...)
  2022-09-23  8:24 ` jirislaby at gmail dot com
@ 2022-09-23  8:28 ` amonakov at gcc dot gnu.org
  2022-09-23  9:52 ` jirislaby at gmail dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-09-23  8:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

--- Comment #5 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
(In reply to Jiri Slaby from comment #4)
> > I am surprised that "flatten" blows up on this function. Is that with any
> > config, or again some specific settings like gcov? Is there an existing lkml
> > thread about this?
> 
> Yes, linked in the commit log:
> https://lore.kernel.org/all/
> CAK8P3a2ZWfNeXKSm8K_SUhhwkor17jFo3xApLXjzfPqX0eUDUA@mail.gmail.com/

I mean now, about compile time blowup with LTO.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/107014] flatten+lto fails the kernel build
  2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
                   ` (4 preceding siblings ...)
  2022-09-23  8:28 ` amonakov at gcc dot gnu.org
@ 2022-09-23  9:52 ` jirislaby at gmail dot com
  2022-09-23 10:12 ` amonakov at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jirislaby at gmail dot com @ 2022-09-23  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

--- Comment #6 from Jiri Slaby <jirislaby at gmail dot com> ---
(In reply to Alexander Monakov from comment #5)
> I mean now, about compile time blowup with LTO.

No, LTO is not supported by upstream (yet) ;).

The point is what should I do when submitting the LTO support. Disabling
flatten in the kernel completely does not sound about right. I wanted to
confirm that this is not a compiler issue -- you guys say the function shoots
to its leg by __flatten__.

So thinking about the best approach...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/107014] flatten+lto fails the kernel build
  2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
                   ` (5 preceding siblings ...)
  2022-09-23  9:52 ` jirislaby at gmail dot com
@ 2022-09-23 10:12 ` amonakov at gcc dot gnu.org
  2022-09-23 10:36 ` jakub at gcc dot gnu.org
  2022-09-25 19:56 ` andi-gcc at firstfloor dot org
  8 siblings, 0 replies; 10+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-09-23 10:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

--- Comment #7 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
I wanted to understand what gets exposed in LTO mode that causes a blowup.

I'd say flatten is not appropriate for this function (I don't think you want to
force inlining of memset or _find_next_bit?), so might be better to go back to
the original issue and solve the problem in a more focused way (e.g.
force-inlining the function which needs to access __initdata if you really need
the verification that triggers otherwise).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/107014] flatten+lto fails the kernel build
  2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
                   ` (6 preceding siblings ...)
  2022-09-23 10:12 ` amonakov at gcc dot gnu.org
@ 2022-09-23 10:36 ` jakub at gcc dot gnu.org
  2022-09-25 19:56 ` andi-gcc at firstfloor dot org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-09-23 10:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Isn't cpumask_clear_cpu using __always_inline macro which expands to inline
__attribute__((__always_inline__)) and so should be always inlined?
Anyway, if you want to inline a particular function at some specific spot in
some  section, perhaps add a __flatten __always_inline wrapper around it
perhaps also with __init section?
With LTO flatten attribute will really try to inline everything that isn't
explicitly noinline and is called from it into it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug lto/107014] flatten+lto fails the kernel build
  2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
                   ` (7 preceding siblings ...)
  2022-09-23 10:36 ` jakub at gcc dot gnu.org
@ 2022-09-25 19:56 ` andi-gcc at firstfloor dot org
  8 siblings, 0 replies; 10+ messages in thread
From: andi-gcc at firstfloor dot org @ 2022-09-25 19:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014

Andi Kleen <andi-gcc at firstfloor dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andi-gcc at firstfloor dot org

--- Comment #9 from Andi Kleen <andi-gcc at firstfloor dot org> ---
I suspect what happens is that it hits in some kernel initialization function.
If they don't use initcall the LTO build can all inline them into each other
(because they are only called once) creating a single big initialization
function. With flatten that will create an extremely large function that takes
a long time to process.

I suspect any use of flatten is better using always_inline, since that affects
only a single function. Should probably be fixed upstream in the kernel.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-09-25 19:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-23  7:38 [Bug lto/107014] New: flatten+lto jirislaby at gmail dot com
2022-09-23  7:55 ` [Bug lto/107014] flatten+lto fails the kernel build marxin at gcc dot gnu.org
2022-09-23  8:04 ` rguenth at gcc dot gnu.org
2022-09-23  8:21 ` amonakov at gcc dot gnu.org
2022-09-23  8:24 ` jirislaby at gmail dot com
2022-09-23  8:28 ` amonakov at gcc dot gnu.org
2022-09-23  9:52 ` jirislaby at gmail dot com
2022-09-23 10:12 ` amonakov at gcc dot gnu.org
2022-09-23 10:36 ` jakub at gcc dot gnu.org
2022-09-25 19:56 ` andi-gcc at firstfloor dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).