* PROPOSAL: Extend inline asm syntax with size spec [not found] <20181003213100.189959-1-namit@vmware.com> @ 2018-10-07 9:46 ` Borislav Petkov 2018-10-07 13:23 ` Segher Boessenkool [not found] ` <4F2F1BCE-7875-4160-9E1E-9F8EF962D989@vmware.com> 0 siblings, 2 replies; 40+ messages in thread From: Borislav Petkov @ 2018-10-07 9:46 UTC (permalink / raw) To: gcc, Richard Biener, Michael Matz Cc: Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa Hi people, this is an attempt to see whether gcc's inline asm heuristic when estimating inline asm statements' cost for better inlining can be improved. AFAIU, the problematic arises when one ends up using a lot of inline asm statements in the kernel but due to the inline asm cost estimation heuristic which counts lines, I think, for example like in this here macro: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/include/asm/cpufeature.h#n162 the resulting code ends up not inlining the functions themselves which use this macro. I.e., you see a CALL <function> instead of its body getting inlined directly. Even though it should be because the actual instructions are only a couple in most cases and all those other directives end up in another section anyway. The issue is explained below in the forwarded mail in a larger detail too. Now, Richard suggested doing something like: 1) inline asm ("...") 2) asm ("..." : : : : <size-expr>) 3) asm ("...") __attribute__((asm_size(<size-expr>))); with which user can tell gcc what the size of that inline asm statement is and thus allow for more precise cost estimation and in the end better inlining. And FWIW 3) looks pretty straight-forward to me because attributes are pretty common anyways. But I'm sure there are other options and I'm sure people will have better/different ideas so feel free to chime in. Thx. On Wed, Oct 03, 2018 at 02:30:50PM -0700, Nadav Amit wrote: > This patch-set deals with an interesting yet stupid problem: kernel code > that does not get inlined despite its simplicity. There are several > causes for this behavior: "cold" attribute on __init, different function > optimization levels; conditional constant computations based on > __builtin_constant_p(); and finally large inline assembly blocks. > > This patch-set deals with the inline assembly problem. I separated these > patches from the others (that were sent in the RFC) for easier > inclusion. I also separated the removal of unnecessary new-lines which > would be sent separately. > > The problem with inline assembly is that inline assembly is often used > by the kernel for things that are other than code - for example, > assembly directives and data. GCC however is oblivious to the content of > the blocks and assumes their cost in space and time is proportional to > the number of the perceived assembly "instruction", according to the > number of newlines and semicolons. Alternatives, paravirt and other > mechanisms are affected, causing code not to be inlined, and degrading > compilation quality in general. > > The solution that this patch-set carries for this problem is to create > an assembly macro, and then call it from the inline assembly block. As > a result, the compiler sees a single "instruction" and assigns the more > appropriate cost to the code. > > To avoid uglification of the code, as many noted, the macros are first > precompiled into an assembly file, which is later assembled together > with the C files. This also enables to avoid duplicate implementation > that was set before for the asm and C code. This can be seen in the > exception table changes. > > Overall this patch-set slightly increases the kernel size (my build was > done using my Ubuntu 18.04 config + localyesconfig for the record): > > text data bss dec hex filename > 18140829 10224724 2957312 31322865 1ddf2f1 ./vmlinux before > 18163608 10227348 2957312 31348268 1de562c ./vmlinux after (+0.1%) > > The number of static functions in the image is reduced by 379, but > actually inlining is even better, which does not always shows in these > numbers: a function may be inlined causing the calling function not to > be inlined. > > I ran some limited number of benchmarks, and in general the performance > impact is not very notable. You can still see >10 cycles shaved off some > syscalls that manipulate page-tables (e.g., mprotect()), in which > paravirt caused many functions not to be inlined. In addition this > patch-set can prevent issues such as [1], and improves code readability > and maintainability. > > Update: Rasmus recently caused me (inadvertently) to become paranoid > about the dependencies. To clarify: if any of the headers changes, any c > file which uses macros that are included in macros.S would be fine as > long as it includes the header as well (as it should). Adding an > assertion to check this is done might become slightly ugly, and nobody > else is concerned about it. Another minor issue is that changes of > macros.S would not trigger a global rebuild, but that is pretty similar > to changes of the Makefile that do not trigger a rebuild. > > [1] https://patchwork.kernel.org/patch/10450037/ > > v8->v9: * Restoring the '-pipe' parameter (Rasmus) > * Adding Kees's tested-by tag (Kees) > > v7->v8: * Add acks (Masahiro, Max) > * Rebase on 4.19 (Ingo) > > v6->v7: * Fix context switch tracking (Ingo) > * Fix xtensa build error (Ingo) > * Rebase on 4.18-rc8 > > v5->v6: * Removing more code from jump-labels (PeterZ) > * Fix build issue on i386 (0-day, PeterZ) > > v4->v5: * Makefile fixes (Masahiro, Sam) > > v3->v4: * Changed naming of macros in 2 patches (PeterZ) > * Minor cleanup of the paravirt patch > > v2->v3: * Several build issues resolved (0-day) > * Wrong comments fix (Josh) > * Change asm vs C order in refcount (Kees) > > v1->v2: * Compiling the macros into a separate .s file, improving > readability (Linus) > * Improving assembly formatting, applying most of the comments > according to my judgment (Jan) > * Adding exception-table, cpufeature and jump-labels > * Removing new-line cleanup; to be submitted separately > > Cc: Masahiro Yamada <yamada.masahiro@socionext.com> > Cc: Sam Ravnborg <sam@ravnborg.org> > Cc: Alok Kataria <akataria@vmware.com> > Cc: Christopher Li <sparse@chrisli.org> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: "H. Peter Anvin" <hpa@zytor.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Jan Beulich <JBeulich@suse.com> > Cc: Josh Poimboeuf <jpoimboe@redhat.com> > Cc: Juergen Gross <jgross@suse.com> > Cc: Kate Stewart <kstewart@linuxfoundation.org> > Cc: Kees Cook <keescook@chromium.org> > Cc: linux-sparse@vger.kernel.org > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Philippe Ombredanne <pombredanne@nexb.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: virtualization@lists.linux-foundation.org > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: x86@kernel.org > Cc: Chris Zankel <chris@zankel.net> > Cc: Max Filippov <jcmvbkbc@gmail.com> > Cc: linux-xtensa@linux-xtensa.org > > Nadav Amit (10): > xtensa: defining LINKER_SCRIPT for the linker script > Makefile: Prepare for using macros for inline asm > x86: objtool: use asm macro for better compiler decisions > x86: refcount: prevent gcc distortions > x86: alternatives: macrofy locks for better inlining > x86: bug: prevent gcc distortions > x86: prevent inline distortion by paravirt ops > x86: extable: use macros instead of inline assembly > x86: cpufeature: use macros instead of inline assembly > x86: jump-labels: use macros instead of inline assembly > > Makefile | 9 ++- > arch/x86/Makefile | 7 ++ > arch/x86/entry/calling.h | 2 +- > arch/x86/include/asm/alternative-asm.h | 20 ++++-- > arch/x86/include/asm/alternative.h | 11 +-- > arch/x86/include/asm/asm.h | 61 +++++++--------- > arch/x86/include/asm/bug.h | 98 +++++++++++++++----------- > arch/x86/include/asm/cpufeature.h | 82 ++++++++++++--------- > arch/x86/include/asm/jump_label.h | 77 ++++++++------------ > arch/x86/include/asm/paravirt_types.h | 56 +++++++-------- > arch/x86/include/asm/refcount.h | 74 +++++++++++-------- > arch/x86/kernel/macros.S | 16 +++++ > arch/xtensa/kernel/Makefile | 4 +- > include/asm-generic/bug.h | 8 +-- > include/linux/compiler.h | 56 +++++++++++---- > scripts/Kbuild.include | 4 +- > scripts/mod/Makefile | 2 + > 17 files changed, 331 insertions(+), 256 deletions(-) > create mode 100644 arch/x86/kernel/macros.S > > -- > 2.17.1 > -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-07 9:46 ` PROPOSAL: Extend inline asm syntax with size spec Borislav Petkov @ 2018-10-07 13:23 ` Segher Boessenkool 2018-10-07 14:14 ` Borislav Petkov 2018-10-07 15:53 ` Michael Matz [not found] ` <4F2F1BCE-7875-4160-9E1E-9F8EF962D989@vmware.com> 1 sibling, 2 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-10-07 13:23 UTC (permalink / raw) To: Borislav Petkov Cc: gcc, Richard Biener, Michael Matz, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > this is an attempt to see whether gcc's inline asm heuristic when > estimating inline asm statements' cost for better inlining can be > improved. GCC already estimates the *size* of inline asm, and this is required *for correctness*. So any workaround that works against this will only end in tears. Taking size as an estimate of cost is not very good. But in your motivating example you actually *do* care mostly about size, namely, for the inlining decisions. So I guess the real issue is that the inline asm size estimate for x86 isn't very good (since it has to be pessimistic, and x86 insns can be huge)? > Now, Richard suggested doing something like: > > 1) inline asm ("...") What would the semantics of this be? > 2) asm ("..." : : : : <size-expr>) This potentially conflicts with the syntax for asm goto. > 3) asm ("...") __attribute__((asm_size(<size-expr>))); Eww. > with which user can tell gcc what the size of that inline asm statement > is and thus allow for more precise cost estimation and in the end better > inlining. More precise *size* estimates, yes. And if the user lies he should not be surprised to get assembler errors, etc. > And FWIW 3) looks pretty straight-forward to me because attributes are > pretty common anyways. I don't like 2) either. But 1) looks interesting, depends what its semantics would be? "Don't count this insn's size for inlining decisions", maybe? Another option is to just force inlining for those few functions where GCC currently makes an inlining decision you don't like. Or are there more than a few? Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-07 13:23 ` Segher Boessenkool @ 2018-10-07 14:14 ` Borislav Petkov 2018-10-07 15:14 ` Segher Boessenkool 2018-10-07 15:53 ` Michael Matz 1 sibling, 1 reply; 40+ messages in thread From: Borislav Petkov @ 2018-10-07 14:14 UTC (permalink / raw) To: Segher Boessenkool Cc: gcc, Richard Biener, Michael Matz, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Sun, Oct 07, 2018 at 08:22:28AM -0500, Segher Boessenkool wrote: > GCC already estimates the *size* of inline asm, and this is required > *for correctness*. I didn't say it didn't - but the heuristic could use improving. > So I guess the real issue is that the inline asm size estimate for x86 > isn't very good (since it has to be pessimistic, and x86 insns can be > huge)? Well, the size thing could be just a "parameter" or "hint" of sorts, to tell gcc to inline the function X which is inlining the asm statement into the function Y which is calling function X. If you look at the patchset, it is moving everything to asm macros where gcc is apparently able to do better inlining. > > 3) asm ("...") __attribute__((asm_size(<size-expr>))); > > Eww. Why? > More precise *size* estimates, yes. And if the user lies he should not > be surprised to get assembler errors, etc. Yes. Another option would be if gcc parses the inline asm directly and does a more precise size estimation. Which is a lot more involved and complicated solution so I guess we wanna look at the simpler ones first. :-) > I don't like 2) either. But 1) looks interesting, depends what its > semantics would be? "Don't count this insn's size for inlining decisions", > maybe? Or simply "this asm statement has a size of 1" to mean, inline it everywhere. Which has the same caveats as above. > Another option is to just force inlining for those few functions where > GCC currently makes an inlining decision you don't like. Or are there > more than a few? I'm afraid they're more than a few and this should work automatically, if possible. Thx. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-07 14:14 ` Borislav Petkov @ 2018-10-07 15:14 ` Segher Boessenkool [not found] ` <20181008055838.GA66819@gmail.com> 0 siblings, 1 reply; 40+ messages in thread From: Segher Boessenkool @ 2018-10-07 15:14 UTC (permalink / raw) To: Borislav Petkov Cc: gcc, Richard Biener, Michael Matz, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Sun, Oct 07, 2018 at 04:13:49PM +0200, Borislav Petkov wrote: > On Sun, Oct 07, 2018 at 08:22:28AM -0500, Segher Boessenkool wrote: > > GCC already estimates the *size* of inline asm, and this is required > > *for correctness*. > > I didn't say it didn't - but the heuristic could use improving. How? It is as sharp an estimate as can be *already*: number of insns times maximum size per insn. If you get this wrong, conditional branches (and similar things, but conditional branches usually hit first, and hurt most) will stop working correctly, unless binutils uses relaxation for those on your architecture (most don't). > > So I guess the real issue is that the inline asm size estimate for x86 > > isn't very good (since it has to be pessimistic, and x86 insns can be > > huge)? > > Well, the size thing could be just a "parameter" or "hint" of sorts, to > tell gcc to inline the function X which is inlining the asm statement > into the function Y which is calling function X. If you look at the > patchset, it is moving everything to asm macros where gcc is apparently > able to do better inlining. Yes, that will cause fewer problems I think: do not override size _at all_, but give a hint to the inliner. > > > 3) asm ("...") __attribute__((asm_size(<size-expr>))); > > > > Eww. > > Why? Attributes are clumsy and clunky and kludgy. It never is well-defined how attributes interact, and the more attributes we define and use, the more that matters. > > More precise *size* estimates, yes. And if the user lies he should not > > be surprised to get assembler errors, etc. > > Yes. > > Another option would be if gcc parses the inline asm directly and > does a more precise size estimation. Which is a lot more involved and > complicated solution so I guess we wanna look at the simpler ones first. > > :-) Which is *impossible* to do. Inline assembler is free-form text. > > I don't like 2) either. But 1) looks interesting, depends what its > > semantics would be? "Don't count this insn's size for inlining decisions", > > maybe? > > Or simply "this asm statement has a size of 1" to mean, inline it > everywhere. Which has the same caveats as above. "Has minimum length" then (size 1 cannot work on most archs). > > Another option is to just force inlining for those few functions where > > GCC currently makes an inlining decision you don't like. Or are there > > more than a few? > > I'm afraid they're more than a few and this should work automatically, > if possible. Would counting *all* asms as having minimum length for inlining decisions work? Will that give bad side effects? Or since this problem is quite specific to x86, maybe some target hook is wanted? Things work quite well elsewhere as-is, degrading that is not a good idea. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <20181008055838.GA66819@gmail.com>]
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <20181008055838.GA66819@gmail.com> @ 2018-10-08 7:53 ` Segher Boessenkool 0 siblings, 0 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-10-08 7:53 UTC (permalink / raw) To: Ingo Molnar Cc: Borislav Petkov, gcc, Richard Biener, Michael Matz, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Mon, Oct 08, 2018 at 07:58:38AM +0200, Ingo Molnar wrote: > * Segher Boessenkool <segher@kernel.crashing.org> wrote: > > > > More precise *size* estimates, yes. And if the user lies he should not > > > > be surprised to get assembler errors, etc. > > > > > > Yes. > > > > > > Another option would be if gcc parses the inline asm directly and > > > does a more precise size estimation. Which is a lot more involved and > > > complicated solution so I guess we wanna look at the simpler ones first. > > > > > > :-) > > > > Which is *impossible* to do. Inline assembler is free-form text. > > "Impossible" is false: only under GCC's model and semantics of inline > asm that is, and only under the (false) assumption that the semantics > of the asm statement (which is a GCC extension to begin with) cannot > be changed like it has been changed multiple times in the past. > > "Difficult", "not worth our while", perhaps. If we throw out our current definition of inline assembler, and of the internal backend interfaces, then sure you can do it. This of course invalidates all code that uses GCC inline assembler, and all GCC backends (both in-tree and out-of-tree, both current and historical). If other compilers think everyone should rewrite all of their code because those compiler do inline asm wro^H^H^Hdifferently, that is their problem; GCC should not deny all history and screw over all its users. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-07 13:23 ` Segher Boessenkool 2018-10-07 14:14 ` Borislav Petkov @ 2018-10-07 15:53 ` Michael Matz 2018-10-08 7:31 ` Segher Boessenkool [not found] ` <20181008061323.GB66819@gmail.com> 1 sibling, 2 replies; 40+ messages in thread From: Michael Matz @ 2018-10-07 15:53 UTC (permalink / raw) To: Segher Boessenkool Cc: Borislav Petkov, gcc, Richard Biener, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa Hi Segher, On Sun, 7 Oct 2018, Segher Boessenkool wrote: > On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > > this is an attempt to see whether gcc's inline asm heuristic when > > estimating inline asm statements' cost for better inlining can be > > improved. > > GCC already estimates the *size* of inline asm, and this is required > *for correctness*. So any workaround that works against this will only > end in tears. You're right and wrong. GCC can't even estimate the size of mildly complicated inline asms right now, so your claim of it being necessary for correctness can't be true in this absolute form. I know what you try to say, but still, consider inline asms like this: insn1 .section bla insn2 .previous or invoke_asm_macro foo,bar in both cases GCCs size estimate will be wrong however you want to count it. This is actually the motivating example for the kernel guys, the games they play within their inline asms make the estimates be wildly wrong to a point it interacts with the inliner. > So I guess the real issue is that the inline asm size estimate for x86 > isn't very good (since it has to be pessimistic, and x86 insns can be > huge)? No, see above, even if we were to improve the size estimates (e.g. based on some average instruction size) the kernel examples would still be off because they switch sections back and forth, use asm macros and computed .fill directives and maybe further stuff. GCC will never be able to accurately calculate these sizes (without an built-in assembler which hopefully noone proposes). So, there is a case for extending the inline-asm facility to say "size is complicated here, assume this for inline decisions". > > Now, Richard suggested doing something like: > > > > 1) inline asm ("...") > > What would the semantics of this be? The size of the inline asm wouldn't be counted towards the inliner size limits (or be counted as "1"). > I don't like 2) either. But 1) looks interesting, depends what its > semantics would be? "Don't count this insn's size for inlining decisions", > maybe? TBH, I like the inline asm (...) suggestion most currently, but what if we want to add more attributes to asms? We could add further special keywords to the clobber list: asm ("...." : : : "cc,memory,inline"); sure, it might seem strange to "clobber" inline, but if we reinterpret the clobber list as arbitrary set of attributes for this asm, it'd be fine. > Another option is to just force inlining for those few functions where > GCC currently makes an inlining decision you don't like. Or are there > more than a few? I think the examples I saw from Boris were all indirect inlines: static inline void foo() { asm("large-looking-but-small-asm"); } static void bar1() { ... foo() ... } static void bar2() { ... foo() ... } void goo (void) { bar1(); } // bar1 should have been inlined So, while the immediate asm user was marked as always inline that in turn caused users of it to become non-inlined. I'm assuming the kernel guys did proper measurements that they _really_ get some non-trivial speed benefit by inlining bar1/bar2, but for some reasons (I didn't inquire) didn't want to mark them all as inline as well. Ciao, Michael. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-07 15:53 ` Michael Matz @ 2018-10-08 7:31 ` Segher Boessenkool 2018-10-08 9:09 ` Richard Biener [not found] ` <20181008061323.GB66819@gmail.com> 1 sibling, 1 reply; 40+ messages in thread From: Segher Boessenkool @ 2018-10-08 7:31 UTC (permalink / raw) To: Michael Matz Cc: Borislav Petkov, gcc, Richard Biener, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa Hi! On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote: > On Sun, 7 Oct 2018, Segher Boessenkool wrote: > > On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > > > this is an attempt to see whether gcc's inline asm heuristic when > > > estimating inline asm statements' cost for better inlining can be > > > improved. > > > > GCC already estimates the *size* of inline asm, and this is required > > *for correctness*. So any workaround that works against this will only > > end in tears. > > You're right and wrong. GCC can't even estimate the size of mildly > complicated inline asms right now, so your claim of it being necessary for > correctness can't be true in this absolute form. I know what you try to > say, but still, consider inline asms like this: > > insn1 > .section bla > insn2 > .previous > > or > invoke_asm_macro foo,bar > > in both cases GCCs size estimate will be wrong however you want to count > it. This is actually the motivating example for the kernel guys, the > games they play within their inline asms make the estimates be wildly > wrong to a point it interacts with the inliner. Right. The manual says: """ Some targets require that GCC track the size of each instruction used in order to generate correct code. Because the final length of the code produced by an @code{asm} statement is only known by the assembler, GCC must make an estimate as to how big it will be. It does this by counting the number of instructions in the pattern of the @code{asm} and multiplying that by the length of the longest instruction supported by that processor. (When working out the number of instructions, it assumes that any occurrence of a newline or of whatever statement separator character is supported by the assembler -- typically @samp{;} --- indicates the end of an instruction.) Normally, GCC's estimate is adequate to ensure that correct code is generated, but it is possible to confuse the compiler if you use pseudo instructions or assembler macros that expand into multiple real instructions, or if you use assembler directives that expand to more space in the object file than is needed for a single instruction. If this happens then the assembler may produce a diagnostic saying that a label is unreachable. """ It *is* necessary for correctness, except you can do things that can confuse the compiler and then you are on your own anyway. > > So I guess the real issue is that the inline asm size estimate for x86 > > isn't very good (since it has to be pessimistic, and x86 insns can be > > huge)? > > No, see above, even if we were to improve the size estimates (e.g. based > on some average instruction size) the kernel examples would still be off > because they switch sections back and forth, use asm macros and computed > .fill directives and maybe further stuff. GCC will never be able to > accurately calculate these sizes What *is* such a size, anyway? If it can be spread over multiple sections (some of which support section merging), and you can have huge alignments, etc. What is needed here is not knowing the maximum size of the binary output (however you want to define that), but some way for the compiler to understand how bad it is to inline some assembler. Maybe manual direction, maybe just the current jeuristics can be tweaked a bit, maybe we need to invent some attribute or two. > (without an built-in assembler which hopefully noone proposes). Not me, that's for sure. > So, there is a case for extending the inline-asm facility to say > "size is complicated here, assume this for inline decisions". Yeah, that's an option. It may be too complicated though, or just not useful in its generality, so that everyone will use "1" (or "1 normal size instruction"), and then we are better off just making something for _that_ (or making it the default). > > > Now, Richard suggested doing something like: > > > > > > 1) inline asm ("...") > > > > What would the semantics of this be? > > The size of the inline asm wouldn't be counted towards the inliner size > limits (or be counted as "1"). That sounds like a good option. > > I don't like 2) either. But 1) looks interesting, depends what its > > semantics would be? "Don't count this insn's size for inlining decisions", > > maybe? > > TBH, I like the inline asm (...) suggestion most currently, but what if we > want to add more attributes to asms? We could add further special > keywords to the clobber list: > asm ("...." : : : "cc,memory,inline"); > sure, it might seem strange to "clobber" inline, but if we reinterpret the > clobber list as arbitrary set of attributes for this asm, it'd be fine. All of a targets register names and alternative register names are allowed in the clobber list. Will that never conflict with an attribute name? We already *have* syntax for specifying attributes on an asm (on *any* statement even), so mixing these two things has no advantage. Both "cc" and "memory" have their own problems of course, adding more things to this just feels bad. It may not be so bad ;-) > > Another option is to just force inlining for those few functions where > > GCC currently makes an inlining decision you don't like. Or are there > > more than a few? > > I think the examples I saw from Boris were all indirect inlines: > > static inline void foo() { asm("large-looking-but-small-asm"); } > static void bar1() { ... foo() ... } > static void bar2() { ... foo() ... } > void goo (void) { bar1(); } // bar1 should have been inlined > > So, while the immediate asm user was marked as always inline that in turn > caused users of it to become non-inlined. I'm assuming the kernel guys > did proper measurements that they _really_ get some non-trivial speed > benefit by inlining bar1/bar2, but for some reasons (I didn't inquire) > didn't want to mark them all as inline as well. Yeah that makes sense, like if this happens with the fixup stuff, it will quickly spiral out of control. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-08 7:31 ` Segher Boessenkool @ 2018-10-08 9:09 ` Richard Biener 2018-10-08 10:03 ` Segher Boessenkool 2018-10-09 15:54 ` Segher Boessenkool 0 siblings, 2 replies; 40+ messages in thread From: Richard Biener @ 2018-10-08 9:09 UTC (permalink / raw) To: Segher Boessenkool Cc: Michael Matz, Borislav Petkov, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Mon, 8 Oct 2018, Segher Boessenkool wrote: > Hi! > > On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote: > > On Sun, 7 Oct 2018, Segher Boessenkool wrote: > > > On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > > > > this is an attempt to see whether gcc's inline asm heuristic when > > > > estimating inline asm statements' cost for better inlining can be > > > > improved. > > > > > > GCC already estimates the *size* of inline asm, and this is required > > > *for correctness*. So any workaround that works against this will only > > > end in tears. > > > > You're right and wrong. GCC can't even estimate the size of mildly > > complicated inline asms right now, so your claim of it being necessary for > > correctness can't be true in this absolute form. I know what you try to > > say, but still, consider inline asms like this: > > > > insn1 > > .section bla > > insn2 > > .previous > > > > or > > invoke_asm_macro foo,bar > > > > in both cases GCCs size estimate will be wrong however you want to count > > it. This is actually the motivating example for the kernel guys, the > > games they play within their inline asms make the estimates be wildly > > wrong to a point it interacts with the inliner. > > Right. The manual says: > > """ > Some targets require that GCC track the size of each instruction used > in order to generate correct code. Because the final length of the > code produced by an @code{asm} statement is only known by the > assembler, GCC must make an estimate as to how big it will be. It > does this by counting the number of instructions in the pattern of the > @code{asm} and multiplying that by the length of the longest > instruction supported by that processor. (When working out the number > of instructions, it assumes that any occurrence of a newline or of > whatever statement separator character is supported by the assembler -- > typically @samp{;} --- indicates the end of an instruction.) > > Normally, GCC's estimate is adequate to ensure that correct > code is generated, but it is possible to confuse the compiler if you use > pseudo instructions or assembler macros that expand into multiple real > instructions, or if you use assembler directives that expand to more > space in the object file than is needed for a single instruction. > If this happens then the assembler may produce a diagnostic saying that > a label is unreachable. > """ > > It *is* necessary for correctness, except you can do things that can > confuse the compiler and then you are on your own anyway. > > > > So I guess the real issue is that the inline asm size estimate for x86 > > > isn't very good (since it has to be pessimistic, and x86 insns can be > > > huge)? > > > > No, see above, even if we were to improve the size estimates (e.g. based > > on some average instruction size) the kernel examples would still be off > > because they switch sections back and forth, use asm macros and computed > > .fill directives and maybe further stuff. GCC will never be able to > > accurately calculate these sizes > > What *is* such a size, anyway? If it can be spread over multiple sections > (some of which support section merging), and you can have huge alignments, > etc. What is needed here is not knowing the maximum size of the binary > output (however you want to define that), but some way for the compiler > to understand how bad it is to inline some assembler. Maybe manual > direction, maybe just the current jeuristics can be tweaked a bit, maybe > we need to invent some attribute or two. > > > (without an built-in assembler which hopefully noone proposes). > > Not me, that's for sure. > > > So, there is a case for extending the inline-asm facility to say > > "size is complicated here, assume this for inline decisions". > > Yeah, that's an option. It may be too complicated though, or just not > useful in its generality, so that everyone will use "1" (or "1 normal > size instruction"), and then we are better off just making something > for _that_ (or making it the default). > > > > > Now, Richard suggested doing something like: > > > > > > > > 1) inline asm ("...") > > > > > > What would the semantics of this be? > > > > The size of the inline asm wouldn't be counted towards the inliner size > > limits (or be counted as "1"). > > That sounds like a good option. Yes, I also like it for simplicity. It also avoids the requirement of translating the number (in bytes?) given by the user to "number of GIMPLE instructions" as needed by the inliner. > > > I don't like 2) either. But 1) looks interesting, depends what its > > > semantics would be? "Don't count this insn's size for inlining decisions", > > > maybe? > > > > TBH, I like the inline asm (...) suggestion most currently, but what if we > > want to add more attributes to asms? We could add further special > > keywords to the clobber list: > > asm ("...." : : : "cc,memory,inline"); > > sure, it might seem strange to "clobber" inline, but if we reinterpret the > > clobber list as arbitrary set of attributes for this asm, it'd be fine. > > All of a targets register names and alternative register names are > allowed in the clobber list. Will that never conflict with an attribute > name? We already *have* syntax for specifying attributes on an asm (on > *any* statement even), so mixing these two things has no advantage. Heh, but I failed to make an example with attribute synatx working. IIRC attributes do not work on stmts. What could work is to use a #pragma though. Richard. > Both "cc" and "memory" have their own problems of course, adding more > things to this just feels bad. It may not be so bad ;-) > > > > Another option is to just force inlining for those few functions where > > > GCC currently makes an inlining decision you don't like. Or are there > > > more than a few? > > > > I think the examples I saw from Boris were all indirect inlines: > > > > static inline void foo() { asm("large-looking-but-small-asm"); } > > static void bar1() { ... foo() ... } > > static void bar2() { ... foo() ... } > > void goo (void) { bar1(); } // bar1 should have been inlined > > > > So, while the immediate asm user was marked as always inline that in turn > > caused users of it to become non-inlined. I'm assuming the kernel guys > > did proper measurements that they _really_ get some non-trivial speed > > benefit by inlining bar1/bar2, but for some reasons (I didn't inquire) > > didn't want to mark them all as inline as well. > > Yeah that makes sense, like if this happens with the fixup stuff, it will > quickly spiral out of control. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-08 9:09 ` Richard Biener @ 2018-10-08 10:03 ` Segher Boessenkool 2018-10-09 15:54 ` Segher Boessenkool 1 sibling, 0 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-10-08 10:03 UTC (permalink / raw) To: Richard Biener Cc: Michael Matz, Borislav Petkov, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Mon, Oct 08, 2018 at 11:07:46AM +0200, Richard Biener wrote: > > All of a targets register names and alternative register names are > > allowed in the clobber list. Will that never conflict with an attribute > > name? We already *have* syntax for specifying attributes on an asm (on > > *any* statement even), so mixing these two things has no advantage. > > Heh, but I failed to make an example with attribute synatx working. > IIRC attributes do not work on stmts. What could work is to use > a #pragma though. Apparently statement attributes currently(?) only work for null statements. Oh well. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-08 9:09 ` Richard Biener 2018-10-08 10:03 ` Segher Boessenkool @ 2018-10-09 15:54 ` Segher Boessenkool 2018-10-10 7:12 ` Richard Biener ` (2 more replies) 1 sibling, 3 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-10-09 15:54 UTC (permalink / raw) To: Richard Biener Cc: Michael Matz, Borislav Petkov, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Mon, Oct 08, 2018 at 11:07:46AM +0200, Richard Biener wrote: > On Mon, 8 Oct 2018, Segher Boessenkool wrote: > > On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote: > > > On Sun, 7 Oct 2018, Segher Boessenkool wrote: > > > > On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > > > > > Now, Richard suggested doing something like: > > > > > > > > > > 1) inline asm ("...") > > > > > > > > What would the semantics of this be? > > > > > > The size of the inline asm wouldn't be counted towards the inliner size > > > limits (or be counted as "1"). > > > > That sounds like a good option. > > Yes, I also like it for simplicity. It also avoids the requirement > of translating the number (in bytes?) given by the user to > "number of GIMPLE instructions" as needed by the inliner. This patch implements this, for C only so far. And the syntax is "asm inline", which is more in line with other syntax. How does this look? Segher diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 1f173fc..94b1d41 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -6287,7 +6287,7 @@ static tree c_parser_asm_statement (c_parser *parser) { tree quals, str, outputs, inputs, clobbers, labels, ret; - bool simple, is_goto; + bool simple, is_goto, is_inline; location_t asm_loc = c_parser_peek_token (parser)->location; int section, nsections; @@ -6318,6 +6318,13 @@ c_parser_asm_statement (c_parser *parser) is_goto = true; } + is_inline = false; + if (c_parser_next_token_is_keyword (parser, RID_INLINE)) + { + c_parser_consume_token (parser); + is_inline = true; + } + /* ??? Follow the C++ parser rather than using the lex_untranslated_string kludge. */ parser->lex_untranslated_string = true; @@ -6393,7 +6400,8 @@ c_parser_asm_statement (c_parser *parser) c_parser_skip_to_end_of_block_or_statement (parser); ret = build_asm_stmt (quals, build_asm_expr (asm_loc, str, outputs, inputs, - clobbers, labels, simple)); + clobbers, labels, simple, + is_inline)); error: parser->lex_untranslated_string = false; diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h index 017c01c..f5629300 100644 --- a/gcc/c/c-tree.h +++ b/gcc/c/c-tree.h @@ -677,7 +677,8 @@ extern tree build_compound_literal (location_t, tree, tree, bool, extern void check_compound_literal_type (location_t, struct c_type_name *); extern tree c_start_case (location_t, location_t, tree, bool); extern void c_finish_case (tree, tree); -extern tree build_asm_expr (location_t, tree, tree, tree, tree, tree, bool); +extern tree build_asm_expr (location_t, tree, tree, tree, tree, tree, bool, + bool); extern tree build_asm_stmt (tree, tree); extern int c_types_compatible_p (tree, tree); extern tree c_begin_compound_stmt (bool); diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 9d09b8d..e013100 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -10064,7 +10064,7 @@ build_asm_stmt (tree cv_qualifier, tree args) are subtly different. We use a ASM_EXPR node to represent this. */ tree build_asm_expr (location_t loc, tree string, tree outputs, tree inputs, - tree clobbers, tree labels, bool simple) + tree clobbers, tree labels, bool simple, bool is_inline) { tree tail; tree args; @@ -10182,6 +10182,7 @@ build_asm_expr (location_t loc, tree string, tree outputs, tree inputs, as volatile. */ ASM_INPUT_P (args) = simple; ASM_VOLATILE_P (args) = (noutputs == 0); + ASM_INLINE_P (args) = is_inline; return args; } diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index 83e2273..1a00fa3 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -2019,6 +2019,8 @@ dump_gimple_asm (pretty_printer *buffer, gasm *gs, int spc, dump_flags_t flags) pp_string (buffer, "__asm__"); if (gimple_asm_volatile_p (gs)) pp_string (buffer, " __volatile__"); + if (gimple_asm_inline_p (gs)) + pp_string (buffer, " __inline__"); if (gimple_asm_nlabels (gs)) pp_string (buffer, " goto"); pp_string (buffer, "(\""); diff --git a/gcc/gimple.h b/gcc/gimple.h index a5dda93..8a58e07 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -137,6 +137,7 @@ enum gimple_rhs_class enum gf_mask { GF_ASM_INPUT = 1 << 0, GF_ASM_VOLATILE = 1 << 1, + GF_ASM_INLINE = 1 << 2, GF_CALL_FROM_THUNK = 1 << 0, GF_CALL_RETURN_SLOT_OPT = 1 << 1, GF_CALL_TAILCALL = 1 << 2, @@ -3911,7 +3912,7 @@ gimple_asm_volatile_p (const gasm *asm_stmt) } -/* If VOLATLE_P is true, mark asm statement ASM_STMT as volatile. */ +/* If VOLATILE_P is true, mark asm statement ASM_STMT as volatile. */ static inline void gimple_asm_set_volatile (gasm *asm_stmt, bool volatile_p) @@ -3923,6 +3924,27 @@ gimple_asm_set_volatile (gasm *asm_stmt, bool volatile_p) } +/* Return true ASM_STMT ASM_STMT is an asm statement marked inline. */ + +static inline bool +gimple_asm_inline_p (const gasm *asm_stmt) +{ + return (asm_stmt->subcode & GF_ASM_INLINE) != 0; +} + + +/* If INLINE_P is true, mark asm statement ASM_STMT as inline. */ + +static inline void +gimple_asm_set_inline (gasm *asm_stmt, bool inline_p) +{ + if (inline_p) + asm_stmt->subcode |= GF_ASM_INLINE; + else + asm_stmt->subcode &= ~GF_ASM_INLINE; +} + + /* If INPUT_P is true, mark asm ASM_STMT as an ASM_INPUT. */ static inline void diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 509fc2f..10b80f2 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -6315,6 +6315,7 @@ gimplify_asm_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p) gimple_asm_set_volatile (stmt, ASM_VOLATILE_P (expr) || noutputs == 0); gimple_asm_set_input (stmt, ASM_INPUT_P (expr)); + gimple_asm_set_inline (stmt, ASM_INLINE_P (expr)); gimplify_seq_add_stmt (pre_p, stmt); } diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c index ba39ea3..5361139 100644 --- a/gcc/ipa-icf-gimple.c +++ b/gcc/ipa-icf-gimple.c @@ -993,6 +993,9 @@ func_checker::compare_gimple_asm (const gasm *g1, const gasm *g2) if (gimple_asm_input_p (g1) != gimple_asm_input_p (g2)) return false; + if (gimple_asm_inline_p (g1) != gimple_asm_inline_p (g2)) + return false; + if (gimple_asm_ninputs (g1) != gimple_asm_ninputs (g2)) return false; diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 9134253..6b1d2ea 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -4108,6 +4108,8 @@ estimate_num_insns (gimple *stmt, eni_weights *weights) with very long asm statements. */ if (count > 1000) count = 1000; + if (gimple_asm_inline_p (as_a <gasm *> (stmt))) + count = !!count; return MAX (1, count); } diff --git a/gcc/tree.h b/gcc/tree.h index 2e45f9d..160b3a7 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1245,6 +1245,9 @@ extern tree maybe_wrap_with_location (tree, location_t); ASM_OPERAND with no operands. */ #define ASM_INPUT_P(NODE) (ASM_EXPR_CHECK (NODE)->base.static_flag) #define ASM_VOLATILE_P(NODE) (ASM_EXPR_CHECK (NODE)->base.public_flag) +/* Nonzero if we want to consider this asm as minimum length and cost + for inlining decisions. */ +#define ASM_INLINE_P(NODE) (ASM_EXPR_CHECK (NODE)->base.protected_flag) /* COND_EXPR accessors. */ #define COND_EXPR_COND(NODE) (TREE_OPERAND (COND_EXPR_CHECK (NODE), 0)) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-09 15:54 ` Segher Boessenkool @ 2018-10-10 7:12 ` Richard Biener 2018-10-10 7:54 ` Segher Boessenkool [not found] ` <20181010072240.GB103159@gmail.com> 2018-10-10 16:31 ` Nadav Amit [not found] ` <CAK7LNARmrdtyE7JRAdJH_zbfvD_cej+TGLh+5KfT20o538KUcA@mail.gmail.com> 2 siblings, 2 replies; 40+ messages in thread From: Richard Biener @ 2018-10-10 7:12 UTC (permalink / raw) To: Segher Boessenkool Cc: Michael Matz, Borislav Petkov, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Tue, 9 Oct 2018, Segher Boessenkool wrote: > On Mon, Oct 08, 2018 at 11:07:46AM +0200, Richard Biener wrote: > > On Mon, 8 Oct 2018, Segher Boessenkool wrote: > > > On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote: > > > > On Sun, 7 Oct 2018, Segher Boessenkool wrote: > > > > > On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > > > > > > Now, Richard suggested doing something like: > > > > > > > > > > > > 1) inline asm ("...") > > > > > > > > > > What would the semantics of this be? > > > > > > > > The size of the inline asm wouldn't be counted towards the inliner size > > > > limits (or be counted as "1"). > > > > > > That sounds like a good option. > > > > Yes, I also like it for simplicity. It also avoids the requirement > > of translating the number (in bytes?) given by the user to > > "number of GIMPLE instructions" as needed by the inliner. > > This patch implements this, for C only so far. And the syntax is > "asm inline", which is more in line with other syntax. > > How does this look? Looks good. A few nits - you need to document this in extend.texi, the tree flag use needs documenting in tree-core.h, and we need a testcase (I'd suggest one that shows we inline a function with "large" asm inline () even at -Os). Oh, and I don't think we want C and C++ to diverge - so you need to cook up C++ support as well. Can kernel folks give this a second and third thought please so we don't implement sth that in the end won't satisfy you guys? Thanks for doing this, Richard. > > Segher > > > > diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c > index 1f173fc..94b1d41 100644 > --- a/gcc/c/c-parser.c > +++ b/gcc/c/c-parser.c > @@ -6287,7 +6287,7 @@ static tree > c_parser_asm_statement (c_parser *parser) > { > tree quals, str, outputs, inputs, clobbers, labels, ret; > - bool simple, is_goto; > + bool simple, is_goto, is_inline; > location_t asm_loc = c_parser_peek_token (parser)->location; > int section, nsections; > > @@ -6318,6 +6318,13 @@ c_parser_asm_statement (c_parser *parser) > is_goto = true; > } > > + is_inline = false; > + if (c_parser_next_token_is_keyword (parser, RID_INLINE)) > + { > + c_parser_consume_token (parser); > + is_inline = true; > + } > + > /* ??? Follow the C++ parser rather than using the > lex_untranslated_string kludge. */ > parser->lex_untranslated_string = true; > @@ -6393,7 +6400,8 @@ c_parser_asm_statement (c_parser *parser) > c_parser_skip_to_end_of_block_or_statement (parser); > > ret = build_asm_stmt (quals, build_asm_expr (asm_loc, str, outputs, inputs, > - clobbers, labels, simple)); > + clobbers, labels, simple, > + is_inline)); > > error: > parser->lex_untranslated_string = false; > diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h > index 017c01c..f5629300 100644 > --- a/gcc/c/c-tree.h > +++ b/gcc/c/c-tree.h > @@ -677,7 +677,8 @@ extern tree build_compound_literal (location_t, tree, tree, bool, > extern void check_compound_literal_type (location_t, struct c_type_name *); > extern tree c_start_case (location_t, location_t, tree, bool); > extern void c_finish_case (tree, tree); > -extern tree build_asm_expr (location_t, tree, tree, tree, tree, tree, bool); > +extern tree build_asm_expr (location_t, tree, tree, tree, tree, tree, bool, > + bool); > extern tree build_asm_stmt (tree, tree); > extern int c_types_compatible_p (tree, tree); > extern tree c_begin_compound_stmt (bool); > diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c > index 9d09b8d..e013100 100644 > --- a/gcc/c/c-typeck.c > +++ b/gcc/c/c-typeck.c > @@ -10064,7 +10064,7 @@ build_asm_stmt (tree cv_qualifier, tree args) > are subtly different. We use a ASM_EXPR node to represent this. */ > tree > build_asm_expr (location_t loc, tree string, tree outputs, tree inputs, > - tree clobbers, tree labels, bool simple) > + tree clobbers, tree labels, bool simple, bool is_inline) > { > tree tail; > tree args; > @@ -10182,6 +10182,7 @@ build_asm_expr (location_t loc, tree string, tree outputs, tree inputs, > as volatile. */ > ASM_INPUT_P (args) = simple; > ASM_VOLATILE_P (args) = (noutputs == 0); > + ASM_INLINE_P (args) = is_inline; > > return args; > } > diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c > index 83e2273..1a00fa3 100644 > --- a/gcc/gimple-pretty-print.c > +++ b/gcc/gimple-pretty-print.c > @@ -2019,6 +2019,8 @@ dump_gimple_asm (pretty_printer *buffer, gasm *gs, int spc, dump_flags_t flags) > pp_string (buffer, "__asm__"); > if (gimple_asm_volatile_p (gs)) > pp_string (buffer, " __volatile__"); > + if (gimple_asm_inline_p (gs)) > + pp_string (buffer, " __inline__"); > if (gimple_asm_nlabels (gs)) > pp_string (buffer, " goto"); > pp_string (buffer, "(\""); > diff --git a/gcc/gimple.h b/gcc/gimple.h > index a5dda93..8a58e07 100644 > --- a/gcc/gimple.h > +++ b/gcc/gimple.h > @@ -137,6 +137,7 @@ enum gimple_rhs_class > enum gf_mask { > GF_ASM_INPUT = 1 << 0, > GF_ASM_VOLATILE = 1 << 1, > + GF_ASM_INLINE = 1 << 2, > GF_CALL_FROM_THUNK = 1 << 0, > GF_CALL_RETURN_SLOT_OPT = 1 << 1, > GF_CALL_TAILCALL = 1 << 2, > @@ -3911,7 +3912,7 @@ gimple_asm_volatile_p (const gasm *asm_stmt) > } > > > -/* If VOLATLE_P is true, mark asm statement ASM_STMT as volatile. */ > +/* If VOLATILE_P is true, mark asm statement ASM_STMT as volatile. */ > > static inline void > gimple_asm_set_volatile (gasm *asm_stmt, bool volatile_p) > @@ -3923,6 +3924,27 @@ gimple_asm_set_volatile (gasm *asm_stmt, bool volatile_p) > } > > > +/* Return true ASM_STMT ASM_STMT is an asm statement marked inline. */ > + > +static inline bool > +gimple_asm_inline_p (const gasm *asm_stmt) > +{ > + return (asm_stmt->subcode & GF_ASM_INLINE) != 0; > +} > + > + > +/* If INLINE_P is true, mark asm statement ASM_STMT as inline. */ > + > +static inline void > +gimple_asm_set_inline (gasm *asm_stmt, bool inline_p) > +{ > + if (inline_p) > + asm_stmt->subcode |= GF_ASM_INLINE; > + else > + asm_stmt->subcode &= ~GF_ASM_INLINE; > +} > + > + > /* If INPUT_P is true, mark asm ASM_STMT as an ASM_INPUT. */ > > static inline void > diff --git a/gcc/gimplify.c b/gcc/gimplify.c > index 509fc2f..10b80f2 100644 > --- a/gcc/gimplify.c > +++ b/gcc/gimplify.c > @@ -6315,6 +6315,7 @@ gimplify_asm_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p) > > gimple_asm_set_volatile (stmt, ASM_VOLATILE_P (expr) || noutputs == 0); > gimple_asm_set_input (stmt, ASM_INPUT_P (expr)); > + gimple_asm_set_inline (stmt, ASM_INLINE_P (expr)); > > gimplify_seq_add_stmt (pre_p, stmt); > } > diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c > index ba39ea3..5361139 100644 > --- a/gcc/ipa-icf-gimple.c > +++ b/gcc/ipa-icf-gimple.c > @@ -993,6 +993,9 @@ func_checker::compare_gimple_asm (const gasm *g1, const gasm *g2) > if (gimple_asm_input_p (g1) != gimple_asm_input_p (g2)) > return false; > > + if (gimple_asm_inline_p (g1) != gimple_asm_inline_p (g2)) > + return false; > + > if (gimple_asm_ninputs (g1) != gimple_asm_ninputs (g2)) > return false; > > diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c > index 9134253..6b1d2ea 100644 > --- a/gcc/tree-inline.c > +++ b/gcc/tree-inline.c > @@ -4108,6 +4108,8 @@ estimate_num_insns (gimple *stmt, eni_weights *weights) > with very long asm statements. */ > if (count > 1000) > count = 1000; > + if (gimple_asm_inline_p (as_a <gasm *> (stmt))) > + count = !!count; > return MAX (1, count); > } > > diff --git a/gcc/tree.h b/gcc/tree.h > index 2e45f9d..160b3a7 100644 > --- a/gcc/tree.h > +++ b/gcc/tree.h > @@ -1245,6 +1245,9 @@ extern tree maybe_wrap_with_location (tree, location_t); > ASM_OPERAND with no operands. */ > #define ASM_INPUT_P(NODE) (ASM_EXPR_CHECK (NODE)->base.static_flag) > #define ASM_VOLATILE_P(NODE) (ASM_EXPR_CHECK (NODE)->base.public_flag) > +/* Nonzero if we want to consider this asm as minimum length and cost > + for inlining decisions. */ > +#define ASM_INLINE_P(NODE) (ASM_EXPR_CHECK (NODE)->base.protected_flag) > > /* COND_EXPR accessors. */ > #define COND_EXPR_COND(NODE) (TREE_OPERAND (COND_EXPR_CHECK (NODE), 0)) > > -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-10 7:12 ` Richard Biener @ 2018-10-10 7:54 ` Segher Boessenkool [not found] ` <20181010072240.GB103159@gmail.com> 1 sibling, 0 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-10-10 7:54 UTC (permalink / raw) To: Richard Biener Cc: Michael Matz, Borislav Petkov, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Wed, Oct 10, 2018 at 09:12:48AM +0200, Richard Biener wrote: > On Tue, 9 Oct 2018, Segher Boessenkool wrote: > > On Mon, Oct 08, 2018 at 11:07:46AM +0200, Richard Biener wrote: > > > On Mon, 8 Oct 2018, Segher Boessenkool wrote: > > > > On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote: > > > > > On Sun, 7 Oct 2018, Segher Boessenkool wrote: > > > > > > On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > > > > > > > Now, Richard suggested doing something like: > > > > > > > > > > > > > > 1) inline asm ("...") > > > > > > > > > > > > What would the semantics of this be? > > > > > > > > > > The size of the inline asm wouldn't be counted towards the inliner size > > > > > limits (or be counted as "1"). > > > > > > > > That sounds like a good option. > > > > > > Yes, I also like it for simplicity. It also avoids the requirement > > > of translating the number (in bytes?) given by the user to > > > "number of GIMPLE instructions" as needed by the inliner. > > > > This patch implements this, for C only so far. And the syntax is > > "asm inline", which is more in line with other syntax. > > > > How does this look? > > Looks good. A few nits - you need to document this in extend.texi, the Yup. > tree flag use needs documenting in tree-core.h, Ah yes. > and we need a testcase > (I'd suggest one that shows we inline a function with "large" asm inline > () even at -Os). I have one. Oh, and I probably should do a comment at the one line of code that isn't just bookkeeping ;-) > Oh, and I don't think we want C and C++ to diverge - so you need to > cook up C++ support as well. Right, that's why I said "C only so far". > Can kernel folks give this a second and third thought please so we > don't implement sth that in the end won't satisfy you guys? Or actually try it out and see if it has the desired effect! Nothing beats field trials. I'll do the C++ thing today hopefully, and send things to gcc-patches@. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <20181010072240.GB103159@gmail.com>]
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <20181010072240.GB103159@gmail.com> @ 2018-10-10 8:03 ` Segher Boessenkool 2018-10-10 8:19 ` Borislav Petkov 2018-10-10 10:29 ` Richard Biener 0 siblings, 2 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-10-10 8:03 UTC (permalink / raw) To: Ingo Molnar Cc: Richard Biener, Michael Matz, Borislav Petkov, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Wed, Oct 10, 2018 at 09:22:40AM +0200, Ingo Molnar wrote: > * Richard Biener <rguenther@suse.de> wrote: > > Can kernel folks give this a second and third thought please so we > > don't implement sth that in the end won't satisfy you guys? > > So this basically passes '0 size' to the inliner, which should be better > than passing in the explicit size, as we'd inevitably get it wrong > in cases. The code immediately after this makes it size 1, even for things like asm(""), I suppose this works better for the inliner. But that's a detail (and it might change); the description says "consider this asm as minimum length and cost for inlining decisions", which works for either 0 or 1. > I also like 'size 0' for the reason that we tend to write assembly code > and mark it 'inline' if we really think it matters to performance, > so making it more likely to be inlined when used within another inline > function is a plus as well. You can think of it as meaning "we want this asm inlined always", and then whether that actually happens depends on if the function around it is inlined or not. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-10 8:03 ` Segher Boessenkool @ 2018-10-10 8:19 ` Borislav Petkov 2018-10-10 8:36 ` Richard Biener 2018-10-10 18:55 ` Segher Boessenkool 2018-10-10 10:29 ` Richard Biener 1 sibling, 2 replies; 40+ messages in thread From: Borislav Petkov @ 2018-10-10 8:19 UTC (permalink / raw) To: Segher Boessenkool, Ingo Molnar Cc: Richard Biener, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Wed, Oct 10, 2018 at 03:03:25AM -0500, Segher Boessenkool wrote: > The code immediately after this makes it size 1, even for things like > asm(""), I suppose this works better for the inliner. But that's a detail > (and it might change); the description says "consider this asm as minimum > length and cost for inlining decisions", which works for either 0 or 1. Thanks for implementing this, much appreciated. If you need people to test stuff, lemme know. > You can think of it as meaning "we want this asm inlined always", and then > whether that actually happens depends on if the function around it is > inlined or not. My only concern is how we would catch the other extremity where the inline asm grows too big and we end up inlining it everywhere and thus getting fat. The 0day bot already builds tinyconfigs but we should be looking at vmlinux size growth too. Thx. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-10 8:19 ` Borislav Petkov @ 2018-10-10 8:36 ` Richard Biener 2018-10-10 18:55 ` Segher Boessenkool 1 sibling, 0 replies; 40+ messages in thread From: Richard Biener @ 2018-10-10 8:36 UTC (permalink / raw) To: Borislav Petkov Cc: Segher Boessenkool, Ingo Molnar, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Wed, 10 Oct 2018, Borislav Petkov wrote: > On Wed, Oct 10, 2018 at 03:03:25AM -0500, Segher Boessenkool wrote: > > The code immediately after this makes it size 1, even for things like > > asm(""), I suppose this works better for the inliner. But that's a detail > > (and it might change); the description says "consider this asm as minimum > > length and cost for inlining decisions", which works for either 0 or 1. > > Thanks for implementing this, much appreciated. If you need people to > test stuff, lemme know. > > > You can think of it as meaning "we want this asm inlined always", and then > > whether that actually happens depends on if the function around it is > > inlined or not. > > My only concern is how we would catch the other extremity where the > inline asm grows too big and we end up inlining it everywhere and thus > getting fat. The 0day bot already builds tinyconfigs but we should be > looking at vmlinux size growth too. Well, it's like always-inline functions, you have to be careful and _not_ put it everywhere... It's also somewhat different to always-inline functions as those lose their special-ness once inlined (the inlined body is properly accounted for size). Richard. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-10 8:19 ` Borislav Petkov 2018-10-10 8:36 ` Richard Biener @ 2018-10-10 18:55 ` Segher Boessenkool 2018-10-10 19:14 ` Borislav Petkov 1 sibling, 1 reply; 40+ messages in thread From: Segher Boessenkool @ 2018-10-10 18:55 UTC (permalink / raw) To: Borislav Petkov Cc: Ingo Molnar, Richard Biener, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Wed, Oct 10, 2018 at 10:19:06AM +0200, Borislav Petkov wrote: > On Wed, Oct 10, 2018 at 03:03:25AM -0500, Segher Boessenkool wrote: > > The code immediately after this makes it size 1, even for things like > > asm(""), I suppose this works better for the inliner. But that's a detail > > (and it might change); the description says "consider this asm as minimum > > length and cost for inlining decisions", which works for either 0 or 1. > > Thanks for implementing this, much appreciated. If you need people to > test stuff, lemme know. It would be great to hear from kernel people if it works adequately for what you guys want it for :-) > > You can think of it as meaning "we want this asm inlined always", and then > > whether that actually happens depends on if the function around it is > > inlined or not. > > My only concern is how we would catch the other extremity where the > inline asm grows too big and we end up inlining it everywhere and thus > getting fat. The 0day bot already builds tinyconfigs but we should be > looking at vmlinux size growth too. But this isn't really different from other always_inline concerns afaics? So you should be able to catch it the same way, too. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-10 18:55 ` Segher Boessenkool @ 2018-10-10 19:14 ` Borislav Petkov 2018-10-13 19:33 ` Borislav Petkov 0 siblings, 1 reply; 40+ messages in thread From: Borislav Petkov @ 2018-10-10 19:14 UTC (permalink / raw) To: Segher Boessenkool Cc: Ingo Molnar, Richard Biener, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Wed, Oct 10, 2018 at 01:54:33PM -0500, Segher Boessenkool wrote: > It would be great to hear from kernel people if it works adequately for > what you guys want it for :-) Sure, ping me when you have the final version and I'll try to build gcc with it and do some size comparisons. Thx. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-10 19:14 ` Borislav Petkov @ 2018-10-13 19:33 ` Borislav Petkov [not found] ` <alpine.LNX.2.20.13.1810132355180.13914@monopod.intra.ispras.ru> ` (2 more replies) 0 siblings, 3 replies; 40+ messages in thread From: Borislav Petkov @ 2018-10-13 19:33 UTC (permalink / raw) To: Segher Boessenkool Cc: Ingo Molnar, Richard Biener, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton Ok, with Segher's help I've been playing with his patch ontop of bleeding edge gcc 9 and here are my observations. Please double-check me for booboos so that they can be addressed while there's time. So here's what I see ontop of 4.19-rc7: First marked the alternative asm() as inline and undeffed the "inline" keyword. I need to do that because of the funky games we do with "inline" redefinitions in include/linux/compiler_types.h. And Segher hinted at either doing: asm volatile inline(... or asm volatile __inline__(... but both "inline" variants are defined as macros in that file. Which means we either need to #undef inline before using it in asm() or come up with something cleverer. Anyway, did this: --- diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 4cd6a3b71824..7c0639087da7 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -165,11 +165,13 @@ static inline int alternatives_text_reserved(void *start, void *end) * For non barrier like inlines please define new variants * without volatile and memory clobber. */ + +#undef inline #define alternative(oldinstr, newinstr, feature) \ - asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory") + asm volatile inline(ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory") #define alternative_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \ - asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) ::: "memory") + asm volatile inline(ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) ::: "memory") /* * Alternative inline assembly with input. --- Build failed at link time with: arch/x86/boot/compressed/cmdline.o: In function `native_save_fl': cmdline.c:(.text+0x0): multiple definition of `native_save_fl' arch/x86/boot/compressed/misc.o:misc.c:(.text+0x1e0): first defined here arch/x86/boot/compressed/cmdline.o: In function `native_restore_fl': cmdline.c:(.text+0x10): multiple definition of `native_restore_fl' arch/x86/boot/compressed/misc.o:misc.c:(.text+0x1f0): first defined here arch/x86/boot/compressed/error.o: In function `native_save_fl': error.c:(.text+0x0): multiple definition of `native_save_fl' which I had to fix with this: --- diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h index 15450a675031..0d772598c37c 100644 --- a/arch/x86/include/asm/irqflags.h +++ b/arch/x86/include/asm/irqflags.h @@ -14,8 +14,7 @@ */ /* Declaration required for gcc < 4.9 to prevent -Werror=missing-prototypes */ -extern inline unsigned long native_save_fl(void); -extern inline unsigned long native_save_fl(void) +static inline unsigned long native_save_fl(void) { unsigned long flags; @@ -33,8 +32,7 @@ ex --- That "extern inline" declaration looks fishy to me anyway, maybe not really needed ? Apparently, gcc < 4.9 complains with -Werror=missing-prototypes... Then the build worked and the results look like this: text data bss dec hex filename 17287384 5040656 2019532 24347572 17383b4 vmlinux-before 17288020 5040656 2015436 24344112 1737630 vmlinux-2nd-version so some inlining must be happening. Then I did this: --- diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c index 9c5606d88f61..a0170344cf08 100644 --- a/arch/x86/lib/usercopy_64.c +++ b/arch/x86/lib/usercopy_64.c @@ -20,7 +20,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size) /* no memory constraint because it doesn't change any memory gcc knows about */ stac(); - asm volatile( + asm volatile inline( " testq %[size8],%[size8]\n" " jz 4f\n" "0: movq $0,(%[dst])\n" --- to force inlining of a somewhat bigger asm() statement. And yap, more got inlined: text data bss dec hex filename 17287384 5040656 2019532 24347572 17383b4 vmlinux-before 17288020 5040656 2015436 24344112 1737630 vmlinux-2nd-version 17288076 5040656 2015436 24344168 1737668 vmlinux-2nd-version__clear_user so more stuff gets inlined. Looking at the asm output, it had before: --- clear_user: # ./arch/x86/include/asm/current.h:15: return this_cpu_read_stable(current_task); #APP # 15 "./arch/x86/include/asm/current.h" 1 movq %gs:current_task,%rax #, pfo_ret__ # 0 "" 2 # arch/x86/lib/usercopy_64.c:51: if (access_ok(VERIFY_WRITE, to, n)) #NO_APP movq 2520(%rax), %rdx # pfo_ret___12->thread.addr_limit.seg, _1 movq %rdi, %rax # to, tmp93 addq %rsi, %rax # n, tmp93 jc .L3 #, # arch/x86/lib/usercopy_64.c:51: if (access_ok(VERIFY_WRITE, to, n)) cmpq %rax, %rdx # tmp93, _1 jb .L3 #, # arch/x86/lib/usercopy_64.c:52: return __clear_user(to, n); jmp __clear_user # --- note the JMP to __clear_user. After marking the asm() in __clear_user() as inline, clear_user() inlines __clear_user() directly: --- clear_user: # ./arch/x86/include/asm/current.h:15: return this_cpu_read_stable(current_task); #APP # 15 "./arch/x86/include/asm/current.h" 1 movq %gs:current_task,%rax #, pfo_ret__ # 0 "" 2 # arch/x86/lib/usercopy_64.c:51: if (access_ok(VERIFY_WRITE, to, n)) #NO_APP movq 2520(%rax), %rdx # pfo_ret___12->thread.addr_limit.seg, _1 movq %rdi, %rax # to, tmp95 addq %rsi, %rax # n, tmp95 jc .L8 #, # arch/x86/lib/usercopy_64.c:51: if (access_ok(VERIFY_WRITE, to, n)) cmpq %rax, %rdx # tmp95, _1 jb .L8 #, # ./arch/x86/include/asm/smap.h:58: alternative("", __stringify(__ASM_STAC), X86_FEATURE_SMAP); ... this last line is the stac() macro which gets inlined due to the alternative() inlined macro above. So I guess this all looks like what we wanted. And if this lands in gcc9, we would need to do a asm_volatile() macro which is defined differently based on the compiler used. Thoughts, suggestions, etc are most welcome. Thx. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <alpine.LNX.2.20.13.1810132355180.13914@monopod.intra.ispras.ru>]
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <alpine.LNX.2.20.13.1810132355180.13914@monopod.intra.ispras.ru> @ 2018-10-13 21:31 ` Borislav Petkov 0 siblings, 0 replies; 40+ messages in thread From: Borislav Petkov @ 2018-10-13 21:31 UTC (permalink / raw) To: Alexander Monakov Cc: Segher Boessenkool, Ingo Molnar, Richard Biener, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Sun, Oct 14, 2018 at 12:14:02AM +0300, Alexander Monakov wrote: > I apologize for coming in late here with an alternative proposal, but would > you be happy if GCC gave you a way to designate a portion of the asm template > string that shouldn't be counted as its cost because it doesn't go into the > .text section? This wouldn't interact with your redefinitions of the inline > keyword, and you could do something like (assuming we go with %` ... %` > delimiters) I don't mind it but I see you guys are still discussing what would be the better solution here, on the gcc ML. And we, as one user, are a happy camper as long as it does what it is meant to do. But how the feature looks like syntactically is something for gcc folk to decide as they're going to support it for the foreseeable future and I'm very well aware of how important it is for a supportable feature to be designed properly. Thx. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-13 19:33 ` Borislav Petkov [not found] ` <alpine.LNX.2.20.13.1810132355180.13914@monopod.intra.ispras.ru> @ 2018-10-25 10:23 ` Borislav Petkov [not found] ` <20181031125526.GA13219@hirez.programming.kicks-ass.net> 2 siblings, 0 replies; 40+ messages in thread From: Borislav Petkov @ 2018-10-25 10:23 UTC (permalink / raw) To: Segher Boessenkool, Ingo Molnar, Richard Biener, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton Ping. This is a good point in time, methinks, where kernel folk on CC here should have a look at this and speak up whether it is useful for us in this form. Frankly, I'm a bit unsure on the aspect of us using this and supporting old compilers which don't have it and new compilers which do. Because the old compilers should get to see the now upstreamed assembler macros and the new compilers will simply inline the "asm volatile inline" constructs. And how ugly that would become... I guess we can try this with smaller macros first and keep them all nicely hidden in header files. On Sat, Oct 13, 2018 at 09:33:35PM +0200, Borislav Petkov wrote: > Ok, > > with Segher's help I've been playing with his patch ontop of bleeding > edge gcc 9 and here are my observations. Please double-check me for > booboos so that they can be addressed while there's time. > > So here's what I see ontop of 4.19-rc7: > > First marked the alternative asm() as inline and undeffed the "inline" > keyword. I need to do that because of the funky games we do with > "inline" redefinitions in include/linux/compiler_types.h. > > And Segher hinted at either doing: > > asm volatile inline(... > > or > > asm volatile __inline__(... > > but both "inline" variants are defined as macros in that file. > > Which means we either need to #undef inline before using it in asm() or > come up with something cleverer. > > Anyway, did this: > > --- > diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h > index 4cd6a3b71824..7c0639087da7 100644 > --- a/arch/x86/include/asm/alternative.h > +++ b/arch/x86/include/asm/alternative.h > @@ -165,11 +165,13 @@ static inline int alternatives_text_reserved(void *start, void *end) > * For non barrier like inlines please define new variants > * without volatile and memory clobber. > */ > + > +#undef inline > #define alternative(oldinstr, newinstr, feature) \ > - asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory") > + asm volatile inline(ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory") > > #define alternative_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \ > - asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) ::: "memory") > + asm volatile inline(ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) ::: "memory") > > /* > * Alternative inline assembly with input. > --- > > Build failed at link time with: > > arch/x86/boot/compressed/cmdline.o: In function `native_save_fl': > cmdline.c:(.text+0x0): multiple definition of `native_save_fl' > arch/x86/boot/compressed/misc.o:misc.c:(.text+0x1e0): first defined here > arch/x86/boot/compressed/cmdline.o: In function `native_restore_fl': > cmdline.c:(.text+0x10): multiple definition of `native_restore_fl' > arch/x86/boot/compressed/misc.o:misc.c:(.text+0x1f0): first defined here > arch/x86/boot/compressed/error.o: In function `native_save_fl': > error.c:(.text+0x0): multiple definition of `native_save_fl' > > which I had to fix with this: > > --- > diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h > index 15450a675031..0d772598c37c 100644 > --- a/arch/x86/include/asm/irqflags.h > +++ b/arch/x86/include/asm/irqflags.h > @@ -14,8 +14,7 @@ > */ > > /* Declaration required for gcc < 4.9 to prevent -Werror=missing-prototypes */ > -extern inline unsigned long native_save_fl(void); > -extern inline unsigned long native_save_fl(void) > +static inline unsigned long native_save_fl(void) > { > unsigned long flags; > > @@ -33,8 +32,7 @@ ex > --- > > That "extern inline" declaration looks fishy to me anyway, maybe not really > needed ? Apparently, gcc < 4.9 complains with -Werror=missing-prototypes... > > Then the build worked and the results look like this: > > text data bss dec hex filename > 17287384 5040656 2019532 24347572 17383b4 vmlinux-before > 17288020 5040656 2015436 24344112 1737630 vmlinux-2nd-version > > so some inlining must be happening. > > Then I did this: > > --- > diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c > index 9c5606d88f61..a0170344cf08 100644 > --- a/arch/x86/lib/usercopy_64.c > +++ b/arch/x86/lib/usercopy_64.c > @@ -20,7 +20,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size) > /* no memory constraint because it doesn't change any memory gcc knows > about */ > stac(); > - asm volatile( > + asm volatile inline( > " testq %[size8],%[size8]\n" > " jz 4f\n" > "0: movq $0,(%[dst])\n" > --- > > to force inlining of a somewhat bigger asm() statement. And yap, more > got inlined: > > text data bss dec hex filename > 17287384 5040656 2019532 24347572 17383b4 vmlinux-before > 17288020 5040656 2015436 24344112 1737630 vmlinux-2nd-version > 17288076 5040656 2015436 24344168 1737668 vmlinux-2nd-version__clear_user > > so more stuff gets inlined. > > Looking at the asm output, it had before: > > --- > clear_user: > # ./arch/x86/include/asm/current.h:15: return this_cpu_read_stable(current_task); > #APP > # 15 "./arch/x86/include/asm/current.h" 1 > movq %gs:current_task,%rax #, pfo_ret__ > # 0 "" 2 > # arch/x86/lib/usercopy_64.c:51: if (access_ok(VERIFY_WRITE, to, n)) > #NO_APP > movq 2520(%rax), %rdx # pfo_ret___12->thread.addr_limit.seg, _1 > movq %rdi, %rax # to, tmp93 > addq %rsi, %rax # n, tmp93 > jc .L3 #, > # arch/x86/lib/usercopy_64.c:51: if (access_ok(VERIFY_WRITE, to, n)) > cmpq %rax, %rdx # tmp93, _1 > jb .L3 #, > # arch/x86/lib/usercopy_64.c:52: return __clear_user(to, n); > jmp __clear_user # > --- > > note the JMP to __clear_user. After marking the asm() in __clear_user() as > inline, clear_user() inlines __clear_user() directly: > > --- > clear_user: > # ./arch/x86/include/asm/current.h:15: return this_cpu_read_stable(current_task); > #APP > # 15 "./arch/x86/include/asm/current.h" 1 > movq %gs:current_task,%rax #, pfo_ret__ > # 0 "" 2 > # arch/x86/lib/usercopy_64.c:51: if (access_ok(VERIFY_WRITE, to, n)) > #NO_APP > movq 2520(%rax), %rdx # pfo_ret___12->thread.addr_limit.seg, _1 > movq %rdi, %rax # to, tmp95 > addq %rsi, %rax # n, tmp95 > jc .L8 #, > # arch/x86/lib/usercopy_64.c:51: if (access_ok(VERIFY_WRITE, to, n)) > cmpq %rax, %rdx # tmp95, _1 > jb .L8 #, > # ./arch/x86/include/asm/smap.h:58: alternative("", __stringify(__ASM_STAC), X86_FEATURE_SMAP); > ... > > this last line is the stac() macro which gets inlined due to the > alternative() inlined macro above. > > So I guess this all looks like what we wanted. > > And if this lands in gcc9, we would need to do a asm_volatile() macro > which is defined differently based on the compiler used. > > Thoughts, suggestions, etc are most welcome. > > Thx. > > -- > Regards/Gruss, > Boris. > > Good mailing practices for 400: avoid top-posting and trim the reply. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <20181031125526.GA13219@hirez.programming.kicks-ass.net>]
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <20181031125526.GA13219@hirez.programming.kicks-ass.net> @ 2018-10-31 22:04 ` Segher Boessenkool 2018-11-01 5:23 ` Joe Perches 1 sibling, 0 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-10-31 22:04 UTC (permalink / raw) To: Peter Zijlstra Cc: Borislav Petkov, Ingo Molnar, Richard Biener, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Wed, Oct 31, 2018 at 01:55:26PM +0100, Peter Zijlstra wrote: > Anyway, with the below patch, I get: > > text data bss dec hex filename > 17385183 5064780 1953892 24403855 1745f8f defconfig-build/vmlinux > 17385678 5064780 1953892 24404350 174617e defconfig-build/vmlinux > > Which shows we inline more (look for asm_volatile for the actual > changes). > > > So yes, this seems like something we could work with. Great to hear. Note that with my latest patches (see https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01931.html ) there is no required order "asm volatile inline" anymore, so you can just say "asm inline volatile". (And similar for "goto" as well). Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <20181031125526.GA13219@hirez.programming.kicks-ass.net> 2018-10-31 22:04 ` Segher Boessenkool @ 2018-11-01 5:23 ` Joe Perches [not found] ` <20181101090114.GE3159@hirez.programming.kicks-ass.net> 1 sibling, 1 reply; 40+ messages in thread From: Joe Perches @ 2018-11-01 5:23 UTC (permalink / raw) To: Peter Zijlstra, Borislav Petkov Cc: Segher Boessenkool, Ingo Molnar, Richard Biener, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Wed, 2018-10-31 at 13:55 +0100, Peter Zijlstra wrote: > > Anyway, with the below patch, I get: > > text data bss dec hex filename > 17385183 5064780 1953892 24403855 1745f8f defconfig-build/vmlinux > 17385678 5064780 1953892 24404350 174617e defconfig-build/vmlinux > > Which shows we inline more (look for asm_volatile for the actual > changes). [] > scripts/checkpatch.pl | 8 ++--- > scripts/genksyms/keywords.c | 4 +-- > scripts/kernel-doc | 4 +-- I believe these should be excluded from the conversions. Other than that, generic conversion by script seems a good idea. ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <20181101090114.GE3159@hirez.programming.kicks-ass.net>]
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <20181101090114.GE3159@hirez.programming.kicks-ass.net> @ 2018-11-01 9:20 ` Joe Perches 0 siblings, 0 replies; 40+ messages in thread From: Joe Perches @ 2018-11-01 9:20 UTC (permalink / raw) To: Peter Zijlstra Cc: Borislav Petkov, Segher Boessenkool, Ingo Molnar, Richard Biener, Michael Matz, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Thu, 2018-11-01 at 10:01 +0100, Peter Zijlstra wrote: > On Wed, Oct 31, 2018 at 10:20:00PM -0700, Joe Perches wrote: > > On Wed, 2018-10-31 at 13:55 +0100, Peter Zijlstra wrote: > > > Anyway, with the below patch, I get: > > > > > > text data bss dec hex filename > > > 17385183 5064780 1953892 24403855 1745f8f defconfig-build/vmlinux > > > 17385678 5064780 1953892 24404350 174617e defconfig-build/vmlinux > > > > > > Which shows we inline more (look for asm_volatile for the actual > > > changes). > > [] > > > scripts/checkpatch.pl | 8 ++--- > > > scripts/genksyms/keywords.c | 4 +-- > > > scripts/kernel-doc | 4 +-- > > > > I believe these should be excluded from the conversions. > > Probably, yes. It compiled, which was all I cared about :-) > > BTW, if we do that conversion, we should upgrade the checkpatch warn to > an error I suppose. More like remove altogether as __inline and __inline__ will no longer be #defined $ git grep -P 'define\s+__inline' include/linux/compiler_types.h:#define __inline__ inline include/linux/compiler_types.h:#define __inline inline ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-10 8:03 ` Segher Boessenkool 2018-10-10 8:19 ` Borislav Petkov @ 2018-10-10 10:29 ` Richard Biener 1 sibling, 0 replies; 40+ messages in thread From: Richard Biener @ 2018-10-10 10:29 UTC (permalink / raw) To: Segher Boessenkool Cc: Ingo Molnar, Michael Matz, Borislav Petkov, gcc, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa, Andrew Morton On Wed, 10 Oct 2018, Segher Boessenkool wrote: > On Wed, Oct 10, 2018 at 09:22:40AM +0200, Ingo Molnar wrote: > > * Richard Biener <rguenther@suse.de> wrote: > > > Can kernel folks give this a second and third thought please so we > > > don't implement sth that in the end won't satisfy you guys? > > > > So this basically passes '0 size' to the inliner, which should be better > > than passing in the explicit size, as we'd inevitably get it wrong > > in cases. > > The code immediately after this makes it size 1, even for things like > asm(""), I suppose this works better for the inliner. But that's a detail > (and it might change); the description says "consider this asm as minimum > length and cost for inlining decisions", which works for either 0 or 1. It was made 1 as otherwise the inliner happily explodes on void foo () { asm(""); foo(); } > > I also like 'size 0' for the reason that we tend to write assembly code > > and mark it 'inline' if we really think it matters to performance, > > so making it more likely to be inlined when used within another inline > > function is a plus as well. > > You can think of it as meaning "we want this asm inlined always", and then > whether that actually happens depends on if the function around it is > inlined or not. Richard. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-09 15:54 ` Segher Boessenkool 2018-10-10 7:12 ` Richard Biener @ 2018-10-10 16:31 ` Nadav Amit 2018-10-10 19:31 ` Segher Boessenkool 2018-10-11 7:05 ` Richard Biener [not found] ` <CAK7LNARmrdtyE7JRAdJH_zbfvD_cej+TGLh+5KfT20o538KUcA@mail.gmail.com> 2 siblings, 2 replies; 40+ messages in thread From: Nadav Amit @ 2018-10-10 16:31 UTC (permalink / raw) To: Segher Boessenkool, Richard Biener Cc: Michael Matz, Borislav Petkov, gcc, Ingo Molnar, LKML, X86 ML, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa at 7:53 AM, Segher Boessenkool <segher@kernel.crashing.org> wrote: > On Mon, Oct 08, 2018 at 11:07:46AM +0200, Richard Biener wrote: >> On Mon, 8 Oct 2018, Segher Boessenkool wrote: >>> On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote: >>>> On Sun, 7 Oct 2018, Segher Boessenkool wrote: >>>>> On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: >>>>>> Now, Richard suggested doing something like: >>>>>> >>>>>> 1) inline asm ("...") >>>>> >>>>> What would the semantics of this be? >>>> >>>> The size of the inline asm wouldn't be counted towards the inliner size >>>> limits (or be counted as "1"). >>> >>> That sounds like a good option. >> >> Yes, I also like it for simplicity. It also avoids the requirement >> of translating the number (in bytes?) given by the user to >> "number of GIMPLE instructions" as needed by the inliner. > > This patch implements this, for C only so far. And the syntax is > "asm inline", which is more in line with other syntax. > > How does this look? It looks good to me in general. I have a couple of reservations, but I suspect you will not want to address them: 1. It is not backward compatible, requiring a C macro to wrap it, as the kernel might be built with different compilers. 2. It is specific to asm. I do not have in mind another use case (excluding the __builtin_constant_p), but it would be nicer IMHO to have a builtin saying “ignore the cost of this statement” for the matter of optimizations. Regards, Nadav ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-10 16:31 ` Nadav Amit @ 2018-10-10 19:31 ` Segher Boessenkool 2018-10-11 7:05 ` Richard Biener 1 sibling, 0 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-10-10 19:31 UTC (permalink / raw) To: Nadav Amit Cc: Richard Biener, Michael Matz, Borislav Petkov, gcc, Ingo Molnar, LKML, X86 ML, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa Hi Nadav, On Wed, Oct 10, 2018 at 04:31:41PM +0000, Nadav Amit wrote: > at 7:53 AM, Segher Boessenkool <segher@kernel.crashing.org> wrote: > > How does this look? > > It looks good to me in general. I have a couple of reservations, but I > suspect you will not want to address them: > > 1. It is not backward compatible, requiring a C macro to wrap it, as the > kernel might be built with different compilers. How *could* it be backward compatible? There should be an error or at least a warning if the compiler does not support this, in general. For the kernel, the kernel already has plenty of infrastructure to support this (compiler.h etc.) For other applications it is quite trivial, too. > 2. It is specific to asm. Yes, and that is on purpose. > I do not have in mind another use case (excluding > the __builtin_constant_p), but it would be nicer IMHO to have a builtin > saying “ignore the cost of this statement” for the matter of optimizations. That is a hundred or a thousand times more work to design and implement (including testing etc.) I'm not going to do it, but feel free to try yourself! Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-10 16:31 ` Nadav Amit 2018-10-10 19:31 ` Segher Boessenkool @ 2018-10-11 7:05 ` Richard Biener 1 sibling, 0 replies; 40+ messages in thread From: Richard Biener @ 2018-10-11 7:05 UTC (permalink / raw) To: Nadav Amit Cc: Segher Boessenkool, Michael Matz, Borislav Petkov, gcc, Ingo Molnar, LKML, X86 ML, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa [-- Attachment #1: Type: text/plain, Size: 2323 bytes --] On Wed, 10 Oct 2018, Nadav Amit wrote: > at 7:53 AM, Segher Boessenkool <segher@kernel.crashing.org> wrote: > > > On Mon, Oct 08, 2018 at 11:07:46AM +0200, Richard Biener wrote: > >> On Mon, 8 Oct 2018, Segher Boessenkool wrote: > >>> On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote: > >>>> On Sun, 7 Oct 2018, Segher Boessenkool wrote: > >>>>> On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > >>>>>> Now, Richard suggested doing something like: > >>>>>> > >>>>>> 1) inline asm ("...") > >>>>> > >>>>> What would the semantics of this be? > >>>> > >>>> The size of the inline asm wouldn't be counted towards the inliner size > >>>> limits (or be counted as "1"). > >>> > >>> That sounds like a good option. > >> > >> Yes, I also like it for simplicity. It also avoids the requirement > >> of translating the number (in bytes?) given by the user to > >> "number of GIMPLE instructions" as needed by the inliner. > > > > This patch implements this, for C only so far. And the syntax is > > "asm inline", which is more in line with other syntax. > > > > How does this look? > > It looks good to me in general. I have a couple of reservations, but I > suspect you will not want to address them: > > 1. It is not backward compatible, requiring a C macro to wrap it, as the > kernel might be built with different compilers. > > 2. It is specific to asm. I do not have in mind another use case (excluding > the __builtin_constant_p), but it would be nicer IMHO to have a builtin > saying âignore the cost of this statementâ for the matter of optimizations. The only easy possibility that comes to my mid is sth like __attribute__((always_inline, zero_cost)) foo () { ... your stmts ... } and us, upon inlining, marking the inlined stmts properly. That would also work for the asm() case and it would be backwards compatible (well, you'd get a diagnostic for the unknown zero_cost attribute). There's the slight complication that if you have, say _1 = _2 * 3; // zero-cost _4 = _1 * 2; and optimization ends up combining those to _4 = _2 * 6; then is this stmt supposed to be zero-cost or not? Compare that to _1 = _2 * 3; _4 = _1 * 2; // zero-cost So outside of asm() there are new issues that come up with respect to expected (cost) semantics. Richard. ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <CAK7LNARmrdtyE7JRAdJH_zbfvD_cej+TGLh+5KfT20o538KUcA@mail.gmail.com>]
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <CAK7LNARmrdtyE7JRAdJH_zbfvD_cej+TGLh+5KfT20o538KUcA@mail.gmail.com> @ 2018-11-29 13:09 ` Borislav Petkov via gcc 2018-11-29 13:16 ` Richard Biener 2018-11-30 9:06 ` Segher Boessenkool 1 sibling, 1 reply; 40+ messages in thread From: Borislav Petkov via gcc @ 2018-11-29 13:09 UTC (permalink / raw) To: Masahiro Yamada Cc: Segher Boessenkool, Nadav Amit, Ingo Molnar, H. Peter Anvin, rguenther, matz, gcc, Linux Kernel Mailing List, X86 ML, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra (Intel), Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Thu, Nov 29, 2018 at 08:46:34PM +0900, Masahiro Yamada wrote: > But, I'd like to ask if x86 people want to keep this macros.s approach. > Revert 77b0bf55bc675 right now > assuming the compiler will eventually solve the issue? Yap, considering how elegant the compiler solution is and how much problems this macros-in-asm thing causes, I think we should patch out the latter and wait for gcc9. I mean, the savings are not so mind-blowing to have to deal with the fallout. But this is just my opinion. Thx. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-11-29 13:09 ` Borislav Petkov via gcc @ 2018-11-29 13:16 ` Richard Biener 2018-11-29 13:25 ` Borislav Petkov via gcc 0 siblings, 1 reply; 40+ messages in thread From: Richard Biener @ 2018-11-29 13:16 UTC (permalink / raw) To: Borislav Petkov Cc: Masahiro Yamada, Segher Boessenkool, Nadav Amit, Ingo Molnar, H. Peter Anvin, matz, gcc, Linux Kernel Mailing List, X86 ML, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra (Intel), Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Thu, 29 Nov 2018, Borislav Petkov wrote: > On Thu, Nov 29, 2018 at 08:46:34PM +0900, Masahiro Yamada wrote: > > But, I'd like to ask if x86 people want to keep this macros.s approach. > > Revert 77b0bf55bc675 right now > > assuming the compiler will eventually solve the issue? > > Yap, considering how elegant the compiler solution is and how much > problems this macros-in-asm thing causes, I think we should patch > out the latter and wait for gcc9. I mean, the savings are not so > mind-blowing to have to deal with the fallout. > > But this is just my opinion. I'd be not opposed to backporting the asm inline support. Of course we still have to be happy with it and install the patch ;) Are you (kernel folks) happy with asm inline ()? Richard. > Thx. > > -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-11-29 13:16 ` Richard Biener @ 2018-11-29 13:25 ` Borislav Petkov via gcc 2018-11-29 13:52 ` Richard Biener 0 siblings, 1 reply; 40+ messages in thread From: Borislav Petkov via gcc @ 2018-11-29 13:25 UTC (permalink / raw) To: Richard Biener Cc: Masahiro Yamada, Segher Boessenkool, Nadav Amit, Ingo Molnar, H. Peter Anvin, matz, gcc, Linux Kernel Mailing List, X86 ML, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra (Intel), Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Thu, Nov 29, 2018 at 02:09:25PM +0100, Richard Biener wrote: > I'd be not opposed to backporting the asm inline support. Even better! :-) > Of course we still have to be happy with it and install the patch ;) > > Are you (kernel folks) happy with asm inline ()? Yes, I think we are: https://lkml.kernel.org/r/20181031125526.GA13219@hirez.programming.kicks-ass.net Thx. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-11-29 13:25 ` Borislav Petkov via gcc @ 2018-11-29 13:52 ` Richard Biener 0 siblings, 0 replies; 40+ messages in thread From: Richard Biener @ 2018-11-29 13:52 UTC (permalink / raw) To: Borislav Petkov Cc: Masahiro Yamada, Segher Boessenkool, Nadav Amit, Ingo Molnar, H. Peter Anvin, matz, gcc, Linux Kernel Mailing List, X86 ML, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra (Intel), Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Thu, 29 Nov 2018, Borislav Petkov wrote: > On Thu, Nov 29, 2018 at 02:09:25PM +0100, Richard Biener wrote: > > I'd be not opposed to backporting the asm inline support. > > Even better! :-) > > > Of course we still have to be happy with it and install the patch ;) > > > > Are you (kernel folks) happy with asm inline ()? > > Yes, I think we are: > > https://lkml.kernel.org/r/20181031125526.GA13219@hirez.programming.kicks-ass.net OK, Segher - can you ping the latest version of the patch please? Thanks, Richard. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <CAK7LNARmrdtyE7JRAdJH_zbfvD_cej+TGLh+5KfT20o538KUcA@mail.gmail.com> 2018-11-29 13:09 ` Borislav Petkov via gcc @ 2018-11-30 9:06 ` Segher Boessenkool 2018-11-30 10:04 ` Boris Petkov via gcc 1 sibling, 1 reply; 40+ messages in thread From: Segher Boessenkool @ 2018-11-30 9:06 UTC (permalink / raw) To: Masahiro Yamada Cc: Nadav Amit, Ingo Molnar, H. Peter Anvin, rguenther, matz, Borislav Petkov, gcc, Linux Kernel Mailing List, X86 ML, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra (Intel), Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Thu, Nov 29, 2018 at 08:46:34PM +0900, Masahiro Yamada wrote: > On Wed, Oct 10, 2018 at 1:14 AM Segher Boessenkool > <segher@kernel.crashing.org> wrote: > > > > On Mon, Oct 08, 2018 at 11:07:46AM +0200, Richard Biener wrote: > > > On Mon, 8 Oct 2018, Segher Boessenkool wrote: > > > > On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote: > > > > > On Sun, 7 Oct 2018, Segher Boessenkool wrote: > > > > > > On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote: > > > > > > > Now, Richard suggested doing something like: > > > > > > > > > > > > > > 1) inline asm ("...") > > > > > > > > > > > > What would the semantics of this be? > > > > > > > > > > The size of the inline asm wouldn't be counted towards the inliner size > > > > > limits (or be counted as "1"). > > > > > > > > That sounds like a good option. > > > > > > Yes, I also like it for simplicity. It also avoids the requirement > > > of translating the number (in bytes?) given by the user to > > > "number of GIMPLE instructions" as needed by the inliner. > > > > This patch implements this, for C only so far. And the syntax is > > "asm inline", which is more in line with other syntax. > > > > How does this look? > > > Thank you very much for your work. > > > https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01932.html > > How is the progress of this in GCC ML? Latest patch was pinged a few times: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01569.html . I'll ping it again. Will fix the subject as well if I remember to, sigh. > I am really hoping the issue will be solved by compiler > instead of the in-kernel workaround. This will only be fixed from GCC 9 on, if the compiler adopts it. The kernel wants to support ancient GCC, so it will need to have a workaround for older GCC versions anyway. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-11-30 9:06 ` Segher Boessenkool @ 2018-11-30 10:04 ` Boris Petkov via gcc 2018-12-02 6:03 ` Segher Boessenkool 0 siblings, 1 reply; 40+ messages in thread From: Boris Petkov via gcc @ 2018-11-30 10:04 UTC (permalink / raw) To: Segher Boessenkool, Masahiro Yamada Cc: Nadav Amit, Ingo Molnar, H. Peter Anvin, rguenther, matz, gcc, Linux Kernel Mailing List, X86 ML, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra (Intel), Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On November 29, 2018 1:25:02 PM GMT+01:00, Segher Boessenkool <segher@kernel.crashing.org> wrote: >This will only be fixed from GCC 9 on, if the compiler adopts it. The >kernel wants to support ancient GCC, so it will need to have a >workaround >for older GCC versions anyway. What about backporting it, like Richard says? -- Sent from a small device: formatting sux and brevity is inevitable. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-11-30 10:04 ` Boris Petkov via gcc @ 2018-12-02 6:03 ` Segher Boessenkool 0 siblings, 0 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-12-02 6:03 UTC (permalink / raw) To: Boris Petkov Cc: Masahiro Yamada, Nadav Amit, Ingo Molnar, H. Peter Anvin, rguenther, matz, gcc, Linux Kernel Mailing List, X86 ML, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra (Intel), Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Fri, Nov 30, 2018 at 10:06:02AM +0100, Boris Petkov wrote: > On November 29, 2018 1:25:02 PM GMT+01:00, Segher Boessenkool <segher@kernel.crashing.org> wrote: > >This will only be fixed from GCC 9 on, if the compiler adopts it. The > >kernel wants to support ancient GCC, so it will need to have a > >workaround > >for older GCC versions anyway. > > What about backporting it, like Richard says? Let me first get it into GCC trunk :-) It should backport fine, sure; and I'll work on that. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <20181008061323.GB66819@gmail.com>]
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <20181008061323.GB66819@gmail.com> @ 2018-10-08 8:19 ` Segher Boessenkool 0 siblings, 0 replies; 40+ messages in thread From: Segher Boessenkool @ 2018-10-08 8:19 UTC (permalink / raw) To: Ingo Molnar Cc: Michael Matz, Borislav Petkov, gcc, Richard Biener, Nadav Amit, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On Mon, Oct 08, 2018 at 08:13:23AM +0200, Ingo Molnar wrote: > * Michael Matz <matz@suse.de> wrote: > > (without an built-in assembler which hopefully noone proposes). > There are disadvantages (the main one is having to implement it), but a built-in assembler has > numerous advantages as well: > > - Better optimizations: for example -Os could more accurately estimate true instruction size. GCC already knows the exact instruction size in almost all cases, for most backends. It is an advantage to not *have to* keep track of exact insn size in all cases. > - Better inlining: as the examples in this thread are showing. That's a red herring, the actual issue is inlining makes some spectacularly wrong decisions for code involving asm currently. That would not be solved by outputting binary code instead of assembler code. All those decisions are done long before code is output. > - Better padding/alignment: right now GCC has no notion about the precise cache layout of the > assembly code it generates and the code alignment options it has are crude. This isn't true. Maybe some targets do not care. And of course GCC only knows this as far as it knows the alignments of the sections it outputs into; for example, ASLR is a nice performance killer at times. And if your linker scripts align sections to less than a cache line things do not look rosy either, etc. > It got away with > this so far because the x86 rule of thumb is that dense code is usually the right choice. > - Better compiler performance: it would be faster as well to immediately emit assembly > instructions, just like GCC's preprocessor library use speeds up compilation *significantly* > instead of creating a separate preprocessor task. So how much faster do you think it would be? Do you have actual numbers? And no, -O0 does not count. > - Better future integration of assembly blocks: GCC could begin to actually understand the > assembly statements in inline asm and allow more user-friendly extensions to its > historically complex and difficult to master inline asm syntax. If you want to add a different kind of inline assembler, you can already. There is no need for outputting binary code for that. > I mean, it's a fact that the GNU project has *already* defined their own assembly syntax which > departs from decades old platform assembly syntax GCC's asm syntax is over three decades old itself. When it was defined all other C inline assembler syntaxes were much younger than that. I don't see what your argument is here. > - and how the assembler is called by the > compiler is basically an implementation detail, not a conceptual choice. But the *format* of the interface representation, in this case textual assembler code, is quite fundamental. > The random > multi-process unidirectional assembler choice of the past should not be treated as orthodoxy. Then I have good news for you: no assembler works that way these days, and that has been true for decades. Segher ^ permalink raw reply [flat|nested] 40+ messages in thread
[parent not found: <4F2F1BCE-7875-4160-9E1E-9F8EF962D989@vmware.com>]
* [RESEND] PROPOSAL: Extend inline asm syntax with size spec [not found] ` <4F2F1BCE-7875-4160-9E1E-9F8EF962D989@vmware.com> @ 2018-10-07 16:13 ` Nadav Amit 2018-10-07 16:46 ` Richard Biener 1 sibling, 0 replies; 40+ messages in thread From: Nadav Amit @ 2018-10-07 16:13 UTC (permalink / raw) To: Borislav Petkov, gcc, Richard Biener, Michael Matz Cc: Ingo Molnar, LKML, X86 ML, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa Second try after being blocked by gcc mailing list: at 9:09 AM, Nadav Amit <namit@vmware.com<mailto:namit@vmware.com>> wrote: at 2:18 AM, Borislav Petkov <bp@alien8.de<mailto:bp@alien8.de>> wrote: Hi people, this is an attempt to see whether gcc's inline asm heuristic when estimating inline asm statements' cost for better inlining can be improved. AFAIU, the problematic arises when one ends up using a lot of inline asm statements in the kernel but due to the inline asm cost estimation heuristic which counts lines, I think, for example like in this here macro: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Farch%2Fx86%2Finclude%2Fasm%2Fcpufeature.h%23n162&data=02%7C01%7Cnamit%40vmware.com%7C6db1258c65ea45bbe4ea08d62c35ceec%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745007006838299&sdata=iehl%2Fb8h%2BZE%2Frqb4qjac19WekSgOObc9%2BM1Jto1VgF4%3D&reserved=0 the resulting code ends up not inlining the functions themselves which use this macro. I.e., you see a CALL <function> instead of its body getting inlined directly. Even though it should be because the actual instructions are only a couple in most cases and all those other directives end up in another section anyway. The issue is explained below in the forwarded mail in a larger detail too. Now, Richard suggested doing something like: 1) inline asm ("...") 2) asm ("..." : : : : <size-expr>) 3) asm ("...") __attribute__((asm_size(<size-expr>))); with which user can tell gcc what the size of that inline asm statement is and thus allow for more precise cost estimation and in the end better inlining. And FWIW 3) looks pretty straight-forward to me because attributes are pretty common anyways. But I'm sure there are other options and I'm sure people will have better/different ideas so feel free to chime in. Thanks for taking care of it. I would like to mention a second issue, since you may want to resolve both with a single solution: not inlining conditional __builtin_constant_p(), in which there are two code-paths - one for constants and one for variables. Consider for example the Linux kernel ilog2 macro, which has a condition based on __builtin_constant_p() ( https://elixir.bootlin.com/linux/v4.19-rc7/source/include/linux/log2.h#L160 ). The compiler mistakenly considers the “heavy” code-path that is supposed to be evaluated only in compilation time to evaluate the code size. This causes the kernel to consider functions such as kmalloc() as “big”. kmalloc() is marked with always_inline attribute, so instead the calling functions, such as kzalloc() are not inlined. When I thought about hacking gcc to solve this issue, I considered an intrinsic that would override the cost of a given statement. This solution is not too nice, but may solve both issues. In addition, note that AFAIU the impact of a wrong cost of code estimation can also impact loop and other optimizations. Regards, Nadav ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec [not found] ` <4F2F1BCE-7875-4160-9E1E-9F8EF962D989@vmware.com> 2018-10-07 16:13 ` [RESEND] " Nadav Amit @ 2018-10-07 16:46 ` Richard Biener 2018-10-07 19:06 ` Nadav Amit 1 sibling, 1 reply; 40+ messages in thread From: Richard Biener @ 2018-10-07 16:46 UTC (permalink / raw) To: Nadav Amit, Borislav Petkov, gcc, Michael Matz Cc: Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa On October 7, 2018 6:09:30 PM GMT+02:00, Nadav Amit <namit@vmware.com> wrote: >at 2:18 AM, Borislav Petkov <bp@alien8.de> wrote: > >> Hi people, >> >> this is an attempt to see whether gcc's inline asm heuristic when >> estimating inline asm statements' cost for better inlining can be >> improved. >> >> AFAIU, the problematic arises when one ends up using a lot of inline >> asm statements in the kernel but due to the inline asm cost >estimation >> heuristic which counts lines, I think, for example like in this here >> macro: >> >> >https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Farch%2Fx86%2Finclude%2Fasm%2Fcpufeature.h%23n162&data=02%7C01%7Cnamit%40vmware.com%7C6db1258c65ea45bbe4ea08d62c35ceec%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745007006838299&sdata=iehl%2Fb8h%2BZE%2Frqb4qjac19WekSgOObc9%2BM1Jto1VgF4%3D&reserved=0 >> >> the resulting code ends up not inlining the functions themselves >which >> use this macro. I.e., you see a CALL <function> instead of its body >> getting inlined directly. >> >> Even though it should be because the actual instructions are only a >> couple in most cases and all those other directives end up in another >> section anyway. >> >> The issue is explained below in the forwarded mail in a larger detail >> too. >> >> Now, Richard suggested doing something like: >> >> 1) inline asm ("...") >> 2) asm ("..." : : : : <size-expr>) >> 3) asm ("...") __attribute__((asm_size(<size-expr>))); >> >> with which user can tell gcc what the size of that inline asm >statement >> is and thus allow for more precise cost estimation and in the end >better >> inlining. >> >> And FWIW 3) looks pretty straight-forward to me because attributes >are >> pretty common anyways. >> >> But I'm sure there are other options and I'm sure people will have >> better/different ideas so feel free to chime in. > >Thanks for taking care of it. I would like to mention a second issue, >since >you may want to resolve both with a single solution: not inlining >conditional __builtin_constant_p(), in which there are two code-paths - >one >for constants and one for variables. > >Consider for example the Linux kernel ilog2 macro, which has a >condition >based on __builtin_constant_p() ( >https://elixir.bootlin.com/linux/v4.19-rc7/source/include/linux/log2.h#L160 >). The compiler mistakenly considers the “heavy” code-path that is >supposed >to be evaluated only in compilation time to evaluate the code size. But this is a misconception about __builtin_constant_p. It doesn't guard sth like 'constexpr' regions. If you try to use it with those semantics you'll fail (appearantly you do). Of course IPA CP code size estimates when seeing a constant fed to bcp might be not optimal, that's another issue of course. Richard. >This >causes the kernel to consider functions such as kmalloc() as “big”. >kmalloc() is marked with always_inline attribute, so instead the >calling >functions, such as kzalloc() are not inlined. > >When I thought about hacking gcc to solve this issue, I considered an >intrinsic that would override the cost of a given statement. This >solution >is not too nice, but may solve both issues. > >In addition, note that AFAIU the impact of a wrong cost of code >estimation >can also impact loop and other optimizations. > >Regards, >Nadav ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-07 16:46 ` Richard Biener @ 2018-10-07 19:06 ` Nadav Amit 2018-10-07 19:52 ` Jeff Law 2018-10-08 7:46 ` Richard Biener 0 siblings, 2 replies; 40+ messages in thread From: Nadav Amit @ 2018-10-07 19:06 UTC (permalink / raw) To: Richard Biener, Borislav Petkov, gcc, Michael Matz Cc: Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa at 9:46 AM, Richard Biener <rguenther@suse.de> wrote: > On October 7, 2018 6:09:30 PM GMT+02:00, Nadav Amit <namit@vmware.com> wrote: >> at 2:18 AM, Borislav Petkov <bp@alien8.de> wrote: >> >>> Hi people, >>> >>> this is an attempt to see whether gcc's inline asm heuristic when >>> estimating inline asm statements' cost for better inlining can be >>> improved. >>> >>> AFAIU, the problematic arises when one ends up using a lot of inline >>> asm statements in the kernel but due to the inline asm cost >> estimation >>> heuristic which counts lines, I think, for example like in this here >>> macro: >> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Farch%2Fx86%2Finclude%2Fasm%2Fcpufeature.h%23n162&data=02%7C01%7Cnamit%40vmware.com%7C860403cecb874db64b7e08d62c746f46%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745275975505381&sdata=Nd0636K9Z1IsUs1RWSRAhVuVboLxlBCB4peiAMfmQzQ%3D&reserved=0 >>> the resulting code ends up not inlining the functions themselves >> which >>> use this macro. I.e., you see a CALL <function> instead of its body >>> getting inlined directly. >>> >>> Even though it should be because the actual instructions are only a >>> couple in most cases and all those other directives end up in another >>> section anyway. >>> >>> The issue is explained below in the forwarded mail in a larger detail >>> too. >>> >>> Now, Richard suggested doing something like: >>> >>> 1) inline asm ("...") >>> 2) asm ("..." : : : : <size-expr>) >>> 3) asm ("...") __attribute__((asm_size(<size-expr>))); >>> >>> with which user can tell gcc what the size of that inline asm >> statement >>> is and thus allow for more precise cost estimation and in the end >> better >>> inlining. >>> >>> And FWIW 3) looks pretty straight-forward to me because attributes >> are >>> pretty common anyways. >>> >>> But I'm sure there are other options and I'm sure people will have >>> better/different ideas so feel free to chime in. >> >> Thanks for taking care of it. I would like to mention a second issue, >> since >> you may want to resolve both with a single solution: not inlining >> conditional __builtin_constant_p(), in which there are two code-paths - >> one >> for constants and one for variables. >> >> Consider for example the Linux kernel ilog2 macro, which has a >> condition >> based on __builtin_constant_p() ( >> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv4.19-rc7%2Fsource%2Finclude%2Flinux%2Flog2.h%23L160&data=02%7C01%7Cnamit%40vmware.com%7C860403cecb874db64b7e08d62c746f46%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745275975515386&sdata=Hk39Za9%2FxcFyK0sGENB24d6QySjsDGzF%2FwqjnUEMiGk%3D&reserved=0 >> ). The compiler mistakenly considers the “heavy” code-path that is >> supposed >> to be evaluated only in compilation time to evaluate the code size. > > But this is a misconception about __builtin_constant_p. It doesn't guard sth like 'constexpr' regions. If you try to use it with those semantics you'll fail (appearantly you do). > > Of course IPA CP code size estimates when seeing a constant fed to bcp might be not optimal, that's another issue of course. I understand that this is might not be the right way to implement macros such as ilog2() and test_bit(), but this code is around for some time. I thought of using __builtin_choose_expr() instead of ternary operator, but this introduces a different problem, as the variable version is used instead of the constant one in many cases. From my brief experiments with llvm, it appears that llvm does not have both of these issues (wrong cost attributed to inline asm and conditions based on __builtin_constant_p()). So what alternative do you propose to implement ilog2() like behavior? I was digging through the gcc code to find a workaround with no success. Thanks, Nadav ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-07 19:06 ` Nadav Amit @ 2018-10-07 19:52 ` Jeff Law 2018-10-08 7:46 ` Richard Biener 1 sibling, 0 replies; 40+ messages in thread From: Jeff Law @ 2018-10-07 19:52 UTC (permalink / raw) To: Nadav Amit, Richard Biener, Borislav Petkov, gcc, Michael Matz Cc: Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov On 10/7/18 1:06 PM, Nadav Amit wrote: > at 9:46 AM, Richard Biener <rguenther@suse.de> wrote: > >> On October 7, 2018 6:09:30 PM GMT+02:00, Nadav Amit <namit@vmware.com> wrote: >>> at 2:18 AM, Borislav Petkov <bp@alien8.de> wrote: >>> >>>> Hi people, >>>> >>>> this is an attempt to see whether gcc's inline asm heuristic when >>>> estimating inline asm statements' cost for better inlining can be >>>> improved. >>>> >>>> AFAIU, the problematic arises when one ends up using a lot of inline >>>> asm statements in the kernel but due to the inline asm cost >>> estimation >>>> heuristic which counts lines, I think, for example like in this here >>>> macro: >>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Farch%2Fx86%2Finclude%2Fasm%2Fcpufeature.h%23n162&data=02%7C01%7Cnamit%40vmware.com%7C860403cecb874db64b7e08d62c746f46%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745275975505381&sdata=Nd0636K9Z1IsUs1RWSRAhVuVboLxlBCB4peiAMfmQzQ%3D&reserved=0 >>>> the resulting code ends up not inlining the functions themselves >>> which >>>> use this macro. I.e., you see a CALL <function> instead of its body >>>> getting inlined directly. >>>> >>>> Even though it should be because the actual instructions are only a >>>> couple in most cases and all those other directives end up in another >>>> section anyway. >>>> >>>> The issue is explained below in the forwarded mail in a larger detail >>>> too. >>>> >>>> Now, Richard suggested doing something like: >>>> >>>> 1) inline asm ("...") >>>> 2) asm ("..." : : : : <size-expr>) >>>> 3) asm ("...") __attribute__((asm_size(<size-expr>))); >>>> >>>> with which user can tell gcc what the size of that inline asm >>> statement >>>> is and thus allow for more precise cost estimation and in the end >>> better >>>> inlining. >>>> >>>> And FWIW 3) looks pretty straight-forward to me because attributes >>> are >>>> pretty common anyways. >>>> >>>> But I'm sure there are other options and I'm sure people will have >>>> better/different ideas so feel free to chime in. >>> >>> Thanks for taking care of it. I would like to mention a second issue, >>> since >>> you may want to resolve both with a single solution: not inlining >>> conditional __builtin_constant_p(), in which there are two code-paths - >>> one >>> for constants and one for variables. >>> >>> Consider for example the Linux kernel ilog2 macro, which has a >>> condition >>> based on __builtin_constant_p() ( >>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv4.19-rc7%2Fsource%2Finclude%2Flinux%2Flog2.h%23L160&data=02%7C01%7Cnamit%40vmware.com%7C860403cecb874db64b7e08d62c746f46%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745275975515386&sdata=Hk39Za9%2FxcFyK0sGENB24d6QySjsDGzF%2FwqjnUEMiGk%3D&reserved=0 >>> ). The compiler mistakenly considers the âheavyâ code-path that is >>> supposed >>> to be evaluated only in compilation time to evaluate the code size. >> >> But this is a misconception about __builtin_constant_p. It doesn't guard sth like 'constexpr' regions. If you try to use it with those semantics you'll fail (appearantly you do). >> >> Of course IPA CP code size estimates when seeing a constant fed to bcp might be not optimal, that's another issue of course. > > I understand that this is might not be the right way to implement macros > such as ilog2() and test_bit(), but this code is around for some time. That doesn't make it right -- and there's been numerous bogus bugs reported against ilog2 because the authors of ilog2 haven't had a clear understanding of the semantics of builtin_constant_p. Jeff ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: PROPOSAL: Extend inline asm syntax with size spec 2018-10-07 19:06 ` Nadav Amit 2018-10-07 19:52 ` Jeff Law @ 2018-10-08 7:46 ` Richard Biener 1 sibling, 0 replies; 40+ messages in thread From: Richard Biener @ 2018-10-08 7:46 UTC (permalink / raw) To: Nadav Amit Cc: Borislav Petkov, gcc, Michael Matz, Ingo Molnar, linux-kernel, x86, Masahiro Yamada, Sam Ravnborg, Alok Kataria, Christopher Li, Greg Kroah-Hartman, H. Peter Anvin, Jan Beulich, Josh Poimboeuf, Juergen Gross, Kate Stewart, Kees Cook, linux-sparse, Peter Zijlstra, Philippe Ombredanne, Thomas Gleixner, virtualization, Linus Torvalds, Chris Zankel, Max Filippov, linux-xtensa [-- Attachment #1: Type: text/plain, Size: 4977 bytes --] On Sun, 7 Oct 2018, Nadav Amit wrote: > at 9:46 AM, Richard Biener <rguenther@suse.de> wrote: > > > On October 7, 2018 6:09:30 PM GMT+02:00, Nadav Amit <namit@vmware.com> wrote: > >> at 2:18 AM, Borislav Petkov <bp@alien8.de> wrote: > >> > >>> Hi people, > >>> > >>> this is an attempt to see whether gcc's inline asm heuristic when > >>> estimating inline asm statements' cost for better inlining can be > >>> improved. > >>> > >>> AFAIU, the problematic arises when one ends up using a lot of inline > >>> asm statements in the kernel but due to the inline asm cost > >> estimation > >>> heuristic which counts lines, I think, for example like in this here > >>> macro: > >> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Farch%2Fx86%2Finclude%2Fasm%2Fcpufeature.h%23n162&data=02%7C01%7Cnamit%40vmware.com%7C860403cecb874db64b7e08d62c746f46%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745275975505381&sdata=Nd0636K9Z1IsUs1RWSRAhVuVboLxlBCB4peiAMfmQzQ%3D&reserved=0 > >>> the resulting code ends up not inlining the functions themselves > >> which > >>> use this macro. I.e., you see a CALL <function> instead of its body > >>> getting inlined directly. > >>> > >>> Even though it should be because the actual instructions are only a > >>> couple in most cases and all those other directives end up in another > >>> section anyway. > >>> > >>> The issue is explained below in the forwarded mail in a larger detail > >>> too. > >>> > >>> Now, Richard suggested doing something like: > >>> > >>> 1) inline asm ("...") > >>> 2) asm ("..." : : : : <size-expr>) > >>> 3) asm ("...") __attribute__((asm_size(<size-expr>))); > >>> > >>> with which user can tell gcc what the size of that inline asm > >> statement > >>> is and thus allow for more precise cost estimation and in the end > >> better > >>> inlining. > >>> > >>> And FWIW 3) looks pretty straight-forward to me because attributes > >> are > >>> pretty common anyways. > >>> > >>> But I'm sure there are other options and I'm sure people will have > >>> better/different ideas so feel free to chime in. > >> > >> Thanks for taking care of it. I would like to mention a second issue, > >> since > >> you may want to resolve both with a single solution: not inlining > >> conditional __builtin_constant_p(), in which there are two code-paths - > >> one > >> for constants and one for variables. > >> > >> Consider for example the Linux kernel ilog2 macro, which has a > >> condition > >> based on __builtin_constant_p() ( > >> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv4.19-rc7%2Fsource%2Finclude%2Flinux%2Flog2.h%23L160&data=02%7C01%7Cnamit%40vmware.com%7C860403cecb874db64b7e08d62c746f46%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636745275975515386&sdata=Hk39Za9%2FxcFyK0sGENB24d6QySjsDGzF%2FwqjnUEMiGk%3D&reserved=0 > >> ). The compiler mistakenly considers the âheavyâ code-path that is > >> supposed > >> to be evaluated only in compilation time to evaluate the code size. > > > > But this is a misconception about __builtin_constant_p. It doesn't guard sth like 'constexpr' regions. If you try to use it with those semantics you'll fail (appearantly you do). > > > > Of course IPA CP code size estimates when seeing a constant fed to bcp might be not optimal, that's another issue of course. > > I understand that this is might not be the right way to implement macros > such as ilog2() and test_bit(), but this code is around for some time. > > I thought of using __builtin_choose_expr() instead of ternary operator, but > this introduces a different problem, as the variable version is used instead > of the constant one in many cases. From my brief experiments with llvm, it > appears that llvm does not have both of these issues (wrong cost attributed > to inline asm and conditions based on __builtin_constant_p()). > > So what alternative do you propose to implement ilog2() like behavior? I was > digging through the gcc code to find a workaround with no success. 1) Don't try to cheat the compilers constant propagation abilities 2) Use a language that allows this (C++) 3) Define (and implement) the corresponding GNU C extension __builtin_constant_p() isn't the right fit (I wonder what it was implemented for in the first place though...). I suppose you want sth like if (__builtin_constant_p (x)) return __constexpr ...; or use a call and have constexpr functions. Note it wouldn't be C++-constexpr like since you want the constexpr evaluation to happen very late in the compilation to benefit from optimizations and you are fine with the non-constexpr path. Properly defining a language extension is hard. Richard. > > Thanks, > Nadav > > -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) ^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2018-12-02 6:03 UTC | newest] Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20181003213100.189959-1-namit@vmware.com> 2018-10-07 9:46 ` PROPOSAL: Extend inline asm syntax with size spec Borislav Petkov 2018-10-07 13:23 ` Segher Boessenkool 2018-10-07 14:14 ` Borislav Petkov 2018-10-07 15:14 ` Segher Boessenkool [not found] ` <20181008055838.GA66819@gmail.com> 2018-10-08 7:53 ` Segher Boessenkool 2018-10-07 15:53 ` Michael Matz 2018-10-08 7:31 ` Segher Boessenkool 2018-10-08 9:09 ` Richard Biener 2018-10-08 10:03 ` Segher Boessenkool 2018-10-09 15:54 ` Segher Boessenkool 2018-10-10 7:12 ` Richard Biener 2018-10-10 7:54 ` Segher Boessenkool [not found] ` <20181010072240.GB103159@gmail.com> 2018-10-10 8:03 ` Segher Boessenkool 2018-10-10 8:19 ` Borislav Petkov 2018-10-10 8:36 ` Richard Biener 2018-10-10 18:55 ` Segher Boessenkool 2018-10-10 19:14 ` Borislav Petkov 2018-10-13 19:33 ` Borislav Petkov [not found] ` <alpine.LNX.2.20.13.1810132355180.13914@monopod.intra.ispras.ru> 2018-10-13 21:31 ` Borislav Petkov 2018-10-25 10:23 ` Borislav Petkov [not found] ` <20181031125526.GA13219@hirez.programming.kicks-ass.net> 2018-10-31 22:04 ` Segher Boessenkool 2018-11-01 5:23 ` Joe Perches [not found] ` <20181101090114.GE3159@hirez.programming.kicks-ass.net> 2018-11-01 9:20 ` Joe Perches 2018-10-10 10:29 ` Richard Biener 2018-10-10 16:31 ` Nadav Amit 2018-10-10 19:31 ` Segher Boessenkool 2018-10-11 7:05 ` Richard Biener [not found] ` <CAK7LNARmrdtyE7JRAdJH_zbfvD_cej+TGLh+5KfT20o538KUcA@mail.gmail.com> 2018-11-29 13:09 ` Borislav Petkov via gcc 2018-11-29 13:16 ` Richard Biener 2018-11-29 13:25 ` Borislav Petkov via gcc 2018-11-29 13:52 ` Richard Biener 2018-11-30 9:06 ` Segher Boessenkool 2018-11-30 10:04 ` Boris Petkov via gcc 2018-12-02 6:03 ` Segher Boessenkool [not found] ` <20181008061323.GB66819@gmail.com> 2018-10-08 8:19 ` Segher Boessenkool [not found] ` <4F2F1BCE-7875-4160-9E1E-9F8EF962D989@vmware.com> 2018-10-07 16:13 ` [RESEND] " Nadav Amit 2018-10-07 16:46 ` Richard Biener 2018-10-07 19:06 ` Nadav Amit 2018-10-07 19:52 ` Jeff Law 2018-10-08 7:46 ` Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).