* Re: [PING] [PATCH] Add direct support for Linux kernel __fentry__ patching
@ 2014-09-27 11:21 Dominique Dhumieres
2014-09-27 15:35 ` Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: Dominique Dhumieres @ 2014-09-27 11:21 UTC (permalink / raw)
To: andi; +Cc: law, gcc-patches
The new tests fail on darwin:
/opt/gcc/work/gcc/testsuite/gcc.target/i386/nop-mcount.c:1:0: error: -mnop-mcount is not implemented for -fPIC
and gcc.target/i386/record-mcount.c fails because mcount_loc is not found in the assembly.
TIA
Dominique
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PING] [PATCH] Add direct support for Linux kernel __fentry__ patching
2014-09-27 11:21 [PING] [PATCH] Add direct support for Linux kernel __fentry__ patching Dominique Dhumieres
@ 2014-09-27 15:35 ` Andi Kleen
[not found] ` <889168F4-6DA0-4B9B-BB19-0B9A3B349D99@lps.ens.fr>
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2014-09-27 15:35 UTC (permalink / raw)
To: Dominique Dhumieres; +Cc: andi, law, gcc-patches
On Sat, Sep 27, 2014 at 01:21:29PM +0200, Dominique Dhumieres wrote:
> The new tests fail on darwin:
>
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/nop-mcount.c:1:0: error: -mnop-mcount is not implemented for -fPIC
>
> and gcc.target/i386/record-mcount.c fails because mcount_loc is not found in the assembly.
Sorry. Here's a patch. I'll install it as obvious unless someone
complains. Also I hope it's the last mcount patch for now :-)
-Andi
diff --git a/gcc/testsuite/gcc.target/i386/nop-mcount.c b/gcc/testsuite/gcc.target/i386/nop-mcount.c
index 2592231..942cae0 100644
--- a/gcc/testsuite/gcc.target/i386/nop-mcount.c
+++ b/gcc/testsuite/gcc.target/i386/nop-mcount.c
@@ -1,5 +1,5 @@
/* Test -mnop-mcount */
-/* { dg-do compile } */
+/* { dg-do compile { target *-*-linux* } } */
/* { dg-options "-pg -mfentry -mrecord-mcount -mnop-mcount" } */
/* { dg-final { scan-assembler-not "__fentry__" } } */
/* Origin: Andi Kleen */
diff --git a/gcc/testsuite/gcc.target/i386/record-mcount.c b/gcc/testsuite/gcc.target/i386/record-mcount.c
index dae413e..26b0dbc 100644
--- a/gcc/testsuite/gcc.target/i386/record-mcount.c
+++ b/gcc/testsuite/gcc.target/i386/record-mcount.c
@@ -1,5 +1,5 @@
/* Test -mrecord-mcount */
-/* { dg-do compile } */
+/* { dg-do compile { target *-*-linux* } } */
/* { dg-options "-pg -mrecord-mcount" } */
/* { dg-final { scan-assembler "mcount_loc" } } */
/* Origin: Andi Kleen */
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] Add direct support for Linux kernel __fentry__ patching
@ 2014-09-02 3:25 Andi Kleen
2014-09-15 16:49 ` Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2014-09-02 3:25 UTC (permalink / raw)
To: gcc-patches; +Cc: Andi Kleen, rostedt
From: Andi Kleen <ak@linux.intel.com>
The Linux kernel dynamically patches in __fentry__ calls in and
out at runtime. This allows using function tracing for debugging
in production kernels without (significant) performance penalty.
For this it needs a table pointing to each __fentry__ call.
The way it is currently implemented is that a special
perl script scans the object file, generates the table in a special
section. When the kernel boots up it nops the calls, and
then later patches in the calls again as needed.
The recordmcount.pl script in the kernel works, but it seems
cleaner and faster to support the code generation of the patch table
directly in gcc.
This also allows to nop the calls directly at code generation
time, which allows to skip a patching step at kernel boot.
I also expect that a patchable production tracing facility is also useful
for other applications.
For example it could be used in ftracer
(https://github.com/andikleen/ftracer)
Having a nop area at the beginning of each function can be also
also useful for other things. For example it can be used to patch
functions at runtime to point to different functions, to do
binary updates without restarting the program (like ksplice or
similar)
This patch implements two new options for the i386 target:
-mrecord-mcount
Generate a __mcount_loc section entry for each __fentry__ or mcount
call. The section is compatible with the kernel convention
and the data is put into a section loaded at runtime.
-mnop-mcount
Generate the mcount/__fentry__ call as 5 byte nop that can be
patched in later. The nop is generated as a single instruction,
as the Linux kernel run time patching relies on this.
Limitations:
- I didn't implement -mnop-mcount for -fPIC. This
would need a good single instruction 6 byte NOP and it seems a
bit pointless, as the patching would prevent text sharing.
- I didn't implement noping for targets that pass a variable
to mcount.
- The facility could be useful on architectures too. Currently
the mcount code is target specific, so I made it a i386 option.
Passes bootstrap and testing on x86_64-linux.
Cc: rostedt@goodmis.org
gcc/:
2014-09-01 Andi Kleen <ak@linux.intel.com>
* config/i386/i386.c (x86_print_call_or_nop): New function.
(x86_function_profiler): Support -mnop-mcount and
-mrecord-mcount.
* config/i386/i386.opt (-mnop-mcount, -mrecord-mcount): Add
* doc/invoke.texi: Document -mnop-mcount, -mrecord-mcount
* testsuite/gcc/gcc.target/i386/nop-mcount.c: New file.
* testsuite/gcc/gcc.target/i386/record-mcount.c: New file.
---
gcc/config/i386/i386.c | 34 +++++++++++++++++++++++----
gcc/config/i386/i386.opt | 9 +++++++
gcc/doc/invoke.texi | 17 +++++++++++++-
gcc/testsuite/gcc.target/i386/nop-mcount.c | 24 +++++++++++++++++++
gcc/testsuite/gcc.target/i386/record-mcount.c | 24 +++++++++++++++++++
5 files changed, 102 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/nop-mcount.c
create mode 100644 gcc/testsuite/gcc.target/i386/record-mcount.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 61b33782..a651aa1 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3974,6 +3974,13 @@ ix86_option_override_internal (bool main_args_p,
}
}
+#ifndef NO_PROFILE_COUNTERS
+ if (flag_nop_mcount)
+ error ("-mnop-mcount is not compatible with this target");
+#endif
+ if (flag_nop_mcount && flag_pic)
+ error ("-mnop-mcount is not implemented for -fPIC");
+
/* Accept -msseregparm only if at least SSE support is enabled. */
if (TARGET_SSEREGPARM_P (opts->x_target_flags)
&& ! TARGET_SSE_P (opts->x_ix86_isa_flags))
@@ -39042,6 +39049,17 @@ x86_field_alignment (tree field, int computed)
return computed;
}
+/* Print call to TARGET to FILE. */
+
+static void
+x86_print_call_or_nop (FILE *file, const char *target)
+{
+ if (flag_nop_mcount)
+ fprintf (file, "1:\tnopl 0x00(%%eax,%%eax,1)\n"); /* 5 byte nop. */
+ else
+ fprintf (file, "1:\tcall\t%s\n", target);
+}
+
/* Output assembler code to FILE to increment profiler label # LABELNO
for profiling a function entry. */
void
@@ -39049,7 +39067,6 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
{
const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE
: MCOUNT_NAME);
-
if (TARGET_64BIT)
{
#ifndef NO_PROFILE_COUNTERS
@@ -39057,9 +39074,9 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
#endif
if (!TARGET_PECOFF && flag_pic)
- fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
+ fprintf (file, "1:\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
else
- fprintf (file, "\tcall\t%s\n", mcount_name);
+ x86_print_call_or_nop (file, mcount_name);
}
else if (flag_pic)
{
@@ -39067,7 +39084,7 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
LPREFIX, labelno);
#endif
- fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
+ fprintf (file, "1:\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
}
else
{
@@ -39075,7 +39092,14 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
LPREFIX, labelno);
#endif
- fprintf (file, "\tcall\t%s\n", mcount_name);
+ x86_print_call_or_nop (file, mcount_name);
+ }
+
+ if (flag_record_mcount)
+ {
+ fprintf (file, "\t.section __mcount_loc, \"r\"\n");
+ fprintf (file, "\t.%s 1b\n", TARGET_64BIT ? "quad" : "long");
+ fprintf (file, "\t.previous\n");
}
}
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 9208b76..acf6b37 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -789,6 +789,15 @@ mfentry
Target Report Var(flag_fentry) Init(-1)
Emit profiling counter call at function entry before prologue.
+mrecord-mcount
+Target Report Var(flag_record_mcount) Init(0)
+Generate __mcount_loc section with all mcount or __fentry__ calls.
+
+mnop-mcount
+Target Report Var(flag_nop_mcount) Init(0)
+Generate mcount/__fentry__ calls as nops. To activate they need to be
+patched in.
+
m8bit-idiv
Target Report Mask(USE_8BIT_IDIV) Save
Expand 32bit/64bit integer divide into 8bit unsigned integer divide with run-time check
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d15d4a9..43fd6b7 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -691,7 +691,7 @@ Objective-C and Objective-C++ Dialects}.
-momit-leaf-frame-pointer -mno-red-zone -mno-tls-direct-seg-refs @gol
-mcmodel=@var{code-model} -mabi=@var{name} -maddress-mode=@var{mode} @gol
-m32 -m64 -mx32 -m16 -mlarge-data-threshold=@var{num} @gol
--msse2avx -mfentry -m8bit-idiv @gol
+-msse2avx -mfentry -mrecord-mcount -mnop-mcount -m8bit-idiv @gol
-mavx256-split-unaligned-load -mavx256-split-unaligned-store @gol
-mstack-protector-guard=@var{guard}}
@@ -15954,6 +15954,21 @@ counter call before the prologue.
Note: On x86 architectures the attribute @code{ms_hook_prologue}
isn't possible at the moment for @option{-mfentry} and @option{-pg}.
+@item -mrecord-mcount
+@itemx -mno-record-mcount
+@opindex mrecord-mcount
+If profiling is active (@option{-pg}), generate a __mcount_loc section
+that contains pointers to each profiling call. This is useful for
+automatically patching and out calls.
+
+@item -mnop-mcount
+@itemx -mno-nop-mcount
+@opindex mnop-mcount
+If profiling is active (@option{-pg}), generate the calls to
+the profiling functions as nops. This is useful when they
+should be patched in later dynamically. This is likely only
+useful together with @option{-mrecord-mcount}.
+
@item -m8bit-idiv
@itemx -mno-8bit-idiv
@opindex 8bit-idiv
diff --git a/gcc/testsuite/gcc.target/i386/nop-mcount.c b/gcc/testsuite/gcc.target/i386/nop-mcount.c
new file mode 100644
index 0000000..2592231
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/nop-mcount.c
@@ -0,0 +1,24 @@
+/* Test -mnop-mcount */
+/* { dg-do compile } */
+/* { dg-options "-pg -mfentry -mrecord-mcount -mnop-mcount" } */
+/* { dg-final { scan-assembler-not "__fentry__" } } */
+/* Origin: Andi Kleen */
+extern void foobar(char *);
+
+void func(void)
+{
+ foobar ("Hello world\n");
+}
+
+void func2(void)
+{
+ int i;
+ for (i = 0; i < 10; i++)
+ foobar ("Hello world");
+}
+
+void func3(a)
+char *a;
+{
+ foobar("Hello world");
+}
diff --git a/gcc/testsuite/gcc.target/i386/record-mcount.c b/gcc/testsuite/gcc.target/i386/record-mcount.c
new file mode 100644
index 0000000..dae413e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/record-mcount.c
@@ -0,0 +1,24 @@
+/* Test -mrecord-mcount */
+/* { dg-do compile } */
+/* { dg-options "-pg -mrecord-mcount" } */
+/* { dg-final { scan-assembler "mcount_loc" } } */
+/* Origin: Andi Kleen */
+extern void foobar(char *);
+
+void func(void)
+{
+ foobar ("Hello world\n");
+}
+
+void func2(void)
+{
+ int i;
+ for (i = 0; i < 10; i++)
+ foobar ("Hello world");
+}
+
+void func3(a)
+char *a;
+{
+ foobar("Hello world");
+}
--
2.1.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Add direct support for Linux kernel __fentry__ patching
2014-09-02 3:25 Andi Kleen
@ 2014-09-15 16:49 ` Andi Kleen
2014-09-22 14:49 ` [PING] " Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2014-09-15 16:49 UTC (permalink / raw)
To: gcc-patches; +Cc: rostedt
Andi Kleen <andi@firstfloor.org> writes:
Ping!
> From: Andi Kleen <ak@linux.intel.com>
>
> The Linux kernel dynamically patches in __fentry__ calls in and
> out at runtime. This allows using function tracing for debugging
> in production kernels without (significant) performance penalty.
>
> For this it needs a table pointing to each __fentry__ call.
>
> The way it is currently implemented is that a special
> perl script scans the object file, generates the table in a special
> section. When the kernel boots up it nops the calls, and
> then later patches in the calls again as needed.
>
> The recordmcount.pl script in the kernel works, but it seems
> cleaner and faster to support the code generation of the patch table
> directly in gcc.
>
> This also allows to nop the calls directly at code generation
> time, which allows to skip a patching step at kernel boot.
> I also expect that a patchable production tracing facility is also useful
> for other applications.
>
> For example it could be used in ftracer
> (https://github.com/andikleen/ftracer)
>
> Having a nop area at the beginning of each function can be also
> also useful for other things. For example it can be used to patch
> functions at runtime to point to different functions, to do
> binary updates without restarting the program (like ksplice or
> similar)
>
> This patch implements two new options for the i386 target:
>
> -mrecord-mcount
> Generate a __mcount_loc section entry for each __fentry__ or mcount
> call. The section is compatible with the kernel convention
> and the data is put into a section loaded at runtime.
>
> -mnop-mcount
> Generate the mcount/__fentry__ call as 5 byte nop that can be
> patched in later. The nop is generated as a single instruction,
> as the Linux kernel run time patching relies on this.
>
> Limitations:
> - I didn't implement -mnop-mcount for -fPIC. This
> would need a good single instruction 6 byte NOP and it seems a
> bit pointless, as the patching would prevent text sharing.
> - I didn't implement noping for targets that pass a variable
> to mcount.
> - The facility could be useful on architectures too. Currently
> the mcount code is target specific, so I made it a i386 option.
>
> Passes bootstrap and testing on x86_64-linux.
>
> Cc: rostedt@goodmis.org
>
> gcc/:
>
> 2014-09-01 Andi Kleen <ak@linux.intel.com>
>
> * config/i386/i386.c (x86_print_call_or_nop): New function.
> (x86_function_profiler): Support -mnop-mcount and
> -mrecord-mcount.
> * config/i386/i386.opt (-mnop-mcount, -mrecord-mcount): Add
> * doc/invoke.texi: Document -mnop-mcount, -mrecord-mcount
> * testsuite/gcc/gcc.target/i386/nop-mcount.c: New file.
> * testsuite/gcc/gcc.target/i386/record-mcount.c: New file.
> ---
> gcc/config/i386/i386.c | 34 +++++++++++++++++++++++----
> gcc/config/i386/i386.opt | 9 +++++++
> gcc/doc/invoke.texi | 17 +++++++++++++-
> gcc/testsuite/gcc.target/i386/nop-mcount.c | 24 +++++++++++++++++++
> gcc/testsuite/gcc.target/i386/record-mcount.c | 24 +++++++++++++++++++
> 5 files changed, 102 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/i386/nop-mcount.c
> create mode 100644 gcc/testsuite/gcc.target/i386/record-mcount.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 61b33782..a651aa1 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -3974,6 +3974,13 @@ ix86_option_override_internal (bool main_args_p,
> }
> }
>
> +#ifndef NO_PROFILE_COUNTERS
> + if (flag_nop_mcount)
> + error ("-mnop-mcount is not compatible with this target");
> +#endif
> + if (flag_nop_mcount && flag_pic)
> + error ("-mnop-mcount is not implemented for -fPIC");
> +
> /* Accept -msseregparm only if at least SSE support is enabled. */
> if (TARGET_SSEREGPARM_P (opts->x_target_flags)
> && ! TARGET_SSE_P (opts->x_ix86_isa_flags))
> @@ -39042,6 +39049,17 @@ x86_field_alignment (tree field, int computed)
> return computed;
> }
>
> +/* Print call to TARGET to FILE. */
> +
> +static void
> +x86_print_call_or_nop (FILE *file, const char *target)
> +{
> + if (flag_nop_mcount)
> + fprintf (file, "1:\tnopl 0x00(%%eax,%%eax,1)\n"); /* 5 byte nop. */
> + else
> + fprintf (file, "1:\tcall\t%s\n", target);
> +}
> +
> /* Output assembler code to FILE to increment profiler label # LABELNO
> for profiling a function entry. */
> void
> @@ -39049,7 +39067,6 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
> {
> const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE
> : MCOUNT_NAME);
> -
> if (TARGET_64BIT)
> {
> #ifndef NO_PROFILE_COUNTERS
> @@ -39057,9 +39074,9 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
> #endif
>
> if (!TARGET_PECOFF && flag_pic)
> - fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
> + fprintf (file, "1:\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
> else
> - fprintf (file, "\tcall\t%s\n", mcount_name);
> + x86_print_call_or_nop (file, mcount_name);
> }
> else if (flag_pic)
> {
> @@ -39067,7 +39084,7 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
> fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
> LPREFIX, labelno);
> #endif
> - fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
> + fprintf (file, "1:\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
> }
> else
> {
> @@ -39075,7 +39092,14 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
> fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
> LPREFIX, labelno);
> #endif
> - fprintf (file, "\tcall\t%s\n", mcount_name);
> + x86_print_call_or_nop (file, mcount_name);
> + }
> +
> + if (flag_record_mcount)
> + {
> + fprintf (file, "\t.section __mcount_loc, \"r\"\n");
> + fprintf (file, "\t.%s 1b\n", TARGET_64BIT ? "quad" : "long");
> + fprintf (file, "\t.previous\n");
> }
> }
>
> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> index 9208b76..acf6b37 100644
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -789,6 +789,15 @@ mfentry
> Target Report Var(flag_fentry) Init(-1)
> Emit profiling counter call at function entry before prologue.
>
> +mrecord-mcount
> +Target Report Var(flag_record_mcount) Init(0)
> +Generate __mcount_loc section with all mcount or __fentry__ calls.
> +
> +mnop-mcount
> +Target Report Var(flag_nop_mcount) Init(0)
> +Generate mcount/__fentry__ calls as nops. To activate they need to be
> +patched in.
> +
> m8bit-idiv
> Target Report Mask(USE_8BIT_IDIV) Save
> Expand 32bit/64bit integer divide into 8bit unsigned integer divide with run-time check
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index d15d4a9..43fd6b7 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -691,7 +691,7 @@ Objective-C and Objective-C++ Dialects}.
> -momit-leaf-frame-pointer -mno-red-zone -mno-tls-direct-seg-refs @gol
> -mcmodel=@var{code-model} -mabi=@var{name} -maddress-mode=@var{mode} @gol
> -m32 -m64 -mx32 -m16 -mlarge-data-threshold=@var{num} @gol
> --msse2avx -mfentry -m8bit-idiv @gol
> +-msse2avx -mfentry -mrecord-mcount -mnop-mcount -m8bit-idiv @gol
> -mavx256-split-unaligned-load -mavx256-split-unaligned-store @gol
> -mstack-protector-guard=@var{guard}}
>
> @@ -15954,6 +15954,21 @@ counter call before the prologue.
> Note: On x86 architectures the attribute @code{ms_hook_prologue}
> isn't possible at the moment for @option{-mfentry} and @option{-pg}.
>
> +@item -mrecord-mcount
> +@itemx -mno-record-mcount
> +@opindex mrecord-mcount
> +If profiling is active (@option{-pg}), generate a __mcount_loc section
> +that contains pointers to each profiling call. This is useful for
> +automatically patching and out calls.
> +
> +@item -mnop-mcount
> +@itemx -mno-nop-mcount
> +@opindex mnop-mcount
> +If profiling is active (@option{-pg}), generate the calls to
> +the profiling functions as nops. This is useful when they
> +should be patched in later dynamically. This is likely only
> +useful together with @option{-mrecord-mcount}.
> +
> @item -m8bit-idiv
> @itemx -mno-8bit-idiv
> @opindex 8bit-idiv
> diff --git a/gcc/testsuite/gcc.target/i386/nop-mcount.c b/gcc/testsuite/gcc.target/i386/nop-mcount.c
> new file mode 100644
> index 0000000..2592231
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/nop-mcount.c
> @@ -0,0 +1,24 @@
> +/* Test -mnop-mcount */
> +/* { dg-do compile } */
> +/* { dg-options "-pg -mfentry -mrecord-mcount -mnop-mcount" } */
> +/* { dg-final { scan-assembler-not "__fentry__" } } */
> +/* Origin: Andi Kleen */
> +extern void foobar(char *);
> +
> +void func(void)
> +{
> + foobar ("Hello world\n");
> +}
> +
> +void func2(void)
> +{
> + int i;
> + for (i = 0; i < 10; i++)
> + foobar ("Hello world");
> +}
> +
> +void func3(a)
> +char *a;
> +{
> + foobar("Hello world");
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/record-mcount.c b/gcc/testsuite/gcc.target/i386/record-mcount.c
> new file mode 100644
> index 0000000..dae413e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/record-mcount.c
> @@ -0,0 +1,24 @@
> +/* Test -mrecord-mcount */
> +/* { dg-do compile } */
> +/* { dg-options "-pg -mrecord-mcount" } */
> +/* { dg-final { scan-assembler "mcount_loc" } } */
> +/* Origin: Andi Kleen */
> +extern void foobar(char *);
> +
> +void func(void)
> +{
> + foobar ("Hello world\n");
> +}
> +
> +void func2(void)
> +{
> + int i;
> + for (i = 0; i < 10; i++)
> + foobar ("Hello world");
> +}
> +
> +void func3(a)
> +char *a;
> +{
> + foobar("Hello world");
> +}
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PING] [PATCH] Add direct support for Linux kernel __fentry__ patching
2014-09-15 16:49 ` Andi Kleen
@ 2014-09-22 14:49 ` Andi Kleen
0 siblings, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2014-09-22 14:49 UTC (permalink / raw)
To: gcc-patches
Andi Kleen <andi@firstfloor.org> writes:
Ping!
> Andi Kleen <andi@firstfloor.org> writes:
>
> Ping!
>
>> From: Andi Kleen <ak@linux.intel.com>
>>
>> The Linux kernel dynamically patches in __fentry__ calls in and
>> out at runtime. This allows using function tracing for debugging
>> in production kernels without (significant) performance penalty.
>>
>> For this it needs a table pointing to each __fentry__ call.
>>
>> The way it is currently implemented is that a special
>> perl script scans the object file, generates the table in a special
>> section. When the kernel boots up it nops the calls, and
>> then later patches in the calls again as needed.
>>
>> The recordmcount.pl script in the kernel works, but it seems
>> cleaner and faster to support the code generation of the patch table
>> directly in gcc.
>>
>> This also allows to nop the calls directly at code generation
>> time, which allows to skip a patching step at kernel boot.
>> I also expect that a patchable production tracing facility is also useful
>> for other applications.
>>
>> For example it could be used in ftracer
>> (https://github.com/andikleen/ftracer)
>>
>> Having a nop area at the beginning of each function can be also
>> also useful for other things. For example it can be used to patch
>> functions at runtime to point to different functions, to do
>> binary updates without restarting the program (like ksplice or
>> similar)
>>
>> This patch implements two new options for the i386 target:
>>
>> -mrecord-mcount
>> Generate a __mcount_loc section entry for each __fentry__ or mcount
>> call. The section is compatible with the kernel convention
>> and the data is put into a section loaded at runtime.
>>
>> -mnop-mcount
>> Generate the mcount/__fentry__ call as 5 byte nop that can be
>> patched in later. The nop is generated as a single instruction,
>> as the Linux kernel run time patching relies on this.
>>
>> Limitations:
>> - I didn't implement -mnop-mcount for -fPIC. This
>> would need a good single instruction 6 byte NOP and it seems a
>> bit pointless, as the patching would prevent text sharing.
>> - I didn't implement noping for targets that pass a variable
>> to mcount.
>> - The facility could be useful on architectures too. Currently
>> the mcount code is target specific, so I made it a i386 option.
>>
>> Passes bootstrap and testing on x86_64-linux.
>>
>> Cc: rostedt@goodmis.org
>>
>> gcc/:
>>
>> 2014-09-01 Andi Kleen <ak@linux.intel.com>
>>
>> * config/i386/i386.c (x86_print_call_or_nop): New function.
>> (x86_function_profiler): Support -mnop-mcount and
>> -mrecord-mcount.
>> * config/i386/i386.opt (-mnop-mcount, -mrecord-mcount): Add
>> * doc/invoke.texi: Document -mnop-mcount, -mrecord-mcount
>> * testsuite/gcc/gcc.target/i386/nop-mcount.c: New file.
>> * testsuite/gcc/gcc.target/i386/record-mcount.c: New file.
>> ---
>> gcc/config/i386/i386.c | 34 +++++++++++++++++++++++----
>> gcc/config/i386/i386.opt | 9 +++++++
>> gcc/doc/invoke.texi | 17 +++++++++++++-
>> gcc/testsuite/gcc.target/i386/nop-mcount.c | 24 +++++++++++++++++++
>> gcc/testsuite/gcc.target/i386/record-mcount.c | 24 +++++++++++++++++++
>> 5 files changed, 102 insertions(+), 6 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.target/i386/nop-mcount.c
>> create mode 100644 gcc/testsuite/gcc.target/i386/record-mcount.c
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 61b33782..a651aa1 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -3974,6 +3974,13 @@ ix86_option_override_internal (bool main_args_p,
>> }
>> }
>>
>> +#ifndef NO_PROFILE_COUNTERS
>> + if (flag_nop_mcount)
>> + error ("-mnop-mcount is not compatible with this target");
>> +#endif
>> + if (flag_nop_mcount && flag_pic)
>> + error ("-mnop-mcount is not implemented for -fPIC");
>> +
>> /* Accept -msseregparm only if at least SSE support is enabled. */
>> if (TARGET_SSEREGPARM_P (opts->x_target_flags)
>> && ! TARGET_SSE_P (opts->x_ix86_isa_flags))
>> @@ -39042,6 +39049,17 @@ x86_field_alignment (tree field, int computed)
>> return computed;
>> }
>>
>> +/* Print call to TARGET to FILE. */
>> +
>> +static void
>> +x86_print_call_or_nop (FILE *file, const char *target)
>> +{
>> + if (flag_nop_mcount)
>> + fprintf (file, "1:\tnopl 0x00(%%eax,%%eax,1)\n"); /* 5 byte nop. */
>> + else
>> + fprintf (file, "1:\tcall\t%s\n", target);
>> +}
>> +
>> /* Output assembler code to FILE to increment profiler label # LABELNO
>> for profiling a function entry. */
>> void
>> @@ -39049,7 +39067,6 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
>> {
>> const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE
>> : MCOUNT_NAME);
>> -
>> if (TARGET_64BIT)
>> {
>> #ifndef NO_PROFILE_COUNTERS
>> @@ -39057,9 +39074,9 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
>> #endif
>>
>> if (!TARGET_PECOFF && flag_pic)
>> - fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
>> + fprintf (file, "1:\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
>> else
>> - fprintf (file, "\tcall\t%s\n", mcount_name);
>> + x86_print_call_or_nop (file, mcount_name);
>> }
>> else if (flag_pic)
>> {
>> @@ -39067,7 +39084,7 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
>> fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
>> LPREFIX, labelno);
>> #endif
>> - fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
>> + fprintf (file, "1:\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
>> }
>> else
>> {
>> @@ -39075,7 +39092,14 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
>> fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
>> LPREFIX, labelno);
>> #endif
>> - fprintf (file, "\tcall\t%s\n", mcount_name);
>> + x86_print_call_or_nop (file, mcount_name);
>> + }
>> +
>> + if (flag_record_mcount)
>> + {
>> + fprintf (file, "\t.section __mcount_loc, \"r\"\n");
>> + fprintf (file, "\t.%s 1b\n", TARGET_64BIT ? "quad" : "long");
>> + fprintf (file, "\t.previous\n");
>> }
>> }
>>
>> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
>> index 9208b76..acf6b37 100644
>> --- a/gcc/config/i386/i386.opt
>> +++ b/gcc/config/i386/i386.opt
>> @@ -789,6 +789,15 @@ mfentry
>> Target Report Var(flag_fentry) Init(-1)
>> Emit profiling counter call at function entry before prologue.
>>
>> +mrecord-mcount
>> +Target Report Var(flag_record_mcount) Init(0)
>> +Generate __mcount_loc section with all mcount or __fentry__ calls.
>> +
>> +mnop-mcount
>> +Target Report Var(flag_nop_mcount) Init(0)
>> +Generate mcount/__fentry__ calls as nops. To activate they need to be
>> +patched in.
>> +
>> m8bit-idiv
>> Target Report Mask(USE_8BIT_IDIV) Save
>> Expand 32bit/64bit integer divide into 8bit unsigned integer divide with run-time check
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index d15d4a9..43fd6b7 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -691,7 +691,7 @@ Objective-C and Objective-C++ Dialects}.
>> -momit-leaf-frame-pointer -mno-red-zone -mno-tls-direct-seg-refs @gol
>> -mcmodel=@var{code-model} -mabi=@var{name} -maddress-mode=@var{mode} @gol
>> -m32 -m64 -mx32 -m16 -mlarge-data-threshold=@var{num} @gol
>> --msse2avx -mfentry -m8bit-idiv @gol
>> +-msse2avx -mfentry -mrecord-mcount -mnop-mcount -m8bit-idiv @gol
>> -mavx256-split-unaligned-load -mavx256-split-unaligned-store @gol
>> -mstack-protector-guard=@var{guard}}
>>
>> @@ -15954,6 +15954,21 @@ counter call before the prologue.
>> Note: On x86 architectures the attribute @code{ms_hook_prologue}
>> isn't possible at the moment for @option{-mfentry} and @option{-pg}.
>>
>> +@item -mrecord-mcount
>> +@itemx -mno-record-mcount
>> +@opindex mrecord-mcount
>> +If profiling is active (@option{-pg}), generate a __mcount_loc section
>> +that contains pointers to each profiling call. This is useful for
>> +automatically patching and out calls.
>> +
>> +@item -mnop-mcount
>> +@itemx -mno-nop-mcount
>> +@opindex mnop-mcount
>> +If profiling is active (@option{-pg}), generate the calls to
>> +the profiling functions as nops. This is useful when they
>> +should be patched in later dynamically. This is likely only
>> +useful together with @option{-mrecord-mcount}.
>> +
>> @item -m8bit-idiv
>> @itemx -mno-8bit-idiv
>> @opindex 8bit-idiv
>> diff --git a/gcc/testsuite/gcc.target/i386/nop-mcount.c b/gcc/testsuite/gcc.target/i386/nop-mcount.c
>> new file mode 100644
>> index 0000000..2592231
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/i386/nop-mcount.c
>> @@ -0,0 +1,24 @@
>> +/* Test -mnop-mcount */
>> +/* { dg-do compile } */
>> +/* { dg-options "-pg -mfentry -mrecord-mcount -mnop-mcount" } */
>> +/* { dg-final { scan-assembler-not "__fentry__" } } */
>> +/* Origin: Andi Kleen */
>> +extern void foobar(char *);
>> +
>> +void func(void)
>> +{
>> + foobar ("Hello world\n");
>> +}
>> +
>> +void func2(void)
>> +{
>> + int i;
>> + for (i = 0; i < 10; i++)
>> + foobar ("Hello world");
>> +}
>> +
>> +void func3(a)
>> +char *a;
>> +{
>> + foobar("Hello world");
>> +}
>> diff --git a/gcc/testsuite/gcc.target/i386/record-mcount.c b/gcc/testsuite/gcc.target/i386/record-mcount.c
>> new file mode 100644
>> index 0000000..dae413e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/i386/record-mcount.c
>> @@ -0,0 +1,24 @@
>> +/* Test -mrecord-mcount */
>> +/* { dg-do compile } */
>> +/* { dg-options "-pg -mrecord-mcount" } */
>> +/* { dg-final { scan-assembler "mcount_loc" } } */
>> +/* Origin: Andi Kleen */
>> +extern void foobar(char *);
>> +
>> +void func(void)
>> +{
>> + foobar ("Hello world\n");
>> +}
>> +
>> +void func2(void)
>> +{
>> + int i;
>> + for (i = 0; i < 10; i++)
>> + foobar ("Hello world");
>> +}
>> +
>> +void func3(a)
>> +char *a;
>> +{
>> + foobar("Hello world");
>> +}
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-09-27 17:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-27 11:21 [PING] [PATCH] Add direct support for Linux kernel __fentry__ patching Dominique Dhumieres
2014-09-27 15:35 ` Andi Kleen
[not found] ` <889168F4-6DA0-4B9B-BB19-0B9A3B349D99@lps.ens.fr>
2014-09-27 16:55 ` Dominique d'Humières
2014-09-27 17:01 ` Andi Kleen
-- strict thread matches above, loose matches on Subject: below --
2014-09-02 3:25 Andi Kleen
2014-09-15 16:49 ` Andi Kleen
2014-09-22 14:49 ` [PING] " Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).