* [patch i386]: Add for win32 targets pre-prologue profiling feature @ 2010-07-13 12:47 Kai Tietz 2010-07-13 16:28 ` Richard Henderson 0 siblings, 1 reply; 27+ messages in thread From: Kai Tietz @ 2010-07-13 12:47 UTC (permalink / raw) To: GCC Patches; +Cc: Richard Henderson [-- Attachment #1: Type: text/plain, Size: 1850 bytes --] Hello, This patch adds for i386/x86_64 win32 targets the feature of pre-prologue profiling call. It additional takes care that for enabled top-profiler call, the frame-pointer gets omitted, if possible. One side-note here about "hotfix" and profiling. The top-profiler call gets emitted in the ix86_asm_output_function_label. This is caused by the fact, that for ix86 the call needs to be placed before the code-pattern, and for x86_64 it can be placed after. Otherwise the x86 pattern for patchable region would be corrupted, as this pattern contains frame-register setup. So I think that the use of the macro PROFILE_BEFORE_PROLOGUE isn't usable here. 2010-07-13 Kai Tietz * config/i386/cygming.h (PROFILE_CALL_AT_TOP): New macro. (MCOUNT_NAME): Define specific sub-target macro. * config/i386/cygming.opt: New option -fprofile-top. * config/i386/i386.c (ix86_function_regparm): Special handling for active profiling. (ix86_function_sseregparm): Likewise. (ix86_frame_pointer_required): Likewise. (ix86_select_alt_pic_regnum): Likewise. (ix86_save_reg): Likewise. (ix86_expand_prologue): Likewise. (x86_function_profiler_intern): New internal function. (ix86_asm_output_function_label): Output preprologue profiler call. (x86_function_profiler): Emit profiling after prologue when no top-profiling is enabled. * config/i386/i386.h (PROFILE_CALL_AT_TOP): Define macro by default of zero. * doc/invoke.texi (-fprofile-top): Add documentation. Tested for i686-pc-linux-gnu, x86_64-pc-mingw32, and i686-pc-mingw32. Ok for apply? Regards, Kai PS: If there is a general interest for this feature for all i386 targets, the patch can be easily adjusted to this. -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination [-- Attachment #2: profile_top.diff --] [-- Type: application/octet-stream, Size: 9336 bytes --] Index: gcc/gcc/config/i386/cygming.h =================================================================== --- gcc.orig/gcc/config/i386/cygming.h 2010-07-09 11:24:42.000000000 +0200 +++ gcc/gcc/config/i386/cygming.h 2010-07-13 12:34:57.332581900 +0200 @@ -39,6 +39,12 @@ along with GCC; see the file COPYING3. #undef DEFAULT_ABI #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI) +#undef PROFILE_CALL_AT_TOP +#define PROFILE_CALL_AT_TOP (flag_profile_top != 0) + +#undef MCOUNT_NAME +#define MCOUNT_NAME (PROFILE_CALL_AT_TOP ? "_mcount_top" : "_mcount") + #if ! defined (USE_MINGW64_LEADING_UNDERSCORES) #undef USER_LABEL_PREFIX #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_") Index: gcc/gcc/config/i386/cygming.opt =================================================================== --- gcc.orig/gcc/config/i386/cygming.opt 2009-12-11 14:56:47.000000000 +0100 +++ gcc/gcc/config/i386/cygming.opt 2010-07-13 12:26:24.374762200 +0200 @@ -53,3 +53,7 @@ Use the GNU extension to the PE format f muse-libstdc-wrappers Target Condition({defined (USE_CYGWIN_LIBSTDCXX_WRAPPERS)}) Compile code that relies on Cygwin DLL wrappers to support C++ operator new/delete replacement + +fprofile-top +Common Report Var(flag_profile_top) Init(0) +Emit for profiling code the profiler-callback call before prologue. Index: gcc/gcc/config/i386/i386.c =================================================================== --- gcc.orig/gcc/config/i386/i386.c 2010-07-13 12:03:39.000000000 +0200 +++ gcc/gcc/config/i386/i386.c 2010-07-13 14:25:24.527182900 +0200 @@ -4841,7 +4841,7 @@ ix86_function_regparm (const_tree type, if (decl && TREE_CODE (decl) == FUNCTION_DECL && optimize - && !profile_flag) + && !(profile_flag && !PROFILE_CALL_AT_TOP)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl)); @@ -4913,7 +4913,8 @@ ix86_function_sseregparm (const_tree typ /* For local functions, pass up to SSE_REGPARM_MAX SFmode (and DFmode for SSE2) arguments in SSE registers. */ - if (decl && TARGET_SSE_MATH && optimize && !profile_flag) + if (decl && TARGET_SSE_MATH && optimize + && !(profile_flag && !PROFILE_CALL_AT_TOP)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl)); @@ -5132,7 +5133,43 @@ ix86_cfun_abi (void) return cfun->machine->call_abi; } -/* Write the extra assembler code needed to declare a function properly. */ +/* Output assembler code to FILE to increment profiler label # LABELNO + for profiling a function entry. */ +static void +x86_function_profiler_intern (FILE *file, int labelno ATTRIBUTE_UNUSED) +{ + if (TARGET_64BIT) + { +#ifndef NO_PROFILE_COUNTERS + fprintf (file, "\tleaq\t%sP%d(%%rip),%%r11\n", LPREFIX, labelno); +#endif + + if (DEFAULT_ABI == SYSV_ABI && flag_pic) + fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", MCOUNT_NAME); + else + fprintf (file, "\tcall\t%s\n", MCOUNT_NAME); + } + else if (flag_pic) + { +#ifndef NO_PROFILE_COUNTERS + fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n", + LPREFIX, labelno); +#endif + fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", MCOUNT_NAME); + } + else + { +#ifndef NO_PROFILE_COUNTERS + fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n", + LPREFIX, labelno); +#endif + fprintf (file, "\tcall\t%s\n", MCOUNT_NAME); + } +} + +/* Write the extra assembler code needed to declare a function properly. + Output call to profiler if profiling is enabled and it should be emitted + before prologue, */ void ix86_asm_output_function_label (FILE *asm_out_file, const char *fname, @@ -5151,6 +5188,11 @@ ix86_asm_output_function_label (FILE *as ASM_OUTPUT_LABEL (asm_out_file, fname); + /* We output profiling call before hotfix region, caused by the fact + that we would otherwise destroy for x86 the magic sequence. */ + if (!TARGET_64BIT && PROFILE_CALL_AT_TOP && profile_flag) + x86_function_profiler_intern (asm_out_file, 0); + /* Output magic byte marker, if hot-patch attribute is set. For x86 case frame-pointer prologue will be emitted in expand_prologue. */ @@ -5164,6 +5206,9 @@ ix86_asm_output_function_label (FILE *as /* movl.s %edi, %edi. */ asm_fprintf (asm_out_file, ASM_BYTE "0x8b, 0xff\n"); } + /* We output profiling call after hotfix region for x86_64. */ + if (TARGET_64BIT && PROFILE_CALL_AT_TOP && profile_flag) + x86_function_profiler_intern (asm_out_file, 0); } /* regclass.c */ @@ -7875,7 +7920,7 @@ ix86_frame_pointer_required (void) || ix86_current_function_calls_tls_descriptor)) return true; - if (crtl->profile) + if (crtl->profile && !PROFILE_CALL_AT_TOP) return true; return false; @@ -8143,7 +8188,7 @@ gen_push (rtx arg) static unsigned int ix86_select_alt_pic_regnum (void) { - if (current_function_is_leaf && !crtl->profile + if (current_function_is_leaf && !(crtl->profile && !PROFILE_CALL_AT_TOP) && !ix86_current_function_calls_tls_descriptor) { int i, drap; @@ -8167,7 +8212,7 @@ ix86_save_reg (unsigned int regno, int m if (pic_offset_table_rtx && regno == REAL_PIC_OFFSET_TABLE_REGNUM && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) - || crtl->profile + || (crtl->profile && !PROFILE_CALL_AT_TOP) || crtl->calls_eh_return || crtl->uses_const_pool)) { @@ -9443,7 +9488,7 @@ ix86_expand_prologue (void) pic_reg_used = false; if (pic_offset_table_rtx && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) - || crtl->profile)) + || (crtl->profile && !PROFILE_CALL_AT_TOP))) { unsigned int alt_pic_reg_used = ix86_select_alt_pic_regnum (); @@ -9480,7 +9525,7 @@ ix86_expand_prologue (void) when mcount needs it. Blockage to avoid call movement across mcount call is emitted in generic code after the NOTE_INSN_PROLOGUE_END note. */ - if (crtl->profile && pic_reg_used) + if (crtl->profile && !PROFILE_CALL_AT_TOP && pic_reg_used) emit_insn (gen_prologue_use (pic_offset_table_rtx)); if (crtl->drap_reg && !crtl->stack_realign_needed) @@ -27283,37 +27328,13 @@ x86_field_alignment (tree field, int com } /* Output assembler code to FILE to increment profiler label # LABELNO - for profiling a function entry. */ + for profiling a function entry, if profiling call should be emitted + after prologue. */ void -x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) +x86_function_profiler (FILE *file ATTRIBUTE_UNUSED, int labelno ATTRIBUTE_UNUSED) { - if (TARGET_64BIT) - { -#ifndef NO_PROFILE_COUNTERS - fprintf (file, "\tleaq\t%sP%d(%%rip),%%r11\n", LPREFIX, labelno); -#endif - - if (DEFAULT_ABI == SYSV_ABI && flag_pic) - fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file); - else - fputs ("\tcall\t" MCOUNT_NAME "\n", file); - } - else if (flag_pic) - { -#ifndef NO_PROFILE_COUNTERS - fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n", - LPREFIX, labelno); -#endif - fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file); - } - else - { -#ifndef NO_PROFILE_COUNTERS - fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n", - LPREFIX, labelno); -#endif - fputs ("\tcall\t" MCOUNT_NAME "\n", file); - } + if (!PROFILE_CALL_AT_TOP) + x86_function_profiler_intern (file, labelno); } #ifdef ASM_OUTPUT_MAX_SKIP_PAD Index: gcc/gcc/config/i386/i386.h =================================================================== --- gcc.orig/gcc/config/i386/i386.h 2010-07-09 11:24:42.000000000 +0200 +++ gcc/gcc/config/i386/i386.h 2010-07-13 12:33:52.273634200 +0200 @@ -1601,6 +1601,8 @@ typedef struct ix86_args { #define MCOUNT_NAME "_mcount" +#define PROFILE_CALL_AT_TOP 0 + #define PROFILE_COUNT_REGISTER "edx" /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function, Index: gcc/gcc/doc/invoke.texi =================================================================== --- gcc.orig/gcc/doc/invoke.texi 2010-07-09 11:24:43.000000000 +0200 +++ gcc/gcc/doc/invoke.texi 2010-07-13 12:49:46.418596000 +0200 @@ -884,7 +884,7 @@ See i386 and x86-64 Options. @emph{i386 and x86-64 Windows Options} @gccoptlist{-mconsole -mcygwin -mno-cygwin -mdll -mnop-fun-dllimport -mthread -municode -mwin32 -mwindows --fno-set-stack-executable} +-fno-set-stack-executable -fprofile-top} @emph{Xstormy16 Options} @gccoptlist{-msim} @@ -17108,6 +17108,15 @@ set. This is necessary for binaries runn Windows, as there the user32 API, which is used to set executable privileges, isn't available. +@item -fprofile-top +@opindex fprofile-top +This option is available for Cygwin and MinGW targets. It +specifies that for profiling the call to profiler should be +done before prologue. Default behavior is that profiler-call +is done after prologue is established. When active it calls +the @code{_mcount_top} function, otherwise the @code{_mcount} +function. + @item -mpe-aligned-commons @opindex mpe-aligned-commons This option is available for Cygwin and MinGW targets. It ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-13 12:47 [patch i386]: Add for win32 targets pre-prologue profiling feature Kai Tietz @ 2010-07-13 16:28 ` Richard Henderson 2010-07-14 10:20 ` Kai Tietz 0 siblings, 1 reply; 27+ messages in thread From: Richard Henderson @ 2010-07-13 16:28 UTC (permalink / raw) To: Kai Tietz; +Cc: GCC Patches On 07/13/2010 05:47 AM, Kai Tietz wrote: > Hello, > > This patch adds for i386/x86_64 win32 targets the feature of > pre-prologue profiling call. It additional takes care that for enabled > top-profiler call, the frame-pointer gets omitted, if possible. > One side-note here about "hotfix" and profiling. The top-profiler call > gets emitted in the ix86_asm_output_function_label. This is caused by > the fact, that for ix86 the call needs to be placed before the > code-pattern, and for x86_64 it can be placed after. Otherwise the x86 > pattern for patchable region would be corrupted, as this pattern > contains frame-register setup. So I think that the use of the macro > PROFILE_BEFORE_PROLOGUE isn't usable here. Huh? This is exactly the opposite of what we discussed yesterday on IRC. For hotfix we have .rept 16 .byte 0xcc .endr function: mov.s edi, edi push ebp mov.s esp, ebp If *any* of the above is not exactly so, then the runtime pattern match fails and the hotfix fails. If we were to write the profiler first, then we might as well not bother with the hotfix pieces, because they will never match. ... of course it doesn't help that we emit the last two insns above within the prologue, so if we simply place the profile before the prologue, we'll *still* be splitting the hotfix sequence for 32-bit. I think the best thing to do is to diagnose hotfix+profile and generate an error. I don't think there's anything reasonable we can do. In the end I don't think there's anything your AT_TOP macro does that PROFILE_BEFORE_PROLOGUE doesn't do just as well. r~ ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-13 16:28 ` Richard Henderson @ 2010-07-14 10:20 ` Kai Tietz 2010-07-14 11:49 ` Dave Korn 2010-07-14 12:16 ` Andi Kleen 0 siblings, 2 replies; 27+ messages in thread From: Kai Tietz @ 2010-07-14 10:20 UTC (permalink / raw) To: Richard Henderson; +Cc: GCC Patches [-- Attachment #1: Type: text/plain, Size: 3516 bytes --] Hello, 2010/7/13 Richard Henderson <rth@redhat.com>: > On 07/13/2010 05:47 AM, Kai Tietz wrote: >> Hello, >> >> This patch adds for i386/x86_64 win32 targets the feature of >> pre-prologue profiling call. It additional takes care that for enabled >> top-profiler call, the frame-pointer gets omitted, if possible. >> One side-note here about "hotfix" and profiling. The top-profiler call >> gets emitted in the ix86_asm_output_function_label. This is caused by >> the fact, that for ix86 the call needs to be placed before the >> code-pattern, and for x86_64 it can be placed after. Otherwise the x86 >> pattern for patchable region would be corrupted, as this pattern >> contains frame-register setup. So I think that the use of the macro >> PROFILE_BEFORE_PROLOGUE isn't usable here. > > Huh? This is exactly the opposite of what we discussed yesterday on IRC. > > For hotfix we have > > .rept 16 > .byte 0xcc > .endr > function: > mov.s edi, edi > push ebp > mov.s esp, ebp > > If *any* of the above is not exactly so, then the runtime pattern match > fails and the hotfix fails. If we were to write the profiler first, > then we might as well not bother with the hotfix pieces, because they > will never match. > > ... of course it doesn't help that we emit the last two insns above > within the prologue, so if we simply place the profile before the > prologue, we'll *still* be splitting the hotfix sequence for 32-bit. > > I think the best thing to do is to diagnose hotfix+profile and generate > an error. I don't think there's anything reasonable we can do. > > In the end I don't think there's anything your AT_TOP macro does that > PROFILE_BEFORE_PROLOGUE doesn't do just as well. > > > r~ > This patch implements it by new hook TARGET_PROFILE_BEFORE_PROLOGUE. This feature is for now just active for win32 i386 targets and is controlled by internal target macro PROFILE_SUPPORT_BEFORE_PROLOGUE. 2010-07-14 Kai Tietz * config/i386/cygming.h (PROFILE_SUPPORT_BEFORE_PROLOGUE): New. (MCOUNT_NAME): Win32 specific version. * config/i386/cygming.opt (mprofile-top): New option. * config/i386/i386.c (ix86_profile_before_prologue): New hook. (ix86_function_regparm): Handle profiling before prologue case. (ix86_function_sseregparm): Likewise. (ix86_cfun_abi): Likewise. (ix86_frame_pointer_required): Likewise. (ix86_select_alt_pic_regnum): Likewise. (ix86_save_reg): Likewise. (ix86_expand_prologue): Likewise. Additionally sorry for 32-bit "hotfix" and profile code before prologue. (x86_function_profiler): Use fprintf instead of fputs for assembly output. (TARGET_PROFILE_BEFORE_PROLOGUE): Define target hook. * doc/invoke.texi (mprofile-top): Document option. * doc/tm.texi.in (TARGET_PROFILE_BEFOR_PROLOGUE): Add documentation. * doc/tm.texi: Regenerated. * final.c (final_start_function): Replace PROFILE_BEFORE_PROLOGUE guard by target hook. (profile_after_prologue): Likewise. * function.c (thread_prologue_and_epilogue_insns): Likewise. * target.def (profile_before_prologue): New hook. * targhook.c (default_profile_before_prologue): New. * targhook.h (default_profile_before_prologue): New. Tested for i686-pc-linux-gnu, i686-pc-mingw32, and x86_64-pc-mingw32. Ok for apply? Regads, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination [-- Attachment #2: profile.top.diff --] [-- Type: application/octet-stream, Size: 13878 bytes --] Index: gcc/gcc/config/i386/cygming.h =================================================================== --- gcc.orig/gcc/config/i386/cygming.h 2010-07-13 17:00:51.000000000 +0200 +++ gcc/gcc/config/i386/cygming.h 2010-07-14 10:15:25.467867800 +0200 @@ -39,6 +39,14 @@ along with GCC; see the file COPYING3. #undef DEFAULT_ABI #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI) +#undef PROFILE_SUPPORT_BEFORE_PROLOGUE +#define PROFILE_SUPPORT_BEFORE_PROLOGUE flag_profile_top + +/* Choose the correct profiler mcount name. For checking we are using the + ix86_profile_before_prologue function as flag_profile_top is tri-state. */ +#undef MCOUNT_NAME +#define MCOUNT_NAME (ix86_profile_before_prologue () ? "_mcount_top" : "_mcount") + #if ! defined (USE_MINGW64_LEADING_UNDERSCORES) #undef USER_LABEL_PREFIX #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_") Index: gcc/gcc/config/i386/cygming.opt =================================================================== --- gcc.orig/gcc/config/i386/cygming.opt 2010-07-13 17:00:51.000000000 +0200 +++ gcc/gcc/config/i386/cygming.opt 2010-07-14 09:49:33.869102000 +0200 @@ -53,3 +53,7 @@ Use the GNU extension to the PE format f muse-libstdc-wrappers Target Condition({defined (USE_CYGWIN_LIBSTDCXX_WRAPPERS)}) Compile code that relies on Cygwin DLL wrappers to support C++ operator new/delete replacement + +mprofile-top +Target Report Var(flag_profile_top) Init(-1) +Emit profiling code before prologue. Index: gcc/gcc/config/i386/i386.c =================================================================== --- gcc.orig/gcc/config/i386/i386.c 2010-07-13 17:00:51.000000000 +0200 +++ gcc/gcc/config/i386/i386.c 2010-07-14 10:05:54.966610900 +0200 @@ -4770,6 +4770,30 @@ ix86_handle_cconv_attribute (tree *node, return NULL_TREE; } +/* Return true, if profiling code should be emitted before + prologue. Otherwise it returns false. + Note: For x86 with "hotfix" it is sorried. */ +static bool +ix86_profile_before_prologue (void) +{ +#ifdef PROFILE_SUPPORT_BEFORE_PROLOGUE + static int flag_value = -1; + if (flag_value == -1) + { + flag_value = PROFILE_SUPPORT_BEFORE_PROLOGUE; + if (flag_value == -1) + { + /* Set it to default 0. We need this tri-state for later + checking of compatiblity and target preferences. */ + flag_value = 0; + } + } + return flag_value != 0; +#else + return false; +#endif +} + /* Return 0 if the attributes for two types are incompatible, 1 if they are compatible, and 2 if they are nearly compatible (which causes a warning to be generated). */ @@ -4841,7 +4865,7 @@ ix86_function_regparm (const_tree type, if (decl && TREE_CODE (decl) == FUNCTION_DECL && optimize - && !profile_flag) + && !(profile_flag && !ix86_profile_before_prologue ())) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl)); @@ -4913,7 +4937,8 @@ ix86_function_sseregparm (const_tree typ /* For local functions, pass up to SSE_REGPARM_MAX SFmode (and DFmode for SSE2) arguments in SSE registers. */ - if (decl && TARGET_SSE_MATH && optimize && !profile_flag) + if (decl && TARGET_SSE_MATH && optimize + && !(profile_flag && !ix86_profile_before_prologue ())) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl)); @@ -5132,7 +5157,9 @@ ix86_cfun_abi (void) return cfun->machine->call_abi; } -/* Write the extra assembler code needed to declare a function properly. */ +/* Write the extra assembler code needed to declare a function properly. + Output call to profiler if profiling is enabled and it should be emitted + before prologue, */ void ix86_asm_output_function_label (FILE *asm_out_file, const char *fname, @@ -7875,7 +7902,7 @@ ix86_frame_pointer_required (void) || ix86_current_function_calls_tls_descriptor)) return true; - if (crtl->profile) + if (crtl->profile && !ix86_profile_before_prologue ()) return true; return false; @@ -8143,7 +8170,8 @@ gen_push (rtx arg) static unsigned int ix86_select_alt_pic_regnum (void) { - if (current_function_is_leaf && !crtl->profile + if (current_function_is_leaf + && !(crtl->profile && !ix86_profile_before_prologue ()) && !ix86_current_function_calls_tls_descriptor) { int i, drap; @@ -8167,7 +8195,7 @@ ix86_save_reg (unsigned int regno, int m if (pic_offset_table_rtx && regno == REAL_PIC_OFFSET_TABLE_REGNUM && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) - || crtl->profile + || (crtl->profile && !ix86_profile_before_prologue ()) || crtl->calls_eh_return || crtl->uses_const_pool)) { @@ -9191,6 +9219,11 @@ ix86_expand_prologue (void) { rtx push, mov; + /* Check if profiling is active and we shall use profiling before + prologue variant. If so sorry. */ + if (crtl->profile && ix86_profile_before_prologue () != 0) + sorry ("ms_hook_prologue attribute isn't compatible with -mprofile-top for 32-bit"); + /* Make sure the function starts with 8b ff movl.s %edi,%edi (emited by ix86_asm_output_function_label) 55 push %ebp @@ -9443,7 +9476,7 @@ ix86_expand_prologue (void) pic_reg_used = false; if (pic_offset_table_rtx && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) - || crtl->profile)) + || (crtl->profile && !ix86_profile_before_prologue ()))) { unsigned int alt_pic_reg_used = ix86_select_alt_pic_regnum (); @@ -9480,7 +9513,7 @@ ix86_expand_prologue (void) when mcount needs it. Blockage to avoid call movement across mcount call is emitted in generic code after the NOTE_INSN_PROLOGUE_END note. */ - if (crtl->profile && pic_reg_used) + if (crtl->profile && !ix86_profile_before_prologue () && pic_reg_used) emit_insn (gen_prologue_use (pic_offset_table_rtx)); if (crtl->drap_reg && !crtl->stack_realign_needed) @@ -27285,7 +27318,7 @@ x86_field_alignment (tree field, int com /* Output assembler code to FILE to increment profiler label # LABELNO for profiling a function entry. */ void -x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) +x86_function_profiler (FILE *file ATTRIBUTE_UNUSED, int labelno ATTRIBUTE_UNUSED) { if (TARGET_64BIT) { @@ -27294,9 +27327,9 @@ x86_function_profiler (FILE *file, int l #endif if (DEFAULT_ABI == SYSV_ABI && flag_pic) - fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file); + fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", MCOUNT_NAME); else - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", MCOUNT_NAME); } else if (flag_pic) { @@ -27304,7 +27337,7 @@ x86_function_profiler (FILE *file, int l fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file); + fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", MCOUNT_NAME); } else { @@ -27312,7 +27345,7 @@ x86_function_profiler (FILE *file, int l fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", MCOUNT_NAME); } } @@ -31360,6 +31393,9 @@ ix86_enum_va_list (int idx, const char * #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD #endif +#undef TARGET_PROFILE_BEFORE_PROLOGUE +#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue + #undef TARGET_ASM_UNALIGNED_HI_OP #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP #undef TARGET_ASM_UNALIGNED_SI_OP Index: gcc/gcc/doc/invoke.texi =================================================================== --- gcc.orig/gcc/doc/invoke.texi 2010-07-13 17:00:51.000000000 +0200 +++ gcc/gcc/doc/invoke.texi 2010-07-14 10:10:11.655367800 +0200 @@ -884,7 +884,7 @@ See i386 and x86-64 Options. @emph{i386 and x86-64 Windows Options} @gccoptlist{-mconsole -mcygwin -mno-cygwin -mdll -mnop-fun-dllimport -mthread -municode -mwin32 -mwindows --fno-set-stack-executable} +-mprofile-top -fno-set-stack-executable} @emph{Xstormy16 Options} @gccoptlist{-msim} @@ -17100,6 +17100,13 @@ specifies that a GUI application is to b instructing the linker to set the PE header subsystem type appropriately. +@item -mprofile-top +@opindex mprofile-top +This option is available for Cygwin and MinGW targets. It +specifies that for profiling the call to profiler should be +done before prologue. Default behavior is that profiler-call +is done after the prologue is established. + @item -fno-set-stack-executable @opindex fno-set-stack-executable This option is available for MinGW targets. It specifies that Index: gcc/gcc/doc/tm.texi.in =================================================================== --- gcc.orig/gcc/doc/tm.texi.in 2010-07-13 12:03:30.000000000 +0200 +++ gcc/gcc/doc/tm.texi.in 2010-07-14 09:47:06.485920000 +0200 @@ -7101,6 +7101,14 @@ Contains the value true if the target pl ``small data'' into a separate section. The default value is false. @end deftypevr +@hook TARGET_PROFILE_BEFORE_PROLOGUE +It returns true if target wants profile code emitted before +prologue. + +The default version of this hook use the target macro +@code{PROFILE_BEFORE_PROLOGUE}. +@end deftypefn + @hook TARGET_BINDS_LOCAL_P Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library Index: gcc/gcc/final.c =================================================================== --- gcc.orig/gcc/final.c 2010-07-09 11:24:47.000000000 +0200 +++ gcc/gcc/final.c 2010-07-14 09:47:06.376496000 +0200 @@ -1546,10 +1546,8 @@ final_start_function (rtx first ATTRIBUT /* The Sun386i and perhaps other machines don't work right if the profiling code comes after the prologue. */ -#ifdef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* PROFILE_BEFORE_PROLOGUE */ #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue) if (dwarf2out_do_frame ()) @@ -1591,10 +1589,8 @@ final_start_function (rtx first ATTRIBUT static void profile_after_prologue (FILE *file ATTRIBUTE_UNUSED) { -#ifndef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* not PROFILE_BEFORE_PROLOGUE */ } static void Index: gcc/gcc/function.c =================================================================== --- gcc.orig/gcc/function.c 2010-07-06 13:15:38.000000000 +0200 +++ gcc/gcc/function.c 2010-07-14 09:47:06.392128000 +0200 @@ -5100,13 +5100,11 @@ thread_prologue_and_epilogue_insns (void record_insns (seq, NULL, &prologue_insn_hash); emit_note (NOTE_INSN_PROLOGUE_END); -#ifndef PROFILE_BEFORE_PROLOGUE /* Ensure that instructions are not moved into the prologue when profiling is on. The call to the profiling routine can be emitted within the live range of a call-clobbered register. */ - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) emit_insn (gen_blockage ()); -#endif seq = get_insns (); end_sequence (); Index: gcc/gcc/target.def =================================================================== --- gcc.orig/gcc/target.def 2010-07-09 11:24:47.000000000 +0200 +++ gcc/gcc/target.def 2010-07-14 09:47:06.392128000 +0200 @@ -1218,6 +1218,13 @@ DEFHOOK bool, (const_tree exp), default_binds_local_p) +/* Check if profiling code is before or after prologue. */ +DEFHOOK +(profile_before_prologue, + "", + bool, (void), + default_profile_before_prologue) + /* Modify and return the identifier of a DECL's external name, originally identified by ID, as required by the target, (eg, append @nn to windows32 stdcall function names). Index: gcc/gcc/targhooks.c =================================================================== --- gcc.orig/gcc/targhooks.c 2010-07-09 11:24:47.000000000 +0200 +++ gcc/gcc/targhooks.c 2010-07-14 09:47:06.392128000 +0200 @@ -1197,4 +1197,14 @@ default_register_move_cost (enum machine #endif } +bool +default_profile_before_prologue (void) +{ +#ifndef PROFILE_BEFORE_PROLOGUE + return false; +#else + return true; +#endif +} + #include "gt-targhooks.h" Index: gcc/gcc/targhooks.h =================================================================== --- gcc.orig/gcc/targhooks.h 2010-07-09 11:24:47.000000000 +0200 +++ gcc/gcc/targhooks.h 2010-07-14 09:47:06.392128000 +0200 @@ -150,3 +150,4 @@ extern int default_memory_move_cost (enu extern int default_register_move_cost (enum machine_mode, reg_class_t, reg_class_t); +extern bool default_profile_before_prologue (void); Index: gcc/gcc/doc/tm.texi =================================================================== --- gcc.orig/gcc/doc/tm.texi 2010-07-13 12:03:30.000000000 +0200 +++ gcc/gcc/doc/tm.texi 2010-07-14 10:34:44.219257400 +0200 @@ -7101,6 +7101,14 @@ Contains the value true if the target pl ``small data'' into a separate section. The default value is false. @end deftypevr +@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void) +It returns true if target wants profile code emitted before +prologue. + +The default version of this hook use the target macro +@code{PROFILE_BEFORE_PROLOGUE}. +@end deftypefn + @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp}) Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-14 10:20 ` Kai Tietz @ 2010-07-14 11:49 ` Dave Korn 2010-07-14 12:11 ` Kai Tietz 2010-07-14 12:16 ` Andi Kleen 1 sibling, 1 reply; 27+ messages in thread From: Dave Korn @ 2010-07-14 11:49 UTC (permalink / raw) To: Kai Tietz; +Cc: Richard Henderson, GCC Patches On 14/07/2010 11:20, Kai Tietz wrote: > * config/i386/cygming.h (PROFILE_SUPPORT_BEFORE_PROLOGUE): New. > (MCOUNT_NAME): Win32 specific version. > + > +/* Choose the correct profiler mcount name. For checking we are using the > + ix86_profile_before_prologue function as flag_profile_top is tri-state. */ > +#undef MCOUNT_NAME > +#define MCOUNT_NAME (ix86_profile_before_prologue () ? "_mcount_top" : "_mcount") > + > #if ! defined (USE_MINGW64_LEADING_UNDERSCORES) > #undef USER_LABEL_PREFIX > #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_") Shouldn't MCOUNT_NAME take USER_LABEL_PREFIX into account? cheers, DaveK ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-14 11:49 ` Dave Korn @ 2010-07-14 12:11 ` Kai Tietz 0 siblings, 0 replies; 27+ messages in thread From: Kai Tietz @ 2010-07-14 12:11 UTC (permalink / raw) To: Dave Korn; +Cc: Richard Henderson, GCC Patches 2010/7/14 Dave Korn <dave.korn.cygwin@gmail.com>: > On 14/07/2010 11:20, Kai Tietz wrote: > >> * config/i386/cygming.h (PROFILE_SUPPORT_BEFORE_PROLOGUE): New. >> (MCOUNT_NAME): Win32 specific version. > >> + >> +/* Choose the correct profiler mcount name. For checking we are using the >> + ix86_profile_before_prologue function as flag_profile_top is tri-state. */ >> +#undef MCOUNT_NAME >> +#define MCOUNT_NAME (ix86_profile_before_prologue () ? "_mcount_top" : "_mcount") >> + >> #if ! defined (USE_MINGW64_LEADING_UNDERSCORES) >> #undef USER_LABEL_PREFIX >> #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_") > > > Shouldn't MCOUNT_NAME take USER_LABEL_PREFIX into account? > > cheers, > DaveK > > Well, as this MCOUNT_NAME ("_mcount") is already used widely for 64-bit, I don't want to change here something. But in general you are right. Cheers, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-14 10:20 ` Kai Tietz 2010-07-14 11:49 ` Dave Korn @ 2010-07-14 12:16 ` Andi Kleen 2010-07-14 12:38 ` Kai Tietz 1 sibling, 1 reply; 27+ messages in thread From: Andi Kleen @ 2010-07-14 12:16 UTC (permalink / raw) To: Kai Tietz; +Cc: Richard Henderson, GCC Patches Kai Tietz <ktietz70@googlemail.com> writes: > > This patch implements it by new hook TARGET_PROFILE_BEFORE_PROLOGUE. > This feature is for now just active for win32 i386 targets and is > controlled by internal target macro PROFILE_SUPPORT_BEFORE_PROLOGUE. IMHO the infrastructure in my old patch for this for Linux was a little cleaner. Unfortunately that patch is still not applied. http://thread.gmane.org/gmane.comp.gcc.patches/197870 But if you're going to resubmit this it would be good to merge the two at least. -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-14 12:16 ` Andi Kleen @ 2010-07-14 12:38 ` Kai Tietz 2010-07-15 18:08 ` Kai Tietz 0 siblings, 1 reply; 27+ messages in thread From: Kai Tietz @ 2010-07-14 12:38 UTC (permalink / raw) To: Andi Kleen; +Cc: Richard Henderson, GCC Patches Hello Andy, 2010/7/14 Andi Kleen <andi@firstfloor.org>: > Kai Tietz <ktietz70@googlemail.com> writes: >> >> This patch implements it by new hook TARGET_PROFILE_BEFORE_PROLOGUE. >> This feature is for now just active for win32 i386 targets and is >> controlled by internal target macro PROFILE_SUPPORT_BEFORE_PROLOGUE. > > IMHO the infrastructure in my old patch for this for Linux was a little > cleaner. Unfortunately that patch is still not applied. > > http://thread.gmane.org/gmane.comp.gcc.patches/197870 > > But if you're going to resubmit this it would be good to merge the two > at least. > > -Andi > > -- > ak@linux.intel.com -- Speaking for myself only. > to move the -mprologue-top (the name is a bit fanciless) option into i386.opt was my initial idea. The bad side of it, is that then that all i386 targets would have this option and for all proper MCOUNT_NAME entries have to made up, but those targets do not support before prologue profile call until now. I spoke here with rth and he pointed it out, that this is possibly something to be avoided. To your implementation, I see here one big disadvange in comparison to mine. As for x64 with SEH unwind information (I am working on the patch and are preparing things here), it needs a different frame-layout for frame-pointer prologue/epilogue, which enforces the use of before prologue profiling, a target-function hook instead of a target-variable is necessary, as otherwise the default setting for specific modes gets problematic. Eg for the x64 frame-layout the option needs to be implicit enabled. But of course I'll take a look into your patch and see, if I am able to merge things into an new version of it after first review of it is done. Cheers, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-14 12:38 ` Kai Tietz @ 2010-07-15 18:08 ` Kai Tietz 2010-07-16 17:06 ` Richard Henderson ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Kai Tietz @ 2010-07-15 18:08 UTC (permalink / raw) To: Richard Henderson; +Cc: Andi Kleen, GCC Patches [-- Attachment #1: Type: text/plain, Size: 1849 bytes --] Hello Andy, I updated my patch in that way, that it should be trivial to add the counter function for before prologue profiling to linux target by a one-liner. Just make sure that for the i386-target the macro MCOUNT_NAME_BEFORE_PROLOGUE is defined. I reworked the patch so that the option is now named -mfentry and it is available for all i386 targets, if they have defined the counter function's name via MCOUNT_NAME_BEFORE_PROLOGUE in target. Additionally I added some option-checks for targets, which don't support before prologue profiling. ChangeLog 2010-07-15 Kai Tietz * config/i386/cygming.h (MCOUNT_NAME): New. (MCOUNT_NAME_BEFORE_PROLOGUE): New. * config/i386/i386.c (ix86_profile_before_prologue): New. (override_options): Add special handling for -mfentry. (ix86_function_regparm): Likewise. (ix86_function_sseregparm): Likewise. (ix86_frame_pointer_required): Likewise. (ix86_select_alt_pic_regnum): Likewise. (ix86_save_reg): (ix86_expand_prologue): (x86_function_profiler): (TARGET_PROFILE_BEFORE_PROLOGUE): Define hook. * config/i386/i386.opt (mfentry): New. * doc/invoke.texi (mfentry): Add documentation. * doc/tm.texi: Regenerated.. * doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New. * final.c (final_start_function): Replace macro PROFILE_BEFORE_PROLOGUE by target hook. * function.c (thread_prologue_and_epilogue_insns): Likewise. * target.def (profile_before_prologue): New hook. * targhooks.c (default_profile_before_prologue): New. * targhooks.h (default_profile_before_prologue): New. Tested for i686-pc-mingw32, x86_64-pc-mingw32, and i686-pc-linux-gnu to see if option check works. Ok for apply? Regards, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination [-- Attachment #2: profileb.diff --] [-- Type: application/octet-stream, Size: 14056 bytes --] Index: gcc/gcc/config/i386/cygming.h =================================================================== --- gcc.orig/gcc/config/i386/cygming.h 2010-07-15 19:15:51.651349100 +0200 +++ gcc/gcc/config/i386/cygming.h 2010-07-15 19:49:18.100111400 +0200 @@ -39,6 +39,13 @@ along with GCC; see the file COPYING3. #undef DEFAULT_ABI #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI) +/* Choose the correct profiler mcount name. */ +#undef MCOUNT_NAME +#define MCOUNT_NAME "_mcount" + +#undef MCOUNT_NAME_BEFORE_PROLOGUE +#define MCOUNT_NAME_BEFORE_PROLOGUE "_mcount_top" + #if ! defined (USE_MINGW64_LEADING_UNDERSCORES) #undef USER_LABEL_PREFIX #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_") Index: gcc/gcc/config/i386/i386.c =================================================================== --- gcc.orig/gcc/config/i386/i386.c 2010-07-15 19:15:51.653349100 +0200 +++ gcc/gcc/config/i386/i386.c 2010-07-15 19:55:46.197309300 +0200 @@ -2768,6 +2768,15 @@ software_prefetching_beneficial_p (void) } } +/* Return true, if profiling code should be emitted before + prologue. Otherwise it returns false. + Note: For x86 with "hotfix" it is sorried. */ +static bool +ix86_profile_before_prologue (void) +{ + return flag_fentry != 0; +} + /* Function that is callable from the debugger to print the current options. */ void @@ -3671,6 +3680,28 @@ override_options (bool main_args_p) target_flags |= MASK_CLD & ~target_flags_explicit; #endif + { + int default_profile_top_flag = 0; + int only_default = 1; + +#if defined(PROFILE_BEFORE_PROLOGUE) + default_profile_top_flag = 1; +#endif +#if defined(MCOUNT_NAME) && defined (MCOUNT_NAME_BEFORE_PROLOGUE) + only_default = 0; +#endif + + if (flag_fentry == -1) + flag_fentry = default_profile_top_flag; + else if (flag_fentry != default_profile_top_flag && only_default) + { + if (!default_profile_top_flag) + sorry ("-mfentry isn't supported for this target"); + else + sorry ("-mno-fentry isn't supported for this target"); + flag_fentry = default_profile_top_flag; + } + } /* Save the initial options in case the user does function specific options */ if (main_args_p) target_option_default_node = target_option_current_node @@ -4841,7 +4872,7 @@ ix86_function_regparm (const_tree type, if (decl && TREE_CODE (decl) == FUNCTION_DECL && optimize - && !profile_flag) + && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl)); @@ -4913,7 +4944,8 @@ ix86_function_sseregparm (const_tree typ /* For local functions, pass up to SSE_REGPARM_MAX SFmode (and DFmode for SSE2) arguments in SSE registers. */ - if (decl && TARGET_SSE_MATH && optimize && !profile_flag) + if (decl && TARGET_SSE_MATH && optimize + && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl)); @@ -7875,7 +7907,7 @@ ix86_frame_pointer_required (void) || ix86_current_function_calls_tls_descriptor)) return true; - if (crtl->profile) + if (crtl->profile && !flag_fentry) return true; return false; @@ -8143,7 +8175,8 @@ gen_push (rtx arg) static unsigned int ix86_select_alt_pic_regnum (void) { - if (current_function_is_leaf && !crtl->profile + if (current_function_is_leaf + && !(crtl->profile && !flag_fentry) && !ix86_current_function_calls_tls_descriptor) { int i, drap; @@ -8167,7 +8200,7 @@ ix86_save_reg (unsigned int regno, int m if (pic_offset_table_rtx && regno == REAL_PIC_OFFSET_TABLE_REGNUM && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) - || crtl->profile + || (crtl->profile && !flag_fentry) || crtl->calls_eh_return || crtl->uses_const_pool)) { @@ -9191,6 +9224,11 @@ ix86_expand_prologue (void) { rtx push, mov; + /* Check if profiling is active and we shall use profiling before + prologue variant. If so sorry. */ + if (crtl->profile && flag_fentry != 0) + sorry ("ms_hook_prologue attribute isn't compatible with -mfentry for 32-bit"); + /* Make sure the function starts with 8b ff movl.s %edi,%edi (emited by ix86_asm_output_function_label) 55 push %ebp @@ -9443,7 +9481,7 @@ ix86_expand_prologue (void) pic_reg_used = false; if (pic_offset_table_rtx && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) - || crtl->profile)) + || (crtl->profile && !flag_fentry))) { unsigned int alt_pic_reg_used = ix86_select_alt_pic_regnum (); @@ -9480,7 +9518,7 @@ ix86_expand_prologue (void) when mcount needs it. Blockage to avoid call movement across mcount call is emitted in generic code after the NOTE_INSN_PROLOGUE_END note. */ - if (crtl->profile && pic_reg_used) + if (crtl->profile && !flag_fentry && pic_reg_used) emit_insn (gen_prologue_use (pic_offset_table_rtx)); if (crtl->drap_reg && !crtl->stack_realign_needed) @@ -27282,11 +27320,26 @@ x86_field_alignment (tree field, int com return computed; } +#if !defined(MCOUNT_NAME) && !defined(MCOUNT_NAME_BEFORE_PROLOGUE) +#error MCOUNT_NAME ,and/or MCOUNT_NAME_BEFORE_PROLOGUE have to be define +#endif + +/* Make sure both are getting defined. */ +#ifndef MCOUNT_NAME +#define MCOUNT_NAME MCOUNT_NAME_BEFORE_PROLOGUE +#endif +#ifndef MCOUNT_NAME_BEFORE_PROLOGUE +#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME +#endif + /* Output assembler code to FILE to increment profiler label # LABELNO for profiling a function entry. */ void x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) { + const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE + : MCOUNT_NAME); + if (TARGET_64BIT) { #ifndef NO_PROFILE_COUNTERS @@ -27294,9 +27347,9 @@ x86_function_profiler (FILE *file, int l #endif if (DEFAULT_ABI == SYSV_ABI && flag_pic) - fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file); + fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name); else - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", mcount_name); } else if (flag_pic) { @@ -27304,7 +27357,7 @@ x86_function_profiler (FILE *file, int l fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file); + fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name); } else { @@ -27312,7 +27365,7 @@ x86_function_profiler (FILE *file, int l fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", mcount_name); } } @@ -31360,6 +31413,9 @@ ix86_enum_va_list (int idx, const char * #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD #endif +#undef TARGET_PROFILE_BEFORE_PROLOGUE +#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue + #undef TARGET_ASM_UNALIGNED_HI_OP #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP #undef TARGET_ASM_UNALIGNED_SI_OP Index: gcc/gcc/config/i386/i386.opt =================================================================== --- gcc.orig/gcc/config/i386/i386.opt 2010-07-15 19:15:51.665349100 +0200 +++ gcc/gcc/config/i386/i386.opt 2010-07-15 19:19:10.376715500 +0200 @@ -375,3 +375,7 @@ Support RDRND built-in functions and cod mf16c Target Report Mask(ISA_F16C) Var(ix86_isa_flags) VarExists Save Support F16C built-in functions and code generation + +mfentry +Target Report Var(flag_fentry) Init(-1) +Emit profiling counter call at function entry before prologue. Index: gcc/gcc/doc/invoke.texi =================================================================== --- gcc.orig/gcc/doc/invoke.texi 2010-07-15 19:15:51.666349100 +0200 +++ gcc/gcc/doc/invoke.texi 2010-07-15 19:19:10.385716000 +0200 @@ -601,7 +601,7 @@ Objective-C and Objective-C++ Dialects}. -momit-leaf-frame-pointer -mno-red-zone -mno-tls-direct-seg-refs @gol -mcmodel=@var{code-model} -mabi=@var{name} @gol -m32 -m64 -mlarge-data-threshold=@var{num} @gol --msse2avx} +-msse2avx -mfentry} @emph{IA-64 Options} @gccoptlist{-mbig-endian -mlittle-endian -mgnu-as -mgnu-ld -mno-pic @gol @@ -12466,6 +12466,14 @@ For systems that use GNU libc, the defau @opindex msse2avx Specify that the assembler should encode SSE instructions with VEX prefix. The option @option{-mavx} turns this on by default. + +@item -mfentry +@itemx -mno-fentry +@opindex mfentry +If profiling is active @option{-pg} put the profiling +counter call before prologue. +Note: On x86 architectures the attribute @code{ms_hook_prologue} +isn't possible at the moment for @option{-mfentry} and @option{-pg}. @end table These @samp{-m} switches are supported in addition to the above Index: gcc/gcc/doc/tm.texi =================================================================== --- gcc.orig/gcc/doc/tm.texi 2010-07-15 19:15:51.668349100 +0200 +++ gcc/gcc/doc/tm.texi 2010-07-15 19:19:10.393716500 +0200 @@ -7101,6 +7101,14 @@ Contains the value true if the target pl ``small data'' into a separate section. The default value is false. @end deftypevr +@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void) +It returns true if target wants profile code emitted before +prologue. + +The default version of this hook use the target macro +@code{PROFILE_BEFORE_PROLOGUE}. +@end deftypefn + @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp}) Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library Index: gcc/gcc/doc/tm.texi.in =================================================================== --- gcc.orig/gcc/doc/tm.texi.in 2010-07-15 19:15:51.673349100 +0200 +++ gcc/gcc/doc/tm.texi.in 2010-07-15 19:19:10.399716800 +0200 @@ -7101,6 +7101,14 @@ Contains the value true if the target pl ``small data'' into a separate section. The default value is false. @end deftypevr +@hook TARGET_PROFILE_BEFORE_PROLOGUE +It returns true if target wants profile code emitted before +prologue. + +The default version of this hook use the target macro +@code{PROFILE_BEFORE_PROLOGUE}. +@end deftypefn + @hook TARGET_BINDS_LOCAL_P Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library Index: gcc/gcc/final.c =================================================================== --- gcc.orig/gcc/final.c 2010-07-15 19:15:51.674349100 +0200 +++ gcc/gcc/final.c 2010-07-15 19:19:10.403717000 +0200 @@ -1546,10 +1546,8 @@ final_start_function (rtx first ATTRIBUT /* The Sun386i and perhaps other machines don't work right if the profiling code comes after the prologue. */ -#ifdef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* PROFILE_BEFORE_PROLOGUE */ #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue) if (dwarf2out_do_frame ()) @@ -1591,10 +1589,8 @@ final_start_function (rtx first ATTRIBUT static void profile_after_prologue (FILE *file ATTRIBUTE_UNUSED) { -#ifndef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* not PROFILE_BEFORE_PROLOGUE */ } static void Index: gcc/gcc/function.c =================================================================== --- gcc.orig/gcc/function.c 2010-07-15 19:15:51.675349100 +0200 +++ gcc/gcc/function.c 2010-07-15 19:19:10.408717300 +0200 @@ -5100,13 +5100,11 @@ thread_prologue_and_epilogue_insns (void record_insns (seq, NULL, &prologue_insn_hash); emit_note (NOTE_INSN_PROLOGUE_END); -#ifndef PROFILE_BEFORE_PROLOGUE /* Ensure that instructions are not moved into the prologue when profiling is on. The call to the profiling routine can be emitted within the live range of a call-clobbered register. */ - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) emit_insn (gen_blockage ()); -#endif seq = get_insns (); end_sequence (); Index: gcc/gcc/target.def =================================================================== --- gcc.orig/gcc/target.def 2010-07-15 19:15:51.676349100 +0200 +++ gcc/gcc/target.def 2010-07-15 19:19:10.411717500 +0200 @@ -1218,6 +1218,13 @@ DEFHOOK bool, (const_tree exp), default_binds_local_p) +/* Check if profiling code is before or after prologue. */ +DEFHOOK +(profile_before_prologue, + "", + bool, (void), + default_profile_before_prologue) + /* Modify and return the identifier of a DECL's external name, originally identified by ID, as required by the target, (eg, append @nn to windows32 stdcall function names). Index: gcc/gcc/targhooks.c =================================================================== --- gcc.orig/gcc/targhooks.c 2010-07-15 19:15:51.678349100 +0200 +++ gcc/gcc/targhooks.c 2010-07-15 19:19:10.413717600 +0200 @@ -1197,4 +1197,14 @@ default_register_move_cost (enum machine #endif } +bool +default_profile_before_prologue (void) +{ +#ifndef PROFILE_BEFORE_PROLOGUE + return false; +#else + return true; +#endif +} + #include "gt-targhooks.h" Index: gcc/gcc/targhooks.h =================================================================== --- gcc.orig/gcc/targhooks.h 2010-07-15 19:15:51.687349100 +0200 +++ gcc/gcc/targhooks.h 2010-07-15 19:19:10.416717800 +0200 @@ -150,3 +150,4 @@ extern int default_memory_move_cost (enu extern int default_register_move_cost (enum machine_mode, reg_class_t, reg_class_t); +extern bool default_profile_before_prologue (void); ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-15 18:08 ` Kai Tietz @ 2010-07-16 17:06 ` Richard Henderson 2010-07-17 6:52 ` Kai Tietz 2010-07-16 20:53 ` Gerald Pfeifer 2010-07-16 23:57 ` Andi Kleen 2 siblings, 1 reply; 27+ messages in thread From: Richard Henderson @ 2010-07-16 17:06 UTC (permalink / raw) To: Kai Tietz; +Cc: Andi Kleen, GCC Patches On 07/15/2010 11:08 AM, Kai Tietz wrote: > (override_options): Add special handling for -mfentry. > (ix86_function_regparm): Likewise. > (ix86_function_sseregparm): Likewise. Why are you adding the special-casing to the two regparm functions? You do realize that suddenly your mcount_top function may have zero free registers with which to do its job. I think that's a bad idea. > (ix86_save_reg): > (ix86_expand_prologue): > (x86_function_profiler): No changes? ;-) x86_function_profiler is broken for -m32 -fentry -fpic. It uses %ebx which has not been set up. I think perhaps this combination cannot really be supported, and this should be diagnosed back in override_options. > +@hook TARGET_PROFILE_BEFORE_PROLOGUE > +It returns true if target wants profile code emitted before > +prologue. > + > +The default version of this hook use the target macro > +@code{PROFILE_BEFORE_PROLOGUE}. > +@end deftypefn The text should go in target.def, in the ""; only the @hook line goes here. r~ ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-16 17:06 ` Richard Henderson @ 2010-07-17 6:52 ` Kai Tietz 2010-07-20 2:27 ` Richard Henderson 0 siblings, 1 reply; 27+ messages in thread From: Kai Tietz @ 2010-07-17 6:52 UTC (permalink / raw) To: Richard Henderson; +Cc: Andi Kleen, GCC Patches [-- Attachment #1: Type: text/plain, Size: 3077 bytes --] Hello Richard, thanks for the review. 2010/7/16 Richard Henderson <rth@redhat.com>: > On 07/15/2010 11:08 AM, Kai Tietz wrote: >> (override_options): Add special handling for -mfentry. >> (ix86_function_regparm): Likewise. >> (ix86_function_sseregparm): Likewise. > > Why are you adding the special-casing to the two regparm functions? > You do realize that suddenly your mcount_top function may have zero > free registers with which to do its job. I think that's a bad idea. Well, as the mcount before prologue call has to be transparent (saving and restoring all modified registers) I don't see here, that no registers are a problem. >> (ix86_save_reg): >> (ix86_expand_prologue): >> (x86_function_profiler): > > No changes? ;-) Well, missed the Likewise :} > x86_function_profiler is broken for -m32 -fentry -fpic. It uses > %ebx which has not been set up. I think perhaps this combination > cannot really be supported, and this should be diagnosed back in > override_options. Yeah, add it to my update patch. >> +@hook TARGET_PROFILE_BEFORE_PROLOGUE >> +It returns true if target wants profile code emitted before >> +prologue. >> + >> +The default version of this hook use the target macro >> +@code{PROFILE_BEFORE_PROLOGUE}. >> +@end deftypefn > > The text should go in target.def, in the ""; only the @hook line goes here. Ok, moved into target.def (btw I think it is one of the first hooks using this feature). So by discussion on IRC some changes to checks about fpic are superflous, as they can't be reached in fact. So I removed the additional checks here about flag_fentry. The option -fpic for x86 is indeed not possible for profiling and -mfentry. So I added a sorry and default back to profiling after prologue, which setups ebx correctly. ChangeLog * config/i386/cygming.h (MCOUNT_NAME): New. (MCOUNT_NAME_BEFORE_PROLOGUE): New. * config/i386/i386.c (ix86_profile_before_prologue): New. (override_options): Add special handling for -mfentry. (ix86_function_regparm): Likewise. (ix86_function_sseregparm): Likewise. (ix86_frame_pointer_required): Likewise. (ix86_expand_prologue): Check for ms_hook_prologue. (x86_function_profiler): Adjust mcount output. (TARGET_PROFILE_BEFORE_PROLOGUE): Define hook. * config/i386/i386.opt (mfentry): New. * doc/invoke.texi (mfentry): Add documentation. * doc/tm.texi: Regenerated.. * doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New. * final.c (final_start_function): Replace macro PROFILE_BEFORE_PROLOGUE by target hook. * function.c (thread_prologue_and_epilogue_insns): Likewise. * target.def (profile_before_prologue): New hook. * targhooks.c (default_profile_before_prologue): New. * targhooks.h (default_profile_before_prologue): New. Retested for x86_64-pc-mingw32, i686-pc-cygwin, and i686-pc-linux-gnu. Ok for apply? Regards, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination [-- Attachment #2: profileb.diff --] [-- Type: application/octet-stream, Size: 12735 bytes --] Index: gcc/gcc/config/i386/cygming.h =================================================================== --- gcc.orig/gcc/config/i386/cygming.h 2010-07-15 19:15:51.651349100 +0200 +++ gcc/gcc/config/i386/cygming.h 2010-07-15 19:49:18.100111400 +0200 @@ -39,6 +39,13 @@ #undef DEFAULT_ABI #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI) +/* Choose the correct profiler mcount name. */ +#undef MCOUNT_NAME +#define MCOUNT_NAME "_mcount" + +#undef MCOUNT_NAME_BEFORE_PROLOGUE +#define MCOUNT_NAME_BEFORE_PROLOGUE "_mcount_top" + #if ! defined (USE_MINGW64_LEADING_UNDERSCORES) #undef USER_LABEL_PREFIX #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_") Index: gcc/gcc/config/i386/i386.c =================================================================== --- gcc.orig/gcc/config/i386/i386.c 2010-07-15 19:15:51.653349100 +0200 +++ gcc/gcc/config/i386/i386.c 2010-07-17 07:54:24.270241900 +0200 @@ -2768,6 +2768,15 @@ } } +/* Return true, if profiling code should be emitted before + prologue. Otherwise it returns false. + Note: For x86 with "hotfix" it is sorried. */ +static bool +ix86_profile_before_prologue (void) +{ + return flag_fentry != 0; +} + /* Function that is callable from the debugger to print the current options. */ void @@ -3671,6 +3680,34 @@ target_flags |= MASK_CLD & ~target_flags_explicit; #endif + { + int default_profile_top_flag = 0; + int only_default = 1; + bool force_default_profile_top_flag = (!TARGET_64BIT && flag_pic); + +#if defined(PROFILE_BEFORE_PROLOGUE) + default_profile_top_flag = (force_default_profile_top_flag ? 0 : 1); +#endif +#if defined(MCOUNT_NAME) && defined(MCOUNT_NAME_BEFORE_PROLOGUE) + only_default = 0; +#endif + + if (flag_fentry == -1) + flag_fentry = default_profile_top_flag; + if (flag_fentry != 0 && force_default_profile_top_flag) + { + sorry ("-mfentry isn't support for x86 in combination with -fpic"); + flag_fentry = 0; + } + else if (flag_fentry != default_profile_top_flag && only_default) + { + if (!default_profile_top_flag) + sorry ("-mfentry isn't supported for this target"); + else + sorry ("-mno-fentry isn't supported for this target"); + flag_fentry = default_profile_top_flag; + } + } /* Save the initial options in case the user does function specific options */ if (main_args_p) target_option_default_node = target_option_current_node @@ -4841,7 +4878,7 @@ if (decl && TREE_CODE (decl) == FUNCTION_DECL && optimize - && !profile_flag) + && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl)); @@ -4913,7 +4950,8 @@ /* For local functions, pass up to SSE_REGPARM_MAX SFmode (and DFmode for SSE2) arguments in SSE registers. */ - if (decl && TARGET_SSE_MATH && optimize && !profile_flag) + if (decl && TARGET_SSE_MATH && optimize + && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl)); @@ -7875,7 +7913,7 @@ || ix86_current_function_calls_tls_descriptor)) return true; - if (crtl->profile) + if (crtl->profile && !flag_fentry) return true; return false; @@ -8143,7 +8181,8 @@ static unsigned int ix86_select_alt_pic_regnum (void) { - if (current_function_is_leaf && !crtl->profile + if (current_function_is_leaf + && !crtl->profile && !ix86_current_function_calls_tls_descriptor) { int i, drap; @@ -9191,6 +9230,11 @@ { rtx push, mov; + /* Check if profiling is active and we shall use profiling before + prologue variant. If so sorry. */ + if (crtl->profile && flag_fentry != 0) + sorry ("ms_hook_prologue attribute isn't compatible with -mfentry for 32-bit"); + /* Make sure the function starts with 8b ff movl.s %edi,%edi (emited by ix86_asm_output_function_label) 55 push %ebp @@ -9480,7 +9524,7 @@ when mcount needs it. Blockage to avoid call movement across mcount call is emitted in generic code after the NOTE_INSN_PROLOGUE_END note. */ - if (crtl->profile && pic_reg_used) + if (crtl->profile && !flag_fentry && pic_reg_used) emit_insn (gen_prologue_use (pic_offset_table_rtx)); if (crtl->drap_reg && !crtl->stack_realign_needed) @@ -27282,11 +27326,26 @@ return computed; } +#if !defined(MCOUNT_NAME) && !defined(MCOUNT_NAME_BEFORE_PROLOGUE) +#error MCOUNT_NAME ,and/or MCOUNT_NAME_BEFORE_PROLOGUE have to be define +#endif + +/* Make sure both are getting defined. */ +#ifndef MCOUNT_NAME +#define MCOUNT_NAME MCOUNT_NAME_BEFORE_PROLOGUE +#endif +#ifndef MCOUNT_NAME_BEFORE_PROLOGUE +#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME +#endif + /* Output assembler code to FILE to increment profiler label # LABELNO for profiling a function entry. */ void x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) { + const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE + : MCOUNT_NAME); + if (TARGET_64BIT) { #ifndef NO_PROFILE_COUNTERS @@ -27294,9 +27353,9 @@ #endif if (DEFAULT_ABI == SYSV_ABI && flag_pic) - fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file); + fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name); else - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", mcount_name); } else if (flag_pic) { @@ -27304,7 +27363,7 @@ fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file); + fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name); } else { @@ -27312,7 +27371,7 @@ fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", mcount_name); } } @@ -31360,6 +31419,9 @@ #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD #endif +#undef TARGET_PROFILE_BEFORE_PROLOGUE +#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue + #undef TARGET_ASM_UNALIGNED_HI_OP #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP #undef TARGET_ASM_UNALIGNED_SI_OP Index: gcc/gcc/config/i386/i386.opt =================================================================== --- gcc.orig/gcc/config/i386/i386.opt 2010-07-15 19:15:51.665349100 +0200 +++ gcc/gcc/config/i386/i386.opt 2010-07-15 19:19:10.376715500 +0200 @@ -375,3 +375,7 @@ mf16c Target Report Mask(ISA_F16C) Var(ix86_isa_flags) VarExists Save Support F16C built-in functions and code generation + +mfentry +Target Report Var(flag_fentry) Init(-1) +Emit profiling counter call at function entry before prologue. Index: gcc/gcc/doc/invoke.texi =================================================================== --- gcc.orig/gcc/doc/invoke.texi 2010-07-15 19:15:51.666349100 +0200 +++ gcc/gcc/doc/invoke.texi 2010-07-15 19:19:10.385716000 +0200 @@ -601,7 +601,7 @@ -momit-leaf-frame-pointer -mno-red-zone -mno-tls-direct-seg-refs @gol -mcmodel=@var{code-model} -mabi=@var{name} @gol -m32 -m64 -mlarge-data-threshold=@var{num} @gol --msse2avx} +-msse2avx -mfentry} @emph{IA-64 Options} @gccoptlist{-mbig-endian -mlittle-endian -mgnu-as -mgnu-ld -mno-pic @gol @@ -12466,6 +12466,14 @@ @opindex msse2avx Specify that the assembler should encode SSE instructions with VEX prefix. The option @option{-mavx} turns this on by default. + +@item -mfentry +@itemx -mno-fentry +@opindex mfentry +If profiling is active @option{-pg} put the profiling +counter call before prologue. +Note: On x86 architectures the attribute @code{ms_hook_prologue} +isn't possible at the moment for @option{-mfentry} and @option{-pg}. @end table These @samp{-m} switches are supported in addition to the above Index: gcc/gcc/doc/tm.texi =================================================================== --- gcc.orig/gcc/doc/tm.texi 2010-07-15 19:15:51.668349100 +0200 +++ gcc/gcc/doc/tm.texi 2010-07-17 08:41:34.231106400 +0200 @@ -7101,6 +7101,13 @@ ``small data'' into a separate section. The default value is false. @end deftypevr +@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void) +It returns true if target wants profile code emitted before prologue. + +The default version of this hook use the target macro +@code{PROFILE_BEFORE_PROLOGUE}. +@end deftypefn + @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp}) Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library Index: gcc/gcc/doc/tm.texi.in =================================================================== --- gcc.orig/gcc/doc/tm.texi.in 2010-07-15 19:15:51.673349100 +0200 +++ gcc/gcc/doc/tm.texi.in 2010-07-17 08:04:39.523432400 +0200 @@ -7101,6 +7101,8 @@ ``small data'' into a separate section. The default value is false. @end deftypevr +@hook TARGET_PROFILE_BEFORE_PROLOGUE + @hook TARGET_BINDS_LOCAL_P Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library Index: gcc/gcc/final.c =================================================================== --- gcc.orig/gcc/final.c 2010-07-15 19:15:51.674349100 +0200 +++ gcc/gcc/final.c 2010-07-15 19:19:10.403717000 +0200 @@ -1546,10 +1546,8 @@ /* The Sun386i and perhaps other machines don't work right if the profiling code comes after the prologue. */ -#ifdef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* PROFILE_BEFORE_PROLOGUE */ #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue) if (dwarf2out_do_frame ()) @@ -1591,10 +1589,8 @@ static void profile_after_prologue (FILE *file ATTRIBUTE_UNUSED) { -#ifndef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* not PROFILE_BEFORE_PROLOGUE */ } static void Index: gcc/gcc/function.c =================================================================== --- gcc.orig/gcc/function.c 2010-07-15 19:15:51.675349100 +0200 +++ gcc/gcc/function.c 2010-07-15 19:19:10.408717300 +0200 @@ -5100,13 +5100,11 @@ record_insns (seq, NULL, &prologue_insn_hash); emit_note (NOTE_INSN_PROLOGUE_END); -#ifndef PROFILE_BEFORE_PROLOGUE /* Ensure that instructions are not moved into the prologue when profiling is on. The call to the profiling routine can be emitted within the live range of a call-clobbered register. */ - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) emit_insn (gen_blockage ()); -#endif seq = get_insns (); end_sequence (); Index: gcc/gcc/target.def =================================================================== --- gcc.orig/gcc/target.def 2010-07-15 19:15:51.676349100 +0200 +++ gcc/gcc/target.def 2010-07-17 08:38:38.423050800 +0200 @@ -1218,6 +1218,15 @@ bool, (const_tree exp), default_binds_local_p) +/* Check if profiling code is before or after prologue. */ +DEFHOOK +(profile_before_prologue, + "It returns true if target wants profile code emitted before prologue.\n\n\ +The default version of this hook use the target macro\n\ +@code{PROFILE_BEFORE_PROLOGUE}.", + bool, (void), + default_profile_before_prologue) + /* Modify and return the identifier of a DECL's external name, originally identified by ID, as required by the target, (eg, append @nn to windows32 stdcall function names). Index: gcc/gcc/targhooks.c =================================================================== --- gcc.orig/gcc/targhooks.c 2010-07-15 19:15:51.678349100 +0200 +++ gcc/gcc/targhooks.c 2010-07-15 19:19:10.413717600 +0200 @@ -1197,4 +1197,14 @@ #endif } +bool +default_profile_before_prologue (void) +{ +#ifndef PROFILE_BEFORE_PROLOGUE + return false; +#else + return true; +#endif +} + #include "gt-targhooks.h" Index: gcc/gcc/targhooks.h =================================================================== --- gcc.orig/gcc/targhooks.h 2010-07-15 19:15:51.687349100 +0200 +++ gcc/gcc/targhooks.h 2010-07-15 19:19:10.416717800 +0200 @@ -150,3 +150,4 @@ extern int default_register_move_cost (enum machine_mode, reg_class_t, reg_class_t); +extern bool default_profile_before_prologue (void); ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-17 6:52 ` Kai Tietz @ 2010-07-20 2:27 ` Richard Henderson 2010-07-28 8:36 ` Kai Tietz 0 siblings, 1 reply; 27+ messages in thread From: Richard Henderson @ 2010-07-20 2:27 UTC (permalink / raw) To: Kai Tietz; +Cc: Andi Kleen, GCC Patches On 07/16/2010 11:51 PM, Kai Tietz wrote: > +#ifndef MCOUNT_NAME_BEFORE_PROLOGUE > +#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME > +#endif I begin to wonder if it wouldn't be better to just go ahead and use Andi's "__fentry__" symbol for all sub-targets. It's really the only thing that makes sense given the -mfentry option name. > + bool force_default_profile_top_flag = (!TARGET_64BIT && flag_pic); > + default_profile_top_flag = (force_default_profile_top_flag ? 0 : 1); These are confusing names and definitions, because they don't correspond to their names. For instance, f_d_p_t_f does not force the default. It's definition forces p_t_f off. Do you see what I mean? > +#ifndef PROFILE_BEFORE_PROLOGUE > + return false; > +#else > + return true; > +#endif Avoid the double negative in favor of the positive. r~ ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-20 2:27 ` Richard Henderson @ 2010-07-28 8:36 ` Kai Tietz 2010-07-28 16:00 ` Richard Henderson 0 siblings, 1 reply; 27+ messages in thread From: Kai Tietz @ 2010-07-28 8:36 UTC (permalink / raw) To: Richard Henderson; +Cc: Andi Kleen, GCC Patches 2010/7/20 Richard Henderson <rth@redhat.com>: > On 07/16/2010 11:51 PM, Kai Tietz wrote: >> +#ifndef MCOUNT_NAME_BEFORE_PROLOGUE >> +#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME >> +#endif > > I begin to wonder if it wouldn't be better to just go ahead and use > Andi's "__fentry__" symbol for all sub-targets. It's really the only > thing that makes sense given the -mfentry option name. Well, maybe it would. But I see that in current source the MCOUNT_NAME symbol isn't the same for all targets, too. Some add prefixes to it for indicate private-scope, etc. But well, I can give it a whirl. To make it default for all targets by defining the default in i386.h is fine by me, but this will cause for targets - not providing in their profiler-library this new entry - link-time failures. By defining it for each target separately makes analyzis of support possible, but well, what ever you want here. >> + bool force_default_profile_top_flag = (!TARGET_64BIT && flag_pic); >> + default_profile_top_flag = (force_default_profile_top_flag ? 0 : 1); > > These are confusing names and definitions, because they don't > correspond to their names. For instance, f_d_p_t_f does not > force the default. It's definition forces p_t_f off. > > Do you see what I mean? Yeah, this logic (and especially their names) aren't lucky choices, as f_d_p_t_f can be either the default for p_t_f, or it is the forced default, which turns p_t_f off (if conflicts are detected for PIC and planned for additional -mseh frame layout). >> +#ifndef PROFILE_BEFORE_PROLOGUE >> + return false; >> +#else >> + return true; >> +#endif > > Avoid the double negative in favor of the positive. Ok ;) Regards, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-28 8:36 ` Kai Tietz @ 2010-07-28 16:00 ` Richard Henderson 2010-07-28 16:01 ` Andi Kleen 0 siblings, 1 reply; 27+ messages in thread From: Richard Henderson @ 2010-07-28 16:00 UTC (permalink / raw) To: Kai Tietz; +Cc: Andi Kleen, GCC Patches On 07/28/2010 12:53 AM, Kai Tietz wrote: > But well, I can give it a whirl. To > make it default for all targets by defining the default in i386.h is > fine by me, but this will cause for targets - not providing in their > profiler-library this new entry - link-time failures. By defining it > for each target separately makes analyzis of support possible, but > well, what ever you want here. Yes, but there are consumers like the linux kernel that plan to provide their own version of the entry point. r~ ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-28 16:00 ` Richard Henderson @ 2010-07-28 16:01 ` Andi Kleen 2010-07-28 17:28 ` Kai Tietz 0 siblings, 1 reply; 27+ messages in thread From: Andi Kleen @ 2010-07-28 16:01 UTC (permalink / raw) To: Richard Henderson; +Cc: Kai Tietz, Andi Kleen, GCC Patches On Wed, Jul 28, 2010 at 08:49:45AM -0700, Richard Henderson wrote: > On 07/28/2010 12:53 AM, Kai Tietz wrote: > > But well, I can give it a whirl. To > > make it default for all targets by defining the default in i386.h is > > fine by me, but this will cause for targets - not providing in their > > profiler-library this new entry - link-time failures. By defining it > > for each target separately makes analyzis of support possible, but > > well, what ever you want here. > > Yes, but there are consumers like the linux kernel that plan to > provide their own version of the entry point. At least on Linux it has to be guarded by the explicit -mfentry option for now. Later on the default could be set by testing glibc support in autoconf. I don't see any reason to not enable non default -mfentry on any i386 target. -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-28 16:01 ` Andi Kleen @ 2010-07-28 17:28 ` Kai Tietz 2010-07-28 17:40 ` Richard Henderson 0 siblings, 1 reply; 27+ messages in thread From: Kai Tietz @ 2010-07-28 17:28 UTC (permalink / raw) To: Andi Kleen; +Cc: Richard Henderson, GCC Patches [-- Attachment #1: Type: text/plain, Size: 2180 bytes --] 2010/7/28 Andi Kleen <andi@firstfloor.org>: > On Wed, Jul 28, 2010 at 08:49:45AM -0700, Richard Henderson wrote: >> On 07/28/2010 12:53 AM, Kai Tietz wrote: >> > But well, I can give it a whirl. To >> > make it default for all targets by defining the default in i386.h is >> > fine by me, but this will cause for targets - not providing in their >> > profiler-library this new entry - link-time failures. By defining it >> > for each target separately makes analyzis of support possible, but >> > well, what ever you want here. >> >> Yes, but there are consumers like the linux kernel that plan to >> provide their own version of the entry point. > > At least on Linux it has to be guarded by the explicit -mfentry > option for now. Later on the default could be set by testing glibc > support in autoconf. > > I don't see any reason to not enable non default -mfentry on any i386 > target. > > -Andi > -- > ak@linux.intel.com -- Speaking for myself only. > Well, so here is the updated patch for this. ChangeLog 2010-07-28 Kai Tietz * config/i386/i386.h (MCOUNT_NAME_BEFORE_PROLOGUE): New. * config/i386/i386.c (ix86_profile_before_prologue): New. (override_options): Add special handling for -mfentry. (ix86_function_regparm): Likewise. (ix86_function_sseregparm): Likewise. (ix86_frame_pointer_required): Likewise. (ix86_expand_prologue): Check for ms_hook_prologue. (x86_function_profiler): Adjust mcount output. (TARGET_PROFILE_BEFORE_PROLOGUE): Define hook. * config/i386/i386.opt (mfentry): New. * doc/invoke.texi (mfentry): Add documentation. * doc/tm.texi: Regenerated.. * doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New. * final.c (final_start_function): Replace macro PROFILE_BEFORE_PROLOGUE by target hook. * function.c (thread_prologue_and_epilogue_insns): Likewise. * target.def (profile_before_prologue): New hook. * targhooks.c (default_profile_before_prologue): New. * targhooks.h (default_profile_before_prologue): New. Ok for apply? Regards, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination [-- Attachment #2: profileb.diff --] [-- Type: application/octet-stream, Size: 12345 bytes --] Index: gcc/gcc/config/i386/i386.c =================================================================== --- gcc.orig/gcc/config/i386/i386.c 2010-07-24 20:23:22.210750300 +0200 +++ gcc/gcc/config/i386/i386.c 2010-07-28 19:08:29.665001500 +0200 @@ -2768,6 +2768,15 @@ software_prefetching_beneficial_p (void) } } +/* Return true, if profiling code should be emitted before + prologue. Otherwise it returns false. + Note: For x86 with "hotfix" it is sorried. */ +static bool +ix86_profile_before_prologue (void) +{ + return flag_fentry != 0; +} + /* Function that is callable from the debugger to print the current options. */ void @@ -3671,6 +3680,18 @@ override_options (bool main_args_p) target_flags |= MASK_CLD & ~target_flags_explicit; #endif + if (flag_fentry == -1) +#if defined(PROFILE_BEFORE_PROLOGUE) + flag_fentry = ((!TARGET_64BIT && flag_pic) ? 0 : 1); +#else + flag_fentry = 0; +#endif + if (flag_fentry != 0 && !TARGET_64BIT && flag_pic) + { + sorry ("-mfentry isn't support for x86 in combination with -fpic"); + flag_fentry = 0; + } + /* Save the initial options in case the user does function specific options */ if (main_args_p) target_option_default_node = target_option_current_node @@ -4841,7 +4862,7 @@ ix86_function_regparm (const_tree type, if (decl && TREE_CODE (decl) == FUNCTION_DECL && optimize - && !profile_flag) + && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl)); @@ -4913,7 +4934,8 @@ ix86_function_sseregparm (const_tree typ /* For local functions, pass up to SSE_REGPARM_MAX SFmode (and DFmode for SSE2) arguments in SSE registers. */ - if (decl && TARGET_SSE_MATH && optimize && !profile_flag) + if (decl && TARGET_SSE_MATH && optimize + && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl)); @@ -7883,7 +7905,7 @@ ix86_frame_pointer_required (void) || ix86_current_function_calls_tls_descriptor)) return true; - if (crtl->profile) + if (crtl->profile && !flag_fentry) return true; return false; @@ -8151,7 +8173,8 @@ gen_push (rtx arg) static unsigned int ix86_select_alt_pic_regnum (void) { - if (current_function_is_leaf && !crtl->profile + if (current_function_is_leaf + && !crtl->profile && !ix86_current_function_calls_tls_descriptor) { int i, drap; @@ -9199,6 +9222,11 @@ ix86_expand_prologue (void) { rtx push, mov; + /* Check if profiling is active and we shall use profiling before + prologue variant. If so sorry. */ + if (crtl->profile && flag_fentry != 0) + sorry ("ms_hook_prologue attribute isn't compatible with -mfentry for 32-bit"); + /* Make sure the function starts with 8b ff movl.s %edi,%edi (emited by ix86_asm_output_function_label) 55 push %ebp @@ -9488,7 +9516,7 @@ ix86_expand_prologue (void) when mcount needs it. Blockage to avoid call movement across mcount call is emitted in generic code after the NOTE_INSN_PROLOGUE_END note. */ - if (crtl->profile && pic_reg_used) + if (crtl->profile && !flag_fentry && pic_reg_used) emit_insn (gen_prologue_use (pic_offset_table_rtx)); if (crtl->drap_reg && !crtl->stack_realign_needed) @@ -27296,6 +27324,9 @@ x86_field_alignment (tree field, int com void x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) { + const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE + : MCOUNT_NAME); + if (TARGET_64BIT) { #ifndef NO_PROFILE_COUNTERS @@ -27303,9 +27334,9 @@ x86_function_profiler (FILE *file, int l #endif if (DEFAULT_ABI == SYSV_ABI && flag_pic) - fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file); + fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name); else - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", mcount_name); } else if (flag_pic) { @@ -27313,7 +27344,7 @@ x86_function_profiler (FILE *file, int l fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file); + fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name); } else { @@ -27321,7 +27352,7 @@ x86_function_profiler (FILE *file, int l fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", mcount_name); } } @@ -31369,6 +31400,9 @@ ix86_enum_va_list (int idx, const char * #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD #endif +#undef TARGET_PROFILE_BEFORE_PROLOGUE +#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue + #undef TARGET_ASM_UNALIGNED_HI_OP #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP #undef TARGET_ASM_UNALIGNED_SI_OP Index: gcc/gcc/config/i386/i386.opt =================================================================== --- gcc.orig/gcc/config/i386/i386.opt 2010-07-24 20:22:00.867096900 +0200 +++ gcc/gcc/config/i386/i386.opt 2010-07-28 18:57:58.735914400 +0200 @@ -375,3 +375,7 @@ Support RDRND built-in functions and cod mf16c Target Report Mask(ISA_F16C) Var(ix86_isa_flags) VarExists Save Support F16C built-in functions and code generation + +mfentry +Target Report Var(flag_fentry) Init(-1) +Emit profiling counter call at function entry before prologue. Index: gcc/gcc/doc/invoke.texi =================================================================== --- gcc.orig/gcc/doc/invoke.texi 2010-07-24 20:22:00.869096900 +0200 +++ gcc/gcc/doc/invoke.texi 2010-07-28 18:57:58.767916200 +0200 @@ -601,7 +601,7 @@ Objective-C and Objective-C++ Dialects}. -momit-leaf-frame-pointer -mno-red-zone -mno-tls-direct-seg-refs @gol -mcmodel=@var{code-model} -mabi=@var{name} @gol -m32 -m64 -mlarge-data-threshold=@var{num} @gol --msse2avx} +-msse2avx -mfentry} @emph{IA-64 Options} @gccoptlist{-mbig-endian -mlittle-endian -mgnu-as -mgnu-ld -mno-pic @gol @@ -12467,6 +12467,14 @@ For systems that use GNU libc, the defau @opindex msse2avx Specify that the assembler should encode SSE instructions with VEX prefix. The option @option{-mavx} turns this on by default. + +@item -mfentry +@itemx -mno-fentry +@opindex mfentry +If profiling is active @option{-pg} put the profiling +counter call before prologue. +Note: On x86 architectures the attribute @code{ms_hook_prologue} +isn't possible at the moment for @option{-mfentry} and @option{-pg}. @end table These @samp{-m} switches are supported in addition to the above Index: gcc/gcc/doc/tm.texi =================================================================== --- gcc.orig/gcc/doc/tm.texi 2010-07-24 20:22:00.872096900 +0200 +++ gcc/gcc/doc/tm.texi 2010-07-28 18:57:58.810918700 +0200 @@ -7076,6 +7076,13 @@ Contains the value true if the target pl ``small data'' into a separate section. The default value is false. @end deftypevr +@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void) +It returns true if target wants profile code emitted before prologue. + +The default version of this hook use the target macro +@code{PROFILE_BEFORE_PROLOGUE}. +@end deftypefn + @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp}) Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library Index: gcc/gcc/doc/tm.texi.in =================================================================== --- gcc.orig/gcc/doc/tm.texi.in 2010-07-24 20:22:00.879096900 +0200 +++ gcc/gcc/doc/tm.texi.in 2010-07-28 18:57:58.840920400 +0200 @@ -7076,6 +7076,8 @@ Contains the value true if the target pl ``small data'' into a separate section. The default value is false. @end deftypevr +@hook TARGET_PROFILE_BEFORE_PROLOGUE + @hook TARGET_BINDS_LOCAL_P Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library Index: gcc/gcc/final.c =================================================================== --- gcc.orig/gcc/final.c 2010-07-24 20:22:00.880096900 +0200 +++ gcc/gcc/final.c 2010-07-28 18:57:58.865921800 +0200 @@ -1546,10 +1546,8 @@ final_start_function (rtx first ATTRIBUT /* The Sun386i and perhaps other machines don't work right if the profiling code comes after the prologue. */ -#ifdef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* PROFILE_BEFORE_PROLOGUE */ #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue) if (dwarf2out_do_frame ()) @@ -1591,10 +1589,8 @@ final_start_function (rtx first ATTRIBUT static void profile_after_prologue (FILE *file ATTRIBUTE_UNUSED) { -#ifndef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* not PROFILE_BEFORE_PROLOGUE */ } static void Index: gcc/gcc/function.c =================================================================== --- gcc.orig/gcc/function.c 2010-07-24 20:22:00.882096900 +0200 +++ gcc/gcc/function.c 2010-07-28 18:57:58.895923600 +0200 @@ -5183,13 +5183,11 @@ thread_prologue_and_epilogue_insns (void record_insns (seq, NULL, &prologue_insn_hash); emit_note (NOTE_INSN_PROLOGUE_END); -#ifndef PROFILE_BEFORE_PROLOGUE /* Ensure that instructions are not moved into the prologue when profiling is on. The call to the profiling routine can be emitted within the live range of a call-clobbered register. */ - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) emit_insn (gen_blockage ()); -#endif seq = get_insns (); end_sequence (); Index: gcc/gcc/target.def =================================================================== --- gcc.orig/gcc/target.def 2010-07-24 20:22:00.885096900 +0200 +++ gcc/gcc/target.def 2010-07-28 18:57:58.921925000 +0200 @@ -1218,6 +1218,15 @@ DEFHOOK bool, (const_tree exp), default_binds_local_p) +/* Check if profiling code is before or after prologue. */ +DEFHOOK +(profile_before_prologue, + "It returns true if target wants profile code emitted before prologue.\n\n\ +The default version of this hook use the target macro\n\ +@code{PROFILE_BEFORE_PROLOGUE}.", + bool, (void), + default_profile_before_prologue) + /* Modify and return the identifier of a DECL's external name, originally identified by ID, as required by the target, (eg, append @nn to windows32 stdcall function names). Index: gcc/gcc/targhooks.c =================================================================== --- gcc.orig/gcc/targhooks.c 2010-07-24 20:22:00.886096900 +0200 +++ gcc/gcc/targhooks.c 2010-07-28 19:10:55.778358700 +0200 @@ -1197,4 +1197,14 @@ default_register_move_cost (enum machine #endif } +bool +default_profile_before_prologue (void) +{ +#ifdef PROFILE_BEFORE_PROLOGUE + return true; +#else + return false; +#endif +} + #include "gt-targhooks.h" Index: gcc/gcc/targhooks.h =================================================================== --- gcc.orig/gcc/targhooks.h 2010-07-24 20:22:00.897096900 +0200 +++ gcc/gcc/targhooks.h 2010-07-28 18:57:58.963927400 +0200 @@ -150,3 +150,4 @@ extern int default_memory_move_cost (enu extern int default_register_move_cost (enum machine_mode, reg_class_t, reg_class_t); +extern bool default_profile_before_prologue (void); Index: gcc/gcc/config/i386/i386.h =================================================================== --- gcc.orig/gcc/config/i386/i386.h 2010-07-20 19:50:41.000000000 +0200 +++ gcc/gcc/config/i386/i386.h 2010-07-28 19:00:14.958705900 +0200 @@ -1607,6 +1607,8 @@ typedef struct ix86_args { #define MCOUNT_NAME "_mcount" +#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__" + #define PROFILE_COUNT_REGISTER "edx" /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function, ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-28 17:28 ` Kai Tietz @ 2010-07-28 17:40 ` Richard Henderson 2010-07-28 18:14 ` Kai Tietz 0 siblings, 1 reply; 27+ messages in thread From: Richard Henderson @ 2010-07-28 17:40 UTC (permalink / raw) To: Kai Tietz; +Cc: Andi Kleen, GCC Patches On 07/28/2010 10:23 AM, Kai Tietz wrote: > * config/i386/i386.h (MCOUNT_NAME_BEFORE_PROLOGUE): New. > * config/i386/i386.c (ix86_profile_before_prologue): New. > (override_options): Add special handling for -mfentry. > (ix86_function_regparm): Likewise. > (ix86_function_sseregparm): Likewise. > (ix86_frame_pointer_required): Likewise. > (ix86_expand_prologue): Check for ms_hook_prologue. > (x86_function_profiler): Adjust mcount output. > (TARGET_PROFILE_BEFORE_PROLOGUE): Define hook. > * config/i386/i386.opt (mfentry): New. > * doc/invoke.texi (mfentry): Add documentation. > * doc/tm.texi: Regenerated.. > * doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New. > * final.c (final_start_function): Replace macro > PROFILE_BEFORE_PROLOGUE by target hook. > * function.c (thread_prologue_and_epilogue_insns): Likewise. > * target.def (profile_before_prologue): New hook. > * targhooks.c (default_profile_before_prologue): New. > * targhooks.h (default_profile_before_prologue): New. > > Ok for apply? Nearly. > + if (flag_fentry == -1) > +#if defined(PROFILE_BEFORE_PROLOGUE) > + flag_fentry = ((!TARGET_64BIT && flag_pic) ? 0 : 1); > +#else > + flag_fentry = 0; > +#endif > + if (flag_fentry != 0 && !TARGET_64BIT && flag_pic) > + { > + sorry ("-mfentry isn't support for x86 in combination with -fpic"); > + flag_fentry = 0; > + } Better as if (!TARGET_64BIT && flag_pic) { if (flag_fentry > 0) sorry ("-mfentry isn't support for x86 in combination with -fpic"); flag_fentry = 0; } if (flag_fentry < 0) { #if defined(PROFILE_BEFORE_PROLOGUE) flag_fentry = 1; #else flag_fentry = 0; #endif } Ok with that change. r~ ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-28 17:40 ` Richard Henderson @ 2010-07-28 18:14 ` Kai Tietz 0 siblings, 0 replies; 27+ messages in thread From: Kai Tietz @ 2010-07-28 18:14 UTC (permalink / raw) To: Richard Henderson; +Cc: Andi Kleen, GCC Patches 2010/7/28 Richard Henderson <rth@redhat.com>: > On 07/28/2010 10:23 AM, Kai Tietz wrote: >> * config/i386/i386.h (MCOUNT_NAME_BEFORE_PROLOGUE): New. >> * config/i386/i386.c (ix86_profile_before_prologue): New. >> (override_options): Add special handling for -mfentry. >> (ix86_function_regparm): Likewise. >> (ix86_function_sseregparm): Likewise. >> (ix86_frame_pointer_required): Likewise. >> (ix86_expand_prologue): Check for ms_hook_prologue. >> (x86_function_profiler): Adjust mcount output. >> (TARGET_PROFILE_BEFORE_PROLOGUE): Define hook. >> * config/i386/i386.opt (mfentry): New. >> * doc/invoke.texi (mfentry): Add documentation. >> * doc/tm.texi: Regenerated.. >> * doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New. >> * final.c (final_start_function): Replace macro >> PROFILE_BEFORE_PROLOGUE by target hook. >> * function.c (thread_prologue_and_epilogue_insns): Likewise. >> * target.def (profile_before_prologue): New hook. >> * targhooks.c (default_profile_before_prologue): New. >> * targhooks.h (default_profile_before_prologue): New. >> >> Ok for apply? > > Nearly. > >> + if (flag_fentry == -1) >> +#if defined(PROFILE_BEFORE_PROLOGUE) >> + flag_fentry = ((!TARGET_64BIT && flag_pic) ? 0 : 1); >> +#else >> + flag_fentry = 0; >> +#endif >> + if (flag_fentry != 0 && !TARGET_64BIT && flag_pic) >> + { >> + sorry ("-mfentry isn't support for x86 in combination with -fpic"); >> + flag_fentry = 0; >> + } > > Better as > > if (!TARGET_64BIT && flag_pic) > { > if (flag_fentry > 0) > sorry ("-mfentry isn't support for x86 in combination with -fpic"); > flag_fentry = 0; > } > if (flag_fentry < 0) > { > #if defined(PROFILE_BEFORE_PROLOGUE) > flag_fentry = 1; > #else > flag_fentry = 0; > #endif > } > > Ok with that change. > > > r~ > Applied at revision 162651 with your suggested code change and the "supporED" and "32-bit" instead of x86. Regards, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-15 18:08 ` Kai Tietz 2010-07-16 17:06 ` Richard Henderson @ 2010-07-16 20:53 ` Gerald Pfeifer 2010-07-18 12:20 ` Kai Tietz 2010-07-16 23:57 ` Andi Kleen 2 siblings, 1 reply; 27+ messages in thread From: Gerald Pfeifer @ 2010-07-16 20:53 UTC (permalink / raw) To: Kai Tietz; +Cc: Richard Henderson, Andi Kleen, GCC Patches On Thu, 15 Jul 2010, Kai Tietz wrote: > * doc/invoke.texi (mfentry): Add documentation. +Note: On x86 architectures the attribute @code{ms_hook_prologue} +isn't possible at the moment for @option{-mfentry} and @option{-pg}. How about: "...is not supported for...at the moment."? And if you could add an item to http://gcc.gnu.org/gcc-4.6/changes.html when this goes in, that would be good. Gerald ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-16 20:53 ` Gerald Pfeifer @ 2010-07-18 12:20 ` Kai Tietz 2010-07-18 20:52 ` Gerald Pfeifer 0 siblings, 1 reply; 27+ messages in thread From: Kai Tietz @ 2010-07-18 12:20 UTC (permalink / raw) To: Gerald Pfeifer; +Cc: Richard Henderson, Andi Kleen, GCC Patches [-- Attachment #1: Type: text/plain, Size: 659 bytes --] 2010/7/16 Gerald Pfeifer <gerald@pfeifer.com>: > On Thu, 15 Jul 2010, Kai Tietz wrote: >> * doc/invoke.texi (mfentry): Add documentation. > > +Note: On x86 architectures the attribute @code{ms_hook_prologue} > +isn't possible at the moment for @option{-mfentry} and @option{-pg}. > > How about: "...is not supported for...at the moment."? > > And if you could add an item to http://gcc.gnu.org/gcc-4.6/changes.html > when this goes in, that would be good. > > Gerald > Something like this? Regards, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination [-- Attachment #2: gcc46_change_profile.diff --] [-- Type: application/octet-stream, Size: 438 bytes --] Index: gcc-4.6/changes.html =================================================================== --- gcc-4.6.orig/changes.html 2010-07-16 13:35:00.000000000 +0200 +++ gcc-4.6/changes.html 2010-07-18 14:18:14.061973200 +0200 @@ -197,7 +197,8 @@ <h3>IA-32/x86-64</h3> <ul> - <li>...</li> + <li>Support of emitting profiler counter call before prologue via + command line option <code>-mfentry</code>.</li> </ul> <h3>MIPS</h3> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-18 12:20 ` Kai Tietz @ 2010-07-18 20:52 ` Gerald Pfeifer 2010-07-18 20:54 ` Kai Tietz 2010-07-28 18:06 ` Kai Tietz 0 siblings, 2 replies; 27+ messages in thread From: Gerald Pfeifer @ 2010-07-18 20:52 UTC (permalink / raw) To: Kai Tietz; +Cc: Richard Henderson, Andi Kleen, gcc-patches On Sun, 18 Jul 2010, Kai Tietz wrote: > Something like this? + <li>Support of emitting profiler counter call before prologue via + command line option <code>-mfentry</code>.</li> How about <li>Support for emitting profiler counter calls before function prologues. This is enabled via a new command-line option <code>-mfentry</code>.</li> ? If this, or a variation, is fine with you, let's give Richard a day or two to comment and then go ahead and commit. Thanks, Gerald ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-18 20:52 ` Gerald Pfeifer @ 2010-07-18 20:54 ` Kai Tietz 2010-07-28 18:06 ` Kai Tietz 1 sibling, 0 replies; 27+ messages in thread From: Kai Tietz @ 2010-07-18 20:54 UTC (permalink / raw) To: Gerald Pfeifer; +Cc: Richard Henderson, Andi Kleen, gcc-patches 2010/7/18 Gerald Pfeifer <gerald@pfeifer.com>: > On Sun, 18 Jul 2010, Kai Tietz wrote: >> Something like this? > > + <li>Support of emitting profiler counter call before prologue via > + command line option <code>-mfentry</code>.</li> > > How about > > <li>Support for emitting profiler counter calls before function > prologues. This is enabled via a new command-line option > <code>-mfentry</code>.</li> > > ? > > If this, or a variation, is fine with you, let's give Richard a day or > two to comment and then go ahead and commit. > > Thanks, > Gerald > Thanks, I'll take the variant you suggested. Regards, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-18 20:52 ` Gerald Pfeifer 2010-07-18 20:54 ` Kai Tietz @ 2010-07-28 18:06 ` Kai Tietz 1 sibling, 0 replies; 27+ messages in thread From: Kai Tietz @ 2010-07-28 18:06 UTC (permalink / raw) To: Gerald Pfeifer; +Cc: Richard Henderson, Andi Kleen, gcc-patches 2010/7/18 Gerald Pfeifer <gerald@pfeifer.com>: > On Sun, 18 Jul 2010, Kai Tietz wrote: >> Something like this? > > + <li>Support of emitting profiler counter call before prologue via > + command line option <code>-mfentry</code>.</li> > > How about > > <li>Support for emitting profiler counter calls before function > prologues. This is enabled via a new command-line option > <code>-mfentry</code>.</li> > > ? > > If this, or a variation, is fine with you, let's give Richard a day or > two to comment and then go ahead and commit. > > Thanks, > Gerald > Ok, added changelog entry as Index: changes.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/changes.html,v retrieving revision 1.33 diff -u -3 -r1.33 changes.html --- changes.html 24 Jul 2010 19:37:48 -0000 1.33 +++ changes.html 28 Jul 2010 18:02:55 -0000 @@ -200,7 +200,9 @@ <h3>IA-32/x86-64</h3> <ul> - <li>...</li> + <li>Support for emitting profiler counter calls before function + prologues. This is enabled via a new command-line option + <code>-mfentry</code>.</li> </ul> <h3>MIPS</h3> Regards, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-15 18:08 ` Kai Tietz 2010-07-16 17:06 ` Richard Henderson 2010-07-16 20:53 ` Gerald Pfeifer @ 2010-07-16 23:57 ` Andi Kleen 2010-07-17 5:34 ` Kai Tietz 2 siblings, 1 reply; 27+ messages in thread From: Andi Kleen @ 2010-07-16 23:57 UTC (permalink / raw) To: Kai Tietz; +Cc: Richard Henderson, Andi Kleen, GCC Patches On Thu, Jul 15, 2010 at 08:08:24PM +0200, Kai Tietz wrote: > Hello Andy, > > I updated my patch in that way, that it should be trivial to add the > counter function for before prologue profiling to linux target by a > one-liner. > Just make sure that for the i386-target the macro > MCOUNT_NAME_BEFORE_PROLOGUE is defined. > > I reworked the patch so that the option is now named -mfentry and it > is available for all i386 targets, if they have defined the counter > function's name via MCOUNT_NAME_BEFORE_PROLOGUE in target. > Additionally I added some option-checks for targets, which don't > support before prologue profiling. Kai, I tried the patch on x86_64-linux but it doesn't work for me. First I added the define to linux diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h index 81dfd1e..54051ed 100644 --- a/gcc/config/i386/linux.h +++ b/gcc/config/i386/linux.h @@ -48,6 +48,10 @@ along with GCC; see the file COPYING3. If not see #define NO_PROFILE_COUNTERS 1 +/* Choose the correct profiler mcount name. */ +#undef MCOUNT_NAME_BEFORE_PROLOGUE +#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__" + #undef MCOUNT_NAME #define MCOUNT_NAME "mcount" But when I try to set -mfentry on a simple test program I get sorry, unimplemented: -mfentry isn't supported for this target I think that's because of +#if defined(PROFILE_BEFORE_PROLOGUE) + default_profile_top_flag = 1; +#endif +#if defined(MCOUNT_NAME) && defined (MCOUNT_NAME_BEFORE_PROLOGUE) + only_default = 0; +#endif + + if (flag_fentry == -1) + flag_fentry = default_profile_top_flag; + else if (flag_fentry != default_profile_top_flag && only_default) + { + if (!default_profile_top_flag) + sorry ("-mfentry isn't supported for this target"); + else + sorry ("-mno-fentry isn't supported for this target"); and PROFILE_BEFORE_PROLOGUE is never set for i386, default_profile_flag is always 0 -Andi ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-16 23:57 ` Andi Kleen @ 2010-07-17 5:34 ` Kai Tietz 2010-07-17 9:45 ` Andi Kleen 0 siblings, 1 reply; 27+ messages in thread From: Kai Tietz @ 2010-07-17 5:34 UTC (permalink / raw) To: Andi Kleen; +Cc: Richard Henderson, GCC Patches 2010/7/17 Andi Kleen <andi@firstfloor.org>: > On Thu, Jul 15, 2010 at 08:08:24PM +0200, Kai Tietz wrote: >> Hello Andy, >> >> I updated my patch in that way, that it should be trivial to add the >> counter function for before prologue profiling to linux target by a >> one-liner. >> Just make sure that for the i386-target the macro >> MCOUNT_NAME_BEFORE_PROLOGUE is defined. >> >> I reworked the patch so that the option is now named -mfentry and it >> is available for all i386 targets, if they have defined the counter >> function's name via MCOUNT_NAME_BEFORE_PROLOGUE in target. >> Additionally I added some option-checks for targets, which don't >> support before prologue profiling. > > Kai, > > I tried the patch on x86_64-linux but it doesn't work for me. First I added > the define to linux > > diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h > index 81dfd1e..54051ed 100644 > --- a/gcc/config/i386/linux.h > +++ b/gcc/config/i386/linux.h > @@ -48,6 +48,10 @@ along with GCC; see the file COPYING3. If not see > > #define NO_PROFILE_COUNTERS 1 > > +/* Choose the correct profiler mcount name. */ > +#undef MCOUNT_NAME_BEFORE_PROLOGUE > +#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__" > + > #undef MCOUNT_NAME > #define MCOUNT_NAME "mcount" > > But when I try to set -mfentry on a simple test program I get > > sorry, unimplemented: -mfentry isn't supported for this target > > I think that's because of > > +#if defined(PROFILE_BEFORE_PROLOGUE) > + default_profile_top_flag = 1; > +#endif > +#if defined(MCOUNT_NAME) && defined (MCOUNT_NAME_BEFORE_PROLOGUE) > + only_default = 0; > +#endif > + > + if (flag_fentry == -1) > + flag_fentry = default_profile_top_flag; > + else if (flag_fentry != default_profile_top_flag && only_default) > + { > + if (!default_profile_top_flag) > + sorry ("-mfentry isn't supported for this target"); > + else > + sorry ("-mno-fentry isn't supported for this target"); > > > and PROFILE_BEFORE_PROLOGUE is never set for i386, default_profile_flag > is always 0 > > -Andi > Hmm, I can't reproduce this. Clear if PROFILE_BEFORE_PROLOGUE isn't set, then default remains profile counter call after prologue. But if you have 'defined(MCOUNT_NAME) && defined (MCOUNT_NAME_BEFORE_PROLOGUE)' the variable only_default is false, and so the error you are showing shouldn't be reachable. Do I miss here something? Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-17 5:34 ` Kai Tietz @ 2010-07-17 9:45 ` Andi Kleen 2010-07-18 11:37 ` Kai Tietz 0 siblings, 1 reply; 27+ messages in thread From: Andi Kleen @ 2010-07-17 9:45 UTC (permalink / raw) To: Kai Tietz; +Cc: Andi Kleen, Richard Henderson, GCC Patches > Hmm, I can't reproduce this. Clear if PROFILE_BEFORE_PROLOGUE isn't > set, then default remains profile counter call after prologue. But if > you have 'defined(MCOUNT_NAME) && defined > (MCOUNT_NAME_BEFORE_PROLOGUE)' the variable only_default is false, and > so the error you are showing shouldn't be reachable. Do I miss here > something? Looks it was my fault. It seems adding the define to linux.h is not enough for them to reach i386.c (MCOUNT_NAME_BEFORE_PROLOGUE is undefined) on my configuration (x86_64-linux) If I add it to linux64.h too it works. I thought linux64 would inherit linux, but apparently that's not the case. With this patch a simple test program works for a 64bit build. -Andi 2010-07-17 Andi Kleen <ak@linux.intel.com> * config/i386/linux.h (MCOUNT_NAME_BEFORE_PROLOGUE): Define. * config/i386/linux64.h (MCOUNT_NAME_BEFORE_PROLOGUE): Define. (MCOUNT_NAME): Define diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h index 81dfd1e..54051ed 100644 --- a/gcc/config/i386/linux.h +++ b/gcc/config/i386/linux.h @@ -48,6 +48,10 @@ along with GCC; see the file COPYING3. If not see #define NO_PROFILE_COUNTERS 1 +/* Choose the correct profiler mcount name. */ +#undef MCOUNT_NAME_BEFORE_PROLOGUE +#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__" + #undef MCOUNT_NAME #define MCOUNT_NAME "mcount" diff --git a/gcc/config/i386/linux64.h b/gcc/config/i386/linux64.h index 33b4dc9..f40fa31 100644 --- a/gcc/config/i386/linux64.h +++ b/gcc/config/i386/linux64.h @@ -123,3 +123,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see x86_64 glibc provides it in %fs:0x28. */ #define TARGET_THREAD_SSP_OFFSET (TARGET_64BIT ? 0x28 : 0x14) #endif + +/* Choose the correct profiler mcount name. */ +#undef MCOUNT_NAME_BEFORE_PROLOGUE +#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__" + +#undef MCOUNT_NAME +#define MCOUNT_NAME "mcount" ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-17 9:45 ` Andi Kleen @ 2010-07-18 11:37 ` Kai Tietz 2010-07-18 11:46 ` Kai Tietz 0 siblings, 1 reply; 27+ messages in thread From: Kai Tietz @ 2010-07-18 11:37 UTC (permalink / raw) To: Richard Henderson; +Cc: GCC Patches, Andi Kleen [-- Attachment #1: Type: text/plain, Size: 1477 bytes --] Hello, I found one missing nit for cygwin/mingw, which is corrected by this patch. For win32 target an initial call to _monstartup has to be done in main before mcount get called. For this I added to cygming.h the new macro PROFILE_HOOK_BEFORE_PROFILE to handle it. So new patch with updated ChangeLog * config/i386/cygming.h (MCOUNT_NAME): New. (MCOUNT_NAME_BEFORE_PROLOGUE): New. (PROFILE_HOOK_BEFORE_PROFILE): New. (PROFILE_HOOK): Check if not fentry is active. * config/i386/i386.c (ix86_profile_before_prologue): New. (override_options): Add special handling for -mfentry. (ix86_function_regparm): Likewise. (ix86_function_sseregparm): Likewise. (ix86_frame_pointer_required): Likewise. (ix86_expand_prologue): Check for ms_hook_prologue. (x86_function_profiler): Adjust mcount output and call PROFILE_HOOK_BEFORE_PROFILE. (TARGET_PROFILE_BEFORE_PROLOGUE): Define hook. * config/i386/i386.opt (mfentry): New. * doc/invoke.texi (mfentry): Add documentation. * doc/tm.texi: Regenerated.. * doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New. * final.c (final_start_function): Replace macro PROFILE_BEFORE_PROLOGUE by target hook. * function.c (thread_prologue_and_epilogue_insns): Likewise. * target.def (profile_before_prologue): New hook. * targhooks.c (default_profile_before_prologue): New. * targhooks.h (default_profile_before_prologue): New. Tested for i686-pc-mingw32, i686-pc-cygwin, and x86_64-pc-mingw32. Ok for apply? Regards, Kai [-- Attachment #2: profileb.diff --] [-- Type: application/octet-stream, Size: 14555 bytes --] Index: gcc/gcc/config/i386/cygming.h =================================================================== --- gcc.orig/gcc/config/i386/cygming.h 2010-07-18 12:15:00.061060600 +0200 +++ gcc/gcc/config/i386/cygming.h 2010-07-18 13:35:41.901998000 +0200 @@ -39,6 +39,13 @@ along with GCC; see the file COPYING3. #undef DEFAULT_ABI #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI) +/* Choose the correct profiler mcount name. */ +#undef MCOUNT_NAME +#define MCOUNT_NAME "_mcount" + +#undef MCOUNT_NAME_BEFORE_PROLOGUE +#define MCOUNT_NAME_BEFORE_PROLOGUE "_mcount_top" + #if ! defined (USE_MINGW64_LEADING_UNDERSCORES) #undef USER_LABEL_PREFIX #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_") @@ -327,7 +334,7 @@ do { \ #undef PROFILE_HOOK #define PROFILE_HOOK(LABEL) \ - if (MAIN_NAME_P (DECL_NAME (current_function_decl))) \ + if (!flag_fentry && MAIN_NAME_P (DECL_NAME (current_function_decl))) \ { \ emit_call_insn (gen_rtx_CALL (VOIDmode, \ gen_rtx_MEM (FUNCTION_MODE, \ @@ -335,6 +342,13 @@ do { \ const0_rtx)); \ } +#undef PROFILE_HOOK_BEFORE_PROFILE +#define PROFILE_HOOK_BEFORE_PROFILE(FILE, LABEL) \ + if (flag_fentry && MAIN_NAME_P (DECL_NAME (current_function_decl))) \ + { \ + fprintf ((FILE), "\tcall\t%s_monstartup\n", user_label_prefix); \ + } + /* Java Native Interface (JNI) methods on Win32 are invoked using the stdcall calling convention. */ #undef MODIFY_JNI_METHOD_CALL Index: gcc/gcc/config/i386/i386.c =================================================================== --- gcc.orig/gcc/config/i386/i386.c 2010-07-18 12:15:00.062060600 +0200 +++ gcc/gcc/config/i386/i386.c 2010-07-18 13:34:31.547974000 +0200 @@ -2768,6 +2768,15 @@ software_prefetching_beneficial_p (void) } } +/* Return true, if profiling code should be emitted before + prologue. Otherwise it returns false. + Note: For x86 with "hotfix" it is sorried. */ +static bool +ix86_profile_before_prologue (void) +{ + return flag_fentry != 0; +} + /* Function that is callable from the debugger to print the current options. */ void @@ -3671,6 +3680,34 @@ override_options (bool main_args_p) target_flags |= MASK_CLD & ~target_flags_explicit; #endif + { + int default_profile_top_flag = 0; + int only_default = 1; + bool force_default_profile_top_flag = (!TARGET_64BIT && flag_pic); + +#if defined(PROFILE_BEFORE_PROLOGUE) + default_profile_top_flag = (force_default_profile_top_flag ? 0 : 1); +#endif +#if defined(MCOUNT_NAME) && defined(MCOUNT_NAME_BEFORE_PROLOGUE) + only_default = 0; +#endif + + if (flag_fentry == -1) + flag_fentry = default_profile_top_flag; + if (flag_fentry != 0 && force_default_profile_top_flag) + { + sorry ("-mfentry isn't support for x86 in combination with -fpic"); + flag_fentry = 0; + } + else if (flag_fentry != default_profile_top_flag && only_default) + { + if (!default_profile_top_flag) + sorry ("-mfentry isn't supported for this target"); + else + sorry ("-mno-fentry isn't supported for this target"); + flag_fentry = default_profile_top_flag; + } + } /* Save the initial options in case the user does function specific options */ if (main_args_p) target_option_default_node = target_option_current_node @@ -4841,7 +4878,7 @@ ix86_function_regparm (const_tree type, if (decl && TREE_CODE (decl) == FUNCTION_DECL && optimize - && !profile_flag) + && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl)); @@ -4913,7 +4950,8 @@ ix86_function_sseregparm (const_tree typ /* For local functions, pass up to SSE_REGPARM_MAX SFmode (and DFmode for SSE2) arguments in SSE registers. */ - if (decl && TARGET_SSE_MATH && optimize && !profile_flag) + if (decl && TARGET_SSE_MATH && optimize + && !(profile_flag && !flag_fentry)) { /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl)); @@ -7878,7 +7916,7 @@ ix86_frame_pointer_required (void) || ix86_current_function_calls_tls_descriptor)) return true; - if (crtl->profile) + if (crtl->profile && !flag_fentry) return true; return false; @@ -8146,7 +8184,8 @@ gen_push (rtx arg) static unsigned int ix86_select_alt_pic_regnum (void) { - if (current_function_is_leaf && !crtl->profile + if (current_function_is_leaf + && !crtl->profile && !ix86_current_function_calls_tls_descriptor) { int i, drap; @@ -9194,6 +9233,11 @@ ix86_expand_prologue (void) { rtx push, mov; + /* Check if profiling is active and we shall use profiling before + prologue variant. If so sorry. */ + if (crtl->profile && flag_fentry != 0) + sorry ("ms_hook_prologue attribute isn't compatible with -mfentry for 32-bit"); + /* Make sure the function starts with 8b ff movl.s %edi,%edi (emited by ix86_asm_output_function_label) 55 push %ebp @@ -9483,7 +9527,7 @@ ix86_expand_prologue (void) when mcount needs it. Blockage to avoid call movement across mcount call is emitted in generic code after the NOTE_INSN_PROLOGUE_END note. */ - if (crtl->profile && pic_reg_used) + if (crtl->profile && !flag_fentry && pic_reg_used) emit_insn (gen_prologue_use (pic_offset_table_rtx)); if (crtl->drap_reg && !crtl->stack_realign_needed) @@ -27285,11 +27329,30 @@ x86_field_alignment (tree field, int com return computed; } +#if !defined(MCOUNT_NAME) && !defined(MCOUNT_NAME_BEFORE_PROLOGUE) +#error MCOUNT_NAME ,and/or MCOUNT_NAME_BEFORE_PROLOGUE have to be define +#endif + +/* Make sure both are getting defined. */ +#ifndef MCOUNT_NAME +#define MCOUNT_NAME MCOUNT_NAME_BEFORE_PROLOGUE +#endif +#ifndef MCOUNT_NAME_BEFORE_PROLOGUE +#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME +#endif + /* Output assembler code to FILE to increment profiler label # LABELNO for profiling a function entry. */ void x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) { + const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE + : MCOUNT_NAME); + +#ifdef PROFILE_HOOK_BEFORE_PROFILE + PROFILE_HOOK_BEFORE_PROFILE (file, labelno); +#endif + if (TARGET_64BIT) { #ifndef NO_PROFILE_COUNTERS @@ -27297,9 +27360,9 @@ x86_function_profiler (FILE *file, int l #endif if (DEFAULT_ABI == SYSV_ABI && flag_pic) - fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file); + fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name); else - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", mcount_name); } else if (flag_pic) { @@ -27307,7 +27370,7 @@ x86_function_profiler (FILE *file, int l fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file); + fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name); } else { @@ -27315,7 +27378,7 @@ x86_function_profiler (FILE *file, int l fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - fputs ("\tcall\t" MCOUNT_NAME "\n", file); + fprintf (file, "\tcall\t%s\n", mcount_name); } } @@ -31363,6 +31426,9 @@ ix86_enum_va_list (int idx, const char * #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD #endif +#undef TARGET_PROFILE_BEFORE_PROLOGUE +#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue + #undef TARGET_ASM_UNALIGNED_HI_OP #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP #undef TARGET_ASM_UNALIGNED_SI_OP Index: gcc/gcc/config/i386/i386.opt =================================================================== --- gcc.orig/gcc/config/i386/i386.opt 2010-07-18 12:15:00.078060600 +0200 +++ gcc/gcc/config/i386/i386.opt 2010-07-18 13:34:31.568974000 +0200 @@ -375,3 +375,7 @@ Support RDRND built-in functions and cod mf16c Target Report Mask(ISA_F16C) Var(ix86_isa_flags) VarExists Save Support F16C built-in functions and code generation + +mfentry +Target Report Var(flag_fentry) Init(-1) +Emit profiling counter call at function entry before prologue. Index: gcc/gcc/doc/invoke.texi =================================================================== --- gcc.orig/gcc/doc/invoke.texi 2010-07-18 12:15:00.081060600 +0200 +++ gcc/gcc/doc/invoke.texi 2010-07-18 13:34:31.591974000 +0200 @@ -601,7 +601,7 @@ Objective-C and Objective-C++ Dialects}. -momit-leaf-frame-pointer -mno-red-zone -mno-tls-direct-seg-refs @gol -mcmodel=@var{code-model} -mabi=@var{name} @gol -m32 -m64 -mlarge-data-threshold=@var{num} @gol --msse2avx} +-msse2avx -mfentry} @emph{IA-64 Options} @gccoptlist{-mbig-endian -mlittle-endian -mgnu-as -mgnu-ld -mno-pic @gol @@ -12466,6 +12466,14 @@ For systems that use GNU libc, the defau @opindex msse2avx Specify that the assembler should encode SSE instructions with VEX prefix. The option @option{-mavx} turns this on by default. + +@item -mfentry +@itemx -mno-fentry +@opindex mfentry +If profiling is active @option{-pg} put the profiling +counter call before prologue. +Note: On x86 architectures the attribute @code{ms_hook_prologue} +isn't possible at the moment for @option{-mfentry} and @option{-pg}. @end table These @samp{-m} switches are supported in addition to the above Index: gcc/gcc/doc/tm.texi =================================================================== --- gcc.orig/gcc/doc/tm.texi 2010-07-18 12:15:00.082060600 +0200 +++ gcc/gcc/doc/tm.texi 2010-07-18 12:25:33.157271600 +0200 @@ -7101,6 +7101,13 @@ Contains the value true if the target pl ``small data'' into a separate section. The default value is false. @end deftypevr +@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void) +It returns true if target wants profile code emitted before prologue. + +The default version of this hook use the target macro +@code{PROFILE_BEFORE_PROLOGUE}. +@end deftypefn + @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp}) Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library Index: gcc/gcc/doc/tm.texi.in =================================================================== --- gcc.orig/gcc/doc/tm.texi.in 2010-07-18 12:15:00.089060600 +0200 +++ gcc/gcc/doc/tm.texi.in 2010-07-18 12:25:33.163271900 +0200 @@ -7101,6 +7101,8 @@ Contains the value true if the target pl ``small data'' into a separate section. The default value is false. @end deftypevr +@hook TARGET_PROFILE_BEFORE_PROLOGUE + @hook TARGET_BINDS_LOCAL_P Returns true if @var{exp} names an object for which name resolution rules must resolve to the current ``module'' (dynamic shared library Index: gcc/gcc/final.c =================================================================== --- gcc.orig/gcc/final.c 2010-07-18 12:15:00.090060600 +0200 +++ gcc/gcc/final.c 2010-07-18 12:25:33.167272200 +0200 @@ -1546,10 +1546,8 @@ final_start_function (rtx first ATTRIBUT /* The Sun386i and perhaps other machines don't work right if the profiling code comes after the prologue. */ -#ifdef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* PROFILE_BEFORE_PROLOGUE */ #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue) if (dwarf2out_do_frame ()) @@ -1591,10 +1589,8 @@ final_start_function (rtx first ATTRIBUT static void profile_after_prologue (FILE *file ATTRIBUTE_UNUSED) { -#ifndef PROFILE_BEFORE_PROLOGUE - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) profile_function (file); -#endif /* not PROFILE_BEFORE_PROLOGUE */ } static void Index: gcc/gcc/function.c =================================================================== --- gcc.orig/gcc/function.c 2010-07-18 12:15:00.091060600 +0200 +++ gcc/gcc/function.c 2010-07-18 12:25:33.172272400 +0200 @@ -5179,13 +5179,11 @@ thread_prologue_and_epilogue_insns (void record_insns (seq, NULL, &prologue_insn_hash); emit_note (NOTE_INSN_PROLOGUE_END); -#ifndef PROFILE_BEFORE_PROLOGUE /* Ensure that instructions are not moved into the prologue when profiling is on. The call to the profiling routine can be emitted within the live range of a call-clobbered register. */ - if (crtl->profile) + if (!targetm.profile_before_prologue () && crtl->profile) emit_insn (gen_blockage ()); -#endif seq = get_insns (); end_sequence (); Index: gcc/gcc/target.def =================================================================== --- gcc.orig/gcc/target.def 2010-07-18 12:15:00.093060600 +0200 +++ gcc/gcc/target.def 2010-07-18 12:25:33.175272600 +0200 @@ -1218,6 +1218,15 @@ DEFHOOK bool, (const_tree exp), default_binds_local_p) +/* Check if profiling code is before or after prologue. */ +DEFHOOK +(profile_before_prologue, + "It returns true if target wants profile code emitted before prologue.\n\n\ +The default version of this hook use the target macro\n\ +@code{PROFILE_BEFORE_PROLOGUE}.", + bool, (void), + default_profile_before_prologue) + /* Modify and return the identifier of a DECL's external name, originally identified by ID, as required by the target, (eg, append @nn to windows32 stdcall function names). Index: gcc/gcc/targhooks.c =================================================================== --- gcc.orig/gcc/targhooks.c 2010-07-18 12:15:00.094060600 +0200 +++ gcc/gcc/targhooks.c 2010-07-18 12:25:33.177272700 +0200 @@ -1197,4 +1197,14 @@ default_register_move_cost (enum machine #endif } +bool +default_profile_before_prologue (void) +{ +#ifndef PROFILE_BEFORE_PROLOGUE + return false; +#else + return true; +#endif +} + #include "gt-targhooks.h" Index: gcc/gcc/targhooks.h =================================================================== --- gcc.orig/gcc/targhooks.h 2010-07-18 12:15:00.110060600 +0200 +++ gcc/gcc/targhooks.h 2010-07-18 12:25:33.180272900 +0200 @@ -150,3 +150,4 @@ extern int default_memory_move_cost (enu extern int default_register_move_cost (enum machine_mode, reg_class_t, reg_class_t); +extern bool default_profile_before_prologue (void); ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature 2010-07-18 11:37 ` Kai Tietz @ 2010-07-18 11:46 ` Kai Tietz 0 siblings, 0 replies; 27+ messages in thread From: Kai Tietz @ 2010-07-18 11:46 UTC (permalink / raw) To: Richard Henderson; +Cc: GCC Patches, Andi Kleen 2010/7/18 Kai Tietz <ktietz70@googlemail.com>: > Hello, > > I found one missing nit for cygwin/mingw, which is corrected by this > patch. For win32 target an initial call to _monstartup has to be done > in main before mcount get called. For this I added to cygming.h the > new macro PROFILE_HOOK_BEFORE_PROFILE to handle it. > > So new patch with updated ChangeLog > > * config/i386/cygming.h (MCOUNT_NAME): New. > (MCOUNT_NAME_BEFORE_PROLOGUE): New. > (PROFILE_HOOK_BEFORE_PROFILE): New. > (PROFILE_HOOK): Check if not fentry is active. > * config/i386/i386.c (ix86_profile_before_prologue): New. > (override_options): Add special handling for -mfentry. > (ix86_function_regparm): Likewise. > (ix86_function_sseregparm): Likewise. > (ix86_frame_pointer_required): Likewise. > (ix86_expand_prologue): Check for ms_hook_prologue. > (x86_function_profiler): Adjust mcount output and > call PROFILE_HOOK_BEFORE_PROFILE. > (TARGET_PROFILE_BEFORE_PROLOGUE): Define hook. > * config/i386/i386.opt (mfentry): New. > * doc/invoke.texi (mfentry): Add documentation. > * doc/tm.texi: Regenerated.. > * doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New. > * final.c (final_start_function): Replace macro > PROFILE_BEFORE_PROLOGUE by target hook. > * function.c (thread_prologue_and_epilogue_insns): Likewise. > * target.def (profile_before_prologue): New hook. > * targhooks.c (default_profile_before_prologue): New. > * targhooks.h (default_profile_before_prologue): New. > > > Tested for i686-pc-mingw32, i686-pc-cygwin, and x86_64-pc-mingw32. Ok for apply? > > Regards, > Kai > Hmm, it doesn't hurt, but it seems to me that for old behavior the monstartup hook got called after the mcount function. So I am not sure here if the last patch is necessary at all, so withdraw recent patch and fallback to the patch before. Sorry for the noise. Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | (")_(") him gain world domination ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2010-07-28 18:06 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-07-13 12:47 [patch i386]: Add for win32 targets pre-prologue profiling feature Kai Tietz 2010-07-13 16:28 ` Richard Henderson 2010-07-14 10:20 ` Kai Tietz 2010-07-14 11:49 ` Dave Korn 2010-07-14 12:11 ` Kai Tietz 2010-07-14 12:16 ` Andi Kleen 2010-07-14 12:38 ` Kai Tietz 2010-07-15 18:08 ` Kai Tietz 2010-07-16 17:06 ` Richard Henderson 2010-07-17 6:52 ` Kai Tietz 2010-07-20 2:27 ` Richard Henderson 2010-07-28 8:36 ` Kai Tietz 2010-07-28 16:00 ` Richard Henderson 2010-07-28 16:01 ` Andi Kleen 2010-07-28 17:28 ` Kai Tietz 2010-07-28 17:40 ` Richard Henderson 2010-07-28 18:14 ` Kai Tietz 2010-07-16 20:53 ` Gerald Pfeifer 2010-07-18 12:20 ` Kai Tietz 2010-07-18 20:52 ` Gerald Pfeifer 2010-07-18 20:54 ` Kai Tietz 2010-07-28 18:06 ` Kai Tietz 2010-07-16 23:57 ` Andi Kleen 2010-07-17 5:34 ` Kai Tietz 2010-07-17 9:45 ` Andi Kleen 2010-07-18 11:37 ` Kai Tietz 2010-07-18 11:46 ` Kai Tietz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).