public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch i386]: Add for win32 targets pre-prologue profiling feature
@ 2010-07-13 12:47 Kai Tietz
  2010-07-13 16:28 ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Kai Tietz @ 2010-07-13 12:47 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 1850 bytes --]

Hello,

This patch adds for i386/x86_64 win32 targets the feature of
pre-prologue profiling call. It additional takes care that for enabled
top-profiler call, the frame-pointer gets omitted, if possible.
One side-note here about "hotfix" and profiling. The top-profiler call
gets emitted in the ix86_asm_output_function_label. This is caused by
the fact, that for ix86 the call needs to be placed before the
code-pattern, and for x86_64 it can be placed after. Otherwise the x86
pattern for patchable region would be corrupted, as this pattern
contains frame-register setup. So I think that the use of the macro
PROFILE_BEFORE_PROLOGUE isn't usable here.

2010-07-13  Kai Tietz

	* config/i386/cygming.h (PROFILE_CALL_AT_TOP): New macro.
	(MCOUNT_NAME): Define specific sub-target macro.
	* config/i386/cygming.opt: New option -fprofile-top.
	* config/i386/i386.c (ix86_function_regparm): Special
	handling for active profiling.
	(ix86_function_sseregparm): Likewise.
	(ix86_frame_pointer_required): Likewise.
	(ix86_select_alt_pic_regnum): Likewise.
	(ix86_save_reg): Likewise.
	(ix86_expand_prologue): Likewise.
	(x86_function_profiler_intern): New internal function.
	(ix86_asm_output_function_label): Output preprologue
	profiler call.
	(x86_function_profiler): Emit profiling after prologue
	when no top-profiling is enabled.
	* config/i386/i386.h (PROFILE_CALL_AT_TOP): Define
	macro by default of zero.
	* doc/invoke.texi (-fprofile-top): Add documentation.

Tested for i686-pc-linux-gnu, x86_64-pc-mingw32, and i686-pc-mingw32.
Ok for apply?

Regards,
Kai

PS: If there is a general interest for this feature for all i386
targets, the patch can be easily adjusted to this.
-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

[-- Attachment #2: profile_top.diff --]
[-- Type: application/octet-stream, Size: 9336 bytes --]

Index: gcc/gcc/config/i386/cygming.h
===================================================================
--- gcc.orig/gcc/config/i386/cygming.h	2010-07-09 11:24:42.000000000 +0200
+++ gcc/gcc/config/i386/cygming.h	2010-07-13 12:34:57.332581900 +0200
@@ -39,6 +39,12 @@ along with GCC; see the file COPYING3.  
 #undef DEFAULT_ABI
 #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI)
 
+#undef PROFILE_CALL_AT_TOP
+#define PROFILE_CALL_AT_TOP (flag_profile_top != 0)
+
+#undef MCOUNT_NAME
+#define MCOUNT_NAME (PROFILE_CALL_AT_TOP ? "_mcount_top" : "_mcount")
+
 #if ! defined (USE_MINGW64_LEADING_UNDERSCORES)
 #undef USER_LABEL_PREFIX
 #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_")
Index: gcc/gcc/config/i386/cygming.opt
===================================================================
--- gcc.orig/gcc/config/i386/cygming.opt	2009-12-11 14:56:47.000000000 +0100
+++ gcc/gcc/config/i386/cygming.opt	2010-07-13 12:26:24.374762200 +0200
@@ -53,3 +53,7 @@ Use the GNU extension to the PE format f
 muse-libstdc-wrappers
 Target Condition({defined (USE_CYGWIN_LIBSTDCXX_WRAPPERS)})
 Compile code that relies on Cygwin DLL wrappers to support C++ operator new/delete replacement
+
+fprofile-top
+Common Report Var(flag_profile_top) Init(0)
+Emit for profiling code the profiler-callback call before prologue.
Index: gcc/gcc/config/i386/i386.c
===================================================================
--- gcc.orig/gcc/config/i386/i386.c	2010-07-13 12:03:39.000000000 +0200
+++ gcc/gcc/config/i386/i386.c	2010-07-13 14:25:24.527182900 +0200
@@ -4841,7 +4841,7 @@ ix86_function_regparm (const_tree type, 
   if (decl
       && TREE_CODE (decl) == FUNCTION_DECL
       && optimize
-      && !profile_flag)
+      && !(profile_flag && !PROFILE_CALL_AT_TOP))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl));
@@ -4913,7 +4913,8 @@ ix86_function_sseregparm (const_tree typ
 
   /* For local functions, pass up to SSE_REGPARM_MAX SFmode
      (and DFmode for SSE2) arguments in SSE registers.  */
-  if (decl && TARGET_SSE_MATH && optimize && !profile_flag)
+  if (decl && TARGET_SSE_MATH && optimize
+      && !(profile_flag && !PROFILE_CALL_AT_TOP))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl));
@@ -5132,7 +5133,43 @@ ix86_cfun_abi (void)
   return cfun->machine->call_abi;
 }
 
-/* Write the extra assembler code needed to declare a function properly.  */
+/* Output assembler code to FILE to increment profiler label # LABELNO
+   for profiling a function entry.  */
+static void
+x86_function_profiler_intern (FILE *file, int labelno ATTRIBUTE_UNUSED)
+{
+  if (TARGET_64BIT)
+    {
+#ifndef NO_PROFILE_COUNTERS
+      fprintf (file, "\tleaq\t%sP%d(%%rip),%%r11\n", LPREFIX, labelno);
+#endif
+
+      if (DEFAULT_ABI == SYSV_ABI && flag_pic)
+	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", MCOUNT_NAME);
+      else
+	fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
+    }
+  else if (flag_pic)
+    {
+#ifndef NO_PROFILE_COUNTERS
+      fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
+	       LPREFIX, labelno);
+#endif
+      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", MCOUNT_NAME);
+    }
+  else
+    {
+#ifndef NO_PROFILE_COUNTERS
+      fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
+	       LPREFIX, labelno);
+#endif
+      fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
+    }
+}
+
+/* Write the extra assembler code needed to declare a function properly.
+   Output call to profiler if profiling is enabled and it should be emitted
+   before prologue,  */
 
 void
 ix86_asm_output_function_label (FILE *asm_out_file, const char *fname,
@@ -5151,6 +5188,11 @@ ix86_asm_output_function_label (FILE *as
 
   ASM_OUTPUT_LABEL (asm_out_file, fname);
 
+  /* We output profiling call before hotfix region, caused by the fact
+     that we would otherwise destroy for x86 the magic sequence.  */
+  if (!TARGET_64BIT && PROFILE_CALL_AT_TOP && profile_flag)
+    x86_function_profiler_intern (asm_out_file, 0);
+
   /* Output magic byte marker, if hot-patch attribute is set.
      For x86 case frame-pointer prologue will be emitted in
      expand_prologue.  */
@@ -5164,6 +5206,9 @@ ix86_asm_output_function_label (FILE *as
         /* movl.s %edi, %edi.  */
 	asm_fprintf (asm_out_file, ASM_BYTE "0x8b, 0xff\n");
     }
+  /* We output profiling call after hotfix region for x86_64.  */
+  if (TARGET_64BIT && PROFILE_CALL_AT_TOP && profile_flag)
+    x86_function_profiler_intern (asm_out_file, 0);
 }
 
 /* regclass.c  */
@@ -7875,7 +7920,7 @@ ix86_frame_pointer_required (void)
 	  || ix86_current_function_calls_tls_descriptor))
     return true;
 
-  if (crtl->profile)
+  if (crtl->profile && !PROFILE_CALL_AT_TOP)
     return true;
 
   return false;
@@ -8143,7 +8188,7 @@ gen_push (rtx arg)
 static unsigned int
 ix86_select_alt_pic_regnum (void)
 {
-  if (current_function_is_leaf && !crtl->profile
+  if (current_function_is_leaf && !(crtl->profile && !PROFILE_CALL_AT_TOP)
       && !ix86_current_function_calls_tls_descriptor)
     {
       int i, drap;
@@ -8167,7 +8212,7 @@ ix86_save_reg (unsigned int regno, int m
   if (pic_offset_table_rtx
       && regno == REAL_PIC_OFFSET_TABLE_REGNUM
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile
+	  || (crtl->profile && !PROFILE_CALL_AT_TOP)
 	  || crtl->calls_eh_return
 	  || crtl->uses_const_pool))
     {
@@ -9443,7 +9488,7 @@ ix86_expand_prologue (void)
   pic_reg_used = false;
   if (pic_offset_table_rtx
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile))
+	  || (crtl->profile && !PROFILE_CALL_AT_TOP)))
     {
       unsigned int alt_pic_reg_used = ix86_select_alt_pic_regnum ();
 
@@ -9480,7 +9525,7 @@ ix86_expand_prologue (void)
      when mcount needs it.  Blockage to avoid call movement across mcount
      call is emitted in generic code after the NOTE_INSN_PROLOGUE_END
      note.  */
-  if (crtl->profile && pic_reg_used)
+  if (crtl->profile && !PROFILE_CALL_AT_TOP && pic_reg_used)
     emit_insn (gen_prologue_use (pic_offset_table_rtx));
 
   if (crtl->drap_reg && !crtl->stack_realign_needed)
@@ -27283,37 +27328,13 @@ x86_field_alignment (tree field, int com
 }
 
 /* Output assembler code to FILE to increment profiler label # LABELNO
-   for profiling a function entry.  */
+   for profiling a function entry, if profiling call should be emitted
+   after prologue.  */
 void
-x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
+x86_function_profiler (FILE *file ATTRIBUTE_UNUSED, int labelno ATTRIBUTE_UNUSED)
 {
-  if (TARGET_64BIT)
-    {
-#ifndef NO_PROFILE_COUNTERS
-      fprintf (file, "\tleaq\t%sP%d(%%rip),%%r11\n", LPREFIX, labelno);
-#endif
-
-      if (DEFAULT_ABI == SYSV_ABI && flag_pic)
-	fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file);
-      else
-	fputs ("\tcall\t" MCOUNT_NAME "\n", file);
-    }
-  else if (flag_pic)
-    {
-#ifndef NO_PROFILE_COUNTERS
-      fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
-	       LPREFIX, labelno);
-#endif
-      fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file);
-    }
-  else
-    {
-#ifndef NO_PROFILE_COUNTERS
-      fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
-	       LPREFIX, labelno);
-#endif
-      fputs ("\tcall\t" MCOUNT_NAME "\n", file);
-    }
+  if (!PROFILE_CALL_AT_TOP)
+    x86_function_profiler_intern (file, labelno);
 }
 
 #ifdef ASM_OUTPUT_MAX_SKIP_PAD
Index: gcc/gcc/config/i386/i386.h
===================================================================
--- gcc.orig/gcc/config/i386/i386.h	2010-07-09 11:24:42.000000000 +0200
+++ gcc/gcc/config/i386/i386.h	2010-07-13 12:33:52.273634200 +0200
@@ -1601,6 +1601,8 @@ typedef struct ix86_args {
 
 #define MCOUNT_NAME "_mcount"
 
+#define PROFILE_CALL_AT_TOP 0
+
 #define PROFILE_COUNT_REGISTER "edx"
 
 /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
Index: gcc/gcc/doc/invoke.texi
===================================================================
--- gcc.orig/gcc/doc/invoke.texi	2010-07-09 11:24:43.000000000 +0200
+++ gcc/gcc/doc/invoke.texi	2010-07-13 12:49:46.418596000 +0200
@@ -884,7 +884,7 @@ See i386 and x86-64 Options.
 @emph{i386 and x86-64 Windows Options}
 @gccoptlist{-mconsole -mcygwin -mno-cygwin -mdll
 -mnop-fun-dllimport -mthread -municode -mwin32 -mwindows
--fno-set-stack-executable}
+-fno-set-stack-executable -fprofile-top}
 
 @emph{Xstormy16 Options}
 @gccoptlist{-msim}
@@ -17108,6 +17108,15 @@ set. This is necessary for binaries runn
 Windows, as there the user32 API, which is used to set executable
 privileges, isn't available.
 
+@item -fprofile-top
+@opindex fprofile-top
+This option is available for Cygwin and MinGW targets.  It
+specifies that for profiling the call to profiler should be
+done before prologue.  Default behavior is that profiler-call
+is done after prologue is established. When active it calls
+the @code{_mcount_top} function, otherwise the @code{_mcount}
+function.
+
 @item -mpe-aligned-commons
 @opindex mpe-aligned-commons
 This option is available for Cygwin and MinGW targets.  It

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature
  2010-07-13 12:47 [patch i386]: Add for win32 targets pre-prologue profiling feature Kai Tietz
@ 2010-07-13 16:28 ` Richard Henderson
  2010-07-14 10:20   ` Kai Tietz
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2010-07-13 16:28 UTC (permalink / raw)
  To: Kai Tietz; +Cc: GCC Patches

On 07/13/2010 05:47 AM, Kai Tietz wrote:
> Hello,
> 
> This patch adds for i386/x86_64 win32 targets the feature of
> pre-prologue profiling call. It additional takes care that for enabled
> top-profiler call, the frame-pointer gets omitted, if possible.
> One side-note here about "hotfix" and profiling. The top-profiler call
> gets emitted in the ix86_asm_output_function_label. This is caused by
> the fact, that for ix86 the call needs to be placed before the
> code-pattern, and for x86_64 it can be placed after. Otherwise the x86
> pattern for patchable region would be corrupted, as this pattern
> contains frame-register setup. So I think that the use of the macro
> PROFILE_BEFORE_PROLOGUE isn't usable here.

Huh?  This is exactly the opposite of what we discussed yesterday on IRC.

For hotfix we have

	.rept 16
	.byte 0xcc
	.endr
function:
	mov.s	edi, edi
	push	ebp
	mov.s	esp, ebp

If *any* of the above is not exactly so, then the runtime pattern match
fails and the hotfix fails.  If we were to write the profiler first,
then we might as well not bother with the hotfix pieces, because they
will never match.

... of course it doesn't help that we emit the last two insns above
within the prologue, so if we simply place the profile before the 
prologue, we'll *still* be splitting the hotfix sequence for 32-bit.

I think the best thing to do is to diagnose hotfix+profile and generate
an error.  I don't think there's anything reasonable we can do.

In the end I don't think there's anything your AT_TOP macro does that 
PROFILE_BEFORE_PROLOGUE doesn't do just as well.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-13 16:28 ` Richard Henderson
@ 2010-07-14 10:20   ` Kai Tietz
  2010-07-14 11:49     ` Dave Korn
  2010-07-14 12:16     ` Andi Kleen
  0 siblings, 2 replies; 27+ messages in thread
From: Kai Tietz @ 2010-07-14 10:20 UTC (permalink / raw)
  To: Richard Henderson; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 3516 bytes --]

Hello,

2010/7/13 Richard Henderson <rth@redhat.com>:
> On 07/13/2010 05:47 AM, Kai Tietz wrote:
>> Hello,
>>
>> This patch adds for i386/x86_64 win32 targets the feature of
>> pre-prologue profiling call. It additional takes care that for enabled
>> top-profiler call, the frame-pointer gets omitted, if possible.
>> One side-note here about "hotfix" and profiling. The top-profiler call
>> gets emitted in the ix86_asm_output_function_label. This is caused by
>> the fact, that for ix86 the call needs to be placed before the
>> code-pattern, and for x86_64 it can be placed after. Otherwise the x86
>> pattern for patchable region would be corrupted, as this pattern
>> contains frame-register setup. So I think that the use of the macro
>> PROFILE_BEFORE_PROLOGUE isn't usable here.
>
> Huh?  This is exactly the opposite of what we discussed yesterday on IRC.
>
> For hotfix we have
>
>        .rept 16
>        .byte 0xcc
>        .endr
> function:
>        mov.s   edi, edi
>        push    ebp
>        mov.s   esp, ebp
>
> If *any* of the above is not exactly so, then the runtime pattern match
> fails and the hotfix fails.  If we were to write the profiler first,
> then we might as well not bother with the hotfix pieces, because they
> will never match.
>
> ... of course it doesn't help that we emit the last two insns above
> within the prologue, so if we simply place the profile before the
> prologue, we'll *still* be splitting the hotfix sequence for 32-bit.
>
> I think the best thing to do is to diagnose hotfix+profile and generate
> an error.  I don't think there's anything reasonable we can do.
>
> In the end I don't think there's anything your AT_TOP macro does that
> PROFILE_BEFORE_PROLOGUE doesn't do just as well.
>
>
> r~
>

This patch implements it by new hook TARGET_PROFILE_BEFORE_PROLOGUE.
This feature is for now just active for win32 i386 targets and is
controlled by internal target macro PROFILE_SUPPORT_BEFORE_PROLOGUE.

2010-07-14  Kai Tietz

	* config/i386/cygming.h (PROFILE_SUPPORT_BEFORE_PROLOGUE): New.
	(MCOUNT_NAME): Win32 specific version.
	* config/i386/cygming.opt (mprofile-top): New option.
	* config/i386/i386.c (ix86_profile_before_prologue):
	New hook.
	(ix86_function_regparm): Handle profiling before
	prologue case.
	(ix86_function_sseregparm): Likewise.
	(ix86_cfun_abi): Likewise.
	(ix86_frame_pointer_required): Likewise.
	(ix86_select_alt_pic_regnum): Likewise.
	(ix86_save_reg): Likewise.
	(ix86_expand_prologue): Likewise.
	Additionally sorry for 32-bit "hotfix" and profile
	code before prologue.
	(x86_function_profiler): Use fprintf instead of fputs for
	assembly output.
	(TARGET_PROFILE_BEFORE_PROLOGUE): Define target hook.
	* doc/invoke.texi (mprofile-top): Document option.
	* doc/tm.texi.in (TARGET_PROFILE_BEFOR_PROLOGUE):
	Add documentation.
	* doc/tm.texi: Regenerated.
	* final.c (final_start_function): Replace
	PROFILE_BEFORE_PROLOGUE guard by target hook.
	(profile_after_prologue): Likewise.
	* function.c (thread_prologue_and_epilogue_insns):
	Likewise.
	* target.def (profile_before_prologue): New hook.
	* targhook.c (default_profile_before_prologue): New.
	* targhook.h (default_profile_before_prologue): New.

Tested for i686-pc-linux-gnu, i686-pc-mingw32, and x86_64-pc-mingw32.
Ok for apply?

Regads,
Kai


-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

[-- Attachment #2: profile.top.diff --]
[-- Type: application/octet-stream, Size: 13878 bytes --]

Index: gcc/gcc/config/i386/cygming.h
===================================================================
--- gcc.orig/gcc/config/i386/cygming.h	2010-07-13 17:00:51.000000000 +0200
+++ gcc/gcc/config/i386/cygming.h	2010-07-14 10:15:25.467867800 +0200
@@ -39,6 +39,14 @@ along with GCC; see the file COPYING3.  
 #undef DEFAULT_ABI
 #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI)
 
+#undef PROFILE_SUPPORT_BEFORE_PROLOGUE
+#define PROFILE_SUPPORT_BEFORE_PROLOGUE flag_profile_top
+
+/* Choose the correct profiler mcount name. For checking we are using the
+   ix86_profile_before_prologue function as flag_profile_top is tri-state.  */
+#undef MCOUNT_NAME
+#define MCOUNT_NAME (ix86_profile_before_prologue () ? "_mcount_top" : "_mcount")
+
 #if ! defined (USE_MINGW64_LEADING_UNDERSCORES)
 #undef USER_LABEL_PREFIX
 #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_")
Index: gcc/gcc/config/i386/cygming.opt
===================================================================
--- gcc.orig/gcc/config/i386/cygming.opt	2010-07-13 17:00:51.000000000 +0200
+++ gcc/gcc/config/i386/cygming.opt	2010-07-14 09:49:33.869102000 +0200
@@ -53,3 +53,7 @@ Use the GNU extension to the PE format f
 muse-libstdc-wrappers
 Target Condition({defined (USE_CYGWIN_LIBSTDCXX_WRAPPERS)})
 Compile code that relies on Cygwin DLL wrappers to support C++ operator new/delete replacement
+
+mprofile-top
+Target Report Var(flag_profile_top) Init(-1)
+Emit profiling code before prologue.
Index: gcc/gcc/config/i386/i386.c
===================================================================
--- gcc.orig/gcc/config/i386/i386.c	2010-07-13 17:00:51.000000000 +0200
+++ gcc/gcc/config/i386/i386.c	2010-07-14 10:05:54.966610900 +0200
@@ -4770,6 +4770,30 @@ ix86_handle_cconv_attribute (tree *node,
   return NULL_TREE;
 }
 
+/* Return true, if profiling code should be emitted before
+   prologue. Otherwise it returns false.
+   Note: For x86 with "hotfix" it is sorried.  */
+static bool
+ix86_profile_before_prologue (void)
+{
+#ifdef PROFILE_SUPPORT_BEFORE_PROLOGUE
+  static int flag_value = -1;
+  if (flag_value == -1)
+    {
+      flag_value = PROFILE_SUPPORT_BEFORE_PROLOGUE;
+      if (flag_value == -1)
+	{
+	  /* Set it to default 0.  We need this tri-state for later
+	     checking of compatiblity and target preferences.  */
+	  flag_value = 0;
+	}
+    }
+  return flag_value != 0;
+#else
+  return false;
+#endif
+}
+
 /* Return 0 if the attributes for two types are incompatible, 1 if they
    are compatible, and 2 if they are nearly compatible (which causes a
    warning to be generated).  */
@@ -4841,7 +4865,7 @@ ix86_function_regparm (const_tree type, 
   if (decl
       && TREE_CODE (decl) == FUNCTION_DECL
       && optimize
-      && !profile_flag)
+      && !(profile_flag && !ix86_profile_before_prologue ()))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl));
@@ -4913,7 +4937,8 @@ ix86_function_sseregparm (const_tree typ
 
   /* For local functions, pass up to SSE_REGPARM_MAX SFmode
      (and DFmode for SSE2) arguments in SSE registers.  */
-  if (decl && TARGET_SSE_MATH && optimize && !profile_flag)
+  if (decl && TARGET_SSE_MATH && optimize
+      && !(profile_flag && !ix86_profile_before_prologue ()))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl));
@@ -5132,7 +5157,9 @@ ix86_cfun_abi (void)
   return cfun->machine->call_abi;
 }
 
-/* Write the extra assembler code needed to declare a function properly.  */
+/* Write the extra assembler code needed to declare a function properly.
+   Output call to profiler if profiling is enabled and it should be emitted
+   before prologue,  */
 
 void
 ix86_asm_output_function_label (FILE *asm_out_file, const char *fname,
@@ -7875,7 +7902,7 @@ ix86_frame_pointer_required (void)
 	  || ix86_current_function_calls_tls_descriptor))
     return true;
 
-  if (crtl->profile)
+  if (crtl->profile && !ix86_profile_before_prologue ())
     return true;
 
   return false;
@@ -8143,7 +8170,8 @@ gen_push (rtx arg)
 static unsigned int
 ix86_select_alt_pic_regnum (void)
 {
-  if (current_function_is_leaf && !crtl->profile
+  if (current_function_is_leaf
+      && !(crtl->profile && !ix86_profile_before_prologue ())
       && !ix86_current_function_calls_tls_descriptor)
     {
       int i, drap;
@@ -8167,7 +8195,7 @@ ix86_save_reg (unsigned int regno, int m
   if (pic_offset_table_rtx
       && regno == REAL_PIC_OFFSET_TABLE_REGNUM
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile
+	  || (crtl->profile && !ix86_profile_before_prologue ())
 	  || crtl->calls_eh_return
 	  || crtl->uses_const_pool))
     {
@@ -9191,6 +9219,11 @@ ix86_expand_prologue (void)
     {
       rtx push, mov;
 
+      /* Check if profiling is active and we shall use profiling before
+         prologue variant. If so sorry.  */
+      if (crtl->profile && ix86_profile_before_prologue () != 0)
+        sorry ("ms_hook_prologue attribute isn't compatible with -mprofile-top for 32-bit");
+
       /* Make sure the function starts with
 	 8b ff     movl.s %edi,%edi (emited by ix86_asm_output_function_label)
 	 55        push   %ebp
@@ -9443,7 +9476,7 @@ ix86_expand_prologue (void)
   pic_reg_used = false;
   if (pic_offset_table_rtx
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile))
+	  || (crtl->profile && !ix86_profile_before_prologue ())))
     {
       unsigned int alt_pic_reg_used = ix86_select_alt_pic_regnum ();
 
@@ -9480,7 +9513,7 @@ ix86_expand_prologue (void)
      when mcount needs it.  Blockage to avoid call movement across mcount
      call is emitted in generic code after the NOTE_INSN_PROLOGUE_END
      note.  */
-  if (crtl->profile && pic_reg_used)
+  if (crtl->profile && !ix86_profile_before_prologue () && pic_reg_used)
     emit_insn (gen_prologue_use (pic_offset_table_rtx));
 
   if (crtl->drap_reg && !crtl->stack_realign_needed)
@@ -27285,7 +27318,7 @@ x86_field_alignment (tree field, int com
 /* Output assembler code to FILE to increment profiler label # LABELNO
    for profiling a function entry.  */
 void
-x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
+x86_function_profiler (FILE *file ATTRIBUTE_UNUSED, int labelno ATTRIBUTE_UNUSED)
 {
   if (TARGET_64BIT)
     {
@@ -27294,9 +27327,9 @@ x86_function_profiler (FILE *file, int l
 #endif
 
       if (DEFAULT_ABI == SYSV_ABI && flag_pic)
-	fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file);
+	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", MCOUNT_NAME);
       else
-	fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+	fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
     }
   else if (flag_pic)
     {
@@ -27304,7 +27337,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file);
+      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", MCOUNT_NAME);
     }
   else
     {
@@ -27312,7 +27345,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+      fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
     }
 }
 
@@ -31360,6 +31393,9 @@ ix86_enum_va_list (int idx, const char *
 #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD
 #endif
 
+#undef TARGET_PROFILE_BEFORE_PROLOGUE
+#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue
+
 #undef TARGET_ASM_UNALIGNED_HI_OP
 #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP
 #undef TARGET_ASM_UNALIGNED_SI_OP
Index: gcc/gcc/doc/invoke.texi
===================================================================
--- gcc.orig/gcc/doc/invoke.texi	2010-07-13 17:00:51.000000000 +0200
+++ gcc/gcc/doc/invoke.texi	2010-07-14 10:10:11.655367800 +0200
@@ -884,7 +884,7 @@ See i386 and x86-64 Options.
 @emph{i386 and x86-64 Windows Options}
 @gccoptlist{-mconsole -mcygwin -mno-cygwin -mdll
 -mnop-fun-dllimport -mthread -municode -mwin32 -mwindows
--fno-set-stack-executable}
+-mprofile-top -fno-set-stack-executable}
 
 @emph{Xstormy16 Options}
 @gccoptlist{-msim}
@@ -17100,6 +17100,13 @@ specifies that a GUI application is to b
 instructing the linker to set the PE header subsystem type
 appropriately.
 
+@item -mprofile-top
+@opindex mprofile-top
+This option is available for Cygwin and MinGW targets. It
+specifies that for profiling the call to profiler should be
+done before prologue.  Default behavior is that profiler-call
+is done after the prologue is established.
+
 @item -fno-set-stack-executable
 @opindex fno-set-stack-executable
 This option is available for MinGW targets. It specifies that
Index: gcc/gcc/doc/tm.texi.in
===================================================================
--- gcc.orig/gcc/doc/tm.texi.in	2010-07-13 12:03:30.000000000 +0200
+++ gcc/gcc/doc/tm.texi.in	2010-07-14 09:47:06.485920000 +0200
@@ -7101,6 +7101,14 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@hook TARGET_PROFILE_BEFORE_PROLOGUE
+It returns true if target wants profile code emitted before
+prologue.
+
+The default version of this hook use the target macro
+@code{PROFILE_BEFORE_PROLOGUE}.
+@end deftypefn
+
 @hook TARGET_BINDS_LOCAL_P
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/final.c
===================================================================
--- gcc.orig/gcc/final.c	2010-07-09 11:24:47.000000000 +0200
+++ gcc/gcc/final.c	2010-07-14 09:47:06.376496000 +0200
@@ -1546,10 +1546,8 @@ final_start_function (rtx first ATTRIBUT
 
   /* The Sun386i and perhaps other machines don't work right
      if the profiling code comes after the prologue.  */
-#ifdef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* PROFILE_BEFORE_PROLOGUE */
 
 #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue)
   if (dwarf2out_do_frame ())
@@ -1591,10 +1589,8 @@ final_start_function (rtx first ATTRIBUT
 static void
 profile_after_prologue (FILE *file ATTRIBUTE_UNUSED)
 {
-#ifndef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (!targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* not PROFILE_BEFORE_PROLOGUE */
 }
 
 static void
Index: gcc/gcc/function.c
===================================================================
--- gcc.orig/gcc/function.c	2010-07-06 13:15:38.000000000 +0200
+++ gcc/gcc/function.c	2010-07-14 09:47:06.392128000 +0200
@@ -5100,13 +5100,11 @@ thread_prologue_and_epilogue_insns (void
       record_insns (seq, NULL, &prologue_insn_hash);
       emit_note (NOTE_INSN_PROLOGUE_END);
 
-#ifndef PROFILE_BEFORE_PROLOGUE
       /* Ensure that instructions are not moved into the prologue when
 	 profiling is on.  The call to the profiling routine can be
 	 emitted within the live range of a call-clobbered register.  */
-      if (crtl->profile)
+      if (!targetm.profile_before_prologue () && crtl->profile)
         emit_insn (gen_blockage ());
-#endif
 
       seq = get_insns ();
       end_sequence ();
Index: gcc/gcc/target.def
===================================================================
--- gcc.orig/gcc/target.def	2010-07-09 11:24:47.000000000 +0200
+++ gcc/gcc/target.def	2010-07-14 09:47:06.392128000 +0200
@@ -1218,6 +1218,13 @@ DEFHOOK
  bool, (const_tree exp),
  default_binds_local_p)
 
+/* Check if profiling code is before or after prologue.  */
+DEFHOOK
+(profile_before_prologue,
+ "",
+ bool, (void),
+ default_profile_before_prologue)
+
 /* Modify and return the identifier of a DECL's external name,
    originally identified by ID, as required by the target,
    (eg, append @nn to windows32 stdcall function names).
Index: gcc/gcc/targhooks.c
===================================================================
--- gcc.orig/gcc/targhooks.c	2010-07-09 11:24:47.000000000 +0200
+++ gcc/gcc/targhooks.c	2010-07-14 09:47:06.392128000 +0200
@@ -1197,4 +1197,14 @@ default_register_move_cost (enum machine
 #endif
 }
 
+bool
+default_profile_before_prologue (void)
+{
+#ifndef PROFILE_BEFORE_PROLOGUE
+  return false;
+#else
+  return true;
+#endif
+}
+
 #include "gt-targhooks.h"
Index: gcc/gcc/targhooks.h
===================================================================
--- gcc.orig/gcc/targhooks.h	2010-07-09 11:24:47.000000000 +0200
+++ gcc/gcc/targhooks.h	2010-07-14 09:47:06.392128000 +0200
@@ -150,3 +150,4 @@ extern int default_memory_move_cost (enu
 extern int default_register_move_cost (enum machine_mode, reg_class_t,
 				       reg_class_t);
 
+extern bool default_profile_before_prologue (void);
Index: gcc/gcc/doc/tm.texi
===================================================================
--- gcc.orig/gcc/doc/tm.texi	2010-07-13 12:03:30.000000000 +0200
+++ gcc/gcc/doc/tm.texi	2010-07-14 10:34:44.219257400 +0200
@@ -7101,6 +7101,14 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void)
+It returns true if target wants profile code emitted before
+prologue.
+
+The default version of this hook use the target macro
+@code{PROFILE_BEFORE_PROLOGUE}.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp})
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-14 10:20   ` Kai Tietz
@ 2010-07-14 11:49     ` Dave Korn
  2010-07-14 12:11       ` Kai Tietz
  2010-07-14 12:16     ` Andi Kleen
  1 sibling, 1 reply; 27+ messages in thread
From: Dave Korn @ 2010-07-14 11:49 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Richard Henderson, GCC Patches

On 14/07/2010 11:20, Kai Tietz wrote:

> 	* config/i386/cygming.h (PROFILE_SUPPORT_BEFORE_PROLOGUE): New.
> 	(MCOUNT_NAME): Win32 specific version.

> +
> +/* Choose the correct profiler mcount name. For checking we are using the
> +   ix86_profile_before_prologue function as flag_profile_top is tri-state.  */
> +#undef MCOUNT_NAME
> +#define MCOUNT_NAME (ix86_profile_before_prologue () ? "_mcount_top" : "_mcount")
> +
>  #if ! defined (USE_MINGW64_LEADING_UNDERSCORES)
>  #undef USER_LABEL_PREFIX
>  #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_")


  Shouldn't MCOUNT_NAME take USER_LABEL_PREFIX into account?

    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-14 11:49     ` Dave Korn
@ 2010-07-14 12:11       ` Kai Tietz
  0 siblings, 0 replies; 27+ messages in thread
From: Kai Tietz @ 2010-07-14 12:11 UTC (permalink / raw)
  To: Dave Korn; +Cc: Richard Henderson, GCC Patches

2010/7/14 Dave Korn <dave.korn.cygwin@gmail.com>:
> On 14/07/2010 11:20, Kai Tietz wrote:
>
>>       * config/i386/cygming.h (PROFILE_SUPPORT_BEFORE_PROLOGUE): New.
>>       (MCOUNT_NAME): Win32 specific version.
>
>> +
>> +/* Choose the correct profiler mcount name. For checking we are using the
>> +   ix86_profile_before_prologue function as flag_profile_top is tri-state.  */
>> +#undef MCOUNT_NAME
>> +#define MCOUNT_NAME (ix86_profile_before_prologue () ? "_mcount_top" : "_mcount")
>> +
>>  #if ! defined (USE_MINGW64_LEADING_UNDERSCORES)
>>  #undef USER_LABEL_PREFIX
>>  #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_")
>
>
>  Shouldn't MCOUNT_NAME take USER_LABEL_PREFIX into account?
>
>    cheers,
>      DaveK
>
>

Well, as this MCOUNT_NAME ("_mcount") is already used widely for
64-bit, I don't want to change here something. But in general you are
right.

Cheers,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-14 10:20   ` Kai Tietz
  2010-07-14 11:49     ` Dave Korn
@ 2010-07-14 12:16     ` Andi Kleen
  2010-07-14 12:38       ` Kai Tietz
  1 sibling, 1 reply; 27+ messages in thread
From: Andi Kleen @ 2010-07-14 12:16 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Richard Henderson, GCC Patches

Kai Tietz <ktietz70@googlemail.com> writes:
>
> This patch implements it by new hook TARGET_PROFILE_BEFORE_PROLOGUE.
> This feature is for now just active for win32 i386 targets and is
> controlled by internal target macro PROFILE_SUPPORT_BEFORE_PROLOGUE.

IMHO the infrastructure in my old patch for this for Linux was a little
cleaner. Unfortunately that patch is still not applied.

http://thread.gmane.org/gmane.comp.gcc.patches/197870

But if you're going to resubmit this it would be good to merge the two
at least.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-14 12:16     ` Andi Kleen
@ 2010-07-14 12:38       ` Kai Tietz
  2010-07-15 18:08         ` Kai Tietz
  0 siblings, 1 reply; 27+ messages in thread
From: Kai Tietz @ 2010-07-14 12:38 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Richard Henderson, GCC Patches

Hello Andy,

2010/7/14 Andi Kleen <andi@firstfloor.org>:
> Kai Tietz <ktietz70@googlemail.com> writes:
>>
>> This patch implements it by new hook TARGET_PROFILE_BEFORE_PROLOGUE.
>> This feature is for now just active for win32 i386 targets and is
>> controlled by internal target macro PROFILE_SUPPORT_BEFORE_PROLOGUE.
>
> IMHO the infrastructure in my old patch for this for Linux was a little
> cleaner. Unfortunately that patch is still not applied.
>
> http://thread.gmane.org/gmane.comp.gcc.patches/197870
>
> But if you're going to resubmit this it would be good to merge the two
> at least.
>
> -Andi
>
> --
> ak@linux.intel.com -- Speaking for myself only.
>

to move the -mprologue-top (the name is a bit fanciless) option into
i386.opt was my initial idea. The bad side of it, is that then that
all i386 targets would have this option and for all proper MCOUNT_NAME
entries have to made up, but those targets do not support before
prologue profile call until now. I spoke here with rth and he pointed
it out, that this is possibly something to be avoided.
To your implementation, I see here one big disadvange in comparison to
mine. As for x64 with SEH unwind information (I am working on the
patch and are preparing things here), it needs a different
frame-layout for frame-pointer prologue/epilogue, which enforces the
use of before prologue profiling, a target-function hook instead of a
target-variable is necessary, as otherwise the default setting for
specific modes gets problematic.
Eg for the x64 frame-layout the option needs to be implicit enabled.

But of course I'll take a look into your patch and see, if I am able
to merge things into an new version of it after first review of it is
done.

Cheers,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-14 12:38       ` Kai Tietz
@ 2010-07-15 18:08         ` Kai Tietz
  2010-07-16 17:06           ` Richard Henderson
                             ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Kai Tietz @ 2010-07-15 18:08 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andi Kleen, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1849 bytes --]

Hello Andy,

I updated my patch in that way, that it should be trivial to add the
counter function for before prologue profiling to linux target by a
one-liner.
Just make sure that for the i386-target the macro
MCOUNT_NAME_BEFORE_PROLOGUE is defined.

I reworked the patch so that the option is now named -mfentry and it
is available for all i386 targets, if they have defined the counter
function's name via MCOUNT_NAME_BEFORE_PROLOGUE in target.
Additionally I added some option-checks for targets, which don't
support before prologue profiling.

ChangeLog

2010-07-15  Kai Tietz

	* config/i386/cygming.h (MCOUNT_NAME): New.
	(MCOUNT_NAME_BEFORE_PROLOGUE): New.
	* config/i386/i386.c (ix86_profile_before_prologue): New.
	(override_options): Add special handling for -mfentry.
	(ix86_function_regparm): Likewise.
	(ix86_function_sseregparm): Likewise.
	(ix86_frame_pointer_required): Likewise.
	(ix86_select_alt_pic_regnum): Likewise.
	(ix86_save_reg):
	(ix86_expand_prologue):
	(x86_function_profiler):
	(TARGET_PROFILE_BEFORE_PROLOGUE): Define hook.
	* config/i386/i386.opt (mfentry): New.
	* doc/invoke.texi (mfentry): Add documentation.
	* doc/tm.texi: Regenerated..
	* doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New.
	* final.c (final_start_function): Replace macro
	PROFILE_BEFORE_PROLOGUE by target hook.
	* function.c (thread_prologue_and_epilogue_insns): Likewise.
	* target.def (profile_before_prologue): New hook.
	* targhooks.c (default_profile_before_prologue): New.
	* targhooks.h (default_profile_before_prologue): New.

Tested for i686-pc-mingw32, x86_64-pc-mingw32, and i686-pc-linux-gnu
to see if option check works. Ok for apply?

Regards,
Kai


-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

[-- Attachment #2: profileb.diff --]
[-- Type: application/octet-stream, Size: 14056 bytes --]

Index: gcc/gcc/config/i386/cygming.h
===================================================================
--- gcc.orig/gcc/config/i386/cygming.h	2010-07-15 19:15:51.651349100 +0200
+++ gcc/gcc/config/i386/cygming.h	2010-07-15 19:49:18.100111400 +0200
@@ -39,6 +39,13 @@ along with GCC; see the file COPYING3.
 #undef DEFAULT_ABI
 #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI)
 
+/* Choose the correct profiler mcount name.  */
+#undef MCOUNT_NAME
+#define MCOUNT_NAME "_mcount"
+
+#undef MCOUNT_NAME_BEFORE_PROLOGUE
+#define MCOUNT_NAME_BEFORE_PROLOGUE "_mcount_top"
+
 #if ! defined (USE_MINGW64_LEADING_UNDERSCORES)
 #undef USER_LABEL_PREFIX
 #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_")
Index: gcc/gcc/config/i386/i386.c
===================================================================
--- gcc.orig/gcc/config/i386/i386.c	2010-07-15 19:15:51.653349100 +0200
+++ gcc/gcc/config/i386/i386.c	2010-07-15 19:55:46.197309300 +0200
@@ -2768,6 +2768,15 @@ software_prefetching_beneficial_p (void)
     }
 }
 
+/* Return true, if profiling code should be emitted before
+   prologue. Otherwise it returns false.
+   Note: For x86 with "hotfix" it is sorried.  */
+static bool
+ix86_profile_before_prologue (void)
+{
+  return flag_fentry != 0;
+}
+
 /* Function that is callable from the debugger to print the current
    options.  */
 void
@@ -3671,6 +3680,28 @@ override_options (bool main_args_p)
     target_flags |= MASK_CLD & ~target_flags_explicit;
 #endif
 
+  {
+    int default_profile_top_flag = 0;
+    int only_default = 1;
+
+#if defined(PROFILE_BEFORE_PROLOGUE)
+    default_profile_top_flag = 1;
+#endif
+#if defined(MCOUNT_NAME) && defined (MCOUNT_NAME_BEFORE_PROLOGUE)
+    only_default = 0;
+#endif
+
+    if (flag_fentry == -1)
+      flag_fentry = default_profile_top_flag;
+    else if (flag_fentry != default_profile_top_flag && only_default)
+      {
+        if (!default_profile_top_flag)
+          sorry ("-mfentry isn't supported for this target");
+        else
+          sorry ("-mno-fentry isn't supported for this target");
+	flag_fentry = default_profile_top_flag;
+      }
+  }
   /* Save the initial options in case the user does function specific options */
   if (main_args_p)
     target_option_default_node = target_option_current_node
@@ -4841,7 +4872,7 @@ ix86_function_regparm (const_tree type,
   if (decl
       && TREE_CODE (decl) == FUNCTION_DECL
       && optimize
-      && !profile_flag)
+      && !(profile_flag && !flag_fentry))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl));
@@ -4913,7 +4944,8 @@ ix86_function_sseregparm (const_tree typ
 
   /* For local functions, pass up to SSE_REGPARM_MAX SFmode
      (and DFmode for SSE2) arguments in SSE registers.  */
-  if (decl && TARGET_SSE_MATH && optimize && !profile_flag)
+  if (decl && TARGET_SSE_MATH && optimize
+      && !(profile_flag && !flag_fentry))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl));
@@ -7875,7 +7907,7 @@ ix86_frame_pointer_required (void)
 	  || ix86_current_function_calls_tls_descriptor))
     return true;
 
-  if (crtl->profile)
+  if (crtl->profile && !flag_fentry)
     return true;
 
   return false;
@@ -8143,7 +8175,8 @@ gen_push (rtx arg)
 static unsigned int
 ix86_select_alt_pic_regnum (void)
 {
-  if (current_function_is_leaf && !crtl->profile
+  if (current_function_is_leaf
+      && !(crtl->profile && !flag_fentry)
       && !ix86_current_function_calls_tls_descriptor)
     {
       int i, drap;
@@ -8167,7 +8200,7 @@ ix86_save_reg (unsigned int regno, int m
   if (pic_offset_table_rtx
       && regno == REAL_PIC_OFFSET_TABLE_REGNUM
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile
+	  || (crtl->profile && !flag_fentry)
 	  || crtl->calls_eh_return
 	  || crtl->uses_const_pool))
     {
@@ -9191,6 +9224,11 @@ ix86_expand_prologue (void)
     {
       rtx push, mov;
 
+      /* Check if profiling is active and we shall use profiling before
+         prologue variant. If so sorry.  */
+      if (crtl->profile && flag_fentry != 0)
+        sorry ("ms_hook_prologue attribute isn't compatible with -mfentry for 32-bit");
+
       /* Make sure the function starts with
 	 8b ff     movl.s %edi,%edi (emited by ix86_asm_output_function_label)
 	 55        push   %ebp
@@ -9443,7 +9481,7 @@ ix86_expand_prologue (void)
   pic_reg_used = false;
   if (pic_offset_table_rtx
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile))
+	  || (crtl->profile && !flag_fentry)))
     {
       unsigned int alt_pic_reg_used = ix86_select_alt_pic_regnum ();
 
@@ -9480,7 +9518,7 @@ ix86_expand_prologue (void)
      when mcount needs it.  Blockage to avoid call movement across mcount
      call is emitted in generic code after the NOTE_INSN_PROLOGUE_END
      note.  */
-  if (crtl->profile && pic_reg_used)
+  if (crtl->profile && !flag_fentry && pic_reg_used)
     emit_insn (gen_prologue_use (pic_offset_table_rtx));
 
   if (crtl->drap_reg && !crtl->stack_realign_needed)
@@ -27282,11 +27320,26 @@ x86_field_alignment (tree field, int com
   return computed;
 }
 
+#if !defined(MCOUNT_NAME) && !defined(MCOUNT_NAME_BEFORE_PROLOGUE)
+#error MCOUNT_NAME ,and/or MCOUNT_NAME_BEFORE_PROLOGUE have to be define
+#endif
+
+/* Make sure both are getting defined.  */
+#ifndef MCOUNT_NAME
+#define MCOUNT_NAME MCOUNT_NAME_BEFORE_PROLOGUE
+#endif
+#ifndef MCOUNT_NAME_BEFORE_PROLOGUE
+#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME
+#endif
+
 /* Output assembler code to FILE to increment profiler label # LABELNO
    for profiling a function entry.  */
 void
 x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
 {
+  const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE
+					 : MCOUNT_NAME);
+
   if (TARGET_64BIT)
     {
 #ifndef NO_PROFILE_COUNTERS
@@ -27294,9 +27347,9 @@ x86_function_profiler (FILE *file, int l
 #endif
 
       if (DEFAULT_ABI == SYSV_ABI && flag_pic)
-	fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file);
+	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
       else
-	fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+	fprintf (file, "\tcall\t%s\n", mcount_name);
     }
   else if (flag_pic)
     {
@@ -27304,7 +27357,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file);
+      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
     }
   else
     {
@@ -27312,7 +27365,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+      fprintf (file, "\tcall\t%s\n", mcount_name);
     }
 }
 
@@ -31360,6 +31413,9 @@ ix86_enum_va_list (int idx, const char *
 #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD
 #endif
 
+#undef TARGET_PROFILE_BEFORE_PROLOGUE
+#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue
+
 #undef TARGET_ASM_UNALIGNED_HI_OP
 #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP
 #undef TARGET_ASM_UNALIGNED_SI_OP
Index: gcc/gcc/config/i386/i386.opt
===================================================================
--- gcc.orig/gcc/config/i386/i386.opt	2010-07-15 19:15:51.665349100 +0200
+++ gcc/gcc/config/i386/i386.opt	2010-07-15 19:19:10.376715500 +0200
@@ -375,3 +375,7 @@ Support RDRND built-in functions and cod
 mf16c
 Target Report Mask(ISA_F16C) Var(ix86_isa_flags) VarExists Save
 Support F16C built-in functions and code generation
+
+mfentry
+Target Report Var(flag_fentry) Init(-1)
+Emit profiling counter call at function entry before prologue.
Index: gcc/gcc/doc/invoke.texi
===================================================================
--- gcc.orig/gcc/doc/invoke.texi	2010-07-15 19:15:51.666349100 +0200
+++ gcc/gcc/doc/invoke.texi	2010-07-15 19:19:10.385716000 +0200
@@ -601,7 +601,7 @@ Objective-C and Objective-C++ Dialects}.
 -momit-leaf-frame-pointer  -mno-red-zone -mno-tls-direct-seg-refs @gol
 -mcmodel=@var{code-model} -mabi=@var{name} @gol
 -m32  -m64 -mlarge-data-threshold=@var{num} @gol
--msse2avx}
+-msse2avx -mfentry}
 
 @emph{IA-64 Options}
 @gccoptlist{-mbig-endian  -mlittle-endian  -mgnu-as  -mgnu-ld  -mno-pic @gol
@@ -12466,6 +12466,14 @@ For systems that use GNU libc, the defau
 @opindex msse2avx
 Specify that the assembler should encode SSE instructions with VEX
 prefix.  The option @option{-mavx} turns this on by default.
+
+@item -mfentry
+@itemx -mno-fentry
+@opindex mfentry
+If profiling is active @option{-pg} put the profiling
+counter call before prologue.
+Note: On x86 architectures the attribute @code{ms_hook_prologue}
+isn't possible at the moment for @option{-mfentry} and @option{-pg}.
 @end table
 
 These @samp{-m} switches are supported in addition to the above
Index: gcc/gcc/doc/tm.texi
===================================================================
--- gcc.orig/gcc/doc/tm.texi	2010-07-15 19:15:51.668349100 +0200
+++ gcc/gcc/doc/tm.texi	2010-07-15 19:19:10.393716500 +0200
@@ -7101,6 +7101,14 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void)
+It returns true if target wants profile code emitted before
+prologue.
+
+The default version of this hook use the target macro
+@code{PROFILE_BEFORE_PROLOGUE}.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp})
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/doc/tm.texi.in
===================================================================
--- gcc.orig/gcc/doc/tm.texi.in	2010-07-15 19:15:51.673349100 +0200
+++ gcc/gcc/doc/tm.texi.in	2010-07-15 19:19:10.399716800 +0200
@@ -7101,6 +7101,14 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@hook TARGET_PROFILE_BEFORE_PROLOGUE
+It returns true if target wants profile code emitted before
+prologue.
+
+The default version of this hook use the target macro
+@code{PROFILE_BEFORE_PROLOGUE}.
+@end deftypefn
+
 @hook TARGET_BINDS_LOCAL_P
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/final.c
===================================================================
--- gcc.orig/gcc/final.c	2010-07-15 19:15:51.674349100 +0200
+++ gcc/gcc/final.c	2010-07-15 19:19:10.403717000 +0200
@@ -1546,10 +1546,8 @@ final_start_function (rtx first ATTRIBUT
 
   /* The Sun386i and perhaps other machines don't work right
      if the profiling code comes after the prologue.  */
-#ifdef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* PROFILE_BEFORE_PROLOGUE */
 
 #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue)
   if (dwarf2out_do_frame ())
@@ -1591,10 +1589,8 @@ final_start_function (rtx first ATTRIBUT
 static void
 profile_after_prologue (FILE *file ATTRIBUTE_UNUSED)
 {
-#ifndef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (!targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* not PROFILE_BEFORE_PROLOGUE */
 }
 
 static void
Index: gcc/gcc/function.c
===================================================================
--- gcc.orig/gcc/function.c	2010-07-15 19:15:51.675349100 +0200
+++ gcc/gcc/function.c	2010-07-15 19:19:10.408717300 +0200
@@ -5100,13 +5100,11 @@ thread_prologue_and_epilogue_insns (void
       record_insns (seq, NULL, &prologue_insn_hash);
       emit_note (NOTE_INSN_PROLOGUE_END);
 
-#ifndef PROFILE_BEFORE_PROLOGUE
       /* Ensure that instructions are not moved into the prologue when
 	 profiling is on.  The call to the profiling routine can be
 	 emitted within the live range of a call-clobbered register.  */
-      if (crtl->profile)
+      if (!targetm.profile_before_prologue () && crtl->profile)
         emit_insn (gen_blockage ());
-#endif
 
       seq = get_insns ();
       end_sequence ();
Index: gcc/gcc/target.def
===================================================================
--- gcc.orig/gcc/target.def	2010-07-15 19:15:51.676349100 +0200
+++ gcc/gcc/target.def	2010-07-15 19:19:10.411717500 +0200
@@ -1218,6 +1218,13 @@ DEFHOOK
  bool, (const_tree exp),
  default_binds_local_p)
 
+/* Check if profiling code is before or after prologue.  */
+DEFHOOK
+(profile_before_prologue,
+ "",
+ bool, (void),
+ default_profile_before_prologue)
+
 /* Modify and return the identifier of a DECL's external name,
    originally identified by ID, as required by the target,
    (eg, append @nn to windows32 stdcall function names).
Index: gcc/gcc/targhooks.c
===================================================================
--- gcc.orig/gcc/targhooks.c	2010-07-15 19:15:51.678349100 +0200
+++ gcc/gcc/targhooks.c	2010-07-15 19:19:10.413717600 +0200
@@ -1197,4 +1197,14 @@ default_register_move_cost (enum machine
 #endif
 }
 
+bool
+default_profile_before_prologue (void)
+{
+#ifndef PROFILE_BEFORE_PROLOGUE
+  return false;
+#else
+  return true;
+#endif
+}
+
 #include "gt-targhooks.h"
Index: gcc/gcc/targhooks.h
===================================================================
--- gcc.orig/gcc/targhooks.h	2010-07-15 19:15:51.687349100 +0200
+++ gcc/gcc/targhooks.h	2010-07-15 19:19:10.416717800 +0200
@@ -150,3 +150,4 @@ extern int default_memory_move_cost (enu
 extern int default_register_move_cost (enum machine_mode, reg_class_t,
 				       reg_class_t);
 
+extern bool default_profile_before_prologue (void);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-15 18:08         ` Kai Tietz
@ 2010-07-16 17:06           ` Richard Henderson
  2010-07-17  6:52             ` Kai Tietz
  2010-07-16 20:53           ` Gerald Pfeifer
  2010-07-16 23:57           ` Andi Kleen
  2 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2010-07-16 17:06 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Andi Kleen, GCC Patches

On 07/15/2010 11:08 AM, Kai Tietz wrote:
> 	(override_options): Add special handling for -mfentry.
> 	(ix86_function_regparm): Likewise.
> 	(ix86_function_sseregparm): Likewise.

Why are you adding the special-casing to the two regparm functions?
You do realize that suddenly your mcount_top function may have zero
free registers with which to do its job.  I think that's a bad idea.

> 	(ix86_save_reg):
> 	(ix86_expand_prologue):
> 	(x86_function_profiler):

No changes?  ;-)

x86_function_profiler is broken for -m32 -fentry -fpic.  It uses
%ebx which has not been set up.  I think perhaps this combination
cannot really be supported, and this should be diagnosed back in
override_options.

> +@hook TARGET_PROFILE_BEFORE_PROLOGUE
> +It returns true if target wants profile code emitted before
> +prologue.
> +
> +The default version of this hook use the target macro
> +@code{PROFILE_BEFORE_PROLOGUE}.
> +@end deftypefn

The text should go in target.def, in the ""; only the @hook line goes here.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-15 18:08         ` Kai Tietz
  2010-07-16 17:06           ` Richard Henderson
@ 2010-07-16 20:53           ` Gerald Pfeifer
  2010-07-18 12:20             ` Kai Tietz
  2010-07-16 23:57           ` Andi Kleen
  2 siblings, 1 reply; 27+ messages in thread
From: Gerald Pfeifer @ 2010-07-16 20:53 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Richard Henderson, Andi Kleen, GCC Patches

On Thu, 15 Jul 2010, Kai Tietz wrote:
> 	* doc/invoke.texi (mfentry): Add documentation.

+Note: On x86 architectures the attribute @code{ms_hook_prologue}               
+isn't possible at the moment for @option{-mfentry} and @option{-pg}.

How about: "...is not supported for...at the moment."?

And if you could add an item to http://gcc.gnu.org/gcc-4.6/changes.html
when this goes in, that would be good.

Gerald

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature
  2010-07-15 18:08         ` Kai Tietz
  2010-07-16 17:06           ` Richard Henderson
  2010-07-16 20:53           ` Gerald Pfeifer
@ 2010-07-16 23:57           ` Andi Kleen
  2010-07-17  5:34             ` Kai Tietz
  2 siblings, 1 reply; 27+ messages in thread
From: Andi Kleen @ 2010-07-16 23:57 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Richard Henderson, Andi Kleen, GCC Patches

On Thu, Jul 15, 2010 at 08:08:24PM +0200, Kai Tietz wrote:
> Hello Andy,
> 
> I updated my patch in that way, that it should be trivial to add the
> counter function for before prologue profiling to linux target by a
> one-liner.
> Just make sure that for the i386-target the macro
> MCOUNT_NAME_BEFORE_PROLOGUE is defined.
> 
> I reworked the patch so that the option is now named -mfentry and it
> is available for all i386 targets, if they have defined the counter
> function's name via MCOUNT_NAME_BEFORE_PROLOGUE in target.
> Additionally I added some option-checks for targets, which don't
> support before prologue profiling.

Kai,

I tried the patch on x86_64-linux but it doesn't work for me. First I added 
the define to linux

diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h
index 81dfd1e..54051ed 100644
--- a/gcc/config/i386/linux.h
+++ b/gcc/config/i386/linux.h
@@ -48,6 +48,10 @@ along with GCC; see the file COPYING3.  If not see
 
 #define NO_PROFILE_COUNTERS	1
 
+/* Choose the correct profiler mcount name.  */
+#undef MCOUNT_NAME_BEFORE_PROLOGUE
+#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__"
+
 #undef MCOUNT_NAME
 #define MCOUNT_NAME "mcount"

But when I try to set -mfentry on a simple test program I get

 sorry, unimplemented: -mfentry isn't supported for this target

I think that's because of

+#if defined(PROFILE_BEFORE_PROLOGUE)
+    default_profile_top_flag = 1;
+#endif
+#if defined(MCOUNT_NAME) && defined (MCOUNT_NAME_BEFORE_PROLOGUE)
+    only_default = 0;
+#endif
+
+    if (flag_fentry == -1)
+      flag_fentry = default_profile_top_flag;
+    else if (flag_fentry != default_profile_top_flag && only_default)
+      {
+        if (!default_profile_top_flag)
+          sorry ("-mfentry isn't supported for this target");
+        else
+          sorry ("-mno-fentry isn't supported for this target");


and PROFILE_BEFORE_PROLOGUE is never set for i386, default_profile_flag
is always 0

-Andi

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-16 23:57           ` Andi Kleen
@ 2010-07-17  5:34             ` Kai Tietz
  2010-07-17  9:45               ` Andi Kleen
  0 siblings, 1 reply; 27+ messages in thread
From: Kai Tietz @ 2010-07-17  5:34 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Richard Henderson, GCC Patches

2010/7/17 Andi Kleen <andi@firstfloor.org>:
> On Thu, Jul 15, 2010 at 08:08:24PM +0200, Kai Tietz wrote:
>> Hello Andy,
>>
>> I updated my patch in that way, that it should be trivial to add the
>> counter function for before prologue profiling to linux target by a
>> one-liner.
>> Just make sure that for the i386-target the macro
>> MCOUNT_NAME_BEFORE_PROLOGUE is defined.
>>
>> I reworked the patch so that the option is now named -mfentry and it
>> is available for all i386 targets, if they have defined the counter
>> function's name via MCOUNT_NAME_BEFORE_PROLOGUE in target.
>> Additionally I added some option-checks for targets, which don't
>> support before prologue profiling.
>
> Kai,
>
> I tried the patch on x86_64-linux but it doesn't work for me. First I added
> the define to linux
>
> diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h
> index 81dfd1e..54051ed 100644
> --- a/gcc/config/i386/linux.h
> +++ b/gcc/config/i386/linux.h
> @@ -48,6 +48,10 @@ along with GCC; see the file COPYING3.  If not see
>
>  #define NO_PROFILE_COUNTERS    1
>
> +/* Choose the correct profiler mcount name.  */
> +#undef MCOUNT_NAME_BEFORE_PROLOGUE
> +#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__"
> +
>  #undef MCOUNT_NAME
>  #define MCOUNT_NAME "mcount"
>
> But when I try to set -mfentry on a simple test program I get
>
>  sorry, unimplemented: -mfentry isn't supported for this target
>
> I think that's because of
>
> +#if defined(PROFILE_BEFORE_PROLOGUE)
> +    default_profile_top_flag = 1;
> +#endif
> +#if defined(MCOUNT_NAME) && defined (MCOUNT_NAME_BEFORE_PROLOGUE)
> +    only_default = 0;
> +#endif
> +
> +    if (flag_fentry == -1)
> +      flag_fentry = default_profile_top_flag;
> +    else if (flag_fentry != default_profile_top_flag && only_default)
> +      {
> +        if (!default_profile_top_flag)
> +          sorry ("-mfentry isn't supported for this target");
> +        else
> +          sorry ("-mno-fentry isn't supported for this target");
>
>
> and PROFILE_BEFORE_PROLOGUE is never set for i386, default_profile_flag
> is always 0
>
> -Andi
>

Hmm, I can't reproduce this. Clear if PROFILE_BEFORE_PROLOGUE isn't
set, then default remains profile counter call after prologue. But if
you have 'defined(MCOUNT_NAME) && defined
(MCOUNT_NAME_BEFORE_PROLOGUE)' the variable only_default is false, and
so the error you are showing shouldn't be reachable. Do I miss here
something?

Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-16 17:06           ` Richard Henderson
@ 2010-07-17  6:52             ` Kai Tietz
  2010-07-20  2:27               ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Kai Tietz @ 2010-07-17  6:52 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andi Kleen, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 3077 bytes --]

Hello Richard,

thanks for the review.

2010/7/16 Richard Henderson <rth@redhat.com>:
> On 07/15/2010 11:08 AM, Kai Tietz wrote:
>>       (override_options): Add special handling for -mfentry.
>>       (ix86_function_regparm): Likewise.
>>       (ix86_function_sseregparm): Likewise.
>
> Why are you adding the special-casing to the two regparm functions?
> You do realize that suddenly your mcount_top function may have zero
> free registers with which to do its job.  I think that's a bad idea.

Well, as the mcount before prologue call has to be transparent (saving
and restoring all modified registers) I don't see here, that no
registers are a problem.

>>       (ix86_save_reg):
>>       (ix86_expand_prologue):
>>       (x86_function_profiler):
>
> No changes?  ;-)
Well, missed the Likewise :}

> x86_function_profiler is broken for -m32 -fentry -fpic.  It uses
> %ebx which has not been set up.  I think perhaps this combination
> cannot really be supported, and this should be diagnosed back in
> override_options.

Yeah, add it to my update patch.

>> +@hook TARGET_PROFILE_BEFORE_PROLOGUE
>> +It returns true if target wants profile code emitted before
>> +prologue.
>> +
>> +The default version of this hook use the target macro
>> +@code{PROFILE_BEFORE_PROLOGUE}.
>> +@end deftypefn
>
> The text should go in target.def, in the ""; only the @hook line goes here.

Ok, moved into target.def (btw I think it is one of the first hooks
using this feature).

So by discussion on IRC some changes to checks about fpic are
superflous, as they can't be reached in fact. So I removed the
additional checks here about flag_fentry.
The option -fpic for x86 is indeed not possible for profiling and
-mfentry. So I added a sorry and default back to profiling after
prologue, which setups ebx correctly.

ChangeLog

	* config/i386/cygming.h (MCOUNT_NAME): New.
	(MCOUNT_NAME_BEFORE_PROLOGUE): New.
	* config/i386/i386.c (ix86_profile_before_prologue): New.
	(override_options): Add special handling for -mfentry.
	(ix86_function_regparm): Likewise.
	(ix86_function_sseregparm): Likewise.
	(ix86_frame_pointer_required): Likewise.
	(ix86_expand_prologue): Check for ms_hook_prologue.
	(x86_function_profiler): Adjust mcount output.
	(TARGET_PROFILE_BEFORE_PROLOGUE): Define hook.
	* config/i386/i386.opt (mfentry): New.
	* doc/invoke.texi (mfentry): Add documentation.
	* doc/tm.texi: Regenerated..
	* doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New.
	* final.c (final_start_function): Replace macro
	PROFILE_BEFORE_PROLOGUE by target hook.
	* function.c (thread_prologue_and_epilogue_insns): Likewise.
	* target.def (profile_before_prologue): New hook.
	* targhooks.c (default_profile_before_prologue): New.
	* targhooks.h (default_profile_before_prologue): New.

Retested for x86_64-pc-mingw32, i686-pc-cygwin, and i686-pc-linux-gnu.
Ok for apply?

Regards,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

[-- Attachment #2: profileb.diff --]
[-- Type: application/octet-stream, Size: 12735 bytes --]

Index: gcc/gcc/config/i386/cygming.h
===================================================================
--- gcc.orig/gcc/config/i386/cygming.h	2010-07-15 19:15:51.651349100 +0200
+++ gcc/gcc/config/i386/cygming.h	2010-07-15 19:49:18.100111400 +0200
@@ -39,6 +39,13 @@
 #undef DEFAULT_ABI
 #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI)
 
+/* Choose the correct profiler mcount name.  */
+#undef MCOUNT_NAME
+#define MCOUNT_NAME "_mcount"
+
+#undef MCOUNT_NAME_BEFORE_PROLOGUE
+#define MCOUNT_NAME_BEFORE_PROLOGUE "_mcount_top"
+
 #if ! defined (USE_MINGW64_LEADING_UNDERSCORES)
 #undef USER_LABEL_PREFIX
 #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_")
Index: gcc/gcc/config/i386/i386.c
===================================================================
--- gcc.orig/gcc/config/i386/i386.c	2010-07-15 19:15:51.653349100 +0200
+++ gcc/gcc/config/i386/i386.c	2010-07-17 07:54:24.270241900 +0200
@@ -2768,6 +2768,15 @@
     }
 }
 
+/* Return true, if profiling code should be emitted before
+   prologue. Otherwise it returns false.
+   Note: For x86 with "hotfix" it is sorried.  */
+static bool
+ix86_profile_before_prologue (void)
+{
+  return flag_fentry != 0;
+}
+
 /* Function that is callable from the debugger to print the current
    options.  */
 void
@@ -3671,6 +3680,34 @@
     target_flags |= MASK_CLD & ~target_flags_explicit;
 #endif
 
+  {
+    int default_profile_top_flag = 0;
+    int only_default = 1;
+    bool force_default_profile_top_flag = (!TARGET_64BIT && flag_pic);
+
+#if defined(PROFILE_BEFORE_PROLOGUE)
+    default_profile_top_flag = (force_default_profile_top_flag ? 0 : 1);
+#endif
+#if defined(MCOUNT_NAME) && defined(MCOUNT_NAME_BEFORE_PROLOGUE)
+    only_default = 0;
+#endif
+
+    if (flag_fentry == -1)
+      flag_fentry = default_profile_top_flag;
+    if (flag_fentry != 0 && force_default_profile_top_flag)
+      {
+	sorry ("-mfentry isn't support for x86 in combination with -fpic");
+	flag_fentry = 0;
+      }
+    else if (flag_fentry != default_profile_top_flag && only_default)
+      {
+        if (!default_profile_top_flag)
+          sorry ("-mfentry isn't supported for this target");
+        else
+          sorry ("-mno-fentry isn't supported for this target");
+	flag_fentry = default_profile_top_flag;
+      }
+  }
   /* Save the initial options in case the user does function specific options */
   if (main_args_p)
     target_option_default_node = target_option_current_node
@@ -4841,7 +4878,7 @@
   if (decl
       && TREE_CODE (decl) == FUNCTION_DECL
       && optimize
-      && !profile_flag)
+      && !(profile_flag && !flag_fentry))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl));
@@ -4913,7 +4950,8 @@
 
   /* For local functions, pass up to SSE_REGPARM_MAX SFmode
      (and DFmode for SSE2) arguments in SSE registers.  */
-  if (decl && TARGET_SSE_MATH && optimize && !profile_flag)
+  if (decl && TARGET_SSE_MATH && optimize
+      && !(profile_flag && !flag_fentry))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl));
@@ -7875,7 +7913,7 @@
 	  || ix86_current_function_calls_tls_descriptor))
     return true;
 
-  if (crtl->profile)
+  if (crtl->profile && !flag_fentry)
     return true;
 
   return false;
@@ -8143,7 +8181,8 @@
 static unsigned int
 ix86_select_alt_pic_regnum (void)
 {
-  if (current_function_is_leaf && !crtl->profile
+  if (current_function_is_leaf
+      && !crtl->profile
       && !ix86_current_function_calls_tls_descriptor)
     {
       int i, drap;
@@ -9191,6 +9230,11 @@
     {
       rtx push, mov;
 
+      /* Check if profiling is active and we shall use profiling before
+         prologue variant. If so sorry.  */
+      if (crtl->profile && flag_fentry != 0)
+        sorry ("ms_hook_prologue attribute isn't compatible with -mfentry for 32-bit");
+
       /* Make sure the function starts with
 	 8b ff     movl.s %edi,%edi (emited by ix86_asm_output_function_label)
 	 55        push   %ebp
@@ -9480,7 +9524,7 @@
      when mcount needs it.  Blockage to avoid call movement across mcount
      call is emitted in generic code after the NOTE_INSN_PROLOGUE_END
      note.  */
-  if (crtl->profile && pic_reg_used)
+  if (crtl->profile && !flag_fentry && pic_reg_used)
     emit_insn (gen_prologue_use (pic_offset_table_rtx));
 
   if (crtl->drap_reg && !crtl->stack_realign_needed)
@@ -27282,11 +27326,26 @@
   return computed;
 }
 
+#if !defined(MCOUNT_NAME) && !defined(MCOUNT_NAME_BEFORE_PROLOGUE)
+#error MCOUNT_NAME ,and/or MCOUNT_NAME_BEFORE_PROLOGUE have to be define
+#endif
+
+/* Make sure both are getting defined.  */
+#ifndef MCOUNT_NAME
+#define MCOUNT_NAME MCOUNT_NAME_BEFORE_PROLOGUE
+#endif
+#ifndef MCOUNT_NAME_BEFORE_PROLOGUE
+#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME
+#endif
+
 /* Output assembler code to FILE to increment profiler label # LABELNO
    for profiling a function entry.  */
 void
 x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
 {
+  const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE
+					 : MCOUNT_NAME);
+
   if (TARGET_64BIT)
     {
 #ifndef NO_PROFILE_COUNTERS
@@ -27294,9 +27353,9 @@
 #endif
 
       if (DEFAULT_ABI == SYSV_ABI && flag_pic)
-	fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file);
+	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
       else
-	fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+	fprintf (file, "\tcall\t%s\n", mcount_name);
     }
   else if (flag_pic)
     {
@@ -27304,7 +27363,7 @@
       fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file);
+      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
     }
   else
     {
@@ -27312,7 +27371,7 @@
       fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+      fprintf (file, "\tcall\t%s\n", mcount_name);
     }
 }
 
@@ -31360,6 +31419,9 @@
 #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD
 #endif
 
+#undef TARGET_PROFILE_BEFORE_PROLOGUE
+#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue
+
 #undef TARGET_ASM_UNALIGNED_HI_OP
 #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP
 #undef TARGET_ASM_UNALIGNED_SI_OP
Index: gcc/gcc/config/i386/i386.opt
===================================================================
--- gcc.orig/gcc/config/i386/i386.opt	2010-07-15 19:15:51.665349100 +0200
+++ gcc/gcc/config/i386/i386.opt	2010-07-15 19:19:10.376715500 +0200
@@ -375,3 +375,7 @@
 mf16c
 Target Report Mask(ISA_F16C) Var(ix86_isa_flags) VarExists Save
 Support F16C built-in functions and code generation
+
+mfentry
+Target Report Var(flag_fentry) Init(-1)
+Emit profiling counter call at function entry before prologue.
Index: gcc/gcc/doc/invoke.texi
===================================================================
--- gcc.orig/gcc/doc/invoke.texi	2010-07-15 19:15:51.666349100 +0200
+++ gcc/gcc/doc/invoke.texi	2010-07-15 19:19:10.385716000 +0200
@@ -601,7 +601,7 @@
 -momit-leaf-frame-pointer  -mno-red-zone -mno-tls-direct-seg-refs @gol
 -mcmodel=@var{code-model} -mabi=@var{name} @gol
 -m32  -m64 -mlarge-data-threshold=@var{num} @gol
--msse2avx}
+-msse2avx -mfentry}
 
 @emph{IA-64 Options}
 @gccoptlist{-mbig-endian  -mlittle-endian  -mgnu-as  -mgnu-ld  -mno-pic @gol
@@ -12466,6 +12466,14 @@
 @opindex msse2avx
 Specify that the assembler should encode SSE instructions with VEX
 prefix.  The option @option{-mavx} turns this on by default.
+
+@item -mfentry
+@itemx -mno-fentry
+@opindex mfentry
+If profiling is active @option{-pg} put the profiling
+counter call before prologue.
+Note: On x86 architectures the attribute @code{ms_hook_prologue}
+isn't possible at the moment for @option{-mfentry} and @option{-pg}.
 @end table
 
 These @samp{-m} switches are supported in addition to the above
Index: gcc/gcc/doc/tm.texi
===================================================================
--- gcc.orig/gcc/doc/tm.texi	2010-07-15 19:15:51.668349100 +0200
+++ gcc/gcc/doc/tm.texi	2010-07-17 08:41:34.231106400 +0200
@@ -7101,6 +7101,13 @@
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void)
+It returns true if target wants profile code emitted before prologue.
+
+The default version of this hook use the target macro
+@code{PROFILE_BEFORE_PROLOGUE}.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp})
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/doc/tm.texi.in
===================================================================
--- gcc.orig/gcc/doc/tm.texi.in	2010-07-15 19:15:51.673349100 +0200
+++ gcc/gcc/doc/tm.texi.in	2010-07-17 08:04:39.523432400 +0200
@@ -7101,6 +7101,8 @@
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@hook TARGET_PROFILE_BEFORE_PROLOGUE
+
 @hook TARGET_BINDS_LOCAL_P
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/final.c
===================================================================
--- gcc.orig/gcc/final.c	2010-07-15 19:15:51.674349100 +0200
+++ gcc/gcc/final.c	2010-07-15 19:19:10.403717000 +0200
@@ -1546,10 +1546,8 @@
 
   /* The Sun386i and perhaps other machines don't work right
      if the profiling code comes after the prologue.  */
-#ifdef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* PROFILE_BEFORE_PROLOGUE */
 
 #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue)
   if (dwarf2out_do_frame ())
@@ -1591,10 +1589,8 @@
 static void
 profile_after_prologue (FILE *file ATTRIBUTE_UNUSED)
 {
-#ifndef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (!targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* not PROFILE_BEFORE_PROLOGUE */
 }
 
 static void
Index: gcc/gcc/function.c
===================================================================
--- gcc.orig/gcc/function.c	2010-07-15 19:15:51.675349100 +0200
+++ gcc/gcc/function.c	2010-07-15 19:19:10.408717300 +0200
@@ -5100,13 +5100,11 @@
       record_insns (seq, NULL, &prologue_insn_hash);
       emit_note (NOTE_INSN_PROLOGUE_END);
 
-#ifndef PROFILE_BEFORE_PROLOGUE
       /* Ensure that instructions are not moved into the prologue when
 	 profiling is on.  The call to the profiling routine can be
 	 emitted within the live range of a call-clobbered register.  */
-      if (crtl->profile)
+      if (!targetm.profile_before_prologue () && crtl->profile)
         emit_insn (gen_blockage ());
-#endif
 
       seq = get_insns ();
       end_sequence ();
Index: gcc/gcc/target.def
===================================================================
--- gcc.orig/gcc/target.def	2010-07-15 19:15:51.676349100 +0200
+++ gcc/gcc/target.def	2010-07-17 08:38:38.423050800 +0200
@@ -1218,6 +1218,15 @@
  bool, (const_tree exp),
  default_binds_local_p)
 
+/* Check if profiling code is before or after prologue.  */
+DEFHOOK
+(profile_before_prologue,
+ "It returns true if target wants profile code emitted before prologue.\n\n\
+The default version of this hook use the target macro\n\
+@code{PROFILE_BEFORE_PROLOGUE}.",
+ bool, (void),
+ default_profile_before_prologue)
+
 /* Modify and return the identifier of a DECL's external name,
    originally identified by ID, as required by the target,
    (eg, append @nn to windows32 stdcall function names).
Index: gcc/gcc/targhooks.c
===================================================================
--- gcc.orig/gcc/targhooks.c	2010-07-15 19:15:51.678349100 +0200
+++ gcc/gcc/targhooks.c	2010-07-15 19:19:10.413717600 +0200
@@ -1197,4 +1197,14 @@
 #endif
 }
 
+bool
+default_profile_before_prologue (void)
+{
+#ifndef PROFILE_BEFORE_PROLOGUE
+  return false;
+#else
+  return true;
+#endif
+}
+
 #include "gt-targhooks.h"
Index: gcc/gcc/targhooks.h
===================================================================
--- gcc.orig/gcc/targhooks.h	2010-07-15 19:15:51.687349100 +0200
+++ gcc/gcc/targhooks.h	2010-07-15 19:19:10.416717800 +0200
@@ -150,3 +150,4 @@
 extern int default_register_move_cost (enum machine_mode, reg_class_t,
 				       reg_class_t);
 
+extern bool default_profile_before_prologue (void);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling feature
  2010-07-17  5:34             ` Kai Tietz
@ 2010-07-17  9:45               ` Andi Kleen
  2010-07-18 11:37                 ` Kai Tietz
  0 siblings, 1 reply; 27+ messages in thread
From: Andi Kleen @ 2010-07-17  9:45 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Andi Kleen, Richard Henderson, GCC Patches

> Hmm, I can't reproduce this. Clear if PROFILE_BEFORE_PROLOGUE isn't
> set, then default remains profile counter call after prologue. But if
> you have 'defined(MCOUNT_NAME) && defined
> (MCOUNT_NAME_BEFORE_PROLOGUE)' the variable only_default is false, and
> so the error you are showing shouldn't be reachable. Do I miss here
> something?

Looks it was my fault.

It seems adding the define to linux.h is not enough for them to reach
i386.c (MCOUNT_NAME_BEFORE_PROLOGUE is undefined) on my configuration
(x86_64-linux)

If I add it to linux64.h too it works. I thought linux64 would inherit
linux, but apparently that's not the case. With this patch a simple
test program works for a 64bit build.

-Andi

2010-07-17  Andi Kleen  <ak@linux.intel.com>

	* config/i386/linux.h (MCOUNT_NAME_BEFORE_PROLOGUE): Define.
	* config/i386/linux64.h (MCOUNT_NAME_BEFORE_PROLOGUE): Define.
	(MCOUNT_NAME): Define	

diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h
index 81dfd1e..54051ed 100644
--- a/gcc/config/i386/linux.h
+++ b/gcc/config/i386/linux.h
@@ -48,6 +48,10 @@ along with GCC; see the file COPYING3.  If not see
 
 #define NO_PROFILE_COUNTERS	1
 
+/* Choose the correct profiler mcount name.  */
+#undef MCOUNT_NAME_BEFORE_PROLOGUE
+#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__"
+
 #undef MCOUNT_NAME
 #define MCOUNT_NAME "mcount"
 
diff --git a/gcc/config/i386/linux64.h b/gcc/config/i386/linux64.h
index 33b4dc9..f40fa31 100644
--- a/gcc/config/i386/linux64.h
+++ b/gcc/config/i386/linux64.h
@@ -123,3 +123,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
    x86_64 glibc provides it in %fs:0x28.  */
 #define TARGET_THREAD_SSP_OFFSET	(TARGET_64BIT ? 0x28 : 0x14)
 #endif
+
+/* Choose the correct profiler mcount name.  */
+#undef MCOUNT_NAME_BEFORE_PROLOGUE
+#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__"
+
+#undef MCOUNT_NAME
+#define MCOUNT_NAME "mcount"


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-17  9:45               ` Andi Kleen
@ 2010-07-18 11:37                 ` Kai Tietz
  2010-07-18 11:46                   ` Kai Tietz
  0 siblings, 1 reply; 27+ messages in thread
From: Kai Tietz @ 2010-07-18 11:37 UTC (permalink / raw)
  To: Richard Henderson; +Cc: GCC Patches, Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 1477 bytes --]

Hello,

I found one missing nit for cygwin/mingw, which is corrected by this
patch. For win32 target an initial call to _monstartup has to be done
in main before mcount get called. For this I added to cygming.h the
new macro PROFILE_HOOK_BEFORE_PROFILE to handle it.

So new patch with updated ChangeLog

	* config/i386/cygming.h (MCOUNT_NAME): New.
	(MCOUNT_NAME_BEFORE_PROLOGUE): New.
	(PROFILE_HOOK_BEFORE_PROFILE): New.
	(PROFILE_HOOK): Check if not fentry is active.
	* config/i386/i386.c (ix86_profile_before_prologue): New.
	(override_options): Add special handling for -mfentry.
	(ix86_function_regparm): Likewise.
	(ix86_function_sseregparm): Likewise.
	(ix86_frame_pointer_required): Likewise.
	(ix86_expand_prologue): Check for ms_hook_prologue.
	(x86_function_profiler): Adjust mcount output and
	call PROFILE_HOOK_BEFORE_PROFILE.
	(TARGET_PROFILE_BEFORE_PROLOGUE): Define hook.
	* config/i386/i386.opt (mfentry): New.
	* doc/invoke.texi (mfentry): Add documentation.
	* doc/tm.texi: Regenerated..
	* doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New.
	* final.c (final_start_function): Replace macro
	PROFILE_BEFORE_PROLOGUE by target hook.
	* function.c (thread_prologue_and_epilogue_insns): Likewise.
	* target.def (profile_before_prologue): New hook.
	* targhooks.c (default_profile_before_prologue): New.
	* targhooks.h (default_profile_before_prologue): New.


Tested for i686-pc-mingw32, i686-pc-cygwin, and x86_64-pc-mingw32. Ok for apply?

Regards,
Kai

[-- Attachment #2: profileb.diff --]
[-- Type: application/octet-stream, Size: 14555 bytes --]

Index: gcc/gcc/config/i386/cygming.h
===================================================================
--- gcc.orig/gcc/config/i386/cygming.h	2010-07-18 12:15:00.061060600 +0200
+++ gcc/gcc/config/i386/cygming.h	2010-07-18 13:35:41.901998000 +0200
@@ -39,6 +39,13 @@ along with GCC; see the file COPYING3.
 #undef DEFAULT_ABI
 #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI)
 
+/* Choose the correct profiler mcount name.  */
+#undef MCOUNT_NAME
+#define MCOUNT_NAME "_mcount"
+
+#undef MCOUNT_NAME_BEFORE_PROLOGUE
+#define MCOUNT_NAME_BEFORE_PROLOGUE "_mcount_top"
+
 #if ! defined (USE_MINGW64_LEADING_UNDERSCORES)
 #undef USER_LABEL_PREFIX
 #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_")
@@ -327,7 +334,7 @@ do {						\
 
 #undef PROFILE_HOOK
 #define PROFILE_HOOK(LABEL)						\
-  if (MAIN_NAME_P (DECL_NAME (current_function_decl)))			\
+  if (!flag_fentry && MAIN_NAME_P (DECL_NAME (current_function_decl)))	\
     {									\
       emit_call_insn (gen_rtx_CALL (VOIDmode,				\
 	gen_rtx_MEM (FUNCTION_MODE,					\
@@ -335,6 +342,13 @@ do {						\
 	const0_rtx));							\
     }
 
+#undef PROFILE_HOOK_BEFORE_PROFILE
+#define PROFILE_HOOK_BEFORE_PROFILE(FILE, LABEL)			\
+  if (flag_fentry && MAIN_NAME_P (DECL_NAME (current_function_decl)))	\
+    {									\
+      fprintf ((FILE), "\tcall\t%s_monstartup\n", user_label_prefix);	\
+    }
+
 /* Java Native Interface (JNI) methods on Win32 are invoked using the
    stdcall calling convention.  */
 #undef MODIFY_JNI_METHOD_CALL
Index: gcc/gcc/config/i386/i386.c
===================================================================
--- gcc.orig/gcc/config/i386/i386.c	2010-07-18 12:15:00.062060600 +0200
+++ gcc/gcc/config/i386/i386.c	2010-07-18 13:34:31.547974000 +0200
@@ -2768,6 +2768,15 @@ software_prefetching_beneficial_p (void)
     }
 }
 
+/* Return true, if profiling code should be emitted before
+   prologue. Otherwise it returns false.
+   Note: For x86 with "hotfix" it is sorried.  */
+static bool
+ix86_profile_before_prologue (void)
+{
+  return flag_fentry != 0;
+}
+
 /* Function that is callable from the debugger to print the current
    options.  */
 void
@@ -3671,6 +3680,34 @@ override_options (bool main_args_p)
     target_flags |= MASK_CLD & ~target_flags_explicit;
 #endif
 
+  {
+    int default_profile_top_flag = 0;
+    int only_default = 1;
+    bool force_default_profile_top_flag = (!TARGET_64BIT && flag_pic);
+
+#if defined(PROFILE_BEFORE_PROLOGUE)
+    default_profile_top_flag = (force_default_profile_top_flag ? 0 : 1);
+#endif
+#if defined(MCOUNT_NAME) && defined(MCOUNT_NAME_BEFORE_PROLOGUE)
+    only_default = 0;
+#endif
+
+    if (flag_fentry == -1)
+      flag_fentry = default_profile_top_flag;
+    if (flag_fentry != 0 && force_default_profile_top_flag)
+      {
+	sorry ("-mfentry isn't support for x86 in combination with -fpic");
+	flag_fentry = 0;
+      }
+    else if (flag_fentry != default_profile_top_flag && only_default)
+      {
+        if (!default_profile_top_flag)
+          sorry ("-mfentry isn't supported for this target");
+        else
+          sorry ("-mno-fentry isn't supported for this target");
+	flag_fentry = default_profile_top_flag;
+      }
+  }
   /* Save the initial options in case the user does function specific options */
   if (main_args_p)
     target_option_default_node = target_option_current_node
@@ -4841,7 +4878,7 @@ ix86_function_regparm (const_tree type,
   if (decl
       && TREE_CODE (decl) == FUNCTION_DECL
       && optimize
-      && !profile_flag)
+      && !(profile_flag && !flag_fentry))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl));
@@ -4913,7 +4950,8 @@ ix86_function_sseregparm (const_tree typ
 
   /* For local functions, pass up to SSE_REGPARM_MAX SFmode
      (and DFmode for SSE2) arguments in SSE registers.  */
-  if (decl && TARGET_SSE_MATH && optimize && !profile_flag)
+  if (decl && TARGET_SSE_MATH && optimize
+      && !(profile_flag && !flag_fentry))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl));
@@ -7878,7 +7916,7 @@ ix86_frame_pointer_required (void)
 	  || ix86_current_function_calls_tls_descriptor))
     return true;
 
-  if (crtl->profile)
+  if (crtl->profile && !flag_fentry)
     return true;
 
   return false;
@@ -8146,7 +8184,8 @@ gen_push (rtx arg)
 static unsigned int
 ix86_select_alt_pic_regnum (void)
 {
-  if (current_function_is_leaf && !crtl->profile
+  if (current_function_is_leaf
+      && !crtl->profile
       && !ix86_current_function_calls_tls_descriptor)
     {
       int i, drap;
@@ -9194,6 +9233,11 @@ ix86_expand_prologue (void)
     {
       rtx push, mov;
 
+      /* Check if profiling is active and we shall use profiling before
+         prologue variant. If so sorry.  */
+      if (crtl->profile && flag_fentry != 0)
+        sorry ("ms_hook_prologue attribute isn't compatible with -mfentry for 32-bit");
+
       /* Make sure the function starts with
 	 8b ff     movl.s %edi,%edi (emited by ix86_asm_output_function_label)
 	 55        push   %ebp
@@ -9483,7 +9527,7 @@ ix86_expand_prologue (void)
      when mcount needs it.  Blockage to avoid call movement across mcount
      call is emitted in generic code after the NOTE_INSN_PROLOGUE_END
      note.  */
-  if (crtl->profile && pic_reg_used)
+  if (crtl->profile && !flag_fentry && pic_reg_used)
     emit_insn (gen_prologue_use (pic_offset_table_rtx));
 
   if (crtl->drap_reg && !crtl->stack_realign_needed)
@@ -27285,11 +27329,30 @@ x86_field_alignment (tree field, int com
   return computed;
 }
 
+#if !defined(MCOUNT_NAME) && !defined(MCOUNT_NAME_BEFORE_PROLOGUE)
+#error MCOUNT_NAME ,and/or MCOUNT_NAME_BEFORE_PROLOGUE have to be define
+#endif
+
+/* Make sure both are getting defined.  */
+#ifndef MCOUNT_NAME
+#define MCOUNT_NAME MCOUNT_NAME_BEFORE_PROLOGUE
+#endif
+#ifndef MCOUNT_NAME_BEFORE_PROLOGUE
+#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME
+#endif
+
 /* Output assembler code to FILE to increment profiler label # LABELNO
    for profiling a function entry.  */
 void
 x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
 {
+  const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE
+					 : MCOUNT_NAME);
+
+#ifdef PROFILE_HOOK_BEFORE_PROFILE
+  PROFILE_HOOK_BEFORE_PROFILE (file, labelno);
+#endif
+
   if (TARGET_64BIT)
     {
 #ifndef NO_PROFILE_COUNTERS
@@ -27297,9 +27360,9 @@ x86_function_profiler (FILE *file, int l
 #endif
 
       if (DEFAULT_ABI == SYSV_ABI && flag_pic)
-	fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file);
+	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
       else
-	fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+	fprintf (file, "\tcall\t%s\n", mcount_name);
     }
   else if (flag_pic)
     {
@@ -27307,7 +27370,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file);
+      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
     }
   else
     {
@@ -27315,7 +27378,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+      fprintf (file, "\tcall\t%s\n", mcount_name);
     }
 }
 
@@ -31363,6 +31426,9 @@ ix86_enum_va_list (int idx, const char *
 #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD
 #endif
 
+#undef TARGET_PROFILE_BEFORE_PROLOGUE
+#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue
+
 #undef TARGET_ASM_UNALIGNED_HI_OP
 #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP
 #undef TARGET_ASM_UNALIGNED_SI_OP
Index: gcc/gcc/config/i386/i386.opt
===================================================================
--- gcc.orig/gcc/config/i386/i386.opt	2010-07-18 12:15:00.078060600 +0200
+++ gcc/gcc/config/i386/i386.opt	2010-07-18 13:34:31.568974000 +0200
@@ -375,3 +375,7 @@ Support RDRND built-in functions and cod
 mf16c
 Target Report Mask(ISA_F16C) Var(ix86_isa_flags) VarExists Save
 Support F16C built-in functions and code generation
+
+mfentry
+Target Report Var(flag_fentry) Init(-1)
+Emit profiling counter call at function entry before prologue.
Index: gcc/gcc/doc/invoke.texi
===================================================================
--- gcc.orig/gcc/doc/invoke.texi	2010-07-18 12:15:00.081060600 +0200
+++ gcc/gcc/doc/invoke.texi	2010-07-18 13:34:31.591974000 +0200
@@ -601,7 +601,7 @@ Objective-C and Objective-C++ Dialects}.
 -momit-leaf-frame-pointer  -mno-red-zone -mno-tls-direct-seg-refs @gol
 -mcmodel=@var{code-model} -mabi=@var{name} @gol
 -m32  -m64 -mlarge-data-threshold=@var{num} @gol
--msse2avx}
+-msse2avx -mfentry}
 
 @emph{IA-64 Options}
 @gccoptlist{-mbig-endian  -mlittle-endian  -mgnu-as  -mgnu-ld  -mno-pic @gol
@@ -12466,6 +12466,14 @@ For systems that use GNU libc, the defau
 @opindex msse2avx
 Specify that the assembler should encode SSE instructions with VEX
 prefix.  The option @option{-mavx} turns this on by default.
+
+@item -mfentry
+@itemx -mno-fentry
+@opindex mfentry
+If profiling is active @option{-pg} put the profiling
+counter call before prologue.
+Note: On x86 architectures the attribute @code{ms_hook_prologue}
+isn't possible at the moment for @option{-mfentry} and @option{-pg}.
 @end table
 
 These @samp{-m} switches are supported in addition to the above
Index: gcc/gcc/doc/tm.texi
===================================================================
--- gcc.orig/gcc/doc/tm.texi	2010-07-18 12:15:00.082060600 +0200
+++ gcc/gcc/doc/tm.texi	2010-07-18 12:25:33.157271600 +0200
@@ -7101,6 +7101,13 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void)
+It returns true if target wants profile code emitted before prologue.
+
+The default version of this hook use the target macro
+@code{PROFILE_BEFORE_PROLOGUE}.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp})
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/doc/tm.texi.in
===================================================================
--- gcc.orig/gcc/doc/tm.texi.in	2010-07-18 12:15:00.089060600 +0200
+++ gcc/gcc/doc/tm.texi.in	2010-07-18 12:25:33.163271900 +0200
@@ -7101,6 +7101,8 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@hook TARGET_PROFILE_BEFORE_PROLOGUE
+
 @hook TARGET_BINDS_LOCAL_P
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/final.c
===================================================================
--- gcc.orig/gcc/final.c	2010-07-18 12:15:00.090060600 +0200
+++ gcc/gcc/final.c	2010-07-18 12:25:33.167272200 +0200
@@ -1546,10 +1546,8 @@ final_start_function (rtx first ATTRIBUT
 
   /* The Sun386i and perhaps other machines don't work right
      if the profiling code comes after the prologue.  */
-#ifdef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* PROFILE_BEFORE_PROLOGUE */
 
 #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue)
   if (dwarf2out_do_frame ())
@@ -1591,10 +1589,8 @@ final_start_function (rtx first ATTRIBUT
 static void
 profile_after_prologue (FILE *file ATTRIBUTE_UNUSED)
 {
-#ifndef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (!targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* not PROFILE_BEFORE_PROLOGUE */
 }
 
 static void
Index: gcc/gcc/function.c
===================================================================
--- gcc.orig/gcc/function.c	2010-07-18 12:15:00.091060600 +0200
+++ gcc/gcc/function.c	2010-07-18 12:25:33.172272400 +0200
@@ -5179,13 +5179,11 @@ thread_prologue_and_epilogue_insns (void
       record_insns (seq, NULL, &prologue_insn_hash);
       emit_note (NOTE_INSN_PROLOGUE_END);
 
-#ifndef PROFILE_BEFORE_PROLOGUE
       /* Ensure that instructions are not moved into the prologue when
 	 profiling is on.  The call to the profiling routine can be
 	 emitted within the live range of a call-clobbered register.  */
-      if (crtl->profile)
+      if (!targetm.profile_before_prologue () && crtl->profile)
         emit_insn (gen_blockage ());
-#endif
 
       seq = get_insns ();
       end_sequence ();
Index: gcc/gcc/target.def
===================================================================
--- gcc.orig/gcc/target.def	2010-07-18 12:15:00.093060600 +0200
+++ gcc/gcc/target.def	2010-07-18 12:25:33.175272600 +0200
@@ -1218,6 +1218,15 @@ DEFHOOK
  bool, (const_tree exp),
  default_binds_local_p)
 
+/* Check if profiling code is before or after prologue.  */
+DEFHOOK
+(profile_before_prologue,
+ "It returns true if target wants profile code emitted before prologue.\n\n\
+The default version of this hook use the target macro\n\
+@code{PROFILE_BEFORE_PROLOGUE}.",
+ bool, (void),
+ default_profile_before_prologue)
+
 /* Modify and return the identifier of a DECL's external name,
    originally identified by ID, as required by the target,
    (eg, append @nn to windows32 stdcall function names).
Index: gcc/gcc/targhooks.c
===================================================================
--- gcc.orig/gcc/targhooks.c	2010-07-18 12:15:00.094060600 +0200
+++ gcc/gcc/targhooks.c	2010-07-18 12:25:33.177272700 +0200
@@ -1197,4 +1197,14 @@ default_register_move_cost (enum machine
 #endif
 }
 
+bool
+default_profile_before_prologue (void)
+{
+#ifndef PROFILE_BEFORE_PROLOGUE
+  return false;
+#else
+  return true;
+#endif
+}
+
 #include "gt-targhooks.h"
Index: gcc/gcc/targhooks.h
===================================================================
--- gcc.orig/gcc/targhooks.h	2010-07-18 12:15:00.110060600 +0200
+++ gcc/gcc/targhooks.h	2010-07-18 12:25:33.180272900 +0200
@@ -150,3 +150,4 @@ extern int default_memory_move_cost (enu
 extern int default_register_move_cost (enum machine_mode, reg_class_t,
 				       reg_class_t);
 
+extern bool default_profile_before_prologue (void);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-18 11:37                 ` Kai Tietz
@ 2010-07-18 11:46                   ` Kai Tietz
  0 siblings, 0 replies; 27+ messages in thread
From: Kai Tietz @ 2010-07-18 11:46 UTC (permalink / raw)
  To: Richard Henderson; +Cc: GCC Patches, Andi Kleen

2010/7/18 Kai Tietz <ktietz70@googlemail.com>:
> Hello,
>
> I found one missing nit for cygwin/mingw, which is corrected by this
> patch. For win32 target an initial call to _monstartup has to be done
> in main before mcount get called. For this I added to cygming.h the
> new macro PROFILE_HOOK_BEFORE_PROFILE to handle it.
>
> So new patch with updated ChangeLog
>
>        * config/i386/cygming.h (MCOUNT_NAME): New.
>        (MCOUNT_NAME_BEFORE_PROLOGUE): New.
>        (PROFILE_HOOK_BEFORE_PROFILE): New.
>        (PROFILE_HOOK): Check if not fentry is active.
>        * config/i386/i386.c (ix86_profile_before_prologue): New.
>        (override_options): Add special handling for -mfentry.
>        (ix86_function_regparm): Likewise.
>        (ix86_function_sseregparm): Likewise.
>        (ix86_frame_pointer_required): Likewise.
>        (ix86_expand_prologue): Check for ms_hook_prologue.
>        (x86_function_profiler): Adjust mcount output and
>        call PROFILE_HOOK_BEFORE_PROFILE.
>        (TARGET_PROFILE_BEFORE_PROLOGUE): Define hook.
>        * config/i386/i386.opt (mfentry): New.
>        * doc/invoke.texi (mfentry): Add documentation.
>        * doc/tm.texi: Regenerated..
>        * doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New.
>        * final.c (final_start_function): Replace macro
>        PROFILE_BEFORE_PROLOGUE by target hook.
>        * function.c (thread_prologue_and_epilogue_insns): Likewise.
>        * target.def (profile_before_prologue): New hook.
>        * targhooks.c (default_profile_before_prologue): New.
>        * targhooks.h (default_profile_before_prologue): New.
>
>
> Tested for i686-pc-mingw32, i686-pc-cygwin, and x86_64-pc-mingw32. Ok for apply?
>
> Regards,
> Kai
>

Hmm, it doesn't hurt, but it seems to me that for old behavior the
monstartup hook got called after the mcount function. So I am not sure
here if the last patch is necessary at all, so withdraw recent patch
and fallback to the patch before.

Sorry for the noise.
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-16 20:53           ` Gerald Pfeifer
@ 2010-07-18 12:20             ` Kai Tietz
  2010-07-18 20:52               ` Gerald Pfeifer
  0 siblings, 1 reply; 27+ messages in thread
From: Kai Tietz @ 2010-07-18 12:20 UTC (permalink / raw)
  To: Gerald Pfeifer; +Cc: Richard Henderson, Andi Kleen, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 659 bytes --]

2010/7/16 Gerald Pfeifer <gerald@pfeifer.com>:
> On Thu, 15 Jul 2010, Kai Tietz wrote:
>>       * doc/invoke.texi (mfentry): Add documentation.
>
> +Note: On x86 architectures the attribute @code{ms_hook_prologue}
> +isn't possible at the moment for @option{-mfentry} and @option{-pg}.
>
> How about: "...is not supported for...at the moment."?
>
> And if you could add an item to http://gcc.gnu.org/gcc-4.6/changes.html
> when this goes in, that would be good.
>
> Gerald
>

Something like this?

Regards,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

[-- Attachment #2: gcc46_change_profile.diff --]
[-- Type: application/octet-stream, Size: 438 bytes --]

Index: gcc-4.6/changes.html
===================================================================
--- gcc-4.6.orig/changes.html	2010-07-16 13:35:00.000000000 +0200
+++ gcc-4.6/changes.html	2010-07-18 14:18:14.061973200 +0200
@@ -197,7 +197,8 @@
 
 <h3>IA-32/x86-64</h3>
   <ul>
-    <li>...</li>
+	<li>Support of emitting profiler counter call before prologue via
+	command line option <code>-mfentry</code>.</li>
   </ul>
 
 <h3>MIPS</h3>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-18 12:20             ` Kai Tietz
@ 2010-07-18 20:52               ` Gerald Pfeifer
  2010-07-18 20:54                 ` Kai Tietz
  2010-07-28 18:06                 ` Kai Tietz
  0 siblings, 2 replies; 27+ messages in thread
From: Gerald Pfeifer @ 2010-07-18 20:52 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Richard Henderson, Andi Kleen, gcc-patches

On Sun, 18 Jul 2010, Kai Tietz wrote:
> Something like this?

+       <li>Support of emitting profiler counter call before prologue via
+       command line option <code>-mfentry</code>.</li>

How about

  <li>Support for emitting profiler counter calls before function
  prologues.  This is enabled via a new command-line option
  <code>-mfentry</code>.</li>

?

If this, or a variation, is fine with you, let's give Richard a day or
two to comment and then go ahead and commit.

Thanks,
Gerald

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-18 20:52               ` Gerald Pfeifer
@ 2010-07-18 20:54                 ` Kai Tietz
  2010-07-28 18:06                 ` Kai Tietz
  1 sibling, 0 replies; 27+ messages in thread
From: Kai Tietz @ 2010-07-18 20:54 UTC (permalink / raw)
  To: Gerald Pfeifer; +Cc: Richard Henderson, Andi Kleen, gcc-patches

2010/7/18 Gerald Pfeifer <gerald@pfeifer.com>:
> On Sun, 18 Jul 2010, Kai Tietz wrote:
>> Something like this?
>
> +       <li>Support of emitting profiler counter call before prologue via
> +       command line option <code>-mfentry</code>.</li>
>
> How about
>
>  <li>Support for emitting profiler counter calls before function
>  prologues.  This is enabled via a new command-line option
>  <code>-mfentry</code>.</li>
>
> ?
>
> If this, or a variation, is fine with you, let's give Richard a day or
> two to comment and then go ahead and commit.
>
> Thanks,
> Gerald
>

Thanks, I'll take the variant you suggested.

Regards,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-17  6:52             ` Kai Tietz
@ 2010-07-20  2:27               ` Richard Henderson
  2010-07-28  8:36                 ` Kai Tietz
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2010-07-20  2:27 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Andi Kleen, GCC Patches

On 07/16/2010 11:51 PM, Kai Tietz wrote:
> +#ifndef MCOUNT_NAME_BEFORE_PROLOGUE
> +#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME
> +#endif

I begin to wonder if it wouldn't be better to just go ahead and use
Andi's "__fentry__" symbol for all sub-targets.  It's really the only
thing that makes sense given the -mfentry option name.

> +    bool force_default_profile_top_flag = (!TARGET_64BIT && flag_pic);
> +    default_profile_top_flag = (force_default_profile_top_flag ? 0 : 1);

These are confusing names and definitions, because they don't 
correspond to their names.  For instance, f_d_p_t_f does not
force the default.  It's definition forces p_t_f off.

Do you see what I mean?

> +#ifndef PROFILE_BEFORE_PROLOGUE
> +  return false;
> +#else
> +  return true;
> +#endif

Avoid the double negative in favor of the positive.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-20  2:27               ` Richard Henderson
@ 2010-07-28  8:36                 ` Kai Tietz
  2010-07-28 16:00                   ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Kai Tietz @ 2010-07-28  8:36 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andi Kleen, GCC Patches

2010/7/20 Richard Henderson <rth@redhat.com>:
> On 07/16/2010 11:51 PM, Kai Tietz wrote:
>> +#ifndef MCOUNT_NAME_BEFORE_PROLOGUE
>> +#define MCOUNT_NAME_BEFORE_PROLOGUE MCOUNT_NAME
>> +#endif
>
> I begin to wonder if it wouldn't be better to just go ahead and use
> Andi's "__fentry__" symbol for all sub-targets.  It's really the only
> thing that makes sense given the -mfentry option name.

Well, maybe it would. But I see that in current source the MCOUNT_NAME
symbol isn't the same for all targets, too. Some add prefixes to it
for indicate private-scope, etc. But well, I can give it a whirl. To
make it default for all targets by defining the default in i386.h is
fine by me, but this will cause for targets - not providing in their
profiler-library this new entry - link-time failures. By defining it
for each target separately makes analyzis of support possible, but
well, what ever you want here.

>> +    bool force_default_profile_top_flag = (!TARGET_64BIT && flag_pic);
>> +    default_profile_top_flag = (force_default_profile_top_flag ? 0 : 1);
>
> These are confusing names and definitions, because they don't
> correspond to their names.  For instance, f_d_p_t_f does not
> force the default.  It's definition forces p_t_f off.
>
> Do you see what I mean?

Yeah, this logic (and especially their names) aren't lucky choices, as
f_d_p_t_f can be either the default for p_t_f, or it is the forced
default, which turns p_t_f off (if conflicts are detected for PIC and
planned for additional -mseh frame layout).

>> +#ifndef PROFILE_BEFORE_PROLOGUE
>> +  return false;
>> +#else
>> +  return true;
>> +#endif
>
> Avoid the double negative in favor of the positive.

Ok ;)

Regards,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-28  8:36                 ` Kai Tietz
@ 2010-07-28 16:00                   ` Richard Henderson
  2010-07-28 16:01                     ` Andi Kleen
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2010-07-28 16:00 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Andi Kleen, GCC Patches

On 07/28/2010 12:53 AM, Kai Tietz wrote:
> But well, I can give it a whirl. To
> make it default for all targets by defining the default in i386.h is
> fine by me, but this will cause for targets - not providing in their
> profiler-library this new entry - link-time failures. By defining it
> for each target separately makes analyzis of support possible, but
> well, what ever you want here.

Yes, but there are consumers like the linux kernel that plan to 
provide their own version of the entry point.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-28 16:00                   ` Richard Henderson
@ 2010-07-28 16:01                     ` Andi Kleen
  2010-07-28 17:28                       ` Kai Tietz
  0 siblings, 1 reply; 27+ messages in thread
From: Andi Kleen @ 2010-07-28 16:01 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Kai Tietz, Andi Kleen, GCC Patches

On Wed, Jul 28, 2010 at 08:49:45AM -0700, Richard Henderson wrote:
> On 07/28/2010 12:53 AM, Kai Tietz wrote:
> > But well, I can give it a whirl. To
> > make it default for all targets by defining the default in i386.h is
> > fine by me, but this will cause for targets - not providing in their
> > profiler-library this new entry - link-time failures. By defining it
> > for each target separately makes analyzis of support possible, but
> > well, what ever you want here.
> 
> Yes, but there are consumers like the linux kernel that plan to 
> provide their own version of the entry point.

At least on Linux it has to be guarded by the explicit -mfentry 
option for now. Later on the default could be set by testing glibc 
support in autoconf.

I don't see any reason to not enable non default -mfentry on any i386
target.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-28 16:01                     ` Andi Kleen
@ 2010-07-28 17:28                       ` Kai Tietz
  2010-07-28 17:40                         ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Kai Tietz @ 2010-07-28 17:28 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Richard Henderson, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2180 bytes --]

2010/7/28 Andi Kleen <andi@firstfloor.org>:
> On Wed, Jul 28, 2010 at 08:49:45AM -0700, Richard Henderson wrote:
>> On 07/28/2010 12:53 AM, Kai Tietz wrote:
>> > But well, I can give it a whirl. To
>> > make it default for all targets by defining the default in i386.h is
>> > fine by me, but this will cause for targets - not providing in their
>> > profiler-library this new entry - link-time failures. By defining it
>> > for each target separately makes analyzis of support possible, but
>> > well, what ever you want here.
>>
>> Yes, but there are consumers like the linux kernel that plan to
>> provide their own version of the entry point.
>
> At least on Linux it has to be guarded by the explicit -mfentry
> option for now. Later on the default could be set by testing glibc
> support in autoconf.
>
> I don't see any reason to not enable non default -mfentry on any i386
> target.
>
> -Andi
> --
> ak@linux.intel.com -- Speaking for myself only.
>

Well, so here is the updated patch for this.

ChangeLog

2010-07-28  Kai Tietz

	* config/i386/i386.h (MCOUNT_NAME_BEFORE_PROLOGUE): New.
	* config/i386/i386.c (ix86_profile_before_prologue): New.
	(override_options): Add special handling for -mfentry.
	(ix86_function_regparm): Likewise.
	(ix86_function_sseregparm): Likewise.
	(ix86_frame_pointer_required): Likewise.
	(ix86_expand_prologue): Check for ms_hook_prologue.
	(x86_function_profiler): Adjust mcount output.
	(TARGET_PROFILE_BEFORE_PROLOGUE): Define hook.
	* config/i386/i386.opt (mfentry): New.
	* doc/invoke.texi (mfentry): Add documentation.
	* doc/tm.texi: Regenerated..
	* doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New.
	* final.c (final_start_function): Replace macro
	PROFILE_BEFORE_PROLOGUE by target hook.
	* function.c (thread_prologue_and_epilogue_insns): Likewise.
	* target.def (profile_before_prologue): New hook.
	* targhooks.c (default_profile_before_prologue): New.
	* targhooks.h (default_profile_before_prologue): New.

Ok for apply?

Regards,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

[-- Attachment #2: profileb.diff --]
[-- Type: application/octet-stream, Size: 12345 bytes --]

Index: gcc/gcc/config/i386/i386.c
===================================================================
--- gcc.orig/gcc/config/i386/i386.c	2010-07-24 20:23:22.210750300 +0200
+++ gcc/gcc/config/i386/i386.c	2010-07-28 19:08:29.665001500 +0200
@@ -2768,6 +2768,15 @@ software_prefetching_beneficial_p (void)
     }
 }
 
+/* Return true, if profiling code should be emitted before
+   prologue. Otherwise it returns false.
+   Note: For x86 with "hotfix" it is sorried.  */
+static bool
+ix86_profile_before_prologue (void)
+{
+  return flag_fentry != 0;
+}
+
 /* Function that is callable from the debugger to print the current
    options.  */
 void
@@ -3671,6 +3680,18 @@ override_options (bool main_args_p)
     target_flags |= MASK_CLD & ~target_flags_explicit;
 #endif
 
+  if (flag_fentry == -1)
+#if defined(PROFILE_BEFORE_PROLOGUE)
+    flag_fentry = ((!TARGET_64BIT && flag_pic) ? 0 : 1);
+#else
+    flag_fentry = 0;
+#endif
+  if (flag_fentry != 0 && !TARGET_64BIT && flag_pic)
+    {
+      sorry ("-mfentry isn't support for x86 in combination with -fpic");
+      flag_fentry = 0;
+    }
+
   /* Save the initial options in case the user does function specific options */
   if (main_args_p)
     target_option_default_node = target_option_current_node
@@ -4841,7 +4862,7 @@ ix86_function_regparm (const_tree type,
   if (decl
       && TREE_CODE (decl) == FUNCTION_DECL
       && optimize
-      && !profile_flag)
+      && !(profile_flag && !flag_fentry))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl));
@@ -4913,7 +4934,8 @@ ix86_function_sseregparm (const_tree typ
 
   /* For local functions, pass up to SSE_REGPARM_MAX SFmode
      (and DFmode for SSE2) arguments in SSE registers.  */
-  if (decl && TARGET_SSE_MATH && optimize && !profile_flag)
+  if (decl && TARGET_SSE_MATH && optimize
+      && !(profile_flag && !flag_fentry))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl));
@@ -7883,7 +7905,7 @@ ix86_frame_pointer_required (void)
 	  || ix86_current_function_calls_tls_descriptor))
     return true;
 
-  if (crtl->profile)
+  if (crtl->profile && !flag_fentry)
     return true;
 
   return false;
@@ -8151,7 +8173,8 @@ gen_push (rtx arg)
 static unsigned int
 ix86_select_alt_pic_regnum (void)
 {
-  if (current_function_is_leaf && !crtl->profile
+  if (current_function_is_leaf
+      && !crtl->profile
       && !ix86_current_function_calls_tls_descriptor)
     {
       int i, drap;
@@ -9199,6 +9222,11 @@ ix86_expand_prologue (void)
     {
       rtx push, mov;
 
+      /* Check if profiling is active and we shall use profiling before
+         prologue variant. If so sorry.  */
+      if (crtl->profile && flag_fentry != 0)
+        sorry ("ms_hook_prologue attribute isn't compatible with -mfentry for 32-bit");
+
       /* Make sure the function starts with
 	 8b ff     movl.s %edi,%edi (emited by ix86_asm_output_function_label)
 	 55        push   %ebp
@@ -9488,7 +9516,7 @@ ix86_expand_prologue (void)
      when mcount needs it.  Blockage to avoid call movement across mcount
      call is emitted in generic code after the NOTE_INSN_PROLOGUE_END
      note.  */
-  if (crtl->profile && pic_reg_used)
+  if (crtl->profile && !flag_fentry && pic_reg_used)
     emit_insn (gen_prologue_use (pic_offset_table_rtx));
 
   if (crtl->drap_reg && !crtl->stack_realign_needed)
@@ -27296,6 +27324,9 @@ x86_field_alignment (tree field, int com
 void
 x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
 {
+  const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE
+					 : MCOUNT_NAME);
+
   if (TARGET_64BIT)
     {
 #ifndef NO_PROFILE_COUNTERS
@@ -27303,9 +27334,9 @@ x86_function_profiler (FILE *file, int l
 #endif
 
       if (DEFAULT_ABI == SYSV_ABI && flag_pic)
-	fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file);
+	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
       else
-	fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+	fprintf (file, "\tcall\t%s\n", mcount_name);
     }
   else if (flag_pic)
     {
@@ -27313,7 +27344,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file);
+      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
     }
   else
     {
@@ -27321,7 +27352,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+      fprintf (file, "\tcall\t%s\n", mcount_name);
     }
 }
 
@@ -31369,6 +31400,9 @@ ix86_enum_va_list (int idx, const char *
 #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD
 #endif
 
+#undef TARGET_PROFILE_BEFORE_PROLOGUE
+#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue
+
 #undef TARGET_ASM_UNALIGNED_HI_OP
 #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP
 #undef TARGET_ASM_UNALIGNED_SI_OP
Index: gcc/gcc/config/i386/i386.opt
===================================================================
--- gcc.orig/gcc/config/i386/i386.opt	2010-07-24 20:22:00.867096900 +0200
+++ gcc/gcc/config/i386/i386.opt	2010-07-28 18:57:58.735914400 +0200
@@ -375,3 +375,7 @@ Support RDRND built-in functions and cod
 mf16c
 Target Report Mask(ISA_F16C) Var(ix86_isa_flags) VarExists Save
 Support F16C built-in functions and code generation
+
+mfentry
+Target Report Var(flag_fentry) Init(-1)
+Emit profiling counter call at function entry before prologue.
Index: gcc/gcc/doc/invoke.texi
===================================================================
--- gcc.orig/gcc/doc/invoke.texi	2010-07-24 20:22:00.869096900 +0200
+++ gcc/gcc/doc/invoke.texi	2010-07-28 18:57:58.767916200 +0200
@@ -601,7 +601,7 @@ Objective-C and Objective-C++ Dialects}.
 -momit-leaf-frame-pointer  -mno-red-zone -mno-tls-direct-seg-refs @gol
 -mcmodel=@var{code-model} -mabi=@var{name} @gol
 -m32  -m64 -mlarge-data-threshold=@var{num} @gol
--msse2avx}
+-msse2avx -mfentry}
 
 @emph{IA-64 Options}
 @gccoptlist{-mbig-endian  -mlittle-endian  -mgnu-as  -mgnu-ld  -mno-pic @gol
@@ -12467,6 +12467,14 @@ For systems that use GNU libc, the defau
 @opindex msse2avx
 Specify that the assembler should encode SSE instructions with VEX
 prefix.  The option @option{-mavx} turns this on by default.
+
+@item -mfentry
+@itemx -mno-fentry
+@opindex mfentry
+If profiling is active @option{-pg} put the profiling
+counter call before prologue.
+Note: On x86 architectures the attribute @code{ms_hook_prologue}
+isn't possible at the moment for @option{-mfentry} and @option{-pg}.
 @end table
 
 These @samp{-m} switches are supported in addition to the above
Index: gcc/gcc/doc/tm.texi
===================================================================
--- gcc.orig/gcc/doc/tm.texi	2010-07-24 20:22:00.872096900 +0200
+++ gcc/gcc/doc/tm.texi	2010-07-28 18:57:58.810918700 +0200
@@ -7076,6 +7076,13 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void)
+It returns true if target wants profile code emitted before prologue.
+
+The default version of this hook use the target macro
+@code{PROFILE_BEFORE_PROLOGUE}.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp})
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/doc/tm.texi.in
===================================================================
--- gcc.orig/gcc/doc/tm.texi.in	2010-07-24 20:22:00.879096900 +0200
+++ gcc/gcc/doc/tm.texi.in	2010-07-28 18:57:58.840920400 +0200
@@ -7076,6 +7076,8 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@hook TARGET_PROFILE_BEFORE_PROLOGUE
+
 @hook TARGET_BINDS_LOCAL_P
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/final.c
===================================================================
--- gcc.orig/gcc/final.c	2010-07-24 20:22:00.880096900 +0200
+++ gcc/gcc/final.c	2010-07-28 18:57:58.865921800 +0200
@@ -1546,10 +1546,8 @@ final_start_function (rtx first ATTRIBUT
 
   /* The Sun386i and perhaps other machines don't work right
      if the profiling code comes after the prologue.  */
-#ifdef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* PROFILE_BEFORE_PROLOGUE */
 
 #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue)
   if (dwarf2out_do_frame ())
@@ -1591,10 +1589,8 @@ final_start_function (rtx first ATTRIBUT
 static void
 profile_after_prologue (FILE *file ATTRIBUTE_UNUSED)
 {
-#ifndef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (!targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* not PROFILE_BEFORE_PROLOGUE */
 }
 
 static void
Index: gcc/gcc/function.c
===================================================================
--- gcc.orig/gcc/function.c	2010-07-24 20:22:00.882096900 +0200
+++ gcc/gcc/function.c	2010-07-28 18:57:58.895923600 +0200
@@ -5183,13 +5183,11 @@ thread_prologue_and_epilogue_insns (void
       record_insns (seq, NULL, &prologue_insn_hash);
       emit_note (NOTE_INSN_PROLOGUE_END);
 
-#ifndef PROFILE_BEFORE_PROLOGUE
       /* Ensure that instructions are not moved into the prologue when
 	 profiling is on.  The call to the profiling routine can be
 	 emitted within the live range of a call-clobbered register.  */
-      if (crtl->profile)
+      if (!targetm.profile_before_prologue () && crtl->profile)
         emit_insn (gen_blockage ());
-#endif
 
       seq = get_insns ();
       end_sequence ();
Index: gcc/gcc/target.def
===================================================================
--- gcc.orig/gcc/target.def	2010-07-24 20:22:00.885096900 +0200
+++ gcc/gcc/target.def	2010-07-28 18:57:58.921925000 +0200
@@ -1218,6 +1218,15 @@ DEFHOOK
  bool, (const_tree exp),
  default_binds_local_p)
 
+/* Check if profiling code is before or after prologue.  */
+DEFHOOK
+(profile_before_prologue,
+ "It returns true if target wants profile code emitted before prologue.\n\n\
+The default version of this hook use the target macro\n\
+@code{PROFILE_BEFORE_PROLOGUE}.",
+ bool, (void),
+ default_profile_before_prologue)
+
 /* Modify and return the identifier of a DECL's external name,
    originally identified by ID, as required by the target,
    (eg, append @nn to windows32 stdcall function names).
Index: gcc/gcc/targhooks.c
===================================================================
--- gcc.orig/gcc/targhooks.c	2010-07-24 20:22:00.886096900 +0200
+++ gcc/gcc/targhooks.c	2010-07-28 19:10:55.778358700 +0200
@@ -1197,4 +1197,14 @@ default_register_move_cost (enum machine
 #endif
 }
 
+bool
+default_profile_before_prologue (void)
+{
+#ifdef PROFILE_BEFORE_PROLOGUE
+  return true;
+#else
+  return false;
+#endif
+}
+
 #include "gt-targhooks.h"
Index: gcc/gcc/targhooks.h
===================================================================
--- gcc.orig/gcc/targhooks.h	2010-07-24 20:22:00.897096900 +0200
+++ gcc/gcc/targhooks.h	2010-07-28 18:57:58.963927400 +0200
@@ -150,3 +150,4 @@ extern int default_memory_move_cost (enu
 extern int default_register_move_cost (enum machine_mode, reg_class_t,
 				       reg_class_t);
 
+extern bool default_profile_before_prologue (void);
Index: gcc/gcc/config/i386/i386.h
===================================================================
--- gcc.orig/gcc/config/i386/i386.h	2010-07-20 19:50:41.000000000 +0200
+++ gcc/gcc/config/i386/i386.h	2010-07-28 19:00:14.958705900 +0200
@@ -1607,6 +1607,8 @@ typedef struct ix86_args {
 
 #define MCOUNT_NAME "_mcount"
 
+#define MCOUNT_NAME_BEFORE_PROLOGUE "__fentry__"
+
 #define PROFILE_COUNT_REGISTER "edx"
 
 /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-28 17:28                       ` Kai Tietz
@ 2010-07-28 17:40                         ` Richard Henderson
  2010-07-28 18:14                           ` Kai Tietz
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2010-07-28 17:40 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Andi Kleen, GCC Patches

On 07/28/2010 10:23 AM, Kai Tietz wrote:
> 	* config/i386/i386.h (MCOUNT_NAME_BEFORE_PROLOGUE): New.
> 	* config/i386/i386.c (ix86_profile_before_prologue): New.
> 	(override_options): Add special handling for -mfentry.
> 	(ix86_function_regparm): Likewise.
> 	(ix86_function_sseregparm): Likewise.
> 	(ix86_frame_pointer_required): Likewise.
> 	(ix86_expand_prologue): Check for ms_hook_prologue.
> 	(x86_function_profiler): Adjust mcount output.
> 	(TARGET_PROFILE_BEFORE_PROLOGUE): Define hook.
> 	* config/i386/i386.opt (mfentry): New.
> 	* doc/invoke.texi (mfentry): Add documentation.
> 	* doc/tm.texi: Regenerated..
> 	* doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New.
> 	* final.c (final_start_function): Replace macro
> 	PROFILE_BEFORE_PROLOGUE by target hook.
> 	* function.c (thread_prologue_and_epilogue_insns): Likewise.
> 	* target.def (profile_before_prologue): New hook.
> 	* targhooks.c (default_profile_before_prologue): New.
> 	* targhooks.h (default_profile_before_prologue): New.
> 
> Ok for apply?

Nearly.

> +  if (flag_fentry == -1)
> +#if defined(PROFILE_BEFORE_PROLOGUE)
> +    flag_fentry = ((!TARGET_64BIT && flag_pic) ? 0 : 1);
> +#else
> +    flag_fentry = 0;
> +#endif
> +  if (flag_fentry != 0 && !TARGET_64BIT && flag_pic)
> +    {
> +      sorry ("-mfentry isn't support for x86 in combination with -fpic");
> +      flag_fentry = 0;
> +    }

Better as

  if (!TARGET_64BIT && flag_pic)
    {
      if (flag_fentry > 0)
        sorry ("-mfentry isn't support for x86 in combination with -fpic");
      flag_fentry = 0;
    }
  if (flag_fentry < 0)
    {
#if defined(PROFILE_BEFORE_PROLOGUE)
      flag_fentry = 1;
#else
      flag_fentry = 0;
#endif
    }

Ok with that change.


r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-18 20:52               ` Gerald Pfeifer
  2010-07-18 20:54                 ` Kai Tietz
@ 2010-07-28 18:06                 ` Kai Tietz
  1 sibling, 0 replies; 27+ messages in thread
From: Kai Tietz @ 2010-07-28 18:06 UTC (permalink / raw)
  To: Gerald Pfeifer; +Cc: Richard Henderson, Andi Kleen, gcc-patches

2010/7/18 Gerald Pfeifer <gerald@pfeifer.com>:
> On Sun, 18 Jul 2010, Kai Tietz wrote:
>> Something like this?
>
> +       <li>Support of emitting profiler counter call before prologue via
> +       command line option <code>-mfentry</code>.</li>
>
> How about
>
>  <li>Support for emitting profiler counter calls before function
>  prologues.  This is enabled via a new command-line option
>  <code>-mfentry</code>.</li>
>
> ?
>
> If this, or a variation, is fine with you, let's give Richard a day or
> two to comment and then go ahead and commit.
>
> Thanks,
> Gerald
>

Ok, added changelog entry as

Index: changes.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/changes.html,v
retrieving revision 1.33
diff -u -3 -r1.33 changes.html
--- changes.html        24 Jul 2010 19:37:48 -0000      1.33
+++ changes.html        28 Jul 2010 18:02:55 -0000
@@ -200,7 +200,9 @@

 <h3>IA-32/x86-64</h3>
   <ul>
-    <li>...</li>
+       <li>Support for emitting profiler counter calls before function
+       prologues.  This is enabled via a new command-line option
+       <code>-mfentry</code>.</li>
   </ul>

 <h3>MIPS</h3>

Regards,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch i386]: Add for win32 targets pre-prologue profiling  feature
  2010-07-28 17:40                         ` Richard Henderson
@ 2010-07-28 18:14                           ` Kai Tietz
  0 siblings, 0 replies; 27+ messages in thread
From: Kai Tietz @ 2010-07-28 18:14 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andi Kleen, GCC Patches

2010/7/28 Richard Henderson <rth@redhat.com>:
> On 07/28/2010 10:23 AM, Kai Tietz wrote:
>>       * config/i386/i386.h (MCOUNT_NAME_BEFORE_PROLOGUE): New.
>>       * config/i386/i386.c (ix86_profile_before_prologue): New.
>>       (override_options): Add special handling for -mfentry.
>>       (ix86_function_regparm): Likewise.
>>       (ix86_function_sseregparm): Likewise.
>>       (ix86_frame_pointer_required): Likewise.
>>       (ix86_expand_prologue): Check for ms_hook_prologue.
>>       (x86_function_profiler): Adjust mcount output.
>>       (TARGET_PROFILE_BEFORE_PROLOGUE): Define hook.
>>       * config/i386/i386.opt (mfentry): New.
>>       * doc/invoke.texi (mfentry): Add documentation.
>>       * doc/tm.texi: Regenerated..
>>       * doc/tm.texi.in (TARGET_PROFILE_BEFORE_PROLOGUE): New.
>>       * final.c (final_start_function): Replace macro
>>       PROFILE_BEFORE_PROLOGUE by target hook.
>>       * function.c (thread_prologue_and_epilogue_insns): Likewise.
>>       * target.def (profile_before_prologue): New hook.
>>       * targhooks.c (default_profile_before_prologue): New.
>>       * targhooks.h (default_profile_before_prologue): New.
>>
>> Ok for apply?
>
> Nearly.
>
>> +  if (flag_fentry == -1)
>> +#if defined(PROFILE_BEFORE_PROLOGUE)
>> +    flag_fentry = ((!TARGET_64BIT && flag_pic) ? 0 : 1);
>> +#else
>> +    flag_fentry = 0;
>> +#endif
>> +  if (flag_fentry != 0 && !TARGET_64BIT && flag_pic)
>> +    {
>> +      sorry ("-mfentry isn't support for x86 in combination with -fpic");
>> +      flag_fentry = 0;
>> +    }
>
> Better as
>
>  if (!TARGET_64BIT && flag_pic)
>    {
>      if (flag_fentry > 0)
>        sorry ("-mfentry isn't support for x86 in combination with -fpic");
>      flag_fentry = 0;
>    }
>  if (flag_fentry < 0)
>    {
> #if defined(PROFILE_BEFORE_PROLOGUE)
>      flag_fentry = 1;
> #else
>      flag_fentry = 0;
> #endif
>    }
>
> Ok with that change.
>
>
> r~
>

Applied at revision 162651 with your suggested code change and the
"supporED" and "32-bit" instead of x86.

Regards,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2010-07-28 18:06 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-13 12:47 [patch i386]: Add for win32 targets pre-prologue profiling feature Kai Tietz
2010-07-13 16:28 ` Richard Henderson
2010-07-14 10:20   ` Kai Tietz
2010-07-14 11:49     ` Dave Korn
2010-07-14 12:11       ` Kai Tietz
2010-07-14 12:16     ` Andi Kleen
2010-07-14 12:38       ` Kai Tietz
2010-07-15 18:08         ` Kai Tietz
2010-07-16 17:06           ` Richard Henderson
2010-07-17  6:52             ` Kai Tietz
2010-07-20  2:27               ` Richard Henderson
2010-07-28  8:36                 ` Kai Tietz
2010-07-28 16:00                   ` Richard Henderson
2010-07-28 16:01                     ` Andi Kleen
2010-07-28 17:28                       ` Kai Tietz
2010-07-28 17:40                         ` Richard Henderson
2010-07-28 18:14                           ` Kai Tietz
2010-07-16 20:53           ` Gerald Pfeifer
2010-07-18 12:20             ` Kai Tietz
2010-07-18 20:52               ` Gerald Pfeifer
2010-07-18 20:54                 ` Kai Tietz
2010-07-28 18:06                 ` Kai Tietz
2010-07-16 23:57           ` Andi Kleen
2010-07-17  5:34             ` Kai Tietz
2010-07-17  9:45               ` Andi Kleen
2010-07-18 11:37                 ` Kai Tietz
2010-07-18 11:46                   ` Kai Tietz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).