public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch i386]: Add for win32 targets pre-prologue profiling feature
@ 2010-07-13 12:47 Kai Tietz
  2010-07-13 16:28 ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Kai Tietz @ 2010-07-13 12:47 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 1850 bytes --]

Hello,

This patch adds for i386/x86_64 win32 targets the feature of
pre-prologue profiling call. It additional takes care that for enabled
top-profiler call, the frame-pointer gets omitted, if possible.
One side-note here about "hotfix" and profiling. The top-profiler call
gets emitted in the ix86_asm_output_function_label. This is caused by
the fact, that for ix86 the call needs to be placed before the
code-pattern, and for x86_64 it can be placed after. Otherwise the x86
pattern for patchable region would be corrupted, as this pattern
contains frame-register setup. So I think that the use of the macro
PROFILE_BEFORE_PROLOGUE isn't usable here.

2010-07-13  Kai Tietz

	* config/i386/cygming.h (PROFILE_CALL_AT_TOP): New macro.
	(MCOUNT_NAME): Define specific sub-target macro.
	* config/i386/cygming.opt: New option -fprofile-top.
	* config/i386/i386.c (ix86_function_regparm): Special
	handling for active profiling.
	(ix86_function_sseregparm): Likewise.
	(ix86_frame_pointer_required): Likewise.
	(ix86_select_alt_pic_regnum): Likewise.
	(ix86_save_reg): Likewise.
	(ix86_expand_prologue): Likewise.
	(x86_function_profiler_intern): New internal function.
	(ix86_asm_output_function_label): Output preprologue
	profiler call.
	(x86_function_profiler): Emit profiling after prologue
	when no top-profiling is enabled.
	* config/i386/i386.h (PROFILE_CALL_AT_TOP): Define
	macro by default of zero.
	* doc/invoke.texi (-fprofile-top): Add documentation.

Tested for i686-pc-linux-gnu, x86_64-pc-mingw32, and i686-pc-mingw32.
Ok for apply?

Regards,
Kai

PS: If there is a general interest for this feature for all i386
targets, the patch can be easily adjusted to this.
-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

[-- Attachment #2: profile_top.diff --]
[-- Type: application/octet-stream, Size: 9336 bytes --]

Index: gcc/gcc/config/i386/cygming.h
===================================================================
--- gcc.orig/gcc/config/i386/cygming.h	2010-07-09 11:24:42.000000000 +0200
+++ gcc/gcc/config/i386/cygming.h	2010-07-13 12:34:57.332581900 +0200
@@ -39,6 +39,12 @@ along with GCC; see the file COPYING3.  
 #undef DEFAULT_ABI
 #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI)
 
+#undef PROFILE_CALL_AT_TOP
+#define PROFILE_CALL_AT_TOP (flag_profile_top != 0)
+
+#undef MCOUNT_NAME
+#define MCOUNT_NAME (PROFILE_CALL_AT_TOP ? "_mcount_top" : "_mcount")
+
 #if ! defined (USE_MINGW64_LEADING_UNDERSCORES)
 #undef USER_LABEL_PREFIX
 #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_")
Index: gcc/gcc/config/i386/cygming.opt
===================================================================
--- gcc.orig/gcc/config/i386/cygming.opt	2009-12-11 14:56:47.000000000 +0100
+++ gcc/gcc/config/i386/cygming.opt	2010-07-13 12:26:24.374762200 +0200
@@ -53,3 +53,7 @@ Use the GNU extension to the PE format f
 muse-libstdc-wrappers
 Target Condition({defined (USE_CYGWIN_LIBSTDCXX_WRAPPERS)})
 Compile code that relies on Cygwin DLL wrappers to support C++ operator new/delete replacement
+
+fprofile-top
+Common Report Var(flag_profile_top) Init(0)
+Emit for profiling code the profiler-callback call before prologue.
Index: gcc/gcc/config/i386/i386.c
===================================================================
--- gcc.orig/gcc/config/i386/i386.c	2010-07-13 12:03:39.000000000 +0200
+++ gcc/gcc/config/i386/i386.c	2010-07-13 14:25:24.527182900 +0200
@@ -4841,7 +4841,7 @@ ix86_function_regparm (const_tree type, 
   if (decl
       && TREE_CODE (decl) == FUNCTION_DECL
       && optimize
-      && !profile_flag)
+      && !(profile_flag && !PROFILE_CALL_AT_TOP))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl));
@@ -4913,7 +4913,8 @@ ix86_function_sseregparm (const_tree typ
 
   /* For local functions, pass up to SSE_REGPARM_MAX SFmode
      (and DFmode for SSE2) arguments in SSE registers.  */
-  if (decl && TARGET_SSE_MATH && optimize && !profile_flag)
+  if (decl && TARGET_SSE_MATH && optimize
+      && !(profile_flag && !PROFILE_CALL_AT_TOP))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl));
@@ -5132,7 +5133,43 @@ ix86_cfun_abi (void)
   return cfun->machine->call_abi;
 }
 
-/* Write the extra assembler code needed to declare a function properly.  */
+/* Output assembler code to FILE to increment profiler label # LABELNO
+   for profiling a function entry.  */
+static void
+x86_function_profiler_intern (FILE *file, int labelno ATTRIBUTE_UNUSED)
+{
+  if (TARGET_64BIT)
+    {
+#ifndef NO_PROFILE_COUNTERS
+      fprintf (file, "\tleaq\t%sP%d(%%rip),%%r11\n", LPREFIX, labelno);
+#endif
+
+      if (DEFAULT_ABI == SYSV_ABI && flag_pic)
+	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", MCOUNT_NAME);
+      else
+	fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
+    }
+  else if (flag_pic)
+    {
+#ifndef NO_PROFILE_COUNTERS
+      fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
+	       LPREFIX, labelno);
+#endif
+      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", MCOUNT_NAME);
+    }
+  else
+    {
+#ifndef NO_PROFILE_COUNTERS
+      fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
+	       LPREFIX, labelno);
+#endif
+      fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
+    }
+}
+
+/* Write the extra assembler code needed to declare a function properly.
+   Output call to profiler if profiling is enabled and it should be emitted
+   before prologue,  */
 
 void
 ix86_asm_output_function_label (FILE *asm_out_file, const char *fname,
@@ -5151,6 +5188,11 @@ ix86_asm_output_function_label (FILE *as
 
   ASM_OUTPUT_LABEL (asm_out_file, fname);
 
+  /* We output profiling call before hotfix region, caused by the fact
+     that we would otherwise destroy for x86 the magic sequence.  */
+  if (!TARGET_64BIT && PROFILE_CALL_AT_TOP && profile_flag)
+    x86_function_profiler_intern (asm_out_file, 0);
+
   /* Output magic byte marker, if hot-patch attribute is set.
      For x86 case frame-pointer prologue will be emitted in
      expand_prologue.  */
@@ -5164,6 +5206,9 @@ ix86_asm_output_function_label (FILE *as
         /* movl.s %edi, %edi.  */
 	asm_fprintf (asm_out_file, ASM_BYTE "0x8b, 0xff\n");
     }
+  /* We output profiling call after hotfix region for x86_64.  */
+  if (TARGET_64BIT && PROFILE_CALL_AT_TOP && profile_flag)
+    x86_function_profiler_intern (asm_out_file, 0);
 }
 
 /* regclass.c  */
@@ -7875,7 +7920,7 @@ ix86_frame_pointer_required (void)
 	  || ix86_current_function_calls_tls_descriptor))
     return true;
 
-  if (crtl->profile)
+  if (crtl->profile && !PROFILE_CALL_AT_TOP)
     return true;
 
   return false;
@@ -8143,7 +8188,7 @@ gen_push (rtx arg)
 static unsigned int
 ix86_select_alt_pic_regnum (void)
 {
-  if (current_function_is_leaf && !crtl->profile
+  if (current_function_is_leaf && !(crtl->profile && !PROFILE_CALL_AT_TOP)
       && !ix86_current_function_calls_tls_descriptor)
     {
       int i, drap;
@@ -8167,7 +8212,7 @@ ix86_save_reg (unsigned int regno, int m
   if (pic_offset_table_rtx
       && regno == REAL_PIC_OFFSET_TABLE_REGNUM
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile
+	  || (crtl->profile && !PROFILE_CALL_AT_TOP)
 	  || crtl->calls_eh_return
 	  || crtl->uses_const_pool))
     {
@@ -9443,7 +9488,7 @@ ix86_expand_prologue (void)
   pic_reg_used = false;
   if (pic_offset_table_rtx
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile))
+	  || (crtl->profile && !PROFILE_CALL_AT_TOP)))
     {
       unsigned int alt_pic_reg_used = ix86_select_alt_pic_regnum ();
 
@@ -9480,7 +9525,7 @@ ix86_expand_prologue (void)
      when mcount needs it.  Blockage to avoid call movement across mcount
      call is emitted in generic code after the NOTE_INSN_PROLOGUE_END
      note.  */
-  if (crtl->profile && pic_reg_used)
+  if (crtl->profile && !PROFILE_CALL_AT_TOP && pic_reg_used)
     emit_insn (gen_prologue_use (pic_offset_table_rtx));
 
   if (crtl->drap_reg && !crtl->stack_realign_needed)
@@ -27283,37 +27328,13 @@ x86_field_alignment (tree field, int com
 }
 
 /* Output assembler code to FILE to increment profiler label # LABELNO
-   for profiling a function entry.  */
+   for profiling a function entry, if profiling call should be emitted
+   after prologue.  */
 void
-x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
+x86_function_profiler (FILE *file ATTRIBUTE_UNUSED, int labelno ATTRIBUTE_UNUSED)
 {
-  if (TARGET_64BIT)
-    {
-#ifndef NO_PROFILE_COUNTERS
-      fprintf (file, "\tleaq\t%sP%d(%%rip),%%r11\n", LPREFIX, labelno);
-#endif
-
-      if (DEFAULT_ABI == SYSV_ABI && flag_pic)
-	fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file);
-      else
-	fputs ("\tcall\t" MCOUNT_NAME "\n", file);
-    }
-  else if (flag_pic)
-    {
-#ifndef NO_PROFILE_COUNTERS
-      fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
-	       LPREFIX, labelno);
-#endif
-      fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file);
-    }
-  else
-    {
-#ifndef NO_PROFILE_COUNTERS
-      fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
-	       LPREFIX, labelno);
-#endif
-      fputs ("\tcall\t" MCOUNT_NAME "\n", file);
-    }
+  if (!PROFILE_CALL_AT_TOP)
+    x86_function_profiler_intern (file, labelno);
 }
 
 #ifdef ASM_OUTPUT_MAX_SKIP_PAD
Index: gcc/gcc/config/i386/i386.h
===================================================================
--- gcc.orig/gcc/config/i386/i386.h	2010-07-09 11:24:42.000000000 +0200
+++ gcc/gcc/config/i386/i386.h	2010-07-13 12:33:52.273634200 +0200
@@ -1601,6 +1601,8 @@ typedef struct ix86_args {
 
 #define MCOUNT_NAME "_mcount"
 
+#define PROFILE_CALL_AT_TOP 0
+
 #define PROFILE_COUNT_REGISTER "edx"
 
 /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
Index: gcc/gcc/doc/invoke.texi
===================================================================
--- gcc.orig/gcc/doc/invoke.texi	2010-07-09 11:24:43.000000000 +0200
+++ gcc/gcc/doc/invoke.texi	2010-07-13 12:49:46.418596000 +0200
@@ -884,7 +884,7 @@ See i386 and x86-64 Options.
 @emph{i386 and x86-64 Windows Options}
 @gccoptlist{-mconsole -mcygwin -mno-cygwin -mdll
 -mnop-fun-dllimport -mthread -municode -mwin32 -mwindows
--fno-set-stack-executable}
+-fno-set-stack-executable -fprofile-top}
 
 @emph{Xstormy16 Options}
 @gccoptlist{-msim}
@@ -17108,6 +17108,15 @@ set. This is necessary for binaries runn
 Windows, as there the user32 API, which is used to set executable
 privileges, isn't available.
 
+@item -fprofile-top
+@opindex fprofile-top
+This option is available for Cygwin and MinGW targets.  It
+specifies that for profiling the call to profiler should be
+done before prologue.  Default behavior is that profiler-call
+is done after prologue is established. When active it calls
+the @code{_mcount_top} function, otherwise the @code{_mcount}
+function.
+
 @item -mpe-aligned-commons
 @opindex mpe-aligned-commons
 This option is available for Cygwin and MinGW targets.  It

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2010-07-28 18:06 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-13 12:47 [patch i386]: Add for win32 targets pre-prologue profiling feature Kai Tietz
2010-07-13 16:28 ` Richard Henderson
2010-07-14 10:20   ` Kai Tietz
2010-07-14 11:49     ` Dave Korn
2010-07-14 12:11       ` Kai Tietz
2010-07-14 12:16     ` Andi Kleen
2010-07-14 12:38       ` Kai Tietz
2010-07-15 18:08         ` Kai Tietz
2010-07-16 17:06           ` Richard Henderson
2010-07-17  6:52             ` Kai Tietz
2010-07-20  2:27               ` Richard Henderson
2010-07-28  8:36                 ` Kai Tietz
2010-07-28 16:00                   ` Richard Henderson
2010-07-28 16:01                     ` Andi Kleen
2010-07-28 17:28                       ` Kai Tietz
2010-07-28 17:40                         ` Richard Henderson
2010-07-28 18:14                           ` Kai Tietz
2010-07-16 20:53           ` Gerald Pfeifer
2010-07-18 12:20             ` Kai Tietz
2010-07-18 20:52               ` Gerald Pfeifer
2010-07-18 20:54                 ` Kai Tietz
2010-07-28 18:06                 ` Kai Tietz
2010-07-16 23:57           ` Andi Kleen
2010-07-17  5:34             ` Kai Tietz
2010-07-17  9:45               ` Andi Kleen
2010-07-18 11:37                 ` Kai Tietz
2010-07-18 11:46                   ` Kai Tietz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).