public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Kai Tietz <ktietz70@googlemail.com>
To: Richard Henderson <rth@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [patch i386]: Add for win32 targets pre-prologue profiling 	feature
Date: Wed, 14 Jul 2010 10:20:00 -0000	[thread overview]
Message-ID: <AANLkTincDxzbxpCAMMiBbv52X2G3tE5lgortV820BqVq@mail.gmail.com> (raw)
In-Reply-To: <4C3C942D.3010101@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3516 bytes --]

Hello,

2010/7/13 Richard Henderson <rth@redhat.com>:
> On 07/13/2010 05:47 AM, Kai Tietz wrote:
>> Hello,
>>
>> This patch adds for i386/x86_64 win32 targets the feature of
>> pre-prologue profiling call. It additional takes care that for enabled
>> top-profiler call, the frame-pointer gets omitted, if possible.
>> One side-note here about "hotfix" and profiling. The top-profiler call
>> gets emitted in the ix86_asm_output_function_label. This is caused by
>> the fact, that for ix86 the call needs to be placed before the
>> code-pattern, and for x86_64 it can be placed after. Otherwise the x86
>> pattern for patchable region would be corrupted, as this pattern
>> contains frame-register setup. So I think that the use of the macro
>> PROFILE_BEFORE_PROLOGUE isn't usable here.
>
> Huh?  This is exactly the opposite of what we discussed yesterday on IRC.
>
> For hotfix we have
>
>        .rept 16
>        .byte 0xcc
>        .endr
> function:
>        mov.s   edi, edi
>        push    ebp
>        mov.s   esp, ebp
>
> If *any* of the above is not exactly so, then the runtime pattern match
> fails and the hotfix fails.  If we were to write the profiler first,
> then we might as well not bother with the hotfix pieces, because they
> will never match.
>
> ... of course it doesn't help that we emit the last two insns above
> within the prologue, so if we simply place the profile before the
> prologue, we'll *still* be splitting the hotfix sequence for 32-bit.
>
> I think the best thing to do is to diagnose hotfix+profile and generate
> an error.  I don't think there's anything reasonable we can do.
>
> In the end I don't think there's anything your AT_TOP macro does that
> PROFILE_BEFORE_PROLOGUE doesn't do just as well.
>
>
> r~
>

This patch implements it by new hook TARGET_PROFILE_BEFORE_PROLOGUE.
This feature is for now just active for win32 i386 targets and is
controlled by internal target macro PROFILE_SUPPORT_BEFORE_PROLOGUE.

2010-07-14  Kai Tietz

	* config/i386/cygming.h (PROFILE_SUPPORT_BEFORE_PROLOGUE): New.
	(MCOUNT_NAME): Win32 specific version.
	* config/i386/cygming.opt (mprofile-top): New option.
	* config/i386/i386.c (ix86_profile_before_prologue):
	New hook.
	(ix86_function_regparm): Handle profiling before
	prologue case.
	(ix86_function_sseregparm): Likewise.
	(ix86_cfun_abi): Likewise.
	(ix86_frame_pointer_required): Likewise.
	(ix86_select_alt_pic_regnum): Likewise.
	(ix86_save_reg): Likewise.
	(ix86_expand_prologue): Likewise.
	Additionally sorry for 32-bit "hotfix" and profile
	code before prologue.
	(x86_function_profiler): Use fprintf instead of fputs for
	assembly output.
	(TARGET_PROFILE_BEFORE_PROLOGUE): Define target hook.
	* doc/invoke.texi (mprofile-top): Document option.
	* doc/tm.texi.in (TARGET_PROFILE_BEFOR_PROLOGUE):
	Add documentation.
	* doc/tm.texi: Regenerated.
	* final.c (final_start_function): Replace
	PROFILE_BEFORE_PROLOGUE guard by target hook.
	(profile_after_prologue): Likewise.
	* function.c (thread_prologue_and_epilogue_insns):
	Likewise.
	* target.def (profile_before_prologue): New hook.
	* targhook.c (default_profile_before_prologue): New.
	* targhook.h (default_profile_before_prologue): New.

Tested for i686-pc-linux-gnu, i686-pc-mingw32, and x86_64-pc-mingw32.
Ok for apply?

Regads,
Kai


-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

[-- Attachment #2: profile.top.diff --]
[-- Type: application/octet-stream, Size: 13878 bytes --]

Index: gcc/gcc/config/i386/cygming.h
===================================================================
--- gcc.orig/gcc/config/i386/cygming.h	2010-07-13 17:00:51.000000000 +0200
+++ gcc/gcc/config/i386/cygming.h	2010-07-14 10:15:25.467867800 +0200
@@ -39,6 +39,14 @@ along with GCC; see the file COPYING3.  
 #undef DEFAULT_ABI
 #define DEFAULT_ABI (TARGET_64BIT ? MS_ABI : SYSV_ABI)
 
+#undef PROFILE_SUPPORT_BEFORE_PROLOGUE
+#define PROFILE_SUPPORT_BEFORE_PROLOGUE flag_profile_top
+
+/* Choose the correct profiler mcount name. For checking we are using the
+   ix86_profile_before_prologue function as flag_profile_top is tri-state.  */
+#undef MCOUNT_NAME
+#define MCOUNT_NAME (ix86_profile_before_prologue () ? "_mcount_top" : "_mcount")
+
 #if ! defined (USE_MINGW64_LEADING_UNDERSCORES)
 #undef USER_LABEL_PREFIX
 #define USER_LABEL_PREFIX (TARGET_64BIT ? "" : "_")
Index: gcc/gcc/config/i386/cygming.opt
===================================================================
--- gcc.orig/gcc/config/i386/cygming.opt	2010-07-13 17:00:51.000000000 +0200
+++ gcc/gcc/config/i386/cygming.opt	2010-07-14 09:49:33.869102000 +0200
@@ -53,3 +53,7 @@ Use the GNU extension to the PE format f
 muse-libstdc-wrappers
 Target Condition({defined (USE_CYGWIN_LIBSTDCXX_WRAPPERS)})
 Compile code that relies on Cygwin DLL wrappers to support C++ operator new/delete replacement
+
+mprofile-top
+Target Report Var(flag_profile_top) Init(-1)
+Emit profiling code before prologue.
Index: gcc/gcc/config/i386/i386.c
===================================================================
--- gcc.orig/gcc/config/i386/i386.c	2010-07-13 17:00:51.000000000 +0200
+++ gcc/gcc/config/i386/i386.c	2010-07-14 10:05:54.966610900 +0200
@@ -4770,6 +4770,30 @@ ix86_handle_cconv_attribute (tree *node,
   return NULL_TREE;
 }
 
+/* Return true, if profiling code should be emitted before
+   prologue. Otherwise it returns false.
+   Note: For x86 with "hotfix" it is sorried.  */
+static bool
+ix86_profile_before_prologue (void)
+{
+#ifdef PROFILE_SUPPORT_BEFORE_PROLOGUE
+  static int flag_value = -1;
+  if (flag_value == -1)
+    {
+      flag_value = PROFILE_SUPPORT_BEFORE_PROLOGUE;
+      if (flag_value == -1)
+	{
+	  /* Set it to default 0.  We need this tri-state for later
+	     checking of compatiblity and target preferences.  */
+	  flag_value = 0;
+	}
+    }
+  return flag_value != 0;
+#else
+  return false;
+#endif
+}
+
 /* Return 0 if the attributes for two types are incompatible, 1 if they
    are compatible, and 2 if they are nearly compatible (which causes a
    warning to be generated).  */
@@ -4841,7 +4865,7 @@ ix86_function_regparm (const_tree type, 
   if (decl
       && TREE_CODE (decl) == FUNCTION_DECL
       && optimize
-      && !profile_flag)
+      && !(profile_flag && !ix86_profile_before_prologue ()))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl));
@@ -4913,7 +4937,8 @@ ix86_function_sseregparm (const_tree typ
 
   /* For local functions, pass up to SSE_REGPARM_MAX SFmode
      (and DFmode for SSE2) arguments in SSE registers.  */
-  if (decl && TARGET_SSE_MATH && optimize && !profile_flag)
+  if (decl && TARGET_SSE_MATH && optimize
+      && !(profile_flag && !ix86_profile_before_prologue ()))
     {
       /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
       struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE(decl));
@@ -5132,7 +5157,9 @@ ix86_cfun_abi (void)
   return cfun->machine->call_abi;
 }
 
-/* Write the extra assembler code needed to declare a function properly.  */
+/* Write the extra assembler code needed to declare a function properly.
+   Output call to profiler if profiling is enabled and it should be emitted
+   before prologue,  */
 
 void
 ix86_asm_output_function_label (FILE *asm_out_file, const char *fname,
@@ -7875,7 +7902,7 @@ ix86_frame_pointer_required (void)
 	  || ix86_current_function_calls_tls_descriptor))
     return true;
 
-  if (crtl->profile)
+  if (crtl->profile && !ix86_profile_before_prologue ())
     return true;
 
   return false;
@@ -8143,7 +8170,8 @@ gen_push (rtx arg)
 static unsigned int
 ix86_select_alt_pic_regnum (void)
 {
-  if (current_function_is_leaf && !crtl->profile
+  if (current_function_is_leaf
+      && !(crtl->profile && !ix86_profile_before_prologue ())
       && !ix86_current_function_calls_tls_descriptor)
     {
       int i, drap;
@@ -8167,7 +8195,7 @@ ix86_save_reg (unsigned int regno, int m
   if (pic_offset_table_rtx
       && regno == REAL_PIC_OFFSET_TABLE_REGNUM
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile
+	  || (crtl->profile && !ix86_profile_before_prologue ())
 	  || crtl->calls_eh_return
 	  || crtl->uses_const_pool))
     {
@@ -9191,6 +9219,11 @@ ix86_expand_prologue (void)
     {
       rtx push, mov;
 
+      /* Check if profiling is active and we shall use profiling before
+         prologue variant. If so sorry.  */
+      if (crtl->profile && ix86_profile_before_prologue () != 0)
+        sorry ("ms_hook_prologue attribute isn't compatible with -mprofile-top for 32-bit");
+
       /* Make sure the function starts with
 	 8b ff     movl.s %edi,%edi (emited by ix86_asm_output_function_label)
 	 55        push   %ebp
@@ -9443,7 +9476,7 @@ ix86_expand_prologue (void)
   pic_reg_used = false;
   if (pic_offset_table_rtx
       && (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM)
-	  || crtl->profile))
+	  || (crtl->profile && !ix86_profile_before_prologue ())))
     {
       unsigned int alt_pic_reg_used = ix86_select_alt_pic_regnum ();
 
@@ -9480,7 +9513,7 @@ ix86_expand_prologue (void)
      when mcount needs it.  Blockage to avoid call movement across mcount
      call is emitted in generic code after the NOTE_INSN_PROLOGUE_END
      note.  */
-  if (crtl->profile && pic_reg_used)
+  if (crtl->profile && !ix86_profile_before_prologue () && pic_reg_used)
     emit_insn (gen_prologue_use (pic_offset_table_rtx));
 
   if (crtl->drap_reg && !crtl->stack_realign_needed)
@@ -27285,7 +27318,7 @@ x86_field_alignment (tree field, int com
 /* Output assembler code to FILE to increment profiler label # LABELNO
    for profiling a function entry.  */
 void
-x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
+x86_function_profiler (FILE *file ATTRIBUTE_UNUSED, int labelno ATTRIBUTE_UNUSED)
 {
   if (TARGET_64BIT)
     {
@@ -27294,9 +27327,9 @@ x86_function_profiler (FILE *file, int l
 #endif
 
       if (DEFAULT_ABI == SYSV_ABI && flag_pic)
-	fputs ("\tcall\t*" MCOUNT_NAME "@GOTPCREL(%rip)\n", file);
+	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", MCOUNT_NAME);
       else
-	fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+	fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
     }
   else if (flag_pic)
     {
@@ -27304,7 +27337,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t*" MCOUNT_NAME "@GOT(%ebx)\n", file);
+      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", MCOUNT_NAME);
     }
   else
     {
@@ -27312,7 +27345,7 @@ x86_function_profiler (FILE *file, int l
       fprintf (file, "\tmovl\t$%sP%d,%%" PROFILE_COUNT_REGISTER "\n",
 	       LPREFIX, labelno);
 #endif
-      fputs ("\tcall\t" MCOUNT_NAME "\n", file);
+      fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
     }
 }
 
@@ -31360,6 +31393,9 @@ ix86_enum_va_list (int idx, const char *
 #define TARGET_ASM_ALIGNED_DI_OP ASM_QUAD
 #endif
 
+#undef TARGET_PROFILE_BEFORE_PROLOGUE
+#define TARGET_PROFILE_BEFORE_PROLOGUE ix86_profile_before_prologue
+
 #undef TARGET_ASM_UNALIGNED_HI_OP
 #define TARGET_ASM_UNALIGNED_HI_OP TARGET_ASM_ALIGNED_HI_OP
 #undef TARGET_ASM_UNALIGNED_SI_OP
Index: gcc/gcc/doc/invoke.texi
===================================================================
--- gcc.orig/gcc/doc/invoke.texi	2010-07-13 17:00:51.000000000 +0200
+++ gcc/gcc/doc/invoke.texi	2010-07-14 10:10:11.655367800 +0200
@@ -884,7 +884,7 @@ See i386 and x86-64 Options.
 @emph{i386 and x86-64 Windows Options}
 @gccoptlist{-mconsole -mcygwin -mno-cygwin -mdll
 -mnop-fun-dllimport -mthread -municode -mwin32 -mwindows
--fno-set-stack-executable}
+-mprofile-top -fno-set-stack-executable}
 
 @emph{Xstormy16 Options}
 @gccoptlist{-msim}
@@ -17100,6 +17100,13 @@ specifies that a GUI application is to b
 instructing the linker to set the PE header subsystem type
 appropriately.
 
+@item -mprofile-top
+@opindex mprofile-top
+This option is available for Cygwin and MinGW targets. It
+specifies that for profiling the call to profiler should be
+done before prologue.  Default behavior is that profiler-call
+is done after the prologue is established.
+
 @item -fno-set-stack-executable
 @opindex fno-set-stack-executable
 This option is available for MinGW targets. It specifies that
Index: gcc/gcc/doc/tm.texi.in
===================================================================
--- gcc.orig/gcc/doc/tm.texi.in	2010-07-13 12:03:30.000000000 +0200
+++ gcc/gcc/doc/tm.texi.in	2010-07-14 09:47:06.485920000 +0200
@@ -7101,6 +7101,14 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@hook TARGET_PROFILE_BEFORE_PROLOGUE
+It returns true if target wants profile code emitted before
+prologue.
+
+The default version of this hook use the target macro
+@code{PROFILE_BEFORE_PROLOGUE}.
+@end deftypefn
+
 @hook TARGET_BINDS_LOCAL_P
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library
Index: gcc/gcc/final.c
===================================================================
--- gcc.orig/gcc/final.c	2010-07-09 11:24:47.000000000 +0200
+++ gcc/gcc/final.c	2010-07-14 09:47:06.376496000 +0200
@@ -1546,10 +1546,8 @@ final_start_function (rtx first ATTRIBUT
 
   /* The Sun386i and perhaps other machines don't work right
      if the profiling code comes after the prologue.  */
-#ifdef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* PROFILE_BEFORE_PROLOGUE */
 
 #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue)
   if (dwarf2out_do_frame ())
@@ -1591,10 +1589,8 @@ final_start_function (rtx first ATTRIBUT
 static void
 profile_after_prologue (FILE *file ATTRIBUTE_UNUSED)
 {
-#ifndef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (!targetm.profile_before_prologue () && crtl->profile)
     profile_function (file);
-#endif /* not PROFILE_BEFORE_PROLOGUE */
 }
 
 static void
Index: gcc/gcc/function.c
===================================================================
--- gcc.orig/gcc/function.c	2010-07-06 13:15:38.000000000 +0200
+++ gcc/gcc/function.c	2010-07-14 09:47:06.392128000 +0200
@@ -5100,13 +5100,11 @@ thread_prologue_and_epilogue_insns (void
       record_insns (seq, NULL, &prologue_insn_hash);
       emit_note (NOTE_INSN_PROLOGUE_END);
 
-#ifndef PROFILE_BEFORE_PROLOGUE
       /* Ensure that instructions are not moved into the prologue when
 	 profiling is on.  The call to the profiling routine can be
 	 emitted within the live range of a call-clobbered register.  */
-      if (crtl->profile)
+      if (!targetm.profile_before_prologue () && crtl->profile)
         emit_insn (gen_blockage ());
-#endif
 
       seq = get_insns ();
       end_sequence ();
Index: gcc/gcc/target.def
===================================================================
--- gcc.orig/gcc/target.def	2010-07-09 11:24:47.000000000 +0200
+++ gcc/gcc/target.def	2010-07-14 09:47:06.392128000 +0200
@@ -1218,6 +1218,13 @@ DEFHOOK
  bool, (const_tree exp),
  default_binds_local_p)
 
+/* Check if profiling code is before or after prologue.  */
+DEFHOOK
+(profile_before_prologue,
+ "",
+ bool, (void),
+ default_profile_before_prologue)
+
 /* Modify and return the identifier of a DECL's external name,
    originally identified by ID, as required by the target,
    (eg, append @nn to windows32 stdcall function names).
Index: gcc/gcc/targhooks.c
===================================================================
--- gcc.orig/gcc/targhooks.c	2010-07-09 11:24:47.000000000 +0200
+++ gcc/gcc/targhooks.c	2010-07-14 09:47:06.392128000 +0200
@@ -1197,4 +1197,14 @@ default_register_move_cost (enum machine
 #endif
 }
 
+bool
+default_profile_before_prologue (void)
+{
+#ifndef PROFILE_BEFORE_PROLOGUE
+  return false;
+#else
+  return true;
+#endif
+}
+
 #include "gt-targhooks.h"
Index: gcc/gcc/targhooks.h
===================================================================
--- gcc.orig/gcc/targhooks.h	2010-07-09 11:24:47.000000000 +0200
+++ gcc/gcc/targhooks.h	2010-07-14 09:47:06.392128000 +0200
@@ -150,3 +150,4 @@ extern int default_memory_move_cost (enu
 extern int default_register_move_cost (enum machine_mode, reg_class_t,
 				       reg_class_t);
 
+extern bool default_profile_before_prologue (void);
Index: gcc/gcc/doc/tm.texi
===================================================================
--- gcc.orig/gcc/doc/tm.texi	2010-07-13 12:03:30.000000000 +0200
+++ gcc/gcc/doc/tm.texi	2010-07-14 10:34:44.219257400 +0200
@@ -7101,6 +7101,14 @@ Contains the value true if the target pl
 ``small data'' into a separate section.  The default value is false.
 @end deftypevr
 
+@deftypefn {Target Hook} bool TARGET_PROFILE_BEFORE_PROLOGUE (void)
+It returns true if target wants profile code emitted before
+prologue.
+
+The default version of this hook use the target macro
+@code{PROFILE_BEFORE_PROLOGUE}.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_BINDS_LOCAL_P (const_tree @var{exp})
 Returns true if @var{exp} names an object for which name resolution
 rules must resolve to the current ``module'' (dynamic shared library

  reply	other threads:[~2010-07-14 10:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-13 12:47 Kai Tietz
2010-07-13 16:28 ` Richard Henderson
2010-07-14 10:20   ` Kai Tietz [this message]
2010-07-14 11:49     ` Dave Korn
2010-07-14 12:11       ` Kai Tietz
2010-07-14 12:16     ` Andi Kleen
2010-07-14 12:38       ` Kai Tietz
2010-07-15 18:08         ` Kai Tietz
2010-07-16 17:06           ` Richard Henderson
2010-07-17  6:52             ` Kai Tietz
2010-07-20  2:27               ` Richard Henderson
2010-07-28  8:36                 ` Kai Tietz
2010-07-28 16:00                   ` Richard Henderson
2010-07-28 16:01                     ` Andi Kleen
2010-07-28 17:28                       ` Kai Tietz
2010-07-28 17:40                         ` Richard Henderson
2010-07-28 18:14                           ` Kai Tietz
2010-07-16 20:53           ` Gerald Pfeifer
2010-07-18 12:20             ` Kai Tietz
2010-07-18 20:52               ` Gerald Pfeifer
2010-07-18 20:54                 ` Kai Tietz
2010-07-28 18:06                 ` Kai Tietz
2010-07-16 23:57           ` Andi Kleen
2010-07-17  5:34             ` Kai Tietz
2010-07-17  9:45               ` Andi Kleen
2010-07-18 11:37                 ` Kai Tietz
2010-07-18 11:46                   ` Kai Tietz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTincDxzbxpCAMMiBbv52X2G3tE5lgortV820BqVq@mail.gmail.com \
    --to=ktietz70@googlemail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).