public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* MSVC hook function prologue
@ 2009-09-02 20:46 Stefan Dösinger
  2009-09-02 22:05 ` Paolo Bonzini
  0 siblings, 1 reply; 20+ messages in thread
From: Stefan Dösinger @ 2009-09-02 20:46 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 2229 bytes --]

Hello,
After a rather long break due to other work I tried to revive my work on 
support for the function prologue used in Win32 API functions on Windows - a 
function prologue that some apps running in Wine expect.

This thread from January explains what I am trying to do:
http://gcc.gnu.org/ml/gcc/2009-01/msg00089.html

Essentially I want a function attrib that starts the function with this 
sequence, no matter what other parameters, code in the function, attributes 
or whatever are used:
8b ff          mov    %edi,%edi
55             push   %ebp
8b ec          mov    %esp,%ebp

I have attached the latest version of my patch for comments. It is mainly 
rebased against gcc changes that were made in the meantime. I also improved 
the REG_FRAME_RELATED_EXPR notes a bit, and only set it if the movs and pops 
are used for the frame pointer setup.

I also now know that I don't(or cannot) care about 64 bit right now. The 
windows apps currently do Windows API function hooking only in 32 bit, and 
there is no emerging scheme yet for hooking Win64 functions in the same way.

Currently I still have these problems:
*) There is apparently some plugin framework in the works. Can this 
functionality implemented as a plugin?

*) The way I read the msvc_prologue attribute seems wrong to me. I could read 
it directly in ix86_expand_prologue, but I got lost in the different trees 
gcc uses. I'm yet again trying to find this in the code and in the docs.

*) The code generated if no frame pointer is needed isn't pretty, but Wine 
will always need a frame pointer, so any optimization in that area won't get 
much test exposure.

*) The stack alignment code + msvc_prologue is used by Wine on osx though. 
Currently I pop %ebp after the 5 byte prologue, and the normal code recreates 
the frame pointer afterwards. My understanding is that I can avoid this by 
keeping the original frame pointer, but adjusting a lot of offsets after the 
alignment to find the function parameters and align the stack properly on 
calls. However, this is currently above my head.

*) What other changes are needed to get a functionality like this into 
mainline?

Thank you,
Stefan Dösinger

[-- Attachment #2: msvc-prologue.diff --]
[-- Type: text/x-diff, Size: 6963 bytes --]

Index: gcc/configure.ac
===================================================================
--- gcc/configure.ac	(revision 151348)
+++ gcc/configure.ac	(working copy)
@@ -3035,6 +3035,12 @@
       [AC_DEFINE(HAVE_AS_IX86_SAHF, 1,
         [Define if your assembler supports the sahf mnemonic.])])
 
+    gcc_GAS_CHECK_FEATURE([swap suffix],
+      gcc_cv_as_ix86_swap,,,
+      [movl.s %esp, %ebp],,
+      [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
+        [Define if your assembler supports the swap suffix.])])
+
     gcc_GAS_CHECK_FEATURE([different section symbol subtraction],
       gcc_cv_as_ix86_diff_sect_delta,,,
       [.section .rodata
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	(revision 151348)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2388,6 +2388,9 @@
      to be used. MS_ABI means ms abi. Otherwise SYSV_ABI means sysv abi.  */
   enum calling_abi call_abi;
   struct machine_cfa_state cfa;
+  /* This value is used for i386 targets and specifies if the function
+   * should start with the hooking-friendly Win32 function prologue       */
+  int msvc_prologue;
 };
 #endif
 
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 151348)
+++ gcc/config/i386/i386.md	(working copy)
@@ -237,6 +237,7 @@
    (UNSPECV_RDTSC		18)
    (UNSPECV_RDTSCP		19)
    (UNSPECV_RDPMC		20)
+   (UNSPECV_VSWAPMOV	21)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -15747,6 +15748,16 @@
    (set_attr "length_immediate" "0")
    (set_attr "modrm" "0")])
 
+(define_insn "vswapmov"
+[(unspec_volatile [(match_operand 0 "register_operand" "0")
+                   (match_operand 1 "register_operand" "1")]
+ UNSPECV_VSWAPMOV )]
+  ""
+  "movl.s\t%1,%0"
+  [(set_attr "length" "2")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 ;; Pad to 16-byte boundary, max skip in op0.  Used to avoid
 ;; branch prediction penalty for the third jump in a 16-byte
 ;; block on K8.
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 151348)
+++ gcc/config/i386/i386.c	(working copy)
@@ -4777,6 +4777,19 @@
   return ix86_abi;
 }
 
+static int
+ix86_function_msvc_prologue (const_tree fntype)
+{
+  if (!TARGET_64BIT && fntype != NULL)
+    {
+      if(lookup_attribute ("msvc_prologue", TYPE_ATTRIBUTES (fntype)))
+        {
+          return 1;
+        }
+    }
+  return 0;
+}
+
 static enum calling_abi
 ix86_function_abi (const_tree fndecl)
 {
@@ -4808,6 +4821,11 @@
     cfun->machine->call_abi = ix86_abi;
   else
     cfun->machine->call_abi = ix86_function_type_abi (TREE_TYPE (fndecl));
+
+  if (fndecl == NULL_TREE)
+    cfun->machine->msvc_prologue = 0;
+  else
+    cfun->machine->msvc_prologue = ix86_function_msvc_prologue (TREE_TYPE (fndecl));
 }
 
 /* MS and SYSV ABI have different set of call used registers.  Avoid expensive
@@ -8316,6 +8334,7 @@
   bool pic_reg_used;
   struct ix86_frame frame;
   HOST_WIDE_INT allocate;
+  int gen_frame_pointer = frame_pointer_needed;
 
   ix86_finalize_stack_realign_flags ();
 
@@ -8328,6 +8347,45 @@
 
   ix86_compute_frame_layout (&frame);
 
+  if(cfun->machine->msvc_prologue)
+  {
+    rtx push, mov;
+    /* Make sure the function starts with
+       8b ff     movl.s %edi,%edi
+       55        push   %ebp
+       8b ec     movl.s %esp,%ebp
+
+       This matches the hookable function prologue generated by msvc. Wine uses this
+       to enable Windows games to hook the Win32 API functions provided by Wine */
+    insn = emit_insn(gen_vswapmov(gen_rtx_REG (Pmode, DI_REG), gen_rtx_REG (Pmode, DI_REG)));
+    push = emit_insn (gen_push (hard_frame_pointer_rtx));
+    mov = emit_insn(gen_vswapmov(hard_frame_pointer_rtx, stack_pointer_rtx));
+
+    if(frame_pointer_needed && !(crtl->drap_reg && crtl->stack_realign_needed))
+    {
+      /* The push %ebp and movl.s %esp, %ebp already set up the frame pointer. No need to do
+         this again. */
+      gen_frame_pointer = 0;
+      RTX_FRAME_RELATED_P (push) = 1;
+      RTX_FRAME_RELATED_P (mov) = 1;
+      add_reg_note (mov, REG_FRAME_RELATED_EXPR,
+                    gen_rtx_SET (VOIDmode, hard_frame_pointer_rtx, stack_pointer_rtx));
+      if (ix86_cfa_state->reg == stack_pointer_rtx)
+        ix86_cfa_state->reg = hard_frame_pointer_rtx;
+    }
+    else
+    {
+      /* If the frame pointer is not needed, pop %ebp again. This could be optimized for cases where
+         ebp needs to be backed up for some other reason.
+
+         If stack realignment is needed, pop the base pointer again, align the stack, and later
+         regenerate the frame pointer setup. The frame pointer generated by the msvc prologue
+         is not aligned, so it can't be used */
+      insn = emit_insn ((*ix86_gen_pop1) (hard_frame_pointer_rtx));
+    }
+  }
+
+  
   /* Emit prologue code to adjust stack alignment and setup DRAP, in case
      of DRAP is needed and stack realignment is really needed after reload */
   if (crtl->drap_reg && crtl->stack_realign_needed)
@@ -8377,14 +8435,12 @@
   /* Note: AT&T enter does NOT have reversed args.  Enter is probably
      slower on all targets.  Also sdb doesn't like it.  */
 
-  if (frame_pointer_needed)
+  if (gen_frame_pointer)
     {
       insn = emit_insn (gen_push (hard_frame_pointer_rtx));
       RTX_FRAME_RELATED_P (insn) = 1;
-
       insn = emit_move_insn (hard_frame_pointer_rtx, stack_pointer_rtx);
       RTX_FRAME_RELATED_P (insn) = 1;
-
       if (ix86_cfa_state->reg == stack_pointer_rtx)
         ix86_cfa_state->reg = hard_frame_pointer_rtx;
     }
@@ -26069,11 +26125,24 @@
     }
   if (!TARGET_64BIT)
     {
-      warning (OPT_Wattributes, "%qE attribute only available for 64-bit",
-	       name);
-      *no_add_attrs = true;
-      return NULL_TREE;
+      if(!is_attribute_p ("msvc_prologue", name))
+        {
+          warning (OPT_Wattributes, "%qE attribute only available for 64-bit",
+	           name);
+          *no_add_attrs = true;
+          return NULL_TREE;
+        }
     }
+  else
+    {
+      if(is_attribute_p ("msvc_prologue", name))
+        {
+          warning (OPT_Wattributes, "%qE attribute only available for 32-bit",
+                   name);
+          *no_add_attrs = true;
+          return NULL_TREE;
+        }
+    }
 
   /* Can combine regparm with all attributes but fastcall.  */
   if (is_attribute_p ("ms_abi", name))
@@ -28983,6 +29052,9 @@
   /* ms_abi and sysv_abi calling convention function attributes.  */
   { "ms_abi", 0, 0, false, true, true, ix86_handle_abi_attribute },
   { "sysv_abi", 0, 0, false, true, true, ix86_handle_abi_attribute },
+#ifdef HAVE_AS_IX86_SWAP
+  { "msvc_prologue", 0, 0, false, true, true, ix86_handle_abi_attribute },
+#endif
   /* End element.  */
   { NULL,        0, 0, false, false, false, NULL }
 };

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-02 20:46 MSVC hook function prologue Stefan Dösinger
@ 2009-09-02 22:05 ` Paolo Bonzini
  2009-09-02 22:27   ` Kai Tietz
                     ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Paolo Bonzini @ 2009-09-02 22:05 UTC (permalink / raw)
  To: Stefan Dösinger; +Cc: gcc

> Currently I still have these problems:
> *) There is apparently some plugin framework in the works. Can this
> functionality implemented as a plugin?

No, plugins do not affect the backend.

> *) The stack alignment code + msvc_prologue is used by Wine on osx though.
> Currently I pop %ebp after the 5 byte prologue, and the normal code recreates
> the frame pointer afterwards. My understanding is that I can avoid this by
> keeping the original frame pointer, but adjusting a lot of offsets after the
> alignment to find the function parameters and align the stack properly on
> calls. However, this is currently above my head.

I don't think this would prevent the patch from getting the patch in.

> *) What other changes are needed to get a functionality like this into
> mainline?

I think right now I'd make only two cosmetic adjustments:

Here:

    if (!TARGET_64BIT)
      {
-      warning (OPT_Wattributes, "%qE attribute only available for 64-bit",
-	       name);
-      *no_add_attrs = true;
-      return NULL_TREE;
+      if(!is_attribute_p ("msvc_prologue", name))

I'd say

    if (TARGET_64BIT
        ? !is_attribute_p ("msvc_prologue", name))
        : is_attribute_p ("msvc_prologue", name))
      {
        warning (OPT_Wattributes, "%qE attribute not available for "
                 "%d-bit", name, TARGET_64BIT ? 64 : 32);
        *no_add_attrs = true;
        return NULL_TREE;
     }

And here:

+(define_insn "vswapmov"
+[(unspec_volatile [(match_operand 0 "register_operand" "0")
+                   (match_operand 1 "register_operand" "1")]
+ UNSPECV_VSWAPMOV )]

I'd try defining the insn as

(define_insn "vswapmov"
[(set (match_operand 0 "register_operand" "0")
       (match_operand 1 "register_operand" "1")
  (unspec_volatile [] UNSPECV_VSWAPMOV)]

I think this should work and would be nicer to the compiler's dataflow 
analysis, but it should be tested of course.

I'm not an x86 maintainer, though.

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-02 22:05 ` Paolo Bonzini
@ 2009-09-02 22:27   ` Kai Tietz
  2009-09-02 23:39     ` Paolo Bonzini
  2009-09-03  7:50   ` Stefan Dösinger
  2009-09-04 11:45   ` Stefan Dösinger
  2 siblings, 1 reply; 20+ messages in thread
From: Kai Tietz @ 2009-09-02 22:27 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Stefan Dösinger, gcc

2009/9/3 Paolo Bonzini <bonzini@gnu.org>:
>   if (TARGET_64BIT
>       ? !is_attribute_p ("msvc_prologue", name))
>       : is_attribute_p ("msvc_prologue", name))
>     {
>       warning (OPT_Wattributes, "%qE attribute not available for "
>                "%d-bit", name, TARGET_64BIT ? 64 : 32);
>       *no_add_attrs = true;
>       return NULL_TREE;
>    }

Shouldn't it be this instead?

  if (is_attribute_p ("msvc_prologue", name))
    {
      warning (OPT_Wattributes, "%qE attribute not available for "
               "%d-bit", name, TARGET_64BIT ? 64 : 32);
      *no_add_attrs = true;
      return NULL_TREE;
   }

as otherwise for 64-bit target warning would be shown always?

Cheers,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-02 22:27   ` Kai Tietz
@ 2009-09-02 23:39     ` Paolo Bonzini
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Bonzini @ 2009-09-02 23:39 UTC (permalink / raw)
  To: Kai Tietz; +Cc: Stefan Dösinger, gcc

On 09/03/2009 12:27 AM, Kai Tietz wrote:
> 2009/9/3 Paolo Bonzini<bonzini@gnu.org>:
>>    if (TARGET_64BIT
>>        ? !is_attribute_p ("msvc_prologue", name))
>>        : is_attribute_p ("msvc_prologue", name))
>>      {
>>        warning (OPT_Wattributes, "%qE attribute not available for "
>>                 "%d-bit", name, TARGET_64BIT ? 64 : 32);
>>        *no_add_attrs = true;
>>        return NULL_TREE;
>>     }
>
> Shouldn't it be this instead?
>
>    if (is_attribute_p ("msvc_prologue", name))

You mean TARGET_64BIT && is_attribute_p (...)?

>      {
>        warning (OPT_Wattributes, "%qE attribute not available for "
>                 "%d-bit", name, TARGET_64BIT ? 64 : 32);
>        *no_add_attrs = true;
>        return NULL_TREE;
>     }
>
> as otherwise for 64-bit target warning would be shown always?

I don't know, I was just reworking Stefan's patch.  He didn't include 
function names (-p) in the patch so I don't know what function this is 
part of.

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-02 22:05 ` Paolo Bonzini
  2009-09-02 22:27   ` Kai Tietz
@ 2009-09-03  7:50   ` Stefan Dösinger
  2009-09-03  7:52     ` Paolo Bonzini
  2009-09-04 11:45   ` Stefan Dösinger
  2 siblings, 1 reply; 20+ messages in thread
From: Stefan Dösinger @ 2009-09-03  7:50 UTC (permalink / raw)
  To: gcc; +Cc: Paolo Bonzini

Am Thursday 03 September 2009 00:04:43 schrieb Paolo Bonzini:

>> *) The stack alignment code + msvc_prologue is used by Wine on osx though.
>> ...
> I don't think this would prevent the patch from getting the patch in.
Ok, I'll read the patch contribution guidelines again and hope for the best. 
Ideally I'd like gcc to generate efficient code for this, since we use this 
on OSX, but I can live with the current situation for now - very few of the 
functions we have to make hookable are performance critical.

> > *) What other changes are needed to get a functionality like this into
> > mainline?
>
> I think right now I'd make only two cosmetic adjustments:
I'll look into them - the 2nd one seems to make sense to me, for the first one 
I have to look.

> > as otherwise for 64-bit target warning would be shown always?

> I don't know, I was just reworking Stefan's patch.  He didn't include 
> function names (-p) in the patch so I don't know what function this is 
> part of.
It was ix86_handle_abi_attribute. I'm usually using git, and don't like cvs 
and svn too much. It seems svn diff doesn't support a -p option here. Maybe 
I'll just switch to git-svn.

Thanks for your help!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-03  7:50   ` Stefan Dösinger
@ 2009-09-03  7:52     ` Paolo Bonzini
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Bonzini @ 2009-09-03  7:52 UTC (permalink / raw)
  To: Stefan Dösinger; +Cc: gcc

>> I don't know, I was just reworking Stefan's patch.  He didn't include
>> function names (-p) in the patch so I don't know what function this is
>> part of.
> It was ix86_handle_abi_attribute. I'm usually using git, and don't like cvs
> and svn too much. It seems svn diff doesn't support a -p option here. Maybe
> I'll just switch to git-svn.

You can use "svn diff -x -p" or put a script like this in ~/bin/svn-diff

#! /bin/sh

p=p
format=u

for i
do
   case "$i" in -*[uUcCy]*) format= ;; esac
   case "$i" in -*W*) format=y ;; esac
   case "$i" in -*p*) p= ;; esac
   case "$i" in --|[!-]*) break ;; esac
done

test -n "$format$p" && set -- -$format$p "$@"
exec diff "$@"


... and place this in ~/.subversion/config:

[helpers]
diff-cmd = /path/to/bin/svn-diff

You're welcome!

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-02 22:05 ` Paolo Bonzini
  2009-09-02 22:27   ` Kai Tietz
  2009-09-03  7:50   ` Stefan Dösinger
@ 2009-09-04 11:45   ` Stefan Dösinger
  2009-09-04 11:47     ` Paolo Bonzini
  2 siblings, 1 reply; 20+ messages in thread
From: Stefan Dösinger @ 2009-09-04 11:45 UTC (permalink / raw)
  To: gcc; +Cc: Paolo Bonzini

Am Thursday 03 September 2009 00:04:43 schrieb Paolo Bonzini:
> (define_insn "vswapmov"
> [(set (match_operand 0 "register_operand" "0")
>        (match_operand 1 "register_operand" "1")
>   (unspec_volatile [] UNSPECV_VSWAPMOV)]
I ran into a problem with this: build/genattrtab doesn't like the empty 
operand list for the unspec_volatile. So after looking at some other insns I 
added a const_int 0. There was also a parenthesis missing, which I added 
after the "register_operand" "1"), to close the "set".

(define_insn "vswapmov"
  [(set (match_operand 0 "register_operand" "0")
        (match_operand 1 "register_operand" "1"))
   (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
  ""
  "movl.s\t%1,%0"
  [(set_attr "length" "2")
   (set_attr "length_immediate" "0")
   (set_attr "modrm" "0")])

This however leads to the following errors and warnings:
build/genrecog ../.././gcc/config/i386/i386.md \
          insn-conditions.md > tmp-recog.c
../.././gcc/config/i386/i386.md:15751: operand 0 missing output reload
../.././gcc/config/i386/i386.md:15751: warning: operand 0 missing mode?
../.././gcc/config/i386/i386.md:15751: warning: operand 1 missing mode?

I guess the error isn't about the const_int 0, but about operand 0. Any ideas?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-04 11:45   ` Stefan Dösinger
@ 2009-09-04 11:47     ` Paolo Bonzini
  2009-09-04 12:18       ` Stefan Dösinger
  0 siblings, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2009-09-04 11:47 UTC (permalink / raw)
  To: Stefan Dösinger; +Cc: gcc


> I guess the error isn't about the const_int 0, but about operand 0. Any ideas?

Yes, you need this:

     [(set (match_operand:SI 0 "register_operand" "=r")
           (match_operand:SI 1 "register_operand" "r"))
      (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-04 11:47     ` Paolo Bonzini
@ 2009-09-04 12:18       ` Stefan Dösinger
  2009-09-04 12:23         ` Paolo Bonzini
  0 siblings, 1 reply; 20+ messages in thread
From: Stefan Dösinger @ 2009-09-04 12:18 UTC (permalink / raw)
  To: gcc; +Cc: Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1811 bytes --]

Am Friday 04 September 2009 13:47:20 schrieb Paolo Bonzini:
> > I guess the error isn't about the const_int 0, but about operand 0. Any
> > ideas?
>
> Yes, you need this:
>
>      [(set (match_operand:SI 0 "register_operand" "=r")
>            (match_operand:SI 1 "register_operand" "r"))
>       (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
That works, thanks!

I just found the "=r" and "r" stuff myself almost at the same time your mail 
arrived. But what does the "SI" do? I haven't found 

Now I went a step further, and implemented the suggestion from amylaar in this 
mail:
http://gcc.gnu.org/ml/gcc/2009-01/msg00174.html

> If you make it a parallel where the actual oprtation is paired with an
> empty unspec, no REG_FRAME_RELATED_EXPR is needed.  If the actual operation
> is hidden in the RTL, however, you have to add it in a 
REG_FRAME_RELATED_EXPR.
> The latter alternative is more complicated.  However, there is a benefit to
> choosing this: win the stack realign or !frame_pointer_needed cases, the
> (early) move of esp to ebp is not really supposed to establish a frame
> pointer, and thus you then don't want any cfi information emitted for it.
> Thus, you can then simply leave out the REG_FRAME_RELATED_EXPR note.

Now the definition looks like this:
(define_insn "vswapmov"
  [(parallel
    [(set (match_operand:SI 0 "register_operand" "=r")
          (match_operand:SI 1 "register_operand" "r"))
     (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
   )]
  ""
  "movl.s\t%1,%0"
  [(set_attr "length" "2")
   (set_attr "length_immediate" "0")
   (set_attr "modrm" "0")])

I am still compiling, so I don't know if it works yet.

I attached the current state of the whole patch. I added the attribute to the 
documentation, and generated the patch with function names this time.

[-- Attachment #2: msvc_prologue.diff --]
[-- Type: text/x-diff, Size: 7709 bytes --]

Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 151419)
+++ gcc/doc/extend.texi	(working copy)
@@ -2672,6 +2672,14 @@ when targeting Windows.  On all other systems, the
 
 Note, This feature is currently sorried out for Windows targets trying to
 
+@item msvc_prologue
+@cindex @code{msvc_prologue} attribute
+
+On 32 bit x86-*-* targets, you can use this function attribute to make
+gcc generate the "hot-patching" function prologue used in Win32 API
+functions in Microsoft Windows XP Service Pack 2 and newer. This requires
+support for the swap suffix in the assembler.
+
 @item naked
 @cindex function without a prologue/epilogue code
 Use this attribute on the ARM, AVR, IP2K and SPU ports to indicate that
Index: gcc/configure.ac
===================================================================
--- gcc/configure.ac	(revision 151419)
+++ gcc/configure.ac	(working copy)
@@ -3035,6 +3035,12 @@ foo:	nop
       [AC_DEFINE(HAVE_AS_IX86_SAHF, 1,
         [Define if your assembler supports the sahf mnemonic.])])
 
+    gcc_GAS_CHECK_FEATURE([swap suffix],
+      gcc_cv_as_ix86_swap,,,
+      [movl.s %esp, %ebp],,
+      [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
+        [Define if your assembler supports the swap suffix.])])
+
     gcc_GAS_CHECK_FEATURE([different section symbol subtraction],
       gcc_cv_as_ix86_diff_sect_delta,,,
       [.section .rodata
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	(revision 151419)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2388,6 +2388,9 @@ struct GTY(()) machine_function {
      to be used. MS_ABI means ms abi. Otherwise SYSV_ABI means sysv abi.  */
   enum calling_abi call_abi;
   struct machine_cfa_state cfa;
+  /* This value is used for i386 targets and specifies if the function
+   * should start with the hooking-friendly Win32 function prologue       */
+  int msvc_prologue;
 };
 #endif
 
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 151419)
+++ gcc/config/i386/i386.md	(working copy)
@@ -237,6 +237,7 @@
    (UNSPECV_RDTSC		18)
    (UNSPECV_RDTSCP		19)
    (UNSPECV_RDPMC		20)
+   (UNSPECV_VSWAPMOV	21)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -15747,6 +15748,18 @@
    (set_attr "length_immediate" "0")
    (set_attr "modrm" "0")])
 
+(define_insn "vswapmov"
+  [(parallel
+    [(set (match_operand:SI 0 "register_operand" "=r")
+          (match_operand:SI 1 "register_operand" "r"))
+     (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
+   )]
+  ""
+  "movl.s\t%1,%0"
+  [(set_attr "length" "2")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 ;; Pad to 16-byte boundary, max skip in op0.  Used to avoid
 ;; branch prediction penalty for the third jump in a 16-byte
 ;; block on K8.
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 151419)
+++ gcc/config/i386/i386.c	(working copy)
@@ -4777,6 +4777,19 @@ ix86_function_type_abi (const_tree fntype)
   return ix86_abi;
 }
 
+static int
+ix86_function_msvc_prologue (const_tree fntype)
+{
+  if (!TARGET_64BIT && fntype != NULL)
+    {
+      if(lookup_attribute ("msvc_prologue", TYPE_ATTRIBUTES (fntype)))
+        {
+          return 1;
+        }
+    }
+  return 0;
+}
+
 static enum calling_abi
 ix86_function_abi (const_tree fndecl)
 {
@@ -4808,6 +4821,11 @@ ix86_call_abi_override (const_tree fndecl)
     cfun->machine->call_abi = ix86_abi;
   else
     cfun->machine->call_abi = ix86_function_type_abi (TREE_TYPE (fndecl));
+
+  if (fndecl == NULL_TREE)
+    cfun->machine->msvc_prologue = 0;
+  else
+    cfun->machine->msvc_prologue = ix86_function_msvc_prologue (TREE_TYPE (fndecl));
 }
 
 /* MS and SYSV ABI have different set of call used registers.  Avoid expensive
@@ -8316,6 +8334,7 @@ ix86_expand_prologue (void)
   bool pic_reg_used;
   struct ix86_frame frame;
   HOST_WIDE_INT allocate;
+  int gen_frame_pointer = frame_pointer_needed;
 
   ix86_finalize_stack_realign_flags ();
 
@@ -8328,6 +8347,44 @@ ix86_expand_prologue (void)
 
   ix86_compute_frame_layout (&frame);
 
+  if(cfun->machine->msvc_prologue)
+  {
+    rtx push, mov;
+    /* Make sure the function starts with
+       8b ff     movl.s %edi,%edi
+       55        push   %ebp
+       8b ec     movl.s %esp,%ebp
+
+       This matches the hookable function prologue in Win32 API functions in Microsoft Windows
+       XP Service Pack 2 and newer. Wine uses this to enable Windows apps to hook the Win32 API
+       functions provided by Wine */
+    insn = emit_insn(gen_vswapmov(gen_rtx_REG (Pmode, DI_REG), gen_rtx_REG (Pmode, DI_REG)));
+    push = emit_insn (gen_push (hard_frame_pointer_rtx));
+    mov = emit_insn(gen_vswapmov(hard_frame_pointer_rtx, stack_pointer_rtx));
+
+    if(frame_pointer_needed && !(crtl->drap_reg && crtl->stack_realign_needed))
+    {
+      /* The push %ebp and movl.s %esp, %ebp already set up the frame pointer. No need to do
+         this again. */
+      gen_frame_pointer = 0;
+      RTX_FRAME_RELATED_P (push) = 1;
+      RTX_FRAME_RELATED_P (mov) = 1;
+      if (ix86_cfa_state->reg == stack_pointer_rtx)
+        ix86_cfa_state->reg = hard_frame_pointer_rtx;
+    }
+    else
+    {
+      /* If the frame pointer is not needed, pop %ebp again. This could be optimized for cases where
+         ebp needs to be backed up for some other reason.
+
+         If stack realignment is needed, pop the base pointer again, align the stack, and later
+         regenerate the frame pointer setup. The frame pointer generated by the msvc prologue
+         is not aligned, so it can't be used */
+      insn = emit_insn ((*ix86_gen_pop1) (hard_frame_pointer_rtx));
+    }
+  }
+
+  
   /* Emit prologue code to adjust stack alignment and setup DRAP, in case
      of DRAP is needed and stack realignment is really needed after reload */
   if (crtl->drap_reg && crtl->stack_realign_needed)
@@ -8377,14 +8434,12 @@ ix86_expand_prologue (void)
   /* Note: AT&T enter does NOT have reversed args.  Enter is probably
      slower on all targets.  Also sdb doesn't like it.  */
 
-  if (frame_pointer_needed)
+  if (gen_frame_pointer)
     {
       insn = emit_insn (gen_push (hard_frame_pointer_rtx));
       RTX_FRAME_RELATED_P (insn) = 1;
-
       insn = emit_move_insn (hard_frame_pointer_rtx, stack_pointer_rtx);
       RTX_FRAME_RELATED_P (insn) = 1;
-
       if (ix86_cfa_state->reg == stack_pointer_rtx)
         ix86_cfa_state->reg = hard_frame_pointer_rtx;
     }
@@ -26067,10 +26122,10 @@ ix86_handle_abi_attribute (tree *node, tree name,
       *no_add_attrs = true;
       return NULL_TREE;
     }
-  if (!TARGET_64BIT)
+  if (TARGET_64BIT ? is_attribute_p ("msvc_prologue", name) : !is_attribute_p ("msvc_prologue", name))
     {
-      warning (OPT_Wattributes, "%qE attribute only available for 64-bit",
-	       name);
+      warning (OPT_Wattributes, "%qE attribute only available for %d-bit",
+	       name, TARGET_64BIT ? 32 : 64);
       *no_add_attrs = true;
       return NULL_TREE;
     }
@@ -28983,6 +29038,9 @@ static const struct attribute_spec ix86_attribute_
   /* ms_abi and sysv_abi calling convention function attributes.  */
   { "ms_abi", 0, 0, false, true, true, ix86_handle_abi_attribute },
   { "sysv_abi", 0, 0, false, true, true, ix86_handle_abi_attribute },
+#ifdef HAVE_AS_IX86_SWAP
+  { "msvc_prologue", 0, 0, false, true, true, ix86_handle_abi_attribute },
+#endif
   /* End element.  */
   { NULL,        0, 0, false, false, false, NULL }
 };

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-04 12:18       ` Stefan Dösinger
@ 2009-09-04 12:23         ` Paolo Bonzini
  2009-09-04 12:50           ` Stefan Dösinger
  0 siblings, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2009-09-04 12:23 UTC (permalink / raw)
  To: Stefan Dösinger; +Cc: gcc

>> Yes, you need this:
>>
>>       [(set (match_operand:SI 0 "register_operand" "=r")
>>             (match_operand:SI 1 "register_operand" "r"))
>>        (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
> That works, thanks!
>
> I just found the "=r" and "r" stuff myself almost at the same time your mail
> arrived. But what does the "SI" do? I haven't found

It specifies that the register must be 32-bit (SImode, Single-word 
Integer mode).  It just shuts up the warnings.

> Now the definition looks like this:
> (define_insn "vswapmov"
>    [(parallel
>      [(set (match_operand:SI 0 "register_operand" "=r")
>            (match_operand:SI 1 "register_operand" "r"))
>       (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
>     )]

The parallel is implicit in define_insn, so it is not different.  It 
does not make any harm I guess, but it looks "weird" to a more familiar 
eye. :-)

>    ""
>    "movl.s\t%1,%0"
>    [(set_attr "length" "2")
>     (set_attr "length_immediate" "0")
>     (set_attr "modrm" "0")])
>
> I am still compiling, so I don't know if it works yet.
>
> I attached the current state of the whole patch. I added the attribute to the
> documentation, and generated the patch with function names this time.

Here:

+#ifdef HAVE_AS_IX86_SWAP
+  { "msvc_prologue", 0, 0, false, true, true, ix86_handle_abi_attribute },
+#endif

it's better to always provide the attribute, and call "sorry" in 
ix86_function_msvc_prologue if you don't have the .s suffix.

Another two nits since I've found a more serious one: :-)

1) do not remove spurious lines.

        RTX_FRAME_RELATED_P (insn) = 1;
-
        insn = emit_move_insn (hard_frame_pointer_rtx, stack_pointer_rtx);
        RTX_FRAME_RELATED_P (insn) = 1;
-
        if (ix86_cfa_state->reg == stack_pointer_rtx)


2) extra long line, go to new line *before* ? and colon:

+  if (TARGET_64BIT ? is_attribute_p ("msvc_prologue", name) : 
!is_attribute_p ("msvc_prologue", name))

Thanks!!

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-04 12:23         ` Paolo Bonzini
@ 2009-09-04 12:50           ` Stefan Dösinger
  2009-09-04 14:35             ` Stefan Dösinger
  2009-09-06  9:36             ` Andreas Schwab
  0 siblings, 2 replies; 20+ messages in thread
From: Stefan Dösinger @ 2009-09-04 12:50 UTC (permalink / raw)
  To: gcc; +Cc: Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 918 bytes --]

Am Friday 04 September 2009 14:23:39 schrieb Paolo Bonzini:
> The parallel is implicit in define_insn, so it is not different.  It
> does not make any harm I guess, but it looks "weird" to a more familiar
> eye. :-)
Ok, I removed it again :-)

> +#ifdef HAVE_AS_IX86_SWAP
> +  { "msvc_prologue", 0, 0, false, true, true, ix86_handle_abi_attribute },
> +#endif
>
> it's better to always provide the attribute, and call "sorry" in
> ix86_function_msvc_prologue if you don't have the .s suffix.
Fixed!

> Another two nits since I've found a more serious one: :-)
>
> 1) do not remove spurious lines.
Ooops. Forgot to read the diff...

> 2) extra long line, go to new line *before* ? and colon:
>
> +  if (TARGET_64BIT ? is_attribute_p ("msvc_prologue", name) :
> !is_attribute_p ("msvc_prologue", name))
Fixed!

I attached another version of the patch - I restarted the compile, so I still 
don't know if it fully works.

[-- Attachment #2: msvc_prologue.diff --]
[-- Type: text/x-diff, Size: 7708 bytes --]

Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 151419)
+++ gcc/doc/extend.texi	(working copy)
@@ -2672,6 +2672,14 @@ when targeting Windows.  On all other systems, the
 
 Note, This feature is currently sorried out for Windows targets trying to
 
+@item msvc_prologue
+@cindex @code{msvc_prologue} attribute
+
+On 32 bit x86-*-* targets, you can use this function attribute to make
+gcc generate the "hot-patching" function prologue used in Win32 API
+functions in Microsoft Windows XP Service Pack 2 and newer. This requires
+support for the swap suffix in the assembler. (GNU Binutils 2.19.51 or later)
+
 @item naked
 @cindex function without a prologue/epilogue code
 Use this attribute on the ARM, AVR, IP2K and SPU ports to indicate that
Index: gcc/configure.ac
===================================================================
--- gcc/configure.ac	(revision 151419)
+++ gcc/configure.ac	(working copy)
@@ -3035,6 +3035,12 @@ foo:	nop
       [AC_DEFINE(HAVE_AS_IX86_SAHF, 1,
         [Define if your assembler supports the sahf mnemonic.])])
 
+    gcc_GAS_CHECK_FEATURE([swap suffix],
+      gcc_cv_as_ix86_swap,,,
+      [movl.s %esp, %ebp],,
+      [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
+        [Define if your assembler supports the swap suffix.])])
+
     gcc_GAS_CHECK_FEATURE([different section symbol subtraction],
       gcc_cv_as_ix86_diff_sect_delta,,,
       [.section .rodata
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	(revision 151419)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2388,6 +2388,9 @@ struct GTY(()) machine_function {
      to be used. MS_ABI means ms abi. Otherwise SYSV_ABI means sysv abi.  */
   enum calling_abi call_abi;
   struct machine_cfa_state cfa;
+  /* This value is used for i386 targets and specifies if the function
+   * should start with the hooking-friendly Win32 function prologue       */
+  int msvc_prologue;
 };
 #endif
 
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 151419)
+++ gcc/config/i386/i386.md	(working copy)
@@ -237,6 +237,7 @@
    (UNSPECV_RDTSC		18)
    (UNSPECV_RDTSCP		19)
    (UNSPECV_RDPMC		20)
+   (UNSPECV_VSWAPMOV	21)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -15747,6 +15748,16 @@
    (set_attr "length_immediate" "0")
    (set_attr "modrm" "0")])
 
+(define_insn "vswapmov"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+        (match_operand:SI 1 "register_operand" "r"))
+   (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
+  ""
+  "movl.s\t%1,%0"
+  [(set_attr "length" "2")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 ;; Pad to 16-byte boundary, max skip in op0.  Used to avoid
 ;; branch prediction penalty for the third jump in a 16-byte
 ;; block on K8.
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 151419)
+++ gcc/config/i386/i386.c	(working copy)
@@ -4777,6 +4777,24 @@ ix86_function_type_abi (const_tree fntype)
   return ix86_abi;
 }
 
+static int
+ix86_function_msvc_prologue (const_tree fntype)
+{
+  if (!TARGET_64BIT && fntype != NULL)
+    {
+      if(lookup_attribute ("msvc_prologue", TYPE_ATTRIBUTES (fntype)))
+        {
+#ifdef HAVE_AS_IX86_SWAP
+          return 1;
+#else
+          sorry ("msvc_prologue needs swap suffix support in as");
+          return 0;
+#endif
+        }
+    }
+  return 0;
+}
+
 static enum calling_abi
 ix86_function_abi (const_tree fndecl)
 {
@@ -4808,6 +4826,11 @@ ix86_call_abi_override (const_tree fndecl)
     cfun->machine->call_abi = ix86_abi;
   else
     cfun->machine->call_abi = ix86_function_type_abi (TREE_TYPE (fndecl));
+
+  if (fndecl == NULL_TREE)
+    cfun->machine->msvc_prologue = 0;
+  else
+    cfun->machine->msvc_prologue = ix86_function_msvc_prologue (TREE_TYPE (fndecl));
 }
 
 /* MS and SYSV ABI have different set of call used registers.  Avoid expensive
@@ -8316,6 +8339,7 @@ ix86_expand_prologue (void)
   bool pic_reg_used;
   struct ix86_frame frame;
   HOST_WIDE_INT allocate;
+  int gen_frame_pointer = frame_pointer_needed;
 
   ix86_finalize_stack_realign_flags ();
 
@@ -8328,6 +8352,43 @@ ix86_expand_prologue (void)
 
   ix86_compute_frame_layout (&frame);
 
+  if(cfun->machine->msvc_prologue)
+  {
+    rtx push, mov;
+    /* Make sure the function starts with
+       8b ff     movl.s %edi,%edi
+       55        push   %ebp
+       8b ec     movl.s %esp,%ebp
+
+       This matches the hookable function prologue in Win32 API functions in Microsoft Windows
+       XP Service Pack 2 and newer. Wine uses this to enable Windows apps to hook the Win32 API
+       functions provided by Wine */
+    insn = emit_insn(gen_vswapmov(gen_rtx_REG (Pmode, DI_REG), gen_rtx_REG (Pmode, DI_REG)));
+    push = emit_insn (gen_push (hard_frame_pointer_rtx));
+    mov = emit_insn(gen_vswapmov(hard_frame_pointer_rtx, stack_pointer_rtx));
+
+    if(frame_pointer_needed && !(crtl->drap_reg && crtl->stack_realign_needed))
+    {
+      /* The push %ebp and movl.s %esp, %ebp already set up the frame pointer. No need to do
+         this again. */
+      gen_frame_pointer = 0;
+      RTX_FRAME_RELATED_P (push) = 1;
+      RTX_FRAME_RELATED_P (mov) = 1;
+      if (ix86_cfa_state->reg == stack_pointer_rtx)
+        ix86_cfa_state->reg = hard_frame_pointer_rtx;
+    }
+    else
+    {
+      /* If the frame pointer is not needed, pop %ebp again. This could be optimized for cases where
+         ebp needs to be backed up for some other reason.
+
+         If stack realignment is needed, pop the base pointer again, align the stack, and later
+         regenerate the frame pointer setup. The frame pointer generated by the msvc prologue
+         is not aligned, so it can't be used */
+      insn = emit_insn ((*ix86_gen_pop1) (hard_frame_pointer_rtx));
+    }
+  }
+
   /* Emit prologue code to adjust stack alignment and setup DRAP, in case
      of DRAP is needed and stack realignment is really needed after reload */
   if (crtl->drap_reg && crtl->stack_realign_needed)
@@ -8377,7 +8438,7 @@ ix86_expand_prologue (void)
   /* Note: AT&T enter does NOT have reversed args.  Enter is probably
      slower on all targets.  Also sdb doesn't like it.  */
 
-  if (frame_pointer_needed)
+  if (gen_frame_pointer)
     {
       insn = emit_insn (gen_push (hard_frame_pointer_rtx));
       RTX_FRAME_RELATED_P (insn) = 1;
@@ -26067,12 +26128,14 @@ ix86_handle_abi_attribute (tree *node, tree name,
       *no_add_attrs = true;
       return NULL_TREE;
     }
-  if (!TARGET_64BIT)
+  if (TARGET_64BIT
+      ? is_attribute_p ("msvc_prologue", name)
+      : !is_attribute_p ("msvc_prologue", name))
     {
-      warning (OPT_Wattributes, "%qE attribute only available for 64-bit",
-	       name);
-      *no_add_attrs = true;
-      return NULL_TREE;
+      warning (OPT_Wattributes, "%qE attribute only available for %d-bit",
+        name, TARGET_64BIT ? 32 : 64);
+       *no_add_attrs = true;
+       return NULL_TREE;
     }
 
   /* Can combine regparm with all attributes but fastcall.  */
@@ -28983,6 +29046,7 @@ static const struct attribute_spec ix86_attribute_
   /* ms_abi and sysv_abi calling convention function attributes.  */
   { "ms_abi", 0, 0, false, true, true, ix86_handle_abi_attribute },
   { "sysv_abi", 0, 0, false, true, true, ix86_handle_abi_attribute },
+  { "msvc_prologue", 0, 0, false, true, true, ix86_handle_abi_attribute },
   /* End element.  */
   { NULL,        0, 0, false, false, false, NULL }
 };

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-04 12:50           ` Stefan Dösinger
@ 2009-09-04 14:35             ` Stefan Dösinger
  2009-09-05 12:43               ` Paolo Bonzini
  2009-09-06  9:36             ` Andreas Schwab
  1 sibling, 1 reply; 20+ messages in thread
From: Stefan Dösinger @ 2009-09-04 14:35 UTC (permalink / raw)
  To: gcc; +Cc: Paolo Bonzini

Am Friday 04 September 2009 14:49:42 schrieb Stefan Dösinger:
> I attached another version of the patch - I restarted the compile, so I
> still don't know if it fully works.
Seems to be working - gcc compiles fine, my test function has the right 
starting bytes. Wine compiles and runs, and Steam successfully hooks the 
functions.

I'll add a test case for this feature as well, and submit it next week. Before 
I finally submit it I want to make sure Alexandre hasn't changed his mind 
about all this.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-04 14:35             ` Stefan Dösinger
@ 2009-09-05 12:43               ` Paolo Bonzini
  2009-09-05 13:41                 ` Stefan Dösinger
  0 siblings, 1 reply; 20+ messages in thread
From: Paolo Bonzini @ 2009-09-05 12:43 UTC (permalink / raw)
  To: Stefan Dösinger; +Cc: gcc

On Fri, Sep 4, 2009 at 16:35, Stefan Dösinger<stefan@codeweavers.com> wrote:
> Am Friday 04 September 2009 14:49:42 schrieb Stefan Dösinger:
>> I attached another version of the patch - I restarted the compile, so I
>> still don't know if it fully works.
> Seems to be working - gcc compiles fine, my test function has the right
> starting bytes. Wine compiles and runs, and Steam successfully hooks the
> functions.
>
> I'll add a test case for this feature as well, and submit it next week. Before
> I finally submit it I want to make sure Alexandre hasn't changed his mind
> about all this.

Are there non-Microsoft DLLs that expect to be hooked this way?  If
so, I think the patch is interesting for gcc independent of whether it
is useful for Wine.

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-05 12:43               ` Paolo Bonzini
@ 2009-09-05 13:41                 ` Stefan Dösinger
  0 siblings, 0 replies; 20+ messages in thread
From: Stefan Dösinger @ 2009-09-05 13:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc

> Are there non-Microsoft DLLs that expect to be hooked this way?  If
> so, I think the patch is interesting for gcc independent of whether it
> is useful for Wine.
I haven't seen any so far. Its certainly possible some server apps  
have the 2 byte nop at the beginning of functions for a similar hot- 
patching mechanism, but beyond that I don't think any app needs this.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-04 12:50           ` Stefan Dösinger
  2009-09-04 14:35             ` Stefan Dösinger
@ 2009-09-06  9:36             ` Andreas Schwab
  2009-09-08 20:12               ` Stefan Dösinger
  1 sibling, 1 reply; 20+ messages in thread
From: Andreas Schwab @ 2009-09-06  9:36 UTC (permalink / raw)
  To: Stefan Dösinger; +Cc: gcc, Paolo Bonzini

Stefan Dösinger <stefan@codeweavers.com> writes:

> Index: gcc/doc/extend.texi
> ===================================================================
> --- gcc/doc/extend.texi	(revision 151419)
> +++ gcc/doc/extend.texi	(working copy)
> @@ -2672,6 +2672,14 @@ when targeting Windows.  On all other systems, the
>  
>  Note, This feature is currently sorried out for Windows targets trying to
>  
> +@item msvc_prologue
> +@cindex @code{msvc_prologue} attribute
> +
> +On 32 bit x86-*-* targets, you can use this function attribute to make

There are no x86-*-* targets, they must match i[34567]86-*-*.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-06  9:36             ` Andreas Schwab
@ 2009-09-08 20:12               ` Stefan Dösinger
  0 siblings, 0 replies; 20+ messages in thread
From: Stefan Dösinger @ 2009-09-08 20:12 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 727 bytes --]

Ok, Alexandre hasn't changed his opinion, the function attrib is ok with him.

I attached another version of the patch, this time adding some testcases. Two 
more questions though:

*) How can I skip the tests if msvc_prologue is not available because as 
doesn't support the swap suffix? I think it would be wrong if the tests 
failed in that case because the compiler says sorry().
I am currently scanning the other tests for hints, haven't found any yet.

*) Is the way I added the new files in the diff ok? (diff -u /dev/null 
newfile). Or is there some more SVN-ish way?

Am Sunday 06 September 2009 11:36:23 schrieb Andreas Schwab:
> There are no x86-*-* targets, they must match i[34567]86-*-*.
Fixed that issue as well

[-- Attachment #2: msvc_prologue.diff --]
[-- Type: text/x-diff, Size: 9384 bytes --]

Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 151512)
+++ gcc/doc/extend.texi	(working copy)
@@ -2672,6 +2672,14 @@ when targeting Windows.  On all other systems, the
 
 Note, This feature is currently sorried out for Windows targets trying to
 
+@item msvc_prologue
+@cindex @code{msvc_prologue} attribute
+
+On 32 bit i[34567]86-*-* targets, you can use this function attribute to make
+gcc generate the "hot-patching" function prologue used in Win32 API
+functions in Microsoft Windows XP Service Pack 2 and newer. This requires
+support for the swap suffix in the assembler. (GNU Binutils 2.19.51 or later)
+
 @item naked
 @cindex function without a prologue/epilogue code
 Use this attribute on the ARM, AVR, IP2K and SPU ports to indicate that
Index: gcc/configure.ac
===================================================================
--- gcc/configure.ac	(revision 151512)
+++ gcc/configure.ac	(working copy)
@@ -3035,6 +3035,12 @@ foo:	nop
       [AC_DEFINE(HAVE_AS_IX86_SAHF, 1,
         [Define if your assembler supports the sahf mnemonic.])])
 
+    gcc_GAS_CHECK_FEATURE([swap suffix],
+      gcc_cv_as_ix86_swap,,,
+      [movl.s %esp, %ebp],,
+      [AC_DEFINE(HAVE_AS_IX86_SWAP, 1,
+        [Define if your assembler supports the swap suffix.])])
+
     gcc_GAS_CHECK_FEATURE([different section symbol subtraction],
       gcc_cv_as_ix86_diff_sect_delta,,,
       [.section .rodata
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	(revision 151512)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2388,6 +2388,9 @@ struct GTY(()) machine_function {
      to be used. MS_ABI means ms abi. Otherwise SYSV_ABI means sysv abi.  */
   enum calling_abi call_abi;
   struct machine_cfa_state cfa;
+  /* This value is used for i386 targets and specifies if the function
+   * should start with the hooking-friendly Win32 function prologue       */
+  int msvc_prologue;
 };
 #endif
 
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 151512)
+++ gcc/config/i386/i386.md	(working copy)
@@ -237,6 +237,7 @@
    (UNSPECV_RDTSC		18)
    (UNSPECV_RDTSCP		19)
    (UNSPECV_RDPMC		20)
+   (UNSPECV_VSWAPMOV	21)
   ])
 
 ;; Constants to represent pcomtrue/pcomfalse variants
@@ -15747,6 +15748,16 @@
    (set_attr "length_immediate" "0")
    (set_attr "modrm" "0")])
 
+(define_insn "vswapmov"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+        (match_operand:SI 1 "register_operand" "r"))
+   (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
+  ""
+  "movl.s\t%1,%0"
+  [(set_attr "length" "2")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
 ;; Pad to 16-byte boundary, max skip in op0.  Used to avoid
 ;; branch prediction penalty for the third jump in a 16-byte
 ;; block on K8.
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 151512)
+++ gcc/config/i386/i386.c	(working copy)
@@ -4777,6 +4777,24 @@ ix86_function_type_abi (const_tree fntype)
   return ix86_abi;
 }
 
+static int
+ix86_function_msvc_prologue (const_tree fntype)
+{
+  if (!TARGET_64BIT && fntype != NULL)
+    {
+      if(lookup_attribute ("msvc_prologue", TYPE_ATTRIBUTES (fntype)))
+        {
+#ifdef HAVE_AS_IX86_SWAP
+          return 1;
+#else
+          sorry ("msvc_prologue needs swap suffix support in as");
+          return 0;
+#endif
+        }
+    }
+  return 0;
+}
+
 static enum calling_abi
 ix86_function_abi (const_tree fndecl)
 {
@@ -4808,6 +4826,11 @@ ix86_call_abi_override (const_tree fndecl)
     cfun->machine->call_abi = ix86_abi;
   else
     cfun->machine->call_abi = ix86_function_type_abi (TREE_TYPE (fndecl));
+
+  if (fndecl == NULL_TREE)
+    cfun->machine->msvc_prologue = 0;
+  else
+    cfun->machine->msvc_prologue = ix86_function_msvc_prologue (TREE_TYPE (fndecl));
 }
 
 /* MS and SYSV ABI have different set of call used registers.  Avoid expensive
@@ -8316,6 +8339,7 @@ ix86_expand_prologue (void)
   bool pic_reg_used;
   struct ix86_frame frame;
   HOST_WIDE_INT allocate;
+  int gen_frame_pointer = frame_pointer_needed;
 
   ix86_finalize_stack_realign_flags ();
 
@@ -8328,6 +8352,43 @@ ix86_expand_prologue (void)
 
   ix86_compute_frame_layout (&frame);
 
+  if(cfun->machine->msvc_prologue)
+  {
+    rtx push, mov;
+    /* Make sure the function starts with
+       8b ff     movl.s %edi,%edi
+       55        push   %ebp
+       8b ec     movl.s %esp,%ebp
+
+       This matches the hookable function prologue in Win32 API functions in Microsoft Windows
+       XP Service Pack 2 and newer. Wine uses this to enable Windows apps to hook the Win32 API
+       functions provided by Wine */
+    insn = emit_insn(gen_vswapmov(gen_rtx_REG (Pmode, DI_REG), gen_rtx_REG (Pmode, DI_REG)));
+    push = emit_insn (gen_push (hard_frame_pointer_rtx));
+    mov = emit_insn(gen_vswapmov(hard_frame_pointer_rtx, stack_pointer_rtx));
+
+    if(frame_pointer_needed && !(crtl->drap_reg && crtl->stack_realign_needed))
+    {
+      /* The push %ebp and movl.s %esp, %ebp already set up the frame pointer. No need to do
+         this again. */
+      gen_frame_pointer = 0;
+      RTX_FRAME_RELATED_P (push) = 1;
+      RTX_FRAME_RELATED_P (mov) = 1;
+      if (ix86_cfa_state->reg == stack_pointer_rtx)
+        ix86_cfa_state->reg = hard_frame_pointer_rtx;
+    }
+    else
+    {
+      /* If the frame pointer is not needed, pop %ebp again. This could be optimized for cases where
+         ebp needs to be backed up for some other reason.
+
+         If stack realignment is needed, pop the base pointer again, align the stack, and later
+         regenerate the frame pointer setup. The frame pointer generated by the msvc prologue
+         is not aligned, so it can't be used */
+      insn = emit_insn ((*ix86_gen_pop1) (hard_frame_pointer_rtx));
+    }
+  }
+
   /* Emit prologue code to adjust stack alignment and setup DRAP, in case
      of DRAP is needed and stack realignment is really needed after reload */
   if (crtl->drap_reg && crtl->stack_realign_needed)
@@ -8377,7 +8438,7 @@ ix86_expand_prologue (void)
   /* Note: AT&T enter does NOT have reversed args.  Enter is probably
      slower on all targets.  Also sdb doesn't like it.  */
 
-  if (frame_pointer_needed)
+  if (gen_frame_pointer)
     {
       insn = emit_insn (gen_push (hard_frame_pointer_rtx));
       RTX_FRAME_RELATED_P (insn) = 1;
@@ -26067,12 +26128,14 @@ ix86_handle_abi_attribute (tree *node, tree name,
       *no_add_attrs = true;
       return NULL_TREE;
     }
-  if (!TARGET_64BIT)
+  if (TARGET_64BIT
+      ? is_attribute_p ("msvc_prologue", name)
+      : !is_attribute_p ("msvc_prologue", name))
     {
-      warning (OPT_Wattributes, "%qE attribute only available for 64-bit",
-	       name);
-      *no_add_attrs = true;
-      return NULL_TREE;
+      warning (OPT_Wattributes, "%qE attribute only available for %d-bit",
+        name, TARGET_64BIT ? 32 : 64);
+       *no_add_attrs = true;
+       return NULL_TREE;
     }
 
   /* Can combine regparm with all attributes but fastcall.  */
@@ -28983,6 +29046,7 @@ static const struct attribute_spec ix86_attribute_
   /* ms_abi and sysv_abi calling convention function attributes.  */
   { "ms_abi", 0, 0, false, true, true, ix86_handle_abi_attribute },
   { "sysv_abi", 0, 0, false, true, true, ix86_handle_abi_attribute },
+  { "msvc_prologue", 0, 0, false, true, true, ix86_handle_abi_attribute },
   /* End element.  */
   { NULL,        0, 0, false, false, false, NULL }
 };
--- /dev/null	2009-09-08 12:59:27.979502008 +0200
+++ gcc/testsuite/gcc.misc-tests/msvc_prologue.c	2009-09-08 22:01:37.000000000 +0200
@@ -0,0 +1,17 @@
+/* Test that the msvc_prologue attribute generates the correct code.  */
+
+int __attribute__((__msvc_prologue__)) foo()
+{
+    unsigned char *ptr = (unsigned char *) foo;
+    if(*ptr++ != 0x8b) return 1;
+    if(*ptr++ != 0xff) return 1;
+    if(*ptr++ != 0x55) return 1;
+    if(*ptr++ != 0x8b) return 1;
+    if(*ptr++ != 0xec) return 1;
+    return 0;
+}
+
+int main ()
+{
+  return foo();
+}
--- /dev/null	2009-09-08 12:59:27.979502008 +0200
+++ gcc/testsuite/gcc.misc-tests/msvc_prologue.exp	2009-09-08 17:51:23.000000000 +0200
@@ -0,0 +1,28 @@
+#   Copyright (C) 1997, 2007 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+global PERF_TEST
+if { ![info exists PERF_TEST] || "$PERF_TEST" != "yes" } {
+    return
+}
+
+load_lib mike-gcc.exp
+
+prebase
+set actions run
+set compiler_output "^$"
+set program_output "^$"
+postbase msvc-prologue.c $run $groups

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
@ 2009-09-07 14:27 Ross Ridge
  0 siblings, 0 replies; 20+ messages in thread
From: Ross Ridge @ 2009-09-07 14:27 UTC (permalink / raw)
  To: gcc

Paolo Bonzini writes:
>The naked attribute has been proposed and bashed to death multiple
>times on the GCC list too.

No, not really.  It's been proposed a few times, but the discussion never
gets anywhere because the i386 maintainers quickly put their foot down
and end it.  That hasn't stopped other ports from implementing a naked
attribute or for that matter developers like me creating their own
private implementations.

					Ross Ridge

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-06  9:16 ` Stefan Dösinger
@ 2009-09-07 11:11   ` Paolo Bonzini
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Bonzini @ 2009-09-07 11:11 UTC (permalink / raw)
  To: Stefan Dösinger; +Cc: gcc, Ross Ridge

On 09/06/2009 11:15 AM, Stefan Dösinger wrote:
> Am Saturday 05 September 2009 17:08:19 schrieb Ross Ridge:
>> If this patch is essentially only for one application, maybe the idea
>> of implementing a more generally useful naked attribute would be the
>> way to go.  I implemented a naked attribute in my private sources to
>> do something similar, although supporting hookable prologues was just
>> a small part of its more general use in supporting an assembler based API.
> We don't really like the naked attribute, because it makes maintaining a C
> function that uses it a pain. Alexandre once said that he would reject any
> solution for the hook problem that is based on the naked attribute. This
> especially becomes a pain when the function has to do stack realignment, like
> all our Win32 functions on OSX.

The naked attribute has been proposed and bashed to death multiple times 
on the GCC list too.

Paolo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
  2009-09-05 15:08 Ross Ridge
@ 2009-09-06  9:16 ` Stefan Dösinger
  2009-09-07 11:11   ` Paolo Bonzini
  0 siblings, 1 reply; 20+ messages in thread
From: Stefan Dösinger @ 2009-09-06  9:16 UTC (permalink / raw)
  To: gcc; +Cc: Ross Ridge

Am Saturday 05 September 2009 17:08:19 schrieb Ross Ridge:
> If this patch is essentially only for one application, maybe the idea
> of implementing a more generally useful naked attribute would be the
> way to go.  I implemented a naked attribute in my private sources to
> do something similar, although supporting hookable prologues was just
> a small part of its more general use in supporting an assembler based API.
We don't really like the naked attribute, because it makes maintaining a C 
function that uses it a pain. Alexandre once said that he would reject any 
solution for the hook problem that is based on the naked attribute. This 
especially becomes a pain when the function has to do stack realignment, like 
all our Win32 functions on OSX.

But yeah, this functionality will probably be used only by Wine, since Linux 
and OSX offer more comfortable hooking mechanisms than opcode replacements. 
Although you never know, perhaps someone else finds a use for this I did not 
anticipate.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: MSVC hook function prologue
@ 2009-09-05 15:08 Ross Ridge
  2009-09-06  9:16 ` Stefan Dösinger
  0 siblings, 1 reply; 20+ messages in thread
From: Ross Ridge @ 2009-09-05 15:08 UTC (permalink / raw)
  To: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 613 bytes --]

Paolo Bonzini writes:
>Are there non-Microsoft DLLs that expect to be hooked this way?  If
>so, I think the patch is interesting for gcc independent of whether it
>is useful for Wine.

Stefan Dösinger writes:
>I haven't seen any so far. ...

If this patch is essentially only for one application, maybe the idea
of implementing a more generally useful naked attribute would be the
way to go.  I implemented a naked attribute in my private sources to
do something similar, although supporting hookable prologues was just
a small part of its more general use in supporting an assembler based API.

					Ross Ridge

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2009-09-08 20:12 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-02 20:46 MSVC hook function prologue Stefan Dösinger
2009-09-02 22:05 ` Paolo Bonzini
2009-09-02 22:27   ` Kai Tietz
2009-09-02 23:39     ` Paolo Bonzini
2009-09-03  7:50   ` Stefan Dösinger
2009-09-03  7:52     ` Paolo Bonzini
2009-09-04 11:45   ` Stefan Dösinger
2009-09-04 11:47     ` Paolo Bonzini
2009-09-04 12:18       ` Stefan Dösinger
2009-09-04 12:23         ` Paolo Bonzini
2009-09-04 12:50           ` Stefan Dösinger
2009-09-04 14:35             ` Stefan Dösinger
2009-09-05 12:43               ` Paolo Bonzini
2009-09-05 13:41                 ` Stefan Dösinger
2009-09-06  9:36             ` Andreas Schwab
2009-09-08 20:12               ` Stefan Dösinger
2009-09-05 15:08 Ross Ridge
2009-09-06  9:16 ` Stefan Dösinger
2009-09-07 11:11   ` Paolo Bonzini
2009-09-07 14:27 Ross Ridge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).