public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64)
@ 2016-11-15 20:00 Daniel Santos
  2016-11-15 20:03 ` [PATCH 3/9] Add msabi pro/epilogue stubs to libgcc Daniel Santos
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:00 UTC (permalink / raw)
  To: gcc-patches

Due to differences between the 64-bit Microsoft and System V ABIs, any 
msabi function that calls a sysv function must consider RSI, RDI and 
XMM6-15 as clobbered. The result is that such functions are bloated with 
SSE saves/restores costing as much as 106 bytes each (up to 200-ish 
bytes per function). This patch set targets 64-bit Wine and aims to 
mitigate some of those costs.

A few save & restore stubs are added to the static portion of libgcc and 
the pro/epilogues of such functions uses these stubs instead, thus 
reducing .text size. While we're already tinkering with stubs, it also 
manages the save/restore of up to 6 additional registers. Analysis of 
building Wine 64 demonstrates a reduction of .text by around 20%. While 
I haven't produce performance data yet, this is my first attempt to 
modify gcc so I would rather ask for comments earlier in this process.

The basic theory is that a reduction of I-cache misses will offset the 
extra instructions required for implementation. In addition, since there 
are only a handful of stubs that will be in memory, I'm using the larger 
mov instructions instead of push/pop to facilitate better parallelization.

Here is a sample of what these prologues/epilogues look like:

Prologue (in this case, SP adjustment was properly combined with later 
stack allocation):
     7b833800:   48 8d 44 24 88          lea -0x78(%rsp),%rax
     7b833805:   48 81 ec 58 01 00 00    sub    $0x158,%rsp
     7b83380c:   e8 95 6f 05 00          callq  7b88a7a6 <__savms64_17>

Epilogue (r10 stores the value to restore the stack pointer to):
     7b83386c:   48 8d b4 24 e0 00 00    lea 0xe0(%rsp),%rsi
     7b833873:   00
     7b833874:   4c 8d 56 78             lea 0x78(%rsi),%r10
     7b833878:   e9 c9 6f 05 00          jmpq   7b88a846 <__resms64x_17>

Prologue, stack realignment case (this shows the uncombined SP 
modifications, described below):
     7b833800:   55                      push   %rbp
     7b833801:   48 8d 44 24 90          lea -0x70(%rsp),%rax
     7b833806:   48 89 e5                mov    %rsp,%rbp
     7b833809:   48 83 e0 f0             and $0xfffffffffffffff0,%rax
     7b83380d:   48 8d 60 90             lea -0x70(%rax),%rsp
     7b833811:   e8 cc 79 05 00          callq  7b88b1e2 <__savms64r_17>
     7b833816:   48 89 cb                mov    %rcx,%rbx# reordered 
insn from body
     7b833819:   48 83 ec 70             sub    $0x70,%rsp

Epilogue, stack realignment case:
     7b833875:   48 8d b4 24 e0 00 00    lea 0xe0(%rsp),%rsi
     7b83387c:   00
    7b83387d:   e9 ac 79 05 00 jmpq   7b88b22e <__resms64rx_17>


Questions and (known) outstanding issues:

 1. I have added the new -f optimization to common.opt, but being that
    it only impacts x86_64, should this be a machine-specific -m option
    instead?
 2. In the prologues that realign the stack, stack pointer modifications
    aren't combining, presumably since I'm using a lea after realigning
    using rax.
 3. My x86 assembly expertise is limited, so I would appreciate any
    feedback on my stubs & emitted code.
 4. Documentation is still missing.
 5. A Changelog entry is still missing.
 6. This is my first major work on a GNU project and I have not yet
    fully reviewed all of the relevant GNU coding conventions, so I
    might still have some non-compliance code.
 7. Regression tests only run on my old Phenom. Have not yet tested on
    AVX cpu (which should use vmovaps instead of movaps).
 8. My test program is inadequate (and is not included in this patch
    set).  During development it failed to produce many optimization
    errors that I got when building Wine.  I've been building 64-bit
    Wine and running Wine's tests in the mean time.
 9. I need to devise a meaningful benchmarking strategy.
10. I have not yet examined how this may or may not affect -flto or
    where additional optimization opportunities in the lto driver may exist.
11. There are a few more optimization opportunities that I haven't
    attempted to exploit yet and prefer to leave for later projects.
      * In the case of stack realignment and all 17 registers being
        clobbered, I can combine the majority of the prologue
        (alignment, saving frame pointer, etc.) in the stub.
      * With these stubs being in the static portion of libgcc, each
        Wine "dll" gets a separate copy. The average number of dlls a
        Windows program loads seems to be at least 15, allowing a
        mechanism for them to be linked dynamically from libwine.so
        could save a little bit more .text and icache.
      * Ultimately, good static analysis of local sysv functions can
        completely eliminate the need to save SSE registers in some cases.
12. Use of hard frame pointers disables the optimization unless we're
    also realigning the stack. I've implemented this in another (local)
    branch, but haven't tested it yet.


gcc/common.opt                 |   7 +
  gcc/config/i386/i386.c         | 729 
++++++++++++++++++++++++++++++++++++++---
  gcc/config/i386/i386.h         |  22 +-
  gcc/config/i386/predicates.md  | 148 +++++++++
  gcc/config/i386/sse.md         |  56 ++++
  libgcc/config.host             |   2 +-
  libgcc/config/i386/i386-asm.h  |  82 +++++
  libgcc/config/i386/resms64.S   |  63 ++++
  libgcc/config/i386/resms64f.S  |  59 ++++
  libgcc/config/i386/resms64fx.S |  61 ++++
  libgcc/config/i386/resms64x.S  |  65 ++++
  libgcc/config/i386/savms64.S   |  63 ++++
  libgcc/config/i386/savms64f.S  |  64 ++++
  libgcc/config/i386/t-msabi     |   7 +
  14 files changed, 1379 insertions(+), 49 deletions(-)

Feedback and comments would be most appreciated!

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/9] Change type of x86_64_ms_sysv_extra_clobbered_registers
  2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
  2016-11-15 20:03 ` [PATCH 3/9] Add msabi pro/epilogue stubs to libgcc Daniel Santos
  2016-11-15 20:03 ` [PATCH 7/9] Modify ix86_save_reg to optionally omit stub-managed registers Daniel Santos
@ 2016-11-15 20:03 ` Daniel Santos
  2016-11-15 20:03 ` [PATCH 2/9] Minor refactor in ix86_compute_frame_layout Daniel Santos
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

This will need to be unsigned for a subsequent patch. Also adds the
constant NUM_X86_64_MS_CLOBBERED_REGS for brievity.
---
 gcc/config/i386/i386.c | 8 +++-----
 gcc/config/i386/i386.h | 4 +++-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a5c4ba7..56cc67d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2421,7 +2421,7 @@ static int const x86_64_int_return_registers[4] =
 
 /* Additional registers that are clobbered by SYSV calls.  */
 
-int const x86_64_ms_sysv_extra_clobbered_registers[12] =
+unsigned const x86_64_ms_sysv_extra_clobbered_registers[12] =
 {
   SI_REG, DI_REG,
   XMM6_REG, XMM7_REG,
@@ -28209,11 +28209,9 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
   else if (TARGET_64BIT_MS_ABI
 	   && (!callarg2 || INTVAL (callarg2) != -2))
     {
-      int const cregs_size
-	= ARRAY_SIZE (x86_64_ms_sysv_extra_clobbered_registers);
-      int i;
+      unsigned i;
 
-      for (i = 0; i < cregs_size; i++)
+      for (i = 0; i < NUM_X86_64_MS_CLOBBERED_REGS; i++)
 	{
 	  int regno = x86_64_ms_sysv_extra_clobbered_registers[i];
 	  machine_mode mode = SSE_REGNO_P (regno) ? TImode : DImode;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index add7a64..a45b66a 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2172,7 +2172,9 @@ extern int const dbx_register_map[FIRST_PSEUDO_REGISTER];
 extern int const dbx64_register_map[FIRST_PSEUDO_REGISTER];
 extern int const svr4_dbx_register_map[FIRST_PSEUDO_REGISTER];
 
-extern int const x86_64_ms_sysv_extra_clobbered_registers[12];
+extern unsigned const x86_64_ms_sysv_extra_clobbered_registers[12];
+#define NUM_X86_64_MS_CLOBBERED_REGS \
+  (ARRAY_SIZE (x86_64_ms_sysv_extra_clobbered_registers))
 
 /* Before the prologue, RA is at 0(%esp).  */
 #define INCOMING_RETURN_ADDR_RTX \
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 5/9] Add patterns and predicates foutline-msabi-xlouges
  2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
                   ` (4 preceding siblings ...)
  2016-11-15 20:03 ` [PATCH 4/9] Add struct fields and option for foutline-msabi-xlouges Daniel Santos
@ 2016-11-15 20:03 ` Daniel Santos
  2016-11-15 21:06   ` Daniel Santos
  2016-11-15 20:04 ` [PATCH 6/9] Adds class xlouge_layout to i386.c Daniel Santos
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

Adds the predicates save_multiple and restore_multiple to predicates.md,
which are used by following patterns in sse.md:

* save_multiple - insn that calls a save stub
* save_multiple_realign - insn that calls a save stub and also manages
  a realign and hard frame pointer
* restore_multiple - call_insn that calls a save stub and returns to the
  function to allow a sibling call (which should typically offer better
  optimization than the restore stub as the tail call)
* restore_multiple_and_return - a jump_insn that is the return from
  a function (tail call)
---
 gcc/config/i386/predicates.md | 148 ++++++++++++++++++++++++++++++++++++++++++
 gcc/config/i386/sse.md        |  56 ++++++++++++++++
 2 files changed, 204 insertions(+)

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 219674e..f50bba9a 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1663,3 +1663,151 @@
   (ior (match_operand 0 "register_operand")
        (and (match_code "const_int")
 	    (match_test "op == constm1_rtx"))))
+
+;; Return true if:
+;; * first op is a symbol reference,
+;; * >= 14 operands, and
+;; * operands 2 to end save a register to a memory location that's an
+(define_predicate "save_multiple"
+  (match_code "parallel")
+{
+  const unsigned nregs = XVECLEN (op, 0);
+  rtx head = XVECEXP (op, 0, 0);
+  unsigned i;
+
+  if (GET_CODE (head) != USE)
+    return false;
+  else
+    {
+      rtx op0 = XEXP (head, 0);
+      if (op0 == NULL_RTX || GET_CODE (op0) != SYMBOL_REF)
+	return false;
+    }
+
+  if (nregs < 14)
+    return false;
+
+  for (i = 2; i < nregs; i++)
+    {
+      rtx e, src, dest;
+
+      e = XVECEXP (op, 0, i);
+
+      switch (GET_CODE (e))
+	{
+	  case SET:
+	    src  = SET_SRC (e);
+	    dest = SET_DEST (e);
+
+	    /* storing a register to memory.  */
+	    if (GET_CODE (src) == REG && GET_CODE (dest) == MEM)
+	      {
+		rtx addr = XEXP (dest, 0);
+
+		/* Good if dest address is in RAX.  */
+		if (GET_CODE (addr) == REG
+		    && REGNO (addr) == AX_REG)
+		  continue;
+
+		/* Good if dest address is offset of RAX.  */
+		if (GET_CODE (addr) == PLUS
+		    && GET_CODE (XEXP (addr, 0)) == REG
+		    && REGNO (XEXP (addr, 0)) == AX_REG)
+		  continue;
+	      }
+	    break;
+
+	  default:
+	    break;
+	}
+	return false;
+    }
+  return true;
+})
+
+;; Return true if:
+;; * first op is (return) or a a use (symbol reference),
+;; * >= 14 operands, and
+;; * operands 2 to end are one of:
+;;   - restoring a register from a memory location that's an offset of RSI.
+;;   - clobbering a reg
+;;   - adjusting SP
+(define_predicate "restore_multiple"
+  (match_code "parallel")
+{
+  const unsigned nregs = XVECLEN (op, 0);
+  rtx head = XVECEXP (op, 0, 0);
+  unsigned i;
+
+  switch (GET_CODE (head))
+    {
+      case RETURN:
+	i = 3;
+	break;
+
+      case USE:
+      {
+	rtx op0 = XEXP (head, 0);
+
+	if (op0 == NULL_RTX || GET_CODE (op0) != SYMBOL_REF)
+	  return false;
+
+	i = 1;
+	break;
+      }
+
+      default:
+	return false;
+    }
+
+  if (nregs < i + 12)
+    return false;
+
+  for (; i < nregs; i++)
+    {
+      rtx e, src, dest;
+
+      e = XVECEXP (op, 0, i);
+
+      switch (GET_CODE (e))
+	{
+	  case CLOBBER:
+	    continue;
+
+	  case SET:
+	    src  = SET_SRC (e);
+	    dest = SET_DEST (e);
+
+	    /* restoring a register from memory.  */
+	    if (GET_CODE (src) == MEM && GET_CODE (dest) == REG)
+	      {
+		rtx addr = XEXP (src, 0);
+
+		/* Good if src address is in RSI.  */
+		if (GET_CODE (addr) == REG
+		    && REGNO (addr) == SI_REG)
+		  continue;
+
+		/* Good if src address is offset of RSI.  */
+		if (GET_CODE (addr) == PLUS
+		    && GET_CODE (XEXP (addr, 0)) == REG
+		    && REGNO (XEXP (addr, 0)) == SI_REG)
+		  continue;
+
+		/* Good if adjusting stack pointer.  */
+		if (GET_CODE (dest) == REG
+		    && REGNO (dest) == SP_REG
+		    && GET_CODE (src) == PLUS
+		    && GET_CODE (XEXP (src, 0)) == REG
+		    && REGNO (XEXP (src, 0)) == SP_REG)
+		  continue;
+	      }
+	    break;
+
+	  default:
+	    break;
+	}
+	return false;
+    }
+  return true;
+})
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 14fcd67..b9dac15 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -19397,3 +19397,59 @@
   [(set_attr "type" "sselog")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
+
+;; Save multiple registers out-of-line
+(define_insn "save_multiple<mode>"
+  [(match_parallel 0 "save_multiple"
+    [(use (match_operand:P 1 "symbol_operand"))
+     (const_int 0)
+    ])]
+  "TARGET_SSE && TARGET_64BIT"
+  "call\t%P1")
+
+;; Save multiple registers out-of-line after realignment
+(define_insn "save_multiple_realign<mode>"
+  [(match_parallel 0 "save_multiple"
+    [(use (match_operand:P 1 "symbol_operand"))
+     (set (reg:P SP_REG) (plus:P (reg:P AX_REG)
+	  (match_operand:DI 2 "const_int_operand")))
+    ])]
+  "TARGET_SSE && TARGET_64BIT"
+  "leaq\t%c2(%%rax),%%rsp;\n\tcall\t%P1")
+
+;; Save multiple registers out-of-line after realignment
+(define_insn "save_multiple_realign_enter<mode>"
+  [(match_parallel 0 "save_multiple"
+    [(use (match_operand:P 1 "symbol_operand"))
+     (const_int 1)
+    ])]
+  "TARGET_SSE && TARGET_64BIT"
+  "call\t%P1")
+
+;; Restore multiple registers out-of-line
+(define_insn "restore_multiple<mode>"
+  [(match_parallel 0 "restore_multiple"
+    [(use (match_operand:P 1 "symbol_operand"))])]
+  "TARGET_SSE && TARGET_64BIT"
+  "call\t%P1")
+
+;; Restore multiple registers out-of-line and return
+(define_insn "restore_multiple_and_return<mode>"
+  [(match_parallel 0 "restore_multiple"
+    [(return)
+     (use (match_operand:P 1 "symbol_operand"))
+     (set (reg:DI R10_REG) (plus:DI (reg:DI SI_REG)
+	  (match_operand:DI 2 "const_int_operand")))
+    ])]
+  "TARGET_SSE && TARGET_64BIT"
+  "leaq\t%c2(%%rsi),%%r10;\n\tjmp\t%P1")
+
+;; Restore multiple registers out-of-line and return
+(define_insn "restore_multiple_leave_return<mode>"
+  [(match_parallel 0 "restore_multiple"
+    [(return)
+     (use (match_operand:P 1 "symbol_operand"))
+     (const_int 0)
+    ])]
+  "TARGET_SSE && TARGET_64BIT"
+  "jmp\t%P1")
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/9] Minor refactor in ix86_compute_frame_layout
  2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
                   ` (2 preceding siblings ...)
  2016-11-15 20:03 ` [PATCH 1/9] Change type of x86_64_ms_sysv_extra_clobbered_registers Daniel Santos
@ 2016-11-15 20:03 ` Daniel Santos
  2016-11-15 20:03 ` [PATCH 4/9] Add struct fields and option for foutline-msabi-xlouges Daniel Santos
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

This refactor is separated from a future patch that actually alters
ix86_compute_frame_layout.
---
 gcc/config/i386/i386.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 56cc67d..5ed8fb6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12256,6 +12256,7 @@ ix86_builtin_setjmp_frame_value (void)
 static void
 ix86_compute_frame_layout (struct ix86_frame *frame)
 {
+  struct machine_function *m = cfun->machine;
   unsigned HOST_WIDE_INT stack_alignment_needed;
   HOST_WIDE_INT offset;
   unsigned HOST_WIDE_INT preferred_alignment;
@@ -12290,19 +12291,19 @@ ix86_compute_frame_layout (struct ix86_frame *frame)
      scheduling that can be done, which means that there's very little point
      in doing anything except PUSHs.  */
   if (TARGET_SEH)
-    cfun->machine->use_fast_prologue_epilogue = false;
+    m->use_fast_prologue_epilogue = false;
 
   /* During reload iteration the amount of registers saved can change.
      Recompute the value as needed.  Do not recompute when amount of registers
      didn't change as reload does multiple calls to the function and does not
      expect the decision to change within single iteration.  */
   else if (!optimize_bb_for_size_p (ENTRY_BLOCK_PTR_FOR_FN (cfun))
-           && cfun->machine->use_fast_prologue_epilogue_nregs != frame->nregs)
+	   && m->use_fast_prologue_epilogue_nregs != frame->nregs)
     {
       int count = frame->nregs;
       struct cgraph_node *node = cgraph_node::get (current_function_decl);
 
-      cfun->machine->use_fast_prologue_epilogue_nregs = count;
+      m->use_fast_prologue_epilogue_nregs = count;
 
       /* The fast prologue uses move instead of push to save registers.  This
          is significantly longer, but also executes faster as modern hardware
@@ -12319,14 +12320,14 @@ ix86_compute_frame_layout (struct ix86_frame *frame)
       if (node->frequency < NODE_FREQUENCY_NORMAL
 	  || (flag_branch_probabilities
 	      && node->frequency < NODE_FREQUENCY_HOT))
-        cfun->machine->use_fast_prologue_epilogue = false;
+	m->use_fast_prologue_epilogue = false;
       else
-        cfun->machine->use_fast_prologue_epilogue
+	m->use_fast_prologue_epilogue
 	   = !expensive_function_p (count);
     }
 
   frame->save_regs_using_mov
-    = (TARGET_PROLOGUE_USING_MOVE && cfun->machine->use_fast_prologue_epilogue
+    = (TARGET_PROLOGUE_USING_MOVE && m->use_fast_prologue_epilogue
        /* If static stack checking is enabled and done with probes,
 	  the registers need to be saved before allocating the frame.  */
        && flag_stack_check != STATIC_BUILTIN_STACK_CHECK);
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 7/9] Modify ix86_save_reg to optionally omit stub-managed registers
  2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
  2016-11-15 20:03 ` [PATCH 3/9] Add msabi pro/epilogue stubs to libgcc Daniel Santos
@ 2016-11-15 20:03 ` Daniel Santos
  2016-11-15 20:03 ` [PATCH 1/9] Change type of x86_64_ms_sysv_extra_clobbered_registers Daniel Santos
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

Adds static HARD_REG_SET stub_managed_regs to track registers that will
be managed by the pro/epilogue stubs for the function.

Adds a third parameter bool ignore_outlined to ix86_save_reg to specify
rather or not the count should include registers marked in
stub_managed_regs.
---
 gcc/config/i386/i386.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f39b847..cb4e688 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12321,10 +12321,14 @@ ix86_hard_regno_scratch_ok (unsigned int regno)
 	      && df_regs_ever_live_p (regno)));
 }
 
+/* Registers who's save & restore will be managed by stubs called from
+   pro/epilogue (inited in ix86_compute_frame_layout).  */
+static HARD_REG_SET GTY(()) stub_managed_regs;
+
 /* Return TRUE if we need to save REGNO.  */
 
 static bool
-ix86_save_reg (unsigned int regno, bool maybe_eh_return)
+ix86_save_reg (unsigned int regno, bool maybe_eh_return, bool ignore_outlined)
 {
   /* If there are no caller-saved registers, we preserve all registers,
      except for MMX and x87 registers which aren't supported when saving
@@ -12392,6 +12396,10 @@ ix86_save_reg (unsigned int regno, bool maybe_eh_return)
 	}
     }
 
+  if (ignore_outlined && cfun->machine->outline_ms_sysv
+      && in_hard_reg_set_p (stub_managed_regs, DImode, regno))
+    return false;
+
   if (crtl->drap_reg
       && regno == REGNO (crtl->drap_reg)
       && !cfun->machine->no_drap_save_restore)
@@ -12412,7 +12420,7 @@ ix86_nsaved_regs (void)
   int regno;
 
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true))
+    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true, false))
       nregs ++;
   return nregs;
 }
@@ -12428,7 +12436,7 @@ ix86_nsaved_sseregs (void)
   if (!TARGET_64BIT_MS_ABI)
     return 0;
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (SSE_REGNO_P (regno) && ix86_save_reg (regno, true))
+    if (SSE_REGNO_P (regno) && ix86_save_reg (regno, true, false))
       nregs ++;
   return nregs;
 }
@@ -12508,6 +12516,7 @@ ix86_compute_frame_layout (struct ix86_frame *frame)
 
   frame->nregs = ix86_nsaved_regs ();
   frame->nsseregs = ix86_nsaved_sseregs ();
+  CLEAR_HARD_REG_SET (stub_managed_regs);
 
   /* 64-bit MS ABI seem to require stack alignment to be always 16,
      except for function prologues, leaf functions and when the defult
@@ -12819,7 +12828,7 @@ ix86_emit_save_regs (void)
   rtx_insn *insn;
 
   for (regno = FIRST_PSEUDO_REGISTER - 1; regno-- > 0; )
-    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true))
+    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true, true))
       {
 	insn = emit_insn (gen_push (gen_rtx_REG (word_mode, regno)));
 	RTX_FRAME_RELATED_P (insn) = 1;
@@ -12901,7 +12910,7 @@ ix86_emit_save_regs_using_mov (HOST_WIDE_INT cfa_offset)
   unsigned int regno;
 
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true))
+    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true, true))
       {
         ix86_emit_save_reg_using_mov (word_mode, regno, cfa_offset);
 	cfa_offset -= UNITS_PER_WORD;
@@ -12916,7 +12925,7 @@ ix86_emit_save_sse_regs_using_mov (HOST_WIDE_INT cfa_offset)
   unsigned int regno;
 
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (SSE_REGNO_P (regno) && ix86_save_reg (regno, true))
+    if (SSE_REGNO_P (regno) && ix86_save_reg (regno, true, true))
       {
 	ix86_emit_save_reg_using_mov (V4SFmode, regno, cfa_offset);
 	cfa_offset -= GET_MODE_SIZE (V4SFmode);
@@ -13296,13 +13305,13 @@ get_scratch_register_on_entry (struct scratch_reg *sr)
 	       && !static_chain_p
 	       && drap_regno != CX_REG)
 	regno = CX_REG;
-      else if (ix86_save_reg (BX_REG, true))
+      else if (ix86_save_reg (BX_REG, true, false))
 	regno = BX_REG;
       /* esi is the static chain register.  */
       else if (!(regparm == 3 && static_chain_p)
-	       && ix86_save_reg (SI_REG, true))
+	       && ix86_save_reg (SI_REG, true, false))
 	regno = SI_REG;
-      else if (ix86_save_reg (DI_REG, true))
+      else if (ix86_save_reg (DI_REG, true, false))
 	regno = DI_REG;
       else
 	{
@@ -14403,7 +14412,7 @@ ix86_emit_restore_regs_using_mov (HOST_WIDE_INT cfa_offset,
   unsigned int regno;
 
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, maybe_eh_return))
+    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, maybe_eh_return, true))
       {
 	rtx reg = gen_rtx_REG (word_mode, regno);
 	rtx mem;
@@ -14442,7 +14451,7 @@ ix86_emit_restore_sse_regs_using_mov (HOST_WIDE_INT cfa_offset,
   unsigned int regno;
 
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (SSE_REGNO_P (regno) && ix86_save_reg (regno, maybe_eh_return))
+    if (SSE_REGNO_P (regno) && ix86_save_reg (regno, maybe_eh_return, true))
       {
 	rtx reg = gen_rtx_REG (V4SFmode, regno);
 	rtx mem;
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 4/9] Add struct fields and option for foutline-msabi-xlouges
  2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
                   ` (3 preceding siblings ...)
  2016-11-15 20:03 ` [PATCH 2/9] Minor refactor in ix86_compute_frame_layout Daniel Santos
@ 2016-11-15 20:03 ` Daniel Santos
  2016-11-15 20:03 ` [PATCH 5/9] Add patterns and predicates foutline-msabi-xlouges Daniel Santos
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

Adds fountline-msabi-xlogues to common.opt and various fields to structs
machine_function and ix86_frame
---
 gcc/common.opt         |  7 +++++++
 gcc/config/i386/i386.c | 35 ++++++++++++++++++++++++++++++-----
 gcc/config/i386/i386.h | 18 ++++++++++++++++++
 3 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 5e8d72d..e9570b0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3075,4 +3075,11 @@ fipa-ra
 Common Report Var(flag_ipa_ra) Optimization
 Use caller save register across calls if possible.
 
+foutline-msabi-xlogues
+Common Report Var(flag_outline_msabi_xlogues) Optimization
+Outline pro/epilogues to save/restore registers clobbered by calling
+sysv_abi functions from within a 64-bit ms_abi function.  This reduces
+.text size at the expense of a few more instructions being executed
+per function.
+
 ; This comment is to ensure we retain the blank line above.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 5ed8fb6..4cc3c8f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2449,13 +2449,37 @@ struct GTY(()) stack_local_entry {
 
    saved frame pointer			if frame_pointer_needed
 					<- HARD_FRAME_POINTER
-   [saved regs]
-					<- regs_save_offset
-   [padding0]
+   [Normal case:
 
-   [saved SSE regs]
+     [saved regs]
+					<- regs_save_offset
+     [padding0]
+
+     [saved SSE regs]
+
+   ][ms x64 --> sysv with -foutline-msabi-xlogues:
+     [padding0]
+			<- Start of out-of-line, stub-saved/restored regs
+			   (see libgcc/config/i386/msabi.S)
+     [XMM6-15]
+     [RSI]
+     [RDI]
+     [?RBX]		only if RBX is clobbered
+     [?RBP]		only if RBP and RBX are clobbered
+     [?R12]		only if R12 and all previous regs are clobbered
+     [?R13]		only if R13 and all previous regs are clobbered
+     [?R14]		only if R14 and all previous regs are clobbered
+     [?R15]		only if R15 and all previous regs are clobbered
+			<- end of stub-saved/restored regs
+     [padding1]
+			<- outlined_save_offset
+     [saved regs]	Any remaning regs are saved in-line
+			<- regs_save_offset
+     [saved SSE regs]	not yet verified, but I *think* that there should be no
+			other SSE regs to save here.
+   ]
 					<- sse_regs_save_offset
-   [padding1]          |
+   [padding2]
 		       |		<- FRAME_POINTER
    [va_arg registers]  |
 		       |
@@ -2477,6 +2501,7 @@ struct ix86_frame
   HOST_WIDE_INT hard_frame_pointer_offset;
   HOST_WIDE_INT stack_pointer_offset;
   HOST_WIDE_INT hfp_save_offset;
+  HOST_WIDE_INT outlined_save_offset;
   HOST_WIDE_INT reg_save_offset;
   HOST_WIDE_INT sse_reg_save_offset;
 
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index a45b66a..e6b79df 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2575,6 +2575,24 @@ struct GTY(()) machine_function {
      pass arguments and can be used for indirect sibcall.  */
   BOOL_BITFIELD arg_reg_available : 1;
 
+  /* If true, we're out-of-lining reg save/restore for regs clobbered
+     by ms_abi functions calling a sysv function.  */
+  BOOL_BITFIELD outline_ms_sysv : 1;
+
+  /* If true, the incoming 16-byte aligned stack has an offset (of 8) and
+     needs padding.  */
+  BOOL_BITFIELD outline_ms_sysv_pad_in : 1;
+
+  /* If true, the size of the stub save area plus inline int reg saves will
+     result in an 8 byte offset, so needs padding.  */
+  BOOL_BITFIELD outline_ms_sysv_pad_out : 1;
+
+  /* This is the number of extra registers saved by stub (valid range is
+     0-6). Each additional register is only saved/restored by the stubs
+     if all successive ones are. (Will always be zero when using a hard
+     frame pointer.) */
+  unsigned int outline_ms_sysv_extra_regs:3;
+
   /* During prologue/epilogue generation, the current frame state.
      Otherwise, the frame state at the end of the prologue.  */
   struct machine_frame_state fs;
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/9] Add msabi pro/epilogue stubs to libgcc
  2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
@ 2016-11-15 20:03 ` Daniel Santos
  2016-11-15 20:03 ` [PATCH 7/9] Modify ix86_save_reg to optionally omit stub-managed registers Daniel Santos
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

Adds libgcc/config/i386/i386-asm.h to manage common cpp and gas macros.
stubs use the following naming convention:

  (sav|res)ms64[f][x]

    save|res    Save or restore
    ms64        Avoid possible name collisions with future stubs
                (specific to 64-bit msabi --> sysv scenario)
    [f]         Variant for hard frame pointer (and stack realignment)
    [x]         Tail-call variant (is the return from function)
---
 libgcc/config.host             |  2 +-
 libgcc/config/i386/i386-asm.h  | 82 ++++++++++++++++++++++++++++++++++++++++++
 libgcc/config/i386/resms64.S   | 63 ++++++++++++++++++++++++++++++++
 libgcc/config/i386/resms64f.S  | 59 ++++++++++++++++++++++++++++++
 libgcc/config/i386/resms64fx.S | 61 +++++++++++++++++++++++++++++++
 libgcc/config/i386/resms64x.S  | 65 +++++++++++++++++++++++++++++++++
 libgcc/config/i386/savms64.S   | 63 ++++++++++++++++++++++++++++++++
 libgcc/config/i386/savms64f.S  | 64 +++++++++++++++++++++++++++++++++
 libgcc/config/i386/t-msabi     |  7 ++++
 9 files changed, 465 insertions(+), 1 deletion(-)
 create mode 100644 libgcc/config/i386/i386-asm.h
 create mode 100644 libgcc/config/i386/resms64.S
 create mode 100644 libgcc/config/i386/resms64f.S
 create mode 100644 libgcc/config/i386/resms64fx.S
 create mode 100644 libgcc/config/i386/resms64x.S
 create mode 100644 libgcc/config/i386/savms64.S
 create mode 100644 libgcc/config/i386/savms64f.S
 create mode 100644 libgcc/config/i386/t-msabi

diff --git a/libgcc/config.host b/libgcc/config.host
index 64beb21..07bb269 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1335,7 +1335,7 @@ case ${host} in
 i[34567]86-*-linux* | x86_64-*-linux* | \
   i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
   i[34567]86-*-gnu*)
-	tmake_file="${tmake_file} t-tls i386/t-linux t-slibgcc-libgcc"
+	tmake_file="${tmake_file} t-tls i386/t-linux i386/t-msabi t-slibgcc-libgcc"
 	if test "$libgcc_cv_cfi" = "yes"; then
 		tmake_file="${tmake_file} t-stack i386/t-stack-i386"
 	fi
diff --git a/libgcc/config/i386/i386-asm.h b/libgcc/config/i386/i386-asm.h
new file mode 100644
index 0000000..73acf5c
--- /dev/null
+++ b/libgcc/config/i386/i386-asm.h
@@ -0,0 +1,82 @@
+/* Defines common perprocessor and assembly macros for use by various stubs.
+ *
+ *   Copyright (C) 2016 Free Software Foundation, Inc.
+ *   Written By Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ *
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ *
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef I386_ASM_H
+#define I386_ASM_H
+
+#ifdef __ELF__
+# define ELFFN(fn) .type fn,@function
+#else
+# define ELFFN(fn)
+#endif
+
+#define FUNC_START(fn)	\
+	.global fn;	\
+	ELFFN (fn);	\
+fn:
+
+#define HIDDEN_FUNC(fn)\
+	FUNC_START (fn)	\
+	.hidden fn;	\
+
+#define FUNC_END(fn) .size fn,.-fn
+
+#ifdef __SSE2__
+# ifdef __AVX__
+#  define MOVAPS vmovaps
+# else
+#  define MOVAPS movaps
+# endif
+
+/* Save SSE registers 6-15. off is the offset of rax to get to xmm6.  */
+.macro SSE_SAVE off=0
+	MOVAPS %xmm15,(\off - 0x90)(%rax)
+	MOVAPS %xmm14,(\off - 0x80)(%rax)
+	MOVAPS %xmm13,(\off - 0x70)(%rax)
+	MOVAPS %xmm12,(\off - 0x60)(%rax)
+	MOVAPS %xmm11,(\off - 0x50)(%rax)
+	MOVAPS %xmm10,(\off - 0x40)(%rax)
+	MOVAPS %xmm9, (\off - 0x30)(%rax)
+	MOVAPS %xmm8, (\off - 0x20)(%rax)
+	MOVAPS %xmm7, (\off - 0x10)(%rax)
+	MOVAPS %xmm6, \off(%rax)
+.endm
+
+/* Restore SSE registers 6-15. off is the offset of rsi to get to xmm6.  */
+.macro SSE_RESTORE off=0
+	MOVAPS (\off - 0x90)(%rsi), %xmm15
+	MOVAPS (\off - 0x80)(%rsi), %xmm14
+	MOVAPS (\off - 0x70)(%rsi), %xmm13
+	MOVAPS (\off - 0x60)(%rsi), %xmm12
+	MOVAPS (\off - 0x50)(%rsi), %xmm11
+	MOVAPS (\off - 0x40)(%rsi), %xmm10
+	MOVAPS (\off - 0x30)(%rsi), %xmm9
+	MOVAPS (\off - 0x20)(%rsi), %xmm8
+	MOVAPS (\off - 0x10)(%rsi), %xmm7
+	MOVAPS \off(%rsi), %xmm6
+.endm
+
+#endif /* __SSE2__ */
+#endif /* I386_ASM_H */
diff --git a/libgcc/config/i386/resms64.S b/libgcc/config/i386/resms64.S
new file mode 100644
index 0000000..57065ba
--- /dev/null
+++ b/libgcc/config/i386/resms64.S
@@ -0,0 +1,63 @@
+/* Epilogue stub for 64-bit ms/sysv clobbers: restore
+ *
+ *   Copyright (C) 2016 Free Software Foundation, Inc.
+ *   Written By Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ *
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ *
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef __x86_64__
+#include "i386-asm.h"
+
+/* Epilogue routine for restoring 64-bit ms/sysv registers.
+ *
+ * typical use:
+ * lea		xxx(%rsp), %rsi		# xxx = SP adjustment to point to -0x70
+ * 					# offset for data
+ * callq	__resms64_<nregs>
+ * subq		$xxx,%rsp		# xxx = SP adjustment to restore stack
+ */
+	.text
+HIDDEN_FUNC(__resms64_18)
+	mov	-0x70(%rsi),%r15
+HIDDEN_FUNC(__resms64_17)
+	mov	-0x68(%rsi),%r14
+HIDDEN_FUNC(__resms64_16)
+	mov	-0x60(%rsi),%r13
+HIDDEN_FUNC(__resms64_15)
+	mov	-0x58(%rsi),%r12
+HIDDEN_FUNC(__resms64_14)
+	mov	-0x50(%rsi),%rbp
+HIDDEN_FUNC(__resms64_13)
+	mov	-0x48(%rsi),%rbx
+HIDDEN_FUNC(__resms64_12)
+	mov	-0x40(%rsi),%rdi
+	SSE_RESTORE off=0x60
+	mov	-0x38(%rsi),%rsi
+	ret
+FUNC_END(__resms64_12)
+FUNC_END(__resms64_13)
+FUNC_END(__resms64_14)
+FUNC_END(__resms64_15)
+FUNC_END(__resms64_16)
+FUNC_END(__resms64_17)
+FUNC_END(__resms64_18)
+
+#endif /* __x86_64__ */
diff --git a/libgcc/config/i386/resms64f.S b/libgcc/config/i386/resms64f.S
new file mode 100644
index 0000000..7317906
--- /dev/null
+++ b/libgcc/config/i386/resms64f.S
@@ -0,0 +1,59 @@
+/* Epilogue stub for 64-bit ms/sysv clobbers: restore (with hard frame pointer)
+ *
+ *   Copyright (C) 2016 Free Software Foundation, Inc.
+ *   Written By Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ *
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ *
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef __x86_64__
+#include "i386-asm.h"
+
+/* Epilogue routine for restoring 64-bit ms/sysv registers when hard frame
+ * pointer is  used.
+ *
+ * TODO: Will this routine ever be used? Hard frame pointers disable sibling
+ * calls, in which case the epilogue will use the "x" (returning from fn)
+ * version of the stub.
+ */
+	.text
+HIDDEN_FUNC(__resms64f_17)
+	mov	-0x68(%rsi),%r15
+HIDDEN_FUNC(__resms64f_16)
+	mov	-0x60(%rsi),%r14
+HIDDEN_FUNC(__resms64f_15)
+	mov	-0x58(%rsi),%r13
+HIDDEN_FUNC(__resms64f_14)
+	mov	-0x50(%rsi),%r12
+HIDDEN_FUNC(__resms64f_13)
+	mov	-0x48(%rsi),%rbx
+HIDDEN_FUNC(__resms64f_12)
+	mov	-0x40(%rsi),%rdi
+	SSE_RESTORE off=0x60
+	mov	-0x38(%rsi),%rsi
+	ret
+FUNC_END(__resms64f_12)
+FUNC_END(__resms64f_13)
+FUNC_END(__resms64f_14)
+FUNC_END(__resms64f_15)
+FUNC_END(__resms64f_16)
+FUNC_END(__resms64f_17)
+
+#endif /* __x86_64__ */
diff --git a/libgcc/config/i386/resms64fx.S b/libgcc/config/i386/resms64fx.S
new file mode 100644
index 0000000..18a43b1
--- /dev/null
+++ b/libgcc/config/i386/resms64fx.S
@@ -0,0 +1,61 @@
+/* Epilogue stub for 64-bit ms/sysv clobbers: restore, leave and return
+ *
+ *   Copyright (C) 2016 Free Software Foundation, Inc.
+ *   Written By Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ *
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ *
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef __x86_64__
+#include "i386-asm.h"
+
+/* Epilogue routine for 64-bit ms/sysv registers when hard frame pointer
+ * used -- restores registers, restores frame pointer and return from the
+ * function.
+ *
+ * typical use:
+ * lea    xxx(%rsp),%rsi
+ * jmp    __resms64fx_<nregs>
+ */
+	.text
+HIDDEN_FUNC(__resms64fx_17)
+	mov	-0x68(%rsi),%r15
+HIDDEN_FUNC(__resms64fx_16)
+	mov	-0x60(%rsi),%r14
+HIDDEN_FUNC(__resms64fx_15)
+	mov	-0x58(%rsi),%r13
+HIDDEN_FUNC(__resms64fx_14)
+	mov	-0x50(%rsi),%r12
+HIDDEN_FUNC(__resms64fx_13)
+	mov	-0x48(%rsi),%rbx
+HIDDEN_FUNC(__resms64fx_12)
+	mov	-0x40(%rsi),%rdi
+	SSE_RESTORE off=0x60
+	mov	-0x38(%rsi),%rsi
+	leaveq
+	ret
+FUNC_END(__resms64fx_12)
+FUNC_END(__resms64fx_13)
+FUNC_END(__resms64fx_14)
+FUNC_END(__resms64fx_15)
+FUNC_END(__resms64fx_16)
+FUNC_END(__resms64fx_17)
+
+#endif /* __x86_64__ */
diff --git a/libgcc/config/i386/resms64x.S b/libgcc/config/i386/resms64x.S
new file mode 100644
index 0000000..bec02f0
--- /dev/null
+++ b/libgcc/config/i386/resms64x.S
@@ -0,0 +1,65 @@
+/* Epilogue stub for 64-bit ms/sysv clobbers: restore and return
+ *
+ *   Copyright (C) 2016 Free Software Foundation, Inc.
+ *   Written By Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ *
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ *
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef __x86_64__
+#include "i386-asm.h"
+
+/* Epilogue routine for restoring 64-bit ms/sysv registers and returning from
+ * function.
+ *
+ * typical use:
+ * lea    xxx(%rsp), %rsi	# xxx = SP adjustment to point to -0x70 offset
+ * 				# for data
+ * lea    xxx(%rsp), r10	# xxx = SP adjustment to restore stack
+ * jmp    __resms64x_<nregs>
+ */
+	.text
+HIDDEN_FUNC(__resms64x_18)
+	mov	-0x70(%rsi),%r15
+HIDDEN_FUNC(__resms64x_17)
+	mov	-0x68(%rsi),%r14
+HIDDEN_FUNC(__resms64x_16)
+	mov	-0x60(%rsi),%r13
+HIDDEN_FUNC(__resms64x_15)
+	mov	-0x58(%rsi),%r12
+HIDDEN_FUNC(__resms64x_14)
+	mov	-0x50(%rsi),%rbp
+HIDDEN_FUNC(__resms64x_13)
+	mov	-0x48(%rsi),%rbx
+HIDDEN_FUNC(__resms64x_12)
+	mov	-0x40(%rsi),%rdi
+	SSE_RESTORE off=0x60
+	mov	-0x38(%rsi),%rsi
+	mov	%r10,%rsp
+	ret
+FUNC_END(__resms64x_12)
+FUNC_END(__resms64x_13)
+FUNC_END(__resms64x_14)
+FUNC_END(__resms64x_15)
+FUNC_END(__resms64x_16)
+FUNC_END(__resms64x_17)
+FUNC_END(__resms64x_18)
+
+#endif /* __x86_64__ */
diff --git a/libgcc/config/i386/savms64.S b/libgcc/config/i386/savms64.S
new file mode 100644
index 0000000..18bd6f1
--- /dev/null
+++ b/libgcc/config/i386/savms64.S
@@ -0,0 +1,63 @@
+/* Prologue stub for 64-bit ms/sysv clobbers: save
+ *
+ *   Copyright (C) 2016 Free Software Foundation, Inc.
+ *   Written By Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ *
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ *
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef __x86_64__
+#include "i386-asm.h"
+
+/* Prologue routine for saving 64-bit ms/sysv registers.
+ *
+ * typical use:
+ * lea    -xxx(%rsp), %rax	# xxx is 0x70 or 0x78 (depending upon incoming
+ * 				# stack alignment offset)
+ * subq   $xxx, %rsp		# xxx is however much stack space the fn needs
+ * callq  __savms64_<nregs>
+ */
+	.text
+HIDDEN_FUNC(__savms64_18)
+	mov	%r15,-0x70(%rax)
+HIDDEN_FUNC(__savms64_17)
+	mov	%r14,-0x68(%rax)
+HIDDEN_FUNC(__savms64_16)
+	mov	%r13,-0x60(%rax)
+HIDDEN_FUNC(__savms64_15)
+	mov	%r12,-0x58(%rax)
+HIDDEN_FUNC(__savms64_14)
+	mov	%rbp,-0x50(%rax)
+HIDDEN_FUNC(__savms64_13)
+	mov	%rbx,-0x48(%rax)
+HIDDEN_FUNC(__savms64_12)
+	mov	%rdi,-0x40(%rax)
+	mov	%rsi,-0x38(%rax)
+	SSE_SAVE off=0x60
+	ret
+FUNC_END(__savms64_12)
+FUNC_END(__savms64_13)
+FUNC_END(__savms64_14)
+FUNC_END(__savms64_15)
+FUNC_END(__savms64_16)
+FUNC_END(__savms64_17)
+FUNC_END(__savms64_18)
+
+#endif /* __x86_64__ */
diff --git a/libgcc/config/i386/savms64f.S b/libgcc/config/i386/savms64f.S
new file mode 100644
index 0000000..f1e4a41
--- /dev/null
+++ b/libgcc/config/i386/savms64f.S
@@ -0,0 +1,64 @@
+/* Prologue stub for 64-bit ms/sysv clobbers: save (with hard frame pointer)
+ *
+ *   Copyright (C) 2016 Free Software Foundation, Inc.
+ *   Written By Daniel Santos <daniel.santos@pobox.com>
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 3, or (at your option) any
+ * later version.
+ *
+ * This file is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * Under Section 7 of GPL version 3, you are granted additional
+ * permissions described in the GCC Runtime Library Exception, version
+ * 3.1, as published by the Free Software Foundation.
+ *
+ * You should have received a copy of the GNU General Public License and
+ * a copy of the GCC Runtime Library Exception along with this program;
+ * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef __x86_64__
+#include "i386-asm.h"
+
+/* Prologue routine for saving 64-bit ms/sysv registers when realignment is
+ * needed and hard frame pointer used.
+ *
+ * typical use:
+ * push   %rbp
+ * mov    %rsp,%rbp
+ * lea    -xxx(%rsp), %rax	# xxx is 0x70 or 0x78 (depending upon incoming
+ * 				# stack alignment offset)
+ * and    $0xfffffffffffffff0,%rax
+ * lea    -xxx(%rax),%rsp	# xxx additional stack space is needed
+ * callq  __savms64f_<nregs>
+ */
+	.text
+HIDDEN_FUNC(__savms64f_17)
+	mov	%r15,-0x68(%rax)
+HIDDEN_FUNC(__savms64f_16)
+	mov	%r14,-0x60(%rax)
+HIDDEN_FUNC(__savms64f_15)
+	mov	%r13,-0x58(%rax)
+HIDDEN_FUNC(__savms64f_14)
+	mov	%r12,-0x50(%rax)
+HIDDEN_FUNC(__savms64f_13)
+	mov	%rbx,-0x48(%rax)
+HIDDEN_FUNC(__savms64f_12)
+	mov	%rdi,-0x40(%rax)
+	mov	%rsi,-0x38(%rax)
+	SSE_SAVE off=0x60
+	ret
+FUNC_END(__savms64f_12)
+FUNC_END(__savms64f_13)
+FUNC_END(__savms64f_14)
+FUNC_END(__savms64f_15)
+FUNC_END(__savms64f_16)
+FUNC_END(__savms64f_17)
+
+#endif /* __x86_64__ */
diff --git a/libgcc/config/i386/t-msabi b/libgcc/config/i386/t-msabi
new file mode 100644
index 0000000..dbb0fa0
--- /dev/null
+++ b/libgcc/config/i386/t-msabi
@@ -0,0 +1,7 @@
+# Makefile fragment to support -foutline-msabi-xlogue
+LIB2ADD_ST += $(srcdir)/config/i386/savms64.S \
+	      $(srcdir)/config/i386/resms64.S \
+	      $(srcdir)/config/i386/resms64x.S \
+	      $(srcdir)/config/i386/savms64f.S \
+	      $(srcdir)/config/i386/resms64f.S \
+	      $(srcdir)/config/i386/resms64fx.S
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 6/9] Adds class xlouge_layout to i386.c
  2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
                   ` (5 preceding siblings ...)
  2016-11-15 20:03 ` [PATCH 5/9] Add patterns and predicates foutline-msabi-xlouges Daniel Santos
@ 2016-11-15 20:04 ` Daniel Santos
  2016-11-15 20:04 ` [PATCH 9/9] Add remainder of foutline-msabi-xlogues implementation Daniel Santos
  2016-11-15 20:04 ` [PATCH 8/9] Modify ix86_compute_frame_layout for foutline-msabi-xlogues Daniel Santos
  8 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:04 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

This C++ class adds the basic support for foutline-msabi-xlogues by
manging the layout (where registers are stored based upon and other
facets of the optimization) and providing the proper symbol rtx for the
required stub.

xlouge_layout should not be used until a call to
ix86_compute_frame_layout as it's behavior is dependent upon data in
ctrl and cfun->machine. Once ix86_compute_frame_layout has been called,
the static member function xlouge_layout::get_instance can be used to
retrieve the appropriate (constant) instance of xlouge_layout.
---
 gcc/config/i386/i386.c | 218 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 218 insertions(+)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 4cc3c8f..f39b847 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2429,6 +2429,224 @@ unsigned const x86_64_ms_sysv_extra_clobbered_registers[12] =
   XMM12_REG, XMM13_REG, XMM14_REG, XMM15_REG
 };
 
+enum xlogue_stub {
+  XLOGUE_STUB_SAVE,
+  XLOGUE_STUB_RESTORE,
+  XLOGUE_STUB_RESTORE_TAIL,
+  XLOGUE_STUB_SAVE_HFP,
+  XLOGUE_STUB_RESTORE_HFP,
+  XLOGUE_STUB_RESTORE_HFP_TAIL,
+
+  XLOGUE_STUB_COUNT
+};
+
+enum xlogue_stub_sets {
+  XLOGUE_SET_ALIGNED,
+  XLOGUE_SET_ALIGNED_PLUS_8,
+  XLOGUE_SET_UNALIGNED,
+
+  XLOGUE_SET_COUNT
+};
+
+/* Register save/restore layout used by an out-of-line stubs.  */
+class xlogue_layout {
+public:
+  struct reginfo {
+    unsigned regno;
+    HOST_WIDE_INT offset;	/* Offset used by stub base pointer (rax or
+				   rsi) to where each register is stored.  */
+  };
+
+  unsigned get_nregs () const			{return m_nregs;}
+  HOST_WIDE_INT get_stack_align_off_in () const	{return m_stack_align_off_in;}
+
+  const reginfo &get_reginfo (unsigned reg) const
+    {
+      gcc_assert (reg < m_nregs);
+      return m_regs[reg];
+    }
+
+  /* Returns an rtx for the stub's symbol based upon
+       1.) the specified stub (save, restore or restore_ret) and
+       2.) the value of cfun->machine->outline_ms_sysv_extra_regs and
+       3.) rather or not stack alignment is being performed.  */
+  rtx get_stub_rtx (enum xlogue_stub stub) const;
+
+  /* Returns the amount of stack space (including padding) that the stub
+     needs to store registers based upon data in the machine_function.  */
+  HOST_WIDE_INT get_stack_space_used () const
+    {
+      const struct machine_function &m = *cfun->machine;
+      unsigned last_reg = m.outline_ms_sysv_extra_regs + MIN_REGS;
+
+      gcc_assert (m.outline_ms_sysv_extra_regs <= MAX_EXTRA_REGS);
+      return m_regs[last_reg - 1].offset
+	     + (m.outline_ms_sysv_pad_out ? 8 : 0)
+	     + STUB_INDEX_OFFSET;
+    }
+
+  /* Returns the offset for the base pointer used by the stub.  */
+  HOST_WIDE_INT get_stub_ptr_offset () const
+    {
+      return STUB_INDEX_OFFSET + m_stack_align_off_in;
+    }
+
+  static const struct xlogue_layout &get_instance ();
+
+  static const HOST_WIDE_INT STUB_INDEX_OFFSET = 0x70;
+  static const unsigned MIN_REGS = 12;
+  static const unsigned MAX_REGS = 18;
+  static const unsigned MAX_EXTRA_REGS = MAX_REGS - MIN_REGS;
+  static const unsigned VARIANT_COUNT = MAX_EXTRA_REGS + 1;
+  static const unsigned STUB_NAME_MAX_LEN = 16;
+  static const char * const STUB_BASE_NAMES[XLOGUE_STUB_COUNT];
+  static const unsigned REG_ORDER[MAX_REGS];
+  static const unsigned REG_ORDER_REALIGN[MAX_REGS];
+
+private:
+  xlogue_layout ();
+  xlogue_layout (HOST_WIDE_INT stack_align_off_in, bool hfp);
+  xlogue_layout (const xlogue_layout &);
+  ~xlogue_layout ();
+
+  /* True if hard frame pointer is used.  */
+  bool m_hfp;
+
+  /* Max number of register this layout manages.  */
+  unsigned m_nregs;
+
+  /* Incoming offset from 16-byte alignment.  */
+  HOST_WIDE_INT m_stack_align_off_in;
+  struct reginfo m_regs[MAX_REGS];
+  rtx m_syms[XLOGUE_STUB_COUNT][VARIANT_COUNT];
+  char m_stub_names[XLOGUE_STUB_COUNT][VARIANT_COUNT][STUB_NAME_MAX_LEN];
+
+  static const struct xlogue_layout GTY(()) s_instances[XLOGUE_SET_COUNT];
+};
+
+const char * const xlogue_layout::STUB_BASE_NAMES[XLOGUE_STUB_COUNT] = {
+  "savms64",
+  "resms64",
+  "resms64x",
+  "savms64f",
+  "resms64f",
+  "resms64fx"
+};
+
+const unsigned xlogue_layout::REG_ORDER[xlogue_layout::MAX_REGS] = {
+/* The below offset values are where each register is stored for the layout
+   relative to incoming stack pointer.  The value of each m_regs[].offset will
+   be relative to the incoming base pointer (rax or rsi) used by the stub.
+
+			  FP offset	FP offset
+    Register		   aligned	aligned + 8	realigned*/
+    XMM15_REG,		/* 0x10		0x18		0x10	*/
+    XMM14_REG,		/* 0x20		0x28		0x20	*/
+    XMM13_REG,		/* 0x30		0x38		0x30	*/
+    XMM12_REG,		/* 0x40		0x48		0x40	*/
+    XMM11_REG,		/* 0x50		0x58		0x50	*/
+    XMM10_REG,		/* 0x60		0x68		0x60	*/
+    XMM9_REG,		/* 0x70		0x78		0x70	*/
+    XMM8_REG,		/* 0x80		0x88		0x80	*/
+    XMM7_REG,		/* 0x90		0x98		0x90	*/
+    XMM6_REG,		/* 0xa0		0xa8		0xa0	*/
+    SI_REG,		/* 0xa8		0xb0		0xa8	*/
+    DI_REG,		/* 0xb0		0xb8		0xb0	*/
+    BX_REG,		/* 0xb8		0xc0		0xb8	*/
+    BP_REG,		/* 0xc0		0xc8		N/A	*/
+    R12_REG,		/* 0xc8		0xd0		0xc0	*/
+    R13_REG,		/* 0xd0		0xd8		0xc8	*/
+    R14_REG,		/* 0xd8		0xe0		0xd0	*/
+    R15_REG,		/* 0xe0		0xe8		0xd8	*/
+};
+
+const struct xlogue_layout GTY(())
+xlogue_layout::s_instances[XLOGUE_SET_COUNT] = {
+  xlogue_layout (0, false),
+  xlogue_layout (8, false),
+  xlogue_layout (0, true)
+};
+
+const struct xlogue_layout &xlogue_layout::get_instance ()
+{
+  enum xlogue_stub_sets stub_set;
+
+  if (crtl->stack_realign_needed)
+    stub_set = XLOGUE_SET_UNALIGNED;
+  else if (cfun->machine->outline_ms_sysv_pad_in)
+    stub_set = XLOGUE_SET_ALIGNED_PLUS_8;
+  else
+    stub_set = XLOGUE_SET_ALIGNED;
+
+  return s_instances[stub_set];
+}
+
+xlogue_layout::xlogue_layout (HOST_WIDE_INT stack_align_off_in, bool hfp)
+  : m_hfp (hfp) , m_nregs (hfp ? 17 : 18),
+    m_stack_align_off_in (stack_align_off_in)
+{
+  memset (m_regs, 0, sizeof (m_regs));
+  memset (m_syms, 0, sizeof (m_syms));
+  memset (m_stub_names, 0, sizeof (m_stub_names));
+
+  gcc_assert (!hfp || !stack_align_off_in);
+  gcc_assert (!(stack_align_off_in & (~8)));
+
+  HOST_WIDE_INT offset = stack_align_off_in;
+  unsigned i, j;
+  for (i = j = 0; i < MAX_REGS; ++i)
+    {
+      unsigned regno = REG_ORDER[i];
+
+      if (regno == BP_REG && hfp)
+	continue;
+      if (SSE_REGNO_P (regno))
+	{
+	  offset += 16;
+	  /* Verify that SSE regs are always aligned.  */
+	  gcc_assert (!((stack_align_off_in + offset) & 15));
+	}
+      else
+	offset += 8;
+
+      m_regs[j].regno    = regno;
+      m_regs[j++].offset = offset - STUB_INDEX_OFFSET;
+    }
+    gcc_assert (j == m_nregs);
+}
+
+xlogue_layout::~xlogue_layout ()
+{
+}
+
+rtx xlogue_layout::get_stub_rtx (enum xlogue_stub stub) const
+{
+  const unsigned n_extra_regs = cfun->machine->outline_ms_sysv_extra_regs;
+  gcc_assert (n_extra_regs <= MAX_EXTRA_REGS);
+  gcc_assert (stub < XLOGUE_STUB_COUNT);
+
+  /* FIXME: For some reason, cached symbols go bad, so disable it for now.
+     Should we just remove the rtx cache or do we need to reset it at some
+     point? */
+  if (true || !m_syms[stub][n_extra_regs])
+    {
+      xlogue_layout *writey_this = const_cast<xlogue_layout*>(this);
+      char *stub_name = writey_this->m_stub_names[stub][n_extra_regs];
+      rtx sym;
+      int res;
+
+      res = snprintf (stub_name, STUB_NAME_MAX_LEN - 1, "__%s_%u",
+		      STUB_BASE_NAMES[stub], 12 + n_extra_regs);
+      gcc_assert (res <= (int)STUB_NAME_MAX_LEN);
+
+      sym = gen_rtx_SYMBOL_REF (Pmode, stub_name);
+      writey_this->m_syms[stub][n_extra_regs] = sym;
+    }
+
+    gcc_assert (m_syms[stub][n_extra_regs]);
+    return m_syms[stub][n_extra_regs];
+}
+
 /* Define the structure for the machine field in struct function.  */
 
 struct GTY(()) stack_local_entry {
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 9/9] Add remainder of foutline-msabi-xlogues implementation
  2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
                   ` (6 preceding siblings ...)
  2016-11-15 20:04 ` [PATCH 6/9] Adds class xlouge_layout to i386.c Daniel Santos
@ 2016-11-15 20:04 ` Daniel Santos
  2016-11-15 20:04 ` [PATCH 8/9] Modify ix86_compute_frame_layout for foutline-msabi-xlogues Daniel Santos
  8 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:04 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

Adds functions emit_msabi_outlined_save and emit_msabi_outlined_restore,
which are called from ix86_expand_prologue and ix86_expand_epilogue,
respectively.
---
 gcc/config/i386/i386.c | 307 ++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 288 insertions(+), 19 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f3149ef..42ce9c1 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13900,6 +13900,114 @@ ix86_elim_entry_set_got (rtx reg)
     }
 }
 
+static rtx
+gen_frame_set (rtx reg, rtx frame_reg, int offset, bool store)
+{
+  rtx addr, mem;
+
+  if (offset)
+    addr = gen_rtx_PLUS (Pmode, frame_reg, GEN_INT (offset));
+  mem = gen_frame_mem (GET_MODE (reg), offset ? addr : frame_reg);
+  return gen_rtx_SET (store ? mem : reg, store ? reg : mem);
+}
+
+static inline rtx
+gen_frame_load (rtx reg, rtx frame_reg, int offset)
+{
+  return gen_frame_set (reg, frame_reg, offset, false);
+}
+
+static inline rtx
+gen_frame_store (rtx reg, rtx frame_reg, int offset)
+{
+  return gen_frame_set (reg, frame_reg, offset, true);
+}
+
+static void
+emit_msabi_outlined_save (const struct ix86_frame &frame)
+{
+  struct machine_function *m = cfun->machine;
+  const unsigned ncregs = NUM_X86_64_MS_CLOBBERED_REGS
+			  + m->outline_ms_sysv_extra_regs;
+  rtvec v = rtvec_alloc (ncregs - 1 + 3);
+  rtx insn, sym, tmp;
+  rtx rax = gen_rtx_REG (word_mode, AX_REG);
+  unsigned i = 0;
+  unsigned j;
+  const struct xlogue_layout &xlogue = xlogue_layout::get_instance ();
+  HOST_WIDE_INT stack_used = xlogue.get_stack_space_used ();
+  HOST_WIDE_INT stack_alloc_size = stack_used;
+  HOST_WIDE_INT rax_offset = xlogue.get_stub_ptr_offset ();
+  bool realign = crtl->stack_realign_needed;
+
+  gcc_assert (TARGET_64BIT);
+  gcc_assert (!crtl->need_drap);
+
+  /* Verify that the incoming stack 16-byte alignment offset matches the
+     layout we're using.  */
+  gcc_assert ((m->fs.sp_offset & 15) == xlogue.get_stack_align_off_in ());
+
+  tmp = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (-rax_offset));
+  insn = emit_insn (gen_rtx_SET (rax, tmp));
+
+  /* Combine as many other allocations as possible.  */
+  if (frame.nregs == 0)
+    {
+      if (frame.nsseregs != 0)
+	stack_alloc_size = frame.sse_reg_save_offset - m->fs.sp_offset;
+      else
+	stack_alloc_size = frame.reg_save_offset - m->fs.sp_offset;
+
+      gcc_assert (stack_alloc_size >= stack_used);
+    }
+
+  sym = xlogue.get_stub_rtx (realign ? XLOGUE_STUB_SAVE_HFP
+				     : XLOGUE_STUB_SAVE);
+  RTVEC_ELT (v, i++) = gen_rtx_USE (VOIDmode, sym);
+
+  /* Take care of any stack realignment here.  */
+  if (realign)
+    {
+      int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
+      rtx rax_sp_offset = GEN_INT (-(stack_alloc_size - rax_offset));
+
+      gcc_assert (align_bytes > MIN_STACK_BOUNDARY / BITS_PER_UNIT);
+
+      /* Align rax.  */
+      insn = emit_insn (ix86_gen_andsp (rax, rax, GEN_INT (-align_bytes)));
+      RTX_FRAME_RELATED_P (insn) = 1;
+
+      tmp = gen_rtx_PLUS (Pmode, rax, rax_sp_offset);
+      tmp = gen_rtx_SET (stack_pointer_rtx, tmp);
+      RTVEC_ELT (v, i++) = tmp;
+      m->fs.sp_offset += stack_alloc_size;
+    }
+  else
+    {
+      pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+				GEN_INT (-stack_alloc_size), -1,
+				m->fs.cfa_reg == stack_pointer_rtx);
+      RTVEC_ELT (v, i++) = const0_rtx;
+    }
+
+  for (j = 0; j < ncregs; ++j)
+    {
+      const xlogue_layout::reginfo &r = xlogue.get_reginfo (j);
+      rtx store;
+      rtx reg;
+
+      reg = gen_rtx_REG (SSE_REGNO_P (r.regno) ? V4SFmode : word_mode,
+			 r.regno);
+      store = gen_frame_store (reg, rax, -r.offset);
+      RTVEC_ELT (v, i++) = store;
+    }
+
+  gcc_assert (i == (unsigned)GET_NUM_ELEM (v));
+
+  insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, v));
+  RTX_FRAME_RELATED_P (insn) = true;
+}
+
 /* Expand the prologue into a bunch of separate insns.  */
 
 void
@@ -14113,6 +14221,11 @@ ix86_expand_prologue (void)
 	}
     }
 
+  /* Call to outlining stub occurs after pushing frame pointer (if it was
+     needed).  */
+  if (m->outline_ms_sysv)
+      emit_msabi_outlined_save (frame);
+
   if (!int_registers_saved)
     {
       /* If saving registers via PUSH, do so now.  */
@@ -14141,20 +14254,24 @@ ix86_expand_prologue (void)
       int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
       gcc_assert (align_bytes > MIN_STACK_BOUNDARY / BITS_PER_UNIT);
 
-      /* The computation of the size of the re-aligned stack frame means
-	 that we must allocate the size of the register save area before
-	 performing the actual alignment.  Otherwise we cannot guarantee
-	 that there's enough storage above the realignment point.  */
-      if (m->fs.sp_offset != frame.sse_reg_save_offset)
-        pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
-				   GEN_INT (m->fs.sp_offset
-					    - frame.sse_reg_save_offset),
-				   -1, false);
+      /* If using stub, stack will have already been aligned.  */
+      if (!m->outline_ms_sysv)
+	{
+	  /* The computation of the size of the re-aligned stack frame means
+	    that we must allocate the size of the register save area before
+	    performing the actual alignment.  Otherwise we cannot guarantee
+	    that there's enough storage above the realignment point.  */
+	  if (m->fs.sp_offset != frame.sse_reg_save_offset)
+	    pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+				      GEN_INT (m->fs.sp_offset
+						- frame.sse_reg_save_offset),
+				      -1, false);
 
-      /* Align the stack.  */
-      insn = emit_insn (ix86_gen_andsp (stack_pointer_rtx,
-					stack_pointer_rtx,
-					GEN_INT (-align_bytes)));
+	  /* Align the stack.  */
+	  insn = emit_insn (ix86_gen_andsp (stack_pointer_rtx,
+					    stack_pointer_rtx,
+					    GEN_INT (-align_bytes)));
+	}
 
       /* For the purposes of register save area addressing, the stack
          pointer is no longer valid.  As for the value of sp_offset,
@@ -14484,17 +14601,19 @@ ix86_emit_restore_regs_using_pop (void)
   unsigned int regno;
 
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, false))
+    if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, false, true))
       ix86_emit_restore_reg_using_pop (gen_rtx_REG (word_mode, regno));
 }
 
-/* Emit code and notes for the LEAVE instruction.  */
+/* Emit code and notes for the LEAVE instruction.  If insn is non-null,
+   omits the emit and only attaches the notes.  */
 
 static void
-ix86_emit_leave (void)
+ix86_emit_leave (rtx_insn *insn)
 {
   struct machine_function *m = cfun->machine;
-  rtx_insn *insn = emit_insn (ix86_gen_leave ());
+  if (!insn)
+    insn = emit_insn (ix86_gen_leave ());
 
   ix86_add_queued_cfa_restore_notes (insn);
 
@@ -14586,6 +14705,138 @@ ix86_emit_restore_sse_regs_using_mov (HOST_WIDE_INT cfa_offset,
       }
 }
 
+static void
+emit_msabi_outlined_restore (const struct ix86_frame &frame, bool use_call,
+			     int style)
+{
+  struct machine_function *m = cfun->machine;
+  const unsigned ncregs = NUM_X86_64_MS_CLOBBERED_REGS
+			  + m->outline_ms_sysv_extra_regs;
+  rtvec v = rtvec_alloc (ncregs - 1 + (use_call ? 3 : 5));
+  rtx_insn *insn;
+  rtx sym, tmp;
+  rtx rsi = gen_rtx_REG (word_mode, SI_REG);
+  rtx note = NULL_RTX;
+  unsigned i = 0;
+  unsigned j;
+  const struct xlogue_layout &xlogue = xlogue_layout::get_instance ();
+  HOST_WIDE_INT stack_restore_offset;
+  HOST_WIDE_INT stub_ptr_offset = xlogue.get_stub_ptr_offset ();
+  HOST_WIDE_INT rsi_offset;
+  rtx rsi_frame_load = NULL_RTX;
+  HOST_WIDE_INT rsi_restore_offset = 0x7fffffff;
+  bool realign = crtl->stack_realign_needed;
+  enum xlogue_stub stub;
+
+  stack_restore_offset = m->fs.sp_offset - frame.hard_frame_pointer_offset;
+  rsi_offset = stack_restore_offset - stub_ptr_offset;
+  gcc_assert (!m->fs.fp_valid || realign);
+
+  tmp = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (rsi_offset));
+  insn = emit_insn (gen_rtx_SET (rsi, tmp));
+
+  if (realign)
+    stub = use_call ? XLOGUE_STUB_RESTORE_HFP
+		    : XLOGUE_STUB_RESTORE_HFP_TAIL;
+  else
+    stub = use_call ? XLOGUE_STUB_RESTORE
+		    : XLOGUE_STUB_RESTORE_TAIL;
+
+  sym = xlogue.get_stub_rtx (stub);
+
+  /* If:
+     + we need to pop incoming args,
+     + a sibling call will follow, or
+     + we have a hard frame pointer
+     then we want to call the epilogue stub instead of jumping to it.  */
+  if (use_call)
+      RTVEC_ELT (v, i++) = gen_rtx_USE (VOIDmode, sym);
+  else
+    {
+      RTVEC_ELT (v, i++) = ret_rtx;
+      RTVEC_ELT (v, i++) = gen_rtx_USE (VOIDmode, sym);
+      if (realign)
+	{
+	  gcc_assert (m->fs.fp_valid);
+	  gcc_assert (m->fs.cfa_reg == hard_frame_pointer_rtx);
+
+	  RTVEC_ELT (v, i++) = const0_rtx;
+	}
+      else
+	{
+	  rtx r10 = gen_rtx_REG (DImode, R10_REG);
+
+	  gcc_assert (!m->fs.fp_valid);
+	  gcc_assert (m->fs.cfa_reg == stack_pointer_rtx);
+	  gcc_assert (m->fs.sp_valid);
+
+	  tmp = GEN_INT (stub_ptr_offset);
+	  tmp = gen_rtx_PLUS (Pmode, rsi, tmp);
+	  RTVEC_ELT (v, i++) = gen_rtx_SET (r10, tmp);
+	  m->fs.sp_offset -= stack_restore_offset;
+	  note = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
+			       GEN_INT (stack_restore_offset));
+	  note = gen_rtx_SET (stack_pointer_rtx, note);
+	}
+    }
+
+  RTVEC_ELT (v, i++) = gen_rtx_CLOBBER (VOIDmode,
+					gen_rtx_REG (CCmode, FLAGS_REG));
+
+  for (j = 0; j < ncregs; ++j)
+    {
+      const xlogue_layout::reginfo &r = xlogue.get_reginfo (j);
+      enum machine_mode mode = SSE_REGNO_P (r.regno) ? V4SFmode : word_mode;
+      rtx reg, restore_note;
+
+      reg = gen_rtx_REG (mode, r.regno);
+      restore_note = gen_frame_load (reg, rsi, r.offset);
+
+      /* Save RSI frame load insn & note to add later.  */
+      if (r.regno == SI_REG)
+	{
+	  gcc_assert (!rsi_frame_load);
+	  rsi_frame_load = restore_note;
+	  rsi_restore_offset = r.offset;
+	}
+      else
+	{
+	  RTVEC_ELT (v, i++) = restore_note;
+	  ix86_add_cfa_restore_note (NULL, reg, r.offset);
+	}
+    }
+
+  /* Add RSI frame load & restore note at the end.  */
+  gcc_assert (rsi_frame_load);
+  RTVEC_ELT (v, i++) = rsi_frame_load;
+  ix86_add_cfa_restore_note (NULL, gen_rtx_REG (DImode, SI_REG),
+			     rsi_restore_offset);
+
+  gcc_assert (i == (unsigned)GET_NUM_ELEM (v));
+
+  tmp = gen_rtx_PARALLEL (VOIDmode, v);
+  if (use_call)
+      insn = emit_insn (tmp);
+  else
+    {
+      insn = emit_jump_insn (tmp);
+      JUMP_LABEL (insn) = ret_rtx;
+
+      if (realign)
+	ix86_emit_leave (insn);
+      else
+	add_reg_note (insn, REG_CFA_ADJUST_CFA, note);
+    }
+
+  RTX_FRAME_RELATED_P (insn) = true;
+  ix86_add_queued_cfa_restore_notes (insn);
+
+  if (use_call)
+    pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+			       GEN_INT (stack_restore_offset), style,
+			       m->fs.cfa_reg == stack_pointer_rtx);
+}
+
 /* Restore function stack, frame, and registers.  */
 
 void
@@ -14596,6 +14847,7 @@ ix86_expand_epilogue (int style)
   struct ix86_frame frame;
   bool restore_regs_via_mov;
   bool using_drap;
+  bool restore_stub_uses_call = false;
 
   ix86_finalize_stack_realign_flags ();
   ix86_compute_frame_layout (&frame);
@@ -14800,6 +15052,10 @@ ix86_expand_epilogue (int style)
 					      - frame.reg_save_offset),
 				     style, false);
 	}
+      /* If using an out-of-lined stub and there are no int regs to restore
+	 inline then we want to let the stub handle the stack restore.  */
+      else if (m->outline_ms_sysv && !frame.nregs)
+	;
       else if (m->fs.sp_offset != frame.reg_save_offset)
 	{
 	  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
@@ -14812,6 +15068,15 @@ ix86_expand_epilogue (int style)
       ix86_emit_restore_regs_using_pop ();
     }
 
+  if (m->outline_ms_sysv)
+    {
+      int popc = crtl->args.pops_args && crtl->args.size ? crtl->args.size : 0;
+
+      restore_stub_uses_call = popc || style == 0 || (m->fs.fp_valid
+			       && !crtl->stack_realign_needed);
+      emit_msabi_outlined_restore (frame, restore_stub_uses_call, style);
+    }
+
   /* If we used a stack pointer and haven't already got rid of it,
      then do so now.  */
   if (m->fs.fp_valid)
@@ -14825,7 +15090,7 @@ ix86_expand_epilogue (int style)
       else if (TARGET_USE_LEAVE
 	       || optimize_bb_for_size_p (EXIT_BLOCK_PTR_FOR_FN (cfun))
 	       || !cfun->machine->use_fast_prologue_epilogue)
-	ix86_emit_leave ();
+	ix86_emit_leave (NULL);
       else
         {
 	  pro_epilogue_adjust_stack (stack_pointer_rtx,
@@ -14935,7 +15200,7 @@ ix86_expand_epilogue (int style)
       else
 	emit_jump_insn (gen_simple_return_pop_internal (popc));
     }
-  else
+  else if (!m->outline_ms_sysv || restore_stub_uses_call)
     emit_jump_insn (gen_simple_return_internal ());
 
   /* Restore the state back to the state from the prologue,
@@ -28586,6 +28851,10 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
 
 	  clobber_reg (&use, gen_rtx_REG (mode, regno));
 	}
+
+      /* Set here, but it may get cleared later.  */
+      if (flag_outline_msabi_xlogues)
+	cfun->machine->outline_ms_sysv = true;
     }
 
   if (vec_len > 1)
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 8/9] Modify ix86_compute_frame_layout for foutline-msabi-xlogues
  2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
                   ` (7 preceding siblings ...)
  2016-11-15 20:04 ` [PATCH 9/9] Add remainder of foutline-msabi-xlogues implementation Daniel Santos
@ 2016-11-15 20:04 ` Daniel Santos
  8 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 20:04 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

ix86_compute_frame_layout will now populate fields added to structs
machine_function and ix86_frame and modify the frame layout specific to
facilitate the use of save & restore stubs.
---
 gcc/config/i386/i386.c | 117 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 116 insertions(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index cb4e688..f3149ef 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12516,6 +12516,8 @@ ix86_compute_frame_layout (struct ix86_frame *frame)
 
   frame->nregs = ix86_nsaved_regs ();
   frame->nsseregs = ix86_nsaved_sseregs ();
+  m->outline_ms_sysv_pad_in = 0;
+  m->outline_ms_sysv_pad_out = 0;
   CLEAR_HARD_REG_SET (stub_managed_regs);
 
   /* 64-bit MS ABI seem to require stack alignment to be always 16,
@@ -12531,6 +12533,61 @@ ix86_compute_frame_layout (struct ix86_frame *frame)
       crtl->stack_alignment_needed = 128;
     }
 
+  /* m->outline_ms_sysv is initially enabled in ix86_expand_call for all
+     64-bit ms_abi functions that call a sysv function.  So this is where
+     we prune away cases where actually don't want to out-of-line the
+     pro/epilogues.  */
+  if (m->outline_ms_sysv)
+  {
+    gcc_assert (TARGET_64BIT_MS_ABI);
+    gcc_assert (flag_outline_msabi_xlogues);
+
+    /* Do we need to handle SEH and disable the optimization? */
+    gcc_assert (!TARGET_SEH);
+
+    if (!TARGET_SSE)
+      m->outline_ms_sysv = false;
+
+    /* Don't break hot-patched functions.  */
+    else if (ix86_function_ms_hook_prologue (current_function_decl))
+      m->outline_ms_sysv = false;
+
+    /* TODO: Still need to add support for hard frame pointers when stack
+       realignment is not needed.  */
+    else if (crtl->stack_realign_finalized
+	     && (frame_pointer_needed && !crtl->stack_realign_needed))
+      {
+	static bool warned = false;
+	if (!warned)
+	  {
+	    warned = true;
+	    warning (OPT_foutline_msabi_xlogues,
+		     "not currently supported with hard frame pointers when "
+		     "not realigning stack.");
+	  }
+	m->outline_ms_sysv = false;
+      }
+
+    /* TODO: Cases that have not yet been examined.  */
+    else if (crtl->calls_eh_return
+	     || crtl->need_drap
+	     || m->static_chain_on_stack
+	     || ix86_using_red_zone ()
+	     || flag_split_stack)
+      {
+	static bool warned = false;
+	if (!warned)
+	  {
+	    warned = true;
+	    warning (OPT_foutline_msabi_xlogues,
+		     "not currently supported with the following: SEH, "
+		     "DRAP, static call chains on the stack, red zones or "
+		     "split stack.");
+	  }
+	m->outline_ms_sysv = false;
+      }
+  }
+
   stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT;
   preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT;
 
@@ -12599,6 +12656,60 @@ ix86_compute_frame_layout (struct ix86_frame *frame)
   /* The traditional frame pointer location is at the top of the frame.  */
   frame->hard_frame_pointer_offset = offset;
 
+  if (m->outline_ms_sysv)
+    {
+      unsigned i;
+      HOST_WIDE_INT offset_after_int_regs;
+
+      gcc_assert (!(offset & 7));
+
+      /* Select an appropriate layout for incoming stack offset.  */
+      m->outline_ms_sysv_pad_in = (!crtl->stack_realign_needed && (offset & 8));
+      const struct xlogue_layout &xlogue = xlogue_layout::get_instance ();
+
+      gcc_assert (frame->nregs >= 2);
+      gcc_assert (frame->nsseregs >= 10);
+
+      for (i = 0; i < xlogue.get_nregs (); ++i)
+	{
+	  unsigned regno = xlogue.get_reginfo (i).regno;
+
+	  if (ix86_save_reg (regno, false, false))
+	    {
+	      add_to_hard_reg_set (&stub_managed_regs, DImode, regno);
+	      /* For the purposes of pro/epilogue generation, we'll only count
+		 regs that aren't saved/restored by out-of-line stubs.  */
+	      if (SSE_REGNO_P (regno))
+		--frame->nsseregs;
+	      else
+		--frame->nregs;
+	    }
+	  else
+	    break;
+	}
+
+      gcc_assert (i >= xlogue_layout::MIN_REGS);
+      gcc_assert (i <= xlogue_layout::MAX_REGS);
+      gcc_assert (frame->nregs >=0);
+      gcc_assert (frame->nsseregs >=0);
+      m->outline_ms_sysv_extra_regs = i - xlogue_layout::MIN_REGS;
+
+      /* If, after saving any remaining int regs we need padding for
+	 16-byte alignment, we insert that padding prior to remaining int
+	 reg saves.  */
+      offset_after_int_regs = xlogue.get_stack_space_used ()
+			      + frame->nregs * UNITS_PER_WORD;
+      if (offset_after_int_regs & 8)
+      {
+	m->outline_ms_sysv_pad_out = 1;
+	offset_after_int_regs += UNITS_PER_WORD;
+      }
+
+      gcc_assert (!(offset_after_int_regs & 15));
+      offset += xlogue.get_stack_space_used ();
+      frame->outlined_save_offset = offset;
+    }
+
   /* Register save area */
   offset += frame->nregs * UNITS_PER_WORD;
   frame->reg_save_offset = offset;
@@ -12611,6 +12722,10 @@ ix86_compute_frame_layout (struct ix86_frame *frame)
   /* Align and set SSE register save area.  */
   if (frame->nsseregs)
     {
+      if (m->outline_ms_sysv)
+	/* If stack is not 16-byte aligned here, then bug.  */
+	gcc_assert (!(offset & 15));
+
       /* The only ABI that has saved SSE registers (Win64) also has a
 	 16-byte aligned default stack, and thus we don't need to be
 	 within the re-aligned local stack frame to save them.  In case
@@ -12618,7 +12733,7 @@ ix86_compute_frame_layout (struct ix86_frame *frame)
 	 unaligned move of SSE register will be emitted, so there is
 	 no point to round up the SSE register save area outside the
 	 re-aligned local stack frame to 16 bytes.  */
-      if (ix86_incoming_stack_boundary >= 128)
+      else if (ix86_incoming_stack_boundary >= 128)
 	offset = ROUND_UP (offset, 16);
       offset += frame->nsseregs * 16;
     }
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/9] Add patterns and predicates foutline-msabi-xlouges
  2016-11-15 20:03 ` [PATCH 5/9] Add patterns and predicates foutline-msabi-xlouges Daniel Santos
@ 2016-11-15 21:06   ` Daniel Santos
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-15 21:06 UTC (permalink / raw)
  To: gcc-patches

On 11/15/2016 02:06 PM, Daniel Santos wrote:
> +;; Save multiple registers out-of-line after realignment
> +(define_insn "save_multiple_realign<mode>"
> +  [(match_parallel 0 "save_multiple"
> +    [(use (match_operand:P 1 "symbol_operand"))
> +     (set (reg:P SP_REG) (plus:P (reg:P AX_REG)
> +	  (match_operand:DI 2 "const_int_operand")))
> +    ])]
> +  "TARGET_SSE && TARGET_64BIT"
> +  "leaq\t%c2(%%rax),%%rsp;\n\tcall\t%P1")

This pattern was included by mistake (it's incorrect and improperly 
documented). This is supposed to be the pattern that manages the enter 
and realignment in the special optimization case of all 17 registers 
being clobbered and I can do the enter, stack realignment and allocation 
in savms64f.S just prior to the symbol __savms64f_17. Please ignore it 
for now.

Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/9] Change type of x86_64_ms_sysv_extra_clobbered_registers
  2016-11-23  5:11 [PATCH v2 0/9] Add optimization -moutline-msabi-xlougues (for Wine 64) Daniel Santos
@ 2016-11-23  5:16 ` Daniel Santos
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Santos @ 2016-11-23  5:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: Daniel Santos

This will need to be unsigned for a subsequent patch. Also adds the
constant NUM_X86_64_MS_CLOBBERED_REGS for brievity.
---
 gcc/config/i386/i386.c | 8 +++-----
 gcc/config/i386/i386.h | 4 +++-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a5c4ba7..56cc67d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2421,7 +2421,7 @@ static int const x86_64_int_return_registers[4] =
 
 /* Additional registers that are clobbered by SYSV calls.  */
 
-int const x86_64_ms_sysv_extra_clobbered_registers[12] =
+unsigned const x86_64_ms_sysv_extra_clobbered_registers[12] =
 {
   SI_REG, DI_REG,
   XMM6_REG, XMM7_REG,
@@ -28209,11 +28209,9 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
   else if (TARGET_64BIT_MS_ABI
 	   && (!callarg2 || INTVAL (callarg2) != -2))
     {
-      int const cregs_size
-	= ARRAY_SIZE (x86_64_ms_sysv_extra_clobbered_registers);
-      int i;
+      unsigned i;
 
-      for (i = 0; i < cregs_size; i++)
+      for (i = 0; i < NUM_X86_64_MS_CLOBBERED_REGS; i++)
 	{
 	  int regno = x86_64_ms_sysv_extra_clobbered_registers[i];
 	  machine_mode mode = SSE_REGNO_P (regno) ? TImode : DImode;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index add7a64..a45b66a 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2172,7 +2172,9 @@ extern int const dbx_register_map[FIRST_PSEUDO_REGISTER];
 extern int const dbx64_register_map[FIRST_PSEUDO_REGISTER];
 extern int const svr4_dbx_register_map[FIRST_PSEUDO_REGISTER];
 
-extern int const x86_64_ms_sysv_extra_clobbered_registers[12];
+extern unsigned const x86_64_ms_sysv_extra_clobbered_registers[12];
+#define NUM_X86_64_MS_CLOBBERED_REGS \
+  (ARRAY_SIZE (x86_64_ms_sysv_extra_clobbered_registers))
 
 /* Before the prologue, RA is at 0(%esp).  */
 #define INCOMING_RETURN_ADDR_RTX \
-- 
2.9.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-11-23  5:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-15 20:00 [PATCH 0/9] RFC: Add optimization -foutline-msabi-xlougues (for Wine 64) Daniel Santos
2016-11-15 20:03 ` [PATCH 3/9] Add msabi pro/epilogue stubs to libgcc Daniel Santos
2016-11-15 20:03 ` [PATCH 7/9] Modify ix86_save_reg to optionally omit stub-managed registers Daniel Santos
2016-11-15 20:03 ` [PATCH 1/9] Change type of x86_64_ms_sysv_extra_clobbered_registers Daniel Santos
2016-11-15 20:03 ` [PATCH 2/9] Minor refactor in ix86_compute_frame_layout Daniel Santos
2016-11-15 20:03 ` [PATCH 4/9] Add struct fields and option for foutline-msabi-xlouges Daniel Santos
2016-11-15 20:03 ` [PATCH 5/9] Add patterns and predicates foutline-msabi-xlouges Daniel Santos
2016-11-15 21:06   ` Daniel Santos
2016-11-15 20:04 ` [PATCH 6/9] Adds class xlouge_layout to i386.c Daniel Santos
2016-11-15 20:04 ` [PATCH 9/9] Add remainder of foutline-msabi-xlogues implementation Daniel Santos
2016-11-15 20:04 ` [PATCH 8/9] Modify ix86_compute_frame_layout for foutline-msabi-xlogues Daniel Santos
2016-11-23  5:11 [PATCH v2 0/9] Add optimization -moutline-msabi-xlougues (for Wine 64) Daniel Santos
2016-11-23  5:16 ` [PATCH 1/9] Change type of x86_64_ms_sysv_extra_clobbered_registers Daniel Santos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).