* GO closures for powerpc linux
@ 2014-11-20 1:44 Alan Modra
2014-11-20 1:48 ` powerpc: Fix ffi_go_closure_linux64 Alan Modra
0 siblings, 1 reply; 2+ messages in thread
From: Alan Modra @ 2014-11-20 1:44 UTC (permalink / raw)
To: libffi-discuss
GO closures for powerpc linux
Plus .cfi async unwind info, rearrangement of ffi_call_linux64 and
ffi_call_SYSV function params to avoid register copies, tweaks to
trampolines.
This along with rth's followup patch has been tested powerpc-linux,
powerpc64-linux and powerpc64le-linux.
If you're using rth's gcc fork
git://github.com/rth7680/gcc.git rth/go-closure
then you'll want to first apply the following upstream libffi commit
commit fa5f25c20f76a6ef5e950a7ccbce826672c8a620
Author: Marcus Comstedt <marcus@mc.pp.se>
Date: Sat Jan 4 19:00:08 2014 +0100
Linux/ppc64: Remove assumption on contents of r11 in closure
and this gcc patch
Index: gcc/config/rs6000/linux64.h
===================================================================
--- gcc/config/rs6000/linux64.h (revision 217330)
+++ gcc/config/rs6000/linux64.h (working copy)
@@ -115,6 +115,14 @@
if (dot_symbols) \
error ("-mcall-aixdesc incompatible with -mabi=elfv2"); \
} \
+ if (DEFAULT_ABI == ABI_AIX \
+ && strcmp (lang_hooks.name, "GNU Go") == 0) \
+ { \
+ if (global_options_set.x_TARGET_POINTERS_TO_NESTED_FUNCTIONS \
+ && TARGET_POINTERS_TO_NESTED_FUNCTIONS) \
+ error ("-mpointers-to-nested-functions is incompatible with Go"); \
+ TARGET_POINTERS_TO_NESTED_FUNCTIONS = 0; \
+ } \
if (rs6000_isa_flags & OPTION_MASK_RELOCATABLE) \
{ \
rs6000_isa_flags &= ~OPTION_MASK_RELOCATABLE; \
* src/powerpc/ffitarget.h (FFI_GO_CLOSURES): Define.
* src/powerpc/ffi.c (ffi_call_int): New function with extra
closure param, and args rearranged on ffi_call_linux64 and
ffi_call_SYSV calls, extracted from ..
(ffi_call): ..here.
(ffi_call_go, ffi_prep_go_closure): New functions.
* src/powerpc/ffi_linux64.c (ffi_prep_closure_loc_linux64): Make
hidden. Only flush insn part of ELFv2 trampoline. Don't shuffle
ELFv1 trampoline.
(ffi_closure_helper_LINUX64): Replace closure param with cif, fun,
user_data params.
* src/powerpc/ffi_powerpc.h (ffi_go_closure_sysv): Declare.
(ffi_go_closure_linux64): Declare.
(ffi_call_SYSV, fi_call_LINUX64): Update.
(ffi_prep_closure_loc_sysv, ffi_prep_closure_loc_linux64): Declare.
(ffi_closure_helper_SYSV, ffi_closure_helper_LINUX64): Update.
* src/powerpc/ffi_sysv.c (ASM_NEEDS_REGISTERS): Increase to 6.
(ffi_prep_closure_loc_sysv): Use bcl in trampoline, put data words
last, flush just the insn part.
(ffi_closure_helper_SYSV): Replace closure param with cif, fun and
user_data params.
* src/powerpc/linux64.S (ffi_call_LINUX64): Replace hand-written
.eh_frame with .cfi directives. Adjust for changed param order.
Pass extra "closure" param to user function in static chain. Add
.cfi directives to describe epilogue. Don't provide traceback
table for ELFv2 or _CALL_LINUX.
* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Replace
hand-written .eh_frame with .cfi directives. Adjust for changed
ffi_closure_helper_LINUX64 params. Add .cfi directives to
describe epilogue. Don't provide traceback table for ELFv2 or
_CALL_LINUX.
(ffi_go_closure_linux64): New function.
* src/powerpc/sysv.S: Remove redundant .globl ffi_prep_args_SYSV.
(ffi_call_SYSV): Make hidden. Replace hand-written .eh_frame with
.cfi directives. Adjust for changed params. Pass extra "closure"
param to user function in static chain. Add .cfi directives to
describe epilogue.
* src/powerpc/ppc_closure.S (ffi_closure_SYSV): Make hidden.
Replace hand-written .eh_frame with .cfi directives. Adjust for
changed ffi_closure_helper_SYSV params. Add .cfi directives to
describe epilogue. Don't just use nops in the dead __NO_FPRS__
epilogues.
(ffi_go_closure_sysv): New function.
---
src/powerpc/ffi.c | 40 ++++++++-
src/powerpc/ffi_linux64.c | 24 ++---
src/powerpc/ffi_powerpc.h | 29 ++++--
src/powerpc/ffi_sysv.c | 33 +++----
src/powerpc/ffitarget.h | 1 +
src/powerpc/linux64.S | 73 +++++----------
src/powerpc/linux64_closure.S | 202 +++++++++++++++++++++++++++++++-----------
src/powerpc/ppc_closure.S | 165 ++++++++++++++++++----------------
src/powerpc/sysv.S | 149 +++++++++++--------------------
9 files changed, 403 insertions(+), 313 deletions(-)
diff --git a/src/powerpc/ffi.c b/src/powerpc/ffi.c
index efb441b..7eb543e 100644
--- a/src/powerpc/ffi.c
+++ b/src/powerpc/ffi.c
@@ -70,8 +70,12 @@ ffi_prep_cif_machdep_var (ffi_cif *cif,
#endif
}
-void
-ffi_call(ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue)
+static void
+ffi_call_int (ffi_cif *cif,
+ void (*fn) (void),
+ void *rvalue,
+ void **avalue,
+ void *closure)
{
/* The final SYSV ABI says that structures smaller or equal 8 bytes
are returned in r3/r4. A draft ABI used by linux instead returns
@@ -97,9 +101,10 @@ ffi_call(ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue)
ecif.rvalue = alloca (cif->rtype->size);
#ifdef POWERPC64
- ffi_call_LINUX64 (&ecif, -(long) cif->bytes, cif->flags, ecif.rvalue, fn);
+ ffi_call_LINUX64 (&ecif, fn, ecif.rvalue, cif->flags, closure,
+ -(long) cif->bytes);
#else
- ffi_call_SYSV (&ecif, -cif->bytes, cif->flags, ecif.rvalue, fn);
+ ffi_call_SYSV (&ecif, fn, ecif.rvalue, cif->flags, closure, -cif->bytes);
#endif
/* Check for a bounce-buffered return value */
@@ -125,6 +130,18 @@ ffi_call(ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue)
}
}
+void
+ffi_call (ffi_cif *cif, void (*fn) (void), void *rvalue, void **avalue)
+{
+ ffi_call_int (cif, fn, rvalue, avalue, NULL);
+}
+
+void
+ffi_call_go (ffi_cif *cif, void (*fn) (void), void *rvalue, void **avalue,
+ void *closure)
+{
+ ffi_call_int (cif, fn, rvalue, avalue, closure);
+}
ffi_status
ffi_prep_closure_loc (ffi_closure *closure,
@@ -139,3 +156,18 @@ ffi_prep_closure_loc (ffi_closure *closure,
return ffi_prep_closure_loc_sysv (closure, cif, fun, user_data, codeloc);
#endif
}
+
+ffi_status
+ffi_prep_go_closure (ffi_go_closure *closure,
+ ffi_cif *cif,
+ void (*fun) (ffi_cif *, void *, void **, void *))
+{
+#ifdef POWERPC64
+ closure->tramp = ffi_go_closure_linux64;
+#else
+ closure->tramp = ffi_go_closure_sysv;
+#endif
+ closure->cif = cif;
+ closure->fun = fun;
+ return FFI_OK;
+}
diff --git a/src/powerpc/ffi_linux64.c b/src/powerpc/ffi_linux64.c
index b087af8..b84b91f 100644
--- a/src/powerpc/ffi_linux64.c
+++ b/src/powerpc/ffi_linux64.c
@@ -667,7 +667,8 @@ flush_icache (char *wraddr, char *xaddr, int size)
}
#endif
-ffi_status
+
+ffi_status FFI_HIDDEN
ffi_prep_closure_loc_linux64 (ffi_closure *closure,
ffi_cif *cif,
void (*fun) (ffi_cif *, void *, void **, void *),
@@ -688,17 +689,17 @@ ffi_prep_closure_loc_linux64 (ffi_closure *closure,
/* 2: .quad context */
*(void **) &tramp[4] = (void *) ffi_closure_LINUX64;
*(void **) &tramp[6] = codeloc;
- flush_icache ((char *)tramp, (char *)codeloc, FFI_TRAMPOLINE_SIZE);
+ flush_icache ((char *) tramp, (char *) codeloc, 4 * 4);
#else
void **tramp = (void **) &closure->tramp[0];
if (cif->abi < FFI_LINUX || cif->abi >= FFI_LAST_ABI)
return FFI_BAD_ABI;
- /* Copy function address and TOC from ffi_closure_LINUX64. */
- memcpy (tramp, (char *) ffi_closure_LINUX64, 16);
- tramp[2] = tramp[1];
+ /* Copy function address and TOC from ffi_closure_LINUX64 OPD. */
+ memcpy (&tramp[0], (void **) ffi_closure_LINUX64, sizeof (void *));
tramp[1] = codeloc;
+ memcpy (&tramp[2], (void **) ffi_closure_LINUX64 + 1, sizeof (void *));
#endif
closure->cif = cif;
@@ -710,8 +711,12 @@ ffi_prep_closure_loc_linux64 (ffi_closure *closure,
int FFI_HIDDEN
-ffi_closure_helper_LINUX64 (ffi_closure *closure, void *rvalue,
- unsigned long *pst, ffi_dblfl *pfr)
+ffi_closure_helper_LINUX64 (ffi_cif *cif,
+ void (*fun) (ffi_cif *, void *, void **, void *),
+ void *user_data,
+ void *rvalue,
+ unsigned long *pst,
+ ffi_dblfl *pfr)
{
/* rvalue is the pointer to space for return value in closure assembly */
/* pst is the pointer to parameter save area
@@ -721,11 +726,9 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure, void *rvalue,
void **avalue;
ffi_type **arg_types;
unsigned long i, avn, nfixedargs;
- ffi_cif *cif;
ffi_dblfl *end_pfr = pfr + NUM_FPR_ARG_REGISTERS64;
unsigned long align;
- cif = closure->cif;
avalue = alloca (cif->nargs * sizeof (void *));
/* Copy the caller's structure return value address so that the
@@ -925,8 +928,7 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure, void *rvalue,
i++;
}
-
- (closure->fun) (cif, rvalue, avalue, closure->user_data);
+ (*fun) (cif, rvalue, avalue, user_data);
/* Tell ffi_closure_LINUX64 how to perform return type promotions. */
if ((cif->flags & FLAG_RETURNS_SMST) != 0)
diff --git a/src/powerpc/ffi_powerpc.h b/src/powerpc/ffi_powerpc.h
index 2e61653..3dcd6b5 100644
--- a/src/powerpc/ffi_powerpc.h
+++ b/src/powerpc/ffi_powerpc.h
@@ -56,22 +56,39 @@ typedef union
} ffi_dblfl;
void FFI_HIDDEN ffi_closure_SYSV (void);
-void FFI_HIDDEN ffi_call_SYSV(extended_cif *, unsigned, unsigned, unsigned *,
- void (*)(void));
+void FFI_HIDDEN ffi_go_closure_sysv (void);
+void FFI_HIDDEN ffi_call_SYSV(extended_cif *, void (*)(void), void *,
+ unsigned, void *, int);
void FFI_HIDDEN ffi_prep_types_sysv (ffi_abi);
ffi_status FFI_HIDDEN ffi_prep_cif_sysv (ffi_cif *);
-int FFI_HIDDEN ffi_closure_helper_SYSV (ffi_closure *, void *, unsigned long *,
+ffi_status FFI_HIDDEN ffi_prep_closure_loc_sysv (ffi_closure *,
+ ffi_cif *,
+ void (*) (ffi_cif *, void *,
+ void **, void *),
+ void *, void *);
+int FFI_HIDDEN ffi_closure_helper_SYSV (ffi_cif *,
+ void (*) (ffi_cif *, void *,
+ void **, void *),
+ void *, void *, unsigned long *,
ffi_dblfl *, unsigned long *);
-void FFI_HIDDEN ffi_call_LINUX64(extended_cif *, unsigned long, unsigned long,
- unsigned long *, void (*)(void));
+void FFI_HIDDEN ffi_call_LINUX64(extended_cif *, void (*) (void), void *,
+ unsigned long, void *, long);
void FFI_HIDDEN ffi_closure_LINUX64 (void);
+void FFI_HIDDEN ffi_go_closure_linux64 (void);
void FFI_HIDDEN ffi_prep_types_linux64 (ffi_abi);
ffi_status FFI_HIDDEN ffi_prep_cif_linux64 (ffi_cif *);
ffi_status FFI_HIDDEN ffi_prep_cif_linux64_var (ffi_cif *, unsigned int,
unsigned int);
void FFI_HIDDEN ffi_prep_args64 (extended_cif *, unsigned long *const);
-int FFI_HIDDEN ffi_closure_helper_LINUX64 (ffi_closure *, void *,
+ffi_status FFI_HIDDEN ffi_prep_closure_loc_linux64 (ffi_closure *, ffi_cif *,
+ void (*) (ffi_cif *, void *,
+ void **, void *),
+ void *, void *);
+int FFI_HIDDEN ffi_closure_helper_LINUX64 (ffi_cif *,
+ void (*) (ffi_cif *, void *,
+ void **, void *),
+ void *, void *,
unsigned long *, ffi_dblfl *);
diff --git a/src/powerpc/ffi_sysv.c b/src/powerpc/ffi_sysv.c
index fbe85fe..646c340 100644
--- a/src/powerpc/ffi_sysv.c
+++ b/src/powerpc/ffi_sysv.c
@@ -36,7 +36,7 @@
/* About the SYSV ABI. */
-#define ASM_NEEDS_REGISTERS 4
+#define ASM_NEEDS_REGISTERS 6
#define NUM_GPR_ARG_REGISTERS 8
#define NUM_FPR_ARG_REGISTERS 8
@@ -654,18 +654,18 @@ ffi_prep_closure_loc_sysv (ffi_closure *closure,
tramp = (unsigned int *) &closure->tramp[0];
tramp[0] = 0x7c0802a6; /* mflr r0 */
- tramp[1] = 0x4800000d; /* bl 10 <trampoline_initial+0x10> */
- tramp[4] = 0x7d6802a6; /* mflr r11 */
- tramp[5] = 0x7c0803a6; /* mtlr r0 */
- tramp[6] = 0x800b0000; /* lwz r0,0(r11) */
- tramp[7] = 0x816b0004; /* lwz r11,4(r11) */
- tramp[8] = 0x7c0903a6; /* mtctr r0 */
- tramp[9] = 0x4e800420; /* bctr */
- *(void **) &tramp[2] = (void *) ffi_closure_SYSV; /* function */
- *(void **) &tramp[3] = codeloc; /* context */
+ tramp[1] = 0x429f0005; /* bcl 20,31,.+4 */
+ tramp[2] = 0x7d6802a6; /* mflr r11 */
+ tramp[3] = 0x7c0803a6; /* mtlr r0 */
+ tramp[4] = 0x800b0018; /* lwz r0,24(r11) */
+ tramp[5] = 0x816b001c; /* lwz r11,28(r11) */
+ tramp[6] = 0x7c0903a6; /* mtctr r0 */
+ tramp[7] = 0x4e800420; /* bctr */
+ *(void **) &tramp[8] = (void *) ffi_closure_SYSV; /* function */
+ *(void **) &tramp[9] = codeloc; /* context */
/* Flush the icache. */
- flush_icache ((char *)tramp, (char *)codeloc, FFI_TRAMPOLINE_SIZE);
+ flush_icache ((char *)tramp, (char *)codeloc, 8 * 4);
closure->cif = cif;
closure->fun = fun;
@@ -682,8 +682,12 @@ ffi_prep_closure_loc_sysv (ffi_closure *closure,
following helper function to do most of the work. */
int
-ffi_closure_helper_SYSV (ffi_closure *closure, void *rvalue,
- unsigned long *pgr, ffi_dblfl *pfr,
+ffi_closure_helper_SYSV (ffi_cif *cif,
+ void (*fun) (ffi_cif *, void *, void **, void *),
+ void *user_data,
+ void *rvalue,
+ unsigned long *pgr,
+ ffi_dblfl *pfr,
unsigned long *pst)
{
/* rvalue is the pointer to space for return value in closure assembly */
@@ -699,7 +703,6 @@ ffi_closure_helper_SYSV (ffi_closure *closure, void *rvalue,
#endif
long ng = 0; /* number of general registers already used */
- ffi_cif *cif = closure->cif;
unsigned size = cif->rtype->size;
unsigned short rtypenum = cif->rtype->type;
@@ -915,7 +918,7 @@ ffi_closure_helper_SYSV (ffi_closure *closure, void *rvalue,
i++;
}
- (closure->fun) (cif, rvalue, avalue, closure->user_data);
+ (*fun) (cif, rvalue, avalue, user_data);
/* Tell ffi_closure_SYSV how to perform return type promotions.
Because the FFI_SYSV ABI returns the structures <= 8 bytes in
diff --git a/src/powerpc/ffitarget.h b/src/powerpc/ffitarget.h
index 84aa586..0f66d31 100644
--- a/src/powerpc/ffitarget.h
+++ b/src/powerpc/ffitarget.h
@@ -138,6 +138,7 @@ typedef enum ffi_abi {
#define FFI_CLOSURES 1
#define FFI_NATIVE_RAW_API 0
#if defined (POWERPC) || defined (POWERPC_FREEBSD)
+# define FFI_GO_CLOSURES 1
# define FFI_TARGET_SPECIFIC_VARIADIC 1
# define FFI_EXTRA_CIF_FIELDS unsigned nfixedargs
#endif
diff --git a/src/powerpc/linux64.S b/src/powerpc/linux64.S
index d2acb70..b2ae60e 100644
--- a/src/powerpc/linux64.S
+++ b/src/powerpc/linux64.S
@@ -32,8 +32,9 @@
#ifdef POWERPC64
.hidden ffi_call_LINUX64
.globl ffi_call_LINUX64
-# if _CALL_ELF == 2
.text
+ .cfi_startproc
+# if _CALL_ELF == 2
ffi_call_LINUX64:
addis %r2, %r12, .TOC.-ffi_call_LINUX64@ha
addi %r2, %r2, .TOC.-ffi_call_LINUX64@l
@@ -57,20 +58,26 @@ ffi_call_LINUX64:
.ffi_call_LINUX64:
# endif
# endif
-.LFB1:
mflr %r0
std %r28, -32(%r1)
std %r29, -24(%r1)
std %r30, -16(%r1)
std %r31, -8(%r1)
+ std %r7, 8(%r1) /* closure, saved in cr field. */
std %r0, 16(%r1)
mr %r28, %r1 /* our AP. */
-.LCFI0:
- stdux %r1, %r1, %r4
- mr %r31, %r5 /* flags, */
- mr %r30, %r6 /* rvalue, */
- mr %r29, %r7 /* function address. */
+ .cfi_def_cfa_register 28
+ .cfi_offset 65, 16
+ .cfi_offset 31, -8
+ .cfi_offset 30, -16
+ .cfi_offset 29, -24
+ .cfi_offset 28, -32
+
+ stdux %r1, %r1, %r8
+ mr %r31, %r6 /* flags, */
+ mr %r30, %r5 /* rvalue, */
+ mr %r29, %r4 /* function address. */
/* Save toc pointer, not for the ffi_prep_args64 call, but for the later
bctrl function call. */
# if _CALL_ELF == 2
@@ -92,7 +99,6 @@ ffi_call_LINUX64:
# else
ld %r12, 0(%r29)
ld %r2, 8(%r29)
- ld %r11, 16(%r29)
# endif
/* Now do the call. */
/* Set up cr1 with bits 4-7 of the flags. */
@@ -130,6 +136,7 @@ ffi_call_LINUX64:
2:
/* Make the call. */
+ ld %r11, 8(%r28)
bctrl
/* This must follow the call immediately, the unwinder
@@ -151,6 +158,7 @@ ffi_call_LINUX64:
.Ldone_return_value:
/* Restore the registers we used and return. */
mr %r1, %r28
+ .cfi_def_cfa_register 1
ld %r0, 16(%r28)
ld %r28, -32(%r28)
mtlr %r0
@@ -160,6 +168,7 @@ ffi_call_LINUX64:
blr
.Lfp_return_value:
+ .cfi_def_cfa_register 28
bf 28, .Lfloat_return_value
stfd %f1, 0(%r30)
mtcrf 0x02, %r31 /* cr6 */
@@ -199,61 +208,19 @@ ffi_call_LINUX64:
std %r4, 8(%r30)
b .Ldone_return_value
-.LFE1:
- .long 0
- .byte 0,12,0,1,128,4,0,0
+ .cfi_endproc
# if _CALL_ELF == 2
.size ffi_call_LINUX64,.-ffi_call_LINUX64
# else
# ifdef _CALL_LINUX
.size ffi_call_LINUX64,.-.L.ffi_call_LINUX64
# else
+ .long 0
+ .byte 0,12,0,1,128,4,0,0
.size .ffi_call_LINUX64,.-.ffi_call_LINUX64
# endif
# endif
- .section .eh_frame,EH_FRAME_FLAGS,@progbits
-.Lframe1:
- .4byte .LECIE1-.LSCIE1 # Length of Common Information Entry
-.LSCIE1:
- .4byte 0x0 # CIE Identifier Tag
- .byte 0x1 # CIE Version
- .ascii "zR\0" # CIE Augmentation
- .uleb128 0x1 # CIE Code Alignment Factor
- .sleb128 -8 # CIE Data Alignment Factor
- .byte 0x41 # CIE RA Column
- .uleb128 0x1 # Augmentation size
- .byte 0x14 # FDE Encoding (pcrel udata8)
- .byte 0xc # DW_CFA_def_cfa
- .uleb128 0x1
- .uleb128 0x0
- .align 3
-.LECIE1:
-.LSFDE1:
- .4byte .LEFDE1-.LASFDE1 # FDE Length
-.LASFDE1:
- .4byte .LASFDE1-.Lframe1 # FDE CIE offset
- .8byte .LFB1-. # FDE initial location
- .8byte .LFE1-.LFB1 # FDE address range
- .uleb128 0x0 # Augmentation size
- .byte 0x2 # DW_CFA_advance_loc1
- .byte .LCFI0-.LFB1
- .byte 0xd # DW_CFA_def_cfa_register
- .uleb128 0x1c
- .byte 0x11 # DW_CFA_offset_extended_sf
- .uleb128 0x41
- .sleb128 -2
- .byte 0x9f # DW_CFA_offset, column 0x1f
- .uleb128 0x1
- .byte 0x9e # DW_CFA_offset, column 0x1e
- .uleb128 0x2
- .byte 0x9d # DW_CFA_offset, column 0x1d
- .uleb128 0x3
- .byte 0x9c # DW_CFA_offset, column 0x1c
- .uleb128 0x4
- .align 3
-.LEFDE1:
-
#endif
#if (defined __ELF__ && defined __linux__) || _CALL_ELF == 2
diff --git a/src/powerpc/linux64_closure.S b/src/powerpc/linux64_closure.S
index 97421a4..1364225 100644
--- a/src/powerpc/linux64_closure.S
+++ b/src/powerpc/linux64_closure.S
@@ -33,8 +33,9 @@
#ifdef POWERPC64
FFI_HIDDEN (ffi_closure_LINUX64)
.globl ffi_closure_LINUX64
-# if _CALL_ELF == 2
.text
+ .cfi_startproc
+# if _CALL_ELF == 2
ffi_closure_LINUX64:
addis %r2, %r12, .TOC.-ffi_closure_LINUX64@ha
addi %r2, %r2, .TOC.-ffi_closure_LINUX64@l
@@ -73,20 +74,18 @@ ffi_closure_LINUX64:
# define RETVAL PARMSAVE+64
# endif
-.LFB1:
# if _CALL_ELF == 2
ld %r12, FFI_TRAMPOLINE_SIZE(%r11) # closure->cif
mflr %r0
lwz %r12, 28(%r12) # cif->flags
mtcrf 0x40, %r12
addi %r12, %r1, PARMSAVE
- bt 7, .Lparmsave
+ bt 7, 0f
# Our caller has not allocated a parameter save area.
# We need to allocate one here and use it to pass gprs to
# ffi_closure_helper_LINUX64.
addi %r12, %r1, -STACKFRAME+PARMSAVE
-.Lparmsave:
- std %r0, 16(%r1)
+0:
# Save general regs into parm save area
std %r3, 0(%r12)
std %r4, 8(%r12)
@@ -98,7 +97,7 @@ ffi_closure_LINUX64:
std %r10, 56(%r12)
# load up the pointer to the parm save area
- mr %r5, %r12
+ mr %r7, %r12
# else
# copy r2 to r11 and load TOC into r2
mr %r11, %r2
@@ -116,12 +115,19 @@ ffi_closure_LINUX64:
std %r9, PARMSAVE+48(%r1)
std %r10, PARMSAVE+56(%r1)
- std %r0, 16(%r1)
-
# load up the pointer to the parm save area
- addi %r5, %r1, PARMSAVE
+ addi %r7, %r1, PARMSAVE
# endif
+ std %r0, 16(%r1)
+
+ # closure->cif
+ ld %r3, FFI_TRAMPOLINE_SIZE(%r11)
+ # closure->fun
+ ld %r4, FFI_TRAMPOLINE_SIZE+8(%r11)
+ # closure->user_data
+ ld %r5, FFI_TRAMPOLINE_SIZE+16(%r11)
+.Ldoclosure:
# next save fpr 1 to fpr 13
stfd %f1, -104+(0*8)(%r1)
stfd %f2, -104+(1*8)(%r1)
@@ -138,16 +144,14 @@ ffi_closure_LINUX64:
stfd %f13, -104+(12*8)(%r1)
# load up the pointer to the saved fpr registers */
- addi %r6, %r1, -104
+ addi %r8, %r1, -104
# load up the pointer to the result storage
- addi %r4, %r1, -STACKFRAME+RETVAL
+ addi %r6, %r1, -STACKFRAME+RETVAL
stdu %r1, -STACKFRAME(%r1)
-.LCFI0:
-
- # get the context pointer from the trampoline
- mr %r3, %r11
+ .cfi_def_cfa_offset STACKFRAME
+ .cfi_offset 65, 16
# make the call
# if defined _CALL_LINUX || _CALL_ELF == 2
@@ -182,7 +186,9 @@ ffi_closure_LINUX64:
# case FFI_TYPE_VOID
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
nop
# case FFI_TYPE_INT
# ifdef __LITTLE_ENDIAN__
@@ -192,17 +198,23 @@ ffi_closure_LINUX64:
# endif
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_FLOAT
lfs %f1, RETVAL+0(%r1)
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_DOUBLE
lfd %f1, RETVAL+0(%r1)
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_LONGDOUBLE
lfd %f1, RETVAL+0(%r1)
mtlr %r0
@@ -216,7 +228,9 @@ ffi_closure_LINUX64:
# endif
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_SINT8
# ifdef __LITTLE_ENDIAN__
lbz %r3, RETVAL+0(%r1)
@@ -235,7 +249,9 @@ ffi_closure_LINUX64:
mtlr %r0
.Lfinish:
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_SINT16
# ifdef __LITTLE_ENDIAN__
lha %r3, RETVAL+0(%r1)
@@ -244,7 +260,9 @@ ffi_closure_LINUX64:
# endif
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_UINT32
# ifdef __LITTLE_ENDIAN__
lwz %r3, RETVAL+0(%r1)
@@ -253,7 +271,9 @@ ffi_closure_LINUX64:
# endif
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_SINT32
# ifdef __LITTLE_ENDIAN__
lwa %r3, RETVAL+0(%r1)
@@ -262,27 +282,37 @@ ffi_closure_LINUX64:
# endif
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_UINT64
ld %r3, RETVAL+0(%r1)
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_SINT64
ld %r3, RETVAL+0(%r1)
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_TYPE_STRUCT
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
nop
# case FFI_TYPE_POINTER
ld %r3, RETVAL+0(%r1)
mtlr %r0
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
# case FFI_V2_TYPE_FLOAT_HOMOG
lfs %f1, RETVAL+0(%r1)
lfs %f2, RETVAL+4(%r1)
@@ -299,7 +329,9 @@ ffi_closure_LINUX64:
lfd %f7, RETVAL+48(%r1)
lfd %f8, RETVAL+56(%r1)
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
.Lmorefloat:
lfs %f4, RETVAL+12(%r1)
mtlr %r0
@@ -308,13 +340,16 @@ ffi_closure_LINUX64:
lfs %f7, RETVAL+24(%r1)
lfs %f8, RETVAL+28(%r1)
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
.Lsmall:
# ifdef __LITTLE_ENDIAN__
ld %r3,RETVAL+0(%r1)
mtlr %r0
ld %r4,RETVAL+8(%r1)
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
# else
# A struct smaller than a dword is returned in the low bits of r3
@@ -328,63 +363,128 @@ ffi_closure_LINUX64:
mtlr %r0
ld %r4,RETVAL+8(%r1)
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset STACKFRAME
.Lsmalldown:
addi %r5, %r5, FFI_V2_TYPE_SMALL_STRUCT + 7
mtlr %r0
sldi %r5, %r5, 3
addi %r1, %r1, STACKFRAME
+ .cfi_def_cfa_offset 0
srd %r3, %r3, %r5
blr
# endif
-.LFE1:
- .long 0
- .byte 0,12,0,1,128,0,0,0
+ .cfi_endproc
# if _CALL_ELF == 2
.size ffi_closure_LINUX64,.-ffi_closure_LINUX64
# else
# ifdef _CALL_LINUX
.size ffi_closure_LINUX64,.-.L.ffi_closure_LINUX64
# else
+ .long 0
+ .byte 0,12,0,1,128,0,0,0
.size .ffi_closure_LINUX64,.-.ffi_closure_LINUX64
# endif
# endif
- .section .eh_frame,EH_FRAME_FLAGS,@progbits
-.Lframe1:
- .4byte .LECIE1-.LSCIE1 # Length of Common Information Entry
-.LSCIE1:
- .4byte 0x0 # CIE Identifier Tag
- .byte 0x1 # CIE Version
- .ascii "zR\0" # CIE Augmentation
- .uleb128 0x1 # CIE Code Alignment Factor
- .sleb128 -8 # CIE Data Alignment Factor
- .byte 0x41 # CIE RA Column
- .uleb128 0x1 # Augmentation size
- .byte 0x14 # FDE Encoding (pcrel udata8)
- .byte 0xc # DW_CFA_def_cfa
- .uleb128 0x1
- .uleb128 0x0
- .align 3
-.LECIE1:
-.LSFDE1:
- .4byte .LEFDE1-.LASFDE1 # FDE Length
-.LASFDE1:
- .4byte .LASFDE1-.Lframe1 # FDE CIE offset
- .8byte .LFB1-. # FDE initial location
- .8byte .LFE1-.LFB1 # FDE address range
- .uleb128 0x0 # Augmentation size
- .byte 0x2 # DW_CFA_advance_loc1
- .byte .LCFI0-.LFB1
- .byte 0xe # DW_CFA_def_cfa_offset
- .uleb128 STACKFRAME
- .byte 0x11 # DW_CFA_offset_extended_sf
- .uleb128 0x41
- .sleb128 -2
- .align 3
-.LEFDE1:
+ FFI_HIDDEN (ffi_go_closure_linux64)
+ .globl ffi_go_closure_linux64
+ .text
+ .cfi_startproc
+# if _CALL_ELF == 2
+ffi_go_closure_linux64:
+ addis %r2, %r12, .TOC.-ffi_go_closure_linux64@ha
+ addi %r2, %r2, .TOC.-ffi_go_closure_linux64@l
+ .localentry ffi_go_closure_linux64, . - ffi_go_closure_linux64
+# else
+ .section ".opd","aw"
+ .align 3
+ffi_go_closure_linux64:
+# ifdef _CALL_LINUX
+ .quad .L.ffi_go_closure_linux64,.TOC.@tocbase,0
+ .type ffi_go_closure_linux64,@function
+ .text
+.L.ffi_go_closure_linux64:
+# else
+ FFI_HIDDEN (.ffi_go_closure_linux64)
+ .globl .ffi_go_closure_linux64
+ .quad .ffi_go_closure_linux64,.TOC.@tocbase,0
+ .size ffi_go_closure_linux64,24
+ .type .ffi_go_closure_linux64,@function
+ .text
+.ffi_go_closure_linux64:
+# endif
+# endif
+
+# if _CALL_ELF == 2
+ ld %r12, 8(%r11) # closure->cif
+ mflr %r0
+ lwz %r12, 28(%r12) # cif->flags
+ mtcrf 0x40, %r12
+ addi %r12, %r1, PARMSAVE
+ bt 7, 0f
+ # Our caller has not allocated a parameter save area.
+ # We need to allocate one here and use it to pass gprs to
+ # ffi_closure_helper_LINUX64.
+ addi %r12, %r1, -STACKFRAME+PARMSAVE
+0:
+ # Save general regs into parm save area
+ std %r3, 0(%r12)
+ std %r4, 8(%r12)
+ std %r5, 16(%r12)
+ std %r6, 24(%r12)
+ std %r7, 32(%r12)
+ std %r8, 40(%r12)
+ std %r9, 48(%r12)
+ std %r10, 56(%r12)
+
+ # load up the pointer to the parm save area
+ mr %r7, %r12
+# else
+ # copy r2 to r11 and load TOC into r2
+ mr %r11, %r2
+ ld %r2, 16(%r11)
+
+ mflr %r0
+ # Save general regs into parm save area
+ # This is the parameter save area set up by our caller.
+ std %r3, PARMSAVE+0(%r1)
+ std %r4, PARMSAVE+8(%r1)
+ std %r5, PARMSAVE+16(%r1)
+ std %r6, PARMSAVE+24(%r1)
+ std %r7, PARMSAVE+32(%r1)
+ std %r8, PARMSAVE+40(%r1)
+ std %r9, PARMSAVE+48(%r1)
+ std %r10, PARMSAVE+56(%r1)
+
+ # load up the pointer to the parm save area
+ addi %r7, %r1, PARMSAVE
+# endif
+ std %r0, 16(%r1)
+
+ # closure->cif
+ ld %r3, 8(%r11)
+ # closure->fun
+ ld %r4, 16(%r11)
+ # user_data
+ mr %r5, %r11
+ b .Ldoclosure
+
+ .cfi_endproc
+# if _CALL_ELF == 2
+ .size ffi_go_closure_linux64,.-ffi_go_closure_linux64
+# else
+# ifdef _CALL_LINUX
+ .size ffi_go_closure_linux64,.-.L.ffi_go_closure_linux64
+# else
+ .long 0
+ .byte 0,12,0,1,128,0,0,0
+ .size .ffi_go_closure_linux64,.-.ffi_go_closure_linux64
+# endif
+# endif
#endif
#if (defined __ELF__ && defined __linux__) || _CALL_ELF == 2
diff --git a/src/powerpc/ppc_closure.S b/src/powerpc/ppc_closure.S
index 075922c..b6d209d 100644
--- a/src/powerpc/ppc_closure.S
+++ b/src/powerpc/ppc_closure.S
@@ -33,13 +33,14 @@
#ifndef POWERPC64
+FFI_HIDDEN(ffi_closure_SYSV)
ENTRY(ffi_closure_SYSV)
-.LFB1:
+ .cfi_startproc
stwu %r1,-144(%r1)
-.LCFI0:
+ .cfi_def_cfa_offset 144
mflr %r0
-.LCFI1:
stw %r0,148(%r1)
+ .cfi_offset 65, 4
# we want to build up an areas for the parameters passed
# in registers (both floating point and integer)
@@ -48,6 +49,17 @@ ENTRY(ffi_closure_SYSV)
stw %r3, 16(%r1)
stw %r4, 20(%r1)
stw %r5, 24(%r1)
+
+ # set up registers for the routine that does the work
+
+ # closure->cif
+ lwz %r3,FFI_TRAMPOLINE_SIZE(%r11)
+ # closure->fun
+ lwz %r4,FFI_TRAMPOLINE_SIZE+4(%r11)
+ # closure->user_data
+ lwz %r5,FFI_TRAMPOLINE_SIZE+8(%r11)
+
+.Ldoclosure:
stw %r6, 28(%r1)
stw %r7, 32(%r1)
stw %r8, 36(%r1)
@@ -66,23 +78,18 @@ ENTRY(ffi_closure_SYSV)
stfd %f8, 104(%r1)
#endif
- # set up registers for the routine that actually does the work
- # get the context pointer from the trampoline
- mr %r3,%r11
+ # pointer to the result storage
+ addi %r6,%r1,112
- # now load up the pointer to the result storage
- addi %r4,%r1,112
+ # pointer to the saved gpr registers
+ addi %r7,%r1,16
- # now load up the pointer to the saved gpr registers
- addi %r5,%r1,16
+ # pointer to the saved fpr registers
+ addi %r8,%r1,48
- # now load up the pointer to the saved fpr registers */
- addi %r6,%r1,48
-
- # now load up the pointer to the outgoing parameter
- # stack in the previous frame
+ # pointer to the outgoing parameter save area in the previous frame
# i.e. the previous frame pointer + 8
- addi %r7,%r1,152
+ addi %r9,%r1,152
# make the call
bl ffi_closure_helper_SYSV@local
@@ -101,7 +108,6 @@ ENTRY(ffi_closure_SYSV)
add %r3,%r3,%r4 # add contents of table to table address
mtctr %r3
bctr # jump to it
-.LFE1:
# Each of the ret_typeX code fragments has to be exactly 16 bytes long
# (4 instructions). For cache effectiveness we align to a 16 byte boundary
@@ -111,7 +117,9 @@ ENTRY(ffi_closure_SYSV)
.Lret_type0:
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
nop
# case FFI_TYPE_INT
@@ -119,31 +127,33 @@ ENTRY(ffi_closure_SYSV)
mtlr %r0
.Lfinish:
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_TYPE_FLOAT
#ifndef __NO_FPRS__
lfs %f1,112+0(%r1)
- mtlr %r0
- addi %r1,%r1,144
#else
nop
- nop
- nop
#endif
+ mtlr %r0
+ addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_TYPE_DOUBLE
#ifndef __NO_FPRS__
lfd %f1,112+0(%r1)
- mtlr %r0
- addi %r1,%r1,144
#else
nop
- nop
- nop
#endif
+ mtlr %r0
+ addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_TYPE_LONGDOUBLE
#ifndef __NO_FPRS__
@@ -152,10 +162,12 @@ ENTRY(ffi_closure_SYSV)
mtlr %r0
b .Lfinish
#else
- nop
- nop
- nop
+ mtlr %r0
+ addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
+ nop
#endif
# case FFI_TYPE_UINT8
@@ -166,7 +178,9 @@ ENTRY(ffi_closure_SYSV)
#endif
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_TYPE_SINT8
#ifdef __LITTLE_ENDIAN__
@@ -186,7 +200,9 @@ ENTRY(ffi_closure_SYSV)
#endif
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_TYPE_SINT16
#ifdef __LITTLE_ENDIAN__
@@ -196,19 +212,25 @@ ENTRY(ffi_closure_SYSV)
#endif
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_TYPE_UINT32
lwz %r3,112+0(%r1)
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_TYPE_SINT32
lwz %r3,112+0(%r1)
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_TYPE_UINT64
lwz %r3,112+0(%r1)
@@ -225,14 +247,18 @@ ENTRY(ffi_closure_SYSV)
# case FFI_TYPE_STRUCT
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
nop
# case FFI_TYPE_POINTER
lwz %r3,112+0(%r1)
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_TYPE_UINT128
lwz %r3,112+0(%r1)
@@ -245,20 +271,26 @@ ENTRY(ffi_closure_SYSV)
lbz %r3,112+0(%r1)
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_SYSV_TYPE_SMALL_STRUCT + 2. Two byte struct.
lhz %r3,112+0(%r1)
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_SYSV_TYPE_SMALL_STRUCT + 3. Three byte struct.
lwz %r3,112+0(%r1)
#ifdef __LITTLE_ENDIAN__
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
#else
srwi %r3,%r3,8
mtlr %r0
@@ -269,7 +301,9 @@ ENTRY(ffi_closure_SYSV)
lwz %r3,112+0(%r1)
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
# case FFI_SYSV_TYPE_SMALL_STRUCT + 5. Five byte struct.
lwz %r3,112+0(%r1)
@@ -319,64 +353,43 @@ ENTRY(ffi_closure_SYSV)
or %r4,%r6,%r4
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
+ .cfi_def_cfa_offset 144
#endif
.Luint128:
lwz %r6,112+12(%r1)
mtlr %r0
addi %r1,%r1,144
+ .cfi_def_cfa_offset 0
blr
-
+ .cfi_endproc
END(ffi_closure_SYSV)
- .section ".eh_frame",EH_FRAME_FLAGS,@progbits
-.Lframe1:
- .4byte .LECIE1-.LSCIE1 # Length of Common Information Entry
-.LSCIE1:
- .4byte 0x0 # CIE Identifier Tag
- .byte 0x1 # CIE Version
-#if defined _RELOCATABLE || defined __PIC__
- .ascii "zR\0" # CIE Augmentation
-#else
- .ascii "\0" # CIE Augmentation
-#endif
- .uleb128 0x1 # CIE Code Alignment Factor
- .sleb128 -4 # CIE Data Alignment Factor
- .byte 0x41 # CIE RA Column
-#if defined _RELOCATABLE || defined __PIC__
- .uleb128 0x1 # Augmentation size
- .byte 0x1b # FDE Encoding (pcrel sdata4)
-#endif
- .byte 0xc # DW_CFA_def_cfa
- .uleb128 0x1
- .uleb128 0x0
- .align 2
-.LECIE1:
-.LSFDE1:
- .4byte .LEFDE1-.LASFDE1 # FDE Length
-.LASFDE1:
- .4byte .LASFDE1-.Lframe1 # FDE CIE offset
-#if defined _RELOCATABLE || defined __PIC__
- .4byte .LFB1-. # FDE initial location
-#else
- .4byte .LFB1 # FDE initial location
-#endif
- .4byte .LFE1-.LFB1 # FDE address range
-#if defined _RELOCATABLE || defined __PIC__
- .uleb128 0x0 # Augmentation size
-#endif
- .byte 0x4 # DW_CFA_advance_loc4
- .4byte .LCFI0-.LFB1
- .byte 0xe # DW_CFA_def_cfa_offset
- .uleb128 144
- .byte 0x4 # DW_CFA_advance_loc4
- .4byte .LCFI1-.LCFI0
- .byte 0x11 # DW_CFA_offset_extended_sf
- .uleb128 0x41
- .sleb128 -1
- .align 2
-.LEFDE1:
+
+FFI_HIDDEN(ffi_go_closure_sysv)
+ENTRY(ffi_go_closure_sysv)
+ .cfi_startproc
+ stwu %r1,-144(%r1)
+ .cfi_def_cfa_offset 144
+ mflr %r0
+ stw %r0,148(%r1)
+ .cfi_offset 65, 4
+
+ stw %r3, 16(%r1)
+ stw %r4, 20(%r1)
+ stw %r5, 24(%r1)
+
+ # closure->cif
+ lwz %r3,4(%r11)
+ # closure->fun
+ lwz %r4,8(%r11)
+ # user_data
+ mr %r5,%r11
+ b .Ldoclosure
+ .cfi_endproc
+END(ffi_go_closure_sysv)
#if defined __ELF__ && defined __linux__
.section .note.GNU-stack,"",@progbits
diff --git a/src/powerpc/sysv.S b/src/powerpc/sysv.S
index fed2380..1474ce7 100644
--- a/src/powerpc/sysv.S
+++ b/src/powerpc/sysv.S
@@ -31,34 +31,35 @@
#include <powerpc/asm.h>
#ifndef POWERPC64
- .globl ffi_prep_args_SYSV
+FFI_HIDDEN(ffi_call_SYSV)
ENTRY(ffi_call_SYSV)
-.LFB1:
+ .cfi_startproc
/* Save the old stack pointer as AP. */
- mr %r8,%r1
+ mr %r10,%r1
+ .cfi_def_cfa_register 10
-.LCFI0:
/* Allocate the stack space we need. */
- stwux %r1,%r1,%r4
+ stwux %r1,%r1,%r8
/* Save registers we use. */
mflr %r9
- stw %r28,-16(%r8)
-.LCFI1:
- stw %r29,-12(%r8)
-.LCFI2:
- stw %r30, -8(%r8)
-.LCFI3:
- stw %r31, -4(%r8)
-.LCFI4:
- stw %r9, 4(%r8)
-.LCFI5:
+ stw %r28,-16(%r10)
+ stw %r29,-12(%r10)
+ stw %r30, -8(%r10)
+ stw %r31, -4(%r10)
+ stw %r9, 4(%r10)
+ .cfi_offset 65, 4
+ .cfi_offset 31, -4
+ .cfi_offset 30, -8
+ .cfi_offset 29, -12
+ .cfi_offset 28, -16
/* Save arguments over call... */
- mr %r31,%r5 /* flags, */
- mr %r30,%r6 /* rvalue, */
- mr %r29,%r7 /* function address, */
- mr %r28,%r8 /* our AP. */
-.LCFI6:
+ stw %r7, -20(%r10) /* closure, */
+ mr %r31,%r6 /* flags, */
+ mr %r30,%r5 /* rvalue, */
+ mr %r29,%r4 /* function address, */
+ mr %r28,%r10 /* our AP. */
+ .cfi_def_cfa_register 28
/* Call ffi_prep_args_SYSV. */
mr %r4,%r1
@@ -70,35 +71,36 @@ ENTRY(ffi_call_SYSV)
/* Get the address to call into CTR. */
mtctr %r29
/* Load all those argument registers. */
- lwz %r3,-16-(8*4)(%r28)
- lwz %r4,-16-(7*4)(%r28)
- lwz %r5,-16-(6*4)(%r28)
- lwz %r6,-16-(5*4)(%r28)
+ lwz %r3,-24-(8*4)(%r28)
+ lwz %r4,-24-(7*4)(%r28)
+ lwz %r5,-24-(6*4)(%r28)
+ lwz %r6,-24-(5*4)(%r28)
bf- 5,1f
nop
- lwz %r7,-16-(4*4)(%r28)
- lwz %r8,-16-(3*4)(%r28)
- lwz %r9,-16-(2*4)(%r28)
- lwz %r10,-16-(1*4)(%r28)
+ lwz %r7,-24-(4*4)(%r28)
+ lwz %r8,-24-(3*4)(%r28)
+ lwz %r9,-24-(2*4)(%r28)
+ lwz %r10,-24-(1*4)(%r28)
nop
1:
#ifndef __NO_FPRS__
/* Load all the FP registers. */
bf- 6,2f
- lfd %f1,-16-(8*4)-(8*8)(%r28)
- lfd %f2,-16-(8*4)-(7*8)(%r28)
- lfd %f3,-16-(8*4)-(6*8)(%r28)
- lfd %f4,-16-(8*4)-(5*8)(%r28)
+ lfd %f1,-24-(8*4)-(8*8)(%r28)
+ lfd %f2,-24-(8*4)-(7*8)(%r28)
+ lfd %f3,-24-(8*4)-(6*8)(%r28)
+ lfd %f4,-24-(8*4)-(5*8)(%r28)
nop
- lfd %f5,-16-(8*4)-(4*8)(%r28)
- lfd %f6,-16-(8*4)-(3*8)(%r28)
- lfd %f7,-16-(8*4)-(2*8)(%r28)
- lfd %f8,-16-(8*4)-(1*8)(%r28)
+ lfd %f5,-24-(8*4)-(4*8)(%r28)
+ lfd %f6,-24-(8*4)-(3*8)(%r28)
+ lfd %f7,-24-(8*4)-(2*8)(%r28)
+ lfd %f8,-24-(8*4)-(1*8)(%r28)
#endif
2:
/* Make the call. */
+ lwz %r11, -20(%r28)
bctrl
/* Now, deal with the return value. */
@@ -125,11 +127,24 @@ L(done_return_value):
lwz %r30, -8(%r28)
lwz %r29,-12(%r28)
lwz %r28,-16(%r28)
+ .cfi_remember_state
+ /* At this point we don't have a cfa register. Say all our
+ saved regs have been restored. */
+ .cfi_same_value 65
+ .cfi_same_value 31
+ .cfi_same_value 30
+ .cfi_same_value 29
+ .cfi_same_value 28
+ /* Hopefully this works.. */
+ .cfi_def_cfa_register 1
+ .cfi_offset 1, 0
lwz %r1,0(%r1)
+ .cfi_same_value 1
blr
#ifndef __NO_FPRS__
L(fp_return_value):
+ .cfi_restore_state
bf 28,L(float_return_value)
stfd %f1,0(%r30)
mtcrf 0x02,%r31 /* cr6 */
@@ -150,70 +165,10 @@ L(small_struct_return_value):
stw %r3, 0(%r30)
stw %r4, 4(%r30)
b L(done_return_value)
+ .cfi_endproc
-.LFE1:
END(ffi_call_SYSV)
- .section ".eh_frame",EH_FRAME_FLAGS,@progbits
-.Lframe1:
- .4byte .LECIE1-.LSCIE1 /* Length of Common Information Entry */
-.LSCIE1:
- .4byte 0x0 /* CIE Identifier Tag */
- .byte 0x1 /* CIE Version */
-#if defined _RELOCATABLE || defined __PIC__
- .ascii "zR\0" /* CIE Augmentation */
-#else
- .ascii "\0" /* CIE Augmentation */
-#endif
- .uleb128 0x1 /* CIE Code Alignment Factor */
- .sleb128 -4 /* CIE Data Alignment Factor */
- .byte 0x41 /* CIE RA Column */
-#if defined _RELOCATABLE || defined __PIC__
- .uleb128 0x1 /* Augmentation size */
- .byte 0x1b /* FDE Encoding (pcrel sdata4) */
-#endif
- .byte 0xc /* DW_CFA_def_cfa */
- .uleb128 0x1
- .uleb128 0x0
- .align 2
-.LECIE1:
-.LSFDE1:
- .4byte .LEFDE1-.LASFDE1 /* FDE Length */
-.LASFDE1:
- .4byte .LASFDE1-.Lframe1 /* FDE CIE offset */
-#if defined _RELOCATABLE || defined __PIC__
- .4byte .LFB1-. /* FDE initial location */
-#else
- .4byte .LFB1 /* FDE initial location */
-#endif
- .4byte .LFE1-.LFB1 /* FDE address range */
-#if defined _RELOCATABLE || defined __PIC__
- .uleb128 0x0 /* Augmentation size */
-#endif
- .byte 0x4 /* DW_CFA_advance_loc4 */
- .4byte .LCFI0-.LFB1
- .byte 0xd /* DW_CFA_def_cfa_register */
- .uleb128 0x08
- .byte 0x4 /* DW_CFA_advance_loc4 */
- .4byte .LCFI5-.LCFI0
- .byte 0x11 /* DW_CFA_offset_extended_sf */
- .uleb128 0x41
- .sleb128 -1
- .byte 0x9f /* DW_CFA_offset, column 0x1f */
- .uleb128 0x1
- .byte 0x9e /* DW_CFA_offset, column 0x1e */
- .uleb128 0x2
- .byte 0x9d /* DW_CFA_offset, column 0x1d */
- .uleb128 0x3
- .byte 0x9c /* DW_CFA_offset, column 0x1c */
- .uleb128 0x4
- .byte 0x4 /* DW_CFA_advance_loc4 */
- .4byte .LCFI6-.LCFI5
- .byte 0xd /* DW_CFA_def_cfa_register */
- .uleb128 0x1c
- .align 2
-.LEFDE1:
-
#if defined __ELF__ && defined __linux__
.section .note.GNU-stack,"",@progbits
#endif
--
2.1.0
--
Alan Modra
Australia Development Lab, IBM
^ permalink raw reply [flat|nested] 2+ messages in thread
* powerpc: Fix ffi_go_closure_linux64
2014-11-20 1:44 GO closures for powerpc linux Alan Modra
@ 2014-11-20 1:48 ` Alan Modra
0 siblings, 0 replies; 2+ messages in thread
From: Alan Modra @ 2014-11-20 1:48 UTC (permalink / raw)
To: libffi-discuss
Unlike ffi_closure_LINUX64, this entry point is called normally,
so we already have the TOC in R2 and the closure in R11.
* powerpc/linux64_closure.S (ffi_closure_LINUX64): Remove a
register dependency chain.
(ffi_go_closure_linux64): Don't load r11 or r2.
---
src/powerpc/linux64_closure.S | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/src/powerpc/linux64_closure.S b/src/powerpc/linux64_closure.S
index 1364225..6487d2a 100644
--- a/src/powerpc/linux64_closure.S
+++ b/src/powerpc/linux64_closure.S
@@ -101,7 +101,7 @@ ffi_closure_LINUX64:
# else
# copy r2 to r11 and load TOC into r2
mr %r11, %r2
- ld %r2, 16(%r11)
+ ld %r2, 16(%r2)
mflr %r0
# Save general regs into parm save area
@@ -444,10 +444,6 @@ ffi_go_closure_linux64:
# load up the pointer to the parm save area
mr %r7, %r12
# else
- # copy r2 to r11 and load TOC into r2
- mr %r11, %r2
- ld %r2, 16(%r11)
-
mflr %r0
# Save general regs into parm save area
# This is the parameter save area set up by our caller.
--
2.1.0
--
Alan Modra
Australia Development Lab, IBM
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-11-20 1:48 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-20 1:44 GO closures for powerpc linux Alan Modra
2014-11-20 1:48 ` powerpc: Fix ffi_go_closure_linux64 Alan Modra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).