From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9336 invoked by alias); 20 Nov 2014 01:44:53 -0000 Mailing-List: contact libffi-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libffi-discuss-owner@sourceware.org Received: (qmail 9324 invoked by uid 89); 20 Nov 2014 01:44:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pd0-f182.google.com Received: from mail-pd0-f182.google.com (HELO mail-pd0-f182.google.com) (209.85.192.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 20 Nov 2014 01:44:45 +0000 Received: by mail-pd0-f182.google.com with SMTP id r10so2035042pdi.13 for ; Wed, 19 Nov 2014 17:44:43 -0800 (PST) X-Received: by 10.66.169.209 with SMTP id ag17mr1206850pac.62.1416447883430; Wed, 19 Nov 2014 17:44:43 -0800 (PST) Received: from bubble.grove.modra.org (CPE-58-160-155-134.oycza5.sa.bigpond.net.au. [58.160.155.134]) by mx.google.com with ESMTPSA id ex4sm455254pdb.17.2014.11.19.17.44.41 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Nov 2014 17:44:42 -0800 (PST) Received: by bubble.grove.modra.org (Postfix, from userid 1000) id A4E54EA017E; Thu, 20 Nov 2014 12:14:35 +1030 (ACDT) Date: Thu, 20 Nov 2014 01:44:00 -0000 From: Alan Modra To: libffi-discuss@sourceware.org Subject: GO closures for powerpc linux Message-ID: <20141120014435.GH22459@bubble.grove.modra.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-SW-Source: 2014/txt/msg00248.txt.bz2 GO closures for powerpc linux Plus .cfi async unwind info, rearrangement of ffi_call_linux64 and ffi_call_SYSV function params to avoid register copies, tweaks to trampolines. This along with rth's followup patch has been tested powerpc-linux, powerpc64-linux and powerpc64le-linux. If you're using rth's gcc fork git://github.com/rth7680/gcc.git rth/go-closure then you'll want to first apply the following upstream libffi commit commit fa5f25c20f76a6ef5e950a7ccbce826672c8a620 Author: Marcus Comstedt Date: Sat Jan 4 19:00:08 2014 +0100 Linux/ppc64: Remove assumption on contents of r11 in closure and this gcc patch Index: gcc/config/rs6000/linux64.h =================================================================== --- gcc/config/rs6000/linux64.h (revision 217330) +++ gcc/config/rs6000/linux64.h (working copy) @@ -115,6 +115,14 @@ if (dot_symbols) \ error ("-mcall-aixdesc incompatible with -mabi=elfv2"); \ } \ + if (DEFAULT_ABI == ABI_AIX \ + && strcmp (lang_hooks.name, "GNU Go") == 0) \ + { \ + if (global_options_set.x_TARGET_POINTERS_TO_NESTED_FUNCTIONS \ + && TARGET_POINTERS_TO_NESTED_FUNCTIONS) \ + error ("-mpointers-to-nested-functions is incompatible with Go"); \ + TARGET_POINTERS_TO_NESTED_FUNCTIONS = 0; \ + } \ if (rs6000_isa_flags & OPTION_MASK_RELOCATABLE) \ { \ rs6000_isa_flags &= ~OPTION_MASK_RELOCATABLE; \ * src/powerpc/ffitarget.h (FFI_GO_CLOSURES): Define. * src/powerpc/ffi.c (ffi_call_int): New function with extra closure param, and args rearranged on ffi_call_linux64 and ffi_call_SYSV calls, extracted from .. (ffi_call): ..here. (ffi_call_go, ffi_prep_go_closure): New functions. * src/powerpc/ffi_linux64.c (ffi_prep_closure_loc_linux64): Make hidden. Only flush insn part of ELFv2 trampoline. Don't shuffle ELFv1 trampoline. (ffi_closure_helper_LINUX64): Replace closure param with cif, fun, user_data params. * src/powerpc/ffi_powerpc.h (ffi_go_closure_sysv): Declare. (ffi_go_closure_linux64): Declare. (ffi_call_SYSV, fi_call_LINUX64): Update. (ffi_prep_closure_loc_sysv, ffi_prep_closure_loc_linux64): Declare. (ffi_closure_helper_SYSV, ffi_closure_helper_LINUX64): Update. * src/powerpc/ffi_sysv.c (ASM_NEEDS_REGISTERS): Increase to 6. (ffi_prep_closure_loc_sysv): Use bcl in trampoline, put data words last, flush just the insn part. (ffi_closure_helper_SYSV): Replace closure param with cif, fun and user_data params. * src/powerpc/linux64.S (ffi_call_LINUX64): Replace hand-written .eh_frame with .cfi directives. Adjust for changed param order. Pass extra "closure" param to user function in static chain. Add .cfi directives to describe epilogue. Don't provide traceback table for ELFv2 or _CALL_LINUX. * src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Replace hand-written .eh_frame with .cfi directives. Adjust for changed ffi_closure_helper_LINUX64 params. Add .cfi directives to describe epilogue. Don't provide traceback table for ELFv2 or _CALL_LINUX. (ffi_go_closure_linux64): New function. * src/powerpc/sysv.S: Remove redundant .globl ffi_prep_args_SYSV. (ffi_call_SYSV): Make hidden. Replace hand-written .eh_frame with .cfi directives. Adjust for changed params. Pass extra "closure" param to user function in static chain. Add .cfi directives to describe epilogue. * src/powerpc/ppc_closure.S (ffi_closure_SYSV): Make hidden. Replace hand-written .eh_frame with .cfi directives. Adjust for changed ffi_closure_helper_SYSV params. Add .cfi directives to describe epilogue. Don't just use nops in the dead __NO_FPRS__ epilogues. (ffi_go_closure_sysv): New function. --- src/powerpc/ffi.c | 40 ++++++++- src/powerpc/ffi_linux64.c | 24 ++--- src/powerpc/ffi_powerpc.h | 29 ++++-- src/powerpc/ffi_sysv.c | 33 +++---- src/powerpc/ffitarget.h | 1 + src/powerpc/linux64.S | 73 +++++---------- src/powerpc/linux64_closure.S | 202 +++++++++++++++++++++++++++++++----------- src/powerpc/ppc_closure.S | 165 ++++++++++++++++++---------------- src/powerpc/sysv.S | 149 +++++++++++-------------------- 9 files changed, 403 insertions(+), 313 deletions(-) diff --git a/src/powerpc/ffi.c b/src/powerpc/ffi.c index efb441b..7eb543e 100644 --- a/src/powerpc/ffi.c +++ b/src/powerpc/ffi.c @@ -70,8 +70,12 @@ ffi_prep_cif_machdep_var (ffi_cif *cif, #endif } -void -ffi_call(ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue) +static void +ffi_call_int (ffi_cif *cif, + void (*fn) (void), + void *rvalue, + void **avalue, + void *closure) { /* The final SYSV ABI says that structures smaller or equal 8 bytes are returned in r3/r4. A draft ABI used by linux instead returns @@ -97,9 +101,10 @@ ffi_call(ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue) ecif.rvalue = alloca (cif->rtype->size); #ifdef POWERPC64 - ffi_call_LINUX64 (&ecif, -(long) cif->bytes, cif->flags, ecif.rvalue, fn); + ffi_call_LINUX64 (&ecif, fn, ecif.rvalue, cif->flags, closure, + -(long) cif->bytes); #else - ffi_call_SYSV (&ecif, -cif->bytes, cif->flags, ecif.rvalue, fn); + ffi_call_SYSV (&ecif, fn, ecif.rvalue, cif->flags, closure, -cif->bytes); #endif /* Check for a bounce-buffered return value */ @@ -125,6 +130,18 @@ ffi_call(ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue) } } +void +ffi_call (ffi_cif *cif, void (*fn) (void), void *rvalue, void **avalue) +{ + ffi_call_int (cif, fn, rvalue, avalue, NULL); +} + +void +ffi_call_go (ffi_cif *cif, void (*fn) (void), void *rvalue, void **avalue, + void *closure) +{ + ffi_call_int (cif, fn, rvalue, avalue, closure); +} ffi_status ffi_prep_closure_loc (ffi_closure *closure, @@ -139,3 +156,18 @@ ffi_prep_closure_loc (ffi_closure *closure, return ffi_prep_closure_loc_sysv (closure, cif, fun, user_data, codeloc); #endif } + +ffi_status +ffi_prep_go_closure (ffi_go_closure *closure, + ffi_cif *cif, + void (*fun) (ffi_cif *, void *, void **, void *)) +{ +#ifdef POWERPC64 + closure->tramp = ffi_go_closure_linux64; +#else + closure->tramp = ffi_go_closure_sysv; +#endif + closure->cif = cif; + closure->fun = fun; + return FFI_OK; +} diff --git a/src/powerpc/ffi_linux64.c b/src/powerpc/ffi_linux64.c index b087af8..b84b91f 100644 --- a/src/powerpc/ffi_linux64.c +++ b/src/powerpc/ffi_linux64.c @@ -667,7 +667,8 @@ flush_icache (char *wraddr, char *xaddr, int size) } #endif -ffi_status + +ffi_status FFI_HIDDEN ffi_prep_closure_loc_linux64 (ffi_closure *closure, ffi_cif *cif, void (*fun) (ffi_cif *, void *, void **, void *), @@ -688,17 +689,17 @@ ffi_prep_closure_loc_linux64 (ffi_closure *closure, /* 2: .quad context */ *(void **) &tramp[4] = (void *) ffi_closure_LINUX64; *(void **) &tramp[6] = codeloc; - flush_icache ((char *)tramp, (char *)codeloc, FFI_TRAMPOLINE_SIZE); + flush_icache ((char *) tramp, (char *) codeloc, 4 * 4); #else void **tramp = (void **) &closure->tramp[0]; if (cif->abi < FFI_LINUX || cif->abi >= FFI_LAST_ABI) return FFI_BAD_ABI; - /* Copy function address and TOC from ffi_closure_LINUX64. */ - memcpy (tramp, (char *) ffi_closure_LINUX64, 16); - tramp[2] = tramp[1]; + /* Copy function address and TOC from ffi_closure_LINUX64 OPD. */ + memcpy (&tramp[0], (void **) ffi_closure_LINUX64, sizeof (void *)); tramp[1] = codeloc; + memcpy (&tramp[2], (void **) ffi_closure_LINUX64 + 1, sizeof (void *)); #endif closure->cif = cif; @@ -710,8 +711,12 @@ ffi_prep_closure_loc_linux64 (ffi_closure *closure, int FFI_HIDDEN -ffi_closure_helper_LINUX64 (ffi_closure *closure, void *rvalue, - unsigned long *pst, ffi_dblfl *pfr) +ffi_closure_helper_LINUX64 (ffi_cif *cif, + void (*fun) (ffi_cif *, void *, void **, void *), + void *user_data, + void *rvalue, + unsigned long *pst, + ffi_dblfl *pfr) { /* rvalue is the pointer to space for return value in closure assembly */ /* pst is the pointer to parameter save area @@ -721,11 +726,9 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure, void *rvalue, void **avalue; ffi_type **arg_types; unsigned long i, avn, nfixedargs; - ffi_cif *cif; ffi_dblfl *end_pfr = pfr + NUM_FPR_ARG_REGISTERS64; unsigned long align; - cif = closure->cif; avalue = alloca (cif->nargs * sizeof (void *)); /* Copy the caller's structure return value address so that the @@ -925,8 +928,7 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure, void *rvalue, i++; } - - (closure->fun) (cif, rvalue, avalue, closure->user_data); + (*fun) (cif, rvalue, avalue, user_data); /* Tell ffi_closure_LINUX64 how to perform return type promotions. */ if ((cif->flags & FLAG_RETURNS_SMST) != 0) diff --git a/src/powerpc/ffi_powerpc.h b/src/powerpc/ffi_powerpc.h index 2e61653..3dcd6b5 100644 --- a/src/powerpc/ffi_powerpc.h +++ b/src/powerpc/ffi_powerpc.h @@ -56,22 +56,39 @@ typedef union } ffi_dblfl; void FFI_HIDDEN ffi_closure_SYSV (void); -void FFI_HIDDEN ffi_call_SYSV(extended_cif *, unsigned, unsigned, unsigned *, - void (*)(void)); +void FFI_HIDDEN ffi_go_closure_sysv (void); +void FFI_HIDDEN ffi_call_SYSV(extended_cif *, void (*)(void), void *, + unsigned, void *, int); void FFI_HIDDEN ffi_prep_types_sysv (ffi_abi); ffi_status FFI_HIDDEN ffi_prep_cif_sysv (ffi_cif *); -int FFI_HIDDEN ffi_closure_helper_SYSV (ffi_closure *, void *, unsigned long *, +ffi_status FFI_HIDDEN ffi_prep_closure_loc_sysv (ffi_closure *, + ffi_cif *, + void (*) (ffi_cif *, void *, + void **, void *), + void *, void *); +int FFI_HIDDEN ffi_closure_helper_SYSV (ffi_cif *, + void (*) (ffi_cif *, void *, + void **, void *), + void *, void *, unsigned long *, ffi_dblfl *, unsigned long *); -void FFI_HIDDEN ffi_call_LINUX64(extended_cif *, unsigned long, unsigned long, - unsigned long *, void (*)(void)); +void FFI_HIDDEN ffi_call_LINUX64(extended_cif *, void (*) (void), void *, + unsigned long, void *, long); void FFI_HIDDEN ffi_closure_LINUX64 (void); +void FFI_HIDDEN ffi_go_closure_linux64 (void); void FFI_HIDDEN ffi_prep_types_linux64 (ffi_abi); ffi_status FFI_HIDDEN ffi_prep_cif_linux64 (ffi_cif *); ffi_status FFI_HIDDEN ffi_prep_cif_linux64_var (ffi_cif *, unsigned int, unsigned int); void FFI_HIDDEN ffi_prep_args64 (extended_cif *, unsigned long *const); -int FFI_HIDDEN ffi_closure_helper_LINUX64 (ffi_closure *, void *, +ffi_status FFI_HIDDEN ffi_prep_closure_loc_linux64 (ffi_closure *, ffi_cif *, + void (*) (ffi_cif *, void *, + void **, void *), + void *, void *); +int FFI_HIDDEN ffi_closure_helper_LINUX64 (ffi_cif *, + void (*) (ffi_cif *, void *, + void **, void *), + void *, void *, unsigned long *, ffi_dblfl *); diff --git a/src/powerpc/ffi_sysv.c b/src/powerpc/ffi_sysv.c index fbe85fe..646c340 100644 --- a/src/powerpc/ffi_sysv.c +++ b/src/powerpc/ffi_sysv.c @@ -36,7 +36,7 @@ /* About the SYSV ABI. */ -#define ASM_NEEDS_REGISTERS 4 +#define ASM_NEEDS_REGISTERS 6 #define NUM_GPR_ARG_REGISTERS 8 #define NUM_FPR_ARG_REGISTERS 8 @@ -654,18 +654,18 @@ ffi_prep_closure_loc_sysv (ffi_closure *closure, tramp = (unsigned int *) &closure->tramp[0]; tramp[0] = 0x7c0802a6; /* mflr r0 */ - tramp[1] = 0x4800000d; /* bl 10 */ - tramp[4] = 0x7d6802a6; /* mflr r11 */ - tramp[5] = 0x7c0803a6; /* mtlr r0 */ - tramp[6] = 0x800b0000; /* lwz r0,0(r11) */ - tramp[7] = 0x816b0004; /* lwz r11,4(r11) */ - tramp[8] = 0x7c0903a6; /* mtctr r0 */ - tramp[9] = 0x4e800420; /* bctr */ - *(void **) &tramp[2] = (void *) ffi_closure_SYSV; /* function */ - *(void **) &tramp[3] = codeloc; /* context */ + tramp[1] = 0x429f0005; /* bcl 20,31,.+4 */ + tramp[2] = 0x7d6802a6; /* mflr r11 */ + tramp[3] = 0x7c0803a6; /* mtlr r0 */ + tramp[4] = 0x800b0018; /* lwz r0,24(r11) */ + tramp[5] = 0x816b001c; /* lwz r11,28(r11) */ + tramp[6] = 0x7c0903a6; /* mtctr r0 */ + tramp[7] = 0x4e800420; /* bctr */ + *(void **) &tramp[8] = (void *) ffi_closure_SYSV; /* function */ + *(void **) &tramp[9] = codeloc; /* context */ /* Flush the icache. */ - flush_icache ((char *)tramp, (char *)codeloc, FFI_TRAMPOLINE_SIZE); + flush_icache ((char *)tramp, (char *)codeloc, 8 * 4); closure->cif = cif; closure->fun = fun; @@ -682,8 +682,12 @@ ffi_prep_closure_loc_sysv (ffi_closure *closure, following helper function to do most of the work. */ int -ffi_closure_helper_SYSV (ffi_closure *closure, void *rvalue, - unsigned long *pgr, ffi_dblfl *pfr, +ffi_closure_helper_SYSV (ffi_cif *cif, + void (*fun) (ffi_cif *, void *, void **, void *), + void *user_data, + void *rvalue, + unsigned long *pgr, + ffi_dblfl *pfr, unsigned long *pst) { /* rvalue is the pointer to space for return value in closure assembly */ @@ -699,7 +703,6 @@ ffi_closure_helper_SYSV (ffi_closure *closure, void *rvalue, #endif long ng = 0; /* number of general registers already used */ - ffi_cif *cif = closure->cif; unsigned size = cif->rtype->size; unsigned short rtypenum = cif->rtype->type; @@ -915,7 +918,7 @@ ffi_closure_helper_SYSV (ffi_closure *closure, void *rvalue, i++; } - (closure->fun) (cif, rvalue, avalue, closure->user_data); + (*fun) (cif, rvalue, avalue, user_data); /* Tell ffi_closure_SYSV how to perform return type promotions. Because the FFI_SYSV ABI returns the structures <= 8 bytes in diff --git a/src/powerpc/ffitarget.h b/src/powerpc/ffitarget.h index 84aa586..0f66d31 100644 --- a/src/powerpc/ffitarget.h +++ b/src/powerpc/ffitarget.h @@ -138,6 +138,7 @@ typedef enum ffi_abi { #define FFI_CLOSURES 1 #define FFI_NATIVE_RAW_API 0 #if defined (POWERPC) || defined (POWERPC_FREEBSD) +# define FFI_GO_CLOSURES 1 # define FFI_TARGET_SPECIFIC_VARIADIC 1 # define FFI_EXTRA_CIF_FIELDS unsigned nfixedargs #endif diff --git a/src/powerpc/linux64.S b/src/powerpc/linux64.S index d2acb70..b2ae60e 100644 --- a/src/powerpc/linux64.S +++ b/src/powerpc/linux64.S @@ -32,8 +32,9 @@ #ifdef POWERPC64 .hidden ffi_call_LINUX64 .globl ffi_call_LINUX64 -# if _CALL_ELF == 2 .text + .cfi_startproc +# if _CALL_ELF == 2 ffi_call_LINUX64: addis %r2, %r12, .TOC.-ffi_call_LINUX64@ha addi %r2, %r2, .TOC.-ffi_call_LINUX64@l @@ -57,20 +58,26 @@ ffi_call_LINUX64: .ffi_call_LINUX64: # endif # endif -.LFB1: mflr %r0 std %r28, -32(%r1) std %r29, -24(%r1) std %r30, -16(%r1) std %r31, -8(%r1) + std %r7, 8(%r1) /* closure, saved in cr field. */ std %r0, 16(%r1) mr %r28, %r1 /* our AP. */ -.LCFI0: - stdux %r1, %r1, %r4 - mr %r31, %r5 /* flags, */ - mr %r30, %r6 /* rvalue, */ - mr %r29, %r7 /* function address. */ + .cfi_def_cfa_register 28 + .cfi_offset 65, 16 + .cfi_offset 31, -8 + .cfi_offset 30, -16 + .cfi_offset 29, -24 + .cfi_offset 28, -32 + + stdux %r1, %r1, %r8 + mr %r31, %r6 /* flags, */ + mr %r30, %r5 /* rvalue, */ + mr %r29, %r4 /* function address. */ /* Save toc pointer, not for the ffi_prep_args64 call, but for the later bctrl function call. */ # if _CALL_ELF == 2 @@ -92,7 +99,6 @@ ffi_call_LINUX64: # else ld %r12, 0(%r29) ld %r2, 8(%r29) - ld %r11, 16(%r29) # endif /* Now do the call. */ /* Set up cr1 with bits 4-7 of the flags. */ @@ -130,6 +136,7 @@ ffi_call_LINUX64: 2: /* Make the call. */ + ld %r11, 8(%r28) bctrl /* This must follow the call immediately, the unwinder @@ -151,6 +158,7 @@ ffi_call_LINUX64: .Ldone_return_value: /* Restore the registers we used and return. */ mr %r1, %r28 + .cfi_def_cfa_register 1 ld %r0, 16(%r28) ld %r28, -32(%r28) mtlr %r0 @@ -160,6 +168,7 @@ ffi_call_LINUX64: blr .Lfp_return_value: + .cfi_def_cfa_register 28 bf 28, .Lfloat_return_value stfd %f1, 0(%r30) mtcrf 0x02, %r31 /* cr6 */ @@ -199,61 +208,19 @@ ffi_call_LINUX64: std %r4, 8(%r30) b .Ldone_return_value -.LFE1: - .long 0 - .byte 0,12,0,1,128,4,0,0 + .cfi_endproc # if _CALL_ELF == 2 .size ffi_call_LINUX64,.-ffi_call_LINUX64 # else # ifdef _CALL_LINUX .size ffi_call_LINUX64,.-.L.ffi_call_LINUX64 # else + .long 0 + .byte 0,12,0,1,128,4,0,0 .size .ffi_call_LINUX64,.-.ffi_call_LINUX64 # endif # endif - .section .eh_frame,EH_FRAME_FLAGS,@progbits -.Lframe1: - .4byte .LECIE1-.LSCIE1 # Length of Common Information Entry -.LSCIE1: - .4byte 0x0 # CIE Identifier Tag - .byte 0x1 # CIE Version - .ascii "zR\0" # CIE Augmentation - .uleb128 0x1 # CIE Code Alignment Factor - .sleb128 -8 # CIE Data Alignment Factor - .byte 0x41 # CIE RA Column - .uleb128 0x1 # Augmentation size - .byte 0x14 # FDE Encoding (pcrel udata8) - .byte 0xc # DW_CFA_def_cfa - .uleb128 0x1 - .uleb128 0x0 - .align 3 -.LECIE1: -.LSFDE1: - .4byte .LEFDE1-.LASFDE1 # FDE Length -.LASFDE1: - .4byte .LASFDE1-.Lframe1 # FDE CIE offset - .8byte .LFB1-. # FDE initial location - .8byte .LFE1-.LFB1 # FDE address range - .uleb128 0x0 # Augmentation size - .byte 0x2 # DW_CFA_advance_loc1 - .byte .LCFI0-.LFB1 - .byte 0xd # DW_CFA_def_cfa_register - .uleb128 0x1c - .byte 0x11 # DW_CFA_offset_extended_sf - .uleb128 0x41 - .sleb128 -2 - .byte 0x9f # DW_CFA_offset, column 0x1f - .uleb128 0x1 - .byte 0x9e # DW_CFA_offset, column 0x1e - .uleb128 0x2 - .byte 0x9d # DW_CFA_offset, column 0x1d - .uleb128 0x3 - .byte 0x9c # DW_CFA_offset, column 0x1c - .uleb128 0x4 - .align 3 -.LEFDE1: - #endif #if (defined __ELF__ && defined __linux__) || _CALL_ELF == 2 diff --git a/src/powerpc/linux64_closure.S b/src/powerpc/linux64_closure.S index 97421a4..1364225 100644 --- a/src/powerpc/linux64_closure.S +++ b/src/powerpc/linux64_closure.S @@ -33,8 +33,9 @@ #ifdef POWERPC64 FFI_HIDDEN (ffi_closure_LINUX64) .globl ffi_closure_LINUX64 -# if _CALL_ELF == 2 .text + .cfi_startproc +# if _CALL_ELF == 2 ffi_closure_LINUX64: addis %r2, %r12, .TOC.-ffi_closure_LINUX64@ha addi %r2, %r2, .TOC.-ffi_closure_LINUX64@l @@ -73,20 +74,18 @@ ffi_closure_LINUX64: # define RETVAL PARMSAVE+64 # endif -.LFB1: # if _CALL_ELF == 2 ld %r12, FFI_TRAMPOLINE_SIZE(%r11) # closure->cif mflr %r0 lwz %r12, 28(%r12) # cif->flags mtcrf 0x40, %r12 addi %r12, %r1, PARMSAVE - bt 7, .Lparmsave + bt 7, 0f # Our caller has not allocated a parameter save area. # We need to allocate one here and use it to pass gprs to # ffi_closure_helper_LINUX64. addi %r12, %r1, -STACKFRAME+PARMSAVE -.Lparmsave: - std %r0, 16(%r1) +0: # Save general regs into parm save area std %r3, 0(%r12) std %r4, 8(%r12) @@ -98,7 +97,7 @@ ffi_closure_LINUX64: std %r10, 56(%r12) # load up the pointer to the parm save area - mr %r5, %r12 + mr %r7, %r12 # else # copy r2 to r11 and load TOC into r2 mr %r11, %r2 @@ -116,12 +115,19 @@ ffi_closure_LINUX64: std %r9, PARMSAVE+48(%r1) std %r10, PARMSAVE+56(%r1) - std %r0, 16(%r1) - # load up the pointer to the parm save area - addi %r5, %r1, PARMSAVE + addi %r7, %r1, PARMSAVE # endif + std %r0, 16(%r1) + + # closure->cif + ld %r3, FFI_TRAMPOLINE_SIZE(%r11) + # closure->fun + ld %r4, FFI_TRAMPOLINE_SIZE+8(%r11) + # closure->user_data + ld %r5, FFI_TRAMPOLINE_SIZE+16(%r11) +.Ldoclosure: # next save fpr 1 to fpr 13 stfd %f1, -104+(0*8)(%r1) stfd %f2, -104+(1*8)(%r1) @@ -138,16 +144,14 @@ ffi_closure_LINUX64: stfd %f13, -104+(12*8)(%r1) # load up the pointer to the saved fpr registers */ - addi %r6, %r1, -104 + addi %r8, %r1, -104 # load up the pointer to the result storage - addi %r4, %r1, -STACKFRAME+RETVAL + addi %r6, %r1, -STACKFRAME+RETVAL stdu %r1, -STACKFRAME(%r1) -.LCFI0: - - # get the context pointer from the trampoline - mr %r3, %r11 + .cfi_def_cfa_offset STACKFRAME + .cfi_offset 65, 16 # make the call # if defined _CALL_LINUX || _CALL_ELF == 2 @@ -182,7 +186,9 @@ ffi_closure_LINUX64: # case FFI_TYPE_VOID mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME nop # case FFI_TYPE_INT # ifdef __LITTLE_ENDIAN__ @@ -192,17 +198,23 @@ ffi_closure_LINUX64: # endif mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_FLOAT lfs %f1, RETVAL+0(%r1) mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_DOUBLE lfd %f1, RETVAL+0(%r1) mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_LONGDOUBLE lfd %f1, RETVAL+0(%r1) mtlr %r0 @@ -216,7 +228,9 @@ ffi_closure_LINUX64: # endif mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_SINT8 # ifdef __LITTLE_ENDIAN__ lbz %r3, RETVAL+0(%r1) @@ -235,7 +249,9 @@ ffi_closure_LINUX64: mtlr %r0 .Lfinish: addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_SINT16 # ifdef __LITTLE_ENDIAN__ lha %r3, RETVAL+0(%r1) @@ -244,7 +260,9 @@ ffi_closure_LINUX64: # endif mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_UINT32 # ifdef __LITTLE_ENDIAN__ lwz %r3, RETVAL+0(%r1) @@ -253,7 +271,9 @@ ffi_closure_LINUX64: # endif mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_SINT32 # ifdef __LITTLE_ENDIAN__ lwa %r3, RETVAL+0(%r1) @@ -262,27 +282,37 @@ ffi_closure_LINUX64: # endif mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_UINT64 ld %r3, RETVAL+0(%r1) mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_SINT64 ld %r3, RETVAL+0(%r1) mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_TYPE_STRUCT mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME nop # case FFI_TYPE_POINTER ld %r3, RETVAL+0(%r1) mtlr %r0 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME # case FFI_V2_TYPE_FLOAT_HOMOG lfs %f1, RETVAL+0(%r1) lfs %f2, RETVAL+4(%r1) @@ -299,7 +329,9 @@ ffi_closure_LINUX64: lfd %f7, RETVAL+48(%r1) lfd %f8, RETVAL+56(%r1) addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME .Lmorefloat: lfs %f4, RETVAL+12(%r1) mtlr %r0 @@ -308,13 +340,16 @@ ffi_closure_LINUX64: lfs %f7, RETVAL+24(%r1) lfs %f8, RETVAL+28(%r1) addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME .Lsmall: # ifdef __LITTLE_ENDIAN__ ld %r3,RETVAL+0(%r1) mtlr %r0 ld %r4,RETVAL+8(%r1) addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr # else # A struct smaller than a dword is returned in the low bits of r3 @@ -328,63 +363,128 @@ ffi_closure_LINUX64: mtlr %r0 ld %r4,RETVAL+8(%r1) addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset STACKFRAME .Lsmalldown: addi %r5, %r5, FFI_V2_TYPE_SMALL_STRUCT + 7 mtlr %r0 sldi %r5, %r5, 3 addi %r1, %r1, STACKFRAME + .cfi_def_cfa_offset 0 srd %r3, %r3, %r5 blr # endif -.LFE1: - .long 0 - .byte 0,12,0,1,128,0,0,0 + .cfi_endproc # if _CALL_ELF == 2 .size ffi_closure_LINUX64,.-ffi_closure_LINUX64 # else # ifdef _CALL_LINUX .size ffi_closure_LINUX64,.-.L.ffi_closure_LINUX64 # else + .long 0 + .byte 0,12,0,1,128,0,0,0 .size .ffi_closure_LINUX64,.-.ffi_closure_LINUX64 # endif # endif - .section .eh_frame,EH_FRAME_FLAGS,@progbits -.Lframe1: - .4byte .LECIE1-.LSCIE1 # Length of Common Information Entry -.LSCIE1: - .4byte 0x0 # CIE Identifier Tag - .byte 0x1 # CIE Version - .ascii "zR\0" # CIE Augmentation - .uleb128 0x1 # CIE Code Alignment Factor - .sleb128 -8 # CIE Data Alignment Factor - .byte 0x41 # CIE RA Column - .uleb128 0x1 # Augmentation size - .byte 0x14 # FDE Encoding (pcrel udata8) - .byte 0xc # DW_CFA_def_cfa - .uleb128 0x1 - .uleb128 0x0 - .align 3 -.LECIE1: -.LSFDE1: - .4byte .LEFDE1-.LASFDE1 # FDE Length -.LASFDE1: - .4byte .LASFDE1-.Lframe1 # FDE CIE offset - .8byte .LFB1-. # FDE initial location - .8byte .LFE1-.LFB1 # FDE address range - .uleb128 0x0 # Augmentation size - .byte 0x2 # DW_CFA_advance_loc1 - .byte .LCFI0-.LFB1 - .byte 0xe # DW_CFA_def_cfa_offset - .uleb128 STACKFRAME - .byte 0x11 # DW_CFA_offset_extended_sf - .uleb128 0x41 - .sleb128 -2 - .align 3 -.LEFDE1: + FFI_HIDDEN (ffi_go_closure_linux64) + .globl ffi_go_closure_linux64 + .text + .cfi_startproc +# if _CALL_ELF == 2 +ffi_go_closure_linux64: + addis %r2, %r12, .TOC.-ffi_go_closure_linux64@ha + addi %r2, %r2, .TOC.-ffi_go_closure_linux64@l + .localentry ffi_go_closure_linux64, . - ffi_go_closure_linux64 +# else + .section ".opd","aw" + .align 3 +ffi_go_closure_linux64: +# ifdef _CALL_LINUX + .quad .L.ffi_go_closure_linux64,.TOC.@tocbase,0 + .type ffi_go_closure_linux64,@function + .text +.L.ffi_go_closure_linux64: +# else + FFI_HIDDEN (.ffi_go_closure_linux64) + .globl .ffi_go_closure_linux64 + .quad .ffi_go_closure_linux64,.TOC.@tocbase,0 + .size ffi_go_closure_linux64,24 + .type .ffi_go_closure_linux64,@function + .text +.ffi_go_closure_linux64: +# endif +# endif + +# if _CALL_ELF == 2 + ld %r12, 8(%r11) # closure->cif + mflr %r0 + lwz %r12, 28(%r12) # cif->flags + mtcrf 0x40, %r12 + addi %r12, %r1, PARMSAVE + bt 7, 0f + # Our caller has not allocated a parameter save area. + # We need to allocate one here and use it to pass gprs to + # ffi_closure_helper_LINUX64. + addi %r12, %r1, -STACKFRAME+PARMSAVE +0: + # Save general regs into parm save area + std %r3, 0(%r12) + std %r4, 8(%r12) + std %r5, 16(%r12) + std %r6, 24(%r12) + std %r7, 32(%r12) + std %r8, 40(%r12) + std %r9, 48(%r12) + std %r10, 56(%r12) + + # load up the pointer to the parm save area + mr %r7, %r12 +# else + # copy r2 to r11 and load TOC into r2 + mr %r11, %r2 + ld %r2, 16(%r11) + + mflr %r0 + # Save general regs into parm save area + # This is the parameter save area set up by our caller. + std %r3, PARMSAVE+0(%r1) + std %r4, PARMSAVE+8(%r1) + std %r5, PARMSAVE+16(%r1) + std %r6, PARMSAVE+24(%r1) + std %r7, PARMSAVE+32(%r1) + std %r8, PARMSAVE+40(%r1) + std %r9, PARMSAVE+48(%r1) + std %r10, PARMSAVE+56(%r1) + + # load up the pointer to the parm save area + addi %r7, %r1, PARMSAVE +# endif + std %r0, 16(%r1) + + # closure->cif + ld %r3, 8(%r11) + # closure->fun + ld %r4, 16(%r11) + # user_data + mr %r5, %r11 + b .Ldoclosure + + .cfi_endproc +# if _CALL_ELF == 2 + .size ffi_go_closure_linux64,.-ffi_go_closure_linux64 +# else +# ifdef _CALL_LINUX + .size ffi_go_closure_linux64,.-.L.ffi_go_closure_linux64 +# else + .long 0 + .byte 0,12,0,1,128,0,0,0 + .size .ffi_go_closure_linux64,.-.ffi_go_closure_linux64 +# endif +# endif #endif #if (defined __ELF__ && defined __linux__) || _CALL_ELF == 2 diff --git a/src/powerpc/ppc_closure.S b/src/powerpc/ppc_closure.S index 075922c..b6d209d 100644 --- a/src/powerpc/ppc_closure.S +++ b/src/powerpc/ppc_closure.S @@ -33,13 +33,14 @@ #ifndef POWERPC64 +FFI_HIDDEN(ffi_closure_SYSV) ENTRY(ffi_closure_SYSV) -.LFB1: + .cfi_startproc stwu %r1,-144(%r1) -.LCFI0: + .cfi_def_cfa_offset 144 mflr %r0 -.LCFI1: stw %r0,148(%r1) + .cfi_offset 65, 4 # we want to build up an areas for the parameters passed # in registers (both floating point and integer) @@ -48,6 +49,17 @@ ENTRY(ffi_closure_SYSV) stw %r3, 16(%r1) stw %r4, 20(%r1) stw %r5, 24(%r1) + + # set up registers for the routine that does the work + + # closure->cif + lwz %r3,FFI_TRAMPOLINE_SIZE(%r11) + # closure->fun + lwz %r4,FFI_TRAMPOLINE_SIZE+4(%r11) + # closure->user_data + lwz %r5,FFI_TRAMPOLINE_SIZE+8(%r11) + +.Ldoclosure: stw %r6, 28(%r1) stw %r7, 32(%r1) stw %r8, 36(%r1) @@ -66,23 +78,18 @@ ENTRY(ffi_closure_SYSV) stfd %f8, 104(%r1) #endif - # set up registers for the routine that actually does the work - # get the context pointer from the trampoline - mr %r3,%r11 + # pointer to the result storage + addi %r6,%r1,112 - # now load up the pointer to the result storage - addi %r4,%r1,112 + # pointer to the saved gpr registers + addi %r7,%r1,16 - # now load up the pointer to the saved gpr registers - addi %r5,%r1,16 + # pointer to the saved fpr registers + addi %r8,%r1,48 - # now load up the pointer to the saved fpr registers */ - addi %r6,%r1,48 - - # now load up the pointer to the outgoing parameter - # stack in the previous frame + # pointer to the outgoing parameter save area in the previous frame # i.e. the previous frame pointer + 8 - addi %r7,%r1,152 + addi %r9,%r1,152 # make the call bl ffi_closure_helper_SYSV@local @@ -101,7 +108,6 @@ ENTRY(ffi_closure_SYSV) add %r3,%r3,%r4 # add contents of table to table address mtctr %r3 bctr # jump to it -.LFE1: # Each of the ret_typeX code fragments has to be exactly 16 bytes long # (4 instructions). For cache effectiveness we align to a 16 byte boundary @@ -111,7 +117,9 @@ ENTRY(ffi_closure_SYSV) .Lret_type0: mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 nop # case FFI_TYPE_INT @@ -119,31 +127,33 @@ ENTRY(ffi_closure_SYSV) mtlr %r0 .Lfinish: addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_TYPE_FLOAT #ifndef __NO_FPRS__ lfs %f1,112+0(%r1) - mtlr %r0 - addi %r1,%r1,144 #else nop - nop - nop #endif + mtlr %r0 + addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_TYPE_DOUBLE #ifndef __NO_FPRS__ lfd %f1,112+0(%r1) - mtlr %r0 - addi %r1,%r1,144 #else nop - nop - nop #endif + mtlr %r0 + addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_TYPE_LONGDOUBLE #ifndef __NO_FPRS__ @@ -152,10 +162,12 @@ ENTRY(ffi_closure_SYSV) mtlr %r0 b .Lfinish #else - nop - nop - nop + mtlr %r0 + addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 + nop #endif # case FFI_TYPE_UINT8 @@ -166,7 +178,9 @@ ENTRY(ffi_closure_SYSV) #endif mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_TYPE_SINT8 #ifdef __LITTLE_ENDIAN__ @@ -186,7 +200,9 @@ ENTRY(ffi_closure_SYSV) #endif mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_TYPE_SINT16 #ifdef __LITTLE_ENDIAN__ @@ -196,19 +212,25 @@ ENTRY(ffi_closure_SYSV) #endif mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_TYPE_UINT32 lwz %r3,112+0(%r1) mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_TYPE_SINT32 lwz %r3,112+0(%r1) mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_TYPE_UINT64 lwz %r3,112+0(%r1) @@ -225,14 +247,18 @@ ENTRY(ffi_closure_SYSV) # case FFI_TYPE_STRUCT mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 nop # case FFI_TYPE_POINTER lwz %r3,112+0(%r1) mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_TYPE_UINT128 lwz %r3,112+0(%r1) @@ -245,20 +271,26 @@ ENTRY(ffi_closure_SYSV) lbz %r3,112+0(%r1) mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_SYSV_TYPE_SMALL_STRUCT + 2. Two byte struct. lhz %r3,112+0(%r1) mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_SYSV_TYPE_SMALL_STRUCT + 3. Three byte struct. lwz %r3,112+0(%r1) #ifdef __LITTLE_ENDIAN__ mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 #else srwi %r3,%r3,8 mtlr %r0 @@ -269,7 +301,9 @@ ENTRY(ffi_closure_SYSV) lwz %r3,112+0(%r1) mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 # case FFI_SYSV_TYPE_SMALL_STRUCT + 5. Five byte struct. lwz %r3,112+0(%r1) @@ -319,64 +353,43 @@ ENTRY(ffi_closure_SYSV) or %r4,%r6,%r4 mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr + .cfi_def_cfa_offset 144 #endif .Luint128: lwz %r6,112+12(%r1) mtlr %r0 addi %r1,%r1,144 + .cfi_def_cfa_offset 0 blr - + .cfi_endproc END(ffi_closure_SYSV) - .section ".eh_frame",EH_FRAME_FLAGS,@progbits -.Lframe1: - .4byte .LECIE1-.LSCIE1 # Length of Common Information Entry -.LSCIE1: - .4byte 0x0 # CIE Identifier Tag - .byte 0x1 # CIE Version -#if defined _RELOCATABLE || defined __PIC__ - .ascii "zR\0" # CIE Augmentation -#else - .ascii "\0" # CIE Augmentation -#endif - .uleb128 0x1 # CIE Code Alignment Factor - .sleb128 -4 # CIE Data Alignment Factor - .byte 0x41 # CIE RA Column -#if defined _RELOCATABLE || defined __PIC__ - .uleb128 0x1 # Augmentation size - .byte 0x1b # FDE Encoding (pcrel sdata4) -#endif - .byte 0xc # DW_CFA_def_cfa - .uleb128 0x1 - .uleb128 0x0 - .align 2 -.LECIE1: -.LSFDE1: - .4byte .LEFDE1-.LASFDE1 # FDE Length -.LASFDE1: - .4byte .LASFDE1-.Lframe1 # FDE CIE offset -#if defined _RELOCATABLE || defined __PIC__ - .4byte .LFB1-. # FDE initial location -#else - .4byte .LFB1 # FDE initial location -#endif - .4byte .LFE1-.LFB1 # FDE address range -#if defined _RELOCATABLE || defined __PIC__ - .uleb128 0x0 # Augmentation size -#endif - .byte 0x4 # DW_CFA_advance_loc4 - .4byte .LCFI0-.LFB1 - .byte 0xe # DW_CFA_def_cfa_offset - .uleb128 144 - .byte 0x4 # DW_CFA_advance_loc4 - .4byte .LCFI1-.LCFI0 - .byte 0x11 # DW_CFA_offset_extended_sf - .uleb128 0x41 - .sleb128 -1 - .align 2 -.LEFDE1: + +FFI_HIDDEN(ffi_go_closure_sysv) +ENTRY(ffi_go_closure_sysv) + .cfi_startproc + stwu %r1,-144(%r1) + .cfi_def_cfa_offset 144 + mflr %r0 + stw %r0,148(%r1) + .cfi_offset 65, 4 + + stw %r3, 16(%r1) + stw %r4, 20(%r1) + stw %r5, 24(%r1) + + # closure->cif + lwz %r3,4(%r11) + # closure->fun + lwz %r4,8(%r11) + # user_data + mr %r5,%r11 + b .Ldoclosure + .cfi_endproc +END(ffi_go_closure_sysv) #if defined __ELF__ && defined __linux__ .section .note.GNU-stack,"",@progbits diff --git a/src/powerpc/sysv.S b/src/powerpc/sysv.S index fed2380..1474ce7 100644 --- a/src/powerpc/sysv.S +++ b/src/powerpc/sysv.S @@ -31,34 +31,35 @@ #include #ifndef POWERPC64 - .globl ffi_prep_args_SYSV +FFI_HIDDEN(ffi_call_SYSV) ENTRY(ffi_call_SYSV) -.LFB1: + .cfi_startproc /* Save the old stack pointer as AP. */ - mr %r8,%r1 + mr %r10,%r1 + .cfi_def_cfa_register 10 -.LCFI0: /* Allocate the stack space we need. */ - stwux %r1,%r1,%r4 + stwux %r1,%r1,%r8 /* Save registers we use. */ mflr %r9 - stw %r28,-16(%r8) -.LCFI1: - stw %r29,-12(%r8) -.LCFI2: - stw %r30, -8(%r8) -.LCFI3: - stw %r31, -4(%r8) -.LCFI4: - stw %r9, 4(%r8) -.LCFI5: + stw %r28,-16(%r10) + stw %r29,-12(%r10) + stw %r30, -8(%r10) + stw %r31, -4(%r10) + stw %r9, 4(%r10) + .cfi_offset 65, 4 + .cfi_offset 31, -4 + .cfi_offset 30, -8 + .cfi_offset 29, -12 + .cfi_offset 28, -16 /* Save arguments over call... */ - mr %r31,%r5 /* flags, */ - mr %r30,%r6 /* rvalue, */ - mr %r29,%r7 /* function address, */ - mr %r28,%r8 /* our AP. */ -.LCFI6: + stw %r7, -20(%r10) /* closure, */ + mr %r31,%r6 /* flags, */ + mr %r30,%r5 /* rvalue, */ + mr %r29,%r4 /* function address, */ + mr %r28,%r10 /* our AP. */ + .cfi_def_cfa_register 28 /* Call ffi_prep_args_SYSV. */ mr %r4,%r1 @@ -70,35 +71,36 @@ ENTRY(ffi_call_SYSV) /* Get the address to call into CTR. */ mtctr %r29 /* Load all those argument registers. */ - lwz %r3,-16-(8*4)(%r28) - lwz %r4,-16-(7*4)(%r28) - lwz %r5,-16-(6*4)(%r28) - lwz %r6,-16-(5*4)(%r28) + lwz %r3,-24-(8*4)(%r28) + lwz %r4,-24-(7*4)(%r28) + lwz %r5,-24-(6*4)(%r28) + lwz %r6,-24-(5*4)(%r28) bf- 5,1f nop - lwz %r7,-16-(4*4)(%r28) - lwz %r8,-16-(3*4)(%r28) - lwz %r9,-16-(2*4)(%r28) - lwz %r10,-16-(1*4)(%r28) + lwz %r7,-24-(4*4)(%r28) + lwz %r8,-24-(3*4)(%r28) + lwz %r9,-24-(2*4)(%r28) + lwz %r10,-24-(1*4)(%r28) nop 1: #ifndef __NO_FPRS__ /* Load all the FP registers. */ bf- 6,2f - lfd %f1,-16-(8*4)-(8*8)(%r28) - lfd %f2,-16-(8*4)-(7*8)(%r28) - lfd %f3,-16-(8*4)-(6*8)(%r28) - lfd %f4,-16-(8*4)-(5*8)(%r28) + lfd %f1,-24-(8*4)-(8*8)(%r28) + lfd %f2,-24-(8*4)-(7*8)(%r28) + lfd %f3,-24-(8*4)-(6*8)(%r28) + lfd %f4,-24-(8*4)-(5*8)(%r28) nop - lfd %f5,-16-(8*4)-(4*8)(%r28) - lfd %f6,-16-(8*4)-(3*8)(%r28) - lfd %f7,-16-(8*4)-(2*8)(%r28) - lfd %f8,-16-(8*4)-(1*8)(%r28) + lfd %f5,-24-(8*4)-(4*8)(%r28) + lfd %f6,-24-(8*4)-(3*8)(%r28) + lfd %f7,-24-(8*4)-(2*8)(%r28) + lfd %f8,-24-(8*4)-(1*8)(%r28) #endif 2: /* Make the call. */ + lwz %r11, -20(%r28) bctrl /* Now, deal with the return value. */ @@ -125,11 +127,24 @@ L(done_return_value): lwz %r30, -8(%r28) lwz %r29,-12(%r28) lwz %r28,-16(%r28) + .cfi_remember_state + /* At this point we don't have a cfa register. Say all our + saved regs have been restored. */ + .cfi_same_value 65 + .cfi_same_value 31 + .cfi_same_value 30 + .cfi_same_value 29 + .cfi_same_value 28 + /* Hopefully this works.. */ + .cfi_def_cfa_register 1 + .cfi_offset 1, 0 lwz %r1,0(%r1) + .cfi_same_value 1 blr #ifndef __NO_FPRS__ L(fp_return_value): + .cfi_restore_state bf 28,L(float_return_value) stfd %f1,0(%r30) mtcrf 0x02,%r31 /* cr6 */ @@ -150,70 +165,10 @@ L(small_struct_return_value): stw %r3, 0(%r30) stw %r4, 4(%r30) b L(done_return_value) + .cfi_endproc -.LFE1: END(ffi_call_SYSV) - .section ".eh_frame",EH_FRAME_FLAGS,@progbits -.Lframe1: - .4byte .LECIE1-.LSCIE1 /* Length of Common Information Entry */ -.LSCIE1: - .4byte 0x0 /* CIE Identifier Tag */ - .byte 0x1 /* CIE Version */ -#if defined _RELOCATABLE || defined __PIC__ - .ascii "zR\0" /* CIE Augmentation */ -#else - .ascii "\0" /* CIE Augmentation */ -#endif - .uleb128 0x1 /* CIE Code Alignment Factor */ - .sleb128 -4 /* CIE Data Alignment Factor */ - .byte 0x41 /* CIE RA Column */ -#if defined _RELOCATABLE || defined __PIC__ - .uleb128 0x1 /* Augmentation size */ - .byte 0x1b /* FDE Encoding (pcrel sdata4) */ -#endif - .byte 0xc /* DW_CFA_def_cfa */ - .uleb128 0x1 - .uleb128 0x0 - .align 2 -.LECIE1: -.LSFDE1: - .4byte .LEFDE1-.LASFDE1 /* FDE Length */ -.LASFDE1: - .4byte .LASFDE1-.Lframe1 /* FDE CIE offset */ -#if defined _RELOCATABLE || defined __PIC__ - .4byte .LFB1-. /* FDE initial location */ -#else - .4byte .LFB1 /* FDE initial location */ -#endif - .4byte .LFE1-.LFB1 /* FDE address range */ -#if defined _RELOCATABLE || defined __PIC__ - .uleb128 0x0 /* Augmentation size */ -#endif - .byte 0x4 /* DW_CFA_advance_loc4 */ - .4byte .LCFI0-.LFB1 - .byte 0xd /* DW_CFA_def_cfa_register */ - .uleb128 0x08 - .byte 0x4 /* DW_CFA_advance_loc4 */ - .4byte .LCFI5-.LCFI0 - .byte 0x11 /* DW_CFA_offset_extended_sf */ - .uleb128 0x41 - .sleb128 -1 - .byte 0x9f /* DW_CFA_offset, column 0x1f */ - .uleb128 0x1 - .byte 0x9e /* DW_CFA_offset, column 0x1e */ - .uleb128 0x2 - .byte 0x9d /* DW_CFA_offset, column 0x1d */ - .uleb128 0x3 - .byte 0x9c /* DW_CFA_offset, column 0x1c */ - .uleb128 0x4 - .byte 0x4 /* DW_CFA_advance_loc4 */ - .4byte .LCFI6-.LCFI5 - .byte 0xd /* DW_CFA_def_cfa_register */ - .uleb128 0x1c - .align 2 -.LEFDE1: - #if defined __ELF__ && defined __linux__ .section .note.GNU-stack,"",@progbits #endif -- 2.1.0 -- Alan Modra Australia Development Lab, IBM