* [Patch 0/4] PowerPC64 Linux split stack support @ 2015-05-18 2:54 Alan Modra 2015-05-18 2:54 ` [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack Alan Modra ` (4 more replies) 0 siblings, 5 replies; 32+ messages in thread From: Alan Modra @ 2015-05-18 2:54 UTC (permalink / raw) To: gcc-patches; +Cc: David Edelsohn The following series of patches add -fsplit-stack support for powerpc64-linux. Each was cumulatively bootstrapped and regression tested powerpc64-linux and powerpc64le-linux. -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack 2015-05-18 2:54 [Patch 0/4] PowerPC64 Linux split stack support Alan Modra @ 2015-05-18 2:54 ` Alan Modra 2015-05-18 18:08 ` David Edelsohn 2015-05-18 2:55 ` [PATCH 2/4] prologue and epilogue tidy and -mno-vrsave bug fix Alan Modra ` (3 subsequent siblings) 4 siblings, 1 reply; 32+ messages in thread From: Alan Modra @ 2015-05-18 2:54 UTC (permalink / raw) To: gcc-patches; +Cc: David Edelsohn This patch changes rs6000_stack_info to keep save areas offsets even when not used. I need lr_save_offset valid for split-stack, and it seemed reasonable to treat the other offsets the same. Not zeroing the offsets requires just one change in code that uses them, the use_backchain_to_restore_sp expression in rs6000_emit_epilogue, not counting the debug_stack_info changes. * config/rs6000/rs6000.c (rs6000_stack_info): Don't zero offsets when not saving registers. (debug_stack_info): Adjust to omit printing unused offsets, as before. (rs6000_emit_epilogue): Adjust use_backchain_to_restore_sp expression. diff -urp gcc-virgin/gcc/config/rs6000/rs6000.c gcc-stack-info1/gcc/config/rs6000/rs6000.c --- gcc-virgin/gcc/config/rs6000/rs6000.c 2015-05-15 14:15:38.157244403 +0930 +++ gcc-stack-info1/gcc/config/rs6000/rs6000.c 2015-05-18 09:44:34.027608414 +0930 @@ -22014,31 +22014,6 @@ rs6000_stack_info (void) else info_ptr->push_p = non_fixed_size > (TARGET_32BIT ? 220 : 288); - /* Zero offsets if we're not saving those registers. */ - if (info_ptr->fp_size == 0) - info_ptr->fp_save_offset = 0; - - if (info_ptr->gp_size == 0) - info_ptr->gp_save_offset = 0; - - if (! TARGET_ALTIVEC_ABI || info_ptr->altivec_size == 0) - info_ptr->altivec_save_offset = 0; - - /* Zero VRSAVE offset if not saved and restored. */ - if (! TARGET_ALTIVEC_VRSAVE || info_ptr->vrsave_mask == 0) - info_ptr->vrsave_save_offset = 0; - - if (! TARGET_SPE_ABI - || info_ptr->spe_64bit_regs_used == 0 - || info_ptr->spe_gp_size == 0) - info_ptr->spe_gp_save_offset = 0; - - if (! info_ptr->lr_save_p) - info_ptr->lr_save_offset = 0; - - if (! info_ptr->cr_save_p) - info_ptr->cr_save_offset = 0; - return info_ptr; } @@ -22144,28 +22119,28 @@ debug_stack_info (rs6000_stack_t *info) if (info->calls_p) fprintf (stderr, "\tcalls_p = %5d\n", info->calls_p); - if (info->gp_save_offset) + if (info->gp_size) fprintf (stderr, "\tgp_save_offset = %5d\n", info->gp_save_offset); - if (info->fp_save_offset) + if (info->fp_size) fprintf (stderr, "\tfp_save_offset = %5d\n", info->fp_save_offset); - if (info->altivec_save_offset) + if (info->altivec_size) fprintf (stderr, "\taltivec_save_offset = %5d\n", info->altivec_save_offset); - if (info->spe_gp_save_offset) + if (info->spe_gp_size == 0) fprintf (stderr, "\tspe_gp_save_offset = %5d\n", info->spe_gp_save_offset); - if (info->vrsave_save_offset) + if (info->vrsave_size) fprintf (stderr, "\tvrsave_save_offset = %5d\n", info->vrsave_save_offset); - if (info->lr_save_offset) + if (info->lr_save_p) fprintf (stderr, "\tlr_save_offset = %5d\n", info->lr_save_offset); - if (info->cr_save_offset) + if (info->cr_save_p) fprintf (stderr, "\tcr_save_offset = %5d\n", info->cr_save_offset); if (info->varargs_save_offset) @@ -24736,7 +24711,9 @@ rs6000_emit_epilogue (int sibcall) here will not trigger at the moment; We don't actually need a frame pointer for alloca, but the generic parts of the compiler give us one anyway. */ - use_backchain_to_restore_sp = (info->total_size > 32767 - info->lr_save_offset + use_backchain_to_restore_sp = (info->total_size + (info->lr_save_p + ? info->lr_save_offset + : 0) > 32767 || (cfun->calls_alloca && !frame_pointer_needed)); restore_lr = (info->lr_save_p -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack 2015-05-18 2:54 ` [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack Alan Modra @ 2015-05-18 18:08 ` David Edelsohn 2015-05-20 1:45 ` Alan Modra 0 siblings, 1 reply; 32+ messages in thread From: David Edelsohn @ 2015-05-18 18:08 UTC (permalink / raw) To: GCC Patches On Sun, May 17, 2015 at 10:54 PM, Alan Modra <amodra@gmail.com> wrote: > This patch changes rs6000_stack_info to keep save areas offsets even > when not used. I need lr_save_offset valid for split-stack, and it > seemed reasonable to treat the other offsets the same. Not zeroing > the offsets requires just one change in code that uses them, the > use_backchain_to_restore_sp expression in rs6000_emit_epilogue, not > counting the debug_stack_info changes. > > * config/rs6000/rs6000.c (rs6000_stack_info): Don't zero offsets > when not saving registers. > (debug_stack_info): Adjust to omit printing unused offsets, > as before. > (rs6000_emit_epilogue): Adjust use_backchain_to_restore_sp > expression. I think that the vrsave_save_offset change may break saving of callee-saved VRs. See PR 55276. Additional points for converting the testcase into one that can be included in the GCC testsuite. Thanks, David ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack 2015-05-18 18:08 ` David Edelsohn @ 2015-05-20 1:45 ` Alan Modra 2015-05-20 13:07 ` David Edelsohn 0 siblings, 1 reply; 32+ messages in thread From: Alan Modra @ 2015-05-20 1:45 UTC (permalink / raw) To: David Edelsohn; +Cc: GCC Patches On Mon, May 18, 2015 at 02:05:59PM -0400, David Edelsohn wrote: > On Sun, May 17, 2015 at 10:54 PM, Alan Modra <amodra@gmail.com> wrote: > > This patch changes rs6000_stack_info to keep save areas offsets even > > when not used. I need lr_save_offset valid for split-stack, and it > > seemed reasonable to treat the other offsets the same. Not zeroing > > the offsets requires just one change in code that uses them, the > > use_backchain_to_restore_sp expression in rs6000_emit_epilogue, not > > counting the debug_stack_info changes. > > > > * config/rs6000/rs6000.c (rs6000_stack_info): Don't zero offsets > > when not saving registers. > > (debug_stack_info): Adjust to omit printing unused offsets, > > as before. > > (rs6000_emit_epilogue): Adjust use_backchain_to_restore_sp > > expression. > > I think that the vrsave_save_offset change may break saving of > callee-saved VRs. See PR 55276. I checked. It doesn't break that testcase. PR 55276 was really caused by using vrsave_mask for two purposes, firstly to track which altivec registers have been saved, and secondly to control use of the vrsave stack slot and whether mfvrsave/mtvrsave insns are generated. Patch 2/4 removes this conflation. -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack 2015-05-20 1:45 ` Alan Modra @ 2015-05-20 13:07 ` David Edelsohn 2015-05-20 13:29 ` Alan Modra 0 siblings, 1 reply; 32+ messages in thread From: David Edelsohn @ 2015-05-20 13:07 UTC (permalink / raw) To: GCC Patches, Alan Modra On Tue, May 19, 2015 at 9:09 PM, Alan Modra <amodra@gmail.com> wrote: > On Mon, May 18, 2015 at 02:05:59PM -0400, David Edelsohn wrote: >> On Sun, May 17, 2015 at 10:54 PM, Alan Modra <amodra@gmail.com> wrote: >> > This patch changes rs6000_stack_info to keep save areas offsets even >> > when not used. I need lr_save_offset valid for split-stack, and it >> > seemed reasonable to treat the other offsets the same. Not zeroing >> > the offsets requires just one change in code that uses them, the >> > use_backchain_to_restore_sp expression in rs6000_emit_epilogue, not >> > counting the debug_stack_info changes. >> > >> > * config/rs6000/rs6000.c (rs6000_stack_info): Don't zero offsets >> > when not saving registers. >> > (debug_stack_info): Adjust to omit printing unused offsets, >> > as before. >> > (rs6000_emit_epilogue): Adjust use_backchain_to_restore_sp >> > expression. >> >> I think that the vrsave_save_offset change may break saving of >> callee-saved VRs. See PR 55276. > > I checked. It doesn't break that testcase. PR 55276 was really > caused by using vrsave_mask for two purposes, firstly to track which > altivec registers have been saved, and secondly to control use of the > vrsave stack slot and whether mfvrsave/mtvrsave insns are generated. > Patch 2/4 removes this conflation. Okay, but that confirms Patch 1 is not safe without the patch series. - David ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack 2015-05-20 13:07 ` David Edelsohn @ 2015-05-20 13:29 ` Alan Modra 0 siblings, 0 replies; 32+ messages in thread From: Alan Modra @ 2015-05-20 13:29 UTC (permalink / raw) To: David Edelsohn; +Cc: GCC Patches On Wed, May 20, 2015 at 09:02:40AM -0400, David Edelsohn wrote: > On Tue, May 19, 2015 at 9:09 PM, Alan Modra <amodra@gmail.com> wrote: > > On Mon, May 18, 2015 at 02:05:59PM -0400, David Edelsohn wrote: > >> On Sun, May 17, 2015 at 10:54 PM, Alan Modra <amodra@gmail.com> wrote: > >> > This patch changes rs6000_stack_info to keep save areas offsets even > >> > when not used. I need lr_save_offset valid for split-stack, and it > >> > seemed reasonable to treat the other offsets the same. Not zeroing > >> > the offsets requires just one change in code that uses them, the > >> > use_backchain_to_restore_sp expression in rs6000_emit_epilogue, not > >> > counting the debug_stack_info changes. > >> > > >> > * config/rs6000/rs6000.c (rs6000_stack_info): Don't zero offsets > >> > when not saving registers. > >> > (debug_stack_info): Adjust to omit printing unused offsets, > >> > as before. > >> > (rs6000_emit_epilogue): Adjust use_backchain_to_restore_sp > >> > expression. > >> > >> I think that the vrsave_save_offset change may break saving of > >> callee-saved VRs. See PR 55276. > > > > I checked. It doesn't break that testcase. PR 55276 was really > > caused by using vrsave_mask for two purposes, firstly to track which > > altivec registers have been saved, and secondly to control use of the > > vrsave stack slot and whether mfvrsave/mtvrsave insns are generated. > > Patch 2/4 removes this conflation. > > Okay, but that confirms Patch 1 is not safe without the patch series. No, patch 1/4 is safe by itself. That's what I tested when I said I'd checked. Patch 2/4 doesn't correct a fault in patch 1/4. The explanation I gave re PR 55276 is saying that patch 2/4 prevents the confusion that caused PR 55276 from re-occurring, at least as far as vrsave_mask is concerned. -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/4] prologue and epilogue tidy and -mno-vrsave bug fix 2015-05-18 2:54 [Patch 0/4] PowerPC64 Linux split stack support Alan Modra 2015-05-18 2:54 ` [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack Alan Modra @ 2015-05-18 2:55 ` Alan Modra 2015-05-19 14:35 ` David Edelsohn 2015-05-18 2:55 ` [PATCH 3/4] split-stack for powerpc64 Alan Modra ` (2 subsequent siblings) 4 siblings, 1 reply; 32+ messages in thread From: Alan Modra @ 2015-05-18 2:55 UTC (permalink / raw) To: gcc-patches; +Cc: David Edelsohn This patch tidies the prologue and epilogue altivec code a little. A number of places using info->altivec_size unnecessarily also test TARGET_ALTIVEC_ABI, when rs6000_stack_info() guarantees that info->altivec_size is zero if !TARGET_ALTIVEC_ABI. Similarly by inspection of rs6000_stack_info() code, TARGET_ALTIVEC_VRSAVE && info->vrsave_mask != 0, used when deciding to save or restore vrsave, can be replaced with info->vrsave_size. I also removed the TARGET_ALTIVEC test used with save/restore of vrsave. I believe it is redundant because compute_vrsave_mask() will return 0 when no altivec registers are used (and of course you can't use then without TARGET_ALTIVEC), except for Darwin where TARGET_ALTIVEC is forced. The vrsave changes make the code actually doing the save or restore visually consistent with code that sets up a frame register for vrsave. Finally, I've changed two places that use info->vrsave_mask to test whether vrsave is saved or restored, to use info->vrsave_size. This is a bug fix for -mno-vrsave. * config/rs6000/rs6000.c (struct rs6000_stack): Correct comments. (rs6000_stack_info): Don't zero offsets when not saving registers. (debug_stack_info): Adjust to omit printing unused offsets, as before. (direct_return): Test vrsave_size rather than vrsave_mask. (rs6000_emit_prologue): Likewise. Remove redundant altivec tests. (rs6000_emit_epilogue): Likewise. diff -urp gcc-stack-info1/gcc/config/rs6000/rs6000.c gcc-stack-info2/gcc/config/rs6000/rs6000.c --- gcc-stack-info1/gcc/config/rs6000/rs6000.c 2015-05-18 09:44:34.027608414 +0930 +++ gcc-stack-info2/gcc/config/rs6000/rs6000.c 2015-05-16 13:33:37.170406399 +0930 @@ -155,10 +155,9 @@ typedef struct rs6000_stack { int gp_size; /* size of saved GP registers */ int fp_size; /* size of saved FP registers */ int altivec_size; /* size of saved AltiVec registers */ - int cr_size; /* size to hold CR if not in save_size */ - int vrsave_size; /* size to hold VRSAVE if not in save_size */ - int altivec_padding_size; /* size of altivec alignment padding if - not in save_size */ + int cr_size; /* size to hold CR if not in fixed area */ + int vrsave_size; /* size to hold VRSAVE */ + int altivec_padding_size; /* size of altivec alignment padding */ int spe_gp_size; /* size of 64-bit GPR save size for SPE */ int spe_padding_size; HOST_WIDE_INT total_size; /* total bytes allocated for stack */ @@ -5206,7 +5205,7 @@ direct_return (void) && info->first_altivec_reg_save == LAST_ALTIVEC_REGNO + 1 && ! info->lr_save_p && ! info->cr_save_p - && info->vrsave_mask == 0 + && info->vrsave_size == 0 && ! info->push_p) return 1; } @@ -23637,7 +23636,7 @@ rs6000_emit_prologue (void) || info->first_fp_reg_save < 64 || info->first_gp_reg_save < 32 || info->altivec_size != 0 - || info->vrsave_mask != 0 + || info->vrsave_size != 0 || crtl->calls_eh_return) ptr_regno = 12; else @@ -24185,7 +24184,7 @@ rs6000_emit_prologue (void) /* Save AltiVec registers if needed. Save here because the red zone does not always include AltiVec registers. */ - if (!WORLD_SAVE_P (info) && TARGET_ALTIVEC_ABI + if (!WORLD_SAVE_P (info) && info->altivec_size != 0 && (strategy & SAVE_INLINE_VRS) == 0) { int end_save = info->altivec_save_offset + info->altivec_size; @@ -24221,7 +24220,7 @@ rs6000_emit_prologue (void) frame_off = ptr_off; } } - else if (!WORLD_SAVE_P (info) && TARGET_ALTIVEC_ABI + else if (!WORLD_SAVE_P (info) && info->altivec_size != 0) { int i; @@ -24263,9 +24262,7 @@ rs6000_emit_prologue (void) epilogue. */ if (!WORLD_SAVE_P (info) - && TARGET_ALTIVEC - && TARGET_ALTIVEC_VRSAVE - && info->vrsave_mask != 0) + && info->vrsave_size != 0) { rtx reg, vrsave; int offset; @@ -24827,8 +24824,7 @@ rs6000_emit_epilogue (int sibcall) /* Restore AltiVec registers if we must do so before adjusting the stack. */ - if (TARGET_ALTIVEC_ABI - && info->altivec_size != 0 + if (info->altivec_size != 0 && (ALWAYS_RESTORE_ALTIVEC_BEFORE_POP || (DEFAULT_ABI != ABI_V4 && offset_below_red_zone_p (info->altivec_save_offset)))) @@ -24915,9 +24911,7 @@ rs6000_emit_epilogue (int sibcall) } /* Restore VRSAVE if we must do so before adjusting the stack. */ - if (TARGET_ALTIVEC - && TARGET_ALTIVEC_VRSAVE - && info->vrsave_mask != 0 + if (info->vrsave_size != 0 && (ALWAYS_RESTORE_ALTIVEC_BEFORE_POP || (DEFAULT_ABI != ABI_V4 && offset_below_red_zone_p (info->vrsave_save_offset)))) @@ -25011,7 +25005,6 @@ rs6000_emit_epilogue (int sibcall) /* Restore AltiVec registers if we have not done so already. */ if (!ALWAYS_RESTORE_ALTIVEC_BEFORE_POP - && TARGET_ALTIVEC_ABI && info->altivec_size != 0 && (DEFAULT_ABI == ABI_V4 || !offset_below_red_zone_p (info->altivec_save_offset))) @@ -25119,9 +25112,7 @@ rs6000_emit_epilogue (int sibcall) /* Restore VRSAVE if we have not done so already. */ if (!ALWAYS_RESTORE_ALTIVEC_BEFORE_POP - && TARGET_ALTIVEC - && TARGET_ALTIVEC_VRSAVE - && info->vrsave_mask != 0 + && info->vrsave_size != 0 && (DEFAULT_ABI == ABI_V4 || !offset_below_red_zone_p (info->vrsave_save_offset))) { -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] prologue and epilogue tidy and -mno-vrsave bug fix 2015-05-18 2:55 ` [PATCH 2/4] prologue and epilogue tidy and -mno-vrsave bug fix Alan Modra @ 2015-05-19 14:35 ` David Edelsohn 0 siblings, 0 replies; 32+ messages in thread From: David Edelsohn @ 2015-05-19 14:35 UTC (permalink / raw) To: GCC Patches, Alan Modra On Sun, May 17, 2015 at 10:54 PM, Alan Modra <amodra@gmail.com> wrote: > This patch tidies the prologue and epilogue altivec code a little. > A number of places using info->altivec_size unnecessarily also test > TARGET_ALTIVEC_ABI, when rs6000_stack_info() guarantees that > info->altivec_size is zero if !TARGET_ALTIVEC_ABI. > > Similarly by inspection of rs6000_stack_info() code, > TARGET_ALTIVEC_VRSAVE && info->vrsave_mask != 0, used when deciding to > save or restore vrsave, can be replaced with info->vrsave_size. I > also removed the TARGET_ALTIVEC test used with save/restore of vrsave. > I believe it is redundant because compute_vrsave_mask() will return 0 > when no altivec registers are used (and of course you can't use then > without TARGET_ALTIVEC), except for Darwin where TARGET_ALTIVEC is > forced. The vrsave changes make the code actually doing the save or > restore visually consistent with code that sets up a frame register > for vrsave. > > Finally, I've changed two places that use info->vrsave_mask to test > whether vrsave is saved or restored, to use info->vrsave_size. This > is a bug fix for -mno-vrsave. > > * config/rs6000/rs6000.c (struct rs6000_stack): Correct comments. > (rs6000_stack_info): Don't zero offsets when not saving registers. > (debug_stack_info): Adjust to omit printing unused offsets, > as before. > (direct_return): Test vrsave_size rather than vrsave_mask. > (rs6000_emit_prologue): Likewise. Remove redundant altivec tests. > (rs6000_emit_epilogue): Likewise. This patch is okay. My only concern is Patch 1 causing a regression for the PR that I mentioned. Thanks, David ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 3/4] split-stack for powerpc64 2015-05-18 2:54 [Patch 0/4] PowerPC64 Linux split stack support Alan Modra 2015-05-18 2:54 ` [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack Alan Modra 2015-05-18 2:55 ` [PATCH 2/4] prologue and epilogue tidy and -mno-vrsave bug fix Alan Modra @ 2015-05-18 2:55 ` Alan Modra 2015-05-18 7:05 ` Alan Modra ` (3 more replies) 2015-05-18 3:42 ` [PATCH 4/4] Split-stack arg pointer init refinement Alan Modra 2015-06-13 12:05 ` [Patch 0/4] PowerPC64 Linux split stack support Andreas Schwab 4 siblings, 4 replies; 32+ messages in thread From: Alan Modra @ 2015-05-18 2:55 UTC (permalink / raw) To: gcc-patches; +Cc: David Edelsohn This patch adds -fsplit-stack support for PowerPC64 Linux. I haven't made any real attempt to support ppc32 at this stage, but that should mostly be a matter of writing __morestack for ppc32. The idea of split-stack is to allocate just enough stack to execute a function, with checks added before function entry and on alloca to ensure the stack is large enough. It stack size is insufficient, a new stack segment is allocated for the function. The new stack and old stack are not necessarily contiguous. For powerpc64, function arguments on the old stack are accessed by using an arg_pointer register rather than accessing them relative to the stack pointer or frame pointer as is usually done. (x86 copies function arguments from the old stack to the new, but needs an arg pointer for variable argument lists.) Unwinding is handled by a personality routine that knows how to find stack segments. Split-stack prologue on function entry (local entry point for ELFv2) is as follows. This goes before the usual function prologue. entry: ld %r0,-0x7000-64(%r13) # tcbhead_t.__private_ss addis %r12,%r1,-allocate@ha addi %r12,%r12,-allocate@l cmpld %cr7,%r12,%r0 bge+ %cr7,enough mflr %r0 std %r0,16(%r1) bl __morestack ld %r0,16(%r1) mtlr %r0 blr enough: # usual function prologue, modified a little at the end to set up the # arg_pointer in %r12, starts here. The arg_pointer is initialized, # if it is used, with addi %r12,%r1,frame_size bge %cr7,.+8 mr %r12,%r29 Notes: 1) A function that does not allocate a stack frame, does not have a split-stack prologue. 2) __morestack must be local. __morestack has a non-standard calling convention, with the desired stack being passed in %r12. It saves arg passing regs, calls __generic_morestack to allocate a new stack segment, restores the arg passing regs and sets r29 to point at the old stack, then calls its return address + 12 to execute the function. After the function returns __morestack saves return regs, calls __generic_releasestack, and returns to the split-stack prologue, which immediately returns. This scheme keeps hardware return prediction valid. __morestack must also ensure cr7 is correctly set. 3) Basic-block reordering (enabled with -O2) will move the six instructions after the "bge+" out of line. 4) When the stack allocation is less than 32k these two instructions addis %r12,%r1,-allocate@ha addi %r12,%r12,-allocate@l are rewritten as addi %r12,%r1,-allocate nop The addi may also be rewritten as a nop in the rare case that the stack allocation is exactly a multiple of 64k. 5) When the linker detects a call from split-stack to non-split-stack code, it adds 16k (or more) to the value found in "allocate" instructions. So non-split-stack code gets a larger stack. The amount is tunable by a linker option. The edit means powerpc64 does not need to implement __morestack_non_split, necessary on x86 because insufficient space is available there to edit the stack comparison code. This feature is only implemented in the GNU gold linker. 6) We won't handle >2G stack initially and perhaps never. Supporting multiple threads each requiring more than 2G of stack is probably not that important, and likely to OOM at run time. (It would be possible to easily handle up to 4G by rounding the allocation up to a multiple of 64k and using two addis instructions in the split-stack prologue.) 7) If __morestack is called, then there are two stack frames between the function and its caller. Immediately above is a small 32 byte frame on the new stack, there so that a back-chain is always present no matter the value of r1. This could be reduced to 16 bytes but I thought it better to waste a few bytes for 32-byte alignment in case powerpc64 goes to 32-byte aligned stacks. Above that frame is the __morestack frame on the old stack. 8) If the normal function prologue uses r12 as a frame pointer, as it always does when the frame size is larger than 32k, then the arg pointer is set up with addi %r12,%r12,to_top_of_frame bge %cr7,.+8 mr %r12,%r29 omitting the addi if to_top_of_frame is zero. gcc/ * common/config/rs6000/rs6000-common.c (TARGET_SUPPORTS_SPLIT_STACK): Define. (rs6000_supports_split_stack): New function. * gcc/config/rs6000/rs6000.c (machine_function): Add split_stack_arg_pointer. (TARGET_EXTRA_LIVE_ON_ENTRY, TARGET_INTERNAL_ARG_POINTER): Define. (setup_incoming_varargs): Use crtl->args.internal_arg_pointer rather than virtual_incoming_args_rtx. (rs6000_va_start): Likewise. (split_stack_arg_pointer_used_p): New function. (rs6000_emit_prologue): Set up arg pointer for -fsplit-stack. (morestack_ref): New var. (gen_add3_const, rs6000_expand_split_stack_prologue, rs6000_internal_arg_pointer, rs6000_live_on_entry, rs6000_split_stack_space_check): New functions. (rs6000_elf_file_end): Call file_end_indicate_split_stack. * gcc/config/rs6000/rs6000.md (UNSPEC_STACK_CHECK): Define. (UNSPECV_SPLIT_STACK_RETURN): Define. (split_stack_prologue, load_split_stack_limit, load_split_stack_limit_di, load_split_stack_limit_si, split_stack_return, split_stack_space_check): New expands and insns. * gcc/config/rs6000/rs6000-protos.h (rs6000_expand_split_stack_prologue): Declare. (rs6000_split_stack_space_check): Declare. libgcc/ * config/rs6000/morestack.S: New. * config/rs6000/t-stack-rs6000: New. * config.host (powerpc*-*-linux*): Add t-stack and t-stack-rs6000 to tmake_file. * generic-morestack.c: Don't build for powerpc 32-bit. diff -urpN gcc-stack-info2/gcc/common/config/rs6000/rs6000-common.c gcc-split-stack1/gcc/common/config/rs6000/rs6000-common.c --- gcc-stack-info2/gcc/common/config/rs6000/rs6000-common.c 2015-05-15 14:15:38.145244889 +0930 +++ gcc-split-stack1/gcc/common/config/rs6000/rs6000-common.c 2015-05-15 01:57:37.417258829 +0930 @@ -288,6 +288,29 @@ rs6000_handle_option (struct gcc_options return true; } +/* -fsplit-stack uses a field in the TCB, available with glibc-2.18. */ + +static bool +rs6000_supports_split_stack (bool report, + struct gcc_options *opts ATTRIBUTE_UNUSED) +{ +#ifndef TARGET_GLIBC_MAJOR +#define TARGET_GLIBC_MAJOR 0 +#endif +#ifndef TARGET_GLIBC_MINOR +#define TARGET_GLIBC_MINOR 0 +#endif + /* Note: Can't test DEFAULT_ABI here, it isn't set until later. */ + if (TARGET_GLIBC_MAJOR * 1000 + TARGET_GLIBC_MINOR >= 2018 + && TARGET_64BIT + && TARGET_ELF) + return true; + + if (report) + error ("%<-fsplit-stack%> currently only supported on PowerPC64 GNU/Linux with glibc-2.18 or later"); + return false; +} + #undef TARGET_HANDLE_OPTION #define TARGET_HANDLE_OPTION rs6000_handle_option @@ -300,4 +323,7 @@ rs6000_handle_option (struct gcc_options #undef TARGET_OPTION_OPTIMIZATION_TABLE #define TARGET_OPTION_OPTIMIZATION_TABLE rs6000_option_optimization_table +#undef TARGET_SUPPORTS_SPLIT_STACK +#define TARGET_SUPPORTS_SPLIT_STACK rs6000_supports_split_stack + struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER; diff -urpN gcc-stack-info2/gcc/config/rs6000/rs6000.c gcc-split-stack1/gcc/config/rs6000/rs6000.c --- gcc-stack-info2/gcc/config/rs6000/rs6000.c 2015-05-16 13:33:37.170406399 +0930 +++ gcc-split-stack1/gcc/config/rs6000/rs6000.c 2015-05-16 14:54:55.483454632 +0930 @@ -187,6 +187,8 @@ typedef struct GTY(()) machine_function 64-bits wide and is allocated early enough so that the offset does not overflow the 16-bit load/store offset field. */ rtx sdmode_stack_slot; + /* Alternative internal arg pointer for -fsplit-stack. */ + rtx split_stack_arg_pointer; /* Flag if r2 setup is needed with ELFv2 ABI. */ bool r2_setup_needed; } machine_function; @@ -1190,6 +1192,7 @@ static bool rs6000_debug_cannot_change_m machine_mode, enum reg_class); static bool rs6000_save_toc_in_prologue_p (void); +static rtx rs6000_internal_arg_pointer (void); rtx (*rs6000_legitimize_reload_address_ptr) (rtx, machine_mode, int, int, int, int *) @@ -1411,6 +1414,12 @@ static const struct attribute_spec rs600 #undef TARGET_SET_UP_BY_PROLOGUE #define TARGET_SET_UP_BY_PROLOGUE rs6000_set_up_by_prologue +#undef TARGET_EXTRA_LIVE_ON_ENTRY +#define TARGET_EXTRA_LIVE_ON_ENTRY rs6000_live_on_entry + +#undef TARGET_INTERNAL_ARG_POINTER +#define TARGET_INTERNAL_ARG_POINTER rs6000_internal_arg_pointer + #undef TARGET_HAVE_TLS #define TARGET_HAVE_TLS HAVE_AS_TLS @@ -11150,7 +11159,7 @@ setup_incoming_varargs (cumulative_args_ else { first_reg_offset = next_cum.words; - save_area = virtual_incoming_args_rtx; + save_area = crtl->args.internal_arg_pointer; if (targetm.calls.must_pass_in_stack (mode, type)) first_reg_offset += rs6000_arg_size (TYPE_MODE (type), type); @@ -11344,7 +11353,7 @@ rs6000_va_start (tree valist, rtx nextar } /* Find the overflow area. */ - t = make_tree (TREE_TYPE (ovf), virtual_incoming_args_rtx); + t = make_tree (TREE_TYPE (ovf), crtl->args.internal_arg_pointer); if (words != 0) t = fold_build_pointer_plus_hwi (t, words * MIN_UNITS_PER_WORD); t = build2 (MODIFY_EXPR, TREE_TYPE (ovf), ovf, t); @@ -23425,6 +23434,48 @@ rs6000_reg_live_or_pic_offset_p (int reg || (DEFAULT_ABI == ABI_DARWIN && flag_pic)))); } +/* Return whether the split-stack arg pointer (r12) is used. */ + +static bool +split_stack_arg_pointer_used_p (void) +{ + /* If the pseudo holding the arg pointer is no longer a pseudo, + then the arg pointer is used. */ + if (cfun->machine->split_stack_arg_pointer != NULL_RTX + && (!REG_P (cfun->machine->split_stack_arg_pointer) + || (REGNO (cfun->machine->split_stack_arg_pointer) + < FIRST_PSEUDO_REGISTER))) + return true; + + /* Unfortunately we also need to do some code scanning, since + r12 may have been substituted for the pseudo. */ + rtx_insn *insn; + basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun); + FOR_BB_INSNS (bb, insn) + if (NONDEBUG_INSN_P (insn)) + { + /* A call destroys r12. */ + if (CALL_P (insn)) + return false; + + df_ref use; + FOR_EACH_INSN_USE (use, insn) + { + rtx x = DF_REF_REG (use); + if (REG_P (x) && REGNO (x) == 12) + return true; + } + df_ref def; + FOR_EACH_INSN_DEF (def, insn) + { + rtx x = DF_REF_REG (def); + if (REG_P (x) && REGNO (x) == 12) + return false; + } + } + return bitmap_bit_p (DF_LR_OUT (bb), 12); +} + /* Emit function prologue as insns. */ void @@ -24376,6 +24427,40 @@ rs6000_emit_prologue (void) rtx reg = gen_rtx_REG (reg_mode, TOC_REGNUM); emit_insn (gen_frame_store (reg, sp_reg_rtx, RS6000_TOC_SAVE_SLOT)); } + + if (flag_split_stack && split_stack_arg_pointer_used_p ()) + { + /* Set up the arg pointer (r12) for -fsplit-stack code. If + __morestack was called, it left the arg pointer to the old + stack in r29. Otherwise, the arg pointer is the top of the + current frame. */ + if (frame_off != 0 || REGNO (frame_reg_rtx) != 12) + { + rtx r12 = gen_rtx_REG (Pmode, 12); + if (frame_off == 0) + emit_move_insn (r12, frame_reg_rtx); + else + emit_insn (gen_add3_insn (r12, frame_reg_rtx, GEN_INT (frame_off))); + } + if (info->push_p) + { + rtx r12 = gen_rtx_REG (Pmode, 12); + rtx r29 = gen_rtx_REG (Pmode, 29); + rtx cr7 = gen_rtx_REG (CCUNSmode, CR7_REGNO); + rtx not_more = gen_label_rtx (); + rtx jump; + + jump = gen_rtx_IF_THEN_ELSE (VOIDmode, + gen_rtx_GEU (VOIDmode, cr7, const0_rtx), + gen_rtx_LABEL_REF (VOIDmode, not_more), + pc_rtx); + jump = emit_jump_insn (gen_rtx_SET (pc_rtx, jump)); + JUMP_LABEL (jump) = not_more; + LABEL_NUSES (not_more) += 1; + emit_move_insn (r12, r29); + emit_label (not_more); + } + } } /* Output .extern statements for the save/restore routines we use. */ @@ -25803,6 +25888,178 @@ rs6000_output_function_epilogue (FILE *f fputs ("\t.align 2\n", file); } } + +/* -fsplit-stack support. */ + +/* A SYMBOL_REF for __morestack. */ +static GTY(()) rtx morestack_ref; + +static rtx +gen_add3_const (rtx rt, rtx ra, long c) +{ + if (TARGET_64BIT) + return gen_adddi3 (rt, ra, GEN_INT (c)); + else + return gen_addsi3 (rt, ra, GEN_INT (c)); +} + +/* Emit -fsplit-stack prologue, which goes before the regular function + prologue (at local entry point in the case of ELFv2). */ + +void +rs6000_expand_split_stack_prologue (void) +{ + rs6000_stack_t *info = rs6000_stack_info (); + unsigned HOST_WIDE_INT allocate; + long alloc_hi, alloc_lo; + rtx r0, r1, r12, lr, ok_label, compare, jump, call_fusage; + rtx_insn *insn; + + gcc_assert (flag_split_stack && reload_completed); + + if (!info->push_p) + return; + + allocate = info->total_size; + if (allocate > (unsigned HOST_WIDE_INT) 1 << 31) + { + sorry ("Stack frame larger than 2G is not supported for -fsplit-stack"); + return; + } + if (morestack_ref == NULL_RTX) + { + morestack_ref = gen_rtx_SYMBOL_REF (Pmode, "__morestack"); + SYMBOL_REF_FLAGS (morestack_ref) |= (SYMBOL_FLAG_LOCAL + | SYMBOL_FLAG_FUNCTION); + } + + r0 = gen_rtx_REG (Pmode, 0); + r1 = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); + r12 = gen_rtx_REG (Pmode, 12); + emit_insn (gen_load_split_stack_limit (r0)); + /* Always emit two insns here to calculate the requested stack, + so that the linker can edit them when adjusting size for calling + non-split-stack code. */ + alloc_hi = (-allocate + 0x8000) & ~0xffffL; + alloc_lo = -allocate - alloc_hi; + if (alloc_hi != 0) + { + emit_insn (gen_add3_const (r12, r1, alloc_hi)); + if (alloc_lo != 0) + emit_insn (gen_add3_const (r12, r12, alloc_lo)); + else + emit_insn (gen_nop ()); + } + else + { + emit_insn (gen_add3_const (r12, r1, alloc_lo)); + emit_insn (gen_nop ()); + } + + compare = gen_rtx_REG (CCUNSmode, CR7_REGNO); + emit_insn (gen_rtx_SET (compare, gen_rtx_COMPARE (CCUNSmode, r12, r0))); + ok_label = gen_label_rtx (); + jump = gen_rtx_IF_THEN_ELSE (VOIDmode, + gen_rtx_GEU (VOIDmode, compare, const0_rtx), + gen_rtx_LABEL_REF (VOIDmode, ok_label), + pc_rtx); + jump = emit_jump_insn (gen_rtx_SET (pc_rtx, jump)); + JUMP_LABEL (jump) = ok_label; + /* Mark the jump as very likely to be taken. */ + add_int_reg_note (jump, REG_BR_PROB, + REG_BR_PROB_BASE - REG_BR_PROB_BASE / 100); + + lr = gen_rtx_REG (Pmode, LR_REGNO); + insn = emit_move_insn (r0, lr); + RTX_FRAME_RELATED_P (insn) = 1; + insn = emit_insn (gen_frame_store (r0, r1, info->lr_save_offset)); + RTX_FRAME_RELATED_P (insn) = 1; + + insn = emit_call_insn (gen_call (gen_rtx_MEM (SImode, morestack_ref), + const0_rtx, const0_rtx)); + call_fusage = NULL_RTX; + use_reg (&call_fusage, r12); + add_function_usage_to (insn, call_fusage); + emit_insn (gen_frame_load (r0, r1, info->lr_save_offset)); + insn = emit_move_insn (lr, r0); + add_reg_note (insn, REG_CFA_RESTORE, lr); + RTX_FRAME_RELATED_P (insn) = 1; + emit_insn (gen_split_stack_return ()); + + emit_label (ok_label); + LABEL_NUSES (ok_label) = 1; +} + +/* Return the internal arg pointer used for function incoming + arguments. When -fsplit-stack, the arg pointer is r12 so we need + to copy it to a pseudo in order for it to be preserved over calls + and suchlike. We'd really like to use a pseudo here for the + internal arg pointer but data-flow analysis is not prepared to + accept pseudos as live at the beginning of a function. */ + +static rtx +rs6000_internal_arg_pointer (void) +{ + if (flag_split_stack) + { + if (cfun->machine->split_stack_arg_pointer == NULL_RTX) + { + rtx pat; + + cfun->machine->split_stack_arg_pointer = gen_reg_rtx (Pmode); + REG_POINTER (cfun->machine->split_stack_arg_pointer) = 1; + + /* Put the pseudo initialization right after the note at the + beginning of the function. */ + pat = gen_rtx_SET (cfun->machine->split_stack_arg_pointer, + gen_rtx_REG (Pmode, 12)); + push_topmost_sequence (); + emit_insn_after (pat, get_insns ()); + pop_topmost_sequence (); + } + return plus_constant (Pmode, cfun->machine->split_stack_arg_pointer, + FIRST_PARM_OFFSET (current_function_decl)); + } + return virtual_incoming_args_rtx; +} + +/* We may have to tell the dataflow pass that the split stack prologue + is initializing a register. */ + +static void +rs6000_live_on_entry (bitmap regs) +{ + if (flag_split_stack) + bitmap_set_bit (regs, 12); +} + +/* Emit -fsplit-stack dynamic stack allocation space check. */ + +void +rs6000_split_stack_space_check (rtx size, rtx label) +{ + rtx sp = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); + rtx limit = gen_reg_rtx (Pmode); + rtx requested = gen_reg_rtx (Pmode); + rtx cmp = gen_reg_rtx (CCUNSmode); + rtx jump; + + emit_insn (gen_load_split_stack_limit (limit)); + if (CONST_INT_P (size)) + emit_insn (gen_add3_insn (requested, sp, GEN_INT (-INTVAL (size)))); + else + { + size = force_reg (Pmode, size); + emit_move_insn (requested, gen_rtx_MINUS (Pmode, sp, size)); + } + emit_insn (gen_rtx_SET (cmp, gen_rtx_COMPARE (CCUNSmode, requested, limit))); + jump = gen_rtx_IF_THEN_ELSE (VOIDmode, + gen_rtx_GEU (VOIDmode, cmp, const0_rtx), + gen_rtx_LABEL_REF (VOIDmode, label), + pc_rtx); + jump = emit_jump_insn (gen_rtx_SET (pc_rtx, jump)); + JUMP_LABEL (jump) = label; +} \f /* A C compound statement that outputs the assembler code for a thunk function, used to implement C++ virtual function calls with @@ -29811,6 +30068,9 @@ rs6000_elf_file_end (void) if (TARGET_32BIT || DEFAULT_ABI == ABI_ELFv2) file_end_indicate_exec_stack (); #endif + + if (flag_split_stack) + file_end_indicate_split_stack (); } #endif diff -urpN gcc-stack-info2/gcc/config/rs6000/rs6000.md gcc-split-stack1/gcc/config/rs6000/rs6000.md --- gcc-stack-info2/gcc/config/rs6000/rs6000.md 2015-05-15 14:15:38.177243589 +0930 +++ gcc-split-stack1/gcc/config/rs6000/rs6000.md 2015-05-15 02:01:15.776472615 +0930 @@ -140,6 +140,7 @@ UNSPEC_PACK_128BIT UNSPEC_LSQ UNSPEC_FUSION_GPR + UNSPEC_STACK_CHECK ]) ;; @@ -157,6 +158,7 @@ UNSPECV_NLGR ; non-local goto receiver UNSPECV_MFFS ; Move from FPSCR UNSPECV_MTFSF ; Move to FPSCR Fields + UNSPECV_SPLIT_STACK_RETURN ; A camouflaged return ]) \f @@ -12345,6 +12347,72 @@ }" [(set_attr "type" "load")]) \f +;; Handle -fsplit-stack. + +(define_expand "split_stack_prologue" + [(const_int 0)] + "" +{ + rs6000_expand_split_stack_prologue (); + DONE; +}) + +(define_expand "load_split_stack_limit" + [(set (match_operand 0) + (unspec [(const_int 0)] UNSPEC_STACK_CHECK))] + "" +{ + emit_insn (gen_rtx_SET (operands[0], + gen_rtx_UNSPEC (Pmode, + gen_rtvec (1, const0_rtx), + UNSPEC_STACK_CHECK))); + DONE; +}) + +(define_insn "load_split_stack_limit_di" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (unspec:DI [(const_int 0)] UNSPEC_STACK_CHECK))] + "TARGET_64BIT" + "ld %0,-0x7040(13)" + [(set_attr "type" "load") + (set_attr "update" "no") + (set_attr "indexed" "no")]) + +(define_insn "load_split_stack_limit_si" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (unspec:SI [(const_int 0)] UNSPEC_STACK_CHECK))] + "!TARGET_64BIT" + "lwz %0,-0x7020(2)" + [(set_attr "type" "load") + (set_attr "update" "no") + (set_attr "indexed" "no")]) + +;; A return instruction which the middle-end doesn't see. +(define_insn "split_stack_return" + [(unspec_volatile [(const_int 0)] UNSPECV_SPLIT_STACK_RETURN)] + "" + "blr" + [(set_attr "type" "jmpreg")]) + +;; If there are operand 0 bytes available on the stack, jump to +;; operand 1. +(define_expand "split_stack_space_check" + [(set (match_dup 2) + (unspec [(const_int 0)] UNSPEC_STACK_CHECK)) + (set (match_dup 3) + (minus (reg STACK_POINTER_REGNUM) + (match_operand 0))) + (set (match_dup 4) (compare:CCUNS (match_dup 3) (match_dup 2))) + (set (pc) (if_then_else + (geu (match_dup 4) (const_int 0)) + (label_ref (match_operand 1)) + (pc)))] + "" +{ + rs6000_split_stack_space_check (operands[0], operands[1]); + DONE; +}) +\f (define_insn "bpermd_<mode>" [(set (match_operand:P 0 "gpc_reg_operand" "=r") (unspec:P [(match_operand:P 1 "gpc_reg_operand" "r") diff -urpN gcc-stack-info2/gcc/config/rs6000/rs6000-protos.h gcc-split-stack1/gcc/config/rs6000/rs6000-protos.h --- gcc-stack-info2/gcc/config/rs6000/rs6000-protos.h 2015-05-15 14:15:38.149244726 +0930 +++ gcc-split-stack1/gcc/config/rs6000/rs6000-protos.h 2015-05-15 01:57:37.417258829 +0930 @@ -191,6 +191,8 @@ extern void rs6000_emit_prologue (void); extern void rs6000_emit_load_toc_table (int); extern unsigned int rs6000_dbx_register_number (unsigned int, unsigned int); extern void rs6000_emit_epilogue (int); +extern void rs6000_expand_split_stack_prologue (void); +extern void rs6000_split_stack_space_check (rtx, rtx); extern void rs6000_emit_eh_reg_restore (rtx, rtx); extern const char * output_isel (rtx *); extern void rs6000_call_aix (rtx, rtx, rtx, rtx); diff -urpN gcc-stack-info2/libgcc/config/rs6000/morestack.S gcc-split-stack1/libgcc/config/rs6000/morestack.S --- gcc-stack-info2/libgcc/config/rs6000/morestack.S 1970-01-01 09:30:00.000000000 +0930 +++ gcc-split-stack1/libgcc/config/rs6000/morestack.S 2015-05-15 14:54:02.247603731 +0930 @@ -0,0 +1,351 @@ +#ifdef __powerpc64__ +# PowerPC64 support for -fsplit-stack. +# Copyright (C) 2009-2015 Free Software Foundation, Inc. +# Contributed by Alan Modra <amodra@gmail.com>. + +# This file is part of GCC. + +# GCC is free software; you can redistribute it and/or modify it under +# the terms of the GNU General Public License as published by the Free +# Software Foundation; either version 3, or (at your option) any later +# version. + +# GCC is distributed in the hope that it will be useful, but WITHOUT ANY +# WARRANTY; without even the implied warranty of MERCHANTABILITY or +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +# for more details. + +# Under Section 7 of GPL version 3, you are granted additional +# permissions described in the GCC Runtime Library Exception, version +# 3.1, as published by the Free Software Foundation. + +# You should have received a copy of the GNU General Public License and +# a copy of the GCC Runtime Library Exception along with this program; +# see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +# <http://www.gnu.org/licenses/>. + +#if _CALL_ELF == 2 + .abiversion 2 +#define PARAMS 32 +#else + .abiversion 1 +#define PARAMS 48 +#endif +#define MORESTACK_FRAMESIZE (PARAMS+96) +#define PARAMREG_SAVE -MORESTACK_FRAMESIZE+PARAMS+0 +#define STATIC_CHAIN_SAVE -MORESTACK_FRAMESIZE+PARAMS+64 +#define R29_SAVE -MORESTACK_FRAMESIZE+PARAMS+72 +#define LINKREG_SAVE -MORESTACK_FRAMESIZE+PARAMS+80 +#define NEWSTACKSIZE_SAVE -MORESTACK_FRAMESIZE+PARAMS+88 + +# Excess space needed to call ld.so resolver for lazy plt +# resolution. Go uses sigaltstack so this doesn't need to +# also cover signal frame size. +#define BACKOFF 4096 +# Large excess allocated when calling non-split-stack code. +#define NON_SPLIT_STACK 0x100000 + + +#if _CALL_ELF == 2 + +#define BODY_LABEL(name) name + +#define ENTRY0(name) \ + .global name; \ + .hidden name; \ + .type name,@function; \ +name##: + +#define ENTRY(name) \ + ENTRY0(name); \ +0: addis %r2,%r12,.TOC.-0b@ha; \ + addi %r2,%r2,.TOC.-0b@l; \ + .localentry name, .-name + +#else + +#define BODY_LABEL(name) .L.##name + +#define ENTRY0(name) \ + .global name; \ + .hidden name; \ + .type name,@function; \ + .pushsection ".opd","aw"; \ + .p2align 3; \ +name##: .quad BODY_LABEL (name), .TOC.@tocbase, 0; \ + .popsection; \ +BODY_LABEL(name)##: + +#define ENTRY(name) ENTRY0(name) + +#endif + +#define SIZE(name) .size name, .-BODY_LABEL(name) + + + .text +# Just like __morestack, but with larger excess allocation +ENTRY0(__morestack_non_split) +.LFB1: + .cfi_startproc +# We use a cleanup to restore the tcbhead_t.__private_ss if +# an exception is thrown through this code. +#ifdef __PIC__ + .cfi_personality 0x9b,DW.ref.__gcc_personality_v0 + .cfi_lsda 0x1b,.LLSDA1 +#else + .cfi_personality 0x3,__gcc_personality_v0 + .cfi_lsda 0x3,.LLSDA1 +#endif +# LR is already saved by the split-stack prologue code. +# We may as well have the unwinder skip over the call in the +# prologue too. + .cfi_offset %lr,16 + + addis %r12,%r12,-NON_SPLIT_STACK@h + SIZE (__morestack_non_split) +# Fall through into __morestack + + +# This function is called with non-standard calling conventions. +# On entry, r12 is the requested stack pointer. One version of the +# split-stack prologue that calls __morestack looks like +# ld %r0,-0x7000-64(%r13) +# addis %r12,%r1,-allocate@ha +# addi %r12,%r12,-allocate@l +# cmpld %r12,%r0 +# bge+ enough +# mflr %r0 +# std %r0,16(%r1) +# bl __morestack +# ld %r0,16(%r1) +# mtlr %r0 +# blr +# enough: +# The normal function prologue follows here, with a small addition at +# the end to set up the arg pointer. The arg pointer is set up with: +# addi %r12,%r1,offset +# bge %cr7,.+8 +# mr %r12,%r29 +# +# Note that the lr save slot 16(%r1) has already been used. +# r3 thru r11 possibly contain arguments and a static chain +# pointer for the function we're calling, so must be preserved. +# cr7 must also be preserved. + +ENTRY0(__morestack) +# Save parameter passing registers, our arguments, lr, r29 +# and use r29 as a frame pointer. + std %r3,PARAMREG_SAVE+0(%r1) + sub %r3,%r1,%r12 # calculate requested stack size + mflr %r12 + std %r4,PARAMREG_SAVE+8(%r1) + std %r5,PARAMREG_SAVE+16(%r1) + std %r6,PARAMREG_SAVE+24(%r1) + std %r7,PARAMREG_SAVE+32(%r1) + addi %r3,%r3,BACKOFF + std %r8,PARAMREG_SAVE+40(%r1) + std %r9,PARAMREG_SAVE+48(%r1) + std %r10,PARAMREG_SAVE+56(%r1) + std %r11,STATIC_CHAIN_SAVE(%r1) + std %r29,R29_SAVE(%r1) + std %r12,LINKREG_SAVE(%r1) + std %r3,NEWSTACKSIZE_SAVE(%r1) # new stack size + mr %r29,%r1 + .cfi_offset %r29,R29_SAVE + .cfi_def_cfa_register %r29 + stdu %r1,-MORESTACK_FRAMESIZE(%r1) + + # void __morestack_block_signals (void) + bl __morestack_block_signals + + # void *__generic_morestack (size_t *pframe_size, + # void *old_stack, + # size_t param_size) + addi %r3,%r29,NEWSTACKSIZE_SAVE + mr %r4,%r29 + li %r5,0 # no copying from old stack + bl __generic_morestack + +# Start using new stack + stdu %r29,-32(%r3) # back-chain + mr %r1,%r3 + +# Set __private_ss stack guard for the new stack. + ld %r12,NEWSTACKSIZE_SAVE(%r29) # modified size + addi %r3,%r3,BACKOFF-32 + sub %r3,%r3,%r12 +# Note that a signal frame has $pc pointing at the instruction +# where the signal occurred. For something like a timer +# interrupt this means the instruction has already executed, +# thus the region starts at the instruction modifying +# __private_ss, not one instruction after. +.LEHB0: + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss + + # void __morestack_unblock_signals (void) + bl __morestack_unblock_signals + +# Set up for a call to the target function, located 3 +# instructions after __morestack's return address. +# + ld %r12,LINKREG_SAVE(%r29) + ld %r3,PARAMREG_SAVE+0(%r29) # restore arg regs + ld %r4,PARAMREG_SAVE+8(%r29) + ld %r5,PARAMREG_SAVE+16(%r29) + ld %r6,PARAMREG_SAVE+24(%r29) + ld %r7,PARAMREG_SAVE+32(%r29) + ld %r8,PARAMREG_SAVE+40(%r29) + ld %r9,PARAMREG_SAVE+48(%r29) + addi %r0,%r12,12 # add 3 instructions + ld %r10,PARAMREG_SAVE+56(%r29) + ld %r11,STATIC_CHAIN_SAVE(%r29) + cmpld %cr7,%r12,%r0 # indicate we were called + mtctr %r0 + bctrl # call caller! + +# On return, save regs possibly used to return a value, and +# possibly trashed by calls to __morestack_block_signals, +# __generic_releasestack and __morestack_unblock_signals. +# Assume those calls don't use vector or floating point regs. + std %r3,PARAMREG_SAVE+0(%r29) + std %r4,PARAMREG_SAVE+8(%r29) + std %r5,PARAMREG_SAVE+16(%r29) + std %r6,PARAMREG_SAVE+24(%r29) +#if _CALL_ELF == 2 + std %r7,PARAMREG_SAVE+32(%r29) + std %r8,PARAMREG_SAVE+40(%r29) + std %r9,PARAMREG_SAVE+48(%r29) + std %r10,PARAMREG_SAVE+56(%r29) +#endif + + bl __morestack_block_signals + + # void *__generic_releasestack (size_t *pavailable) + addi %r3,%r29,NEWSTACKSIZE_SAVE + bl __generic_releasestack + +# Reset __private_ss stack guard to value for old stack + ld %r12,NEWSTACKSIZE_SAVE(%r29) + addi %r3,%r3,BACKOFF + sub %r3,%r3,%r12 +.LEHE0: + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss + + bl __morestack_unblock_signals + +# Use old stack again. + mr %r1,%r29 + +# Restore return value regs, and return. + ld %r0,LINKREG_SAVE(%r29) + mtlr %r0 + ld %r3,PARAMREG_SAVE+0(%r29) + ld %r4,PARAMREG_SAVE+8(%r29) + ld %r5,PARAMREG_SAVE+16(%r29) + ld %r6,PARAMREG_SAVE+24(%r29) +#if _CALL_ELF == 2 + ld %r7,PARAMREG_SAVE+32(%r29) + ld %r8,PARAMREG_SAVE+40(%r29) + ld %r9,PARAMREG_SAVE+48(%r29) + ld %r10,PARAMREG_SAVE+56(%r29) +#endif + ld %r29,R29_SAVE(%r29) + .cfi_def_cfa_register %r1 + blr + +# This is the cleanup code called by the stack unwinder when +# unwinding through code between .LEHB0 and .LEHE0 above. +cleanup: + .cfi_def_cfa_register %r29 + std %r3,PARAMREG_SAVE(%r29) # Save exception header + # size_t __generic_findstack (void *stack) + mr %r3,%r29 + bl __generic_findstack + sub %r3,%r29,%r3 + addi %r3,%r3,BACKOFF + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss + ld %r3,PARAMREG_SAVE(%r29) + bl _Unwind_Resume + nop + .cfi_endproc + SIZE (__morestack) + + + .section .gcc_except_table,"a",@progbits + .p2align 2 +.LLSDA1: + .byte 0xff # @LPStart format (omit) + .byte 0xff # @TType format (omit) + .byte 0x1 # call-site format (uleb128) + .uleb128 .LLSDACSE1-.LLSDACSB1 # Call-site table length +.LLSDACSB1: + .uleb128 .LEHB0-.LFB1 # region 0 start + .uleb128 .LEHE0-.LEHB0 # length + .uleb128 cleanup-.LFB1 # landing pad + .uleb128 0 # no action, ie. a cleanup +.LLSDACSE1: + + +#ifdef __PIC__ +# Build a position independent reference to the personality function. + .hidden DW.ref.__gcc_personality_v0 + .weak DW.ref.__gcc_personality_v0 + .section .data.DW.ref.__gcc_personality_v0,"awG",@progbits,DW.ref.__gcc_personality_v0,comdat + .p2align 3 +DW.ref.__gcc_personality_v0: + .quad __gcc_personality_v0 + .type DW.ref.__gcc_personality_v0, @object + .size DW.ref.__gcc_personality_v0, 8 +#endif + + + .text +# Initialize the stack guard when the program starts or when a +# new thread starts. This is called from a constructor. +# void __stack_split_initialize (void) +ENTRY(__stack_split_initialize) + addi %r3,%r1,-0x4000 # We should have at least 16K. + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss + # void __generic_morestack_set_initial_sp (void *sp, size_t len) + mr %r3,%r1 + li %r4, 0x4000 + b __generic_morestack_set_initial_sp + SIZE (__stack_split_initialize) + + +# Return current __private_ss +# void *__morestack_get_guard (void) +ENTRY0(__morestack_get_guard) + ld %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss + blr + SIZE (__morestack_get_guard) + + +# Set __private_ss +# void __morestack_set_guard (void *ptr) +ENTRY0(__morestack_set_guard) + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss + blr + SIZE (__morestack_set_guard) + + +# Return the stack guard value for given stack +# void *__morestack_make_guard (void *stack, size_t size) +ENTRY0(__morestack_make_guard) + sub %r3,%r3,%r4 + addi %r3,%r3,BACKOFF + blr + SIZE (__morestack_make_guard) + + +# Make __stack_split_initialize a high priority constructor. + .section .ctors.65535,"aw",@progbits + .p2align 3 + .quad __stack_split_initialize + .quad __morestack_load_mmap + + .section .note.GNU-stack,"",@progbits + .section .note.GNU-split-stack,"",@progbits + .section .note.GNU-no-split-stack,"",@progbits +#endif /* __powerpc64__ */ diff -urpN gcc-stack-info2/libgcc/config/rs6000/t-stack-rs6000 gcc-split-stack1/libgcc/config/rs6000/t-stack-rs6000 --- gcc-stack-info2/libgcc/config/rs6000/t-stack-rs6000 1970-01-01 09:30:00.000000000 +0930 +++ gcc-split-stack1/libgcc/config/rs6000/t-stack-rs6000 2015-05-15 01:57:37.429258346 +0930 @@ -0,0 +1,2 @@ +# Makefile fragment to support -fsplit-stack for powerpc. +LIB2ADD_ST += $(srcdir)/config/rs6000/morestack.S diff -urpN gcc-stack-info2/libgcc/config.host gcc-split-stack1/libgcc/config.host --- gcc-stack-info2/libgcc/config.host 2015-05-15 14:15:38.193242938 +0930 +++ gcc-split-stack1/libgcc/config.host 2015-05-15 01:57:37.429258346 +0930 @@ -1021,6 +1021,7 @@ powerpc-*-rtems*) ;; powerpc*-*-linux*) tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-savresfgpr rs6000/t-crtstuff rs6000/t-linux t-dfprules rs6000/t-ppc64-fp t-slibgcc-libgcc" + tmake_file="${tmake_file} t-stack rs6000/t-stack-rs6000" case $ppc_fp_type in 64) ;; diff -urpN gcc-stack-info2/libgcc/generic-morestack.c gcc-split-stack1/libgcc/generic-morestack.c --- gcc-stack-info2/libgcc/generic-morestack.c 2015-05-15 14:15:38.193242938 +0930 +++ gcc-split-stack1/libgcc/generic-morestack.c 2015-05-15 01:57:37.429258346 +0930 @@ -23,6 +23,9 @@ a copy of the GCC Runtime Library Except see the files COPYING3 and COPYING.RUNTIME respectively. If not, see <http://www.gnu.org/licenses/>. */ +/* powerpc 32-bit not supported. */ +#if !defined __powerpc__ || defined __powerpc64__ + #include "tconfig.h" #include "tsystem.h" #include "coretypes.h" @@ -935,6 +938,7 @@ __splitstack_find (void *segment_arg, vo nsp -= 12 * sizeof (void *); #elif defined (__i386__) nsp -= 6 * sizeof (void *); +#elif defined __powerpc64__ #else #error "unrecognized target" #endif @@ -1170,3 +1174,4 @@ __splitstack_find_context (void *context } #endif /* !defined (inhibit_libc) */ +#endif /* not powerpc 32-bit */ -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-18 2:55 ` [PATCH 3/4] split-stack for powerpc64 Alan Modra @ 2015-05-18 7:05 ` Alan Modra 2015-05-19 12:48 ` Lynn A. Boger ` (2 subsequent siblings) 3 siblings, 0 replies; 32+ messages in thread From: Alan Modra @ 2015-05-18 7:05 UTC (permalink / raw) To: gcc-patches, David Edelsohn On Mon, May 18, 2015 at 12:24:51PM +0930, Alan Modra wrote: > + error ("%<-fsplit-stack%> currently only supported on PowerPC64 GNU/Linux with glibc-2.18 or later"); I forgot to comment on this. 2.19 is actually when __private_ss appeared in the ppc tcbhead_t, but I misread the commit date and thought it was 2.18. I was going to correct that, but then wondered if glibc allocates any spare bytes, and it looks like it does. We have a struct pthread before tcbhead_t, and struct pthread is aligned according to TCB_ALIGNMENT which is 16 for powerpc. So 2.18's 56 byte tcbhead_t, which is laid out so that the end coincides with tp-0x7000, must be preceded by 8 bytes of padding. Enough for __private_ss, and we don't care if it isn't initially zero. -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-18 2:55 ` [PATCH 3/4] split-stack for powerpc64 Alan Modra 2015-05-18 7:05 ` Alan Modra @ 2015-05-19 12:48 ` Lynn A. Boger 2015-05-20 1:03 ` Alan Modra 2015-05-19 14:37 ` David Edelsohn 2015-05-21 2:10 ` Alan Modra 3 siblings, 1 reply; 32+ messages in thread From: Lynn A. Boger @ 2015-05-19 12:48 UTC (permalink / raw) To: gcc-patches, David Edelsohn Questions on the use of the options for split stack: - The way this is implemented, split stack is generated if the target platform supports split stack, on ppc64/ppc64le as well as on x86, and the use of -fno-split-stack doesn't seem to affect it for any of these. Is that the way it should work? I would expect -fno-split-stack to disable it completely. - The comments say that the gold linker is used for some situations but I don't see any reference in the code to enabling the gold linker for ppc64le, ppc64, or x86. Is the user expected to add the option for the gold linker if needed? (I realize this is more complicated because the gold linker support for this on Power was just added upstream, and the gold linker might not even be installed on the user's system.) On 05/17/2015 09:54 PM, Alan Modra wrote: > This patch adds -fsplit-stack support for PowerPC64 Linux. I haven't > made any real attempt to support ppc32 at this stage, but that should > mostly be a matter of writing __morestack for ppc32. > > The idea of split-stack is to allocate just enough stack to execute a > function, with checks added before function entry and on alloca to > ensure the stack is large enough. It stack size is insufficient, a > new stack segment is allocated for the function. The new stack and > old stack are not necessarily contiguous. For powerpc64, function > arguments on the old stack are accessed by using an arg_pointer > register rather than accessing them relative to the stack pointer or > frame pointer as is usually done. (x86 copies function arguments from > the old stack to the new, but needs an arg pointer for variable > argument lists.) Unwinding is handled by a personality routine that > knows how to find stack segments. > > Split-stack prologue on function entry (local entry point for ELFv2) > is as follows. This goes before the usual function prologue. > > entry: > ld %r0,-0x7000-64(%r13) # tcbhead_t.__private_ss > addis %r12,%r1,-allocate@ha > addi %r12,%r12,-allocate@l > cmpld %cr7,%r12,%r0 > bge+ %cr7,enough > mflr %r0 > std %r0,16(%r1) > bl __morestack > ld %r0,16(%r1) > mtlr %r0 > blr > enough: > # usual function prologue, modified a little at the end to set up the > # arg_pointer in %r12, starts here. The arg_pointer is initialized, > # if it is used, with > addi %r12,%r1,frame_size > bge %cr7,.+8 > mr %r12,%r29 > > Notes: > 1) A function that does not allocate a stack frame, does not have a > split-stack prologue. > > 2) __morestack must be local. __morestack has a non-standard calling > convention, with the desired stack being passed in %r12. It saves arg > passing regs, calls __generic_morestack to allocate a new stack > segment, restores the arg passing regs and sets r29 to point at the > old stack, then calls its return address + 12 to execute the function. > After the function returns __morestack saves return regs, calls > __generic_releasestack, and returns to the split-stack prologue, which > immediately returns. This scheme keeps hardware return prediction > valid. __morestack must also ensure cr7 is correctly set. > > 3) Basic-block reordering (enabled with -O2) will move the six > instructions after the "bge+" out of line. > > 4) When the stack allocation is less than 32k these two instructions > addis %r12,%r1,-allocate@ha > addi %r12,%r12,-allocate@l > are rewritten as > addi %r12,%r1,-allocate > nop > The addi may also be rewritten as a nop in the rare case that the > stack allocation is exactly a multiple of 64k. > > 5) When the linker detects a call from split-stack to non-split-stack > code, it adds 16k (or more) to the value found in "allocate" > instructions. So non-split-stack code gets a larger stack. The > amount is tunable by a linker option. The edit means powerpc64 does > not need to implement __morestack_non_split, necessary on x86 because > insufficient space is available there to edit the stack comparison > code. This feature is only implemented in the GNU gold linker. > > 6) We won't handle >2G stack initially and perhaps never. Supporting > multiple threads each requiring more than 2G of stack is probably not > that important, and likely to OOM at run time. (It would be possible > to easily handle up to 4G by rounding the allocation up to a multiple > of 64k and using two addis instructions in the split-stack prologue.) > > 7) If __morestack is called, then there are two stack frames between > the function and its caller. Immediately above is a small 32 byte > frame on the new stack, there so that a back-chain is always present > no matter the value of r1. This could be reduced to 16 bytes but I > thought it better to waste a few bytes for 32-byte alignment in case > powerpc64 goes to 32-byte aligned stacks. Above that frame is the > __morestack frame on the old stack. > > 8) If the normal function prologue uses r12 as a frame pointer, as it > always does when the frame size is larger than 32k, then the arg > pointer is set up with > addi %r12,%r12,to_top_of_frame > bge %cr7,.+8 > mr %r12,%r29 > omitting the addi if to_top_of_frame is zero. > > gcc/ > * common/config/rs6000/rs6000-common.c (TARGET_SUPPORTS_SPLIT_STACK): > Define. > (rs6000_supports_split_stack): New function. > * gcc/config/rs6000/rs6000.c (machine_function): Add > split_stack_arg_pointer. > (TARGET_EXTRA_LIVE_ON_ENTRY, TARGET_INTERNAL_ARG_POINTER): Define. > (setup_incoming_varargs): Use crtl->args.internal_arg_pointer > rather than virtual_incoming_args_rtx. > (rs6000_va_start): Likewise. > (split_stack_arg_pointer_used_p): New function. > (rs6000_emit_prologue): Set up arg pointer for -fsplit-stack. > (morestack_ref): New var. > (gen_add3_const, rs6000_expand_split_stack_prologue, > rs6000_internal_arg_pointer, rs6000_live_on_entry, > rs6000_split_stack_space_check): New functions. > (rs6000_elf_file_end): Call file_end_indicate_split_stack. > * gcc/config/rs6000/rs6000.md (UNSPEC_STACK_CHECK): Define. > (UNSPECV_SPLIT_STACK_RETURN): Define. > (split_stack_prologue, load_split_stack_limit, > load_split_stack_limit_di, load_split_stack_limit_si, > split_stack_return, split_stack_space_check): New expands and insns. > * gcc/config/rs6000/rs6000-protos.h > (rs6000_expand_split_stack_prologue): Declare. > (rs6000_split_stack_space_check): Declare. > libgcc/ > * config/rs6000/morestack.S: New. > * config/rs6000/t-stack-rs6000: New. > * config.host (powerpc*-*-linux*): Add t-stack and t-stack-rs6000 > to tmake_file. > * generic-morestack.c: Don't build for powerpc 32-bit. > > diff -urpN gcc-stack-info2/gcc/common/config/rs6000/rs6000-common.c gcc-split-stack1/gcc/common/config/rs6000/rs6000-common.c > --- gcc-stack-info2/gcc/common/config/rs6000/rs6000-common.c 2015-05-15 14:15:38.145244889 +0930 > +++ gcc-split-stack1/gcc/common/config/rs6000/rs6000-common.c 2015-05-15 01:57:37.417258829 +0930 > @@ -288,6 +288,29 @@ rs6000_handle_option (struct gcc_options > return true; > } > > +/* -fsplit-stack uses a field in the TCB, available with glibc-2.18. */ > + > +static bool > +rs6000_supports_split_stack (bool report, > + struct gcc_options *opts ATTRIBUTE_UNUSED) > +{ > +#ifndef TARGET_GLIBC_MAJOR > +#define TARGET_GLIBC_MAJOR 0 > +#endif > +#ifndef TARGET_GLIBC_MINOR > +#define TARGET_GLIBC_MINOR 0 > +#endif > + /* Note: Can't test DEFAULT_ABI here, it isn't set until later. */ > + if (TARGET_GLIBC_MAJOR * 1000 + TARGET_GLIBC_MINOR >= 2018 > + && TARGET_64BIT > + && TARGET_ELF) > + return true; > + > + if (report) > + error ("%<-fsplit-stack%> currently only supported on PowerPC64 GNU/Linux with glibc-2.18 or later"); > + return false; > +} > + > #undef TARGET_HANDLE_OPTION > #define TARGET_HANDLE_OPTION rs6000_handle_option > > @@ -300,4 +323,7 @@ rs6000_handle_option (struct gcc_options > #undef TARGET_OPTION_OPTIMIZATION_TABLE > #define TARGET_OPTION_OPTIMIZATION_TABLE rs6000_option_optimization_table > > +#undef TARGET_SUPPORTS_SPLIT_STACK > +#define TARGET_SUPPORTS_SPLIT_STACK rs6000_supports_split_stack > + > struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER; > diff -urpN gcc-stack-info2/gcc/config/rs6000/rs6000.c gcc-split-stack1/gcc/config/rs6000/rs6000.c > --- gcc-stack-info2/gcc/config/rs6000/rs6000.c 2015-05-16 13:33:37.170406399 +0930 > +++ gcc-split-stack1/gcc/config/rs6000/rs6000.c 2015-05-16 14:54:55.483454632 +0930 > @@ -187,6 +187,8 @@ typedef struct GTY(()) machine_function > 64-bits wide and is allocated early enough so that the offset > does not overflow the 16-bit load/store offset field. */ > rtx sdmode_stack_slot; > + /* Alternative internal arg pointer for -fsplit-stack. */ > + rtx split_stack_arg_pointer; > /* Flag if r2 setup is needed with ELFv2 ABI. */ > bool r2_setup_needed; > } machine_function; > @@ -1190,6 +1192,7 @@ static bool rs6000_debug_cannot_change_m > machine_mode, > enum reg_class); > static bool rs6000_save_toc_in_prologue_p (void); > +static rtx rs6000_internal_arg_pointer (void); > > rtx (*rs6000_legitimize_reload_address_ptr) (rtx, machine_mode, int, int, > int, int *) > @@ -1411,6 +1414,12 @@ static const struct attribute_spec rs600 > #undef TARGET_SET_UP_BY_PROLOGUE > #define TARGET_SET_UP_BY_PROLOGUE rs6000_set_up_by_prologue > > +#undef TARGET_EXTRA_LIVE_ON_ENTRY > +#define TARGET_EXTRA_LIVE_ON_ENTRY rs6000_live_on_entry > + > +#undef TARGET_INTERNAL_ARG_POINTER > +#define TARGET_INTERNAL_ARG_POINTER rs6000_internal_arg_pointer > + > #undef TARGET_HAVE_TLS > #define TARGET_HAVE_TLS HAVE_AS_TLS > > @@ -11150,7 +11159,7 @@ setup_incoming_varargs (cumulative_args_ > else > { > first_reg_offset = next_cum.words; > - save_area = virtual_incoming_args_rtx; > + save_area = crtl->args.internal_arg_pointer; > > if (targetm.calls.must_pass_in_stack (mode, type)) > first_reg_offset += rs6000_arg_size (TYPE_MODE (type), type); > @@ -11344,7 +11353,7 @@ rs6000_va_start (tree valist, rtx nextar > } > > /* Find the overflow area. */ > - t = make_tree (TREE_TYPE (ovf), virtual_incoming_args_rtx); > + t = make_tree (TREE_TYPE (ovf), crtl->args.internal_arg_pointer); > if (words != 0) > t = fold_build_pointer_plus_hwi (t, words * MIN_UNITS_PER_WORD); > t = build2 (MODIFY_EXPR, TREE_TYPE (ovf), ovf, t); > @@ -23425,6 +23434,48 @@ rs6000_reg_live_or_pic_offset_p (int reg > || (DEFAULT_ABI == ABI_DARWIN && flag_pic)))); > } > > +/* Return whether the split-stack arg pointer (r12) is used. */ > + > +static bool > +split_stack_arg_pointer_used_p (void) > +{ > + /* If the pseudo holding the arg pointer is no longer a pseudo, > + then the arg pointer is used. */ > + if (cfun->machine->split_stack_arg_pointer != NULL_RTX > + && (!REG_P (cfun->machine->split_stack_arg_pointer) > + || (REGNO (cfun->machine->split_stack_arg_pointer) > + < FIRST_PSEUDO_REGISTER))) > + return true; > + > + /* Unfortunately we also need to do some code scanning, since > + r12 may have been substituted for the pseudo. */ > + rtx_insn *insn; > + basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun); > + FOR_BB_INSNS (bb, insn) > + if (NONDEBUG_INSN_P (insn)) > + { > + /* A call destroys r12. */ > + if (CALL_P (insn)) > + return false; > + > + df_ref use; > + FOR_EACH_INSN_USE (use, insn) > + { > + rtx x = DF_REF_REG (use); > + if (REG_P (x) && REGNO (x) == 12) > + return true; > + } > + df_ref def; > + FOR_EACH_INSN_DEF (def, insn) > + { > + rtx x = DF_REF_REG (def); > + if (REG_P (x) && REGNO (x) == 12) > + return false; > + } > + } > + return bitmap_bit_p (DF_LR_OUT (bb), 12); > +} > + > /* Emit function prologue as insns. */ > > void > @@ -24376,6 +24427,40 @@ rs6000_emit_prologue (void) > rtx reg = gen_rtx_REG (reg_mode, TOC_REGNUM); > emit_insn (gen_frame_store (reg, sp_reg_rtx, RS6000_TOC_SAVE_SLOT)); > } > + > + if (flag_split_stack && split_stack_arg_pointer_used_p ()) > + { > + /* Set up the arg pointer (r12) for -fsplit-stack code. If > + __morestack was called, it left the arg pointer to the old > + stack in r29. Otherwise, the arg pointer is the top of the > + current frame. */ > + if (frame_off != 0 || REGNO (frame_reg_rtx) != 12) > + { > + rtx r12 = gen_rtx_REG (Pmode, 12); > + if (frame_off == 0) > + emit_move_insn (r12, frame_reg_rtx); > + else > + emit_insn (gen_add3_insn (r12, frame_reg_rtx, GEN_INT (frame_off))); > + } > + if (info->push_p) > + { > + rtx r12 = gen_rtx_REG (Pmode, 12); > + rtx r29 = gen_rtx_REG (Pmode, 29); > + rtx cr7 = gen_rtx_REG (CCUNSmode, CR7_REGNO); > + rtx not_more = gen_label_rtx (); > + rtx jump; > + > + jump = gen_rtx_IF_THEN_ELSE (VOIDmode, > + gen_rtx_GEU (VOIDmode, cr7, const0_rtx), > + gen_rtx_LABEL_REF (VOIDmode, not_more), > + pc_rtx); > + jump = emit_jump_insn (gen_rtx_SET (pc_rtx, jump)); > + JUMP_LABEL (jump) = not_more; > + LABEL_NUSES (not_more) += 1; > + emit_move_insn (r12, r29); > + emit_label (not_more); > + } > + } > } > > /* Output .extern statements for the save/restore routines we use. */ > @@ -25803,6 +25888,178 @@ rs6000_output_function_epilogue (FILE *f > fputs ("\t.align 2\n", file); > } > } > + > +/* -fsplit-stack support. */ > + > +/* A SYMBOL_REF for __morestack. */ > +static GTY(()) rtx morestack_ref; > + > +static rtx > +gen_add3_const (rtx rt, rtx ra, long c) > +{ > + if (TARGET_64BIT) > + return gen_adddi3 (rt, ra, GEN_INT (c)); > + else > + return gen_addsi3 (rt, ra, GEN_INT (c)); > +} > + > +/* Emit -fsplit-stack prologue, which goes before the regular function > + prologue (at local entry point in the case of ELFv2). */ > + > +void > +rs6000_expand_split_stack_prologue (void) > +{ > + rs6000_stack_t *info = rs6000_stack_info (); > + unsigned HOST_WIDE_INT allocate; > + long alloc_hi, alloc_lo; > + rtx r0, r1, r12, lr, ok_label, compare, jump, call_fusage; > + rtx_insn *insn; > + > + gcc_assert (flag_split_stack && reload_completed); > + > + if (!info->push_p) > + return; > + > + allocate = info->total_size; > + if (allocate > (unsigned HOST_WIDE_INT) 1 << 31) > + { > + sorry ("Stack frame larger than 2G is not supported for -fsplit-stack"); > + return; > + } > + if (morestack_ref == NULL_RTX) > + { > + morestack_ref = gen_rtx_SYMBOL_REF (Pmode, "__morestack"); > + SYMBOL_REF_FLAGS (morestack_ref) |= (SYMBOL_FLAG_LOCAL > + | SYMBOL_FLAG_FUNCTION); > + } > + > + r0 = gen_rtx_REG (Pmode, 0); > + r1 = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); > + r12 = gen_rtx_REG (Pmode, 12); > + emit_insn (gen_load_split_stack_limit (r0)); > + /* Always emit two insns here to calculate the requested stack, > + so that the linker can edit them when adjusting size for calling > + non-split-stack code. */ > + alloc_hi = (-allocate + 0x8000) & ~0xffffL; > + alloc_lo = -allocate - alloc_hi; > + if (alloc_hi != 0) > + { > + emit_insn (gen_add3_const (r12, r1, alloc_hi)); > + if (alloc_lo != 0) > + emit_insn (gen_add3_const (r12, r12, alloc_lo)); > + else > + emit_insn (gen_nop ()); > + } > + else > + { > + emit_insn (gen_add3_const (r12, r1, alloc_lo)); > + emit_insn (gen_nop ()); > + } > + > + compare = gen_rtx_REG (CCUNSmode, CR7_REGNO); > + emit_insn (gen_rtx_SET (compare, gen_rtx_COMPARE (CCUNSmode, r12, r0))); > + ok_label = gen_label_rtx (); > + jump = gen_rtx_IF_THEN_ELSE (VOIDmode, > + gen_rtx_GEU (VOIDmode, compare, const0_rtx), > + gen_rtx_LABEL_REF (VOIDmode, ok_label), > + pc_rtx); > + jump = emit_jump_insn (gen_rtx_SET (pc_rtx, jump)); > + JUMP_LABEL (jump) = ok_label; > + /* Mark the jump as very likely to be taken. */ > + add_int_reg_note (jump, REG_BR_PROB, > + REG_BR_PROB_BASE - REG_BR_PROB_BASE / 100); > + > + lr = gen_rtx_REG (Pmode, LR_REGNO); > + insn = emit_move_insn (r0, lr); > + RTX_FRAME_RELATED_P (insn) = 1; > + insn = emit_insn (gen_frame_store (r0, r1, info->lr_save_offset)); > + RTX_FRAME_RELATED_P (insn) = 1; > + > + insn = emit_call_insn (gen_call (gen_rtx_MEM (SImode, morestack_ref), > + const0_rtx, const0_rtx)); > + call_fusage = NULL_RTX; > + use_reg (&call_fusage, r12); > + add_function_usage_to (insn, call_fusage); > + emit_insn (gen_frame_load (r0, r1, info->lr_save_offset)); > + insn = emit_move_insn (lr, r0); > + add_reg_note (insn, REG_CFA_RESTORE, lr); > + RTX_FRAME_RELATED_P (insn) = 1; > + emit_insn (gen_split_stack_return ()); > + > + emit_label (ok_label); > + LABEL_NUSES (ok_label) = 1; > +} > + > +/* Return the internal arg pointer used for function incoming > + arguments. When -fsplit-stack, the arg pointer is r12 so we need > + to copy it to a pseudo in order for it to be preserved over calls > + and suchlike. We'd really like to use a pseudo here for the > + internal arg pointer but data-flow analysis is not prepared to > + accept pseudos as live at the beginning of a function. */ > + > +static rtx > +rs6000_internal_arg_pointer (void) > +{ > + if (flag_split_stack) > + { > + if (cfun->machine->split_stack_arg_pointer == NULL_RTX) > + { > + rtx pat; > + > + cfun->machine->split_stack_arg_pointer = gen_reg_rtx (Pmode); > + REG_POINTER (cfun->machine->split_stack_arg_pointer) = 1; > + > + /* Put the pseudo initialization right after the note at the > + beginning of the function. */ > + pat = gen_rtx_SET (cfun->machine->split_stack_arg_pointer, > + gen_rtx_REG (Pmode, 12)); > + push_topmost_sequence (); > + emit_insn_after (pat, get_insns ()); > + pop_topmost_sequence (); > + } > + return plus_constant (Pmode, cfun->machine->split_stack_arg_pointer, > + FIRST_PARM_OFFSET (current_function_decl)); > + } > + return virtual_incoming_args_rtx; > +} > + > +/* We may have to tell the dataflow pass that the split stack prologue > + is initializing a register. */ > + > +static void > +rs6000_live_on_entry (bitmap regs) > +{ > + if (flag_split_stack) > + bitmap_set_bit (regs, 12); > +} > + > +/* Emit -fsplit-stack dynamic stack allocation space check. */ > + > +void > +rs6000_split_stack_space_check (rtx size, rtx label) > +{ > + rtx sp = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); > + rtx limit = gen_reg_rtx (Pmode); > + rtx requested = gen_reg_rtx (Pmode); > + rtx cmp = gen_reg_rtx (CCUNSmode); > + rtx jump; > + > + emit_insn (gen_load_split_stack_limit (limit)); > + if (CONST_INT_P (size)) > + emit_insn (gen_add3_insn (requested, sp, GEN_INT (-INTVAL (size)))); > + else > + { > + size = force_reg (Pmode, size); > + emit_move_insn (requested, gen_rtx_MINUS (Pmode, sp, size)); > + } > + emit_insn (gen_rtx_SET (cmp, gen_rtx_COMPARE (CCUNSmode, requested, limit))); > + jump = gen_rtx_IF_THEN_ELSE (VOIDmode, > + gen_rtx_GEU (VOIDmode, cmp, const0_rtx), > + gen_rtx_LABEL_REF (VOIDmode, label), > + pc_rtx); > + jump = emit_jump_insn (gen_rtx_SET (pc_rtx, jump)); > + JUMP_LABEL (jump) = label; > +} > \f > /* A C compound statement that outputs the assembler code for a thunk > function, used to implement C++ virtual function calls with > @@ -29811,6 +30068,9 @@ rs6000_elf_file_end (void) > if (TARGET_32BIT || DEFAULT_ABI == ABI_ELFv2) > file_end_indicate_exec_stack (); > #endif > + > + if (flag_split_stack) > + file_end_indicate_split_stack (); > } > #endif > > diff -urpN gcc-stack-info2/gcc/config/rs6000/rs6000.md gcc-split-stack1/gcc/config/rs6000/rs6000.md > --- gcc-stack-info2/gcc/config/rs6000/rs6000.md 2015-05-15 14:15:38.177243589 +0930 > +++ gcc-split-stack1/gcc/config/rs6000/rs6000.md 2015-05-15 02:01:15.776472615 +0930 > @@ -140,6 +140,7 @@ > UNSPEC_PACK_128BIT > UNSPEC_LSQ > UNSPEC_FUSION_GPR > + UNSPEC_STACK_CHECK > ]) > > ;; > @@ -157,6 +158,7 @@ > UNSPECV_NLGR ; non-local goto receiver > UNSPECV_MFFS ; Move from FPSCR > UNSPECV_MTFSF ; Move to FPSCR Fields > + UNSPECV_SPLIT_STACK_RETURN ; A camouflaged return > ]) > > \f > @@ -12345,6 +12347,72 @@ > }" > [(set_attr "type" "load")]) > \f > +;; Handle -fsplit-stack. > + > +(define_expand "split_stack_prologue" > + [(const_int 0)] > + "" > +{ > + rs6000_expand_split_stack_prologue (); > + DONE; > +}) > + > +(define_expand "load_split_stack_limit" > + [(set (match_operand 0) > + (unspec [(const_int 0)] UNSPEC_STACK_CHECK))] > + "" > +{ > + emit_insn (gen_rtx_SET (operands[0], > + gen_rtx_UNSPEC (Pmode, > + gen_rtvec (1, const0_rtx), > + UNSPEC_STACK_CHECK))); > + DONE; > +}) > + > +(define_insn "load_split_stack_limit_di" > + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") > + (unspec:DI [(const_int 0)] UNSPEC_STACK_CHECK))] > + "TARGET_64BIT" > + "ld %0,-0x7040(13)" > + [(set_attr "type" "load") > + (set_attr "update" "no") > + (set_attr "indexed" "no")]) > + > +(define_insn "load_split_stack_limit_si" > + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") > + (unspec:SI [(const_int 0)] UNSPEC_STACK_CHECK))] > + "!TARGET_64BIT" > + "lwz %0,-0x7020(2)" > + [(set_attr "type" "load") > + (set_attr "update" "no") > + (set_attr "indexed" "no")]) > + > +;; A return instruction which the middle-end doesn't see. > +(define_insn "split_stack_return" > + [(unspec_volatile [(const_int 0)] UNSPECV_SPLIT_STACK_RETURN)] > + "" > + "blr" > + [(set_attr "type" "jmpreg")]) > + > +;; If there are operand 0 bytes available on the stack, jump to > +;; operand 1. > +(define_expand "split_stack_space_check" > + [(set (match_dup 2) > + (unspec [(const_int 0)] UNSPEC_STACK_CHECK)) > + (set (match_dup 3) > + (minus (reg STACK_POINTER_REGNUM) > + (match_operand 0))) > + (set (match_dup 4) (compare:CCUNS (match_dup 3) (match_dup 2))) > + (set (pc) (if_then_else > + (geu (match_dup 4) (const_int 0)) > + (label_ref (match_operand 1)) > + (pc)))] > + "" > +{ > + rs6000_split_stack_space_check (operands[0], operands[1]); > + DONE; > +}) > +\f > (define_insn "bpermd_<mode>" > [(set (match_operand:P 0 "gpc_reg_operand" "=r") > (unspec:P [(match_operand:P 1 "gpc_reg_operand" "r") > diff -urpN gcc-stack-info2/gcc/config/rs6000/rs6000-protos.h gcc-split-stack1/gcc/config/rs6000/rs6000-protos.h > --- gcc-stack-info2/gcc/config/rs6000/rs6000-protos.h 2015-05-15 14:15:38.149244726 +0930 > +++ gcc-split-stack1/gcc/config/rs6000/rs6000-protos.h 2015-05-15 01:57:37.417258829 +0930 > @@ -191,6 +191,8 @@ extern void rs6000_emit_prologue (void); > extern void rs6000_emit_load_toc_table (int); > extern unsigned int rs6000_dbx_register_number (unsigned int, unsigned int); > extern void rs6000_emit_epilogue (int); > +extern void rs6000_expand_split_stack_prologue (void); > +extern void rs6000_split_stack_space_check (rtx, rtx); > extern void rs6000_emit_eh_reg_restore (rtx, rtx); > extern const char * output_isel (rtx *); > extern void rs6000_call_aix (rtx, rtx, rtx, rtx); > diff -urpN gcc-stack-info2/libgcc/config/rs6000/morestack.S gcc-split-stack1/libgcc/config/rs6000/morestack.S > --- gcc-stack-info2/libgcc/config/rs6000/morestack.S 1970-01-01 09:30:00.000000000 +0930 > +++ gcc-split-stack1/libgcc/config/rs6000/morestack.S 2015-05-15 14:54:02.247603731 +0930 > @@ -0,0 +1,351 @@ > +#ifdef __powerpc64__ > +# PowerPC64 support for -fsplit-stack. > +# Copyright (C) 2009-2015 Free Software Foundation, Inc. > +# Contributed by Alan Modra <amodra@gmail.com>. > + > +# This file is part of GCC. > + > +# GCC is free software; you can redistribute it and/or modify it under > +# the terms of the GNU General Public License as published by the Free > +# Software Foundation; either version 3, or (at your option) any later > +# version. > + > +# GCC is distributed in the hope that it will be useful, but WITHOUT ANY > +# WARRANTY; without even the implied warranty of MERCHANTABILITY or > +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > +# for more details. > + > +# Under Section 7 of GPL version 3, you are granted additional > +# permissions described in the GCC Runtime Library Exception, version > +# 3.1, as published by the Free Software Foundation. > + > +# You should have received a copy of the GNU General Public License and > +# a copy of the GCC Runtime Library Exception along with this program; > +# see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > +# <http://www.gnu.org/licenses/>. > + > +#if _CALL_ELF == 2 > + .abiversion 2 > +#define PARAMS 32 > +#else > + .abiversion 1 > +#define PARAMS 48 > +#endif > +#define MORESTACK_FRAMESIZE (PARAMS+96) > +#define PARAMREG_SAVE -MORESTACK_FRAMESIZE+PARAMS+0 > +#define STATIC_CHAIN_SAVE -MORESTACK_FRAMESIZE+PARAMS+64 > +#define R29_SAVE -MORESTACK_FRAMESIZE+PARAMS+72 > +#define LINKREG_SAVE -MORESTACK_FRAMESIZE+PARAMS+80 > +#define NEWSTACKSIZE_SAVE -MORESTACK_FRAMESIZE+PARAMS+88 > + > +# Excess space needed to call ld.so resolver for lazy plt > +# resolution. Go uses sigaltstack so this doesn't need to > +# also cover signal frame size. > +#define BACKOFF 4096 > +# Large excess allocated when calling non-split-stack code. > +#define NON_SPLIT_STACK 0x100000 > + > + > +#if _CALL_ELF == 2 > + > +#define BODY_LABEL(name) name > + > +#define ENTRY0(name) \ > + .global name; \ > + .hidden name; \ > + .type name,@function; \ > +name##: > + > +#define ENTRY(name) \ > + ENTRY0(name); \ > +0: addis %r2,%r12,.TOC.-0b@ha; \ > + addi %r2,%r2,.TOC.-0b@l; \ > + .localentry name, .-name > + > +#else > + > +#define BODY_LABEL(name) .L.##name > + > +#define ENTRY0(name) \ > + .global name; \ > + .hidden name; \ > + .type name,@function; \ > + .pushsection ".opd","aw"; \ > + .p2align 3; \ > +name##: .quad BODY_LABEL (name), .TOC.@tocbase, 0; \ > + .popsection; \ > +BODY_LABEL(name)##: > + > +#define ENTRY(name) ENTRY0(name) > + > +#endif > + > +#define SIZE(name) .size name, .-BODY_LABEL(name) > + > + > + .text > +# Just like __morestack, but with larger excess allocation > +ENTRY0(__morestack_non_split) > +.LFB1: > + .cfi_startproc > +# We use a cleanup to restore the tcbhead_t.__private_ss if > +# an exception is thrown through this code. > +#ifdef __PIC__ > + .cfi_personality 0x9b,DW.ref.__gcc_personality_v0 > + .cfi_lsda 0x1b,.LLSDA1 > +#else > + .cfi_personality 0x3,__gcc_personality_v0 > + .cfi_lsda 0x3,.LLSDA1 > +#endif > +# LR is already saved by the split-stack prologue code. > +# We may as well have the unwinder skip over the call in the > +# prologue too. > + .cfi_offset %lr,16 > + > + addis %r12,%r12,-NON_SPLIT_STACK@h > + SIZE (__morestack_non_split) > +# Fall through into __morestack > + > + > +# This function is called with non-standard calling conventions. > +# On entry, r12 is the requested stack pointer. One version of the > +# split-stack prologue that calls __morestack looks like > +# ld %r0,-0x7000-64(%r13) > +# addis %r12,%r1,-allocate@ha > +# addi %r12,%r12,-allocate@l > +# cmpld %r12,%r0 > +# bge+ enough > +# mflr %r0 > +# std %r0,16(%r1) > +# bl __morestack > +# ld %r0,16(%r1) > +# mtlr %r0 > +# blr > +# enough: > +# The normal function prologue follows here, with a small addition at > +# the end to set up the arg pointer. The arg pointer is set up with: > +# addi %r12,%r1,offset > +# bge %cr7,.+8 > +# mr %r12,%r29 > +# > +# Note that the lr save slot 16(%r1) has already been used. > +# r3 thru r11 possibly contain arguments and a static chain > +# pointer for the function we're calling, so must be preserved. > +# cr7 must also be preserved. > + > +ENTRY0(__morestack) > +# Save parameter passing registers, our arguments, lr, r29 > +# and use r29 as a frame pointer. > + std %r3,PARAMREG_SAVE+0(%r1) > + sub %r3,%r1,%r12 # calculate requested stack size > + mflr %r12 > + std %r4,PARAMREG_SAVE+8(%r1) > + std %r5,PARAMREG_SAVE+16(%r1) > + std %r6,PARAMREG_SAVE+24(%r1) > + std %r7,PARAMREG_SAVE+32(%r1) > + addi %r3,%r3,BACKOFF > + std %r8,PARAMREG_SAVE+40(%r1) > + std %r9,PARAMREG_SAVE+48(%r1) > + std %r10,PARAMREG_SAVE+56(%r1) > + std %r11,STATIC_CHAIN_SAVE(%r1) > + std %r29,R29_SAVE(%r1) > + std %r12,LINKREG_SAVE(%r1) > + std %r3,NEWSTACKSIZE_SAVE(%r1) # new stack size > + mr %r29,%r1 > + .cfi_offset %r29,R29_SAVE > + .cfi_def_cfa_register %r29 > + stdu %r1,-MORESTACK_FRAMESIZE(%r1) > + > + # void __morestack_block_signals (void) > + bl __morestack_block_signals > + > + # void *__generic_morestack (size_t *pframe_size, > + # void *old_stack, > + # size_t param_size) > + addi %r3,%r29,NEWSTACKSIZE_SAVE > + mr %r4,%r29 > + li %r5,0 # no copying from old stack > + bl __generic_morestack > + > +# Start using new stack > + stdu %r29,-32(%r3) # back-chain > + mr %r1,%r3 > + > +# Set __private_ss stack guard for the new stack. > + ld %r12,NEWSTACKSIZE_SAVE(%r29) # modified size > + addi %r3,%r3,BACKOFF-32 > + sub %r3,%r3,%r12 > +# Note that a signal frame has $pc pointing at the instruction > +# where the signal occurred. For something like a timer > +# interrupt this means the instruction has already executed, > +# thus the region starts at the instruction modifying > +# __private_ss, not one instruction after. > +.LEHB0: > + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss > + > + # void __morestack_unblock_signals (void) > + bl __morestack_unblock_signals > + > +# Set up for a call to the target function, located 3 > +# instructions after __morestack's return address. > +# > + ld %r12,LINKREG_SAVE(%r29) > + ld %r3,PARAMREG_SAVE+0(%r29) # restore arg regs > + ld %r4,PARAMREG_SAVE+8(%r29) > + ld %r5,PARAMREG_SAVE+16(%r29) > + ld %r6,PARAMREG_SAVE+24(%r29) > + ld %r7,PARAMREG_SAVE+32(%r29) > + ld %r8,PARAMREG_SAVE+40(%r29) > + ld %r9,PARAMREG_SAVE+48(%r29) > + addi %r0,%r12,12 # add 3 instructions > + ld %r10,PARAMREG_SAVE+56(%r29) > + ld %r11,STATIC_CHAIN_SAVE(%r29) > + cmpld %cr7,%r12,%r0 # indicate we were called > + mtctr %r0 > + bctrl # call caller! > + > +# On return, save regs possibly used to return a value, and > +# possibly trashed by calls to __morestack_block_signals, > +# __generic_releasestack and __morestack_unblock_signals. > +# Assume those calls don't use vector or floating point regs. > + std %r3,PARAMREG_SAVE+0(%r29) > + std %r4,PARAMREG_SAVE+8(%r29) > + std %r5,PARAMREG_SAVE+16(%r29) > + std %r6,PARAMREG_SAVE+24(%r29) > +#if _CALL_ELF == 2 > + std %r7,PARAMREG_SAVE+32(%r29) > + std %r8,PARAMREG_SAVE+40(%r29) > + std %r9,PARAMREG_SAVE+48(%r29) > + std %r10,PARAMREG_SAVE+56(%r29) > +#endif > + > + bl __morestack_block_signals > + > + # void *__generic_releasestack (size_t *pavailable) > + addi %r3,%r29,NEWSTACKSIZE_SAVE > + bl __generic_releasestack > + > +# Reset __private_ss stack guard to value for old stack > + ld %r12,NEWSTACKSIZE_SAVE(%r29) > + addi %r3,%r3,BACKOFF > + sub %r3,%r3,%r12 > +.LEHE0: > + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss > + > + bl __morestack_unblock_signals > + > +# Use old stack again. > + mr %r1,%r29 > + > +# Restore return value regs, and return. > + ld %r0,LINKREG_SAVE(%r29) > + mtlr %r0 > + ld %r3,PARAMREG_SAVE+0(%r29) > + ld %r4,PARAMREG_SAVE+8(%r29) > + ld %r5,PARAMREG_SAVE+16(%r29) > + ld %r6,PARAMREG_SAVE+24(%r29) > +#if _CALL_ELF == 2 > + ld %r7,PARAMREG_SAVE+32(%r29) > + ld %r8,PARAMREG_SAVE+40(%r29) > + ld %r9,PARAMREG_SAVE+48(%r29) > + ld %r10,PARAMREG_SAVE+56(%r29) > +#endif > + ld %r29,R29_SAVE(%r29) > + .cfi_def_cfa_register %r1 > + blr > + > +# This is the cleanup code called by the stack unwinder when > +# unwinding through code between .LEHB0 and .LEHE0 above. > +cleanup: > + .cfi_def_cfa_register %r29 > + std %r3,PARAMREG_SAVE(%r29) # Save exception header > + # size_t __generic_findstack (void *stack) > + mr %r3,%r29 > + bl __generic_findstack > + sub %r3,%r29,%r3 > + addi %r3,%r3,BACKOFF > + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss > + ld %r3,PARAMREG_SAVE(%r29) > + bl _Unwind_Resume > + nop > + .cfi_endproc > + SIZE (__morestack) > + > + > + .section .gcc_except_table,"a",@progbits > + .p2align 2 > +.LLSDA1: > + .byte 0xff # @LPStart format (omit) > + .byte 0xff # @TType format (omit) > + .byte 0x1 # call-site format (uleb128) > + .uleb128 .LLSDACSE1-.LLSDACSB1 # Call-site table length > +.LLSDACSB1: > + .uleb128 .LEHB0-.LFB1 # region 0 start > + .uleb128 .LEHE0-.LEHB0 # length > + .uleb128 cleanup-.LFB1 # landing pad > + .uleb128 0 # no action, ie. a cleanup > +.LLSDACSE1: > + > + > +#ifdef __PIC__ > +# Build a position independent reference to the personality function. > + .hidden DW.ref.__gcc_personality_v0 > + .weak DW.ref.__gcc_personality_v0 > + .section .data.DW.ref.__gcc_personality_v0,"awG",@progbits,DW.ref.__gcc_personality_v0,comdat > + .p2align 3 > +DW.ref.__gcc_personality_v0: > + .quad __gcc_personality_v0 > + .type DW.ref.__gcc_personality_v0, @object > + .size DW.ref.__gcc_personality_v0, 8 > +#endif > + > + > + .text > +# Initialize the stack guard when the program starts or when a > +# new thread starts. This is called from a constructor. > +# void __stack_split_initialize (void) > +ENTRY(__stack_split_initialize) > + addi %r3,%r1,-0x4000 # We should have at least 16K. > + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss > + # void __generic_morestack_set_initial_sp (void *sp, size_t len) > + mr %r3,%r1 > + li %r4, 0x4000 > + b __generic_morestack_set_initial_sp > + SIZE (__stack_split_initialize) > + > + > +# Return current __private_ss > +# void *__morestack_get_guard (void) > +ENTRY0(__morestack_get_guard) > + ld %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss > + blr > + SIZE (__morestack_get_guard) > + > + > +# Set __private_ss > +# void __morestack_set_guard (void *ptr) > +ENTRY0(__morestack_set_guard) > + std %r3,-0x7000-64(%r13) # tcbhead_t.__private_ss > + blr > + SIZE (__morestack_set_guard) > + > + > +# Return the stack guard value for given stack > +# void *__morestack_make_guard (void *stack, size_t size) > +ENTRY0(__morestack_make_guard) > + sub %r3,%r3,%r4 > + addi %r3,%r3,BACKOFF > + blr > + SIZE (__morestack_make_guard) > + > + > +# Make __stack_split_initialize a high priority constructor. > + .section .ctors.65535,"aw",@progbits > + .p2align 3 > + .quad __stack_split_initialize > + .quad __morestack_load_mmap > + > + .section .note.GNU-stack,"",@progbits > + .section .note.GNU-split-stack,"",@progbits > + .section .note.GNU-no-split-stack,"",@progbits > +#endif /* __powerpc64__ */ > diff -urpN gcc-stack-info2/libgcc/config/rs6000/t-stack-rs6000 gcc-split-stack1/libgcc/config/rs6000/t-stack-rs6000 > --- gcc-stack-info2/libgcc/config/rs6000/t-stack-rs6000 1970-01-01 09:30:00.000000000 +0930 > +++ gcc-split-stack1/libgcc/config/rs6000/t-stack-rs6000 2015-05-15 01:57:37.429258346 +0930 > @@ -0,0 +1,2 @@ > +# Makefile fragment to support -fsplit-stack for powerpc. > +LIB2ADD_ST += $(srcdir)/config/rs6000/morestack.S > diff -urpN gcc-stack-info2/libgcc/config.host gcc-split-stack1/libgcc/config.host > --- gcc-stack-info2/libgcc/config.host 2015-05-15 14:15:38.193242938 +0930 > +++ gcc-split-stack1/libgcc/config.host 2015-05-15 01:57:37.429258346 +0930 > @@ -1021,6 +1021,7 @@ powerpc-*-rtems*) > ;; > powerpc*-*-linux*) > tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-savresfgpr rs6000/t-crtstuff rs6000/t-linux t-dfprules rs6000/t-ppc64-fp t-slibgcc-libgcc" > + tmake_file="${tmake_file} t-stack rs6000/t-stack-rs6000" > case $ppc_fp_type in > 64) > ;; > diff -urpN gcc-stack-info2/libgcc/generic-morestack.c gcc-split-stack1/libgcc/generic-morestack.c > --- gcc-stack-info2/libgcc/generic-morestack.c 2015-05-15 14:15:38.193242938 +0930 > +++ gcc-split-stack1/libgcc/generic-morestack.c 2015-05-15 01:57:37.429258346 +0930 > @@ -23,6 +23,9 @@ a copy of the GCC Runtime Library Except > see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > <http://www.gnu.org/licenses/>. */ > > +/* powerpc 32-bit not supported. */ > +#if !defined __powerpc__ || defined __powerpc64__ > + > #include "tconfig.h" > #include "tsystem.h" > #include "coretypes.h" > @@ -935,6 +938,7 @@ __splitstack_find (void *segment_arg, vo > nsp -= 12 * sizeof (void *); > #elif defined (__i386__) > nsp -= 6 * sizeof (void *); > +#elif defined __powerpc64__ > #else > #error "unrecognized target" > #endif > @@ -1170,3 +1174,4 @@ __splitstack_find_context (void *context > } > > #endif /* !defined (inhibit_libc) */ > +#endif /* not powerpc 32-bit */ > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-19 12:48 ` Lynn A. Boger @ 2015-05-20 1:03 ` Alan Modra 2015-05-20 12:14 ` Lynn A. Boger 0 siblings, 1 reply; 32+ messages in thread From: Alan Modra @ 2015-05-20 1:03 UTC (permalink / raw) To: Lynn A. Boger; +Cc: gcc-patches, David Edelsohn On Tue, May 19, 2015 at 07:40:15AM -0500, Lynn A. Boger wrote: > Questions on the use of the options for split stack: > > - The way this is implemented, split stack is generated if the > target platform supports split stack, on ppc64/ppc64le as well > as on x86, and the use of -fno-split-stack doesn't seem to affect it > for any of these. Is that the way it should work? I would expect > -fno-split-stack to disable it completely. Can you give a testcase to show what you mean? Picking one of the go testsuite programs at random, I see $ gcc/xgcc -Bgcc/ -S -I powerpc64le-linux/libgo /src/gcc-virgin/gcc/testsuite/go.test/test/args.go $ grep morestack args.s bl __morestack bl __morestack $ gcc/xgcc -Bgcc/ -fno-split-stack -S -I powerpc64le-linux/libgo /src/gcc-virgin/gcc/testsuite/go.test/test/args.go $ grep morestack args.s $ That shows -fno-split-stack being honoured. > - The comments say that the gold linker is used for some > situations but I don't see any reference in the code to enabling > the gold linker for ppc64le, ppc64, or x86. Is the user expected > to add the option for the gold linker if needed? At the moment I believe this is true. -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-20 1:03 ` Alan Modra @ 2015-05-20 12:14 ` Lynn A. Boger 2015-05-20 12:59 ` David Edelsohn 0 siblings, 1 reply; 32+ messages in thread From: Lynn A. Boger @ 2015-05-20 12:14 UTC (permalink / raw) To: gcc-patches, David Edelsohn On 05/19/2015 07:52 PM, Alan Modra wrote: > On Tue, May 19, 2015 at 07:40:15AM -0500, Lynn A. Boger wrote: >> Questions on the use of the options for split stack: >> >> - The way this is implemented, split stack is generated if the >> target platform supports split stack, on ppc64/ppc64le as well >> as on x86, and the use of -fno-split-stack doesn't seem to affect it >> for any of these. Is that the way it should work? I would expect >> -fno-split-stack to disable it completely. > Can you give a testcase to show what you mean? Picking one of the go > testsuite programs at random, I see > $ gcc/xgcc -Bgcc/ -S -I powerpc64le-linux/libgo /src/gcc-virgin/gcc/testsuite/go.test/test/args.go > $ grep morestack args.s > bl __morestack > bl __morestack > $ gcc/xgcc -Bgcc/ -fno-split-stack -S -I powerpc64le-linux/libgo /src/gcc-virgin/gcc/testsuite/go.test/test/args.go > $ grep morestack args.s > $ > That shows -fno-split-stack being honoured. You are correct. I made some mistake in my testing. >> - The comments say that the gold linker is used for some >> situations but I don't see any reference in the code to enabling >> the gold linker for ppc64le, ppc64, or x86. Is the user expected >> to add the option for the gold linker if needed? > At the moment I believe this is true. I have been trying to use the gold linker with your patch and seems to work fine. I added the following to the STACK_SPLIT_SPEC in gcc/gcc.c to enable the gold linker if -fsplit-stack is set, but that will cause problems on systems where the gold linker (and the correct level of binutils for Power) is not available. Is this an absolute requirement to use split stack? Could the configure determine if gold is available and generate this one way or another? --- gcc.c (revision 223217) +++ gcc.c (working copy) @@ -541,7 +541,8 @@ proper position among the other output files. */ libgcc. This is not yet a real spec, though it could become one; it is currently just stuffed into LINK_SPEC. FIXME: This wrapping only works with GNU ld and gold. */ -#define STACK_SPLIT_SPEC " %{fsplit-stack: --wrap=pthread_create}" +#define STACK_SPLIT_SPEC \ + " %{fsplit-stack: --wrap=pthread_create -fuse-ld=gold}" #ifndef LIBASAN_SPEC #define STATIC_LIBASAN_LIBS \ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-20 12:14 ` Lynn A. Boger @ 2015-05-20 12:59 ` David Edelsohn 2015-05-20 16:01 ` Lynn A. Boger 0 siblings, 1 reply; 32+ messages in thread From: David Edelsohn @ 2015-05-20 12:59 UTC (permalink / raw) To: Lynn A. Boger; +Cc: GCC Patches On Wed, May 20, 2015 at 8:13 AM, Lynn A. Boger <laboger@linux.vnet.ibm.com> wrote: > > > On 05/19/2015 07:52 PM, Alan Modra wrote: >> >> On Tue, May 19, 2015 at 07:40:15AM -0500, Lynn A. Boger wrote: >>> >>> Questions on the use of the options for split stack: >>> >>> - The way this is implemented, split stack is generated if the >>> target platform supports split stack, on ppc64/ppc64le as well >>> as on x86, and the use of -fno-split-stack doesn't seem to affect it >>> for any of these. Is that the way it should work? I would expect >>> -fno-split-stack to disable it completely. >> >> Can you give a testcase to show what you mean? Picking one of the go >> testsuite programs at random, I see >> $ gcc/xgcc -Bgcc/ -S -I powerpc64le-linux/libgo >> /src/gcc-virgin/gcc/testsuite/go.test/test/args.go >> $ grep morestack args.s >> bl __morestack >> bl __morestack >> $ gcc/xgcc -Bgcc/ -fno-split-stack -S -I powerpc64le-linux/libgo >> /src/gcc-virgin/gcc/testsuite/go.test/test/args.go >> $ grep morestack args.s >> $ >> That shows -fno-split-stack being honoured. > > You are correct. I made some mistake in my testing. >>> >>> - The comments say that the gold linker is used for some >>> situations but I don't see any reference in the code to enabling >>> the gold linker for ppc64le, ppc64, or x86. Is the user expected >>> to add the option for the gold linker if needed? >> >> At the moment I believe this is true. > > > I have been trying to use the gold linker with your patch and seems to work > fine. I added the following to > the STACK_SPLIT_SPEC in gcc/gcc.c to enable the gold linker if -fsplit-stack > is set, but that will cause problems > on systems where the gold linker (and the correct level of binutils for > Power) is not available. Is this an > absolute requirement to use split stack? Could the configure determine if > gold is available and > generate this one way or another? > > --- gcc.c (revision 223217) > +++ gcc.c (working copy) > @@ -541,7 +541,8 @@ proper position among the other output files. */ > libgcc. This is not yet a real spec, though it could become one; > it is currently just stuffed into LINK_SPEC. FIXME: This wrapping > only works with GNU ld and gold. */ > -#define STACK_SPLIT_SPEC " %{fsplit-stack: --wrap=pthread_create}" > +#define STACK_SPLIT_SPEC \ > + " %{fsplit-stack: --wrap=pthread_create -fuse-ld=gold}" > > #ifndef LIBASAN_SPEC > #define STATIC_LIBASAN_LIBS \ Lynn, split-stack does not require Gold linker. This is a non-starter. Gold is necessary for some corner cases of mixing split-stack and non-split-stack modules. - David ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-20 12:59 ` David Edelsohn @ 2015-05-20 16:01 ` Lynn A. Boger 0 siblings, 0 replies; 32+ messages in thread From: Lynn A. Boger @ 2015-05-20 16:01 UTC (permalink / raw) To: David Edelsohn; +Cc: GCC Patches Anytime go code built with gccgo is linked against libraries built with gcc (without split stack) there could be mixing of split stack and non split stack code. I think that will be a common case. My understanding is that if you don't use the gold linker in these cases, it is possible that the app could fail and it won't be clear why. Maybe the gold linker isn't required to make it work for most cases, but it will fail for some cases without it. On 05/20/2015 07:58 AM, David Edelsohn wrote: > On Wed, May 20, 2015 at 8:13 AM, Lynn A. Boger > <laboger@linux.vnet.ibm.com> wrote: >> >> On 05/19/2015 07:52 PM, Alan Modra wrote: >>> On Tue, May 19, 2015 at 07:40:15AM -0500, Lynn A. Boger wrote: >>>> Questions on the use of the options for split stack: >>>> >>>> - The way this is implemented, split stack is generated if the >>>> target platform supports split stack, on ppc64/ppc64le as well >>>> as on x86, and the use of -fno-split-stack doesn't seem to affect it >>>> for any of these. Is that the way it should work? I would expect >>>> -fno-split-stack to disable it completely. >>> Can you give a testcase to show what you mean? Picking one of the go >>> testsuite programs at random, I see >>> $ gcc/xgcc -Bgcc/ -S -I powerpc64le-linux/libgo >>> /src/gcc-virgin/gcc/testsuite/go.test/test/args.go >>> $ grep morestack args.s >>> bl __morestack >>> bl __morestack >>> $ gcc/xgcc -Bgcc/ -fno-split-stack -S -I powerpc64le-linux/libgo >>> /src/gcc-virgin/gcc/testsuite/go.test/test/args.go >>> $ grep morestack args.s >>> $ >>> That shows -fno-split-stack being honoured. >> You are correct. I made some mistake in my testing. >>>> - The comments say that the gold linker is used for some >>>> situations but I don't see any reference in the code to enabling >>>> the gold linker for ppc64le, ppc64, or x86. Is the user expected >>>> to add the option for the gold linker if needed? >>> At the moment I believe this is true. >> >> I have been trying to use the gold linker with your patch and seems to work >> fine. I added the following to >> the STACK_SPLIT_SPEC in gcc/gcc.c to enable the gold linker if -fsplit-stack >> is set, but that will cause problems >> on systems where the gold linker (and the correct level of binutils for >> Power) is not available. Is this an >> absolute requirement to use split stack? Could the configure determine if >> gold is available and >> generate this one way or another? >> >> --- gcc.c (revision 223217) >> +++ gcc.c (working copy) >> @@ -541,7 +541,8 @@ proper position among the other output files. */ >> libgcc. This is not yet a real spec, though it could become one; >> it is currently just stuffed into LINK_SPEC. FIXME: This wrapping >> only works with GNU ld and gold. */ >> -#define STACK_SPLIT_SPEC " %{fsplit-stack: --wrap=pthread_create}" >> +#define STACK_SPLIT_SPEC \ >> + " %{fsplit-stack: --wrap=pthread_create -fuse-ld=gold}" >> >> #ifndef LIBASAN_SPEC >> #define STATIC_LIBASAN_LIBS \ > Lynn, > > split-stack does not require Gold linker. This is a non-starter. > > Gold is necessary for some corner cases of mixing split-stack and > non-split-stack modules. > > - David > > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-18 2:55 ` [PATCH 3/4] split-stack for powerpc64 Alan Modra 2015-05-18 7:05 ` Alan Modra 2015-05-19 12:48 ` Lynn A. Boger @ 2015-05-19 14:37 ` David Edelsohn 2015-05-21 2:10 ` Alan Modra 3 siblings, 0 replies; 32+ messages in thread From: David Edelsohn @ 2015-05-19 14:37 UTC (permalink / raw) To: GCC Patches, Alan Modra On Sun, May 17, 2015 at 10:54 PM, Alan Modra <amodra@gmail.com> wrote: > This patch adds -fsplit-stack support for PowerPC64 Linux. I haven't > made any real attempt to support ppc32 at this stage, but that should > mostly be a matter of writing __morestack for ppc32. > > The idea of split-stack is to allocate just enough stack to execute a > function, with checks added before function entry and on alloca to > ensure the stack is large enough. It stack size is insufficient, a > new stack segment is allocated for the function. The new stack and > old stack are not necessarily contiguous. For powerpc64, function > arguments on the old stack are accessed by using an arg_pointer > register rather than accessing them relative to the stack pointer or > frame pointer as is usually done. (x86 copies function arguments from > the old stack to the new, but needs an arg pointer for variable > argument lists.) Unwinding is handled by a personality routine that > knows how to find stack segments. > > Split-stack prologue on function entry (local entry point for ELFv2) > is as follows. This goes before the usual function prologue. > > entry: > ld %r0,-0x7000-64(%r13) # tcbhead_t.__private_ss > addis %r12,%r1,-allocate@ha > addi %r12,%r12,-allocate@l > cmpld %cr7,%r12,%r0 > bge+ %cr7,enough > mflr %r0 > std %r0,16(%r1) > bl __morestack > ld %r0,16(%r1) > mtlr %r0 > blr > enough: > # usual function prologue, modified a little at the end to set up the > # arg_pointer in %r12, starts here. The arg_pointer is initialized, > # if it is used, with > addi %r12,%r1,frame_size > bge %cr7,.+8 > mr %r12,%r29 > > Notes: > 1) A function that does not allocate a stack frame, does not have a > split-stack prologue. > > 2) __morestack must be local. __morestack has a non-standard calling > convention, with the desired stack being passed in %r12. It saves arg > passing regs, calls __generic_morestack to allocate a new stack > segment, restores the arg passing regs and sets r29 to point at the > old stack, then calls its return address + 12 to execute the function. > After the function returns __morestack saves return regs, calls > __generic_releasestack, and returns to the split-stack prologue, which > immediately returns. This scheme keeps hardware return prediction > valid. __morestack must also ensure cr7 is correctly set. > > 3) Basic-block reordering (enabled with -O2) will move the six > instructions after the "bge+" out of line. > > 4) When the stack allocation is less than 32k these two instructions > addis %r12,%r1,-allocate@ha > addi %r12,%r12,-allocate@l > are rewritten as > addi %r12,%r1,-allocate > nop > The addi may also be rewritten as a nop in the rare case that the > stack allocation is exactly a multiple of 64k. > > 5) When the linker detects a call from split-stack to non-split-stack > code, it adds 16k (or more) to the value found in "allocate" > instructions. So non-split-stack code gets a larger stack. The > amount is tunable by a linker option. The edit means powerpc64 does > not need to implement __morestack_non_split, necessary on x86 because > insufficient space is available there to edit the stack comparison > code. This feature is only implemented in the GNU gold linker. > > 6) We won't handle >2G stack initially and perhaps never. Supporting > multiple threads each requiring more than 2G of stack is probably not > that important, and likely to OOM at run time. (It would be possible > to easily handle up to 4G by rounding the allocation up to a multiple > of 64k and using two addis instructions in the split-stack prologue.) > > 7) If __morestack is called, then there are two stack frames between > the function and its caller. Immediately above is a small 32 byte > frame on the new stack, there so that a back-chain is always present > no matter the value of r1. This could be reduced to 16 bytes but I > thought it better to waste a few bytes for 32-byte alignment in case > powerpc64 goes to 32-byte aligned stacks. Above that frame is the > __morestack frame on the old stack. > > 8) If the normal function prologue uses r12 as a frame pointer, as it > always does when the frame size is larger than 32k, then the arg > pointer is set up with > addi %r12,%r12,to_top_of_frame > bge %cr7,.+8 > mr %r12,%r29 > omitting the addi if to_top_of_frame is zero. > > gcc/ > * common/config/rs6000/rs6000-common.c (TARGET_SUPPORTS_SPLIT_STACK): > Define. > (rs6000_supports_split_stack): New function. > * gcc/config/rs6000/rs6000.c (machine_function): Add > split_stack_arg_pointer. > (TARGET_EXTRA_LIVE_ON_ENTRY, TARGET_INTERNAL_ARG_POINTER): Define. > (setup_incoming_varargs): Use crtl->args.internal_arg_pointer > rather than virtual_incoming_args_rtx. > (rs6000_va_start): Likewise. > (split_stack_arg_pointer_used_p): New function. > (rs6000_emit_prologue): Set up arg pointer for -fsplit-stack. > (morestack_ref): New var. > (gen_add3_const, rs6000_expand_split_stack_prologue, > rs6000_internal_arg_pointer, rs6000_live_on_entry, > rs6000_split_stack_space_check): New functions. > (rs6000_elf_file_end): Call file_end_indicate_split_stack. > * gcc/config/rs6000/rs6000.md (UNSPEC_STACK_CHECK): Define. > (UNSPECV_SPLIT_STACK_RETURN): Define. > (split_stack_prologue, load_split_stack_limit, > load_split_stack_limit_di, load_split_stack_limit_si, > split_stack_return, split_stack_space_check): New expands and insns. > * gcc/config/rs6000/rs6000-protos.h > (rs6000_expand_split_stack_prologue): Declare. > (rs6000_split_stack_space_check): Declare. > libgcc/ > * config/rs6000/morestack.S: New. > * config/rs6000/t-stack-rs6000: New. > * config.host (powerpc*-*-linux*): Add t-stack and t-stack-rs6000 > to tmake_file. > * generic-morestack.c: Don't build for powerpc 32-bit. This patch is okay. I'll let you and Lynn discuss the meaning of options. Thanks, David ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-18 2:55 ` [PATCH 3/4] split-stack for powerpc64 Alan Modra ` (2 preceding siblings ...) 2015-05-19 14:37 ` David Edelsohn @ 2015-05-21 2:10 ` Alan Modra 2015-05-29 16:09 ` Alan Modra 3 siblings, 1 reply; 32+ messages in thread From: Alan Modra @ 2015-05-21 2:10 UTC (permalink / raw) To: gcc-patches, David Edelsohn Older assemblers don't understand .abiversion, so I'm committing the following as obvious to fix a problem Michael Meissner found when building gcc for powerpc64-linux. PR libgcc/66225 * config/rs6000/morestack.S: Remove ".abiversion 1". Index: libgcc/config/rs6000/morestack.S =================================================================== --- libgcc/config/rs6000/morestack.S (revision 223463) +++ libgcc/config/rs6000/morestack.S (working copy) @@ -28,7 +28,6 @@ .abiversion 2 #define PARAMS 32 #else - .abiversion 1 #define PARAMS 48 #endif #define MORESTACK_FRAMESIZE (PARAMS+96) -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-21 2:10 ` Alan Modra @ 2015-05-29 16:09 ` Alan Modra 2015-05-29 17:25 ` David Edelsohn 0 siblings, 1 reply; 32+ messages in thread From: Alan Modra @ 2015-05-29 16:09 UTC (permalink / raw) To: gcc-patches, David Edelsohn Two fixes for -fsplit-stack on powerpc64. I goofed on the block scanned for uses of r12. ENTRY_BLOCK_PTR_FOR_FN is the fake one. The next block is the first one having insns. The second change emits an error if people use a global register variable r29 with -fsplit-stack. For example: register struct important_stuff *quick_access __asm__ ("r29"); Such code does exist in the wild, but probably doesn't currently use -fsplit-stack. The problem is that r29 is saved by morestack then used to pass the old stack pointer to the normal function prologue. So on entry to a function, r29 will be modified. It would be possible to restore r29 from the morestack frame in order to let the function body see the original r29, I wrote such a patch, but that trick isn't safe if someone is using r29 in a signal handler. I suppose we could restore r29 and downgrade the error to a warning that using a global register asm r29 isn't safe in signal handlers with -fsplit-stack. What do you think, David? * config/rs6000/rs6000.c (split_stack_arg_pointer_used_p): Scan correct block for use of r12. (rs6000_expand_split_stack_prologue): Error on r29 asm global reg. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 223857) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -23450,7 +23450,7 @@ split_stack_arg_pointer_used_p (void) /* Unfortunately we also need to do some code scanning, since r12 may have been substituted for the pseudo. */ rtx_insn *insn; - basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun); + basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb; FOR_BB_INSNS (bb, insn) if (NONDEBUG_INSN_P (insn)) { @@ -25942,6 +25942,13 @@ rs6000_expand_split_stack_prologue (void) if (!info->push_p) return; + if (global_regs[29]) + { + error ("-fsplit-stack uses register r29"); + inform (DECL_SOURCE_LOCATION (global_regs_decl[29]), + "conflicts with %qD", global_regs_decl[29]); + } + allocate = info->total_size; if (allocate > (unsigned HOST_WIDE_INT) 1 << 31) { -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] split-stack for powerpc64 2015-05-29 16:09 ` Alan Modra @ 2015-05-29 17:25 ` David Edelsohn 0 siblings, 0 replies; 32+ messages in thread From: David Edelsohn @ 2015-05-29 17:25 UTC (permalink / raw) To: GCC Patches, Alan Modra On Fri, May 29, 2015 at 10:47 AM, Alan Modra <amodra@gmail.com> wrote: > Two fixes for -fsplit-stack on powerpc64. I goofed on the block > scanned for uses of r12. ENTRY_BLOCK_PTR_FOR_FN is the fake one. The > next block is the first one having insns. > > The second change emits an error if people use a global register > variable r29 with -fsplit-stack. For example: > register struct important_stuff *quick_access __asm__ ("r29"); > > Such code does exist in the wild, but probably doesn't currently use > -fsplit-stack. > > The problem is that r29 is saved by morestack then used to pass the > old stack pointer to the normal function prologue. So on entry to a > function, r29 will be modified. It would be possible to restore r29 > from the morestack frame in order to let the function body see the > original r29, I wrote such a patch, but that trick isn't safe if > someone is using r29 in a signal handler. > > I suppose we could restore r29 and downgrade the error to a warning > that using a global register asm r29 isn't safe in signal handlers > with -fsplit-stack. What do you think, David? > > * config/rs6000/rs6000.c (split_stack_arg_pointer_used_p): Scan > correct block for use of r12. > (rs6000_expand_split_stack_prologue): Error on r29 asm global reg. Okay. I doubt that many developers explicitly use r29 as a global register. I agree with you that it probably is better to error out completely instead of likely generating buggy code for this particular combination of r29 asm and split stack. Thanks, David ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 4/4] Split-stack arg pointer init refinement 2015-05-18 2:54 [Patch 0/4] PowerPC64 Linux split stack support Alan Modra ` (2 preceding siblings ...) 2015-05-18 2:55 ` [PATCH 3/4] split-stack for powerpc64 Alan Modra @ 2015-05-18 3:42 ` Alan Modra 2015-06-13 12:05 ` [Patch 0/4] PowerPC64 Linux split stack support Andreas Schwab 4 siblings, 0 replies; 32+ messages in thread From: Alan Modra @ 2015-05-18 3:42 UTC (permalink / raw) To: gcc-patches This small refinement to the -fsplit-stack prologue arg pointer initialization improves code generation. Compare the -O2 gcc/testsuite/gcc.dg/split-3.c code for down() below. before after mflr 0 mflr 0 std 31,-8(1) std 31,-8(1) std 0,16(1) mr 12,1 stdu 1,-10144(1) std 0,16(1) addi 12,1,10144 stdu 1,-10144(1) bge 7,.L7 bge 7,.L7 mr 12,29 mr 12,29 .L7: .L7: * config/rs6000/rs6000.c (rs6000_emit_allocate_stack): Return stack adjusting insn. Formatting. (rs6000_emit_prologue): Track stack adjusting insn, and use of r12. If possible, emit first -fsplit-stack arg pointer insn before stack adjust. Don't use r12 to save cr if split-stack. diff -urpN gcc-split-stack1/gcc/config/rs6000/rs6000.c gcc-split-stack2/gcc/config/rs6000/rs6000.c --- gcc-split-stack1/gcc/config/rs6000/rs6000.c 2015-05-18 10:17:11.341628090 +0930 +++ gcc-split-stack2/gcc/config/rs6000/rs6000.c 2015-05-18 10:16:58.758131165 +0930 @@ -22608,7 +22608,7 @@ rs6000_emit_stack_tie (rtx fp, bool hard If COPY_REG, make sure a copy of the old frame is left there. The generated code may use hard register 0 as a temporary. */ -static void +static rtx_insn * rs6000_emit_allocate_stack (HOST_WIDE_INT size, rtx copy_reg, int copy_off) { rtx_insn *insn; @@ -22621,7 +22621,7 @@ rs6000_emit_allocate_stack (HOST_WIDE_IN { warning (0, "stack frame too large"); emit_insn (gen_trap ()); - return; + return 0; } if (crtl->limit_stack) @@ -22672,9 +22672,9 @@ rs6000_emit_allocate_stack (HOST_WIDE_IN insn = emit_insn (TARGET_32BIT ? gen_movsi_update_stack (stack_reg, stack_reg, - todec, stack_reg) + todec, stack_reg) : gen_movdi_di_update_stack (stack_reg, stack_reg, - todec, stack_reg)); + todec, stack_reg)); /* Since we didn't use gen_frame_mem to generate the MEM, grab it now and set the alias set/attributes. The above gen_*_update calls will generate a PARALLEL with the MEM set being the first @@ -22692,6 +22692,7 @@ rs6000_emit_allocate_stack (HOST_WIDE_IN add_reg_note (insn, REG_FRAME_RELATED_EXPR, gen_rtx_SET (stack_reg, gen_rtx_PLUS (Pmode, stack_reg, GEN_INT (-size)))); + return insn; } #define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP) @@ -23496,6 +23497,10 @@ rs6000_emit_prologue (void) /* Offset to top of frame for frame_reg and sp respectively. */ HOST_WIDE_INT frame_off = 0; HOST_WIDE_INT sp_off = 0; + /* sp_adjust is the stack adjusting instruction, tracked so that the + insn setting up the split-stack arg pointer can be emitted just + prior to it, when r12 is not used here for other purposes. */ + rtx_insn *sp_adjust = 0; #ifdef ENABLE_CHECKING /* Track and check usage of r0, r11, r12. */ @@ -23714,7 +23719,10 @@ rs6000_emit_prologue (void) ptr_off = info->altivec_save_offset + info->altivec_size; frame_off = -ptr_off; } - rs6000_emit_allocate_stack (info->total_size, ptr_reg, ptr_off); + sp_adjust = rs6000_emit_allocate_stack (info->total_size, + ptr_reg, ptr_off); + if (REGNO (frame_reg_rtx) == 12) + sp_adjust = 0; sp_off = info->total_size; if (frame_reg_rtx != sp_reg_rtx) rs6000_emit_stack_tie (frame_reg_rtx, false); @@ -23755,7 +23763,8 @@ rs6000_emit_prologue (void) if (!WORLD_SAVE_P (info) && info->cr_save_p && REGNO (frame_reg_rtx) != cr_save_regno - && !(using_static_chain_p && cr_save_regno == 11)) + && !(using_static_chain_p && cr_save_regno == 11) + && !(flag_split_stack && cr_save_regno == 12 && sp_adjust)) { cr_save_rtx = gen_rtx_REG (SImode, cr_save_regno); START_USE (cr_save_regno); @@ -23901,6 +23910,8 @@ rs6000_emit_prologue (void) int end_save = info->gp_save_offset + info->gp_size; int ptr_off; + if (ptr_regno == 12) + sp_adjust = 0; if (!ptr_set_up) ptr_reg = gen_rtx_REG (Pmode, ptr_regno); @@ -24219,7 +24230,10 @@ rs6000_emit_prologue (void) } else if (REGNO (frame_reg_rtx) == 1) frame_off = info->total_size; - rs6000_emit_allocate_stack (info->total_size, ptr_reg, ptr_off); + sp_adjust = rs6000_emit_allocate_stack (info->total_size, + ptr_reg, ptr_off); + if (REGNO (frame_reg_rtx) == 12) + sp_adjust = 0; sp_off = info->total_size; if (frame_reg_rtx != sp_reg_rtx) rs6000_emit_stack_tie (frame_reg_rtx, false); @@ -24249,6 +24263,8 @@ rs6000_emit_prologue (void) gcc_checking_assert (scratch_regno == 11 || scratch_regno == 12); NOT_INUSE (0); + if (scratch_regno == 12) + sp_adjust = 0; if (end_save + frame_off != 0) { rtx offset = GEN_INT (end_save + frame_off); @@ -24326,7 +24342,7 @@ rs6000_emit_prologue (void) if ((DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) && !using_static_chain_p) save_regno = 11; - else if (REGNO (frame_reg_rtx) == 12) + else if (flag_split_stack || REGNO (frame_reg_rtx) == 12) { save_regno = 11; if (using_static_chain_p) @@ -24372,6 +24388,7 @@ rs6000_emit_prologue (void) rtx lr = gen_rtx_REG (Pmode, LR_REGNO); rtx tmp = gen_rtx_REG (Pmode, 12); + sp_adjust = 0; insn = emit_move_insn (tmp, lr); RTX_FRAME_RELATED_P (insn) = 1; @@ -24434,7 +24451,13 @@ rs6000_emit_prologue (void) __morestack was called, it left the arg pointer to the old stack in r29. Otherwise, the arg pointer is the top of the current frame. */ - if (frame_off != 0 || REGNO (frame_reg_rtx) != 12) + if (sp_adjust) + { + rtx r12 = gen_rtx_REG (Pmode, 12); + rtx set_r12 = gen_rtx_SET (r12, sp_reg_rtx); + emit_insn_before (set_r12, sp_adjust); + } + else if (frame_off != 0 || REGNO (frame_reg_rtx) != 12) { rtx r12 = gen_rtx_REG (Pmode, 12); if (frame_off == 0) -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-05-18 2:54 [Patch 0/4] PowerPC64 Linux split stack support Alan Modra ` (3 preceding siblings ...) 2015-05-18 3:42 ` [PATCH 4/4] Split-stack arg pointer init refinement Alan Modra @ 2015-06-13 12:05 ` Andreas Schwab 2015-06-15 9:09 ` Alan Modra 4 siblings, 1 reply; 32+ messages in thread From: Andreas Schwab @ 2015-06-13 12:05 UTC (permalink / raw) To: gcc-patches; +Cc: David Edelsohn /usr/bin/mkdir -p .; files=`echo ../../../../libgo/go/errors/errors.go | sed -e 's/[^ ]*\.gox//g'`; /bin/sh ./libtool --tag GO --mode=compile /daten/gcc/gcc-20150613/Build/./gcc/gccgo -B/daten/gcc/gcc-20150613/Build/./gcc/ -B/usr/powerpc64-linux/bin/ -B/usr/powerpc64-linux/lib/ -isystem /usr/powerpc64-linux/include -isystem /usr/powerpc64-linux/sys-include -m32 -O2 -g -I . -c -fgo-pkgpath=`echo errors.lo | sed -e 's/.lo$//' -e 's/-go$//'` -o errors.lo $files libtool: compile: /daten/gcc/gcc-20150613/Build/./gcc/gccgo -B/daten/gcc/gcc-20150613/Build/./gcc/ -B/usr/powerpc64-linux/bin/ -B/usr/powerpc64-linux/lib/ -isystem /usr/powerpc64-linux/include -isystem /usr/powerpc64-linux/sys-include -m32 -O2 -g -I . -c -fgo-pkgpath=errors ../../../../libgo/go/errors/errors.go go1: error: â-fsplit-stackâ currently only supported on PowerPC64 GNU/Linux with glibc-2.18 or later go1: error: â-fsplit-stackâ is not supported by this compiler configuration make[2]: *** [errors.lo] Error 1 make[2]: Leaving directory `/daten/gcc/gcc-20150613/Build/powerpc64-linux/32/libgo' make[1]: *** [all-recursive] Error 1 Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-06-13 12:05 ` [Patch 0/4] PowerPC64 Linux split stack support Andreas Schwab @ 2015-06-15 9:09 ` Alan Modra 2015-06-15 9:51 ` Andreas Schwab ` (3 more replies) 0 siblings, 4 replies; 32+ messages in thread From: Alan Modra @ 2015-06-15 9:09 UTC (permalink / raw) To: Andreas Schwab; +Cc: gcc-patches, David Edelsohn On Sat, Jun 13, 2015 at 12:46:18PM +0200, Andreas Schwab wrote: > /usr/bin/mkdir -p .; files=`echo ../../../../libgo/go/errors/errors.go | sed -e 's/[^ ]*\.gox//g'`; /bin/sh ./libtool --tag GO --mode=compile /daten/gcc/gcc-20150613/Build/./gcc/gccgo -B/daten/gcc/gcc-20150613/Build/./gcc/ -B/usr/powerpc64-linux/bin/ -B/usr/powerpc64-linux/lib/ -isystem /usr/powerpc64-linux/include -isystem /usr/powerpc64-linux/sys-include -m32 -O2 -g -I . -c -fgo-pkgpath=`echo errors.lo | sed -e 's/.lo$//' -e 's/-go$//'` -o errors.lo $files > libtool: compile: /daten/gcc/gcc-20150613/Build/./gcc/gccgo -B/daten/gcc/gcc-20150613/Build/./gcc/ -B/usr/powerpc64-linux/bin/ -B/usr/powerpc64-linux/lib/ -isystem /usr/powerpc64-linux/include -isystem /usr/powerpc64-linux/sys-include -m32 -O2 -g -I . -c -fgo-pkgpath=errors ../../../../libgo/go/errors/errors.go > go1: error: â-fsplit-stackâ currently only supported on PowerPC64 GNU/Linux with glibc-2.18 or later > go1: error: â-fsplit-stackâ is not supported by this compiler configuration > make[2]: *** [errors.lo] Error 1 > make[2]: Leaving directory `/daten/gcc/gcc-20150613/Build/powerpc64-linux/32/libgo' > make[1]: *** [all-recursive] Error 1 This untested patch ought to fix the problem, I think. My BE test environment had gold installed but not a sufficietly recent glibc. The LE test environment of course didn't build any 32-bit multilibs. Oops. * configure.ac (libgo_cv_c_split_stack_supported): Unset for powerpc. * configure: Regenerate. diff --git a/libgo/configure.ac b/libgo/configure.ac index 7c403a5..2ddcdfd 100644 --- a/libgo/configure.ac +++ b/libgo/configure.ac @@ -366,6 +366,13 @@ esac AC_SUBST(OSCFLAGS) dnl Use -fsplit-stack when compiling C code if available. +case "$target" in + powerpc*-*-*) + # Don't use cached value. Support is available only for 64-bit, + # so the result from a 64-bit multilib is not valid for 32-bit. + unset libgo_cv_c_split_stack_supported + ;; +esac AC_CACHE_CHECK([whether -fsplit-stack is supported], [libgo_cv_c_split_stack_supported], [CFLAGS_hold=$CFLAGS -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-06-15 9:09 ` Alan Modra @ 2015-06-15 9:51 ` Andreas Schwab 2015-06-15 17:31 ` Andreas Schwab ` (2 subsequent siblings) 3 siblings, 0 replies; 32+ messages in thread From: Andreas Schwab @ 2015-06-15 9:51 UTC (permalink / raw) To: gcc-patches; +Cc: David Edelsohn Alan Modra <amodra@gmail.com> writes: > diff --git a/libgo/configure.ac b/libgo/configure.ac > index 7c403a5..2ddcdfd 100644 > --- a/libgo/configure.ac > +++ b/libgo/configure.ac > @@ -366,6 +366,13 @@ esac > AC_SUBST(OSCFLAGS) > > dnl Use -fsplit-stack when compiling C code if available. > +case "$target" in > + powerpc*-*-*) > + # Don't use cached value. Support is available only for 64-bit, > + # so the result from a 64-bit multilib is not valid for 32-bit. > + unset libgo_cv_c_split_stack_supported Where does this cached value come from? There shouldn't be any sharing between multilib builds. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-06-15 9:09 ` Alan Modra 2015-06-15 9:51 ` Andreas Schwab @ 2015-06-15 17:31 ` Andreas Schwab 2015-06-15 17:56 ` Andreas Schwab 2015-06-15 19:14 ` Andreas Schwab 3 siblings, 0 replies; 32+ messages in thread From: Andreas Schwab @ 2015-06-15 17:31 UTC (permalink / raw) To: gcc-patches; +Cc: David Edelsohn Alan Modra <amodra@gmail.com> writes: > This untested patch ought to fix the problem, I think. There is no -fsplit-stack in the Makefile, and the configure script has already determined the correct settings. $ grep -e -fsplit-stack libgo/Makefile 32/libgo/Makefile libgo/Makefile:SPLIT_STACK = -fsplit-stack $ grep split_stack libgo/config.log 32/libgo/config.log libgo/config.log:libgo_cv_c_linker_supports_split_stack=no libgo/config.log:libgo_cv_c_split_stack_supported=yes 32/libgo/config.log:libgo_cv_c_linker_supports_split_stack=no 32/libgo/config.log:libgo_cv_c_split_stack_supported=no Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-06-15 9:09 ` Alan Modra 2015-06-15 9:51 ` Andreas Schwab 2015-06-15 17:31 ` Andreas Schwab @ 2015-06-15 17:56 ` Andreas Schwab 2015-06-15 19:14 ` Andreas Schwab 3 siblings, 0 replies; 32+ messages in thread From: Andreas Schwab @ 2015-06-15 17:56 UTC (permalink / raw) To: gcc-patches; +Cc: David Edelsohn The bug is of course that like DEFAULT_ABI, rs6000_isa_flags hasn't been determined yet. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-06-15 9:09 ` Alan Modra ` (2 preceding siblings ...) 2015-06-15 17:56 ` Andreas Schwab @ 2015-06-15 19:14 ` Andreas Schwab 2015-06-19 13:04 ` Andreas Schwab 2015-07-01 14:09 ` Lynn A. Boger 3 siblings, 2 replies; 32+ messages in thread From: Andreas Schwab @ 2015-06-15 19:14 UTC (permalink / raw) To: gcc-patches; +Cc: David Edelsohn * go-lang.c (go_langhook_init_options_struct): Don't set x_flag_split_stack. (go_langhook_post_options): Set it here instead. --- gcc/go/go-lang.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/gcc/go/go-lang.c b/gcc/go/go-lang.c index ce4dd9b..d952e0f 100644 --- a/gcc/go/go-lang.c +++ b/gcc/go/go-lang.c @@ -158,10 +158,6 @@ go_langhook_init_options_struct (struct gcc_options *opts) opts->x_flag_errno_math = 0; opts->frontend_set_flag_errno_math = true; - /* We turn on stack splitting if we can. */ - if (targetm_common.supports_split_stack (false, opts)) - opts->x_flag_split_stack = 1; - /* Exceptions are used to handle recovering from panics. */ opts->x_flag_exceptions = 1; opts->x_flag_non_call_exceptions = 1; @@ -295,6 +291,11 @@ go_langhook_post_options (const char **pfilename ATTRIBUTE_UNUSED) && global_options.x_write_symbols == NO_DEBUG) global_options.x_write_symbols = PREFERRED_DEBUGGING_TYPE; + /* We turn on stack splitting if we can. */ + if (!global_options_set.x_flag_split_stack + && targetm_common.supports_split_stack (false, &global_options)) + global_options.x_flag_split_stack = 1; + /* Returning false means that the backend should be used. */ return false; } -- 2.4.3 -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-06-15 19:14 ` Andreas Schwab @ 2015-06-19 13:04 ` Andreas Schwab 2015-07-30 19:23 ` Lynn A. Boger 2015-07-01 14:09 ` Lynn A. Boger 1 sibling, 1 reply; 32+ messages in thread From: Andreas Schwab @ 2015-06-19 13:04 UTC (permalink / raw) To: gcc-patches; +Cc: Ian Lance Taylor, David Edelsohn > * go-lang.c (go_langhook_init_options_struct): Don't set > x_flag_split_stack. > (go_langhook_post_options): Set it here instead. > --- > gcc/go/go-lang.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/gcc/go/go-lang.c b/gcc/go/go-lang.c > index ce4dd9b..d952e0f 100644 > --- a/gcc/go/go-lang.c > +++ b/gcc/go/go-lang.c > @@ -158,10 +158,6 @@ go_langhook_init_options_struct (struct gcc_options *opts) > opts->x_flag_errno_math = 0; > opts->frontend_set_flag_errno_math = true; > > - /* We turn on stack splitting if we can. */ > - if (targetm_common.supports_split_stack (false, opts)) > - opts->x_flag_split_stack = 1; > - > /* Exceptions are used to handle recovering from panics. */ > opts->x_flag_exceptions = 1; > opts->x_flag_non_call_exceptions = 1; > @@ -295,6 +291,11 @@ go_langhook_post_options (const char **pfilename ATTRIBUTE_UNUSED) > && global_options.x_write_symbols == NO_DEBUG) > global_options.x_write_symbols = PREFERRED_DEBUGGING_TYPE; > > + /* We turn on stack splitting if we can. */ > + if (!global_options_set.x_flag_split_stack > + && targetm_common.supports_split_stack (false, &global_options)) > + global_options.x_flag_split_stack = 1; > + > /* Returning false means that the backend should be used. */ > return false; > } This fixes the bootstrap error, and probably makes it possible to use DEFAULT_ABI in rs6000_supports_split_stack. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-06-19 13:04 ` Andreas Schwab @ 2015-07-30 19:23 ` Lynn A. Boger 2015-07-31 0:32 ` Ian Lance Taylor 0 siblings, 1 reply; 32+ messages in thread From: Lynn A. Boger @ 2015-07-30 19:23 UTC (permalink / raw) To: Andreas Schwab, gcc-patches; +Cc: Ian Lance Taylor, David Edelsohn Can this patch be submitted to fix the ppc 32 bootstrap error? On 06/19/2015 07:58 AM, Andreas Schwab wrote: >> * go-lang.c (go_langhook_init_options_struct): Don't set >> x_flag_split_stack. >> (go_langhook_post_options): Set it here instead. >> --- >> gcc/go/go-lang.c | 9 +++++---- >> 1 file changed, 5 insertions(+), 4 deletions(-) >> >> diff --git a/gcc/go/go-lang.c b/gcc/go/go-lang.c >> index ce4dd9b..d952e0f 100644 >> --- a/gcc/go/go-lang.c >> +++ b/gcc/go/go-lang.c >> @@ -158,10 +158,6 @@ go_langhook_init_options_struct (struct gcc_options *opts) >> opts->x_flag_errno_math = 0; >> opts->frontend_set_flag_errno_math = true; >> >> - /* We turn on stack splitting if we can. */ >> - if (targetm_common.supports_split_stack (false, opts)) >> - opts->x_flag_split_stack = 1; >> - >> /* Exceptions are used to handle recovering from panics. */ >> opts->x_flag_exceptions = 1; >> opts->x_flag_non_call_exceptions = 1; >> @@ -295,6 +291,11 @@ go_langhook_post_options (const char **pfilename ATTRIBUTE_UNUSED) >> && global_options.x_write_symbols == NO_DEBUG) >> global_options.x_write_symbols = PREFERRED_DEBUGGING_TYPE; >> >> + /* We turn on stack splitting if we can. */ >> + if (!global_options_set.x_flag_split_stack >> + && targetm_common.supports_split_stack (false, &global_options)) >> + global_options.x_flag_split_stack = 1; >> + >> /* Returning false means that the backend should be used. */ >> return false; >> } > This fixes the bootstrap error, and probably makes it possible to use > DEFAULT_ABI in rs6000_supports_split_stack. > > Andreas. > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-07-30 19:23 ` Lynn A. Boger @ 2015-07-31 0:32 ` Ian Lance Taylor 0 siblings, 0 replies; 32+ messages in thread From: Ian Lance Taylor @ 2015-07-31 0:32 UTC (permalink / raw) To: Lynn A. Boger Cc: Andreas Schwab, gcc-patches, Ian Lance Taylor, David Edelsohn On Thu, Jul 30, 2015 at 10:46 AM, Lynn A. Boger <laboger@linux.vnet.ibm.com> wrote: > Can this patch be submitted to fix the ppc 32 bootstrap error? > > On 06/19/2015 07:58 AM, Andreas Schwab wrote: >>> >>> * go-lang.c (go_langhook_init_options_struct): Don't set >>> x_flag_split_stack. >>> (go_langhook_post_options): Set it here instead. This is OK. Thanks. Ian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-06-15 19:14 ` Andreas Schwab 2015-06-19 13:04 ` Andreas Schwab @ 2015-07-01 14:09 ` Lynn A. Boger 2015-07-01 14:15 ` Andreas Schwab 1 sibling, 1 reply; 32+ messages in thread From: Lynn A. Boger @ 2015-07-01 14:09 UTC (permalink / raw) To: gcc-patches If further testing is needed on this patch I can do it, but I need more information what variations need to be tested? It's not clear to me what distro/gcc/glibc versions and type of build causes the error. I have not been able to reproduce the original problem. On 06/15/2015 01:58 PM, Andreas Schwab wrote: > * go-lang.c (go_langhook_init_options_struct): Don't set > x_flag_split_stack. > (go_langhook_post_options): Set it here instead. > --- > gcc/go/go-lang.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/gcc/go/go-lang.c b/gcc/go/go-lang.c > index ce4dd9b..d952e0f 100644 > --- a/gcc/go/go-lang.c > +++ b/gcc/go/go-lang.c > @@ -158,10 +158,6 @@ go_langhook_init_options_struct (struct gcc_options *opts) > opts->x_flag_errno_math = 0; > opts->frontend_set_flag_errno_math = true; > > - /* We turn on stack splitting if we can. */ > - if (targetm_common.supports_split_stack (false, opts)) > - opts->x_flag_split_stack = 1; > - > /* Exceptions are used to handle recovering from panics. */ > opts->x_flag_exceptions = 1; > opts->x_flag_non_call_exceptions = 1; > @@ -295,6 +291,11 @@ go_langhook_post_options (const char **pfilename ATTRIBUTE_UNUSED) > && global_options.x_write_symbols == NO_DEBUG) > global_options.x_write_symbols = PREFERRED_DEBUGGING_TYPE; > > + /* We turn on stack splitting if we can. */ > + if (!global_options_set.x_flag_split_stack > + && targetm_common.supports_split_stack (false, &global_options)) > + global_options.x_flag_split_stack = 1; > + > /* Returning false means that the backend should be used. */ > return false; > } ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-07-01 14:09 ` Lynn A. Boger @ 2015-07-01 14:15 ` Andreas Schwab 2015-07-17 12:37 ` Lynn A. Boger 0 siblings, 1 reply; 32+ messages in thread From: Andreas Schwab @ 2015-07-01 14:15 UTC (permalink / raw) To: Lynn A. Boger; +Cc: gcc-patches "Lynn A. Boger" <laboger@linux.vnet.ibm.com> writes: > It's not clear to me what distro/gcc/glibc versions and type of build > causes the error. I have not been able to reproduce the > original problem. The failure mode is quite obvious: go_langhook_init_options_struct is called before the options are parsed, so -m32 hasn't been acted upon and supports_split_stack falsely returns true. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Patch 0/4] PowerPC64 Linux split stack support 2015-07-01 14:15 ` Andreas Schwab @ 2015-07-17 12:37 ` Lynn A. Boger 0 siblings, 0 replies; 32+ messages in thread From: Lynn A. Boger @ 2015-07-17 12:37 UTC (permalink / raw) To: Andreas Schwab; +Cc: gcc-patches I have tested this and it fixes the problem. On 07/01/2015 09:15 AM, Andreas Schwab wrote: > "Lynn A. Boger" <laboger@linux.vnet.ibm.com> writes: > >> It's not clear to me what distro/gcc/glibc versions and type of build >> causes the error. I have not been able to reproduce the >> original problem. > The failure mode is quite obvious: go_langhook_init_options_struct is > called before the options are parsed, so -m32 hasn't been acted upon and > supports_split_stack falsely returns true. > > Andreas. > ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2015-07-30 22:41 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-05-18 2:54 [Patch 0/4] PowerPC64 Linux split stack support Alan Modra 2015-05-18 2:54 ` [PATCH 1/4] rs6000_stack_info changes for -fsplit-stack Alan Modra 2015-05-18 18:08 ` David Edelsohn 2015-05-20 1:45 ` Alan Modra 2015-05-20 13:07 ` David Edelsohn 2015-05-20 13:29 ` Alan Modra 2015-05-18 2:55 ` [PATCH 2/4] prologue and epilogue tidy and -mno-vrsave bug fix Alan Modra 2015-05-19 14:35 ` David Edelsohn 2015-05-18 2:55 ` [PATCH 3/4] split-stack for powerpc64 Alan Modra 2015-05-18 7:05 ` Alan Modra 2015-05-19 12:48 ` Lynn A. Boger 2015-05-20 1:03 ` Alan Modra 2015-05-20 12:14 ` Lynn A. Boger 2015-05-20 12:59 ` David Edelsohn 2015-05-20 16:01 ` Lynn A. Boger 2015-05-19 14:37 ` David Edelsohn 2015-05-21 2:10 ` Alan Modra 2015-05-29 16:09 ` Alan Modra 2015-05-29 17:25 ` David Edelsohn 2015-05-18 3:42 ` [PATCH 4/4] Split-stack arg pointer init refinement Alan Modra 2015-06-13 12:05 ` [Patch 0/4] PowerPC64 Linux split stack support Andreas Schwab 2015-06-15 9:09 ` Alan Modra 2015-06-15 9:51 ` Andreas Schwab 2015-06-15 17:31 ` Andreas Schwab 2015-06-15 17:56 ` Andreas Schwab 2015-06-15 19:14 ` Andreas Schwab 2015-06-19 13:04 ` Andreas Schwab 2015-07-30 19:23 ` Lynn A. Boger 2015-07-31 0:32 ` Ian Lance Taylor 2015-07-01 14:09 ` Lynn A. Boger 2015-07-01 14:15 ` Andreas Schwab 2015-07-17 12:37 ` Lynn A. Boger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).