From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5348 invoked by alias); 31 Oct 2002 03:23:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 5178 invoked from network); 31 Oct 2002 03:23:53 -0000 Received: from unknown (HELO hiauly1.hia.nrc.ca) (132.246.100.193) by sources.redhat.com with SMTP; 31 Oct 2002 03:23:53 -0000 Received: from hiauly1.hia.nrc.ca (localhost [127.0.0.1]) by hiauly1.hia.nrc.ca (8.12.0.Beta16/8.12.0.Beta16) with ESMTP id g9V3NqSf009510; Wed, 30 Oct 2002 22:23:52 -0500 (EST) Received: (from dave@localhost) by hiauly1.hia.nrc.ca (8.12.0.Beta16/8.12.0.Beta16) id g9V3NpGD009509; Wed, 30 Oct 2002 22:23:51 -0500 (EST) Message-Id: <200210310323.g9V3NpGD009509@hiauly1.hia.nrc.ca> Subject: Re: Call rewrite for PA To: dave@hiauly1.hia.nrc.ca (John David Anglin) Date: Wed, 30 Oct 2002 19:23:00 -0000 From: "John David Anglin" Cc: gcc-patches@gcc.gnu.org, law@redhat.com In-Reply-To: from "John David Anglin" at Oct 23, 2002 04:24:08 pm MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-SW-Source: 2002-10/txt/msg01880.txt.bz2 The following patch has been applied to the main. It contains everything but the revised arg pointer handling proposed for hppa64. This patch fixes a number of bugs in the handling of long calls as previously discussed and moves the length computation for millicode and regular calls from pa.md to pa.c. This version has been tested on hppa64-hp-hpux11.11 and hppa-unknown-linux-gnu. Applied Dave -- J. David Anglin dave.anglin@nrc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6605) 2002-10-30 John David Anglin * pa-linux.h (ASM_OUTPUT_EXTERNAL_LIBCALL): Define. * pa-protos.h (attr_length_millicode_call, attr_length_call, pa_init_machine_status): Declare new global functions. * pa.c (void copy_fp_args, length_fp_args, get_plabel): Declare and implement new functions. (attr_length_millicode_call, attr_length_call): Implement. (total_code_bytes): Change type to long. (pa_output_function_prologue): Compute total_code_bytes on TARGET_64BIT. Reset counter if flag_function_sections. (output_deferred_plabels): Set output alignment to 3 for TARGET_64BIT. (output_cbranch): Move call to gen_label_rtx. (output_millicode_call): Rewrite adding long TARGET_64BIT call, expose delay slot in all variants, shorten pc-relative calls. (output_call): Rewrite adding long TARGET_64BIT call, improved delay slot usage and exposure, various new call variants, and shortened sequences for some variants on TARGET_PA_20. Miscellaneous format changes. * pa.h (total_code_bytes): Change type to long. (MASK_LONG_CALLS, TARGET_LONG_CALLS, TARGET_LONG_ABS_CALL, TARGET_LONG_PIC_SDIFF_CALL, TARGET_LONG_PIC_PCREL_CALL): Define. (TARGET_SWITCHES): Add "-mlong-calls" and "-mno-long-calls" options. (EXTRA_CONSTRAINT, GO_IF_LEGITIMATE_ADDRESS, LEGITIMIZE_RELOAD_ADDRESS): Don't use long floating point loads and stores on TARGET_ELF32. *pa.md (define_delay): Allow insns in delay on TARGET_PORTABLE_RUNTIME. (unnamed patterns for mulsi3, divsi3, udivsi3, modsi3, umodsi3 and canonicalize_funcptr_for_compare expanders): Calculate attribute length attr_length_millicode_call(). (call_internal_symref, call_value_internal_symref): Clobber register 1. Calculate attribute length using attr_length_call(). (call_internal_reg_64bit, call_value_internal_reg_64bit): Move gp load to delay slot. (sibcall, sibcall_value): Rewrite. (sibcall_internal_symref, sibcall_value_internal_symref): Clobber register 1. Use attr_length_call(). (sibcall_internal_symref_64bit, sibcall_value_internal_symref_64bit): New patterns. (unamed pattern for canonicalize_funcptr_for_compare): Rewrite. * som.h (MEMBER_TYPE_FORCES_BLK): Define. * t-pa64 (TARGET_LIBGCC2_CFLAGS): Add "-mlong-calls". * doc/invoke.texi (mlong-calls): Document. Index: config/pa/pa-linux.h =================================================================== RCS file: /cvsroot/gcc/gcc/gcc/config/pa/pa-linux.h,v retrieving revision 1.26 diff -u -3 -p -r1.26 pa-linux.h --- config/pa/pa-linux.h 3 Oct 2002 04:05:54 -0000 1.26 +++ config/pa/pa-linux.h 30 Oct 2002 17:06:37 -0000 @@ -196,6 +196,19 @@ Boston, MA 02111-1307, USA. */ } \ while (0) +/* As well as globalizing the label, we need to encode the label + to ensure a plabel is generated in an indirect call. */ + +#undef ASM_OUTPUT_EXTERNAL_LIBCALL +#define ASM_OUTPUT_EXTERNAL_LIBCALL(FILE, FUN) \ + do \ + { \ + if (!FUNCTION_NAME_P (XSTR (FUN, 0))) \ + hppa_encode_label (FUN); \ + (*targetm.asm_out.globalize_label) (FILE, XSTR (FUN, 0)); \ + } \ + while (0) + /* Linux always uses gas. */ #undef TARGET_GAS #define TARGET_GAS 1 Index: config/pa/pa-protos.h =================================================================== RCS file: /cvsroot/gcc/gcc/gcc/config/pa/pa-protos.h,v retrieving revision 1.18 diff -u -3 -p -r1.18 pa-protos.h --- config/pa/pa-protos.h 20 Oct 2002 22:37:12 -0000 1.18 +++ config/pa/pa-protos.h 30 Oct 2002 17:06:37 -0000 @@ -105,6 +105,8 @@ extern int jump_in_call_delay PARAMS ((r extern enum reg_class secondary_reload_class PARAMS ((enum reg_class, enum machine_mode, rtx)); extern int hppa_fpstore_bypass_p PARAMS ((rtx, rtx)); +extern int attr_length_millicode_call PARAMS ((rtx, int)); +extern int attr_length_call PARAMS ((rtx, int)); /* Declare functions defined in pa.c and used in templates. */ Index: config/pa/pa.c =================================================================== RCS file: /cvsroot/gcc/gcc/gcc/config/pa/pa.c,v retrieving revision 1.184 diff -u -3 -p -r1.184 pa.c --- config/pa/pa.c 22 Oct 2002 23:05:19 -0000 1.184 +++ config/pa/pa.c 30 Oct 2002 17:06:42 -0000 @@ -121,11 +121,13 @@ static void pa_globalize_label PARAMS (( ATTRIBUTE_UNUSED; static void pa_asm_output_mi_thunk PARAMS ((FILE *, tree, HOST_WIDE_INT, HOST_WIDE_INT, tree)); - +static void copy_fp_args PARAMS ((rtx)) ATTRIBUTE_UNUSED; +static int length_fp_args PARAMS ((rtx)) ATTRIBUTE_UNUSED; +static struct deferred_plabel *get_plabel PARAMS ((const char *)) + ATTRIBUTE_UNUSED; /* Save the operands last given to a compare for use when we generate a scc or bcc insn. */ - rtx hppa_compare_op0, hppa_compare_op1; enum cmp_type hppa_branch_type; @@ -149,12 +151,10 @@ static rtx find_addr_reg PARAMS ((rtx)); /* Keep track of the number of bytes we have output in the CODE subspaces during this compilation so we'll know when to emit inline long-calls. */ - -unsigned int total_code_bytes; +unsigned long total_code_bytes; /* Variables to handle plabels that we discover are necessary at assembly output time. They are output after the current function. */ - struct deferred_plabel GTY(()) { rtx internal_label; @@ -3197,14 +3197,14 @@ pa_output_function_prologue (file, size) fputs ("\n\t.ENTRY\n", file); /* If we're using GAS and SOM, and not using the portable runtime model, - then we don't need to accumulate the total number of code bytes. */ + or function sections, then we don't need to accumulate the total number + of code bytes. */ if ((TARGET_GAS && TARGET_SOM && ! TARGET_PORTABLE_RUNTIME) - /* FIXME: we can't handle long calls for TARGET_64BIT. */ - || TARGET_64BIT) + || flag_function_sections) total_code_bytes = 0; else if (INSN_ADDRESSES_SET_P ()) { - unsigned int old_total = total_code_bytes; + unsigned long old_total = total_code_bytes; total_code_bytes += INSN_ADDRESSES (INSN_UID (get_last_nonnote_insn ())); total_code_bytes += FUNCTION_BOUNDARY / BITS_PER_UNIT; @@ -4726,6 +4726,47 @@ output_global_address (file, x, round_co output_addr_const (file, x); } +static struct deferred_plabel * +get_plabel (fname) + const char *fname; +{ + size_t i; + + /* See if we have already put this function on the list of deferred + plabels. This list is generally small, so a liner search is not + too ugly. If it proves too slow replace it with something faster. */ + for (i = 0; i < n_deferred_plabels; i++) + if (strcmp (fname, deferred_plabels[i].name) == 0) + break; + + /* If the deferred plabel list is empty, or this entry was not found + on the list, create a new entry on the list. */ + if (deferred_plabels == NULL || i == n_deferred_plabels) + { + const char *real_name; + + if (deferred_plabels == 0) + deferred_plabels = (struct deferred_plabel *) + ggc_alloc (sizeof (struct deferred_plabel)); + else + deferred_plabels = (struct deferred_plabel *) + ggc_realloc (deferred_plabels, + ((n_deferred_plabels + 1) + * sizeof (struct deferred_plabel))); + + i = n_deferred_plabels++; + deferred_plabels[i].internal_label = gen_label_rtx (); + deferred_plabels[i].name = ggc_strdup (fname); + + /* Gross. We have just implicitly taken the address of this function, + mark it as such. */ + real_name = (*targetm.strip_name_encoding) (fname); + TREE_SYMBOL_REFERENCED (get_identifier (real_name)) = 1; + } + + return &deferred_plabels[i]; +} + void output_deferred_plabels (file) FILE *file; @@ -4737,7 +4778,7 @@ output_deferred_plabels (file) if (n_deferred_plabels) { data_section (); - ASM_OUTPUT_ALIGN (file, 2); + ASM_OUTPUT_ALIGN (file, TARGET_64BIT ? 3 : 2); } /* Now output the deferred plabels. */ @@ -5323,9 +5364,9 @@ hppa_va_arg (valist, type) const char * output_cbranch (operands, nullify, length, negated, insn) - rtx *operands; - int nullify, length, negated; - rtx insn; + rtx *operands; + int nullify, length, negated; + rtx insn; { static char buf[100]; int useskip = 0; @@ -5499,12 +5540,11 @@ output_cbranch (operands, nullify, lengt xoperands[1] = operands[1]; xoperands[2] = operands[2]; xoperands[3] = operands[3]; - if (TARGET_SOM || ! TARGET_GAS) - xoperands[4] = gen_label_rtx (); output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); - if (TARGET_SOM || ! TARGET_GAS) + if (TARGET_SOM || !TARGET_GAS) { + xoperands[4] = gen_label_rtx (); output_asm_insn ("addil L'%l0-%l4,%%r1", xoperands); ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", CODE_LABEL_NUMBER (xoperands[4])); @@ -5536,10 +5576,10 @@ output_cbranch (operands, nullify, lengt const char * output_bb (operands, nullify, length, negated, insn, which) - rtx *operands ATTRIBUTE_UNUSED; - int nullify, length, negated; - rtx insn; - int which; + rtx *operands ATTRIBUTE_UNUSED; + int nullify, length, negated; + rtx insn; + int which; { static char buf[100]; int useskip = 0; @@ -5684,10 +5724,10 @@ output_bb (operands, nullify, length, ne const char * output_bvb (operands, nullify, length, negated, insn, which) - rtx *operands ATTRIBUTE_UNUSED; - int nullify, length, negated; - rtx insn; - int which; + rtx *operands ATTRIBUTE_UNUSED; + int nullify, length, negated; + rtx insn; + int which; { static char buf[100]; int useskip = 0; @@ -6043,442 +6083,594 @@ output_movb (operands, insn, which_alter } } +/* Copy any FP arguments in INSN into integer registers. */ +static void +copy_fp_args (insn) + rtx insn; +{ + rtx link; + rtx xoperands[2]; -/* INSN is a millicode call. It may have an unconditional jump in its delay - slot. + for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1)) + { + int arg_mode, regno; + rtx use = XEXP (link, 0); - CALL_DEST is the routine we are calling. */ + if (! (GET_CODE (use) == USE + && GET_CODE (XEXP (use, 0)) == REG + && FUNCTION_ARG_REGNO_P (REGNO (XEXP (use, 0))))) + continue; -const char * -output_millicode_call (insn, call_dest) - rtx insn; - rtx call_dest; -{ - int attr_length = get_attr_length (insn); - int seq_length = dbr_sequence_length (); - int distance; - rtx xoperands[4]; - rtx seq_insn; + arg_mode = GET_MODE (XEXP (use, 0)); + regno = REGNO (XEXP (use, 0)); - xoperands[3] = gen_rtx_REG (Pmode, TARGET_64BIT ? 2 : 31); + /* Is it a floating point register? */ + if (regno >= 32 && regno <= 39) + { + /* Copy the FP register into an integer register via memory. */ + if (arg_mode == SFmode) + { + xoperands[0] = XEXP (use, 0); + xoperands[1] = gen_rtx_REG (SImode, 26 - (regno - 32) / 2); + output_asm_insn ("{fstws|fstw} %0,-16(%%sr0,%%r30)", xoperands); + output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands); + } + else + { + xoperands[0] = XEXP (use, 0); + xoperands[1] = gen_rtx_REG (DImode, 25 - (regno - 34) / 2); + output_asm_insn ("{fstds|fstd} %0,-16(%%sr0,%%r30)", xoperands); + output_asm_insn ("ldw -12(%%sr0,%%r30),%R1", xoperands); + output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands); + } + } + } +} + +/* Compute length of the FP argument copy sequence for INSN. */ +static int +length_fp_args (insn) + rtx insn; +{ + int length = 0; + rtx link; - /* Handle common case -- empty delay slot or no jump in the delay slot, - and we're sure that the branch will reach the beginning of the $CODE$ - subspace. The within reach form of the $$sh_func_adrs call has - a length of 28 and attribute type of multi. This length is the - same as the maximum length of an out of reach PIC call to $$div. */ - if ((seq_length == 0 - && (attr_length == 8 - || (attr_length == 28 && get_attr_type (insn) == TYPE_MULTI))) - || (seq_length != 0 - && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN - && attr_length == 4)) + for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1)) { - xoperands[0] = call_dest; - output_asm_insn ("{bl|b,l} %0,%3%#", xoperands); - return ""; + int arg_mode, regno; + rtx use = XEXP (link, 0); + + if (! (GET_CODE (use) == USE + && GET_CODE (XEXP (use, 0)) == REG + && FUNCTION_ARG_REGNO_P (REGNO (XEXP (use, 0))))) + continue; + + arg_mode = GET_MODE (XEXP (use, 0)); + regno = REGNO (XEXP (use, 0)); + + /* Is it a floating point register? */ + if (regno >= 32 && regno <= 39) + { + if (arg_mode == SFmode) + length += 8; + else + length += 12; + } } - /* This call may not reach the beginning of the $CODE$ subspace. */ - if (attr_length > 8) + return length; +} + +/* We include the delay slot in the returned length as it is better to + over estimate the length than to under estimate it. */ + +int +attr_length_millicode_call (insn, length) + rtx insn; + int length; +{ + unsigned long distance = total_code_bytes + INSN_ADDRESSES (INSN_UID (insn)); + + if (distance < total_code_bytes) + distance = -1; + + if (TARGET_64BIT) { - int delay_insn_deleted = 0; + if (!TARGET_LONG_CALLS && distance < 7600000) + return length + 8; - /* We need to emit an inline long-call branch. */ - if (seq_length != 0 - && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN) - { - /* A non-jump insn in the delay slot. By definition we can - emit this insn before the call. */ - final_scan_insn (NEXT_INSN (insn), asm_out_file, optimize, 0, 0); + return length + 20; + } + else if (TARGET_PORTABLE_RUNTIME) + return length + 24; + else + { + if (!TARGET_LONG_CALLS && distance < 240000) + return length + 8; - /* Now delete the delay insn. */ - PUT_CODE (NEXT_INSN (insn), NOTE); - NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; - NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; - delay_insn_deleted = 1; - } + if (TARGET_LONG_ABS_CALL && !flag_pic) + return length + 12; - /* PIC long millicode call sequence. */ - if (flag_pic) - { - xoperands[0] = call_dest; - if (TARGET_SOM || ! TARGET_GAS) - xoperands[1] = gen_label_rtx (); + return length + 24; + } +} - /* Get our address + 8 into %r1. */ - output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); +/* INSN is a function call. It may have an unconditional jump + in its delay slot. - if (TARGET_SOM || ! TARGET_GAS) - { - /* Add %r1 to the offset of our target from the next insn. */ - output_asm_insn ("addil L%%%0-%1,%%r1", xoperands); - ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", - CODE_LABEL_NUMBER (xoperands[1])); - output_asm_insn ("ldo R%%%0-%1(%%r1),%%r1", xoperands); - } - else - { - output_asm_insn ("addil L%%%0-$PIC_pcrel$0+4,%%r1", xoperands); - output_asm_insn ("ldo R%%%0-$PIC_pcrel$0+8(%%r1),%%r1", - xoperands); - } + CALL_DEST is the routine we are calling. */ - /* Get the return address into %r31. */ - output_asm_insn ("blr 0,%3", xoperands); +const char * +output_millicode_call (insn, call_dest) + rtx insn; + rtx call_dest; +{ + int attr_length = get_attr_length (insn); + int seq_length = dbr_sequence_length (); + int distance; + rtx seq_insn; + rtx xoperands[3]; - /* Branch to our target which is in %r1. */ - output_asm_insn ("bv,n %%r0(%%r1)", xoperands); + xoperands[0] = call_dest; + xoperands[2] = gen_rtx_REG (Pmode, TARGET_64BIT ? 2 : 31); - /* Empty delay slot. Note this insn gets fetched twice and - executed once. To be safe we use a nop. */ - output_asm_insn ("nop", xoperands); + /* Handle the common case where we are sure that the branch will + reach the beginning of the $CODE$ subspace. The within reach + form of the $$sh_func_adrs call has a length of 28. Because + it has an attribute type of multi, it never has a non-zero + sequence length. The length of the $$sh_func_adrs is the same + as certain out of reach PIC calls to other routines. */ + if (!TARGET_LONG_CALLS + && ((seq_length == 0 + && (attr_length == 12 + || (attr_length == 28 && get_attr_type (insn) == TYPE_MULTI))) + || (seq_length != 0 && attr_length == 8))) + { + output_asm_insn ("{bl|b,l} %0,%2", xoperands); + } + else + { + if (TARGET_64BIT) + { + /* It might seem that one insn could be saved by accessing + the millicode function using the linkage table. However, + this doesn't work in shared libraries and other dynamically + loaded objects. Using a pc-relative sequence also avoids + problems related to the implicit use of the gp register. */ + output_asm_insn ("b,l .+8,%%r1", xoperands); + output_asm_insn ("addil L'%0-$PIC_pcrel$0+4,%%r1", xoperands); + output_asm_insn ("ldo R'%0-$PIC_pcrel$0+8(%%r1),%%r1", xoperands); + output_asm_insn ("bve,l (%%r1),%%r2", xoperands); } - /* Pure portable runtime doesn't allow be/ble; we also don't have - PIC support in the assembler/linker, so this sequence is needed. */ else if (TARGET_PORTABLE_RUNTIME) { - xoperands[0] = call_dest; - /* Get the address of our target into %r29. */ - output_asm_insn ("ldil L%%%0,%%r29", xoperands); - output_asm_insn ("ldo R%%%0(%%r29),%%r29", xoperands); + /* Pure portable runtime doesn't allow be/ble; we also don't + have PIC support in the assembler/linker, so this sequence + is needed. */ + + /* Get the address of our target into %r1. */ + output_asm_insn ("ldil L'%0,%%r1", xoperands); + output_asm_insn ("ldo R'%0(%%r1),%%r1", xoperands); /* Get our return address into %r31. */ - output_asm_insn ("blr %%r0,%3", xoperands); + output_asm_insn ("{bl|b,l} .+8,%%r31", xoperands); + output_asm_insn ("addi 8,%%r31,%%r31", xoperands); - /* Jump to our target address in %r29. */ - output_asm_insn ("bv,n %%r0(%%r29)", xoperands); - - /* Empty delay slot. Note this insn gets fetched twice and - executed once. To be safe we use a nop. */ - output_asm_insn ("nop", xoperands); + /* Jump to our target address in %r1. */ + output_asm_insn ("bv %%r0(%%r1)", xoperands); } - /* If we're allowed to use be/ble instructions, then this is the - best sequence to use for a long millicode call. */ - else + else if (!flag_pic) { - xoperands[0] = call_dest; - output_asm_insn ("ldil L%%%0,%3", xoperands); + output_asm_insn ("ldil L'%0,%%r1", xoperands); if (TARGET_PA_20) - output_asm_insn ("be,l R%%%0(%%sr4,%3),%%sr0,%%r31", xoperands); + output_asm_insn ("be,l R'%0(%%sr4,%%r1),%%sr0,%%r31", xoperands); else - output_asm_insn ("ble R%%%0(%%sr4,%3)", xoperands); - output_asm_insn ("nop", xoperands); + output_asm_insn ("ble R'%0(%%sr4,%%r1)", xoperands); } - - /* If we had a jump in the call's delay slot, output it now. */ - if (seq_length != 0 && !delay_insn_deleted) + else { - xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - output_asm_insn ("b,n %0", xoperands); + if (TARGET_SOM || !TARGET_GAS) + { + /* The HP assembler can generate relocations for the + difference of two symbols. GAS can do this for a + millicode symbol but not an arbitrary external + symbol when generating SOM output. */ + xoperands[1] = gen_label_rtx (); + output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + output_asm_insn ("addi 16,%%r1,%%r31", xoperands); + ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", + CODE_LABEL_NUMBER (xoperands[1])); + output_asm_insn ("addil L'%0-%l1,%%r1", xoperands); + output_asm_insn ("ldo R'%0-%l1(%%r1),%%r1", xoperands); + } + else + { + output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + output_asm_insn ("addi 16,%%r1,%%r31", xoperands); + output_asm_insn ("addil L'%0-$PIC_pcrel$0+8,%%r1", xoperands); + output_asm_insn ("ldo R'%0-$PIC_pcrel$0+12(%%r1),%%r1", + xoperands); + } - /* Now delete the delay insn. */ - PUT_CODE (NEXT_INSN (insn), NOTE); - NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; - NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + /* Jump to our target address in %r1. */ + output_asm_insn ("bv %%r0(%%r1)", xoperands); } - return ""; } - /* This call has an unconditional jump in its delay slot and the - call is known to reach its target or the beginning of the current - subspace. */ - - /* Use the containing sequence insn's address. */ - seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0))); + if (seq_length == 0) + output_asm_insn ("nop", xoperands); - distance = INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn)))) - - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8; + /* We are done if there isn't a jump in the delay slot. */ + if (seq_length == 0 || GET_CODE (NEXT_INSN (insn)) != JUMP_INSN) + return ""; - /* If the branch was too far away, emit a normal call followed - by a nop, followed by the unconditional branch. + /* This call has an unconditional jump in its delay slot. */ + xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - If the branch is close, then adjust %r2 from within the - call's delay slot. */ + /* See if the return address can be adjusted. Use the containing + sequence insn's address. */ + seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0))); + distance = (INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn)))) + - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8); - xoperands[0] = call_dest; - xoperands[1] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - if (! VAL_14_BITS_P (distance)) - output_asm_insn ("{bl|b,l} %0,%3\n\tnop\n\tb,n %1", xoperands); - else + if (VAL_14_BITS_P (distance)) { - xoperands[2] = gen_label_rtx (); - output_asm_insn ("\n\t{bl|b,l} %0,%3\n\tldo %1-%2(%3),%3", - xoperands); + xoperands[1] = gen_label_rtx (); + output_asm_insn ("ldo %0-%1(%2),%2", xoperands); ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", - CODE_LABEL_NUMBER (xoperands[2])); + CODE_LABEL_NUMBER (xoperands[3])); } + else + /* ??? This branch may not reach its target. */ + output_asm_insn ("nop\n\tb,n %0", xoperands); /* Delete the jump. */ PUT_CODE (NEXT_INSN (insn), NOTE); NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + return ""; } -/* INSN is either a function call. It may have an unconditional jump +/* We include the delay slot in the returned length as it is better to + over estimate the length than to under estimate it. */ + +int +attr_length_call (insn, sibcall) + rtx insn; + int sibcall; +{ + unsigned long distance = total_code_bytes + INSN_ADDRESSES (INSN_UID (insn)); + + if (distance < total_code_bytes) + distance = -1; + + if (TARGET_64BIT) + { + if (!TARGET_LONG_CALLS + && ((!sibcall && distance < 7600000) || distance < 240000)) + return 8; + + return (sibcall ? 28 : 24); + } + else + { + if (!TARGET_LONG_CALLS + && ((TARGET_PA_20 && !sibcall && distance < 7600000) + || distance < 240000)) + return 8; + + if (TARGET_LONG_ABS_CALL && !flag_pic) + return 12; + + if ((TARGET_SOM && TARGET_LONG_PIC_SDIFF_CALL) + || (TARGET_GAS && TARGET_LONG_PIC_PCREL_CALL)) + { + if (TARGET_PA_20) + return 20; + + return 28; + } + else + { + int length = 0; + + if (TARGET_SOM) + length += length_fp_args (insn); + + if (flag_pic) + length += 4; + + if (TARGET_PA_20) + return (length + 32); + + if (!sibcall) + length += 8; + + return (length + 40); + } + } +} + +/* INSN is a function call. It may have an unconditional jump in its delay slot. CALL_DEST is the routine we are calling. */ const char * output_call (insn, call_dest, sibcall) - rtx insn; - rtx call_dest; - int sibcall; + rtx insn; + rtx call_dest; + int sibcall; { + int delay_insn_deleted = 0; + int delay_slot_filled = 0; int attr_length = get_attr_length (insn); int seq_length = dbr_sequence_length (); - int distance; - rtx xoperands[4]; - rtx seq_insn; + rtx xoperands[2]; + + xoperands[0] = call_dest; - /* Handle common case -- empty delay slot or no jump in the delay slot, - and we're sure that the branch will reach the beginning of the $CODE$ - subspace. */ - if ((seq_length == 0 && attr_length == 12) - || (seq_length != 0 - && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN - && attr_length == 8)) + /* Handle the common case where we're sure that the branch will reach + the beginning of the $CODE$ subspace. */ + if (!TARGET_LONG_CALLS + && ((seq_length == 0 && attr_length == 12) + || (seq_length != 0 && attr_length == 8))) { - xoperands[0] = call_dest; xoperands[1] = gen_rtx_REG (word_mode, sibcall ? 0 : 2); - output_asm_insn ("{bl|b,l} %0,%1%#", xoperands); - return ""; + output_asm_insn ("{bl|b,l} %0,%1", xoperands); } - - /* This call may not reach the beginning of the $CODE$ subspace. */ - if (attr_length > 12) + else { - int delay_insn_deleted = 0; - rtx xoperands[2]; - rtx link; - - /* We need to emit an inline long-call branch. Furthermore, - because we're changing a named function call into an indirect - function call well after the parameters have been set up, we - need to make sure any FP args appear in both the integer - and FP registers. Also, we need move any delay slot insn - out of the delay slot. And finally, we can't rely on the linker - being able to fix the call to $$dyncall! -- Yuk!. */ - if (seq_length != 0 - && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN) - { - /* A non-jump insn in the delay slot. By definition we can - emit this insn before the call (and in fact before argument - relocating. */ - final_scan_insn (NEXT_INSN (insn), asm_out_file, optimize, 0, 0); - - /* Now delete the delay insn. */ - PUT_CODE (NEXT_INSN (insn), NOTE); - NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; - NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; - delay_insn_deleted = 1; - } - - /* Now copy any FP arguments into integer registers. */ - for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1)) - { - int arg_mode, regno; - rtx use = XEXP (link, 0); - if (! (GET_CODE (use) == USE - && GET_CODE (XEXP (use, 0)) == REG - && FUNCTION_ARG_REGNO_P (REGNO (XEXP (use, 0))))) - continue; - - arg_mode = GET_MODE (XEXP (use, 0)); - regno = REGNO (XEXP (use, 0)); - /* Is it a floating point register? */ - if (regno >= 32 && regno <= 39) - { - /* Copy from the FP register into an integer register - (via memory). */ - if (arg_mode == SFmode) - { - xoperands[0] = XEXP (use, 0); - xoperands[1] = gen_rtx_REG (SImode, 26 - (regno - 32) / 2); - output_asm_insn ("{fstws|fstw} %0,-16(%%sr0,%%r30)", - xoperands); - output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands); - } - else - { - xoperands[0] = XEXP (use, 0); - xoperands[1] = gen_rtx_REG (DImode, 25 - (regno - 34) / 2); - output_asm_insn ("{fstds|fstd} %0,-16(%%sr0,%%r30)", - xoperands); - output_asm_insn ("ldw -12(%%sr0,%%r30),%R1", xoperands); - output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands); - } + if (TARGET_64BIT) + { + /* ??? As far as I can tell, the HP linker doesn't support the + long pc-relative sequence described in the 64-bit runtime + architecture. So, we use a slightly longer indirect call. */ + struct deferred_plabel *p = get_plabel (XSTR (call_dest, 0)); + + xoperands[0] = p->internal_label; + xoperands[1] = gen_label_rtx (); + + /* If this isn't a sibcall, we put the load of %r27 into the + delay slot. We can't do this in a sibcall as we don't + have a second call-clobbered scratch register available. */ + if (seq_length != 0 + && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN + && !sibcall) + { + final_scan_insn (NEXT_INSN (insn), asm_out_file, + optimize, 0, 0); + + /* Now delete the delay insn. */ + PUT_CODE (NEXT_INSN (insn), NOTE); + NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; + NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + delay_insn_deleted = 1; + } + + output_asm_insn ("addil LT'%0,%%r27", xoperands); + output_asm_insn ("ldd RT'%0(%%r1),%%r1", xoperands); + output_asm_insn ("ldd 0(%%r1),%%r1", xoperands); + + if (sibcall) + { + output_asm_insn ("ldd 24(%%r1),%%r27", xoperands); + output_asm_insn ("ldd 16(%%r1),%%r1", xoperands); + output_asm_insn ("bve (%%r1)", xoperands); + } + else + { + output_asm_insn ("ldd 16(%%r1),%%r2", xoperands); + output_asm_insn ("bve,l (%%r2),%%r2", xoperands); + output_asm_insn ("ldd 24(%%r1),%%r27", xoperands); + delay_slot_filled = 1; } } - - /* Don't have to worry about TARGET_PORTABLE_RUNTIME here since - we don't have any direct calls in that case. */ + else { - size_t i; - const char *name = XSTR (call_dest, 0); + int indirect_call = 0; - /* See if we have already put this function on the list - of deferred plabels. This list is generally small, - so a liner search is not too ugly. If it proves too - slow replace it with something faster. */ - for (i = 0; i < n_deferred_plabels; i++) - if (strcmp (name, deferred_plabels[i].name) == 0) - break; - - /* If the deferred plabel list is empty, or this entry was - not found on the list, create a new entry on the list. */ - if (deferred_plabels == NULL || i == n_deferred_plabels) - { - const char *real_name; - - if (deferred_plabels == 0) - deferred_plabels = (struct deferred_plabel *) - ggc_alloc (sizeof (struct deferred_plabel)); + /* Emit a long call. There are several different sequences + of increasing length and complexity. In most cases, + they don't allow an instruction in the delay slot. */ + if (!(TARGET_LONG_ABS_CALL && !flag_pic) + && !(TARGET_SOM && TARGET_LONG_PIC_SDIFF_CALL) + && !(TARGET_GAS && TARGET_LONG_PIC_PCREL_CALL)) + indirect_call = 1; + + if (seq_length != 0 + && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN + && !sibcall + && (!TARGET_PA_20 || indirect_call)) + { + /* A non-jump insn in the delay slot. By definition we can + emit this insn before the call (and in fact before argument + relocating. */ + final_scan_insn (NEXT_INSN (insn), asm_out_file, optimize, 0, 0); + + /* Now delete the delay insn. */ + PUT_CODE (NEXT_INSN (insn), NOTE); + NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; + NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + delay_insn_deleted = 1; + } + + if (TARGET_LONG_ABS_CALL && !flag_pic) + { + /* This is the best sequence for making long calls in + non-pic code. Unfortunately, GNU ld doesn't provide + the stub needed for external calls, and GAS's support + for this with the SOM linker is buggy. */ + output_asm_insn ("ldil L'%0,%%r1", xoperands); + if (sibcall) + output_asm_insn ("be R'%0(%%sr4,%%r1)", xoperands); else - deferred_plabels = (struct deferred_plabel *) - ggc_realloc (deferred_plabels, - ((n_deferred_plabels + 1) - * sizeof (struct deferred_plabel))); - - i = n_deferred_plabels++; - deferred_plabels[i].internal_label = gen_label_rtx (); - deferred_plabels[i].name = ggc_strdup (name); - - /* Gross. We have just implicitly taken the address of this - function, mark it as such. */ - real_name = (*targetm.strip_name_encoding) (name); - TREE_SYMBOL_REFERENCED (get_identifier (real_name)) = 1; - } - - /* We have to load the address of the function using a procedure - label (plabel). Inline plabels can lose for PIC and other - cases, so avoid them by creating a 32bit plabel in the data - segment. */ - if (flag_pic) - { - xoperands[0] = deferred_plabels[i].internal_label; - if (TARGET_SOM || ! TARGET_GAS) - xoperands[1] = gen_label_rtx (); - - output_asm_insn ("addil LT%%%0,%%r19", xoperands); - output_asm_insn ("ldw RT%%%0(%%r1),%%r22", xoperands); - output_asm_insn ("ldw 0(%%r22),%%r22", xoperands); - - /* Get our address + 8 into %r1. */ - output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + { + if (TARGET_PA_20) + output_asm_insn ("be,l R'%0(%%sr4,%%r1),%%sr0,%%r31", + xoperands); + else + output_asm_insn ("ble R'%0(%%sr4,%%r1)", xoperands); - if (TARGET_SOM || ! TARGET_GAS) + output_asm_insn ("copy %%r31,%%r2", xoperands); + delay_slot_filled = 1; + } + } + else + { + if (TARGET_SOM && TARGET_LONG_PIC_SDIFF_CALL) { - /* Add %r1 to the offset of dyncall from the next insn. */ - output_asm_insn ("addil L%%$$dyncall-%1,%%r1", xoperands); + /* The HP assembler and linker can handle relocations + for the difference of two symbols. GAS and the HP + linker can't do this when one of the symbols is + external. */ + xoperands[1] = gen_label_rtx (); + output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + output_asm_insn ("addil L'%0-%l1,%%r1", xoperands); ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", CODE_LABEL_NUMBER (xoperands[1])); - output_asm_insn ("ldo R%%$$dyncall-%1(%%r1),%%r1", xoperands); - } - else + output_asm_insn ("ldo R'%0-%l1(%%r1),%%r1", xoperands); + } + else if (TARGET_GAS && TARGET_LONG_PIC_PCREL_CALL) { - output_asm_insn ("addil L%%$$dyncall-$PIC_pcrel$0+4,%%r1", + /* GAS currently can't generate the relocations that + are needed for the SOM linker under HP-UX using this + sequence. The GNU linker doesn't generate the stubs + that are needed for external calls on TARGET_ELF32 + with this sequence. For now, we have to use a + longer plabel sequence when using GAS. */ + output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + output_asm_insn ("addil L'%0-$PIC_pcrel$0+4,%%r1", xoperands); - output_asm_insn ("ldo R%%$$dyncall-$PIC_pcrel$0+8(%%r1),%%r1", + output_asm_insn ("ldo R'%0-$PIC_pcrel$0+8(%%r1),%%r1", xoperands); } + else + { + /* Emit a long plabel-based call sequence. This is + essentially an inline implementation of $$dyncall. + We don't actually try to call $$dyncall as this is + as difficult as calling the function itself. */ + struct deferred_plabel *p = get_plabel (XSTR (call_dest, 0)); + + xoperands[0] = p->internal_label; + xoperands[1] = gen_label_rtx (); + + /* Since the call is indirect, FP arguments in registers + need to be copied to the general registers. Then, the + argument relocation stub will copy them back. */ + if (TARGET_SOM) + copy_fp_args (insn); - /* Get the return address into %r31. */ - output_asm_insn ("blr %%r0,%%r31", xoperands); + if (flag_pic) + { + output_asm_insn ("addil LT'%0,%%r19", xoperands); + output_asm_insn ("ldw RT'%0(%%r1),%%r1", xoperands); + output_asm_insn ("ldw 0(%%r1),%%r1", xoperands); + } + else + { + output_asm_insn ("addil LR'%0-$global$,%%r27", + xoperands); + output_asm_insn ("ldw RR'%0-$global$(%%r1),%%r1", + xoperands); + } - /* Branch to our target which is in %r1. */ - output_asm_insn ("bv %%r0(%%r1)", xoperands); + output_asm_insn ("bb,>=,n %%r1,30,.+16", xoperands); + output_asm_insn ("depi 0,31,2,%%r1", xoperands); + output_asm_insn ("ldw 4(%%sr0,%%r1),%%r19", xoperands); + output_asm_insn ("ldw 0(%%sr0,%%r1),%%r1", xoperands); - if (sibcall) - { - /* This call never returns, so we do not need to fix the - return pointer. */ - output_asm_insn ("nop", xoperands); - } - else - { - /* Copy the return address into %r2 also. */ - output_asm_insn ("copy %%r31,%%r2", xoperands); + if (!sibcall && !TARGET_PA_20) + { + output_asm_insn ("{bl|b,l} .+8,%%r2", xoperands); + output_asm_insn ("addi 16,%%r2,%%r2", xoperands); + } } - } - else - { - xoperands[0] = deferred_plabels[i].internal_label; - /* Get the address of our target into %r22. */ - output_asm_insn ("addil LR%%%0-$global$,%%r27", xoperands); - output_asm_insn ("ldw RR%%%0-$global$(%%r1),%%r22", xoperands); - - /* Get the high part of the address of $dyncall into %r2, then - add in the low part in the branch instruction. */ - output_asm_insn ("ldil L%%$$dyncall,%%r2", xoperands); if (TARGET_PA_20) - output_asm_insn ("be,l R%%$$dyncall(%%sr4,%%r2),%%sr0,%%r31", - xoperands); - else - output_asm_insn ("ble R%%$$dyncall(%%sr4,%%r2)", xoperands); - - if (sibcall) { - /* This call never returns, so we do not need to fix the - return pointer. */ - output_asm_insn ("nop", xoperands); + if (sibcall) + output_asm_insn ("bve (%%r1)", xoperands); + else + { + if (indirect_call) + { + output_asm_insn ("bve,l (%%r1),%%r2", xoperands); + output_asm_insn ("stw %%r2,-24(%%sp)", xoperands); + delay_slot_filled = 1; + } + else + output_asm_insn ("bve,l (%%r1),%%r2", xoperands); + } } else { - /* Copy the return address into %r2 also. */ - output_asm_insn ("copy %%r31,%%r2", xoperands); - } - } - } + output_asm_insn ("ldsid (%%r1),%%r31\n\tmtsp %%r31,%%sr0", + xoperands); - /* If we had a jump in the call's delay slot, output it now. */ - if (seq_length != 0 && !delay_insn_deleted) - { - xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - output_asm_insn ("b,n %0", xoperands); + if (sibcall) + output_asm_insn ("be 0(%%sr0,%%r1)", xoperands); + else + { + output_asm_insn ("ble 0(%%sr0,%%r1)", xoperands); - /* Now delete the delay insn. */ - PUT_CODE (NEXT_INSN (insn), NOTE); - NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; - NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + if (indirect_call) + output_asm_insn ("stw %%r31,-24(%%sp)", xoperands); + else + output_asm_insn ("copy %%r31,%%r2", xoperands); + delay_slot_filled = 1; + } + } + } } - return ""; } - /* This call has an unconditional jump in its delay slot and the - call is known to reach its target or the beginning of the current - subspace. */ + if (seq_length == 0 || (delay_insn_deleted && !delay_slot_filled)) + output_asm_insn ("nop", xoperands); - /* Use the containing sequence insn's address. */ - seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0))); + /* We are done if there isn't a jump in the delay slot. */ + if (seq_length == 0 + || delay_insn_deleted + || GET_CODE (NEXT_INSN (insn)) != JUMP_INSN) + return ""; - distance = INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn)))) - - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8; + /* A sibcall should never have a branch in the delay slot. */ + if (sibcall) + abort (); - /* If the branch is too far away, emit a normal call followed - by a nop, followed by the unconditional branch. If the branch - is close, then adjust %r2 in the call's delay slot. */ + /* This call has an unconditional jump in its delay slot. */ + xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - xoperands[0] = call_dest; - xoperands[1] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - if (! VAL_14_BITS_P (distance)) - output_asm_insn ("{bl|b,l} %0,%%r2\n\tnop\n\tb,n %1", xoperands); - else + if (!delay_slot_filled) { - xoperands[3] = gen_label_rtx (); - output_asm_insn ("\n\t{bl|b,l} %0,%%r2\n\tldo %1-%3(%%r2),%%r2", - xoperands); - ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", - CODE_LABEL_NUMBER (xoperands[3])); + /* See if the return address can be adjusted. Use the containing + sequence insn's address. */ + rtx seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0))); + int distance = (INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn)))) + - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8); + + if (VAL_14_BITS_P (distance)) + { + xoperands[1] = gen_label_rtx (); + output_asm_insn ("ldo %0-%1(%%r2),%%r2", xoperands); + ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", + CODE_LABEL_NUMBER (xoperands[3])); + } + else + /* ??? This branch may not reach its target. */ + output_asm_insn ("nop\n\tb,n %0", xoperands); } + else + /* ??? This branch may not reach its target. */ + output_asm_insn ("b,n %0", xoperands); /* Delete the jump. */ PUT_CODE (NEXT_INSN (insn), NOTE); NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + return ""; } @@ -6580,8 +6772,8 @@ pa_asm_output_mi_thunk (file, thunk_fnde { if (! TARGET_64BIT && ! TARGET_PORTABLE_RUNTIME && flag_pic) { - fprintf (file, "\taddil LT%%%s,%%r19\n", lab); - fprintf (file, "\tldw RT%%%s(%%r1),%%r22\n", lab); + fprintf (file, "\taddil LT'%s,%%r19\n", lab); + fprintf (file, "\tldw RT'%s(%%r1),%%r22\n", lab); fprintf (file, "\tldw 0(%%sr0,%%r22),%%r22\n"); fprintf (file, "\tbb,>=,n %%r22,30,.+16\n"); fprintf (file, "\tdepi 0,31,2,%%r22\n"); @@ -6603,13 +6795,13 @@ pa_asm_output_mi_thunk (file, thunk_fnde { if (! TARGET_64BIT && ! TARGET_PORTABLE_RUNTIME && flag_pic) { - fprintf (file, "\taddil L%%"); + fprintf (file, "\taddil L'"); fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta); - fprintf (file, ",%%r26\n\tldo R%%"); + fprintf (file, ",%%r26\n\tldo R'"); fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta); fprintf (file, "(%%r1),%%r26\n"); - fprintf (file, "\taddil LT%%%s,%%r19\n", lab); - fprintf (file, "\tldw RT%%%s(%%r1),%%r22\n", lab); + fprintf (file, "\taddil LT'%s,%%r19\n", lab); + fprintf (file, "\tldw RT'%s(%%r1),%%r22\n", lab); fprintf (file, "\tldw 0(%%sr0,%%r22),%%r22\n"); fprintf (file, "\tbb,>=,n %%r22,30,.+16\n"); fprintf (file, "\tdepi 0,31,2,%%r22\n"); @@ -6620,9 +6812,9 @@ pa_asm_output_mi_thunk (file, thunk_fnde } else { - fprintf (file, "\taddil L%%"); + fprintf (file, "\taddil L'"); fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta); - fprintf (file, ",%%r26\n\tb %s\n\tldo R%%", target_name); + fprintf (file, ",%%r26\n\tb %s\n\tldo R'", target_name); fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta); fprintf (file, "(%%r1),%%r26\n"); } @@ -6634,7 +6826,7 @@ pa_asm_output_mi_thunk (file, thunk_fnde data_section (); fprintf (file, "\t.align 4\n"); ASM_OUTPUT_INTERNAL_LABEL (file, "LTHN", current_thunk_number); - fprintf (file, "\t.word P%%%s\n", target_name); + fprintf (file, "\t.word P'%s\n", target_name); function_section (thunk_fndecl); } current_thunk_number++; Index: config/pa/pa.h =================================================================== RCS file: /cvsroot/gcc/gcc/gcc/config/pa/pa.h,v retrieving revision 1.173 diff -u -3 -p -r1.173 pa.h --- config/pa/pa.h 20 Oct 2002 22:37:12 -0000 1.173 +++ config/pa/pa.h 30 Oct 2002 17:06:44 -0000 @@ -31,7 +31,7 @@ enum cmp_type /* comparison type */ }; /* For long call handling. */ -extern unsigned int total_code_bytes; +extern unsigned long total_code_bytes; /* Which processor to schedule for. */ @@ -152,6 +152,12 @@ extern int target_flags; #define TARGET_GNU_LD (target_flags & MASK_GNU_LD) #endif +/* Force generation of long calls. */ +#define MASK_LONG_CALLS 32768 +#ifndef TARGET_LONG_CALLS +#define TARGET_LONG_CALLS (target_flags & MASK_LONG_CALLS) +#endif + #ifndef TARGET_PA_10 #define TARGET_PA_10 (target_flags & (MASK_PA_11 | MASK_PA_20) == 0) #endif @@ -179,6 +185,27 @@ extern int target_flags; #define TARGET_SOM 0 #endif +/* The following three defines are potential target switches. The current + defines are optimal given the current capabilities of GAS and GNU ld. */ + +/* Define to a C expression evaluating to true to use long absolute calls. + Currently, only the HP assembler and SOM linker support long absolute + calls. They are used only in non-pic code. */ +#define TARGET_LONG_ABS_CALL (TARGET_SOM && !TARGET_GAS) + +/* Define to a C expression evaluating to true to use long pic symbol + difference calls. This is a call variant similar to the long pic + pc-relative call. Long pic symbol difference calls are only used with + the HP SOM linker. Currently, only the HP assembler supports these + calls. GAS doesn't allow an arbritrary difference of two symbols. */ +#define TARGET_LONG_PIC_SDIFF_CALL (!TARGET_GAS) + +/* Define to a C expression evaluating to true to use long pic + pc-relative calls. Long pic pc-relative calls are only used with + GAS. Currently, they are usable for calls within a module but + not for external calls. */ +#define TARGET_LONG_PIC_PCREL_CALL 0 + /* Macro to define tables used to set the flags. This is a list in braces of target switches with each switch being { "NAME", VALUE, "HELP_STRING" }. VALUE is the bits to set, @@ -237,6 +264,10 @@ extern int target_flags; N_("Generate code for huge switch statements") }, \ { "no-big-switch", -MASK_BIG_SWITCH, \ N_("Do not generate code for huge switch statements") }, \ + { "long-calls", MASK_LONG_CALLS, \ + N_("Always generate long calls") }, \ + { "no-long-calls", -MASK_LONG_CALLS, \ + N_("Generate long calls only when needed") }, \ { "linker-opt", 0, \ N_("Enable linker optimizations") }, \ SUBTARGET_SWITCHES \ @@ -1193,8 +1224,14 @@ extern int may_call_alloca; /* Using DFmode forces only short displacements \ to be recognized as valid in reg+d addresses. \ However, this is not necessary for PA2.0 since\ - it has long FP loads/stores. */ \ + it has long FP loads/stores. \ + \ + FIXME: the ELF32 linker clobbers the LSB of \ + the FP register number in {fldw,fstw} insns. \ + Thus, we only allow long FP loads/stores on \ + TARGET_64BIT. */ \ && memory_address_p ((TARGET_PA_20 \ + && !TARGET_ELF32 \ ? GET_MODE (OP) \ : DFmode), \ XEXP (OP, 0)) \ @@ -1300,7 +1337,7 @@ extern int may_call_alloca; if (GET_CODE (index) == CONST_INT \ && ((INT_14_BITS (index) \ && (TARGET_SOFT_FLOAT \ - || (TARGET_PA_20 \ + || (TARGET_PA_20 \ && ((MODE == SFmode \ && (INTVAL (index) % 4) == 0)\ || (MODE == DFmode \ @@ -1327,6 +1364,7 @@ extern int may_call_alloca; /* We can allow symbolic LO_SUM addresses\ for PA2.0. */ \ || (TARGET_PA_20 \ + && !TARGET_ELF32 \ && GET_CODE (XEXP (X, 1)) != CONST_INT)\ || ((MODE) != SFmode \ && (MODE) != DFmode))) \ @@ -1340,6 +1378,7 @@ extern int may_call_alloca; /* We can allow symbolic LO_SUM addresses\ for PA2.0. */ \ || (TARGET_PA_20 \ + && !TARGET_ELF32 \ && GET_CODE (XEXP (X, 1)) != CONST_INT)\ || ((MODE) != SFmode \ && (MODE) != DFmode))) \ @@ -1354,7 +1393,7 @@ extern int may_call_alloca; && REG_OK_FOR_BASE_P (XEXP (X, 0)) \ && GET_CODE (XEXP (X, 1)) == UNSPEC \ && (TARGET_SOFT_FLOAT \ - || TARGET_PA_20 \ + || (TARGET_PA_20 && !TARGET_ELF32) \ || ((MODE) != SFmode \ && (MODE) != DFmode))) \ goto ADDR; \ @@ -1386,7 +1425,7 @@ do { \ rtx new, temp = NULL_RTX; \ \ mask = (GET_MODE_CLASS (MODE) == MODE_FLOAT \ - ? (TARGET_PA_20 ? 0x3fff : 0x1f) : 0x3fff); \ + ? (TARGET_PA_20 && !TARGET_ELF32 ? 0x3fff : 0x1f) : 0x3fff); \ \ if (optimize \ && GET_CODE (AD) == PLUS) \ Index: config/pa/pa.md =================================================================== RCS file: /cvsroot/gcc/gcc/gcc/config/pa/pa.md,v retrieving revision 1.113 diff -u -3 -p -r1.113 pa.md --- config/pa/pa.md 11 Sep 2002 02:45:09 -0000 1.113 +++ config/pa/pa.md 30 Oct 2002 17:06:46 -0000 @@ -105,12 +105,9 @@ (define_delay (eq_attr "type" "call") [(eq_attr "in_call_delay" "true") (nil) (nil)]) -;; millicode call delay slot description. Note it disallows delay slot -;; when TARGET_PORTABLE_RUNTIME is true. +;; Millicode call delay slot description. (define_delay (eq_attr "type" "milli") - [(and (eq_attr "in_call_delay" "true") - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") (const_int 0))) - (nil) (nil)]) + [(eq_attr "in_call_delay" "true") (nil) (nil)]) ;; Return and other similar instructions. (define_delay (eq_attr "type" "branch,parallel_branch") @@ -4089,27 +4086,7 @@ "!TARGET_64BIT" "* return output_mul_insn (0, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) (mult:SI (reg:SI 26) (reg:SI 25))) @@ -4120,7 +4097,7 @@ "TARGET_64BIT" "* return output_mul_insn (0, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_expand "muldi3" [(set (match_operand:DI 0 "register_operand" "") @@ -4211,27 +4188,7 @@ "* return output_div_insn (operands, 0, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) @@ -4245,7 +4202,7 @@ "* return output_div_insn (operands, 0, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_expand "udivsi3" [(set (reg:SI 26) (match_operand:SI 1 "move_operand" "")) @@ -4261,6 +4218,7 @@ " { operands[3] = gen_reg_rtx (SImode); + if (TARGET_64BIT) { operands[5] = gen_rtx_REG (SImode, 2); @@ -4287,27 +4245,7 @@ "* return output_div_insn (operands, 1, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) @@ -4321,7 +4259,7 @@ "* return output_div_insn (operands, 1, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_expand "modsi3" [(set (reg:SI 26) (match_operand:SI 1 "move_operand" "")) @@ -4360,27 +4298,7 @@ "* return output_mod_insn (0, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) (mod:SI (reg:SI 26) (reg:SI 25))) @@ -4393,7 +4311,7 @@ "* return output_mod_insn (0, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_expand "umodsi3" [(set (reg:SI 26) (match_operand:SI 1 "move_operand" "")) @@ -4432,27 +4350,7 @@ "* return output_mod_insn (1, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) (umod:SI (reg:SI 26) (reg:SI 25))) @@ -4465,7 +4363,7 @@ "* return output_mod_insn (1, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) ;;- and instructions ;; We define DImode `and` so with DImode `not` we can get @@ -6036,11 +5934,12 @@ call_insn = emit_call_insn (gen_call_internal_reg (operands[1])); } + if (TARGET_64BIT) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); + if (flag_pic) { use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), pic_offset_table_rtx); - if (TARGET_64BIT) - use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); /* After each call we must restore the PIC register, even if it doesn't appear to be used. */ @@ -6052,6 +5951,7 @@ (define_insn "call_internal_symref" [(call (mem:SI (match_operand 0 "call_operand_address" "")) (match_operand 1 "" "i")) + (clobber (reg:SI 1)) (clobber (reg:SI 2)) (use (const_int 0))] "! TARGET_PORTABLE_RUNTIME" @@ -6061,21 +5961,7 @@ return output_call (insn, operands[0], 0); }" [(set_attr "type" "call") - (set (attr "length") -;; If we're sure that we can either reach the target or that the -;; linker can use a long-branch stub, then the length is at most -;; 8 bytes. -;; -;; For long-calls the length will be at most 68 bytes (non-pic) -;; or 84 bytes (pic). */ -;; Else we have to use a long-call; - (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (const_int 8) - (if_then_else (eq (symbol_ref "flag_pic") - (const_int 0)) - (const_int 68) - (const_int 84))))]) + (set (attr "length") (symbol_ref "attr_length_call (insn, 0)"))]) (define_insn "call_internal_reg_64bit" [(call (mem:SI (match_operand:DI 0 "register_operand" "r")) @@ -6086,15 +5972,16 @@ "* { /* ??? Needs more work. Length computation, split into multiple insns, - do not use %r22 directly, expose delay slot. */ - return \"ldd 16(%0),%%r2\;ldd 24(%0),%%r27\;bve,l (%%r2),%%r2\;nop\"; + expose delay slot. */ + return \"ldd 16(%0),%%r2\;bve,l (%%r2),%%r2\;ldd 24(%0),%%r27\"; }" [(set_attr "type" "dyncall") - (set (attr "length") (const_int 16))]) + (set (attr "length") (const_int 12))]) (define_insn "call_internal_reg" [(call (mem:SI (reg:SI 22)) (match_operand 0 "" "i")) + (clobber (reg:SI 1)) (clobber (reg:SI 2)) (use (const_int 1))] "" @@ -6218,11 +6105,13 @@ call_insn = emit_call_insn (gen_call_value_internal_reg (operands[0], operands[2])); } + + if (TARGET_64BIT) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); + if (flag_pic) { use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), pic_offset_table_rtx); - if (TARGET_64BIT) - use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); /* After each call we must restore the PIC register, even if it doesn't appear to be used. */ @@ -6235,6 +6124,7 @@ [(set (match_operand 0 "" "=rf") (call (mem:SI (match_operand 1 "call_operand_address" "")) (match_operand 2 "" "i"))) + (clobber (reg:SI 1)) (clobber (reg:SI 2)) (use (const_int 0))] ;;- Don't use operand 1 for most machines. @@ -6245,21 +6135,7 @@ return output_call (insn, operands[1], 0); }" [(set_attr "type" "call") - (set (attr "length") -;; If we're sure that we can either reach the target or that the -;; linker can use a long-branch stub, then the length is at most -;; 8 bytes. -;; -;; For long-calls the length will be at most 68 bytes (non-pic) -;; or 84 bytes (pic). */ -;; Else we have to use a long-call; - (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (const_int 8) - (if_then_else (eq (symbol_ref "flag_pic") - (const_int 0)) - (const_int 68) - (const_int 84))))]) + (set (attr "length") (symbol_ref "attr_length_call (insn, 0)"))]) (define_insn "call_value_internal_reg_64bit" [(set (match_operand 0 "" "=rf") @@ -6271,16 +6147,17 @@ "* { /* ??? Needs more work. Length computation, split into multiple insns, - do not use %r22 directly, expose delay slot. */ - return \"ldd 16(%1),%%r2\;ldd 24(%1),%%r27\;bve,l (%%r2),%%r2\;nop\"; + expose delay slot. */ + return \"ldd 16(%1),%%r2\;bve,l (%%r2),%%r2\;ldd 24(%1),%%r27\"; }" [(set_attr "type" "dyncall") - (set (attr "length") (const_int 16))]) + (set (attr "length") (const_int 12))]) (define_insn "call_value_internal_reg" [(set (match_operand 0 "" "=rf") (call (mem:SI (reg:SI 22)) (match_operand 1 "" "i"))) + (clobber (reg:SI 1)) (clobber (reg:SI 2)) (use (const_int 1))] "" @@ -6389,10 +6266,9 @@ }") (define_expand "sibcall" - [(parallel [(call (match_operand:SI 0 "" "") - (match_operand 1 "" "")) - (clobber (reg:SI 0))])] - "! TARGET_PORTABLE_RUNTIME" + [(call (match_operand:SI 0 "" "") + (match_operand 1 "" ""))] + "!TARGET_PORTABLE_RUNTIME" " { rtx op; @@ -6400,8 +6276,21 @@ op = XEXP (operands[0], 0); - /* We do not allow indirect sibling calls. */ - call_insn = emit_call_insn (gen_sibcall_internal_symref (op, operands[1])); + if (TARGET_64BIT) + emit_move_insn (arg_pointer_rtx, + gen_rtx_PLUS (word_mode, virtual_outgoing_args_rtx, + GEN_INT (64))); + + /* Indirect sibling calls are not allowed. */ + if (TARGET_64BIT) + call_insn = gen_sibcall_internal_symref_64bit (op, operands[1]); + else + call_insn = gen_sibcall_internal_symref (op, operands[1]); + + call_insn = emit_call_insn (call_insn); + + if (TARGET_64BIT) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); if (flag_pic) { @@ -6417,38 +6306,39 @@ (define_insn "sibcall_internal_symref" [(call (mem:SI (match_operand 0 "call_operand_address" "")) (match_operand 1 "" "i")) - (clobber (reg:SI 0)) + (clobber (reg:SI 1)) (use (reg:SI 2)) (use (const_int 0))] - "! TARGET_PORTABLE_RUNTIME" + "!TARGET_PORTABLE_RUNTIME && !TARGET_64BIT" "* { output_arg_descriptor (insn); return output_call (insn, operands[0], 1); }" [(set_attr "type" "call") - (set (attr "length") -;; If we're sure that we can either reach the target or that the -;; linker can use a long-branch stub, then the length is at most -;; 8 bytes. -;; -;; For long-calls the length will be at most 68 bytes (non-pic) -;; or 84 bytes (pic). */ -;; Else we have to use a long-call; - (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (const_int 8) - (if_then_else (eq (symbol_ref "flag_pic") - (const_int 0)) - (const_int 68) - (const_int 84))))]) + (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))]) + +(define_insn "sibcall_internal_symref_64bit" + [(call (mem:SI (match_operand 0 "call_operand_address" "")) + (match_operand 1 "" "i")) + (clobber (reg:SI 1)) + (clobber (reg:SI 27)) + (use (reg:SI 2)) + (use (const_int 0))] + "TARGET_64BIT" + "* +{ + output_arg_descriptor (insn); + return output_call (insn, operands[0], 1); +}" + [(set_attr "type" "call") + (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))]) (define_expand "sibcall_value" - [(parallel [(set (match_operand 0 "" "") + [(set (match_operand 0 "" "") (call (match_operand:SI 1 "" "") - (match_operand 2 "" ""))) - (clobber (reg:SI 0))])] - "! TARGET_PORTABLE_RUNTIME" + (match_operand 2 "" "")))] + "!TARGET_PORTABLE_RUNTIME" " { rtx op; @@ -6456,10 +6346,24 @@ op = XEXP (operands[1], 0); - /* We do not allow indirect sibling calls. */ - call_insn = emit_call_insn (gen_sibcall_value_internal_symref (operands[0], - op, - operands[2])); + if (TARGET_64BIT) + emit_move_insn (arg_pointer_rtx, + gen_rtx_PLUS (word_mode, virtual_outgoing_args_rtx, + GEN_INT (64))); + + /* Indirect sibling calls are not allowed. */ + if (TARGET_64BIT) + call_insn + = gen_sibcall_value_internal_symref_64bit (operands[0], op, operands[2]); + else + call_insn + = gen_sibcall_value_internal_symref (operands[0], op, operands[2]); + + call_insn = emit_call_insn (call_insn); + + if (TARGET_64BIT) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); + if (flag_pic) { use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), pic_offset_table_rtx); @@ -6475,32 +6379,34 @@ [(set (match_operand 0 "" "=rf") (call (mem:SI (match_operand 1 "call_operand_address" "")) (match_operand 2 "" "i"))) - (clobber (reg:SI 0)) + (clobber (reg:SI 1)) (use (reg:SI 2)) (use (const_int 0))] - ;;- Don't use operand 1 for most machines. - "! TARGET_PORTABLE_RUNTIME" + "!TARGET_PORTABLE_RUNTIME && !TARGET_64BIT" "* { output_arg_descriptor (insn); return output_call (insn, operands[1], 1); }" [(set_attr "type" "call") - (set (attr "length") -;; If we're sure that we can either reach the target or that the -;; linker can use a long-branch stub, then the length is at most -;; 8 bytes. -;; -;; For long-calls the length will be at most 68 bytes (non-pic) -;; or 84 bytes (pic). */ -;; Else we have to use a long-call; - (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (const_int 8) - (if_then_else (eq (symbol_ref "flag_pic") - (const_int 0)) - (const_int 68) - (const_int 84))))]) + (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))]) + +(define_insn "sibcall_value_internal_symref_64bit" + [(set (match_operand 0 "" "=rf") + (call (mem:SI (match_operand 1 "call_operand_address" "")) + (match_operand 2 "" "i"))) + (clobber (reg:SI 1)) + (clobber (reg:SI 27)) + (use (reg:SI 2)) + (use (const_int 0))] + "TARGET_64BIT" + "* +{ + output_arg_descriptor (insn); + return output_call (insn, operands[1], 1); +}" + [(set_attr "type" "call") + (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))]) (define_insn "nop" [(const_int 0)] @@ -7392,6 +7298,12 @@ "!TARGET_64BIT" "* { + int length = get_attr_length (insn); + rtx xoperands[2]; + + xoperands[0] = GEN_INT (length - 8); + xoperands[1] = GEN_INT (length - 16); + /* Must import the magic millicode routine. */ output_asm_insn (\".IMPORT $$sh_func_adrs,MILLICODE\", NULL); @@ -7400,60 +7312,24 @@ First, copy our input parameter into %r29 just in case we don't need to call $$sh_func_adrs. */ output_asm_insn (\"copy %%r26,%%r29\", NULL); + output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\", NULL); /* Next, examine the low two bits in %r26, if they aren't 0x2, then we use %r26 unchanged. */ - if (get_attr_length (insn) == 32) - output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+24\", NULL); - else if (get_attr_length (insn) == 40) - output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+32\", NULL); - else if (get_attr_length (insn) == 44) - output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+36\", NULL); - else - output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+20\", NULL); + output_asm_insn (\"{comib|cmpib},<>,n 2,%%r31,.+%0\", xoperands); + output_asm_insn (\"ldi 4096,%%r31\", NULL); /* Next, compare %r26 with 4096, if %r26 is less than or equal to - 4096, then we use %r26 unchanged. */ - if (get_attr_length (insn) == 32) - output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+16\", - NULL); - else if (get_attr_length (insn) == 40) - output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+24\", - NULL); - else if (get_attr_length (insn) == 44) - output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+28\", - NULL); - else - output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+12\", - NULL); + 4096, then again we use %r26 unchanged. */ + output_asm_insn (\"{comb|cmpb},<<,n %%r26,%%r31,.+%1\", xoperands); - /* Else call $$sh_func_adrs to extract the function's real add24. */ + /* Finally, call $$sh_func_adrs to extract the function's real add24. */ return output_millicode_call (insn, gen_rtx_SYMBOL_REF (SImode, - \"$$sh_func_adrs\")); + \"$$sh_func_adrs\")); }" [(set_attr "type" "multi") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 28) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 44) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 40)] - -;; Out of reach, can use ble - (const_int 32)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 20)"))]) ;; On the PA, the PIC register is call clobbered, so it must ;; be saved & restored around calls by the caller. If the call Index: config/pa/som.h =================================================================== RCS file: /cvsroot/gcc/gcc/gcc/config/pa/som.h,v retrieving revision 1.38 diff -u -3 -p -r1.38 som.h --- config/pa/som.h 29 Aug 2002 21:16:35 -0000 1.38 +++ config/pa/som.h 30 Oct 2002 17:06:46 -0000 @@ -371,3 +371,7 @@ do { \ on the location of the GCC tool directory. The downside is GCC cannot be moved after installation using a symlink. */ #define ALWAYS_STRIP_DOTDOT 1 + +/* Aggregates with a single float or double field should be passed and + returned in the general registers. */ +#define MEMBER_TYPE_FORCES_BLK(FIELD, MODE) (MODE==SFmode || MODE==DFmode) Index: config/pa/t-pa64 =================================================================== RCS file: /cvsroot/gcc/gcc/gcc/config/pa/t-pa64,v retrieving revision 1.6 diff -u -3 -p -r1.6 t-pa64 --- config/pa/t-pa64 30 Apr 2002 19:47:38 -0000 1.6 +++ config/pa/t-pa64 30 Oct 2002 17:06:46 -0000 @@ -1,4 +1,4 @@ -TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1 +TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1 -mlong-calls LIB2FUNCS_EXTRA=quadlib.c Index: doc/invoke.texi =================================================================== RCS file: /cvsroot/gcc/gcc/gcc/doc/invoke.texi,v retrieving revision 1.196 diff -u -3 -p -r1.196 invoke.texi --- doc/invoke.texi 20 Oct 2002 19:18:30 -0000 1.196 +++ doc/invoke.texi 30 Oct 2002 17:06:48 -0000 @@ -508,7 +508,7 @@ in the following sections. -march=@var{architecture-type} @gol -mbig-switch -mdisable-fpregs -mdisable-indexing @gol -mfast-indirect-calls -mgas -mgnu-ld -mhp-ld @gol --mjump-in-delay -mlinker-opt @gol +-mjump-in-delay -mlinker-opt -mlong-calls @gol -mlong-load-store -mno-big-switch -mno-disable-fpregs @gol -mno-disable-indexing -mno-fast-indirect-calls -mno-gas @gol -mno-jump-in-delay -mno-long-load-store @gol @@ -8093,6 +8093,33 @@ ld. The ld that is called is determined configure option, gcc's program search path, and finally by the user's @env{PATH}. The linker used by GCC can be printed using @samp{which `gcc -print-prog-name=ld`}. + +@item -mlong-calls +@opindex mno-long-calls +Generate code that uses long call sequences. This ensures that a call +is always able to reach linker generated stubs. The default is to generate +long calls only when the distance from the call site to the beginning +of the function or translation unit, as the case may be, exceeds a +predefined limit set by the branch type being used. The limits for +normal calls are 7,600,000 and 240,000 bytes, respectively for the +PA 2.0 and PA 1.X architectures. Sibcalls are always limited at +240,000 bytes. + +Distances are measured from the beginning of functions when using the +@option{-ffunction-sections} option, or when using the @option{-mgas} +and @option{-mno-portable-runtime} options together under HP-UX with +the SOM linker. + +It is normally not desirable to use this option as it will degrade +performance. However, it may be useful in large applications, +particularly when partial linking is used to build the application. + +The types of long calls used depends on the capabilities of the +assembler and linker, and the type of code being generated. The +impact on systems that support long absolute calls, and long pic +symbol-difference or pc-relative calls should be relatively small. +However, an indirect call is used on 32-bit ELF systems in pic code +and it is quite long. @end table