* [PATCH] Sparc rtx costs @ 2004-07-09 23:39 David S. Miller 2004-07-10 2:22 ` Richard Henderson 2004-07-15 6:53 ` Eric Botcazou 0 siblings, 2 replies; 8+ messages in thread From: David S. Miller @ 2004-07-09 23:39 UTC (permalink / raw) To: gcc-patches This makes the sparc backend just like i386 and others by using a processor_costs struct instead of a pile of processor type tests. This cleans up the sparc_rtx_costs implementation and makes the sparc backend easier to maintain and tweak. Applied to mainline. 2004-07-02 David S. Miller <davem@nuts.davemloft.net> * config/sparc/sparc.h (processor_costs): Define. (sparc_costs): Declare. * config/sparc/sparc.c (cypress_costs, supersparc_costs, hypersparc_costs, sparclet_costs, ultrasparc_costs, ultrasparc3_costs): New. (sparc_override_options): Set sparc_costs as appropriate. (sparc_rtx_costs): Use sparc_costs instead of messy conditionals. Index: config/sparc/sparc.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/config/sparc/sparc.c,v retrieving revision 1.316 diff -c -r1.316 sparc.c *** config/sparc/sparc.c 9 Jul 2004 10:04:34 -0000 1.316 --- config/sparc/sparc.c 9 Jul 2004 22:52:26 -0000 *************** *** 49,54 **** --- 49,201 ---- #include "cfglayout.h" #include "tree-gimple.h" + /* Processor costs */ + static const + struct processor_costs cypress_costs = { + 2, /* int load */ + 2, /* int signed load */ + 2, /* int zeroed load */ + 2, /* float load */ + 5, /* fmov, fneg, fabs */ + 5, /* fadd, fsub */ + 1, /* fcmp */ + 1, /* fmov, fmovr */ + 7, /* fmul */ + 37, /* fdivs */ + 37, /* fdivd */ + 63, /* fsqrts */ + 63, /* fsqrtd */ + 1, /* imul */ + 1, /* imulX */ + 0, /* imul bit factor */ + 1, /* idiv */ + 1, /* idivX */ + 1, /* movcc/movr */ + 0, /* shift penalty */ + }; + + static const + struct processor_costs supersparc_costs = { + 1, /* int load */ + 1, /* int signed load */ + 1, /* int zeroed load */ + 0, /* float load */ + 3, /* fmov, fneg, fabs */ + 3, /* fadd, fsub */ + 3, /* fcmp */ + 1, /* fmov, fmovr */ + 3, /* fmul */ + 6, /* fdivs */ + 9, /* fdivd */ + 12, /* fsqrts */ + 12, /* fsqrtd */ + 4, /* imul */ + 4, /* imulX */ + 0, /* imul bit factor */ + 4, /* idiv */ + 4, /* idivX */ + 1, /* movcc/movr */ + 1, /* shift penalty */ + }; + + static const + struct processor_costs hypersparc_costs = { + 1, /* int load */ + 1, /* int signed load */ + 1, /* int zeroed load */ + 1, /* float load */ + 1, /* fmov, fneg, fabs */ + 1, /* fadd, fsub */ + 1, /* fcmp */ + 1, /* fmov, fmovr */ + 1, /* fmul */ + 8, /* fdivs */ + 12, /* fdivd */ + 17, /* fsqrts */ + 17, /* fsqrtd */ + 17, /* imul */ + 17, /* imulX */ + 0, /* imul bit factor */ + 17, /* idiv */ + 17, /* idivX */ + 1, /* movcc/movr */ + 0, /* shift penalty */ + }; + + static const + struct processor_costs sparclet_costs = { + 3, /* int load */ + 3, /* int signed load */ + 1, /* int zeroed load */ + 1, /* float load */ + 1, /* fmov, fneg, fabs */ + 1, /* fadd, fsub */ + 1, /* fcmp */ + 1, /* fmov, fmovr */ + 1, /* fmul */ + 1, /* fdivs */ + 1, /* fdivd */ + 1, /* fsqrts */ + 1, /* fsqrtd */ + 5, /* imul */ + 5, /* imulX */ + 0, /* imul bit factor */ + 5, /* idiv */ + 5, /* idivX */ + 1, /* movcc/movr */ + 0, /* shift penalty */ + }; + + static const + struct processor_costs ultrasparc_costs = { + 2, /* int load */ + 3, /* int signed load */ + 2, /* int zeroed load */ + 2, /* float load */ + 1, /* fmov, fneg, fabs */ + 4, /* fadd, fsub */ + 1, /* fcmp */ + 2, /* fmov, fmovr */ + 4, /* fmul */ + 13, /* fdivs */ + 23, /* fdivd */ + 13, /* fsqrts */ + 23, /* fsqrtd */ + 4, /* imul */ + 4, /* imulX */ + 2, /* imul bit factor */ + 37, /* idiv */ + 68, /* idivX */ + 2, /* movcc/movr */ + 2, /* shift penalty */ + }; + + static const + struct processor_costs ultrasparc3_costs = { + 2, /* int load */ + 3, /* int signed load */ + 3, /* int zeroed load */ + 2, /* float load */ + 3, /* fmov, fneg, fabs */ + 4, /* fadd, fsub */ + 5, /* fcmp */ + 3, /* fmov, fmovr */ + 4, /* fmul */ + 17, /* fdivs */ + 20, /* fdivd */ + 20, /* fsqrts */ + 29, /* fsqrtd */ + 6, /* imul */ + 6, /* imulX */ + 0, /* imul bit factor */ + 40, /* idiv */ + 71, /* idivX */ + 2, /* movcc/movr */ + 0, /* shift penalty */ + }; + + const struct processor_costs *sparc_costs = &cypress_costs; + #ifdef HAVE_AS_RELAX_OPTION /* If 'as' and 'ld' are relaxing tail call insns into branch always, use "or %o7,%g0,X; call Y; or X,%g0,%o7" always, so that it can be optimized. *************** *** 503,508 **** --- 650,685 ---- /* Set up function hooks. */ init_machine_status = sparc_init_machine_status; + + switch (sparc_cpu) + { + case PROCESSOR_V7: + case PROCESSOR_CYPRESS: + sparc_costs = &cypress_costs; + break; + case PROCESSOR_V8: + case PROCESSOR_SPARCLITE: + case PROCESSOR_SUPERSPARC: + sparc_costs = &supersparc_costs; + break; + case PROCESSOR_F930: + case PROCESSOR_F934: + case PROCESSOR_HYPERSPARC: + case PROCESSOR_SPARCLITE86X: + sparc_costs = &hypersparc_costs; + break; + case PROCESSOR_SPARCLET: + case PROCESSOR_TSC701: + sparc_costs = &sparclet_costs; + break; + case PROCESSOR_V9: + case PROCESSOR_ULTRASPARC: + sparc_costs = &ultrasparc_costs; + break; + case PROCESSOR_ULTRASPARC3: + sparc_costs = &ultrasparc3_costs; + break; + }; } \f /* Miscellaneous utilities. */ *************** *** 8071,8432 **** static bool sparc_rtx_costs (rtx x, int code, int outer_code, int *total) { switch (code) { ! case PLUS: case MINUS: case ABS: case NEG: ! case FLOAT: case UNSIGNED_FLOAT: ! case FIX: case UNSIGNED_FIX: ! case FLOAT_EXTEND: case FLOAT_TRUNCATE: ! if (FLOAT_MODE_P (GET_MODE (x))) ! { ! switch (sparc_cpu) ! { ! case PROCESSOR_ULTRASPARC: ! case PROCESSOR_ULTRASPARC3: ! *total = COSTS_N_INSNS (4); ! return true; ! ! case PROCESSOR_SUPERSPARC: ! *total = COSTS_N_INSNS (3); ! return true; ! ! case PROCESSOR_CYPRESS: ! *total = COSTS_N_INSNS (5); ! return true; ! ! case PROCESSOR_HYPERSPARC: ! case PROCESSOR_SPARCLITE86X: ! default: ! *total = COSTS_N_INSNS (1); ! return true; ! } ! } ! ! *total = COSTS_N_INSNS (1); ! return true; ! ! case SQRT: ! switch (sparc_cpu) { ! case PROCESSOR_ULTRASPARC: ! if (GET_MODE (x) == SFmode) ! *total = COSTS_N_INSNS (13); ! else ! *total = COSTS_N_INSNS (23); ! return true; ! ! case PROCESSOR_ULTRASPARC3: ! if (GET_MODE (x) == SFmode) ! *total = COSTS_N_INSNS (20); ! else ! *total = COSTS_N_INSNS (29); ! return true; ! ! case PROCESSOR_SUPERSPARC: ! *total = COSTS_N_INSNS (12); ! return true; ! ! case PROCESSOR_CYPRESS: ! *total = COSTS_N_INSNS (63); ! return true; ! ! case PROCESSOR_HYPERSPARC: ! case PROCESSOR_SPARCLITE86X: ! *total = COSTS_N_INSNS (17); ! return true; ! ! default: ! *total = COSTS_N_INSNS (30); return true; } ! case COMPARE: ! if (FLOAT_MODE_P (GET_MODE (x))) ! { ! switch (sparc_cpu) ! { ! case PROCESSOR_ULTRASPARC: ! case PROCESSOR_ULTRASPARC3: ! *total = COSTS_N_INSNS (1); ! return true; ! ! case PROCESSOR_SUPERSPARC: ! *total = COSTS_N_INSNS (3); ! return true; ! ! case PROCESSOR_CYPRESS: ! *total = COSTS_N_INSNS (5); ! return true; ! ! case PROCESSOR_HYPERSPARC: ! case PROCESSOR_SPARCLITE86X: ! default: ! *total = COSTS_N_INSNS (1); ! return true; ! } ! } ! /* ??? Maybe mark integer compares as zero cost on ! ??? all UltraSPARC processors because the result ! ??? can be bypassed to a branch in the same group. */ ! *total = COSTS_N_INSNS (1); return true; ! case MULT: ! if (FLOAT_MODE_P (GET_MODE (x))) { ! switch (sparc_cpu) ! { ! case PROCESSOR_ULTRASPARC: ! case PROCESSOR_ULTRASPARC3: ! *total = COSTS_N_INSNS (4); ! return true; ! ! case PROCESSOR_SUPERSPARC: ! *total = COSTS_N_INSNS (3); ! return true; ! ! case PROCESSOR_CYPRESS: ! *total = COSTS_N_INSNS (7); ! return true; ! ! case PROCESSOR_HYPERSPARC: ! case PROCESSOR_SPARCLITE86X: ! *total = COSTS_N_INSNS (1); ! return true; ! ! default: ! *total = COSTS_N_INSNS (5); ! return true; ! } } ! ! /* The latency is actually variable for Ultra-I/II ! And if one of the inputs have a known constant ! value, we could calculate this precisely. ! ! However, for that to be useful we would need to ! add some machine description changes which would ! make sure small constants ended up in rs1 of the ! multiply instruction. This is because the multiply ! latency is determined by the number of clear (or ! set if the value is negative) bits starting from ! the most significant bit of the first input. ! ! The algorithm for computing num_cycles of a multiply ! on Ultra-I/II is: ! ! if (rs1 < 0) ! highest_bit = highest_clear_bit(rs1); ! else ! highest_bit = highest_set_bit(rs1); ! if (num_bits < 3) ! highest_bit = 3; ! num_cycles = 4 + ((highest_bit - 3) / 2); ! ! If we did that we would have to also consider register ! allocation issues that would result from forcing such ! a value into a register. ! ! There are other similar tricks we could play if we ! knew, for example, that one input was an array index. ! ! Since we do not play any such tricks currently the ! safest thing to do is report the worst case latency. */ ! if (sparc_cpu == PROCESSOR_ULTRASPARC) { ! *total = (GET_MODE (x) == DImode ! ? COSTS_N_INSNS (34) : COSTS_N_INSNS (19)); ! return true; } ! ! /* Multiply latency on Ultra-III, fortunately, is constant. */ ! if (sparc_cpu == PROCESSOR_ULTRASPARC3) { ! *total = COSTS_N_INSNS (6); ! return true; } ! ! if (sparc_cpu == PROCESSOR_HYPERSPARC ! || sparc_cpu == PROCESSOR_SPARCLITE86X) { ! *total = COSTS_N_INSNS (17); ! return true; } - *total = (TARGET_HARD_MUL ? COSTS_N_INSNS (5) : COSTS_N_INSNS (25)); return true; ! case DIV: ! case UDIV: ! case MOD: ! case UMOD: ! if (FLOAT_MODE_P (GET_MODE (x))) ! { ! switch (sparc_cpu) ! { ! case PROCESSOR_ULTRASPARC: ! if (GET_MODE (x) == SFmode) ! *total = COSTS_N_INSNS (13); ! else ! *total = COSTS_N_INSNS (23); ! return true; ! case PROCESSOR_ULTRASPARC3: ! if (GET_MODE (x) == SFmode) ! *total = COSTS_N_INSNS (17); ! else ! *total = COSTS_N_INSNS (20); ! return true; ! case PROCESSOR_SUPERSPARC: ! if (GET_MODE (x) == SFmode) ! *total = COSTS_N_INSNS (6); ! else ! *total = COSTS_N_INSNS (9); ! return true; ! case PROCESSOR_HYPERSPARC: ! case PROCESSOR_SPARCLITE86X: ! if (GET_MODE (x) == SFmode) ! *total = COSTS_N_INSNS (8); else ! *total = COSTS_N_INSNS (12); ! return true; ! default: ! *total = COSTS_N_INSNS (7); ! return true; } - } - - if (sparc_cpu == PROCESSOR_ULTRASPARC) - *total = (GET_MODE (x) == DImode - ? COSTS_N_INSNS (68) : COSTS_N_INSNS (37)); - else if (sparc_cpu == PROCESSOR_ULTRASPARC3) - *total = (GET_MODE (x) == DImode - ? COSTS_N_INSNS (71) : COSTS_N_INSNS (40)); - else - *total = COSTS_N_INSNS (25); - return true; - - case IF_THEN_ELSE: - /* Conditional moves. */ - switch (sparc_cpu) - { - case PROCESSOR_ULTRASPARC: - *total = COSTS_N_INSNS (2); - return true; ! case PROCESSOR_ULTRASPARC3: ! if (FLOAT_MODE_P (GET_MODE (x))) ! *total = COSTS_N_INSNS (3); else ! *total = COSTS_N_INSNS (2); ! return true; ! ! default: ! *total = COSTS_N_INSNS (1); ! return true; } ! case MEM: ! /* If outer-code is SIGN/ZERO extension we have to subtract ! out COSTS_N_INSNS (1) from whatever we return in determining ! the cost. */ ! switch (sparc_cpu) ! { ! case PROCESSOR_ULTRASPARC: ! if (outer_code == ZERO_EXTEND) ! *total = COSTS_N_INSNS (1); ! else ! *total = COSTS_N_INSNS (2); ! return true; ! ! case PROCESSOR_ULTRASPARC3: ! if (outer_code == ZERO_EXTEND) ! { ! if (GET_MODE (x) == QImode ! || GET_MODE (x) == HImode ! || outer_code == SIGN_EXTEND) ! *total = COSTS_N_INSNS (2); ! else ! *total = COSTS_N_INSNS (1); ! } ! else ! { ! /* This handles sign extension (3 cycles) ! and everything else (2 cycles). */ ! *total = COSTS_N_INSNS (2); ! } ! return true; ! ! case PROCESSOR_SUPERSPARC: ! if (FLOAT_MODE_P (GET_MODE (x)) ! || outer_code == ZERO_EXTEND ! || outer_code == SIGN_EXTEND) ! *total = COSTS_N_INSNS (0); ! else ! *total = COSTS_N_INSNS (1); ! return true; ! case PROCESSOR_TSC701: ! if (outer_code == ZERO_EXTEND ! || outer_code == SIGN_EXTEND) ! *total = COSTS_N_INSNS (2); ! else ! *total = COSTS_N_INSNS (3); ! return true; ! ! case PROCESSOR_CYPRESS: ! if (outer_code == ZERO_EXTEND ! || outer_code == SIGN_EXTEND) ! *total = COSTS_N_INSNS (1); else ! *total = COSTS_N_INSNS (2); ! return true; ! ! case PROCESSOR_HYPERSPARC: ! case PROCESSOR_SPARCLITE86X: ! default: ! if (outer_code == ZERO_EXTEND ! || outer_code == SIGN_EXTEND) ! *total = COSTS_N_INSNS (0); else ! *total = COSTS_N_INSNS (1); ! return true; } ! case CONST_INT: ! if (INTVAL (x) < 0x1000 && INTVAL (x) >= -0x1000) { ! *total = 0; ! return true; } /* FALLTHRU */ ! case HIGH: ! *total = 2; ! return true; ! case CONST: ! case LABEL_REF: ! case SYMBOL_REF: ! *total = 4; ! return true; ! case CONST_DOUBLE: ! if (GET_MODE (x) == DImode ! && ((XINT (x, 3) == 0 ! && (unsigned HOST_WIDE_INT) XINT (x, 2) < 0x1000) ! || (XINT (x, 3) == -1 ! && XINT (x, 2) < 0 ! && XINT (x, 2) >= -0x1000))) ! *total = 0; else ! *total = 8; ! return true; default: return false; --- 8248,8428 ---- static bool sparc_rtx_costs (rtx x, int code, int outer_code, int *total) { + enum machine_mode mode = GET_MODE (x); + bool float_mode_p = FLOAT_MODE_P (mode); + switch (code) { ! case CONST_INT: ! if (INTVAL (x) < 0x1000 && INTVAL (x) >= -0x1000) { ! *total = 0; return true; } + /* FALLTHRU */ ! case HIGH: ! *total = 2; ! return true; ! case CONST: ! case LABEL_REF: ! case SYMBOL_REF: ! *total = 4; ! return true; ! case CONST_DOUBLE: ! if (GET_MODE (x) == DImode ! && ((XINT (x, 3) == 0 ! && (unsigned HOST_WIDE_INT) XINT (x, 2) < 0x1000) ! || (XINT (x, 3) == -1 ! && XINT (x, 2) < 0 ! && XINT (x, 2) >= -0x1000))) ! *total = 0; ! else ! *total = 8; return true; ! case MEM: ! /* If outer-code was a sign or zero extension, a cost ! of COSTS_N_INSNS (1) was already added in. This is ! why we are subtracting it back out. */ ! if (outer_code == ZERO_EXTEND) { ! *total = sparc_costs->int_zload - COSTS_N_INSNS (1); } ! else if (outer_code == SIGN_EXTEND) { ! *total = sparc_costs->int_sload - COSTS_N_INSNS (1); } ! else if (float_mode_p) { ! *total = sparc_costs->float_load; } ! else { ! *total = sparc_costs->int_load; } return true; ! case PLUS: ! case MINUS: ! if (float_mode_p) ! *total = sparc_costs->float_plusminus; ! else ! *total = COSTS_N_INSNS (1); ! return false; ! case MULT: ! if (float_mode_p) ! *total = sparc_costs->float_mul; ! else ! { ! int bit_cost; ! bit_cost = 0; ! if (sparc_costs->int_mul_bit_factor) ! { ! int nbits; ! if (GET_CODE (XEXP (x, 1)) == CONST_INT) ! { ! unsigned HOST_WIDE_INT value = INTVAL (XEXP (x, 1)); ! for (nbits = 0; value != 0; value &= value - 1) ! nbits++; ! } ! else if (GET_CODE (XEXP (x, 1)) == CONST_DOUBLE ! && GET_MODE (XEXP (x, 1)) == DImode) ! { ! rtx x1 = XEXP (x, 1); ! unsigned HOST_WIDE_INT value1 = XINT (x1, 2); ! unsigned HOST_WIDE_INT value2 = XINT (x1, 3); ! ! for (nbits = 0; value1 != 0; value1 &= value1 - 1) ! nbits++; ! for (; value2 != 0; value2 &= value2 - 1) ! nbits++; ! } else ! nbits = 7; ! if (nbits < 3) ! nbits = 3; ! bit_cost = (nbits - 3) / sparc_costs->int_mul_bit_factor; } ! if (mode == DImode) ! *total = COSTS_N_INSNS (sparc_costs->int_mulX) + bit_cost; else ! *total = COSTS_N_INSNS (sparc_costs->int_mul) + bit_cost; } + return false; ! case ASHIFT: ! case ASHIFTRT: ! case LSHIFTRT: ! *total = COSTS_N_INSNS (1) + sparc_costs->shift_penalty; ! return false; ! case DIV: ! case UDIV: ! case MOD: ! case UMOD: ! if (float_mode_p) ! { ! if (mode == DFmode) ! *total = sparc_costs->float_div_df; else ! *total = sparc_costs->float_div_sf; ! } ! else ! { ! if (mode == DImode) ! *total = sparc_costs->int_divX; else ! *total = sparc_costs->int_div; } + return false; ! case NEG: ! if (! float_mode_p) { ! *total = COSTS_N_INSNS (1); ! return false; } /* FALLTHRU */ ! case ABS: ! case FLOAT: ! case UNSIGNED_FLOAT: ! case FIX: ! case UNSIGNED_FIX: ! case FLOAT_EXTEND: ! case FLOAT_TRUNCATE: ! *total = sparc_costs->float_move; ! return false; ! case SQRT: ! if (mode == DFmode) ! *total = sparc_costs->float_sqrt_df; ! else ! *total = sparc_costs->float_sqrt_sf; ! return false; ! case COMPARE: ! if (float_mode_p) ! *total = sparc_costs->float_cmp; else ! *total = COSTS_N_INSNS (1); ! return false; ! ! case IF_THEN_ELSE: ! if (float_mode_p) ! *total = sparc_costs->float_cmove; ! else ! *total = sparc_costs->int_cmove; ! return false; default: return false; Index: config/sparc/sparc.h =================================================================== RCS file: /cvs/gcc/gcc/gcc/config/sparc/sparc.h,v retrieving revision 1.260 diff -c -r1.260 sparc.h *** config/sparc/sparc.h 7 Jul 2004 19:24:53 -0000 1.260 --- config/sparc/sparc.h 9 Jul 2004 22:52:27 -0000 *************** *** 25,30 **** --- 25,108 ---- /* Note that some other tm.h files include this one and then override whatever definitions are necessary. */ + /* Define the specific costs for a given cpu */ + + struct processor_costs { + /* Integer load */ + const int int_load; + + /* Integer signed load */ + const int int_sload; + + /* Integer zeroed load */ + const int int_zload; + + /* Float load */ + const int float_load; + + /* fmov, fneg, fabs */ + const int float_move; + + /* fadd, fsub */ + const int float_plusminus; + + /* fcmp */ + const int float_cmp; + + /* fmov, fmovr */ + const int float_cmove; + + /* fmul */ + const int float_mul; + + /* fdivs */ + const int float_div_sf; + + /* fdivd */ + const int float_div_df; + + /* fsqrts */ + const int float_sqrt_sf; + + /* fsqrtd */ + const int float_sqrt_df; + + /* umul/smul */ + const int int_mul; + + /* mulX */ + const int int_mulX; + + /* integer multiply cost for each bit set past the most + significant 3, so the formula for multiply cost becomes: + + if (rs1 < 0) + highest_bit = highest_clear_bit(rs1); + else + highest_bit = highest_set_bit(rs1); + if (highest_bit < 3) + highest_bit = 3; + cost = int_mul{,X} + ((highest_bit - 3) / int_mul_bit_factor); + + A value of zero indicates that the multiply costs is fixed, + and not variable. */ + const int int_mul_bit_factor; + + /* udiv/sdiv */ + const int int_div; + + /* divX */ + const int int_divX; + + /* movcc, movr */ + const int int_cmove; + + /* penalty for shifts, due to scheduling rules etc. */ + const int shift_penalty; + }; + + extern const struct processor_costs *sparc_costs; + /* Target CPU builtins. FIXME: Defining sparc is for the benefit of Solaris only; otherwise just define __sparc__. Sadly the headers are such a mess there is no Solaris-specific header. */ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Sparc rtx costs 2004-07-09 23:39 [PATCH] Sparc rtx costs David S. Miller @ 2004-07-10 2:22 ` Richard Henderson 2004-07-10 3:46 ` David S. Miller 2004-07-15 6:53 ` Eric Botcazou 1 sibling, 1 reply; 8+ messages in thread From: Richard Henderson @ 2004-07-10 2:22 UTC (permalink / raw) To: David S. Miller; +Cc: gcc-patches On Fri, Jul 09, 2004 at 04:01:20PM -0700, David S. Miller wrote: > + struct processor_costs cypress_costs = { > + 2, /* int load */ ... > ! *total = sparc_costs->int_load; At some point all of these need to be scaled by COSTS_N_INSNS. Personally, I recommend in the processor_costs struct. r~ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Sparc rtx costs 2004-07-10 2:22 ` Richard Henderson @ 2004-07-10 3:46 ` David S. Miller 0 siblings, 0 replies; 8+ messages in thread From: David S. Miller @ 2004-07-10 3:46 UTC (permalink / raw) To: Richard Henderson; +Cc: gcc-patches On Fri, 9 Jul 2004 18:03:54 -0700 Richard Henderson <rth@redhat.com> wrote: > At some point all of these need to be scaled by COSTS_N_INSNS. > Personally, I recommend in the processor_costs struct. Works for me, applied to mainline. 2004-07-09 David S. Miller <davem@nuts.davemloft.net> * config/sparc/sparc.c (*_costs): Scale instruction costs by COSTS_N_INSNS. (sparc_rtx_costs): Adjust as appropriate. Index: config/sparc/sparc.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/config/sparc/sparc.c,v retrieving revision 1.317 diff -c -r1.317 sparc.c *** config/sparc/sparc.c 9 Jul 2004 22:59:32 -0000 1.317 --- config/sparc/sparc.c 10 Jul 2004 01:46:15 -0000 *************** *** 52,196 **** /* Processor costs */ static const struct processor_costs cypress_costs = { ! 2, /* int load */ ! 2, /* int signed load */ ! 2, /* int zeroed load */ ! 2, /* float load */ ! 5, /* fmov, fneg, fabs */ ! 5, /* fadd, fsub */ ! 1, /* fcmp */ ! 1, /* fmov, fmovr */ ! 7, /* fmul */ ! 37, /* fdivs */ ! 37, /* fdivd */ ! 63, /* fsqrts */ ! 63, /* fsqrtd */ ! 1, /* imul */ ! 1, /* imulX */ 0, /* imul bit factor */ ! 1, /* idiv */ ! 1, /* idivX */ ! 1, /* movcc/movr */ 0, /* shift penalty */ }; static const struct processor_costs supersparc_costs = { ! 1, /* int load */ ! 1, /* int signed load */ ! 1, /* int zeroed load */ ! 0, /* float load */ ! 3, /* fmov, fneg, fabs */ ! 3, /* fadd, fsub */ ! 3, /* fcmp */ ! 1, /* fmov, fmovr */ ! 3, /* fmul */ ! 6, /* fdivs */ ! 9, /* fdivd */ ! 12, /* fsqrts */ ! 12, /* fsqrtd */ ! 4, /* imul */ ! 4, /* imulX */ 0, /* imul bit factor */ ! 4, /* idiv */ ! 4, /* idivX */ ! 1, /* movcc/movr */ 1, /* shift penalty */ }; static const struct processor_costs hypersparc_costs = { ! 1, /* int load */ ! 1, /* int signed load */ ! 1, /* int zeroed load */ ! 1, /* float load */ ! 1, /* fmov, fneg, fabs */ ! 1, /* fadd, fsub */ ! 1, /* fcmp */ ! 1, /* fmov, fmovr */ ! 1, /* fmul */ ! 8, /* fdivs */ ! 12, /* fdivd */ ! 17, /* fsqrts */ ! 17, /* fsqrtd */ ! 17, /* imul */ ! 17, /* imulX */ 0, /* imul bit factor */ ! 17, /* idiv */ ! 17, /* idivX */ ! 1, /* movcc/movr */ 0, /* shift penalty */ }; static const struct processor_costs sparclet_costs = { ! 3, /* int load */ ! 3, /* int signed load */ ! 1, /* int zeroed load */ ! 1, /* float load */ ! 1, /* fmov, fneg, fabs */ ! 1, /* fadd, fsub */ ! 1, /* fcmp */ ! 1, /* fmov, fmovr */ ! 1, /* fmul */ ! 1, /* fdivs */ ! 1, /* fdivd */ ! 1, /* fsqrts */ ! 1, /* fsqrtd */ ! 5, /* imul */ ! 5, /* imulX */ 0, /* imul bit factor */ ! 5, /* idiv */ ! 5, /* idivX */ ! 1, /* movcc/movr */ 0, /* shift penalty */ }; static const struct processor_costs ultrasparc_costs = { ! 2, /* int load */ ! 3, /* int signed load */ ! 2, /* int zeroed load */ ! 2, /* float load */ ! 1, /* fmov, fneg, fabs */ ! 4, /* fadd, fsub */ ! 1, /* fcmp */ ! 2, /* fmov, fmovr */ ! 4, /* fmul */ ! 13, /* fdivs */ ! 23, /* fdivd */ ! 13, /* fsqrts */ ! 23, /* fsqrtd */ ! 4, /* imul */ ! 4, /* imulX */ 2, /* imul bit factor */ ! 37, /* idiv */ ! 68, /* idivX */ ! 2, /* movcc/movr */ 2, /* shift penalty */ }; static const struct processor_costs ultrasparc3_costs = { ! 2, /* int load */ ! 3, /* int signed load */ ! 3, /* int zeroed load */ ! 2, /* float load */ ! 3, /* fmov, fneg, fabs */ ! 4, /* fadd, fsub */ ! 5, /* fcmp */ ! 3, /* fmov, fmovr */ ! 4, /* fmul */ ! 17, /* fdivs */ ! 20, /* fdivd */ ! 20, /* fsqrts */ ! 29, /* fsqrtd */ ! 6, /* imul */ ! 6, /* imulX */ 0, /* imul bit factor */ ! 40, /* idiv */ ! 71, /* idivX */ ! 2, /* movcc/movr */ 0, /* shift penalty */ }; --- 52,196 ---- /* Processor costs */ static const struct processor_costs cypress_costs = { ! COSTS_N_INSNS (2), /* int load */ ! COSTS_N_INSNS (2), /* int signed load */ ! COSTS_N_INSNS (2), /* int zeroed load */ ! COSTS_N_INSNS (2), /* float load */ ! COSTS_N_INSNS (5), /* fmov, fneg, fabs */ ! COSTS_N_INSNS (5), /* fadd, fsub */ ! COSTS_N_INSNS (1), /* fcmp */ ! COSTS_N_INSNS (1), /* fmov, fmovr */ ! COSTS_N_INSNS (7), /* fmul */ ! COSTS_N_INSNS (37), /* fdivs */ ! COSTS_N_INSNS (37), /* fdivd */ ! COSTS_N_INSNS (63), /* fsqrts */ ! COSTS_N_INSNS (63), /* fsqrtd */ ! COSTS_N_INSNS (1), /* imul */ ! COSTS_N_INSNS (1), /* imulX */ 0, /* imul bit factor */ ! COSTS_N_INSNS (1), /* idiv */ ! COSTS_N_INSNS (1), /* idivX */ ! COSTS_N_INSNS (1), /* movcc/movr */ 0, /* shift penalty */ }; static const struct processor_costs supersparc_costs = { ! COSTS_N_INSNS (1), /* int load */ ! COSTS_N_INSNS (1), /* int signed load */ ! COSTS_N_INSNS (1), /* int zeroed load */ ! COSTS_N_INSNS (0), /* float load */ ! COSTS_N_INSNS (3), /* fmov, fneg, fabs */ ! COSTS_N_INSNS (3), /* fadd, fsub */ ! COSTS_N_INSNS (3), /* fcmp */ ! COSTS_N_INSNS (1), /* fmov, fmovr */ ! COSTS_N_INSNS (3), /* fmul */ ! COSTS_N_INSNS (6), /* fdivs */ ! COSTS_N_INSNS (9), /* fdivd */ ! COSTS_N_INSNS (12), /* fsqrts */ ! COSTS_N_INSNS (12), /* fsqrtd */ ! COSTS_N_INSNS (4), /* imul */ ! COSTS_N_INSNS (4), /* imulX */ 0, /* imul bit factor */ ! COSTS_N_INSNS (4), /* idiv */ ! COSTS_N_INSNS (4), /* idivX */ ! COSTS_N_INSNS (1), /* movcc/movr */ 1, /* shift penalty */ }; static const struct processor_costs hypersparc_costs = { ! COSTS_N_INSNS (1), /* int load */ ! COSTS_N_INSNS (1), /* int signed load */ ! COSTS_N_INSNS (1), /* int zeroed load */ ! COSTS_N_INSNS (1), /* float load */ ! COSTS_N_INSNS (1), /* fmov, fneg, fabs */ ! COSTS_N_INSNS (1), /* fadd, fsub */ ! COSTS_N_INSNS (1), /* fcmp */ ! COSTS_N_INSNS (1), /* fmov, fmovr */ ! COSTS_N_INSNS (1), /* fmul */ ! COSTS_N_INSNS (8), /* fdivs */ ! COSTS_N_INSNS (12), /* fdivd */ ! COSTS_N_INSNS (17), /* fsqrts */ ! COSTS_N_INSNS (17), /* fsqrtd */ ! COSTS_N_INSNS (17), /* imul */ ! COSTS_N_INSNS (17), /* imulX */ 0, /* imul bit factor */ ! COSTS_N_INSNS (17), /* idiv */ ! COSTS_N_INSNS (17), /* idivX */ ! COSTS_N_INSNS (1), /* movcc/movr */ 0, /* shift penalty */ }; static const struct processor_costs sparclet_costs = { ! COSTS_N_INSNS (3), /* int load */ ! COSTS_N_INSNS (3), /* int signed load */ ! COSTS_N_INSNS (1), /* int zeroed load */ ! COSTS_N_INSNS (1), /* float load */ ! COSTS_N_INSNS (1), /* fmov, fneg, fabs */ ! COSTS_N_INSNS (1), /* fadd, fsub */ ! COSTS_N_INSNS (1), /* fcmp */ ! COSTS_N_INSNS (1), /* fmov, fmovr */ ! COSTS_N_INSNS (1), /* fmul */ ! COSTS_N_INSNS (1), /* fdivs */ ! COSTS_N_INSNS (1), /* fdivd */ ! COSTS_N_INSNS (1), /* fsqrts */ ! COSTS_N_INSNS (1), /* fsqrtd */ ! COSTS_N_INSNS (5), /* imul */ ! COSTS_N_INSNS (5), /* imulX */ 0, /* imul bit factor */ ! COSTS_N_INSNS (5), /* idiv */ ! COSTS_N_INSNS (5), /* idivX */ ! COSTS_N_INSNS (1), /* movcc/movr */ 0, /* shift penalty */ }; static const struct processor_costs ultrasparc_costs = { ! COSTS_N_INSNS (2), /* int load */ ! COSTS_N_INSNS (3), /* int signed load */ ! COSTS_N_INSNS (2), /* int zeroed load */ ! COSTS_N_INSNS (2), /* float load */ ! COSTS_N_INSNS (1), /* fmov, fneg, fabs */ ! COSTS_N_INSNS (4), /* fadd, fsub */ ! COSTS_N_INSNS (1), /* fcmp */ ! COSTS_N_INSNS (2), /* fmov, fmovr */ ! COSTS_N_INSNS (4), /* fmul */ ! COSTS_N_INSNS (13), /* fdivs */ ! COSTS_N_INSNS (23), /* fdivd */ ! COSTS_N_INSNS (13), /* fsqrts */ ! COSTS_N_INSNS (23), /* fsqrtd */ ! COSTS_N_INSNS (4), /* imul */ ! COSTS_N_INSNS (4), /* imulX */ 2, /* imul bit factor */ ! COSTS_N_INSNS (37), /* idiv */ ! COSTS_N_INSNS (68), /* idivX */ ! COSTS_N_INSNS (2), /* movcc/movr */ 2, /* shift penalty */ }; static const struct processor_costs ultrasparc3_costs = { ! COSTS_N_INSNS (2), /* int load */ ! COSTS_N_INSNS (3), /* int signed load */ ! COSTS_N_INSNS (3), /* int zeroed load */ ! COSTS_N_INSNS (2), /* float load */ ! COSTS_N_INSNS (3), /* fmov, fneg, fabs */ ! COSTS_N_INSNS (4), /* fadd, fsub */ ! COSTS_N_INSNS (5), /* fcmp */ ! COSTS_N_INSNS (3), /* fmov, fmovr */ ! COSTS_N_INSNS (4), /* fmul */ ! COSTS_N_INSNS (17), /* fdivs */ ! COSTS_N_INSNS (20), /* fdivd */ ! COSTS_N_INSNS (20), /* fsqrts */ ! COSTS_N_INSNS (29), /* fsqrtd */ ! COSTS_N_INSNS (6), /* imul */ ! COSTS_N_INSNS (6), /* imulX */ 0, /* imul bit factor */ ! COSTS_N_INSNS (40), /* idiv */ ! COSTS_N_INSNS (71), /* idivX */ ! COSTS_N_INSNS (2), /* movcc/movr */ 0, /* shift penalty */ }; *************** *** 8350,8361 **** if (nbits < 3) nbits = 3; bit_cost = (nbits - 3) / sparc_costs->int_mul_bit_factor; } if (mode == DImode) ! *total = COSTS_N_INSNS (sparc_costs->int_mulX) + bit_cost; else ! *total = COSTS_N_INSNS (sparc_costs->int_mul) + bit_cost; } return false; --- 8350,8362 ---- if (nbits < 3) nbits = 3; bit_cost = (nbits - 3) / sparc_costs->int_mul_bit_factor; + bit_cost = COSTS_N_INSNS (bit_cost); } if (mode == DImode) ! *total = sparc_costs->int_mulX + bit_cost; else ! *total = sparc_costs->int_mul + bit_cost; } return false; ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Sparc rtx costs 2004-07-09 23:39 [PATCH] Sparc rtx costs David S. Miller 2004-07-10 2:22 ` Richard Henderson @ 2004-07-15 6:53 ` Eric Botcazou 2004-07-21 10:42 ` David S. Miller 1 sibling, 1 reply; 8+ messages in thread From: Eric Botcazou @ 2004-07-15 6:53 UTC (permalink / raw) To: David S. Miller; +Cc: gcc-patches > This cleans up the sparc_rtx_costs implementation and > makes the sparc backend easier to maintain and tweak. I need your help here because the modifications introduced are not crystal clear to me: 1. false/true mutation. For example: case IF_THEN_ELSE: /* Conditional moves. */ switch (sparc_cpu) { case PROCESSOR_ULTRASPARC: *total = COSTS_N_INSNS (2); return true; case PROCESSOR_ULTRASPARC3: if (FLOAT_MODE_P (GET_MODE (x))) *total = COSTS_N_INSNS (3); else *total = COSTS_N_INSNS (2); return true; default: *total = COSTS_N_INSNS (1); return true; } was turned into: case IF_THEN_ELSE: if (float_mode_p) *total = sparc_costs->float_cmove; else *total = sparc_costs->int_cmove; return false; 2. Changes in the costs. For example: case COMPARE: if (FLOAT_MODE_P (GET_MODE (x))) { switch (sparc_cpu) { case PROCESSOR_ULTRASPARC: case PROCESSOR_ULTRASPARC3: *total = COSTS_N_INSNS (1); return true; case PROCESSOR_SUPERSPARC: *total = COSTS_N_INSNS (3); return true; case PROCESSOR_CYPRESS: *total = COSTS_N_INSNS (5); return true; case PROCESSOR_HYPERSPARC: case PROCESSOR_SPARCLITE86X: default: *total = COSTS_N_INSNS (1); return true; } } was turned into: struct processor_costs cypress_costs = { COSTS_N_INSNS (1), /* fcmp */ struct processor_costs ultrasparc3_costs = { COSTS_N_INSNS (5), /* fcmp */ 3. Integer multiplication costs. Originally: /* The latency is actually variable for Ultra-I/II And if one of the inputs have a known constant value, we could calculate this precisely. [...] if (sparc_cpu == PROCESSOR_ULTRASPARC) { *total = (GET_MODE (x) == DImode ? COSTS_N_INSNS (34) : COSTS_N_INSNS (19)); return true; } /* Multiply latency on Ultra-III, fortunately, is constant. */ if (sparc_cpu == PROCESSOR_ULTRASPARC3) { *total = COSTS_N_INSNS (6); return true; } if (sparc_cpu == PROCESSOR_HYPERSPARC || sparc_cpu == PROCESSOR_SPARCLITE86X) { *total = COSTS_N_INSNS (17); return true; } *total = (TARGET_HARD_MUL ? COSTS_N_INSNS (5) : COSTS_N_INSNS (25)); return true; The cost algorithm appears to apply only to Ultra I and II. Note also the final check against TARGET_HARD_MUL, which predicates the availability of the "smul" instruction. The check was lifted by your patch and, as a consequence, we emit libcalls all over the place in default (V7) 32-bit code. For example, in reload.c: for (i = 0; i < n_reloads; i++) if (rld[i].in && ! rld[i].optional && ! rld[i].nocombine && rld[i].when_needed != RELOAD_FOR_OUTPUT_ADDRESS && rld[i].when_needed != RELOAD_FOR_OUTADDR_ADDRESS && rld[i].when_needed != RELOAD_OTHER && ((((rld[i].class) == FP_REGS || (rld[i].class) == EXTRA_FP_REGS) ? (((unsigned short) mode_size[rld[i].inmode]) + 3) / 4 : (((unsigned short) mode_size[rld[i].inmode]) + ((! (! (target_flags & 0x10000))) ? 8 : 4) - 1) / ((! (! (target_flags & 0x10000))) ? 8 : 4)) == (((rld[output_reload].class) == FP_REGS || (rld[output_reload].class) == EXTRA_FP_REGS) ? (((unsigned short) mode_size[rld[output_reload].outmode]) + 3) / 4 : (((unsigned short) mode_size[rld[output_reload].outmode]) + ((! (! (target_flags & 0x10000))) ? 8 : 4) - 1) / ((! (! (target_flags & 0x10000))) ? 8 : 4))) && rld[i].inc == 0 && rld[i].reg_rtx == 0 Every single reference to rld[something] that still exists when the RTL expanders are invoked is emitted as a libcall. -- Eric Botcazou ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Sparc rtx costs 2004-07-15 6:53 ` Eric Botcazou @ 2004-07-21 10:42 ` David S. Miller 2004-07-21 14:19 ` David S. Miller 2004-09-03 6:47 ` Eric Botcazou 0 siblings, 2 replies; 8+ messages in thread From: David S. Miller @ 2004-07-21 10:42 UTC (permalink / raw) To: Eric Botcazou; +Cc: gcc-patches On Wed, 14 Jul 2004 23:42:12 +0200 Eric Botcazou <ebotcazou@libertysurf.fr> wrote: > For example: > > case IF_THEN_ELSE: ... > was turned into: > > case IF_THEN_ELSE: Point out what was different effectively please. :-) All the UltraSPARC costs were tweaked to be accurate as per the foo.md DFA scheduler descriptions. All the other processors were given "1" for both float and non- float cases which matches the default: in that case statement we used to have there. > 2. Changes in the costs. ... > was turned into: > > struct processor_costs cypress_costs = { > COSTS_N_INSNS (1), /* fcmp */ > > struct processor_costs ultrasparc3_costs = { > COSTS_N_INSNS (5), /* fcmp */ More accurate, and matches ultra*.md files. > The cost algorithm appears to apply only to Ultra I and II. Correct, because only Ultra I and II actually have variable cost multiply. Ultra3 is fixed cost. > Note also the final check against TARGET_HARD_MUL, which predicates > the availability of the "smul" instruction. > > The check was lifted by your patch and, as a consequence, we emit libcalls > all over the place in default (V7) 32-bit code. For example, in reload.c: That's a bug, I'll fix it up. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Sparc rtx costs 2004-07-21 10:42 ` David S. Miller @ 2004-07-21 14:19 ` David S. Miller 2004-09-03 6:47 ` Eric Botcazou 1 sibling, 0 replies; 8+ messages in thread From: David S. Miller @ 2004-07-21 14:19 UTC (permalink / raw) To: David S. Miller; +Cc: ebotcazou, gcc-patches On Tue, 20 Jul 2004 17:20:06 -0700 "David S. Miller" <davem@redhat.com> wrote: > > Note also the final check against TARGET_HARD_MUL, which predicates > > the availability of the "smul" instruction. > > > > The check was lifted by your patch and, as a consequence, we emit libcalls > > all over the place in default (V7) 32-bit code. For example, in reload.c: > > That's a bug, I'll fix it up. > Thanks. This should fix it. Applied to mainline. 2004-07-20 David S. Miller <davem@nuts.davemloft.net> * config/sparc/sparc.c (sparc_rtx_costs case MULT): Emit enormous cost if not TARGET_HARD_MUL. Index: config/sparc/sparc.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/config/sparc/sparc.c,v retrieving revision 1.324 diff -c -r1.324 sparc.c *** config/sparc/sparc.c 20 Jul 2004 07:27:16 -0000 1.324 --- config/sparc/sparc.c 21 Jul 2004 00:26:28 -0000 *************** *** 8383,8388 **** --- 8383,8390 ---- case MULT: if (float_mode_p) *total = sparc_costs->float_mul; + else if (! TARGET_HARD_MUL) + return COSTS_N_INSNS (25); else { int bit_cost; ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Sparc rtx costs 2004-07-21 10:42 ` David S. Miller 2004-07-21 14:19 ` David S. Miller @ 2004-09-03 6:47 ` Eric Botcazou 2004-09-03 6:53 ` David S. Miller 1 sibling, 1 reply; 8+ messages in thread From: Eric Botcazou @ 2004-09-03 6:47 UTC (permalink / raw) To: David S. Miller; +Cc: gcc-patches [Sorry for the big delay] > > For example: > > > > case IF_THEN_ELSE: > > ... > > > was turned into: > > > > case IF_THEN_ELSE: > > Point out what was different effectively please. :-) You changed the return value for all the operators from 'true' to 'false', right? While this sounds theoritically nice, I think this implicity assumes a near-perfect uniformity in the cost distribution for every insn on every cpu. Are you sure this is effectively the case, especially for older processors? > > struct processor_costs cypress_costs = { > > COSTS_N_INSNS (1), /* fcmp */ > > > > struct processor_costs ultrasparc3_costs = { > > COSTS_N_INSNS (5), /* fcmp */ > > More accurate, and matches ultra*.md files. So the previous costs were all that broken? x5 in cost is quite a big change. > > The cost algorithm appears to apply only to Ultra I and II. > > Correct, because only Ultra I and II actually have variable > cost multiply. Ultra3 is fixed cost. Ok. I think I only wondered why you had scrapped the comment. > That's a bug, I'll fix it up. Thanks. This restored bootstrap with BOOT_CFLAGS=-O in the process (broken by an obscure bug involving libcalls, CSE and loop IM that I was too lazy to fix). -- Eric Botcazou ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Sparc rtx costs 2004-09-03 6:47 ` Eric Botcazou @ 2004-09-03 6:53 ` David S. Miller 0 siblings, 0 replies; 8+ messages in thread From: David S. Miller @ 2004-09-03 6:53 UTC (permalink / raw) To: Eric Botcazou; +Cc: gcc-patches On Fri, 3 Sep 2004 08:48:47 +0200 Eric Botcazou <ebotcazou@libertysurf.fr> wrote: > [Sorry for the big delay] > > > > For example: > > > > > > case IF_THEN_ELSE: > > > > ... > > > > > was turned into: > > > > > > case IF_THEN_ELSE: > > > > Point out what was different effectively please. :-) > > You changed the return value for all the operators from 'true' to 'false', > right? Note that IF_THEN_ELSE will only appear for v9 output (integer and floating point conditional moves) which effectively means UltraSPARC variants. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-09-03 6:50 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-07-09 23:39 [PATCH] Sparc rtx costs David S. Miller 2004-07-10 2:22 ` Richard Henderson 2004-07-10 3:46 ` David S. Miller 2004-07-15 6:53 ` Eric Botcazou 2004-07-21 10:42 ` David S. Miller 2004-07-21 14:19 ` David S. Miller 2004-09-03 6:47 ` Eric Botcazou 2004-09-03 6:53 ` David S. Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).