From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 9025C3889808 for ; Mon, 28 Jun 2021 06:50:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9025C3889808 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 15S6Yj8U142746; Mon, 28 Jun 2021 02:50:13 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 39f907grqq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 28 Jun 2021 02:50:12 -0400 Received: from m0098394.ppops.net (m0098394.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 15S6YjAJ142811; Mon, 28 Jun 2021 02:50:12 -0400 Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com with ESMTP id 39f907grpn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 28 Jun 2021 02:50:11 -0400 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 15S6m5sX009759; Mon, 28 Jun 2021 06:50:09 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma06ams.nl.ibm.com with ESMTP id 39dughgmg0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 28 Jun 2021 06:50:09 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 15S6mWIa27197802 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 28 Jun 2021 06:48:33 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 14197A464B; Mon, 28 Jun 2021 06:48:51 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3FB0E1348F0; Mon, 28 Jun 2021 06:26:20 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.147.143]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 28 Jun 2021 06:26:19 +0000 (GMT) Subject: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328] References: To: GCC Patches Cc: Vladimir Makarov , bergner@linux.ibm.com, Bill Schmidt , Segher Boessenkool , Richard Sandiford , crazylht@gmail.com From: "Kewen.Lin" Message-ID: <8a5fd52a-1cc9-6563-ee6c-f345b489654c@linux.ibm.com> Date: Mon, 28 Jun 2021 14:26:18 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------974AE30CFB6EE08B46689DC1" Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-GUID: gn1xrbUflCA03_yqTiO3wisa-QB7Uy8U X-Proofpoint-ORIG-GUID: EZjG5393iHe0eQNqfgDlQJhG3OLKpxl4 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-06-28_05:2021-06-25, 2021-06-28 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 bulkscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 malwarescore=0 priorityscore=1501 impostorscore=0 spamscore=0 phishscore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2106280045 X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, MIME_CHARSET_FARAWAY, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jun 2021 06:50:20 -0000 This is a multi-part message in MIME format. --------------974AE30CFB6EE08B46689DC1 Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: 8bit Hi! on 2021/6/9 ÏÂÎç1:18, Kewen.Lin via Gcc-patches wrote: > Hi, > > PR100328 has some details about this issue, I am trying to > brief it here. In the hottest function LBM_performStreamCollideTRT > of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions > (27 FMA, 19 FMS, 11 FNMA). On rs6000, this kind of FMA style > insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg > class have 64 registers whose foregoing 32 ones make up the > whole FLOAT_REG. There are some differences for these two > flavors, taking "*fma4_fpr" as example: > > (define_insn "*fma4_fpr" > [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa,wa") > (fma:SFDF > (match_operand:SFDF 1 "gpc_reg_operand" "%,wa,wa") > (match_operand:SFDF 2 "gpc_reg_operand" ",wa,0") > (match_operand:SFDF 3 "gpc_reg_operand" ",0,wa")))] > > // wa => A VSX register (VSR), vs0¡­vs63, aka. VSX_REG. > // (f/d) => A floating point register, aka. FLOAT_REG. > > So for VSX_REG, we only have the destructive form, when VSX_REG > alternative being used, the operand 2 or operand 3 is required > to be the same as operand 0. reload has to take care of this > constraint and create some non-free register copies if required. > > Assuming one fma insn looks like: > op0 = FMA (op1, op2, op3) > > The best regclass of them are VSX_REG, when op1,op2,op3 are all dead, > IRA simply creates three shuffle copies for them (here the operand > order matters, since with the same freq, the one with smaller number > takes preference), but IMO both op2 and op3 should take higher priority > in copy queue due to the matching constraint. > > I noticed that there is one function ira_get_dup_out_num, which meant > to create this kind of constraint copy, but the below code looks to > refuse to create if there is an alternative which has valid regclass > without spilled need. > > default: > { > enum constraint_num cn = lookup_constraint (str); > enum reg_class cl = reg_class_for_constraint (cn); > if (cl != NO_REGS > && !targetm.class_likely_spilled_p (cl)) > goto fail > > ... > > I cooked one patch attached to make ira respect this kind of matching > constraint guarded with one parameter. As I stated in the PR, I was > not sure this is on the right track. The RFC patch is to check the > matching constraint in all alternatives, if there is one alternative > with matching constraint and matches the current preferred regclass > (or best of allocno?), it will record the output operand number and > further create one constraint copy for it. Normally it can get the > priority against shuffle copies and the matching constraint will get > satisfied with higher possibility, reload doesn't create extra copies > to meet the matching constraint or the desirable register class when > it has to. > > For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay > as shuffle copies, and later any of A,B,C,D gets assigned by one > hardware register which is a VSX register (VSX_REG) but not a FP > register (FLOAT_REG), which means it has to pay costs once we can NOT > go with VSX alternatives, so at that time it's important to respect > the matching constraint then we can increase the freq for the remaining > copies related to this (A/B, A/C, A/D). This idea requires some side > tables to record some information and seems a bit complicated in the > current framework, so the proposed patch aggressively emphasizes the > matching constraint at the time of creating copies. > Comparing with the original patch (v1), this patch v3 has considered: (this should be v2 for this mail list, but bump it to be consistent as PR's). - Excluding the case where for one preferred register class there can be two or more alternatives, one of them has the matching constraint, while another doesn't have. So for the given operand, even if it's assigned by a hardware reg which doesn't meet the matching constraint, it can simply use the alternative which doesn't have matching constraint so no register move is needed. One typical case is define_insn *mov_internal2 on rs6000. So we shouldn't create constraint copy for it. - The possible free register move in the same register class, disable this if so since the register move to meet the constraint is considered as free. - Making it on by default, suggested by Segher & Vladimir, we hope to get rid of the parameter if the benchmarking result looks good on major targets. - Tweaking cost when either of matching constraint two sides is hardware register. Before this patch, the constraint copy is simply taken as a real move insn for pref and conflict cost with one hardware register, after this patch, it's allowed that there are several input operands respecting the same matching constraint (but in different alternatives), so we should take it to be like shuffle copy for some cases to avoid over preferring/disparaging. Please check the PR comments for more details. This patch can be bootstrapped & regtested on powerpc64le-linux-gnu P9 and x86_64-redhat-linux, but have some "XFAIL->XPASS" failures on aarch64-linux-gnu. The failure list was attached in the PR and thought the new assembly looks improved (expected). With option Ofast unroll, this patch can help to improve SPEC2017 bmk 508.namd_r +2.42% and 519.lbm_r +2.43% on Power8 while 508.namd_r +3.02% and 519.lbm_r +3.85% on Power9 without any remarkable degradations. Since this patch likely benefits x86_64 and aarch64, but I don't have performance machines with these arches at hand, could someone kindly help to benchmark it if possible? Many thanks in advance! btw, you can simply ignore the part about parameter ira-consider-dup-in-all-alts (its name/description), it's sort of stale, I let it be for now as we will likely get rid of it. BR, Kewen ----- gcc/ChangeLog: * doc/invoke.texi (ira-consider-dup-in-all-alts): Document new parameter. * ira.c (ira_get_dup_out_num): Adjust as parameter param_ira_consider_dup_in_all_alts. * params.opt (ira-consider-dup-in-all-alts): New. * ira-conflicts.c (process_regs_for_copy): Add one parameter single_input_op_has_cstr_p. (get_freq_for_shuffle_copy): New function. (add_insn_allocno_copies): Adjust as single_input_op_has_cstr_p. * ira-int.h (ira_get_dup_out_num): Add one bool parameter. --------------974AE30CFB6EE08B46689DC1 Content-Type: text/plain; charset=UTF-8; x-mac-type="0"; x-mac-creator="0"; name="ira-v3.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ira-v3.diff" --- gcc/doc/invoke.texi | 6 +++ gcc/ira-conflicts.c | 91 +++++++++++++++++++++++++++------- gcc/ira-int.h | 2 +- gcc/ira.c | 118 ++++++++++++++++++++++++++++++++++++++++---- gcc/params.opt | 4 ++ 5 files changed, 194 insertions(+), 27 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 510f24e55ab..d54cc991d18 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -13845,6 +13845,12 @@ of available registers reserved for some other purposes is given by this parameter. Default of the parameter is the best found from numerous experiments. +@item ira-consider-dup-in-all-alts +Make IRA to consider matching constraint (duplicated operand number) +heavily if that one is with preferred register class, even if there +is some other choice with an appropriate register class no matter +which is preferred or not. + @item lra-inheritance-ebb-probability-cutoff LRA tries to reuse values reloaded in registers in subsequent insns. This optimization is called inheritance. EBB is used as a region to diff --git a/gcc/ira-conflicts.c b/gcc/ira-conflicts.c index d83cfc1c1a7..67c4cdcbc8d 100644 --- a/gcc/ira-conflicts.c +++ b/gcc/ira-conflicts.c @@ -233,6 +233,15 @@ go_through_subreg (rtx x, int *offset) return reg; } +/* Return the recomputed frequency for this shuffle copy or its similar + case, since it's not for a real move insn, make it smaller. */ + +static int +get_freq_for_shuffle_copy (int freq) +{ + return freq < 8 ? 1 : freq / 8; +} + /* Process registers REG1 and REG2 in move INSN with execution frequency FREQ. The function also processes the registers in a potential move insn (INSN == NULL in this case) with frequency @@ -240,12 +249,14 @@ go_through_subreg (rtx x, int *offset) corresponding allocnos or create a copy involving the corresponding allocnos. The function does nothing if the both registers are hard registers. When nothing is changed, the function returns - FALSE. */ + FALSE. SINGLE_INPUT_OP_HAS_CSTR_P is only meaningful when constraint_p + is true, see function ira_get_dup_out_num for details. */ static bool -process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, - rtx_insn *insn, int freq) +process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, rtx_insn *insn, + int freq, bool single_input_op_has_cstr_p = true) { - int allocno_preferenced_hard_regno, cost, index, offset1, offset2; + int allocno_preferenced_hard_regno, index, offset1, offset2; + int cost, conflict_cost, move_cost; bool only_regs_p; ira_allocno_t a; reg_class_t rclass, aclass; @@ -306,9 +317,52 @@ process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, return false; ira_init_register_move_cost_if_necessary (mode); if (HARD_REGISTER_P (reg1)) - cost = ira_register_move_cost[mode][aclass][rclass] * freq; + move_cost = ira_register_move_cost[mode][aclass][rclass]; + else + move_cost = ira_register_move_cost[mode][rclass][aclass]; + + if (!single_input_op_has_cstr_p) + { + /* When this is a constraint copy and the matching constraint + doesn't only exist for this given operand but also for some + other operand(s), it means saving the possible move cost is + not necessary to have reg1 and reg2 use the same hardware + register, this hardware preference isn't required to be + fixed. To avoid it to over prefer this hardware register, + and over disparage this hardware register on conflicted + objects, we need some cost tweaking here, similar to what + we do for shuffle copy. */ + gcc_assert (constraint_p); + int reduced_freq = get_freq_for_shuffle_copy (freq); + if (HARD_REGISTER_P (reg1)) + /* For reg2 = opcode(reg1, reg3 ...), assume that reg3 is a + pseudo register which has matching constraint on reg2, + even if reg2 isn't assigned by reg1, it's still possible + to have no register moves if reg2 and reg3 use the same + hardware register. So to avoid the allocation over + prefers reg1, we can just take it as a shuffle copy. */ + cost = conflict_cost = move_cost * reduced_freq; + else + { + /* For reg1 = opcode(reg2, reg3 ...), assume that reg3 is a + pseudo register which has matching constraint on reg2, + to save the register move, it's better to assign reg1 + to either of reg2 and reg3 (or one of other pseudos like + reg3), it's reasonable to use freq for the cost. But + for conflict_cost, since reg2 and reg3 conflicts with + each other, both of them has the chance to be assigned + by reg1, assume reg3 has one copy which also conflicts + with reg2, we shouldn't make it less preferred on reg1 + since reg3 has the same chance to be assigned by reg1. + So it adjusts the conflic_cost to make it same as what + we use for shuffle copy. */ + cost = move_cost * freq; + conflict_cost = move_cost * reduced_freq; + } + } else - cost = ira_register_move_cost[mode][rclass][aclass] * freq; + cost = conflict_cost = move_cost * freq; + do { ira_allocate_and_set_costs @@ -317,7 +371,7 @@ process_regs_for_copy (rtx reg1, rtx reg2, bool constraint_p, ira_allocate_and_set_costs (&ALLOCNO_CONFLICT_HARD_REG_COSTS (a), aclass, 0); ALLOCNO_HARD_REG_COSTS (a)[index] -= cost; - ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[index] -= cost; + ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[index] -= conflict_cost; if (ALLOCNO_HARD_REG_COSTS (a)[index] < ALLOCNO_CLASS_COST (a)) ALLOCNO_CLASS_COST (a) = ALLOCNO_HARD_REG_COSTS (a)[index]; ira_add_allocno_pref (a, allocno_preferenced_hard_regno, freq); @@ -420,7 +474,8 @@ add_insn_allocno_copies (rtx_insn *insn) operand = recog_data.operand[i]; if (! REG_SUBREG_P (operand)) continue; - if ((n = ira_get_dup_out_num (i, alts)) >= 0) + bool single_input_op_has_cstr_p; + if ((n = ira_get_dup_out_num (i, alts, single_input_op_has_cstr_p)) >= 0) { bound_p[n] = true; dup = recog_data.operand[n]; @@ -429,8 +484,8 @@ add_insn_allocno_copies (rtx_insn *insn) REG_P (operand) ? operand : SUBREG_REG (operand)) != NULL_RTX) - process_regs_for_copy (operand, dup, true, NULL, - freq); + process_regs_for_copy (operand, dup, true, NULL, freq, + single_input_op_has_cstr_p); } } for (i = 0; i < recog_data.n_operands; i++) @@ -440,13 +495,15 @@ add_insn_allocno_copies (rtx_insn *insn) && find_reg_note (insn, REG_DEAD, REG_P (operand) ? operand : SUBREG_REG (operand)) != NULL_RTX) - /* If an operand dies, prefer its hard register for the output - operands by decreasing the hard register cost or creating - the corresponding allocno copies. The cost will not - correspond to a real move insn cost, so make the frequency - smaller. */ - process_reg_shuffles (insn, operand, i, freq < 8 ? 1 : freq / 8, - bound_p); + { + /* If an operand dies, prefer its hard register for the output + operands by decreasing the hard register cost or creating + the corresponding allocno copies. The cost will not + correspond to a real move insn cost, so make the frequency + smaller. */ + int new_freq = get_freq_for_shuffle_copy (freq); + process_reg_shuffles (insn, operand, i, new_freq, bound_p); + } } } diff --git a/gcc/ira-int.h b/gcc/ira-int.h index 31e013b0461..da748626e31 100644 --- a/gcc/ira-int.h +++ b/gcc/ira-int.h @@ -971,7 +971,7 @@ extern void ira_debug_disposition (void); extern void ira_debug_allocno_classes (void); extern void ira_init_register_move_cost (machine_mode); extern alternative_mask ira_setup_alts (rtx_insn *); -extern int ira_get_dup_out_num (int, alternative_mask); +extern int ira_get_dup_out_num (int, alternative_mask, bool &); /* ira-build.c */ diff --git a/gcc/ira.c b/gcc/ira.c index 638ef4ea17e..75033a45963 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -1922,9 +1922,15 @@ ira_setup_alts (rtx_insn *insn) /* Return the number of the output non-early clobber operand which should be the same in any case as operand with number OP_NUM (or negative value if there is no such operand). ALTS is the mask - of alternatives that we should consider. */ + of alternatives that we should consider. SINGLE_INPUT_OP_HAS_CSTR_P + should be set in this function, it indicates whether there is only + a single input operand which has the matching constraint on the + output operand with returned position. If the pattern allows any + one of several input operands holds the matching constraint, it's + set as false. One typical case is FMA insn on rs6000. */ int -ira_get_dup_out_num (int op_num, alternative_mask alts) +ira_get_dup_out_num (int op_num, alternative_mask alts, + bool &single_input_op_has_cstr_p) { int curr_alt, c, original; bool ignore_p, use_commut_op_p; @@ -1937,10 +1943,42 @@ ira_get_dup_out_num (int op_num, alternative_mask alts) return -1; str = recog_data.constraints[op_num]; use_commut_op_p = false; + single_input_op_has_cstr_p = true; + + rtx op = recog_data.operand[op_num]; + int op_no = reg_or_subregno (op); + enum reg_class op_pref_cl = reg_preferred_class (op_no); + machine_mode op_mode = GET_MODE (op); + + ira_init_register_move_cost_if_necessary (op_mode); + /* If the preferred regclass isn't NO_REG, continue to find the matching + constraint in all available alternatives with preferred regclass, even + if we have found or will find one alternative whose constraint stands + for a REG (!NO_REG) regclass. Note that it would be fine not to + respect matching constraint if the register copy is free, so exclude + it. */ + bool respect_dup_despite_reg_cstr + = param_ira_consider_dup_in_all_alts + && op_pref_cl != NO_REGS + && ira_register_move_cost[op_mode][op_pref_cl][op_pref_cl] > 0; + + /* Record the alternative whose constraint uses the same regclass as the + preferred regclass, later if we find one matching constraint for this + operand with preferred reclass, we will visit these recorded + alternatives to check whether if there is one alternative in which no + any INPUT operands have one matching constraint same as our candidate. + If yes, it means there is one alternative which is perfectly fine + without satisfying this matching constraint. If no, it means in any + alternatives there is one other INPUT operand holding this matching + constraint, it's fine to respect this matching constraint and further + create this constraint copy since it would become harmless once some + other takes preference and it's interfered. */ + alternative_mask pref_cl_alts; + for (;;) { - rtx op = recog_data.operand[op_num]; - + pref_cl_alts = 0; + for (curr_alt = 0, ignore_p = !TEST_BIT (alts, curr_alt), original = -1;;) { @@ -1963,9 +2001,25 @@ ira_get_dup_out_num (int op_num, alternative_mask alts) { enum constraint_num cn = lookup_constraint (str); enum reg_class cl = reg_class_for_constraint (cn); - if (cl != NO_REGS - && !targetm.class_likely_spilled_p (cl)) - goto fail; + if (cl != NO_REGS && !targetm.class_likely_spilled_p (cl)) + { + if (respect_dup_despite_reg_cstr) + { + /* If it's free to move from one preferred class to + the one without matching constraint, it doesn't + have to respect this constraint with costs. */ + if (cl != op_pref_cl + && (ira_reg_class_intersect[cl][op_pref_cl] + != NO_REGS) + && (ira_may_move_in_cost[op_mode][op_pref_cl][cl] + == 0)) + goto fail; + else if (cl == op_pref_cl) + pref_cl_alts |= ALTERNATIVE_BIT (curr_alt); + } + else + goto fail; + } if (constraint_satisfied_p (op, cn)) goto fail; break; @@ -1979,7 +2033,21 @@ ira_get_dup_out_num (int op_num, alternative_mask alts) str = end; if (original != -1 && original != n) goto fail; - original = n; + gcc_assert (n < recog_data.n_operands); + if (respect_dup_despite_reg_cstr) + { + const operand_alternative *op_alt + = &recog_op_alt[curr_alt * recog_data.n_operands]; + /* Only respect the one with preferred rclass, without + respect_dup_despite_reg_cstr, it's possible to get + one whose regclass isn't preferred first before, + but it would fail since there should be other + alternatives with preferred regclass. */ + if (op_alt[n].cl == op_pref_cl) + original = n; + } + else + original = n; continue; } } @@ -1988,7 +2056,39 @@ ira_get_dup_out_num (int op_num, alternative_mask alts) if (original == -1) goto fail; if (recog_data.operand_type[original] == OP_OUT) - return original; + { + if (pref_cl_alts == 0) + return original; + /* Visit these recorded alternatives to check whether if + there is one alternative in which no any INPUT operands + have one matching constraint same as our candidate. + Give up this candidate if so. */ + int nop, nalt; + for (nalt = 0; nalt < recog_data.n_alternatives; nalt++) + { + if (!TEST_BIT (pref_cl_alts, nalt)) + continue; + const operand_alternative *op_alt + = &recog_op_alt[nalt * recog_data.n_operands]; + bool dup_in_other = false; + for (nop = 0; nop < recog_data.n_operands; nop++) + { + if (recog_data.operand_type[nop] != OP_IN) + continue; + if (nop == op_num) + continue; + if (op_alt[nop].matches == original) + { + dup_in_other = true; + break; + } + } + if (!dup_in_other) + return -1; + } + single_input_op_has_cstr_p = false; + return original; + } fail: if (use_commut_op_p) break; diff --git a/gcc/params.opt b/gcc/params.opt index 18e6036c4f4..5121e3ddc80 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -330,6 +330,10 @@ Max size of conflict table in MB. Common Joined UInteger Var(param_ira_max_loops_num) Init(100) Param Optimization Max loops number for regional RA. +-param=ira-consider-dup-in-all-alts= +Common Joined UInteger Var(param_ira_consider_dup_in_all_alts) Init(1) IntegerRange(0, 1) Param Optimization +Control ira to consider matching constraint (duplicated operand number) heavily if that one is with preferred register class, even if there is some other choice with an appropriate register class no matter which is preferred or not. + -param=iv-always-prune-cand-set-bound= Common Joined UInteger Var(param_iv_always_prune_cand_set_bound) Init(10) Param Optimization If number of candidates in the set is smaller, we always try to remove unused ivs during its optimization. -- 2.17.1 --------------974AE30CFB6EE08B46689DC1--