From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 442463858D39 for ; Wed, 15 Sep 2021 08:40:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 442463858D39 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.0.43) with SMTP id 18F85hRO015555; Wed, 15 Sep 2021 04:39:59 -0400 Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0b-001b2d01.pphosted.com with ESMTP id 3b3d150nud-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Sep 2021 04:39:59 -0400 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 18F8bC6W002459; Wed, 15 Sep 2021 08:39:57 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma06fra.de.ibm.com with ESMTP id 3b0kqjudvu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Sep 2021 08:39:57 +0000 Received: from d06av24.portsmouth.uk.ibm.com (d06av24.portsmouth.uk.ibm.com [9.149.105.60]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 18F8dtDx50725316 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 15 Sep 2021 08:39:55 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2396842047; Wed, 15 Sep 2021 08:39:55 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E222442042; Wed, 15 Sep 2021 08:39:54 +0000 (GMT) Received: from li-926bd7cc-2dd1-11b2-a85c-f6adc0f5efec.ibm.com (unknown [9.171.48.90]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Wed, 15 Sep 2021 08:39:54 +0000 (GMT) Subject: Re: [PATCH 1/7] ifcvt: Check if cmovs are needed. To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com References: <20210625160905.23786-1-rdapp@linux.ibm.com> <20210625160905.23786-2-rdapp@linux.ibm.com> <399e9ada-8a0a-e95b-f037-b3f3cb8a6c48@linux.ibm.com> From: Robin Dapp Message-ID: Date: Wed, 15 Sep 2021 10:39:54 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------CB9C5EB28DA7AB2F27234602" Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-GUID: wctTttB3G8EsMuYS_THc9XdXORBe2Trw X-Proofpoint-ORIG-GUID: wctTttB3G8EsMuYS_THc9XdXORBe2Trw X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.687,Hydra:6.0.235,FMLib:17.0.607.475 definitions=2020-10-13_15,2020-10-13_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 clxscore=1011 adultscore=0 priorityscore=1501 mlxscore=0 mlxlogscore=999 phishscore=0 impostorscore=0 bulkscore=0 suspectscore=0 lowpriorityscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109030001 definitions=main-2109150040 X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Sep 2021 08:40:04 -0000 This is a multi-part message in MIME format. --------------CB9C5EB28DA7AB2F27234602 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Hi Richard, > Don't we still need this code (without the REG_DEAD handling) for the > case in which… > >> + /* As we are transforming >> + if (x > y) >> + { >> + a = b; >> + c = d; >> + } >> + into >> + a = (x > y) ... >> + c = (x > y) ... >> + >> + we potentially check x > y before every set. >> + Even though the check might be removed by subsequent passes, this means >> + that we cannot transform >> + if (x > y) >> + { >> + x = y; >> + ... >> + } >> + into >> + x = (x > y) ... >> + ... >> + since this would invalidate x and the following to-be-removed checks. >> + Therefore we introduce a temporary every time we are about to >> + overwrite a variable used in the check. Costing of a sequence with >> + these is going to be inaccurate so only use temporaries when >> + needed. */ >> + if (reg_overlap_mentioned_p (target, cond)) >> + temp = gen_reg_rtx (GET_MODE (target)); > > …this code triggers? I don't see otherwise how later uses of x would > pick up “temp” instead of the original target. E.g. suppose we had: > > if (x > y) > { > x = …; > z = x; // x does not die here > } > > Without the loop, it looks like z would pick up the old value of x > (used in the comparison) instead of the new one. getting back to this now. I re-added handling of the situation you mentioned (even though I didn't manage to trigger it myself). Regards Robin --------------CB9C5EB28DA7AB2F27234602 Content-Type: text/x-patch; charset=UTF-8; name="check-need-cmovs.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="check-need-cmovs.diff" commit 2d909ee93ee1eb0f7474ed57581713367c22ba6c Author: Robin Dapp Date: Thu Jun 24 16:40:04 2021 +0200 ifcvt: Check if cmovs are needed. When if-converting multiple SETs and we encounter a swap-style idiom if (a > b) { tmp = c; // [1] c = d; d = tmp; } ifcvt should not generate a conditional move for the instruction at [1]. In order to achieve that, this patch goes through all relevant SETs and marks the relevant instructions. This helps to evaluate costs. On top, only generate temporaries if the current cmov is going to overwrite one of the comparands of the initial compare. diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 017944f4f79..f1448667732 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -98,6 +98,8 @@ static int dead_or_predicable (basic_block, basic_block, basic_block, edge, int); static void noce_emit_move_insn (rtx, rtx); static rtx_insn *block_has_only_trap (basic_block); +static void need_cmov_or_rewire (basic_block, hash_set *, + hash_map *); /* Count the number of non-jump active insns in BB. */ @@ -3203,6 +3205,11 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) auto_vec unmodified_insns; int count = 0; + hash_set need_no_cmov; + hash_map rewired_src; + + need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src); + FOR_BB_INSNS (then_bb, insn) { /* Skip over non-insns. */ @@ -3213,26 +3220,47 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) gcc_checking_assert (set); rtx target = SET_DEST (set); - rtx temp = gen_reg_rtx (GET_MODE (target)); - rtx new_val = SET_SRC (set); + rtx temp; + + int *ii = rewired_src.get (insn); + rtx new_val = ii == NULL ? SET_SRC (set) : temporaries[*ii]; rtx old_val = target; - /* If we were supposed to read from an earlier write in this block, - we've changed the register allocation. Rewire the read. While - we are looking, also try to catch a swap idiom. */ - for (int i = count - 1; i >= 0; --i) - if (reg_overlap_mentioned_p (new_val, targets[i])) - { - /* Catch a "swap" style idiom. */ - if (find_reg_note (insn, REG_DEAD, new_val) != NULL_RTX) - /* The write to targets[i] is only live until the read - here. As the condition codes match, we can propagate - the set to here. */ - new_val = SET_SRC (single_set (unmodified_insns[i])); - else - new_val = temporaries[i]; - break; - } + /* As we are transforming + if (x > y) + { + a = b; + c = d; + } + into + a = (x > y) ... + c = (x > y) ... + + we potentially check x > y before every set. + Even though the check might be removed by subsequent passes, this means + that we cannot transform + if (x > y) + { + x = y; + ... + } + into + x = (x > y) ... + ... + since this would invalidate x and the following to-be-removed checks. + Therefore we introduce a temporary every time we are about to + overwrite a variable used in the check. Costing of a sequence with + these is going to be inaccurate so only use temporaries when + needed. */ + if (reg_overlap_mentioned_p (target, cond)) + temp = gen_reg_rtx (GET_MODE (target)); + else + temp = target; + + /* We have identified swap-style idioms in check_need_cmovs. A normal + set will need to be a cmov while the first instruction of a swap-style + idiom can be a regular move. This helps with costing. */ + bool need_cmov = !need_no_cmov.contains (insn); /* If we had a non-canonical conditional jump (i.e. one where the fallthrough is to the "else" case) we need to reverse @@ -3275,16 +3303,29 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) old_val = lowpart_subreg (dst_mode, old_val, src_mode); } - /* Actually emit the conditional move. */ - rtx temp_dest = noce_emit_cmove (if_info, temp, cond_code, - x, y, new_val, old_val); + rtx temp_dest = NULL_RTX; - /* If we failed to expand the conditional move, drop out and don't - try to continue. */ - if (temp_dest == NULL_RTX) + if (need_cmov) { - end_sequence (); - return FALSE; + /* Actually emit the conditional move. */ + temp_dest = noce_emit_cmove (if_info, temp, cond_code, + x, y, new_val, old_val); + + /* If we failed to expand the conditional move, drop out and don't + try to continue. */ + if (temp_dest == NULL_RTX) + { + end_sequence (); + return FALSE; + } + } + else + { + if (if_info->then_else_reversed) + noce_emit_move_insn (temp, old_val); + else + noce_emit_move_insn (temp, new_val); + temp_dest = temp; } /* Bookkeeping. */ @@ -3808,6 +3849,87 @@ check_cond_move_block (basic_block bb, return TRUE; } +/* Find local swap-style idioms in BB and mark the first insn (1) + that is only a temporary as not needing a conditional move as + it is going to be dead afterwards anyway. + + (1) int tmp = a; + a = b; + b = tmp; + + ifcvt + --> + + tmp = a; + a = cond ? b : a_old; + b = cond ? tmp : b_old; + + Additionally, store the index of insns like (2) when a subsequent + SET reads from their destination. + + (2) int c = a; + int d = c; + + ifcvt + --> + + c = cond ? a : c_old; + d = cond ? d : c; // Need to use c rather than c_old here. +*/ + +static void +need_cmov_or_rewire (basic_block bb, + hash_set *need_no_cmov, + hash_map *rewired_src) +{ + rtx_insn *insn; + int count = 0; + auto_vec insns; + auto_vec dests; + + /* Iterate over all SETs, storing the destinations + in DEST. + - If we hit a SET that reads from a destination + that we have seen before and the corresponding register + is dead afterwards, the register does not need to be + moved conditionally. + - If we encounter a previously changed register, + rewire the read to the original source. */ + FOR_BB_INSNS (bb, insn) + { + rtx set, src, dest; + + if (!active_insn_p (insn)) + continue; + + set = single_set (insn); + if (set == NULL_RTX) + continue; + + src = SET_SRC (set); + dest = SET_DEST (set); + + /* Check if the current SET's source is the same + as any previously seen destination. + This is quadratic but the number of insns in BB + is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ + if (REG_P (src)) + for (int i = count - 1; i >= 0; --i) + if (reg_overlap_mentioned_p (src, dests[i])) + { + if (find_reg_note (insn, REG_DEAD, src) != NULL_RTX) + need_no_cmov->add (insns[i]); + else + rewired_src->put (insn, i); + } + + insns.safe_push (insn); + dests.safe_push (dest); + + count++; + } +} + /* Given a basic block BB suitable for conditional move conversion, a condition COND, and pointer maps THEN_VALS and ELSE_VALS containing the register values depending on COND, emit the insns in the block as --------------CB9C5EB28DA7AB2F27234602--