From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 76F263858403 for ; Wed, 7 Sep 2022 14:40:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 76F263858403 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 287EILsj022183 for ; Wed, 7 Sep 2022 14:40:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : to : from : subject : content-type : content-transfer-encoding; s=pp1; bh=1/vsfpvE9XENGiYFNFV2SRyP5ptZ7WmTNpnkFH6D0FY=; b=rBqNXS3mxJKZICutApbt6EEOI2PgAIUywxZvKPwOqN1n0+dWP9IEwuUxmOq/O9T+FB72 932/j+zsG/QoG4i5ienWzb/ygYANUBuUiHYlUdAAWMufmOPfrjXGvwkNb0DscC29Cg57 7mamPyR9xfeihWVhxIHJa4yy5Sh+b96tCr0Fr4EG7PS59Zt2PQN+W/wsImCywPXLV3xF xZYo60D7B6dApKcJyoOz1hHPgUjvJ2xQPNt52ETXc+c6e7EtZS9s/IJrTsuxlYF1meCf JAOMR25/s/gttMuOzOvqo/eonqOqxMf3T96C98aMvQqX2pL0UcZYeuBZS7/1KJkkj+7j ew== Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3jevxtgs3s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 07 Sep 2022 14:40:06 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 287EaAn0023702 for ; Wed, 7 Sep 2022 14:40:05 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma01fra.de.ibm.com with ESMTP id 3jbxj8uxms-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 07 Sep 2022 14:40:04 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 287Ee2oV39453016 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 7 Sep 2022 14:40:02 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8E042A4059 for ; Wed, 7 Sep 2022 14:40:02 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7A5C1A404D for ; Wed, 7 Sep 2022 14:40:02 +0000 (GMT) Received: from [9.171.19.250] (unknown [9.171.19.250]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTPS for ; Wed, 7 Sep 2022 14:40:02 +0000 (GMT) Message-ID: <3b0984ef-c532-c29c-732a-1c9b569e134c@linux.ibm.com> Date: Wed, 7 Sep 2022 16:40:02 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Content-Language: en-US To: GCC Patches From: Robin Dapp Subject: [RFC] postreload cse'ing vector constants Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: A-Yr7vda_jZwjfcBT3--cmcgzvb2wzFz X-Proofpoint-GUID: A-Yr7vda_jZwjfcBT3--cmcgzvb2wzFz X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-07_08,2022-09-07_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 mlxscore=0 phishscore=0 spamscore=0 malwarescore=0 adultscore=0 mlxlogscore=999 impostorscore=0 priorityscore=1501 bulkscore=0 suspectscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2209070057 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, I recently looked into a sequence like vzero %v0 vlr %v2, %v0 vlr %v3, %v0. Ideally we would like to use vzero for all of these sets in order to not create dependencies. For some instances of this problem I found the offending snippet to be the postreload cse pass. If there is a non hard reg whose value is equivalent to an existing hard reg, it will replace the non hard reg. The costs are only compared if the respective operand is a CONST_INT_P, otherwise we always replace. The comment before says: /* See if REGNO fits this alternative, and set it up as the replacement register if we don't have one for this alternative yet and the operand being replaced is not a cheap CONST_INT. */ Now, in my case we have a CONST_VECTOR consisting of CONST_INTS (zeros). This is obviously no CONST_INT therefore the substitution takes place resulting in a "vlr" instead of a "vzero". Would it not make sense to always compare costs here? Some backends have instructions for loading vector constants and there could also be backends able to load floating point constants directly. For my snippet getting rid of the CONST_INT check suffices because the costs are similar and no replacement happens. Was this originally a shortcut for performance reasons? I thought we were not checking that many alternatives and only locally at this point anymore. Any comments or ideas? Regards Robin -- diff --git a/gcc/postreload.cc b/gcc/postreload.cc index 41f61d326482..934439733d52 100644 --- a/gcc/postreload.cc +++ b/gcc/postreload.cc @@ -558,13 +558,12 @@ reload_cse_simplify_operands (rtx_insn *insn, rtx testreg) if (op_alt_regno[i][j] == -1 && TEST_BIT (preferred, j) && reg_fits_class_p (testreg, rclass, 0, mode) - && (!CONST_INT_P (recog_data.operand[i]) - || (set_src_cost (recog_data.operand[i], mode, - optimize_bb_for_speed_p - (BLOCK_FOR_INSN (insn))) - > set_src_cost (testreg, mode, - optimize_bb_for_speed_p - (BLOCK_FOR_INSN (insn)))))) + && (set_src_cost (recog_data.operand[i], mode, + optimize_bb_for_speed_p + (BLOCK_FOR_INSN (insn))) + > set_src_cost (testreg, mode, + optimize_bb_for_speed_p + (BLOCK_FOR_INSN (insn))))) { alternative_nregs[j]++; op_alt_regno[i][j] = regno;