From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 96D7938555B6 for ; Fri, 17 Mar 2023 11:50:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 96D7938555B6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32HBEdlq024509; Fri, 17 Mar 2023 11:50:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=7p/0StQZ4XMyOfrRUj0yYaoiG1eN4q7BUv+kKKOSl4Y=; b=XCx28IttC3/l2TbdW+3tf8nvZxkK6afE32DmoVh7tAmxDv7T8nU3WC300Bl3HeD+D0w/ mjABqVIRPYZP8EnS2ulewMPAeiWhc8vVtgfUfqAHd5fT10HfkRb/pL3A92lRToXhsciQ pTldAfjIjRianm0dlkQE5OMR/rNYMg0BWbXD+BYeiSuM7fU4UGrhk5A7KvMhJR/spOih 41x8MqoNd5MnGngTOqnhh/nS5fhj+WDrbtaW0OPTBClswaoZhXnrE/xpbuGwKjkpvafF LtWOYO1N0tGX9R4cj69KeyOfy49d7eXOOgF76nUnMqMeaGLoAbMg3TO5zWg+Sk2c8hib Tw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3pcq5nrst2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 17 Mar 2023 11:50:06 +0000 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 32HBEiPe024563; Fri, 17 Mar 2023 11:50:06 GMT Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3pcq5nrssh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 17 Mar 2023 11:50:05 +0000 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 32H9hogi026802; Fri, 17 Mar 2023 11:50:05 GMT Received: from smtprelay06.wdc07v.mail.ibm.com ([9.208.129.118]) by ppma03dal.us.ibm.com (PPS) with ESMTPS id 3pbsa0293y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 17 Mar 2023 11:50:05 +0000 Received: from smtpav06.dal12v.mail.ibm.com (smtpav06.dal12v.mail.ibm.com [10.241.53.105]) by smtprelay06.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 32HBo3WR5046932 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 17 Mar 2023 11:50:03 GMT Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0C96D58059; Fri, 17 Mar 2023 11:50:03 +0000 (GMT) Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A9F4D58055; Fri, 17 Mar 2023 11:50:00 +0000 (GMT) Received: from [9.43.106.70] (unknown [9.43.106.70]) by smtpav06.dal12v.mail.ibm.com (Postfix) with ESMTP; Fri, 17 Mar 2023 11:50:00 +0000 (GMT) Message-ID: <6703cf8b-eec2-e5f6-0614-48f91192aca0@linux.ibm.com> Date: Fri, 17 Mar 2023 17:19:56 +0530 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH] rs6000: suboptimal code for returning bool value on target ppc Content-Language: en-US To: Jeff Law , Richard Biener Cc: gcc-patches , Segher Boessenkool , bergner@linux.ibm.com References: <86cf8475-4353-52ca-869c-75f40bd7d06f@linux.ibm.com> <55b2d830-e71b-8b8a-948d-103b75aea1df@linux.ibm.com> <46a7e308-773d-fc27-5905-41ce3d531653@linux.ibm.com> <68ae93ab-ecb9-332b-dba8-bdc7b0d6b3c9@linux.ibm.com> <162d0f3f-0515-9e26-cbf7-7564537f8783@gmail.com> From: Ajit Agarwal In-Reply-To: <162d0f3f-0515-9e26-cbf7-7564537f8783@gmail.com> Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: jmHdklfb__04MqKTdeJaoLv_fHdyo4C5 X-Proofpoint-GUID: GL4Gtk9DafPLUumgd1Gglq4QBRpq8r9J Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-17_06,2023-03-16_02,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 adultscore=0 spamscore=0 clxscore=1015 mlxscore=0 priorityscore=1501 impostorscore=0 lowpriorityscore=0 suspectscore=0 mlxlogscore=831 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303150002 definitions=main-2303170078 X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,KAM_SHORT,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Jeff: On 16/03/23 8:18 pm, Jeff Law wrote: > > > On 3/16/23 04:11, Ajit Agarwal via Gcc-patches wrote: >> >> Hello Richard: >> >> On 16/03/23 3:22 pm, Richard Biener wrote: >>> On Thu, Mar 16, 2023 at 9:19 AM Ajit Agarwal wrote: >>>> >>>> >>>> >>>> On 16/03/23 1:44 pm, Richard Biener wrote: >>>>> On Thu, Mar 16, 2023 at 9:11 AM Ajit Agarwal wrote: >>>>>> >>>>>> Hello Richard: >>>>>> >>>>>> On 16/03/23 1:10 pm, Richard Biener wrote: >>>>>>> On Thu, Mar 16, 2023 at 6:21 AM Ajit Agarwal via Gcc-patches >>>>>>> wrote: >>>>>>>> >>>>>>>> Hello All: >>>>>>>> >>>>>>>> >>>>>>>> This patch eliminates unnecessary zero extension instruction from power generated assembly. >>>>>>>> Bootstrapped and regtested on powerpc64-linux-gnu. >>>>>>> >>>>>>> What makes this so special that we cannot deal with it from generic code? >>>>>>> In particular we do have the REE pass, why is target specific >>>>>>> knowledge neccessary >>>>>>> to eliminate the extension? >>>>>>> >>>>>> >>>>>> For returning bool values and comparision with integers generates the following by all the rtl passes. >>>>>> >>>>>> set compare (subreg) >>>>>> set if_then_else >>>>>> Convert SImode -> QImode >>>>>> set zero_extend to SImode from QImode >>>>>> set return value 0 in one path of cfg. >>>>>> set return value 1 in other path of cfg. >>>>>> >>>>>> This pass replaces the above zero extension and conversion from QImode to DImode with copy operation to keep QImode in 64 bit registers in powerpc target. >>>>> >>>>> Sorry, I can't parse that - as there's no testcase with the patch I >>>>> cannot even try to see what the actual RTL >>>>> looks like (without the pass). >>>>> >>>> >>>> Here is the PR with bugzilla. >>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 >>>> >>>> I can add the attached testcase with this PR in the patch. >>> >>> I don't see any zero-extends there. >>> >> >> Here is the testcase. >> >> >> bool (int a, int b) >> { >>            if (a > 2) >>                        return false; >>             if (b < 10) >>                         return true; >>               return false; >> } >> >> compiled with gcc -O3 -m64 testcase.cc -mcpu=power9 -save-temps. >> >> Here is the rtl after cse. >> (note 12 11 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK) >> (insn 15 12 16 3 (set (reg:CC 123) >>          (compare:CC (subreg/s/u:SI (reg/v:DI 120 [ b ]) 0) >>              (const_int 9 [0x9]))) "ext.cc":5:5 796 {*cmpsi_signed} >>       (expr_list:REG_DEAD (reg/v:DI 120 [ b ]) >>          (nil))) >> (insn 16 15 17 3 (set (reg:SI 124) >>          (const_int 1 [0x1])) "ext.cc":5:5 555 {*movsi_internal1} >>       (nil)) >> (insn 17 16 18 3 (set (reg:SI 122) >>          (if_then_else:SI (gt (reg:CC 123) >>                  (const_int 0 [0])) >>              (const_int 0 [0]) >>              (reg:SI 124))) "ext.cc":5:5 344 {isel_cc_si} >>       (expr_list:REG_DEAD (reg:SI 124) >>          (expr_list:REG_DEAD (reg:CC 123) >>              (nil)))) >> (insn 18 17 32 3 (set (reg:QI 117 [ _1 ]) >>          (subreg:QI (reg:SI 122) 0)) "ext.cc":5:5 562 {*movqi_internal} >>       (expr_list:REG_DEAD (reg:SI 122) >>          (nil))) >>        ; pc falls through to BB 5 >> (code_label 32 18 31 4 3 (nil) [1 uses]) >> (note 31 32 5 4 [bb 4] NOTE_INSN_BASIC_BLOCK) >> (insn 5 31 19 4 (set (reg:QI 117 [ _1 ]) >>          (const_int 0 [0])) "ext.cc":4:16 562 {*movqi_internal} >>       (nil)) >> (code_label 19 5 20 5 2 (nil) [0 uses]) >> (note 20 19 21 5 [bb 5] NOTE_INSN_BASIC_BLOCK) >> (insn 21 20 22 5 (set (reg:DI 126 [ _1 ]) >>          (zero_extend:DI (reg:QI 117 [ _1 ]))) "ext.cc":8:1 5 {zero_extendqidi2} >>       (expr_list:REG_DEAD (reg:QI 117 [ _1 ]) >>          (nil))) >> (insn 22 21 26 5 (set (reg:DI 118 [ ]) >>          (reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64} >>       (expr_list:REG_DEAD (reg:DI 126 [ _1 ]) >>          (nil))) >> (insn 26 22 27 5 (set (reg/i:DI 3 3) >>          (reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64} >>       (expr_list:REG_DEAD (reg:DI 118 [ ]) >>          (nil))) >> (insn 27 26 0 5 (use (reg/i:DI 3 3)) "ext.cc":8:1 -1 >>       (nil)) > This looks like it'd be better addressed in REE. > > > We've got two paths to the zero_extend.  One sets (reg 117) from a constant.  The other sets (reg 117) from a (subreg:QI (reg:SI)). > > Handling the constant is trivial.  For the other set, we can replace the subreg with the zero_extend.  Presumably we'd then proceed to try and eliminate the zero-extend by realizing both arms of the conditional move are constants and thus trivially handled. > > While I don't think REE would handle all this today, fixing it to handle this case seems like it'd be better than doing a specialized pass in the ppc backend. > > jeff > Thanks for your advice. At the input of REE pass the RTL has the following wherein zero_extend and subreg( reg 117) is converted to and (subreg DI ( reg QI 117). This needs to be handled. I am working on handling this in REE pass. insn 44 43 18 3 (set (reg:SI 122) (if_then_else:SI (le:SI (reg:CC 130) (const_int 0 [0])) (reg:SI 129) (const_int 0 [0]))) "ext.cc":5:5 -1 (nil)) (insn 18 44 40 3 (set (reg:QI 117 [ _1 ]) (subreg:QI (reg:SI 122) 0)) "ext.cc":5:5 562 {*movqi_internal} (expr_list:REG_DEAD (reg:SI 122) (nil))) (jump_insn 40 18 41 3 (set (pc) (label_ref 19)) -1 (nil) -> 19) (barrier 41 40 32) (code_label 32 41 31 4 3 (nil) [1 uses]) (note 31 32 5 4 [bb 4] NOTE_INSN_BASIC_BLOCK) (insn 5 31 19 4 (set (reg:QI 117 [ _1 ]) (const_int 0 [0])) "ext.cc":4:16 562 {*movqi_internal} (nil)) (code_label 19 5 20 5 2 (nil) [1 uses]) (note 20 19 21 5 [bb 5] NOTE_INSN_BASIC_BLOCK) (note 21 20 26 5 NOTE_INSN_DELETED) (insn 26 21 27 5 (set (reg/i:DI 3 %r3) (and:DI (subreg:DI (reg:QI 117 [ _1 ]) 0) (const_int 1 [0x1]))) "ext.cc":8:1 207 {anddi3_mask} (expr_list:REG_DEAD (reg:QI 117 [ _1 ]) (nil))) (insn 27 26 0 5 (use (reg/i:DI 3 %r3)) "ext.cc":8:1 -1 (nil)) "a-ext.cc.292r.split1" 92L, 4727C Thanks & Regards Ajit