From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id B17AA3858D39 for ; Thu, 16 Mar 2023 10:12:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B17AA3858D39 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32G8pOJa009113; Thu, 16 Mar 2023 10:12:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=9d8L6KF0A4c+5gj3h0RQLKF7r038Buou8nrbug4y8xc=; b=kEmU/IFnuV9q5UQDJ0NKwzlhtK8CPgVxuCUa/wRNE4JgRcPRYuA3MOmFaMzjfF1j33Bl C3ZQpUWOZDX1bgWy8JurDkTbEbzscdHb+daJR1ZedkKnOHbqNHI0BSz1RULA0iawwa9q fkUGauzng8WeqFUMTRHP1/paVPTB350eGD3IYU6kcAHlxmMSSlQ+HL1F6VVjY8kbfz1S ux+yqxE15Vd9c3I+2qvLsoEndS+WqBR83/7tvsjpThXB9iAZPmwRYrXWBmLAnP0reEQr iJL7gEXUYk53ZFMmtFlKxQyeb3kko1k3mKiWtTFyZZQYvaN+kLa5zec0XKk+oAp5Rw3u SQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3pbyy7hyd0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 16 Mar 2023 10:11:59 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 32G9Vdcv008694; Thu, 16 Mar 2023 10:11:59 GMT Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3pbyy7hyct-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 16 Mar 2023 10:11:59 +0000 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 32G7Gl5v026747; Thu, 16 Mar 2023 10:11:58 GMT Received: from smtprelay02.wdc07v.mail.ibm.com ([9.208.129.120]) by ppma03dal.us.ibm.com (PPS) with ESMTPS id 3pbs9ythsw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 16 Mar 2023 10:11:58 +0000 Received: from smtpav03.dal12v.mail.ibm.com (smtpav03.dal12v.mail.ibm.com [10.241.53.102]) by smtprelay02.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 32GABuIx29688452 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 16 Mar 2023 10:11:57 GMT Received: from smtpav03.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AABBF58060; Thu, 16 Mar 2023 10:11:56 +0000 (GMT) Received: from smtpav03.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CB6105803F; Thu, 16 Mar 2023 10:11:52 +0000 (GMT) Received: from [9.43.226.196] (unknown [9.43.226.196]) by smtpav03.dal12v.mail.ibm.com (Postfix) with ESMTP; Thu, 16 Mar 2023 10:11:52 +0000 (GMT) Message-ID: <68ae93ab-ecb9-332b-dba8-bdc7b0d6b3c9@linux.ibm.com> Date: Thu, 16 Mar 2023 15:41:48 +0530 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH] rs6000: suboptimal code for returning bool value on target ppc Content-Language: en-US To: Richard Biener Cc: gcc-patches , Segher Boessenkool , bergner@linux.ibm.com References: <86cf8475-4353-52ca-869c-75f40bd7d06f@linux.ibm.com> <55b2d830-e71b-8b8a-948d-103b75aea1df@linux.ibm.com> <46a7e308-773d-fc27-5905-41ce3d531653@linux.ibm.com> From: Ajit Agarwal In-Reply-To: Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 8wmuS-trmXEU-qXFaac9t_uIYCKMKqoQ X-Proofpoint-GUID: cPUhi1rEAg62A2qED0plbohYt_Tu8jeN Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-16_07,2023-03-15_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 priorityscore=1501 bulkscore=0 phishscore=0 impostorscore=0 clxscore=1015 adultscore=0 suspectscore=0 spamscore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303150002 definitions=main-2303160081 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Richard: On 16/03/23 3:22 pm, Richard Biener wrote: > On Thu, Mar 16, 2023 at 9:19 AM Ajit Agarwal wrote: >> >> >> >> On 16/03/23 1:44 pm, Richard Biener wrote: >>> On Thu, Mar 16, 2023 at 9:11 AM Ajit Agarwal wrote: >>>> >>>> Hello Richard: >>>> >>>> On 16/03/23 1:10 pm, Richard Biener wrote: >>>>> On Thu, Mar 16, 2023 at 6:21 AM Ajit Agarwal via Gcc-patches >>>>> wrote: >>>>>> >>>>>> Hello All: >>>>>> >>>>>> >>>>>> This patch eliminates unnecessary zero extension instruction from power generated assembly. >>>>>> Bootstrapped and regtested on powerpc64-linux-gnu. >>>>> >>>>> What makes this so special that we cannot deal with it from generic code? >>>>> In particular we do have the REE pass, why is target specific >>>>> knowledge neccessary >>>>> to eliminate the extension? >>>>> >>>> >>>> For returning bool values and comparision with integers generates the following by all the rtl passes. >>>> >>>> set compare (subreg) >>>> set if_then_else >>>> Convert SImode -> QImode >>>> set zero_extend to SImode from QImode >>>> set return value 0 in one path of cfg. >>>> set return value 1 in other path of cfg. >>>> >>>> This pass replaces the above zero extension and conversion from QImode to DImode with copy operation to keep QImode in 64 bit registers in powerpc target. >>> >>> Sorry, I can't parse that - as there's no testcase with the patch I >>> cannot even try to see what the actual RTL >>> looks like (without the pass). >>> >> >> Here is the PR with bugzilla. >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 >> >> I can add the attached testcase with this PR in the patch. > > I don't see any zero-extends there. > Here is the testcase. bool (int a, int b) { if (a > 2) return false; if (b < 10) return true; return false; } compiled with gcc -O3 -m64 testcase.cc -mcpu=power9 -save-temps. Here is the rtl after cse. (note 12 11 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn 15 12 16 3 (set (reg:CC 123) (compare:CC (subreg/s/u:SI (reg/v:DI 120 [ b ]) 0) (const_int 9 [0x9]))) "ext.cc":5:5 796 {*cmpsi_signed} (expr_list:REG_DEAD (reg/v:DI 120 [ b ]) (nil))) (insn 16 15 17 3 (set (reg:SI 124) (const_int 1 [0x1])) "ext.cc":5:5 555 {*movsi_internal1} (nil)) (insn 17 16 18 3 (set (reg:SI 122) (if_then_else:SI (gt (reg:CC 123) (const_int 0 [0])) (const_int 0 [0]) (reg:SI 124))) "ext.cc":5:5 344 {isel_cc_si} (expr_list:REG_DEAD (reg:SI 124) (expr_list:REG_DEAD (reg:CC 123) (nil)))) (insn 18 17 32 3 (set (reg:QI 117 [ _1 ]) (subreg:QI (reg:SI 122) 0)) "ext.cc":5:5 562 {*movqi_internal} (expr_list:REG_DEAD (reg:SI 122) (nil))) ; pc falls through to BB 5 (code_label 32 18 31 4 3 (nil) [1 uses]) (note 31 32 5 4 [bb 4] NOTE_INSN_BASIC_BLOCK) (insn 5 31 19 4 (set (reg:QI 117 [ _1 ]) (const_int 0 [0])) "ext.cc":4:16 562 {*movqi_internal} (nil)) (code_label 19 5 20 5 2 (nil) [0 uses]) (note 20 19 21 5 [bb 5] NOTE_INSN_BASIC_BLOCK) (insn 21 20 22 5 (set (reg:DI 126 [ _1 ]) (zero_extend:DI (reg:QI 117 [ _1 ]))) "ext.cc":8:1 5 {zero_extendqidi2} (expr_list:REG_DEAD (reg:QI 117 [ _1 ]) (nil))) (insn 22 21 26 5 (set (reg:DI 118 [ ]) (reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64} (expr_list:REG_DEAD (reg:DI 126 [ _1 ]) (nil))) (insn 26 22 27 5 (set (reg/i:DI 3 3) (reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64} (expr_list:REG_DEAD (reg:DI 118 [ ]) (nil))) (insn 27 26 0 5 (use (reg/i:DI 3 3)) "ext.cc":8:1 -1 (nil)) Thanks & Regards Ajit >> Thanks & Regards >> Ajit >>> Richard. >>> >>>> Thanks & Regards >>>> Ajit >>>>>> + In cfgexpand pass QImode is generated with >>>>>> + bool register value and this pass uses QI >>>>>> + as 64 bit registers. >>>>>> + >>>> >>>>>> rs6000: suboptimal code for returning bool value on target ppc. >>>>>> >>>>>> New pass to eliminate unnecessary zero extension. This pass >>>>>> is registered after cse rtl pass. >>>>>> >>>>>> 2023-03-16 Ajit Kumar Agarwal >>>>>> >>>>>> gcc/ChangeLog: >>>>>> >>>>>> * config/rs6000/rs6000-passes.def: Registered zero elimination >>>>>> pass. >>>>>> * config/rs6000/rs6000-zext-elim.cc: Add new pass. >>>>>> * config.gcc: Add new executable. >>>>>> * config/rs6000/rs6000-protos.h: Add new prototype for zero >>>>>> elimination pass. >>>>>> * config/rs6000/rs6000.cc: Add new prototype for zero >>>>>> elimination pass. >>>>>> * config/rs6000/t-rs6000: Add new rule. >>>>>> * expr.cc: Modified gcc assert. >>>>>> * explow.cc: Modified gcc assert. >>>>>> * optabs.cc: Modified gcc assert. >>>>>> --- >>>>>> gcc/config.gcc | 4 +- >>>>>> gcc/config/rs6000/rs6000-passes.def | 2 + >>>>>> gcc/config/rs6000/rs6000-protos.h | 1 + >>>>>> gcc/config/rs6000/rs6000-zext-elim.cc | 361 ++++++++++++++++++++++++++ >>>>>> gcc/config/rs6000/rs6000.cc | 2 + >>>>>> gcc/config/rs6000/t-rs6000 | 5 + >>>>>> gcc/explow.cc | 3 +- >>>>>> gcc/expr.cc | 4 +- >>>>>> gcc/optabs.cc | 3 +- >>>>>> 9 files changed, 379 insertions(+), 6 deletions(-) >>>>>> create mode 100644 gcc/config/rs6000/rs6000-zext-elim.cc >>>>>> >>>>>> diff --git a/gcc/config.gcc b/gcc/config.gcc >>>>>> index da3a6d3ba1f..e8ac9d882f0 100644 >>>>>> --- a/gcc/config.gcc >>>>>> +++ b/gcc/config.gcc >>>>>> @@ -503,7 +503,7 @@ or1k*-*-*) >>>>>> ;; >>>>>> powerpc*-*-*) >>>>>> cpu_type=rs6000 >>>>>> - extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o" >>>>>> + extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-zext-elim.o rs6000-logue.o" >>>>>> extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o" >>>>>> extra_objs="${extra_objs} rs6000-builtins.o rs6000-builtin.o" >>>>>> extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h" >>>>>> @@ -538,7 +538,7 @@ riscv*) >>>>>> ;; >>>>>> rs6000*-*-*) >>>>>> extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt" >>>>>> - extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o" >>>>>> + extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-zext-elim.o rs6000-logue.o" >>>>>> extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o" >>>>>> target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-logue.cc \$(srcdir)/config/rs6000/rs6000-call.cc" >>>>>> target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-pcrel-opt.cc" >>>>>> diff --git a/gcc/config/rs6000/rs6000-passes.def b/gcc/config/rs6000/rs6000-passes.def >>>>>> index ca899d5f7af..d7500feddf1 100644 >>>>>> --- a/gcc/config/rs6000/rs6000-passes.def >>>>>> +++ b/gcc/config/rs6000/rs6000-passes.def >>>>>> @@ -28,6 +28,8 @@ along with GCC; see the file COPYING3. If not see >>>>>> The power8 does not have instructions that automaticaly do the byte swaps >>>>>> for loads and stores. */ >>>>>> INSERT_PASS_BEFORE (pass_cse, 1, pass_analyze_swaps); >>>>>> + INSERT_PASS_AFTER (pass_cse, 1, pass_analyze_zext); >>>>>> + >>>>>> >>>>>> /* Pass to do the PCREL_OPT optimization that combines the load of an >>>>>> external symbol's address along with a single load or store using that >>>>>> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h >>>>>> index 1a4fc1df668..f6cf2d673d4 100644 >>>>>> --- a/gcc/config/rs6000/rs6000-protos.h >>>>>> +++ b/gcc/config/rs6000/rs6000-protos.h >>>>>> @@ -340,6 +340,7 @@ namespace gcc { class context; } >>>>>> class rtl_opt_pass; >>>>>> >>>>>> extern rtl_opt_pass *make_pass_analyze_swaps (gcc::context *); >>>>>> +extern rtl_opt_pass *make_pass_analyze_zext (gcc::context *); >>>>>> extern rtl_opt_pass *make_pass_pcrel_opt (gcc::context *); >>>>>> extern bool rs6000_sum_of_two_registers_p (const_rtx expr); >>>>>> extern bool rs6000_quadword_masked_address_p (const_rtx exp); >>>>>> diff --git a/gcc/config/rs6000/rs6000-zext-elim.cc b/gcc/config/rs6000/rs6000-zext-elim.cc >>>>>> new file mode 100644 >>>>>> index 00000000000..777c7a5a387 >>>>>> --- /dev/null >>>>>> +++ b/gcc/config/rs6000/rs6000-zext-elim.cc >>>>>> @@ -0,0 +1,361 @@ >>>>>> +/* Subroutine to eliminate redundant zero extend for power architecture. >>>>>> + Copyright (C) 1991-2023 Free Software Foundation, Inc. >>>>>> + >>>>>> + This file is part of GCC. >>>>>> + >>>>>> + GCC is free software; you can redistribute it and/or modify it >>>>>> + under the terms of the GNU General Public License as published >>>>>> + by the Free Software Foundation; either version 3, or (at your >>>>>> + option) any later version. >>>>>> + >>>>>> + GCC is distributed in the hope that it will be useful, but WITHOUT >>>>>> + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY >>>>>> + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public >>>>>> + License for more details. >>>>>> + >>>>>> + You should have received a copy of the GNU General Public License >>>>>> + along with GCC; see the file COPYING3. If not see >>>>>> + . */ >>>>>> + >>>>>> +/* This pass remove unnecessary zero extension instruction from >>>>>> + power generated assembly. This pass is register after cse >>>>>> + pass. >>>>>> + Identifies the following sequence of instruction after cse >>>>>> + rtl pass. >>>>>> + >>>>>> + set compare (subreg) >>>>>> + set if_then_else >>>>>> + set SImode -> QImode >>>>>> + set zero_extend to DImode from QImode >>>>>> + set return value 0 in one path of cfg. >>>>>> + set return value 1 in other path of cfg. >>>>>> + >>>>>> + In cfgexpand pass QImode is generated with >>>>>> + bool register value and this pass uses QI >>>>>> + as 64 bit registers. >>>>>> + >>>>>> + This pass replace copy operation from QImode to DImode >>>>>> + and return appropriate return values.*/ >>>>>> + >>>>>> +#define IN_TARGET_CODE 1 >>>>>> + >>>>>> +#include "config.h" >>>>>> +#include "system.h" >>>>>> +#include "coretypes.h" >>>>>> +#include "backend.h" >>>>>> +#include "rtl.h" >>>>>> +#include "tree.h" >>>>>> +#include "memmodel.h" >>>>>> +#include "df.h" >>>>>> +#include "tm_p.h" >>>>>> +#include "ira.h" >>>>>> +#include "print-tree.h" >>>>>> +#include "varasm.h" >>>>>> +#include "explow.h" >>>>>> +#include "expr.h" >>>>>> +#include "output.h" >>>>>> +#include "tree-pass.h" >>>>>> + >>>>>> +/* This is based on the union-find logic in web.cc. web_entry_base is >>>>>> + defined in df.h. */ >>>>>> +class zext_web_entry : public web_entry_base >>>>>> +{ >>>>>> + public: >>>>>> + /* Pointer to the insn. */ >>>>>> + rtx_insn *insn; >>>>>> + unsigned int is_relevant : 1; >>>>>> + /* Set if insn is a load. */ >>>>>> + unsigned int is_load : 1; >>>>>> + /* Set if insn is a store. */ >>>>>> + unsigned int is_store : 1; >>>>>> + unsigned int is_zext :1 ; >>>>>> + unsigned int is_move :1; >>>>>> + unsigned int is_delete_move :1; >>>>>> + /* Set if this insn should be deleted. */ >>>>>> + unsigned int will_delete : 1; >>>>>> + unsigned int will_delete_chances : 1; >>>>>> +}; >>>>>> + >>>>>> +/* Checks if instruction is zero extension >>>>>> + * with QIMode to DImode.*/ >>>>>> +static unsigned int >>>>>> +insn_is_zext_p(rtx insn) >>>>>> +{ >>>>>> + rtx body = PATTERN (insn); >>>>>> + >>>>>> + if (GET_CODE (body) == SET >>>>>> + && GET_MODE(SET_DEST (body)) == DImode >>>>>> + && GET_CODE(SET_SRC (body)) == ZERO_EXTEND) >>>>>> + { >>>>>> + rtx set = XEXP (SET_SRC (body), 0); >>>>>> + >>>>>> + if (REG_P (set)) >>>>>> + { >>>>>> + if (GET_MODE (set) == QImode) return 1; >>>>>> + } >>>>>> + else >>>>>> + return 0; >>>>>> + } >>>>>> + return 0; >>>>>> +} >>>>>> + >>>>>> +/* Checks if instruction is SET operation with QImode.*/ >>>>>> +static unsigned int >>>>>> +insn_is_store_p (rtx insn) >>>>>> +{ >>>>>> + rtx body = PATTERN (insn); >>>>>> + if (GET_CODE (body) == SET >>>>>> + && SUBREG_P(SET_SRC (body)) >>>>>> + && !CONST_INT_P(SET_SRC (body)) >>>>>> + && GET_MODE(XEXP (SET_SRC (body), 0)) == SImode >>>>>> + && GET_MODE(SET_SRC (body)) == QImode) >>>>>> + return 1; >>>>>> + >>>>>> + return 0; >>>>>> +} >>>>>> + >>>>>> +/* Find out zero extension removal candidate with use-def web.*/ >>>>>> +static void >>>>>> +find_zero_ext_elimination_candidate (zext_web_entry *insn_entry, >>>>>> + rtx insn, df_ref def) >>>>>> +{ >>>>>> + struct df_link *link = DF_REF_CHAIN (def); >>>>>> + >>>>>> + rtx move_insn = NULL_RTX; >>>>>> + rtx compare_insn = NULL_RTX; >>>>>> + >>>>>> + while (link) >>>>>> + { >>>>>> + if (!DF_REF_INSN_INFO (link->ref)) >>>>>> + insn_entry[INSN_UID(insn)].will_delete_chances = 0; >>>>>> + >>>>>> + if (DF_REF_INSN_INFO (link->ref)) >>>>>> + { >>>>>> + rtx use_insn = DF_REF_INSN (link->ref); >>>>>> + >>>>>> + if (GET_CODE (PATTERN (use_insn)) == SET >>>>>> + && (GET_CODE (SET_SRC (PATTERN (use_insn))) == IF_THEN_ELSE)) >>>>>> + { >>>>>> + if (GET_CODE (PATTERN (insn)) == SET >>>>>> + && GET_CODE (SET_SRC (PATTERN (insn))) == COMPARE) >>>>>> + { >>>>>> + rtx body = XEXP (SET_SRC (PATTERN (insn)), 0); >>>>>> + >>>>>> + if (SUBREG_P (body)) >>>>>> + { >>>>>> + compare_insn = use_insn; >>>>>> + rtx compare_body = XEXP (SET_SRC (PATTERN (compare_insn)), 0); >>>>>> + >>>>>> + if (compare_insn >>>>>> + && ((REGNO (XEXP (compare_body, 0))) >>>>>> + == REGNO (SET_DEST (PATTERN (insn))))) >>>>>> + insn_entry[INSN_UID(use_insn)].will_delete_chances = 1; >>>>>> + } >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + if (insn_is_store_p(use_insn) >>>>>> + && GET_CODE (PATTERN (insn)) == SET >>>>>> + && (GET_CODE (SET_SRC (PATTERN(insn))) == IF_THEN_ELSE)) >>>>>> + { >>>>>> + if (GET_MODE (SET_DEST (PATTERN (insn))) == SImode) >>>>>> + { >>>>>> + if (insn_entry[INSN_UID(insn)].will_delete_chances) >>>>>> + insn_entry[INSN_UID(use_insn)].will_delete_chances = 1; >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + if (insn_is_zext_p (insn)) >>>>>> + { >>>>>> + if (GET_CODE (PATTERN (use_insn)) == SET >>>>>> + && REG_P (SET_SRC (PATTERN (use_insn)))) >>>>>> + { >>>>>> + if (move_insn >>>>>> + && REGNO (SET_SRC (PATTERN (use_insn))) >>>>>> + == REGNO (SET_SRC (PATTERN (move_insn))) >>>>>> + && insn_entry[INSN_UID(insn)].is_delete_move) >>>>>> + { >>>>>> + insn_entry[INSN_UID (insn)].is_move = 1; >>>>>> + break; >>>>>> + } >>>>>> + else if (insn_entry[INSN_UID (insn)].will_delete) >>>>>> + { >>>>>> + move_insn = use_insn; >>>>>> + insn_entry[INSN_UID(insn)].is_delete_move= 1; >>>>>> + } >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + if (insn_is_zext_p (use_insn)) >>>>>> + { >>>>>> + insn_entry[INSN_UID (use_insn)].is_zext = 1; >>>>>> + insn_entry[INSN_UID(use_insn)].is_relevant = 1; >>>>>> + >>>>>> + if (insn_is_store_p (insn) >>>>>> + && insn_entry[INSN_UID (insn)].will_delete_chances) >>>>>> + { >>>>>> + insn_entry[INSN_UID (use_insn)].will_delete = 1; >>>>>> + insn_entry[INSN_UID (insn)].will_delete = 1; >>>>>> + insn_entry[INSN_UID( insn)].is_store = 1; >>>>>> + } >>>>>> + >>>>>> + if (NONDEBUG_INSN_P (use_insn)) >>>>>> + unionfind_union (insn_entry + INSN_UID (insn), >>>>>> + insn_entry + INSN_UID (use_insn)); >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + link = link->next; >>>>>> + } >>>>>> +} >>>>>> + >>>>>> +/* Replace QImode extensions with copy operations.*/ >>>>>> +static void >>>>>> +replace_marked_insns (zext_web_entry *insn_entry, unsigned i) >>>>>> +{ >>>>>> + rtx_insn *insn = insn_entry[i].insn; >>>>>> + rtx body = PATTERN (insn); >>>>>> + rtx src_reg; >>>>>> + src_reg = XEXP (SET_SRC (body), 0); >>>>>> + set_mode_and_regno (src_reg, DImode, REGNO(src_reg)); >>>>>> + >>>>>> + if (GET_MODE(SET_DEST(body)) != DImode) >>>>>> + set_mode_and_regno (SET_DEST(body), DImode, REGNO (SET_DEST (body))); >>>>>> + >>>>>> + rtx copy = gen_rtx_SET (SET_DEST (body), src_reg); >>>>>> + rtx_insn *new_insn = emit_insn_before (copy, insn); >>>>>> + set_block_for_insn (new_insn, BLOCK_FOR_INSN (insn)); >>>>>> + df_insn_rescan (new_insn); >>>>>> + >>>>>> + df_insn_delete (insn); >>>>>> + remove_insn (insn); >>>>>> + insn->set_deleted (); >>>>>> +} >>>>>> + >>>>>> +/* Main entry point for this pass. */ >>>>>> +unsigned int >>>>>> +rs6000_analyze_zext (function *fun) >>>>>> +{ >>>>>> + zext_web_entry *insn_entry; >>>>>> + basic_block bb; >>>>>> + rtx_insn *insn, *curr_insn = 0; >>>>>> + >>>>>> + /* Dataflow analysis for use-def chains. */ >>>>>> + df_set_flags (DF_RD_PRUNE_DEAD_DEFS); >>>>>> + df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN); >>>>>> + df_analyze (); >>>>>> + df_set_flags (DF_DEFER_INSN_RESCAN); >>>>>> + >>>>>> + /* Rebuild ud- and du-chains. */ >>>>>> + df_remove_problem (df_chain); >>>>>> + df_process_deferred_rescans (); >>>>>> + df_set_flags (DF_RD_PRUNE_DEAD_DEFS); >>>>>> + df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN); >>>>>> + df_analyze (); >>>>>> + df_set_flags (DF_DEFER_INSN_RESCAN); >>>>>> + >>>>>> + /* Allocate structure to represent webs of insns. */ >>>>>> + insn_entry = XCNEWVEC (zext_web_entry, get_max_uid ()); >>>>>> + >>>>>> + /* Walk the insns to gather basic data. */ >>>>>> + FOR_ALL_BB_FN (bb, fun) >>>>>> + FOR_BB_INSNS_SAFE (bb, insn, curr_insn) >>>>>> + { >>>>>> + unsigned int uid = INSN_UID (insn); >>>>>> + if (NONDEBUG_INSN_P (insn)) >>>>>> + { >>>>>> + insn_entry[uid].insn = insn; >>>>>> + >>>>>> + if (GET_CODE (insn) == insn_is_store_p (insn)) >>>>>> + { >>>>>> + insn_entry[uid].is_store = 1; >>>>>> + insn_entry[uid].is_relevant = 1; >>>>>> + } >>>>>> + >>>>>> + /* Walk the uses and defs to identify the optimization >>>>>> + candidates.*/ >>>>>> + struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn); >>>>>> + df_ref mention; >>>>>> + >>>>>> + FOR_EACH_INSN_INFO_DEF (mention, insn_info) >>>>>> + { >>>>>> + insn_entry[uid].is_relevant = 1; >>>>>> + insn_entry[uid].is_store = insn_is_store_p (insn); >>>>>> + find_zero_ext_elimination_candidate (insn_entry, insn, mention); >>>>>> + } >>>>>> + >>>>>> + if (insn_entry[uid].is_relevant) >>>>>> + { >>>>>> + /* Determine if this is a store. */ >>>>>> + insn_entry[uid].is_store = insn_is_store_p (insn); >>>>>> + } >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + unsigned e = get_max_uid (), i; >>>>>> + >>>>>> + int store_index = -1; >>>>>> + >>>>>> + /* Replace with copy operation.*/ >>>>>> + for (i = 0; i < e; ++i) >>>>>> + { >>>>>> + if (insn_entry[i].is_store && insn_entry[i].will_delete) >>>>>> + store_index = i; >>>>>> + >>>>>> + if ((store_index != -1) >>>>>> + && insn_entry[i].is_move && insn_entry[i].will_delete) >>>>>> + { >>>>>> + replace_marked_insns (insn_entry, store_index); >>>>>> + replace_marked_insns (insn_entry, i); >>>>>> + } >>>>>> + } >>>>>> + /* Clean up. */ >>>>>> + free (insn_entry); >>>>>> + >>>>>> + return 0; >>>>>> +} >>>>>> + >>>>>> +const pass_data pass_data_analyze_zext = >>>>>> +{ >>>>>> + RTL_PASS, /* type */ >>>>>> + "zext", /* name */ >>>>>> + OPTGROUP_NONE, /* optinfo_flags */ >>>>>> + TV_NONE, /* tv_id */ >>>>>> + 0, /* properties_required */ >>>>>> + 0, /* properties_provided */ >>>>>> + 0, /* properties_destroyed */ >>>>>> + 0, /* todo_flags_start */ >>>>>> + TODO_df_finish, /* todo_flags_finish */ >>>>>> +}; >>>>>> + >>>>>> +class pass_analyze_zext : public rtl_opt_pass >>>>>> +{ >>>>>> +public: >>>>>> + pass_analyze_zext(gcc::context *ctxt) >>>>>> + : rtl_opt_pass(pass_data_analyze_zext, ctxt) >>>>>> + {} >>>>>> + >>>>>> + /* opt_pass methods: */ >>>>>> + virtual bool gate (function *) >>>>>> + { >>>>>> + return (optimize > 0 ); >>>>>> + } >>>>>> + >>>>>> + virtual unsigned int execute (function *fun) >>>>>> + { >>>>>> + return rs6000_analyze_zext (fun); >>>>>> + } >>>>>> + >>>>>> + opt_pass *clone () >>>>>> + { >>>>>> + return new pass_analyze_zext (m_ctxt); >>>>>> + } >>>>>> + >>>>>> +}; // class pass_analyze_zext >>>>>> + >>>>>> +rtl_opt_pass * >>>>>> +make_pass_analyze_zext (gcc::context *ctxt) >>>>>> +{ >>>>>> + return new pass_analyze_zext (ctxt); >>>>>> +} >>>>>> + >>>>>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >>>>>> index 8e0b0d022db..6541334bf2d 100644 >>>>>> --- a/gcc/config/rs6000/rs6000.cc >>>>>> +++ b/gcc/config/rs6000/rs6000.cc >>>>>> @@ -1178,6 +1178,8 @@ static bool rs6000_secondary_reload_move (enum rs6000_reg_type, >>>>>> bool); >>>>>> rtl_opt_pass *make_pass_analyze_swaps (gcc::context*); >>>>>> >>>>>> +rtl_opt_pass *make_pass_analyze_zext (gcc::context*); >>>>>> + >>>>>> /* Hash table stuff for keeping track of TOC entries. */ >>>>>> >>>>>> struct GTY((for_user)) toc_hash_struct >>>>>> diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000 >>>>>> index f183b42ce1d..c1f61591d2f 100644 >>>>>> --- a/gcc/config/rs6000/t-rs6000 >>>>>> +++ b/gcc/config/rs6000/t-rs6000 >>>>>> @@ -35,6 +35,11 @@ rs6000-p8swap.o: $(srcdir)/config/rs6000/rs6000-p8swap.cc >>>>>> $(COMPILE) $< >>>>>> $(POSTCOMPILE) >>>>>> >>>>>> +rs6000-zext-elim.o: $(srcdir)/config/rs6000/rs6000-zext-elim.cc >>>>>> + $(COMPILE) $< >>>>>> + $(POSTCOMPILE) >>>>>> + >>>>>> + >>>>>> rs6000-d.o: $(srcdir)/config/rs6000/rs6000-d.cc >>>>>> $(COMPILE) $< >>>>>> $(POSTCOMPILE) >>>>>> diff --git a/gcc/explow.cc b/gcc/explow.cc >>>>>> index 32e9498ee07..316aa975e40 100644 >>>>>> --- a/gcc/explow.cc >>>>>> +++ b/gcc/explow.cc >>>>>> @@ -654,7 +654,8 @@ copy_to_mode_reg (machine_mode mode, rtx x) >>>>>> if (! general_operand (x, VOIDmode)) >>>>>> x = force_operand (x, temp); >>>>>> >>>>>> - gcc_assert (GET_MODE (x) == mode || GET_MODE (x) == VOIDmode); >>>>>> + gcc_assert (mode == DImode || GET_MODE (x) == mode >>>>>> + || GET_MODE (x) == VOIDmode); >>>>>> if (x != temp) >>>>>> emit_move_insn (temp, x); >>>>>> return temp; >>>>>> diff --git a/gcc/expr.cc b/gcc/expr.cc >>>>>> index 15be1c8db99..6162ef92b88 100644 >>>>>> --- a/gcc/expr.cc >>>>>> +++ b/gcc/expr.cc >>>>>> @@ -4223,9 +4223,9 @@ emit_move_insn (rtx x, rtx y) >>>>>> rtx y_cst = NULL_RTX; >>>>>> rtx_insn *last_insn; >>>>>> rtx set; >>>>>> - >>>>>> gcc_assert (mode != BLKmode >>>>>> - && (GET_MODE (y) == mode || GET_MODE (y) == VOIDmode)); >>>>>> + && (mode == DImode || GET_MODE (y) == mode >>>>>> + || GET_MODE (y) == VOIDmode)); >>>>>> >>>>>> /* If we have a copy that looks like one of the following patterns: >>>>>> (set (subreg:M1 (reg:M2 ...)) (subreg:M1 (reg:M2 ...))) >>>>>> diff --git a/gcc/optabs.cc b/gcc/optabs.cc >>>>>> index 4c641cab192..9d22fadc7ef 100644 >>>>>> --- a/gcc/optabs.cc >>>>>> +++ b/gcc/optabs.cc >>>>>> @@ -7902,7 +7902,8 @@ maybe_legitimize_operand (enum insn_code icode, unsigned int opno, >>>>>> input: >>>>>> gcc_assert (mode != VOIDmode); >>>>>> gcc_assert (GET_MODE (op->value) == VOIDmode >>>>>> - || GET_MODE (op->value) == mode); >>>>>> + || GET_MODE (op->value) == mode >>>>>> + || mode == DImode); >>>>>> if (maybe_legitimize_operand_same_code (icode, opno, op)) >>>>>> return true; >>>>>> >>>>>> -- >>>>>> 2.31.1 >>>>>>