From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) by sourceware.org (Postfix) with ESMTPS id 73FFF3858C78 for ; Thu, 16 Mar 2023 10:30:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 73FFF3858C78 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x22f.google.com with SMTP id f16so1090049ljq.10 for ; Thu, 16 Mar 2023 03:30:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1678962626; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wkefIEHv4kxavKqoKmigVkTfnSq75lJ+ya5UMCzbG2s=; b=U16Aol08LyuHB9qMxXvlJi4s4oHiXAXzVTU2ofut10XCbiP+LviNtCsRtZxtTidzrD FNxg3YHgcJrASKCxnueG+Ac8txbEoIYrUOh7I+ZNHARtKH9zSR+LC77yvNdUNFVAglvz 5iKdz0J9xQoBcmsboJ4tBLzt9UnCxUQASdPCgjkEkbvDWrLN0ST1KaSZfkTBaQJQtLYV /U/WDJAxSWJOLS/c/hsk4WlFVXTiRDgs1P86z4xvpkHfdrDjteWuK1kSNwBrXjVVHvcv 8PCWS45ik3eUfP8G8poKgXx+iHWDVkVDugBly5guzB4gPnbtSrFFTHHQ2ulXgLLwHY2+ lqdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678962626; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wkefIEHv4kxavKqoKmigVkTfnSq75lJ+ya5UMCzbG2s=; b=ZJTcbLnoIKYnEpyKMiFCwM8FUGpagM4uOTzYiVWOmH8pjYYWxhfa0Icg53Gg4sqc6V kpZ47+JWO+vWwpazYjoPhxY5KTKLI++g6J7elP71xQQrVOSjkvR9x1v8uHQYCb2GNzSl CCRiQ9gSiPB6cNW+2A5nIqkAqB9qKBmZE5PEsTqmIH1eQuA1TRW80CDn9NBYYwzgFeQS 3eFGVQQcPompgwv1kYC+DW0Kfob9uQLWatOvF/jPab6d7UhUmTaCJmPm5MkVp1ooqxVG 5mlfvFyqwhzyRcA98Bx1LJ9rj69WEBJMIN4Lp3hsICGvkU74UvFyAKKUlLkuAm+gkBIO pYxg== X-Gm-Message-State: AO0yUKUmmkeVXYTWy3LgSORlsnfwREA9d5K4SDof69+an3icIAwnKa+Q C0LJy+2+BBAR1R89pLSALmIwp03/7x2GZV+Qhzw= X-Google-Smtp-Source: AK7set8vHE46GD+ifv9ot9OAa0SfIbrmn/XoyO5+PD4zaZBPro9W+Li8iA59GEV7VsbCgGqIEyAlqtqEF02kRIPMwHE= X-Received: by 2002:a05:651c:30d:b0:298:92ed:8251 with SMTP id a13-20020a05651c030d00b0029892ed8251mr1905774ljp.10.1678962626306; Thu, 16 Mar 2023 03:30:26 -0700 (PDT) MIME-Version: 1.0 References: <86cf8475-4353-52ca-869c-75f40bd7d06f@linux.ibm.com> <55b2d830-e71b-8b8a-948d-103b75aea1df@linux.ibm.com> <46a7e308-773d-fc27-5905-41ce3d531653@linux.ibm.com> <68ae93ab-ecb9-332b-dba8-bdc7b0d6b3c9@linux.ibm.com> In-Reply-To: <68ae93ab-ecb9-332b-dba8-bdc7b0d6b3c9@linux.ibm.com> From: Richard Biener Date: Thu, 16 Mar 2023 11:30:02 +0100 Message-ID: Subject: Re: [PATCH] rs6000: suboptimal code for returning bool value on target ppc To: Ajit Agarwal Cc: gcc-patches , Segher Boessenkool , bergner@linux.ibm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Mar 16, 2023 at 11:12=E2=80=AFAM Ajit Agarwal wrote: > > > Hello Richard: > > On 16/03/23 3:22 pm, Richard Biener wrote: > > On Thu, Mar 16, 2023 at 9:19=E2=80=AFAM Ajit Agarwal wrote: > >> > >> > >> > >> On 16/03/23 1:44 pm, Richard Biener wrote: > >>> On Thu, Mar 16, 2023 at 9:11=E2=80=AFAM Ajit Agarwal wrote: > >>>> > >>>> Hello Richard: > >>>> > >>>> On 16/03/23 1:10 pm, Richard Biener wrote: > >>>>> On Thu, Mar 16, 2023 at 6:21=E2=80=AFAM Ajit Agarwal via Gcc-patche= s > >>>>> wrote: > >>>>>> > >>>>>> Hello All: > >>>>>> > >>>>>> > >>>>>> This patch eliminates unnecessary zero extension instruction from = power generated assembly. > >>>>>> Bootstrapped and regtested on powerpc64-linux-gnu. > >>>>> > >>>>> What makes this so special that we cannot deal with it from generic= code? > >>>>> In particular we do have the REE pass, why is target specific > >>>>> knowledge neccessary > >>>>> to eliminate the extension? > >>>>> > >>>> > >>>> For returning bool values and comparision with integers generates th= e following by all the rtl passes. > >>>> > >>>> set compare (subreg) > >>>> set if_then_else > >>>> Convert SImode -> QImode > >>>> set zero_extend to SImode from QImode > >>>> set return value 0 in one path of cfg. > >>>> set return value 1 in other path of cfg. > >>>> > >>>> This pass replaces the above zero extension and conversion from QImo= de to DImode with copy operation to keep QImode in 64 bit registers in powe= rpc target. > >>> > >>> Sorry, I can't parse that - as there's no testcase with the patch I > >>> cannot even try to see what the actual RTL > >>> looks like (without the pass). > >>> > >> > >> Here is the PR with bugzilla. > >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103784 > >> > >> I can add the attached testcase with this PR in the patch. > > > > I don't see any zero-extends there. > > > > Here is the testcase. > > > bool (int a, int b) > { > if (a > 2) > return false; > if (b < 10) > return true; > return false; > } > > compiled with gcc -O3 -m64 testcase.cc -mcpu=3Dpower9 -save-temps. > > Here is the rtl after cse. > (note 12 11 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK) > (insn 15 12 16 3 (set (reg:CC 123) > (compare:CC (subreg/s/u:SI (reg/v:DI 120 [ b ]) 0) > (const_int 9 [0x9]))) "ext.cc":5:5 796 {*cmpsi_signed} > (expr_list:REG_DEAD (reg/v:DI 120 [ b ]) > (nil))) > (insn 16 15 17 3 (set (reg:SI 124) > (const_int 1 [0x1])) "ext.cc":5:5 555 {*movsi_internal1} > (nil)) > (insn 17 16 18 3 (set (reg:SI 122) > (if_then_else:SI (gt (reg:CC 123) > (const_int 0 [0])) > (const_int 0 [0]) > (reg:SI 124))) "ext.cc":5:5 344 {isel_cc_si} > (expr_list:REG_DEAD (reg:SI 124) > (expr_list:REG_DEAD (reg:CC 123) > (nil)))) > (insn 18 17 32 3 (set (reg:QI 117 [ _1 ]) > (subreg:QI (reg:SI 122) 0)) "ext.cc":5:5 562 {*movqi_internal} > (expr_list:REG_DEAD (reg:SI 122) > (nil))) > ; pc falls through to BB 5 > (code_label 32 18 31 4 3 (nil) [1 uses]) > (note 31 32 5 4 [bb 4] NOTE_INSN_BASIC_BLOCK) > (insn 5 31 19 4 (set (reg:QI 117 [ _1 ]) > (const_int 0 [0])) "ext.cc":4:16 562 {*movqi_internal} > (nil)) > (code_label 19 5 20 5 2 (nil) [0 uses]) > (note 20 19 21 5 [bb 5] NOTE_INSN_BASIC_BLOCK) > (insn 21 20 22 5 (set (reg:DI 126 [ _1 ]) > (zero_extend:DI (reg:QI 117 [ _1 ]))) "ext.cc":8:1 5 {zero_extend= qidi2} > (expr_list:REG_DEAD (reg:QI 117 [ _1 ]) > (nil))) > (insn 22 21 26 5 (set (reg:DI 118 [ ]) > (reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64} > (expr_list:REG_DEAD (reg:DI 126 [ _1 ]) > (nil))) > (insn 26 22 27 5 (set (reg/i:DI 3 3) > (reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64} > (expr_list:REG_DEAD (reg:DI 118 [ ]) > (nil))) > (insn 27 26 0 5 (use (reg/i:DI 3 3)) "ext.cc":8:1 -1 > (nil)) But after combine there's just (note 6 0 38 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 38 6 2 2 (set (reg:DI 126) (reg:DI 3 3 [ a ])) "t.c":3:1 634 {*movdi_internal64} (expr_list:REG_DEAD (reg:DI 3 3 [ a ]) (nil))) (note 2 38 39 2 NOTE_INSN_DELETED) (insn 39 2 3 2 (set (reg:DI 127) (reg:DI 4 4 [ b ])) "t.c":3:1 634 {*movdi_internal64} (expr_list:REG_DEAD (reg:DI 4 4 [ b ]) (nil))) (insn 3 39 4 2 (set (reg/v:DI 119 [ b ]) (reg:DI 127)) "t.c":3:1 634 {*movdi_internal64} (expr_list:REG_DEAD (reg:DI 127) (nil))) (note 4 3 10 2 NOTE_INSN_FUNCTION_BEG) (insn 10 4 11 2 (set (reg:CC 120) (compare:CC (subreg/s/u:SI (reg:DI 126) 0) (const_int 2 [0x2]))) "t.c":4:6 755 {*cmpsi_signed} (expr_list:REG_DEAD (reg:DI 126) (nil))) (jump_insn 11 10 12 2 (set (pc) (if_then_else (gt (reg:CC 120) (const_int 0 [0])) (label_ref:DI 32) (pc))) "t.c":4:6 838 {*cbranch} (expr_list:REG_DEAD (reg:CC 120) (int_list:REG_BR_PROB 365072228 (nil))) -> 32) (note 12 11 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (note 15 12 16 3 NOTE_INSN_DELETED) (note 16 15 17 3 NOTE_INSN_DELETED) (note 17 16 19 3 NOTE_INSN_DELETED) (insn 19 17 32 3 (parallel [ (set (reg:DI 117 [ ]) (le:DI (subreg/s/u:SI (reg/v:DI 119 [ b ]) 0) (const_int 9 [0x9]))) (clobber (scratch:DI)) (clobber (scratch:DI)) (clobber (scratch:CC)) ]) "t.c":6:6 783 {ledisi2_isel} (expr_list:REG_DEAD (reg/v:DI 119 [ b ]) (nil))) ; pc falls through to BB 5 (code_label 32 19 31 4 3 (nil) [1 uses]) (note 31 32 5 4 [bb 4] NOTE_INSN_BASIC_BLOCK) (insn 5 31 20 4 (set (reg:DI 117 [ ]) (const_int 0 [0])) "t.c":5:12 634 {*movdi_internal64} (nil)) (code_label 20 5 21 5 2 (nil) [0 uses]) (note 21 20 26 5 [bb 5] NOTE_INSN_BASIC_BLOCK) (insn 26 21 27 5 (set (reg/i:DI 3 3) (reg:DI 117 [ ])) "t.c":9:1 634 {*movdi_internal64} (expr_list:REG_DEAD (reg:DI 117 [ ]) (nil))) (insn 27 26 0 5 (use (reg/i:DI 3 3)) "t.c":9:1 -1 (nil)) and we get foo: .LFB0: .cfi_startproc cmpwi 0,3,2 bgt 0,.L3 cmpwi 0,4,9 li 3,1 isel 3,0,3,1 blr .p2align 4,,15 .L3: li 3,0 blr where I don't see what we can do better (ok, not knowing ppc very much) > > Thanks & Regards > Ajit > > >> Thanks & Regards > >> Ajit > >>> Richard. > >>> > >>>> Thanks & Regards > >>>> Ajit > >>>>>> + In cfgexpand pass QImode is generated with > >>>>>> + bool register value and this pass uses QI > >>>>>> + as 64 bit registers. > >>>>>> + > >>>> > >>>>>> rs6000: suboptimal code for returning bool value on target= ppc. > >>>>>> > >>>>>> New pass to eliminate unnecessary zero extension. This pas= s > >>>>>> is registered after cse rtl pass. > >>>>>> > >>>>>> 2023-03-16 Ajit Kumar Agarwal > >>>>>> > >>>>>> gcc/ChangeLog: > >>>>>> > >>>>>> * config/rs6000/rs6000-passes.def: Registered zero elimina= tion > >>>>>> pass. > >>>>>> * config/rs6000/rs6000-zext-elim.cc: Add new pass. > >>>>>> * config.gcc: Add new executable. > >>>>>> * config/rs6000/rs6000-protos.h: Add new prototype for zer= o > >>>>>> elimination pass. > >>>>>> * config/rs6000/rs6000.cc: Add new prototype for zero > >>>>>> elimination pass. > >>>>>> * config/rs6000/t-rs6000: Add new rule. > >>>>>> * expr.cc: Modified gcc assert. > >>>>>> * explow.cc: Modified gcc assert. > >>>>>> * optabs.cc: Modified gcc assert. > >>>>>> --- > >>>>>> gcc/config.gcc | 4 +- > >>>>>> gcc/config/rs6000/rs6000-passes.def | 2 + > >>>>>> gcc/config/rs6000/rs6000-protos.h | 1 + > >>>>>> gcc/config/rs6000/rs6000-zext-elim.cc | 361 +++++++++++++++++++++= +++++ > >>>>>> gcc/config/rs6000/rs6000.cc | 2 + > >>>>>> gcc/config/rs6000/t-rs6000 | 5 + > >>>>>> gcc/explow.cc | 3 +- > >>>>>> gcc/expr.cc | 4 +- > >>>>>> gcc/optabs.cc | 3 +- > >>>>>> 9 files changed, 379 insertions(+), 6 deletions(-) > >>>>>> create mode 100644 gcc/config/rs6000/rs6000-zext-elim.cc > >>>>>> > >>>>>> diff --git a/gcc/config.gcc b/gcc/config.gcc > >>>>>> index da3a6d3ba1f..e8ac9d882f0 100644 > >>>>>> --- a/gcc/config.gcc > >>>>>> +++ b/gcc/config.gcc > >>>>>> @@ -503,7 +503,7 @@ or1k*-*-*) > >>>>>> ;; > >>>>>> powerpc*-*-*) > >>>>>> cpu_type=3Drs6000 > >>>>>> - extra_objs=3D"rs6000-string.o rs6000-p8swap.o rs6000-logue= .o" > >>>>>> + extra_objs=3D"rs6000-string.o rs6000-p8swap.o rs6000-zext-= elim.o rs6000-logue.o" > >>>>>> extra_objs=3D"${extra_objs} rs6000-call.o rs6000-pcrel-opt= .o" > >>>>>> extra_objs=3D"${extra_objs} rs6000-builtins.o rs6000-built= in.o" > >>>>>> extra_headers=3D"ppc-asm.h altivec.h htmintrin.h htmxlintr= in.h" > >>>>>> @@ -538,7 +538,7 @@ riscv*) > >>>>>> ;; > >>>>>> rs6000*-*-*) > >>>>>> extra_options=3D"${extra_options} g.opt fused-madd.opt rs6= 000/rs6000-tables.opt" > >>>>>> - extra_objs=3D"rs6000-string.o rs6000-p8swap.o rs6000-logue= .o" > >>>>>> + extra_objs=3D"rs6000-string.o rs6000-p8swap.o rs6000-zext-= elim.o rs6000-logue.o" > >>>>>> extra_objs=3D"${extra_objs} rs6000-call.o rs6000-pcrel-opt= .o" > >>>>>> target_gtfiles=3D"$target_gtfiles \$(srcdir)/config/rs6000= /rs6000-logue.cc \$(srcdir)/config/rs6000/rs6000-call.cc" > >>>>>> target_gtfiles=3D"$target_gtfiles \$(srcdir)/config/rs6000= /rs6000-pcrel-opt.cc" > >>>>>> diff --git a/gcc/config/rs6000/rs6000-passes.def b/gcc/config/rs60= 00/rs6000-passes.def > >>>>>> index ca899d5f7af..d7500feddf1 100644 > >>>>>> --- a/gcc/config/rs6000/rs6000-passes.def > >>>>>> +++ b/gcc/config/rs6000/rs6000-passes.def > >>>>>> @@ -28,6 +28,8 @@ along with GCC; see the file COPYING3. If not s= ee > >>>>>> The power8 does not have instructions that automaticaly do t= he byte swaps > >>>>>> for loads and stores. */ > >>>>>> INSERT_PASS_BEFORE (pass_cse, 1, pass_analyze_swaps); > >>>>>> + INSERT_PASS_AFTER (pass_cse, 1, pass_analyze_zext); > >>>>>> + > >>>>>> > >>>>>> /* Pass to do the PCREL_OPT optimization that combines the load= of an > >>>>>> external symbol's address along with a single load or store = using that > >>>>>> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000= /rs6000-protos.h > >>>>>> index 1a4fc1df668..f6cf2d673d4 100644 > >>>>>> --- a/gcc/config/rs6000/rs6000-protos.h > >>>>>> +++ b/gcc/config/rs6000/rs6000-protos.h > >>>>>> @@ -340,6 +340,7 @@ namespace gcc { class context; } > >>>>>> class rtl_opt_pass; > >>>>>> > >>>>>> extern rtl_opt_pass *make_pass_analyze_swaps (gcc::context *); > >>>>>> +extern rtl_opt_pass *make_pass_analyze_zext (gcc::context *); > >>>>>> extern rtl_opt_pass *make_pass_pcrel_opt (gcc::context *); > >>>>>> extern bool rs6000_sum_of_two_registers_p (const_rtx expr); > >>>>>> extern bool rs6000_quadword_masked_address_p (const_rtx exp); > >>>>>> diff --git a/gcc/config/rs6000/rs6000-zext-elim.cc b/gcc/config/rs= 6000/rs6000-zext-elim.cc > >>>>>> new file mode 100644 > >>>>>> index 00000000000..777c7a5a387 > >>>>>> --- /dev/null > >>>>>> +++ b/gcc/config/rs6000/rs6000-zext-elim.cc > >>>>>> @@ -0,0 +1,361 @@ > >>>>>> +/* Subroutine to eliminate redundant zero extend for power archit= ecture. > >>>>>> + Copyright (C) 1991-2023 Free Software Foundation, Inc. > >>>>>> + > >>>>>> + This file is part of GCC. > >>>>>> + > >>>>>> + GCC is free software; you can redistribute it and/or modify it > >>>>>> + under the terms of the GNU General Public License as published > >>>>>> + by the Free Software Foundation; either version 3, or (at your > >>>>>> + option) any later version. > >>>>>> + > >>>>>> + GCC is distributed in the hope that it will be useful, but WIT= HOUT > >>>>>> + ANY WARRANTY; without even the implied warranty of MERCHANTABI= LITY > >>>>>> + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Publ= ic > >>>>>> + License for more details. > >>>>>> + > >>>>>> + You should have received a copy of the GNU General Public Lice= nse > >>>>>> + along with GCC; see the file COPYING3. If not see > >>>>>> + . */ > >>>>>> + > >>>>>> +/* This pass remove unnecessary zero extension instruction from > >>>>>> + power generated assembly. This pass is register after cse > >>>>>> + pass. > >>>>>> + Identifies the following sequence of instruction after cse > >>>>>> + rtl pass. > >>>>>> + > >>>>>> + set compare (subreg) > >>>>>> + set if_then_else > >>>>>> + set SImode -> QImode > >>>>>> + set zero_extend to DImode from QImode > >>>>>> + set return value 0 in one path of cfg. > >>>>>> + set return value 1 in other path of cfg. > >>>>>> + > >>>>>> + In cfgexpand pass QImode is generated with > >>>>>> + bool register value and this pass uses QI > >>>>>> + as 64 bit registers. > >>>>>> + > >>>>>> + This pass replace copy operation from QImode to DImode > >>>>>> + and return appropriate return values.*/ > >>>>>> + > >>>>>> +#define IN_TARGET_CODE 1 > >>>>>> + > >>>>>> +#include "config.h" > >>>>>> +#include "system.h" > >>>>>> +#include "coretypes.h" > >>>>>> +#include "backend.h" > >>>>>> +#include "rtl.h" > >>>>>> +#include "tree.h" > >>>>>> +#include "memmodel.h" > >>>>>> +#include "df.h" > >>>>>> +#include "tm_p.h" > >>>>>> +#include "ira.h" > >>>>>> +#include "print-tree.h" > >>>>>> +#include "varasm.h" > >>>>>> +#include "explow.h" > >>>>>> +#include "expr.h" > >>>>>> +#include "output.h" > >>>>>> +#include "tree-pass.h" > >>>>>> + > >>>>>> +/* This is based on the union-find logic in web.cc. web_entry_ba= se is > >>>>>> + defined in df.h. */ > >>>>>> +class zext_web_entry : public web_entry_base > >>>>>> +{ > >>>>>> + public: > >>>>>> + /* Pointer to the insn. */ > >>>>>> + rtx_insn *insn; > >>>>>> + unsigned int is_relevant : 1; > >>>>>> + /* Set if insn is a load. */ > >>>>>> + unsigned int is_load : 1; > >>>>>> + /* Set if insn is a store. */ > >>>>>> + unsigned int is_store : 1; > >>>>>> + unsigned int is_zext :1 ; > >>>>>> + unsigned int is_move :1; > >>>>>> + unsigned int is_delete_move :1; > >>>>>> + /* Set if this insn should be deleted. */ > >>>>>> + unsigned int will_delete : 1; > >>>>>> + unsigned int will_delete_chances : 1; > >>>>>> +}; > >>>>>> + > >>>>>> +/* Checks if instruction is zero extension > >>>>>> + * with QIMode to DImode.*/ > >>>>>> +static unsigned int > >>>>>> +insn_is_zext_p(rtx insn) > >>>>>> +{ > >>>>>> + rtx body =3D PATTERN (insn); > >>>>>> + > >>>>>> + if (GET_CODE (body) =3D=3D SET > >>>>>> + && GET_MODE(SET_DEST (body)) =3D=3D DImode > >>>>>> + && GET_CODE(SET_SRC (body)) =3D=3D ZERO_EXTEND) > >>>>>> + { > >>>>>> + rtx set =3D XEXP (SET_SRC (body), 0); > >>>>>> + > >>>>>> + if (REG_P (set)) > >>>>>> + { > >>>>>> + if (GET_MODE (set) =3D=3D QImode) return 1; > >>>>>> + } > >>>>>> + else > >>>>>> + return 0; > >>>>>> + } > >>>>>> + return 0; > >>>>>> +} > >>>>>> + > >>>>>> +/* Checks if instruction is SET operation with QImode.*/ > >>>>>> +static unsigned int > >>>>>> +insn_is_store_p (rtx insn) > >>>>>> +{ > >>>>>> + rtx body =3D PATTERN (insn); > >>>>>> + if (GET_CODE (body) =3D=3D SET > >>>>>> + && SUBREG_P(SET_SRC (body)) > >>>>>> + && !CONST_INT_P(SET_SRC (body)) > >>>>>> + && GET_MODE(XEXP (SET_SRC (body), 0)) =3D=3D SImode > >>>>>> + && GET_MODE(SET_SRC (body)) =3D=3D QImode) > >>>>>> + return 1; > >>>>>> + > >>>>>> + return 0; > >>>>>> +} > >>>>>> + > >>>>>> +/* Find out zero extension removal candidate with use-def web.*/ > >>>>>> +static void > >>>>>> +find_zero_ext_elimination_candidate (zext_web_entry *insn_entry, > >>>>>> + rtx insn, df_ref def) > >>>>>> +{ > >>>>>> + struct df_link *link =3D DF_REF_CHAIN (def); > >>>>>> + > >>>>>> + rtx move_insn =3D NULL_RTX; > >>>>>> + rtx compare_insn =3D NULL_RTX; > >>>>>> + > >>>>>> + while (link) > >>>>>> + { > >>>>>> + if (!DF_REF_INSN_INFO (link->ref)) > >>>>>> + insn_entry[INSN_UID(insn)].will_delete_chances =3D 0; > >>>>>> + > >>>>>> + if (DF_REF_INSN_INFO (link->ref)) > >>>>>> + { > >>>>>> + rtx use_insn =3D DF_REF_INSN (link->ref); > >>>>>> + > >>>>>> + if (GET_CODE (PATTERN (use_insn)) =3D=3D SET > >>>>>> + && (GET_CODE (SET_SRC (PATTERN (use_insn))) =3D=3D IF_= THEN_ELSE)) > >>>>>> + { > >>>>>> + if (GET_CODE (PATTERN (insn)) =3D=3D SET > >>>>>> + && GET_CODE (SET_SRC (PATTERN (insn))) =3D=3D COMP= ARE) > >>>>>> + { > >>>>>> + rtx body =3D XEXP (SET_SRC (PATTERN (insn)), 0); > >>>>>> + > >>>>>> + if (SUBREG_P (body)) > >>>>>> + { > >>>>>> + compare_insn =3D use_insn; > >>>>>> + rtx compare_body =3D XEXP (SET_SRC (PATTERN (c= ompare_insn)), 0); > >>>>>> + > >>>>>> + if (compare_insn > >>>>>> + && ((REGNO (XEXP (compare_body, 0))) > >>>>>> + =3D=3D REGNO (SET_DEST (PATTERN (i= nsn))))) > >>>>>> + insn_entry[INSN_UID(use_insn)].will_delete_c= hances =3D 1; > >>>>>> + } > >>>>>> + } > >>>>>> + } > >>>>>> + > >>>>>> + if (insn_is_store_p(use_insn) > >>>>>> + && GET_CODE (PATTERN (insn)) =3D=3D SET > >>>>>> + && (GET_CODE (SET_SRC (PATTERN(insn))) =3D=3D IF_THEN_= ELSE)) > >>>>>> + { > >>>>>> + if (GET_MODE (SET_DEST (PATTERN (insn))) =3D=3D SImode= ) > >>>>>> + { > >>>>>> + if (insn_entry[INSN_UID(insn)].will_delete_chances= ) > >>>>>> + insn_entry[INSN_UID(use_insn)].will_delete_chanc= es =3D 1; > >>>>>> + } > >>>>>> + } > >>>>>> + > >>>>>> + if (insn_is_zext_p (insn)) > >>>>>> + { > >>>>>> + if (GET_CODE (PATTERN (use_insn)) =3D=3D SET > >>>>>> + && REG_P (SET_SRC (PATTERN (use_insn)))) > >>>>>> + { > >>>>>> + if (move_insn > >>>>>> + && REGNO (SET_SRC (PATTERN (use_insn))) > >>>>>> + =3D=3D REGNO (SET_SRC (PATTERN (move_insn))= ) > >>>>>> + && insn_entry[INSN_UID(insn)].is_delete_move) > >>>>>> + { > >>>>>> + insn_entry[INSN_UID (insn)].is_move =3D 1; > >>>>>> + break; > >>>>>> + } > >>>>>> + else if (insn_entry[INSN_UID (insn)].will_delete= ) > >>>>>> + { > >>>>>> + move_insn =3D use_insn; > >>>>>> + insn_entry[INSN_UID(insn)].is_delete_move=3D= 1; > >>>>>> + } > >>>>>> + } > >>>>>> + } > >>>>>> + > >>>>>> + if (insn_is_zext_p (use_insn)) > >>>>>> + { > >>>>>> + insn_entry[INSN_UID (use_insn)].is_zext =3D 1; > >>>>>> + insn_entry[INSN_UID(use_insn)].is_relevant =3D 1; > >>>>>> + > >>>>>> + if (insn_is_store_p (insn) > >>>>>> + && insn_entry[INSN_UID (insn)].will_delete_chances= ) > >>>>>> + { > >>>>>> + insn_entry[INSN_UID (use_insn)].will_delete =3D 1; > >>>>>> + insn_entry[INSN_UID (insn)].will_delete =3D 1; > >>>>>> + insn_entry[INSN_UID( insn)].is_store =3D 1; > >>>>>> + } > >>>>>> + > >>>>>> + if (NONDEBUG_INSN_P (use_insn)) > >>>>>> + unionfind_union (insn_entry + INSN_UID (insn), > >>>>>> + insn_entry + INSN_UID (use_insn)); > >>>>>> + } > >>>>>> + } > >>>>>> + > >>>>>> + link =3D link->next; > >>>>>> + } > >>>>>> +} > >>>>>> + > >>>>>> +/* Replace QImode extensions with copy operations.*/ > >>>>>> +static void > >>>>>> +replace_marked_insns (zext_web_entry *insn_entry, unsigned i) > >>>>>> +{ > >>>>>> + rtx_insn *insn =3D insn_entry[i].insn; > >>>>>> + rtx body =3D PATTERN (insn); > >>>>>> + rtx src_reg; > >>>>>> + src_reg =3D XEXP (SET_SRC (body), 0); > >>>>>> + set_mode_and_regno (src_reg, DImode, REGNO(src_reg)); > >>>>>> + > >>>>>> + if (GET_MODE(SET_DEST(body)) !=3D DImode) > >>>>>> + set_mode_and_regno (SET_DEST(body), DImode, REGNO (SET_DEST (= body))); > >>>>>> + > >>>>>> + rtx copy =3D gen_rtx_SET (SET_DEST (body), src_reg); > >>>>>> + rtx_insn *new_insn =3D emit_insn_before (copy, insn); > >>>>>> + set_block_for_insn (new_insn, BLOCK_FOR_INSN (insn)); > >>>>>> + df_insn_rescan (new_insn); > >>>>>> + > >>>>>> + df_insn_delete (insn); > >>>>>> + remove_insn (insn); > >>>>>> + insn->set_deleted (); > >>>>>> +} > >>>>>> + > >>>>>> +/* Main entry point for this pass. */ > >>>>>> +unsigned int > >>>>>> +rs6000_analyze_zext (function *fun) > >>>>>> +{ > >>>>>> + zext_web_entry *insn_entry; > >>>>>> + basic_block bb; > >>>>>> + rtx_insn *insn, *curr_insn =3D 0; > >>>>>> + > >>>>>> + /* Dataflow analysis for use-def chains. */ > >>>>>> + df_set_flags (DF_RD_PRUNE_DEAD_DEFS); > >>>>>> + df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN); > >>>>>> + df_analyze (); > >>>>>> + df_set_flags (DF_DEFER_INSN_RESCAN); > >>>>>> + > >>>>>> + /* Rebuild ud- and du-chains. */ > >>>>>> + df_remove_problem (df_chain); > >>>>>> + df_process_deferred_rescans (); > >>>>>> + df_set_flags (DF_RD_PRUNE_DEAD_DEFS); > >>>>>> + df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN); > >>>>>> + df_analyze (); > >>>>>> + df_set_flags (DF_DEFER_INSN_RESCAN); > >>>>>> + > >>>>>> + /* Allocate structure to represent webs of insns. */ > >>>>>> + insn_entry =3D XCNEWVEC (zext_web_entry, get_max_uid ()); > >>>>>> + > >>>>>> + /* Walk the insns to gather basic data. */ > >>>>>> + FOR_ALL_BB_FN (bb, fun) > >>>>>> + FOR_BB_INSNS_SAFE (bb, insn, curr_insn) > >>>>>> + { > >>>>>> + unsigned int uid =3D INSN_UID (insn); > >>>>>> + if (NONDEBUG_INSN_P (insn)) > >>>>>> + { > >>>>>> + insn_entry[uid].insn =3D insn; > >>>>>> + > >>>>>> + if (GET_CODE (insn) =3D=3D insn_is_store_p (insn)) > >>>>>> + { > >>>>>> + insn_entry[uid].is_store =3D 1; > >>>>>> + insn_entry[uid].is_relevant =3D 1; > >>>>>> + } > >>>>>> + > >>>>>> + /* Walk the uses and defs to identify the optimization > >>>>>> + candidates.*/ > >>>>>> + struct df_insn_info *insn_info =3D DF_INSN_INFO_GET (ins= n); > >>>>>> + df_ref mention; > >>>>>> + > >>>>>> + FOR_EACH_INSN_INFO_DEF (mention, insn_info) > >>>>>> + { > >>>>>> + insn_entry[uid].is_relevant =3D 1; > >>>>>> + insn_entry[uid].is_store =3D insn_is_store_p (insn); > >>>>>> + find_zero_ext_elimination_candidate (insn_entry, ins= n, mention); > >>>>>> + } > >>>>>> + > >>>>>> + if (insn_entry[uid].is_relevant) > >>>>>> + { > >>>>>> + /* Determine if this is a store. */ > >>>>>> + insn_entry[uid].is_store =3D insn_is_store_p (insn); > >>>>>> + } > >>>>>> + } > >>>>>> + } > >>>>>> + > >>>>>> + unsigned e =3D get_max_uid (), i; > >>>>>> + > >>>>>> + int store_index =3D -1; > >>>>>> + > >>>>>> + /* Replace with copy operation.*/ > >>>>>> + for (i =3D 0; i < e; ++i) > >>>>>> + { > >>>>>> + if (insn_entry[i].is_store && insn_entry[i].will_delete) > >>>>>> + store_index =3D i; > >>>>>> + > >>>>>> + if ((store_index !=3D -1) > >>>>>> + && insn_entry[i].is_move && insn_entry[i].will_delete= ) > >>>>>> + { > >>>>>> + replace_marked_insns (insn_entry, store_index); > >>>>>> + replace_marked_insns (insn_entry, i); > >>>>>> + } > >>>>>> + } > >>>>>> + /* Clean up. */ > >>>>>> + free (insn_entry); > >>>>>> + > >>>>>> + return 0; > >>>>>> +} > >>>>>> + > >>>>>> +const pass_data pass_data_analyze_zext =3D > >>>>>> +{ > >>>>>> + RTL_PASS, /* type */ > >>>>>> + "zext", /* name */ > >>>>>> + OPTGROUP_NONE, /* optinfo_flags */ > >>>>>> + TV_NONE, /* tv_id */ > >>>>>> + 0, /* properties_required */ > >>>>>> + 0, /* properties_provided */ > >>>>>> + 0, /* properties_destroyed */ > >>>>>> + 0, /* todo_flags_start */ > >>>>>> + TODO_df_finish, /* todo_flags_finish */ > >>>>>> +}; > >>>>>> + > >>>>>> +class pass_analyze_zext : public rtl_opt_pass > >>>>>> +{ > >>>>>> +public: > >>>>>> + pass_analyze_zext(gcc::context *ctxt) > >>>>>> + : rtl_opt_pass(pass_data_analyze_zext, ctxt) > >>>>>> + {} > >>>>>> + > >>>>>> + /* opt_pass methods: */ > >>>>>> + virtual bool gate (function *) > >>>>>> + { > >>>>>> + return (optimize > 0 ); > >>>>>> + } > >>>>>> + > >>>>>> + virtual unsigned int execute (function *fun) > >>>>>> + { > >>>>>> + return rs6000_analyze_zext (fun); > >>>>>> + } > >>>>>> + > >>>>>> + opt_pass *clone () > >>>>>> + { > >>>>>> + return new pass_analyze_zext (m_ctxt); > >>>>>> + } > >>>>>> + > >>>>>> +}; // class pass_analyze_zext > >>>>>> + > >>>>>> +rtl_opt_pass * > >>>>>> +make_pass_analyze_zext (gcc::context *ctxt) > >>>>>> +{ > >>>>>> + return new pass_analyze_zext (ctxt); > >>>>>> +} > >>>>>> + > >>>>>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs600= 0.cc > >>>>>> index 8e0b0d022db..6541334bf2d 100644 > >>>>>> --- a/gcc/config/rs6000/rs6000.cc > >>>>>> +++ b/gcc/config/rs6000/rs6000.cc > >>>>>> @@ -1178,6 +1178,8 @@ static bool rs6000_secondary_reload_move (en= um rs6000_reg_type, > >>>>>> bool); > >>>>>> rtl_opt_pass *make_pass_analyze_swaps (gcc::context*); > >>>>>> > >>>>>> +rtl_opt_pass *make_pass_analyze_zext (gcc::context*); > >>>>>> + > >>>>>> /* Hash table stuff for keeping track of TOC entries. */ > >>>>>> > >>>>>> struct GTY((for_user)) toc_hash_struct > >>>>>> diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs60= 00 > >>>>>> index f183b42ce1d..c1f61591d2f 100644 > >>>>>> --- a/gcc/config/rs6000/t-rs6000 > >>>>>> +++ b/gcc/config/rs6000/t-rs6000 > >>>>>> @@ -35,6 +35,11 @@ rs6000-p8swap.o: $(srcdir)/config/rs6000/rs6000= -p8swap.cc > >>>>>> $(COMPILE) $< > >>>>>> $(POSTCOMPILE) > >>>>>> > >>>>>> +rs6000-zext-elim.o: $(srcdir)/config/rs6000/rs6000-zext-elim.cc > >>>>>> + $(COMPILE) $< > >>>>>> + $(POSTCOMPILE) > >>>>>> + > >>>>>> + > >>>>>> rs6000-d.o: $(srcdir)/config/rs6000/rs6000-d.cc > >>>>>> $(COMPILE) $< > >>>>>> $(POSTCOMPILE) > >>>>>> diff --git a/gcc/explow.cc b/gcc/explow.cc > >>>>>> index 32e9498ee07..316aa975e40 100644 > >>>>>> --- a/gcc/explow.cc > >>>>>> +++ b/gcc/explow.cc > >>>>>> @@ -654,7 +654,8 @@ copy_to_mode_reg (machine_mode mode, rtx x) > >>>>>> if (! general_operand (x, VOIDmode)) > >>>>>> x =3D force_operand (x, temp); > >>>>>> > >>>>>> - gcc_assert (GET_MODE (x) =3D=3D mode || GET_MODE (x) =3D=3D VOI= Dmode); > >>>>>> + gcc_assert (mode =3D=3D DImode || GET_MODE (x) =3D=3D mode > >>>>>> + || GET_MODE (x) =3D=3D VOIDmode); > >>>>>> if (x !=3D temp) > >>>>>> emit_move_insn (temp, x); > >>>>>> return temp; > >>>>>> diff --git a/gcc/expr.cc b/gcc/expr.cc > >>>>>> index 15be1c8db99..6162ef92b88 100644 > >>>>>> --- a/gcc/expr.cc > >>>>>> +++ b/gcc/expr.cc > >>>>>> @@ -4223,9 +4223,9 @@ emit_move_insn (rtx x, rtx y) > >>>>>> rtx y_cst =3D NULL_RTX; > >>>>>> rtx_insn *last_insn; > >>>>>> rtx set; > >>>>>> - > >>>>>> gcc_assert (mode !=3D BLKmode > >>>>>> - && (GET_MODE (y) =3D=3D mode || GET_MODE (y) =3D=3D = VOIDmode)); > >>>>>> + && (mode =3D=3D DImode || GET_MODE (y) =3D=3D mode > >>>>>> + || GET_MODE (y) =3D=3D VOIDmode)); > >>>>>> > >>>>>> /* If we have a copy that looks like one of the following patte= rns: > >>>>>> (set (subreg:M1 (reg:M2 ...)) (subreg:M1 (reg:M2 ...))) > >>>>>> diff --git a/gcc/optabs.cc b/gcc/optabs.cc > >>>>>> index 4c641cab192..9d22fadc7ef 100644 > >>>>>> --- a/gcc/optabs.cc > >>>>>> +++ b/gcc/optabs.cc > >>>>>> @@ -7902,7 +7902,8 @@ maybe_legitimize_operand (enum insn_code ico= de, unsigned int opno, > >>>>>> input: > >>>>>> gcc_assert (mode !=3D VOIDmode); > >>>>>> gcc_assert (GET_MODE (op->value) =3D=3D VOIDmode > >>>>>> - || GET_MODE (op->value) =3D=3D mode); > >>>>>> + || GET_MODE (op->value) =3D=3D mode > >>>>>> + || mode =3D=3D DImode); > >>>>>> if (maybe_legitimize_operand_same_code (icode, opno, op)) > >>>>>> return true; > >>>>>> > >>>>>> -- > >>>>>> 2.31.1 > >>>>>>