From: Ajit Agarwal <aagarwa1@linux.ibm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
Segher Boessenkool <segher@kernel.crashing.org>,
bergner@linux.ibm.com
Subject: Re: [PATCH] rs6000: suboptimal code for returning bool value on target ppc
Date: Thu, 16 Mar 2023 15:41:48 +0530 [thread overview]
Message-ID: <68ae93ab-ecb9-332b-dba8-bdc7b0d6b3c9@linux.ibm.com> (raw)
In-Reply-To: <CAFiYyc2GDvx2iWi2tLq=YrMyQ33oBrMFs5WtgbnOkm-0UtUxDA@mail.gmail.com>
Hello Richard:
On 16/03/23 3:22 pm, Richard Biener wrote:
> On Thu, Mar 16, 2023 at 9:19 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>>
>>
>>
>> On 16/03/23 1:44 pm, Richard Biener wrote:
>>> On Thu, Mar 16, 2023 at 9:11 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>>>>
>>>> Hello Richard:
>>>>
>>>> On 16/03/23 1:10 pm, Richard Biener wrote:
>>>>> On Thu, Mar 16, 2023 at 6:21 AM Ajit Agarwal via Gcc-patches
>>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>>>
>>>>>> Hello All:
>>>>>>
>>>>>>
>>>>>> This patch eliminates unnecessary zero extension instruction from power generated assembly.
>>>>>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>>>>
>>>>> What makes this so special that we cannot deal with it from generic code?
>>>>> In particular we do have the REE pass, why is target specific
>>>>> knowledge neccessary
>>>>> to eliminate the extension?
>>>>>
>>>>
>>>> For returning bool values and comparision with integers generates the following by all the rtl passes.
>>>>
>>>> set compare (subreg)
>>>> set if_then_else
>>>> Convert SImode -> QImode
>>>> set zero_extend to SImode from QImode
>>>> set return value 0 in one path of cfg.
>>>> set return value 1 in other path of cfg.
>>>>
>>>> This pass replaces the above zero extension and conversion from QImode to DImode with copy operation to keep QImode in 64 bit registers in powerpc target.
>>>
>>> Sorry, I can't parse that - as there's no testcase with the patch I
>>> cannot even try to see what the actual RTL
>>> looks like (without the pass).
>>>
>>
>> Here is the PR with bugzilla.
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784
>>
>> I can add the attached testcase with this PR in the patch.
>
> I don't see any zero-extends there.
>
Here is the testcase.
bool (int a, int b)
{
if (a > 2)
return false;
if (b < 10)
return true;
return false;
}
compiled with gcc -O3 -m64 testcase.cc -mcpu=power9 -save-temps.
Here is the rtl after cse.
(note 12 11 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 15 12 16 3 (set (reg:CC 123)
(compare:CC (subreg/s/u:SI (reg/v:DI 120 [ b ]) 0)
(const_int 9 [0x9]))) "ext.cc":5:5 796 {*cmpsi_signed}
(expr_list:REG_DEAD (reg/v:DI 120 [ b ])
(nil)))
(insn 16 15 17 3 (set (reg:SI 124)
(const_int 1 [0x1])) "ext.cc":5:5 555 {*movsi_internal1}
(nil))
(insn 17 16 18 3 (set (reg:SI 122)
(if_then_else:SI (gt (reg:CC 123)
(const_int 0 [0]))
(const_int 0 [0])
(reg:SI 124))) "ext.cc":5:5 344 {isel_cc_si}
(expr_list:REG_DEAD (reg:SI 124)
(expr_list:REG_DEAD (reg:CC 123)
(nil))))
(insn 18 17 32 3 (set (reg:QI 117 [ _1 ])
(subreg:QI (reg:SI 122) 0)) "ext.cc":5:5 562 {*movqi_internal}
(expr_list:REG_DEAD (reg:SI 122)
(nil)))
; pc falls through to BB 5
(code_label 32 18 31 4 3 (nil) [1 uses])
(note 31 32 5 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 5 31 19 4 (set (reg:QI 117 [ _1 ])
(const_int 0 [0])) "ext.cc":4:16 562 {*movqi_internal}
(nil))
(code_label 19 5 20 5 2 (nil) [0 uses])
(note 20 19 21 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(insn 21 20 22 5 (set (reg:DI 126 [ _1 ])
(zero_extend:DI (reg:QI 117 [ _1 ]))) "ext.cc":8:1 5 {zero_extendqidi2}
(expr_list:REG_DEAD (reg:QI 117 [ _1 ])
(nil)))
(insn 22 21 26 5 (set (reg:DI 118 [ <retval> ])
(reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64}
(expr_list:REG_DEAD (reg:DI 126 [ _1 ])
(nil)))
(insn 26 22 27 5 (set (reg/i:DI 3 3)
(reg:DI 126 [ _1 ])) "ext.cc":8:1 681 {*movdi_internal64}
(expr_list:REG_DEAD (reg:DI 118 [ <retval> ])
(nil)))
(insn 27 26 0 5 (use (reg/i:DI 3 3)) "ext.cc":8:1 -1
(nil))
Thanks & Regards
Ajit
>> Thanks & Regards
>> Ajit
>>> Richard.
>>>
>>>> Thanks & Regards
>>>> Ajit
>>>>>> + In cfgexpand pass QImode is generated with
>>>>>> + bool register value and this pass uses QI
>>>>>> + as 64 bit registers.
>>>>>> +
>>>>
>>>>>> rs6000: suboptimal code for returning bool value on target ppc.
>>>>>>
>>>>>> New pass to eliminate unnecessary zero extension. This pass
>>>>>> is registered after cse rtl pass.
>>>>>>
>>>>>> 2023-03-16 Ajit Kumar Agarwal <aagarwa1@linux.ibm.com>
>>>>>>
>>>>>> gcc/ChangeLog:
>>>>>>
>>>>>> * config/rs6000/rs6000-passes.def: Registered zero elimination
>>>>>> pass.
>>>>>> * config/rs6000/rs6000-zext-elim.cc: Add new pass.
>>>>>> * config.gcc: Add new executable.
>>>>>> * config/rs6000/rs6000-protos.h: Add new prototype for zero
>>>>>> elimination pass.
>>>>>> * config/rs6000/rs6000.cc: Add new prototype for zero
>>>>>> elimination pass.
>>>>>> * config/rs6000/t-rs6000: Add new rule.
>>>>>> * expr.cc: Modified gcc assert.
>>>>>> * explow.cc: Modified gcc assert.
>>>>>> * optabs.cc: Modified gcc assert.
>>>>>> ---
>>>>>> gcc/config.gcc | 4 +-
>>>>>> gcc/config/rs6000/rs6000-passes.def | 2 +
>>>>>> gcc/config/rs6000/rs6000-protos.h | 1 +
>>>>>> gcc/config/rs6000/rs6000-zext-elim.cc | 361 ++++++++++++++++++++++++++
>>>>>> gcc/config/rs6000/rs6000.cc | 2 +
>>>>>> gcc/config/rs6000/t-rs6000 | 5 +
>>>>>> gcc/explow.cc | 3 +-
>>>>>> gcc/expr.cc | 4 +-
>>>>>> gcc/optabs.cc | 3 +-
>>>>>> 9 files changed, 379 insertions(+), 6 deletions(-)
>>>>>> create mode 100644 gcc/config/rs6000/rs6000-zext-elim.cc
>>>>>>
>>>>>> diff --git a/gcc/config.gcc b/gcc/config.gcc
>>>>>> index da3a6d3ba1f..e8ac9d882f0 100644
>>>>>> --- a/gcc/config.gcc
>>>>>> +++ b/gcc/config.gcc
>>>>>> @@ -503,7 +503,7 @@ or1k*-*-*)
>>>>>> ;;
>>>>>> powerpc*-*-*)
>>>>>> cpu_type=rs6000
>>>>>> - extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o"
>>>>>> + extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-zext-elim.o rs6000-logue.o"
>>>>>> extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o"
>>>>>> extra_objs="${extra_objs} rs6000-builtins.o rs6000-builtin.o"
>>>>>> extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
>>>>>> @@ -538,7 +538,7 @@ riscv*)
>>>>>> ;;
>>>>>> rs6000*-*-*)
>>>>>> extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt"
>>>>>> - extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o"
>>>>>> + extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-zext-elim.o rs6000-logue.o"
>>>>>> extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o"
>>>>>> target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-logue.cc \$(srcdir)/config/rs6000/rs6000-call.cc"
>>>>>> target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-pcrel-opt.cc"
>>>>>> diff --git a/gcc/config/rs6000/rs6000-passes.def b/gcc/config/rs6000/rs6000-passes.def
>>>>>> index ca899d5f7af..d7500feddf1 100644
>>>>>> --- a/gcc/config/rs6000/rs6000-passes.def
>>>>>> +++ b/gcc/config/rs6000/rs6000-passes.def
>>>>>> @@ -28,6 +28,8 @@ along with GCC; see the file COPYING3. If not see
>>>>>> The power8 does not have instructions that automaticaly do the byte swaps
>>>>>> for loads and stores. */
>>>>>> INSERT_PASS_BEFORE (pass_cse, 1, pass_analyze_swaps);
>>>>>> + INSERT_PASS_AFTER (pass_cse, 1, pass_analyze_zext);
>>>>>> +
>>>>>>
>>>>>> /* Pass to do the PCREL_OPT optimization that combines the load of an
>>>>>> external symbol's address along with a single load or store using that
>>>>>> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
>>>>>> index 1a4fc1df668..f6cf2d673d4 100644
>>>>>> --- a/gcc/config/rs6000/rs6000-protos.h
>>>>>> +++ b/gcc/config/rs6000/rs6000-protos.h
>>>>>> @@ -340,6 +340,7 @@ namespace gcc { class context; }
>>>>>> class rtl_opt_pass;
>>>>>>
>>>>>> extern rtl_opt_pass *make_pass_analyze_swaps (gcc::context *);
>>>>>> +extern rtl_opt_pass *make_pass_analyze_zext (gcc::context *);
>>>>>> extern rtl_opt_pass *make_pass_pcrel_opt (gcc::context *);
>>>>>> extern bool rs6000_sum_of_two_registers_p (const_rtx expr);
>>>>>> extern bool rs6000_quadword_masked_address_p (const_rtx exp);
>>>>>> diff --git a/gcc/config/rs6000/rs6000-zext-elim.cc b/gcc/config/rs6000/rs6000-zext-elim.cc
>>>>>> new file mode 100644
>>>>>> index 00000000000..777c7a5a387
>>>>>> --- /dev/null
>>>>>> +++ b/gcc/config/rs6000/rs6000-zext-elim.cc
>>>>>> @@ -0,0 +1,361 @@
>>>>>> +/* Subroutine to eliminate redundant zero extend for power architecture.
>>>>>> + Copyright (C) 1991-2023 Free Software Foundation, Inc.
>>>>>> +
>>>>>> + This file is part of GCC.
>>>>>> +
>>>>>> + GCC is free software; you can redistribute it and/or modify it
>>>>>> + under the terms of the GNU General Public License as published
>>>>>> + by the Free Software Foundation; either version 3, or (at your
>>>>>> + option) any later version.
>>>>>> +
>>>>>> + GCC is distributed in the hope that it will be useful, but WITHOUT
>>>>>> + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
>>>>>> + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
>>>>>> + License for more details.
>>>>>> +
>>>>>> + You should have received a copy of the GNU General Public License
>>>>>> + along with GCC; see the file COPYING3. If not see
>>>>>> + <http://www.gnu.org/licenses/>. */
>>>>>> +
>>>>>> +/* This pass remove unnecessary zero extension instruction from
>>>>>> + power generated assembly. This pass is register after cse
>>>>>> + pass.
>>>>>> + Identifies the following sequence of instruction after cse
>>>>>> + rtl pass.
>>>>>> +
>>>>>> + set compare (subreg)
>>>>>> + set if_then_else
>>>>>> + set SImode -> QImode
>>>>>> + set zero_extend to DImode from QImode
>>>>>> + set return value 0 in one path of cfg.
>>>>>> + set return value 1 in other path of cfg.
>>>>>> +
>>>>>> + In cfgexpand pass QImode is generated with
>>>>>> + bool register value and this pass uses QI
>>>>>> + as 64 bit registers.
>>>>>> +
>>>>>> + This pass replace copy operation from QImode to DImode
>>>>>> + and return appropriate return values.*/
>>>>>> +
>>>>>> +#define IN_TARGET_CODE 1
>>>>>> +
>>>>>> +#include "config.h"
>>>>>> +#include "system.h"
>>>>>> +#include "coretypes.h"
>>>>>> +#include "backend.h"
>>>>>> +#include "rtl.h"
>>>>>> +#include "tree.h"
>>>>>> +#include "memmodel.h"
>>>>>> +#include "df.h"
>>>>>> +#include "tm_p.h"
>>>>>> +#include "ira.h"
>>>>>> +#include "print-tree.h"
>>>>>> +#include "varasm.h"
>>>>>> +#include "explow.h"
>>>>>> +#include "expr.h"
>>>>>> +#include "output.h"
>>>>>> +#include "tree-pass.h"
>>>>>> +
>>>>>> +/* This is based on the union-find logic in web.cc. web_entry_base is
>>>>>> + defined in df.h. */
>>>>>> +class zext_web_entry : public web_entry_base
>>>>>> +{
>>>>>> + public:
>>>>>> + /* Pointer to the insn. */
>>>>>> + rtx_insn *insn;
>>>>>> + unsigned int is_relevant : 1;
>>>>>> + /* Set if insn is a load. */
>>>>>> + unsigned int is_load : 1;
>>>>>> + /* Set if insn is a store. */
>>>>>> + unsigned int is_store : 1;
>>>>>> + unsigned int is_zext :1 ;
>>>>>> + unsigned int is_move :1;
>>>>>> + unsigned int is_delete_move :1;
>>>>>> + /* Set if this insn should be deleted. */
>>>>>> + unsigned int will_delete : 1;
>>>>>> + unsigned int will_delete_chances : 1;
>>>>>> +};
>>>>>> +
>>>>>> +/* Checks if instruction is zero extension
>>>>>> + * with QIMode to DImode.*/
>>>>>> +static unsigned int
>>>>>> +insn_is_zext_p(rtx insn)
>>>>>> +{
>>>>>> + rtx body = PATTERN (insn);
>>>>>> +
>>>>>> + if (GET_CODE (body) == SET
>>>>>> + && GET_MODE(SET_DEST (body)) == DImode
>>>>>> + && GET_CODE(SET_SRC (body)) == ZERO_EXTEND)
>>>>>> + {
>>>>>> + rtx set = XEXP (SET_SRC (body), 0);
>>>>>> +
>>>>>> + if (REG_P (set))
>>>>>> + {
>>>>>> + if (GET_MODE (set) == QImode) return 1;
>>>>>> + }
>>>>>> + else
>>>>>> + return 0;
>>>>>> + }
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> +/* Checks if instruction is SET operation with QImode.*/
>>>>>> +static unsigned int
>>>>>> +insn_is_store_p (rtx insn)
>>>>>> +{
>>>>>> + rtx body = PATTERN (insn);
>>>>>> + if (GET_CODE (body) == SET
>>>>>> + && SUBREG_P(SET_SRC (body))
>>>>>> + && !CONST_INT_P(SET_SRC (body))
>>>>>> + && GET_MODE(XEXP (SET_SRC (body), 0)) == SImode
>>>>>> + && GET_MODE(SET_SRC (body)) == QImode)
>>>>>> + return 1;
>>>>>> +
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> +/* Find out zero extension removal candidate with use-def web.*/
>>>>>> +static void
>>>>>> +find_zero_ext_elimination_candidate (zext_web_entry *insn_entry,
>>>>>> + rtx insn, df_ref def)
>>>>>> +{
>>>>>> + struct df_link *link = DF_REF_CHAIN (def);
>>>>>> +
>>>>>> + rtx move_insn = NULL_RTX;
>>>>>> + rtx compare_insn = NULL_RTX;
>>>>>> +
>>>>>> + while (link)
>>>>>> + {
>>>>>> + if (!DF_REF_INSN_INFO (link->ref))
>>>>>> + insn_entry[INSN_UID(insn)].will_delete_chances = 0;
>>>>>> +
>>>>>> + if (DF_REF_INSN_INFO (link->ref))
>>>>>> + {
>>>>>> + rtx use_insn = DF_REF_INSN (link->ref);
>>>>>> +
>>>>>> + if (GET_CODE (PATTERN (use_insn)) == SET
>>>>>> + && (GET_CODE (SET_SRC (PATTERN (use_insn))) == IF_THEN_ELSE))
>>>>>> + {
>>>>>> + if (GET_CODE (PATTERN (insn)) == SET
>>>>>> + && GET_CODE (SET_SRC (PATTERN (insn))) == COMPARE)
>>>>>> + {
>>>>>> + rtx body = XEXP (SET_SRC (PATTERN (insn)), 0);
>>>>>> +
>>>>>> + if (SUBREG_P (body))
>>>>>> + {
>>>>>> + compare_insn = use_insn;
>>>>>> + rtx compare_body = XEXP (SET_SRC (PATTERN (compare_insn)), 0);
>>>>>> +
>>>>>> + if (compare_insn
>>>>>> + && ((REGNO (XEXP (compare_body, 0)))
>>>>>> + == REGNO (SET_DEST (PATTERN (insn)))))
>>>>>> + insn_entry[INSN_UID(use_insn)].will_delete_chances = 1;
>>>>>> + }
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + if (insn_is_store_p(use_insn)
>>>>>> + && GET_CODE (PATTERN (insn)) == SET
>>>>>> + && (GET_CODE (SET_SRC (PATTERN(insn))) == IF_THEN_ELSE))
>>>>>> + {
>>>>>> + if (GET_MODE (SET_DEST (PATTERN (insn))) == SImode)
>>>>>> + {
>>>>>> + if (insn_entry[INSN_UID(insn)].will_delete_chances)
>>>>>> + insn_entry[INSN_UID(use_insn)].will_delete_chances = 1;
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + if (insn_is_zext_p (insn))
>>>>>> + {
>>>>>> + if (GET_CODE (PATTERN (use_insn)) == SET
>>>>>> + && REG_P (SET_SRC (PATTERN (use_insn))))
>>>>>> + {
>>>>>> + if (move_insn
>>>>>> + && REGNO (SET_SRC (PATTERN (use_insn)))
>>>>>> + == REGNO (SET_SRC (PATTERN (move_insn)))
>>>>>> + && insn_entry[INSN_UID(insn)].is_delete_move)
>>>>>> + {
>>>>>> + insn_entry[INSN_UID (insn)].is_move = 1;
>>>>>> + break;
>>>>>> + }
>>>>>> + else if (insn_entry[INSN_UID (insn)].will_delete)
>>>>>> + {
>>>>>> + move_insn = use_insn;
>>>>>> + insn_entry[INSN_UID(insn)].is_delete_move= 1;
>>>>>> + }
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + if (insn_is_zext_p (use_insn))
>>>>>> + {
>>>>>> + insn_entry[INSN_UID (use_insn)].is_zext = 1;
>>>>>> + insn_entry[INSN_UID(use_insn)].is_relevant = 1;
>>>>>> +
>>>>>> + if (insn_is_store_p (insn)
>>>>>> + && insn_entry[INSN_UID (insn)].will_delete_chances)
>>>>>> + {
>>>>>> + insn_entry[INSN_UID (use_insn)].will_delete = 1;
>>>>>> + insn_entry[INSN_UID (insn)].will_delete = 1;
>>>>>> + insn_entry[INSN_UID( insn)].is_store = 1;
>>>>>> + }
>>>>>> +
>>>>>> + if (NONDEBUG_INSN_P (use_insn))
>>>>>> + unionfind_union (insn_entry + INSN_UID (insn),
>>>>>> + insn_entry + INSN_UID (use_insn));
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + link = link->next;
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +/* Replace QImode extensions with copy operations.*/
>>>>>> +static void
>>>>>> +replace_marked_insns (zext_web_entry *insn_entry, unsigned i)
>>>>>> +{
>>>>>> + rtx_insn *insn = insn_entry[i].insn;
>>>>>> + rtx body = PATTERN (insn);
>>>>>> + rtx src_reg;
>>>>>> + src_reg = XEXP (SET_SRC (body), 0);
>>>>>> + set_mode_and_regno (src_reg, DImode, REGNO(src_reg));
>>>>>> +
>>>>>> + if (GET_MODE(SET_DEST(body)) != DImode)
>>>>>> + set_mode_and_regno (SET_DEST(body), DImode, REGNO (SET_DEST (body)));
>>>>>> +
>>>>>> + rtx copy = gen_rtx_SET (SET_DEST (body), src_reg);
>>>>>> + rtx_insn *new_insn = emit_insn_before (copy, insn);
>>>>>> + set_block_for_insn (new_insn, BLOCK_FOR_INSN (insn));
>>>>>> + df_insn_rescan (new_insn);
>>>>>> +
>>>>>> + df_insn_delete (insn);
>>>>>> + remove_insn (insn);
>>>>>> + insn->set_deleted ();
>>>>>> +}
>>>>>> +
>>>>>> +/* Main entry point for this pass. */
>>>>>> +unsigned int
>>>>>> +rs6000_analyze_zext (function *fun)
>>>>>> +{
>>>>>> + zext_web_entry *insn_entry;
>>>>>> + basic_block bb;
>>>>>> + rtx_insn *insn, *curr_insn = 0;
>>>>>> +
>>>>>> + /* Dataflow analysis for use-def chains. */
>>>>>> + df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
>>>>>> + df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
>>>>>> + df_analyze ();
>>>>>> + df_set_flags (DF_DEFER_INSN_RESCAN);
>>>>>> +
>>>>>> + /* Rebuild ud- and du-chains. */
>>>>>> + df_remove_problem (df_chain);
>>>>>> + df_process_deferred_rescans ();
>>>>>> + df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
>>>>>> + df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
>>>>>> + df_analyze ();
>>>>>> + df_set_flags (DF_DEFER_INSN_RESCAN);
>>>>>> +
>>>>>> + /* Allocate structure to represent webs of insns. */
>>>>>> + insn_entry = XCNEWVEC (zext_web_entry, get_max_uid ());
>>>>>> +
>>>>>> + /* Walk the insns to gather basic data. */
>>>>>> + FOR_ALL_BB_FN (bb, fun)
>>>>>> + FOR_BB_INSNS_SAFE (bb, insn, curr_insn)
>>>>>> + {
>>>>>> + unsigned int uid = INSN_UID (insn);
>>>>>> + if (NONDEBUG_INSN_P (insn))
>>>>>> + {
>>>>>> + insn_entry[uid].insn = insn;
>>>>>> +
>>>>>> + if (GET_CODE (insn) == insn_is_store_p (insn))
>>>>>> + {
>>>>>> + insn_entry[uid].is_store = 1;
>>>>>> + insn_entry[uid].is_relevant = 1;
>>>>>> + }
>>>>>> +
>>>>>> + /* Walk the uses and defs to identify the optimization
>>>>>> + candidates.*/
>>>>>> + struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
>>>>>> + df_ref mention;
>>>>>> +
>>>>>> + FOR_EACH_INSN_INFO_DEF (mention, insn_info)
>>>>>> + {
>>>>>> + insn_entry[uid].is_relevant = 1;
>>>>>> + insn_entry[uid].is_store = insn_is_store_p (insn);
>>>>>> + find_zero_ext_elimination_candidate (insn_entry, insn, mention);
>>>>>> + }
>>>>>> +
>>>>>> + if (insn_entry[uid].is_relevant)
>>>>>> + {
>>>>>> + /* Determine if this is a store. */
>>>>>> + insn_entry[uid].is_store = insn_is_store_p (insn);
>>>>>> + }
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + unsigned e = get_max_uid (), i;
>>>>>> +
>>>>>> + int store_index = -1;
>>>>>> +
>>>>>> + /* Replace with copy operation.*/
>>>>>> + for (i = 0; i < e; ++i)
>>>>>> + {
>>>>>> + if (insn_entry[i].is_store && insn_entry[i].will_delete)
>>>>>> + store_index = i;
>>>>>> +
>>>>>> + if ((store_index != -1)
>>>>>> + && insn_entry[i].is_move && insn_entry[i].will_delete)
>>>>>> + {
>>>>>> + replace_marked_insns (insn_entry, store_index);
>>>>>> + replace_marked_insns (insn_entry, i);
>>>>>> + }
>>>>>> + }
>>>>>> + /* Clean up. */
>>>>>> + free (insn_entry);
>>>>>> +
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> +const pass_data pass_data_analyze_zext =
>>>>>> +{
>>>>>> + RTL_PASS, /* type */
>>>>>> + "zext", /* name */
>>>>>> + OPTGROUP_NONE, /* optinfo_flags */
>>>>>> + TV_NONE, /* tv_id */
>>>>>> + 0, /* properties_required */
>>>>>> + 0, /* properties_provided */
>>>>>> + 0, /* properties_destroyed */
>>>>>> + 0, /* todo_flags_start */
>>>>>> + TODO_df_finish, /* todo_flags_finish */
>>>>>> +};
>>>>>> +
>>>>>> +class pass_analyze_zext : public rtl_opt_pass
>>>>>> +{
>>>>>> +public:
>>>>>> + pass_analyze_zext(gcc::context *ctxt)
>>>>>> + : rtl_opt_pass(pass_data_analyze_zext, ctxt)
>>>>>> + {}
>>>>>> +
>>>>>> + /* opt_pass methods: */
>>>>>> + virtual bool gate (function *)
>>>>>> + {
>>>>>> + return (optimize > 0 );
>>>>>> + }
>>>>>> +
>>>>>> + virtual unsigned int execute (function *fun)
>>>>>> + {
>>>>>> + return rs6000_analyze_zext (fun);
>>>>>> + }
>>>>>> +
>>>>>> + opt_pass *clone ()
>>>>>> + {
>>>>>> + return new pass_analyze_zext (m_ctxt);
>>>>>> + }
>>>>>> +
>>>>>> +}; // class pass_analyze_zext
>>>>>> +
>>>>>> +rtl_opt_pass *
>>>>>> +make_pass_analyze_zext (gcc::context *ctxt)
>>>>>> +{
>>>>>> + return new pass_analyze_zext (ctxt);
>>>>>> +}
>>>>>> +
>>>>>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>>>>>> index 8e0b0d022db..6541334bf2d 100644
>>>>>> --- a/gcc/config/rs6000/rs6000.cc
>>>>>> +++ b/gcc/config/rs6000/rs6000.cc
>>>>>> @@ -1178,6 +1178,8 @@ static bool rs6000_secondary_reload_move (enum rs6000_reg_type,
>>>>>> bool);
>>>>>> rtl_opt_pass *make_pass_analyze_swaps (gcc::context*);
>>>>>>
>>>>>> +rtl_opt_pass *make_pass_analyze_zext (gcc::context*);
>>>>>> +
>>>>>> /* Hash table stuff for keeping track of TOC entries. */
>>>>>>
>>>>>> struct GTY((for_user)) toc_hash_struct
>>>>>> diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
>>>>>> index f183b42ce1d..c1f61591d2f 100644
>>>>>> --- a/gcc/config/rs6000/t-rs6000
>>>>>> +++ b/gcc/config/rs6000/t-rs6000
>>>>>> @@ -35,6 +35,11 @@ rs6000-p8swap.o: $(srcdir)/config/rs6000/rs6000-p8swap.cc
>>>>>> $(COMPILE) $<
>>>>>> $(POSTCOMPILE)
>>>>>>
>>>>>> +rs6000-zext-elim.o: $(srcdir)/config/rs6000/rs6000-zext-elim.cc
>>>>>> + $(COMPILE) $<
>>>>>> + $(POSTCOMPILE)
>>>>>> +
>>>>>> +
>>>>>> rs6000-d.o: $(srcdir)/config/rs6000/rs6000-d.cc
>>>>>> $(COMPILE) $<
>>>>>> $(POSTCOMPILE)
>>>>>> diff --git a/gcc/explow.cc b/gcc/explow.cc
>>>>>> index 32e9498ee07..316aa975e40 100644
>>>>>> --- a/gcc/explow.cc
>>>>>> +++ b/gcc/explow.cc
>>>>>> @@ -654,7 +654,8 @@ copy_to_mode_reg (machine_mode mode, rtx x)
>>>>>> if (! general_operand (x, VOIDmode))
>>>>>> x = force_operand (x, temp);
>>>>>>
>>>>>> - gcc_assert (GET_MODE (x) == mode || GET_MODE (x) == VOIDmode);
>>>>>> + gcc_assert (mode == DImode || GET_MODE (x) == mode
>>>>>> + || GET_MODE (x) == VOIDmode);
>>>>>> if (x != temp)
>>>>>> emit_move_insn (temp, x);
>>>>>> return temp;
>>>>>> diff --git a/gcc/expr.cc b/gcc/expr.cc
>>>>>> index 15be1c8db99..6162ef92b88 100644
>>>>>> --- a/gcc/expr.cc
>>>>>> +++ b/gcc/expr.cc
>>>>>> @@ -4223,9 +4223,9 @@ emit_move_insn (rtx x, rtx y)
>>>>>> rtx y_cst = NULL_RTX;
>>>>>> rtx_insn *last_insn;
>>>>>> rtx set;
>>>>>> -
>>>>>> gcc_assert (mode != BLKmode
>>>>>> - && (GET_MODE (y) == mode || GET_MODE (y) == VOIDmode));
>>>>>> + && (mode == DImode || GET_MODE (y) == mode
>>>>>> + || GET_MODE (y) == VOIDmode));
>>>>>>
>>>>>> /* If we have a copy that looks like one of the following patterns:
>>>>>> (set (subreg:M1 (reg:M2 ...)) (subreg:M1 (reg:M2 ...)))
>>>>>> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
>>>>>> index 4c641cab192..9d22fadc7ef 100644
>>>>>> --- a/gcc/optabs.cc
>>>>>> +++ b/gcc/optabs.cc
>>>>>> @@ -7902,7 +7902,8 @@ maybe_legitimize_operand (enum insn_code icode, unsigned int opno,
>>>>>> input:
>>>>>> gcc_assert (mode != VOIDmode);
>>>>>> gcc_assert (GET_MODE (op->value) == VOIDmode
>>>>>> - || GET_MODE (op->value) == mode);
>>>>>> + || GET_MODE (op->value) == mode
>>>>>> + || mode == DImode);
>>>>>> if (maybe_legitimize_operand_same_code (icode, opno, op))
>>>>>> return true;
>>>>>>
>>>>>> --
>>>>>> 2.31.1
>>>>>>
next prev parent reply other threads:[~2023-03-16 10:12 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-16 5:20 Ajit Agarwal
2023-03-16 7:40 ` Richard Biener
2023-03-16 8:11 ` Ajit Agarwal
2023-03-16 8:14 ` Richard Biener
2023-03-16 8:19 ` Ajit Agarwal
2023-03-16 9:52 ` Richard Biener
2023-03-16 10:11 ` Ajit Agarwal [this message]
2023-03-16 10:30 ` Richard Biener
2023-03-16 10:43 ` Ajit Agarwal
2023-03-16 10:56 ` Richard Biener
2023-03-16 11:43 ` Ajit Agarwal
2023-03-16 14:48 ` Jeff Law
2023-03-17 11:49 ` Ajit Agarwal
2023-03-17 3:37 ` Surya Kumari Jangala
2023-03-17 21:20 ` Peter Bergner
2023-03-18 3:53 ` Peter Bergner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=68ae93ab-ecb9-332b-dba8-bdc7b0d6b3c9@linux.ibm.com \
--to=aagarwa1@linux.ibm.com \
--cc=bergner@linux.ibm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).