Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Ajit Agarwal <aagarwa1@linux.ibm.com>
To: Sam James <sam@gentoo.org>
Cc: Alex Coplan <alex.coplan@arm.com>,
	"Kewen.Lin" <linkw@linux.ibm.com>,
	Peter Bergner <bergner@linux.ibm.com>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	David Edelsohn <dje.gcc@gmail.com>,
	Michael Meissner <meissner@linux.ibm.com>,
	richard.sandiford@arm.com, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target
Date: Wed, 14 Feb 2024 23:26:34 +0530	[thread overview]
Message-ID: <60ffc1f6-1282-4cc2-8dba-07c529cf8e38@linux.ibm.com> (raw)
In-Reply-To: <87v86rneq6.fsf@gentoo.org>

Hello Sam:

On 14/02/24 10:50 pm, Sam James wrote:
> 
> Ajit Agarwal <aagarwa1@linux.ibm.com> writes:
> 
>> Hello Richard:
>>
>>
>> On 14/02/24 4:03 pm, Richard Sandiford wrote:
>>> Hi,
>>>
>>> Thanks for working on this.
>>>
>>> You posted a version of this patch on Sunday too.  If you need to repost
>>> to fix bugs or make other improvements, could you describe the changes
>>> that you've made since the previous version?  It makes things easier
>>> to follow.
>>
>> Sure. Sorry for that I forgot to add that.
>>
>>>
>>> Also, sorry for starting with a meta discussion about reviews, but
>>> there are multiple types of review comment, including:
>>>
>>> (1) Suggestions for changes that are worded as suggestions.
>>>
>>> (2) Suggestions for changes that are worded as questions ("Wouldn't it be
>>>     better to do X?", etc).
>>>
>>> (3) Questions asking for an explanation or for more information.
>>>
>>> Just sending a new patch makes sense when the previous review comments
>>> were all like (1), and arguably also (1)+(2).  But Alex's previous review
>>> included (3) as well.  Could you go back and respond to his questions there?
>>> It would help understand some of the design choices.
>>>
>>
>> I have responded to Alex comments for the previous patches.
>> I have incorporated all of his comments in this patch.
>>
>>  
>>> A natural starting point when reviewing a patch like this is to diff
>>> the current aarch64-ldp-fusion.cc with the new pair-fusion.cc.  This shows
>>> many of the kind of changes that I'd expect.  But it also seems to include
>>> some code reordering, such as putting fuse_pair after try_fuse_pair.
>>> If some reordering is necessary, could you try to organise the patch as
>>> a series in which the reordering is a separate step?  It's a bit hard
>>> to review at the moment.  (Reordering for cosmetic reasons is also OK,
>>> but again please separate it out for ease of review.)
>>>
>>> Maybe one way of making the review easier would be to split the aarch64
>>> pass into the "target-dependent" and "target-independent" pieces
>>> in-place, i.e. keeping everything within aarch64-ldp-fusion.cc, and then
>>> (as separate patches) move the target-independent pieces outside
>>> config/aarch64.
>>>
>> Sure I will do that.
>>
>>> The patch includes:
>>>
>>>> 	* emit-rtl.cc: Modify ge with gt on PolyINT data structure.
>>>> 	* dce.cc: Add changes not to delete the load store pair.
>>>> 	* rtl-ssa/changes.cc: Modified assert code.
>>>> 	* var-tracking.cc: Modified assert code.
>>>> 	* df-problems.cc: Not to generate REG_UNUSED for multi
>>>> 	word registers that is requied for rs6000 target.
>>>
>>> Please submit these separately, as independent preparatory patches,
>>> with an explanation for why they're needed & correct.  But:
>>>
>> Sure I will do that.
>>
>>>> diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
>>>> index 88ee0dd67fc..a8d0ee7c4db 100644
>>>> --- a/gcc/df-problems.cc
>>>> +++ b/gcc/df-problems.cc
>>>> @@ -3360,7 +3360,7 @@ df_set_unused_notes_for_mw (rtx_insn *insn, struct df_mw_hardreg *mws,
>>>>    if (df_whole_mw_reg_unused_p (mws, live, artificial_uses))
>>>>      {
>>>>        unsigned int regno = mws->start_regno;
>>>> -      df_set_note (REG_UNUSED, insn, mws->mw_reg);
>>>> +      //df_set_note (REG_UNUSED, insn, mws->mw_reg);
>>>>        dead_debug_insert_temp (debug, regno, insn, DEBUG_TEMP_AFTER_WITH_REG);
>>>>  
>>>>        if (REG_DEAD_DEBUGGING)
>>>> @@ -3375,7 +3375,7 @@ df_set_unused_notes_for_mw (rtx_insn *insn, struct df_mw_hardreg *mws,
>>>>  	if (!bitmap_bit_p (live, r)
>>>>  	    && !bitmap_bit_p (artificial_uses, r))
>>>>  	  {
>>>> -	    df_set_note (REG_UNUSED, insn, regno_reg_rtx[r]);
>>>> +	   // df_set_note (REG_UNUSED, insn, regno_reg_rtx[r]);
> 
> I just want to emphasise here:
> a) adding out commented code is very unusual (I know a reviewer picked
> up on that already);
> 
> b) if you are going to comment something out as a hack / you need help,
> please *clearly flag that* (apologies if I missed it), and possibly add
> a comment above it saying "// TODO: Need to figure out ...." or similar,
> otherwise it just looks like it was forgotten about.
> 
> In this case, your question about how to handle REG_UNUSED should've
> been made clear in a summary at the top where you mention the
> outstanding items. Again, sorry if I missed it.
> 

Question is not about how to handle REG_UNUSED,

I am afraid this is not what I meant, I wanted to
convey the following.

REG_UNSED generated by ira with the above code used by cprop_hardreg to remove the code with REG_UNUSED notes.

We can modify these passes to handle REG_UNUSED 
differently or not to generate REG_UNUSED for
multi-word case as above. 

We use similar to we do as follows:

We generate sequential registers using (subreg V16QI (reg 00mode) 16) and (reg OOmode 0) where OOmode is 256 bit and V16QI is 128 bits in order to generate sequential register pair. If I keep the above REG_UNUSED notes ira generates REG_UNUSED and in cprop_harreg pass and dce pass deletes store pairs and we get incorrect code.

I choose not to generate REG_UNUSED for multi-word case instead of handling in different passes lower the pipeline.

Thanks & Regards
Ajit
>>>>  	    dead_debug_insert_temp (debug, r, insn, DEBUG_TEMP_AFTER_WITH_REG);
>>>>  	    if (REG_DEAD_DEBUGGING)
>>>>  	      df_print_note ("adding 2: ", insn, REG_NOTES (insn));
>>>> @@ -3493,9 +3493,9 @@ df_create_unused_note (rtx_insn *insn, df_ref def,
>>>>  	|| bitmap_bit_p (artificial_uses, dregno)
>>>>  	|| df_ignore_stack_reg (dregno)))
>>>>      {
>>>> -      rtx reg = (DF_REF_LOC (def))
>>>> -                ? *DF_REF_REAL_LOC (def): DF_REF_REG (def);
>>>> -      df_set_note (REG_UNUSED, insn, reg);
>>>> +      //rtx reg = (DF_REF_LOC (def))
>>>> +      //          ? *DF_REF_REAL_LOC (def): DF_REF_REG (def);
>>>> +      //df_set_note (REG_UNUSED, insn, reg);
>>>>        dead_debug_insert_temp (debug, dregno, insn, DEBUG_TEMP_AFTER_WITH_REG);
>>>>        if (REG_DEAD_DEBUGGING)
>>>>  	df_print_note ("adding 3: ", insn, REG_NOTES (insn));
>>>
>>> I don't think this can be right.  The last hunk of the var-tracking.cc
>>> patch also seems to be reverting a correct change.
>>>
>>
>> We generate sequential registers using (subreg V16QI (reg 00mode) 16)
>> and (reg OOmode 0)
>> where OOmode is 256 bit and V16QI is 128 bits in order to generate
>> sequential register pair. If I keep the above REG_UNUSED notes ira generates
>> REG_UNUSED and in cprop_harreg pass and dce pass deletes store pairs and
>> we get incorrect code.
>>
>> By commenting REG_UNUSED notes it is not generated and we get the correct store
>> pair fusion and cprop_hardreg and dce doesn't deletes them.
>>
>> Please let me know is there are better ways to address this instead of commenting
>> above generation of REG_UNUSED notes.
>>
>> Thanks & Regards
>> Ajit
>>> Thanks,
>>> Richard
>

next prev parent reply	other threads:[~2024-02-14 17:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-14  6:16 Ajit Agarwal
2024-02-14 10:33 ` Richard Sandiford
2024-02-14 13:52   ` Ajit Agarwal
2024-02-14 14:15     ` Ajit Agarwal
2024-02-14 17:20     ` Sam James
2024-02-14 17:56       ` Ajit Agarwal [this message]
2024-02-14 17:26     ` Richard Sandiford
2024-02-14 19:04       ` Ajit Agarwal
2024-02-14 19:44         ` Richard Sandiford
2024-02-14 21:08           ` Ajit Agarwal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60ffc1f6-1282-4cc2-8dba-07c529cf8e38@linux.ibm.com \
    --to=aagarwa1@linux.ibm.com \
    --cc=alex.coplan@arm.com \
    --cc=bergner@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linkw@linux.ibm.com \
    --cc=meissner@linux.ibm.com \
    --cc=richard.sandiford@arm.com \
    --cc=sam@gentoo.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).