public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw
@ 2020-07-22 14:58 Zhongyunde
  2020-07-22 20:47 ` Richard Sandiford
  0 siblings, 1 reply; 12+ messages in thread
From: Zhongyunde @ 2020-07-22 14:58 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Yangfei (Felix)

[-- Attachment #1: Type: text/plain, Size: 2871 bytes --]


> -----Original Message-----
> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
> Sent: Wednesday, July 22, 2020 12:12 AM
> To: Zhongyunde <zhongyunde@huawei.com>
> Cc: gcc-patches@gcc.gnu.org; Yangfei (A) <lancelo.yang@huawei.com>
> Subject: Re: 答复: [PATCH PR95696] regrename creates overlapping
> register allocations for vliw
> 
> Zhongyunde <zhongyunde@huawei.com> writes:
> > Thanks for your review.
> >
> > First of all, this is an optimization.
> 
> OK, good.
> 
> >    gcc do sms before reload, and here each insn use pseudo-register.
> After reload, they are allocated hard-register, then the regrename pass try
> to adjust the register number with def/use chain created by
> build_def_use.
> >  As now gcc doesn't consider the VLIW bundles, so regrename pass may
> updated a reg which may not really unused, which will bring in invalid
> VLIW bundles.
> >    Before the final schedule, we usually recheck the validation of VLIW
> bundles, and reschedule the conflicted insns into two VLIW to make them
> validation to avoid above issue, so this is not a correctness issue.
> >  Certainly, reschedule the conflicted insns into two VLIW will destroy
> the kernel loop's sms schedule result, and usually it will be harmful to the
> performance.
> 
> Yeah.  The reason I was worried about the TI markers being stale is that, in
> general, register allocation can introduce new spills and reloads, can add
> and remove instructions, and can convert instructions into different forms
> (e.g. as a result of register elimination).
> There are then post-reload optimisers that can change the code further.
> All these things could invalidate the VLIW bundling done by the first
> scheduler.
> 
> It sounds like that's not happening in your motivating testcase, and the
> VLIW bundling is still correct (for this loop) by the time that regrename
> runs.  Is that right?

Yes, it is right.

> It's interesting that this is for a testcase using SMS.  One of the traditional
> problems with the GCC implementation of SMS has been ensuring that
> later passes don't mess up the scheduled loop.  So in your testcase, does
> register allocation succeed for the SMS loop without invalidating the
> bundling decisions?

Yes.

> If so, then it's probably better to avoid running regrename on it at all.
> It mostly exists to help the second scheduling pass, but the second
> scheduling pass shouldn't be messing with an SMS loop anyway.  Also,
> although the patch deals with one case in which regrename could disrupt
> the bundling, there are others too.
> 
> So maybe one option would be to make regrename ignore blocks that
> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
> discounted
> already.)

ok, according your advice, I make a new patch attached.

> Thanks,
> Richard

[-- Attachment #2: PR95696_1.patch --]
[-- Type: application/octet-stream, Size: 2274 bytes --]

diff --git a/gcc/regrename.c b/gcc/regrename.c
index c38173a77..0ccc5e18c 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -284,6 +284,48 @@ create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   return head;
 }
 
+/* For the def-use chain HEAD of SMS block, find the registers with REG_UNUSED
+   conflict its VLIW constraint and set the corresponding bits in *PSET.  */
+
+static void
+merge_vliw_overlapping_regs (HARD_REG_SET *pset, struct du_head *head)
+{
+  rtx_insn *insn;
+  rtx_insn *cur_insn = head->first->insn;
+  basic_block bb = BLOCK_FOR_INSN (cur_insn);
+
+  /* Just run regrename for no-SMS related basic block.  */
+  if ((bb->flags & BB_DISABLE_SCHEDULE) == 0 || (bb->flags & BB_MODIFIED) != 0)
+    return;
+
+  /* SMS related basic block generate VLIW before reload, and avoid disrupting
+     the bundling if the new rename reg is valid for bundling.  */
+  for (insn = prev_active_insn (cur_insn); insn && BLOCK_FOR_INSN (insn) == bb;
+       insn = prev_active_insn (insn))
+    {
+      rtx note;
+
+      for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
+        {
+          if (REG_NOTE_KIND (note) == REG_UNUSED)
+          {
+            int regno = REGNO (XEXP (note, 0));
+            SET_HARD_REG_BIT (*pset, regno);
+            if (dump_file)
+              {
+                fprintf (dump_file, "Register %s may confilct with REG_UNUSED \
+                                     in insn %d\n",
+                         reg_names[regno], INSN_UID (insn));
+              }
+          }
+        }
+
+      /* Walk until the first insn of current vliw.  */
+      if (GET_MODE (insn) == TImode)
+        break;
+    }
+}
+
 /* For a def-use chain HEAD, find which registers overlap its lifetime and
    set the corresponding bits in *PSET.  */
 
@@ -372,6 +414,9 @@ find_rename_reg (du_head_p this_head, enum reg_class super_class,
   /* Mark registers that overlap this chain's lifetime as unavailable.  */
   merge_overlapping_regs (unavailable, this_head);
 
+  /* Mark registers that conflict vliw constraint as unavailable.  */
+  merge_vliw_overlapping_regs (unavailable, this_head);
+
   /* Compute preferred rename class of super union of all the classes
      in the chain.  */
   preferred_class

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-07-22 14:58 RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw Zhongyunde
@ 2020-07-22 20:47 ` Richard Sandiford
  2020-07-26  7:28   ` Zhongyunde
  2020-07-27  4:53   ` Zhongyunde
  0 siblings, 2 replies; 12+ messages in thread
From: Richard Sandiford @ 2020-07-22 20:47 UTC (permalink / raw)
  To: Zhongyunde; +Cc: gcc-patches, Yangfei (Felix)

Zhongyunde <zhongyunde@huawei.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
>> Sent: Wednesday, July 22, 2020 12:12 AM
>> To: Zhongyunde <zhongyunde@huawei.com>
>> Cc: gcc-patches@gcc.gnu.org; Yangfei (A) <lancelo.yang@huawei.com>
>> Subject: Re: 答复: [PATCH PR95696] regrename creates overlapping
>> register allocations for vliw
>> 
>> Zhongyunde <zhongyunde@huawei.com> writes:
>> > Thanks for your review.
>> >
>> > First of all, this is an optimization.
>> 
>> OK, good.
>> 
>> >    gcc do sms before reload, and here each insn use pseudo-register.
>> After reload, they are allocated hard-register, then the regrename pass try
>> to adjust the register number with def/use chain created by
>> build_def_use.
>> >  As now gcc doesn't consider the VLIW bundles, so regrename pass may
>> updated a reg which may not really unused, which will bring in invalid
>> VLIW bundles.
>> >    Before the final schedule, we usually recheck the validation of VLIW
>> bundles, and reschedule the conflicted insns into two VLIW to make them
>> validation to avoid above issue, so this is not a correctness issue.
>> >  Certainly, reschedule the conflicted insns into two VLIW will destroy
>> the kernel loop's sms schedule result, and usually it will be harmful to the
>> performance.
>> 
>> Yeah.  The reason I was worried about the TI markers being stale is that, in
>> general, register allocation can introduce new spills and reloads, can add
>> and remove instructions, and can convert instructions into different forms
>> (e.g. as a result of register elimination).
>> There are then post-reload optimisers that can change the code further.
>> All these things could invalidate the VLIW bundling done by the first
>> scheduler.
>> 
>> It sounds like that's not happening in your motivating testcase, and the
>> VLIW bundling is still correct (for this loop) by the time that regrename
>> runs.  Is that right?
>
> Yes, it is right.
>
>> It's interesting that this is for a testcase using SMS.  One of the traditional
>> problems with the GCC implementation of SMS has been ensuring that
>> later passes don't mess up the scheduled loop.  So in your testcase, does
>> register allocation succeed for the SMS loop without invalidating the
>> bundling decisions?
>
> Yes.
>
>> If so, then it's probably better to avoid running regrename on it at all.
>> It mostly exists to help the second scheduling pass, but the second
>> scheduling pass shouldn't be messing with an SMS loop anyway.  Also,
>> although the patch deals with one case in which regrename could disrupt
>> the bundling, there are others too.
>> 
>> So maybe one option would be to make regrename ignore blocks that
>> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
>> discounted
>> already.)
>
> ok, according your advice, I make a new patch attached.

Thanks.  I think we should treat the SMS and the REG_UNUSED stuff as
separate patches though.

For the SMS part, I think a better place to enforce the rule
is in build_def_use.  If that function returns false early for
BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
block without wasting too much compile time on it, and we'll still
keep the pass structures internally correct.  (It would also be good
to have a dump_file message to say that that's what we're doing.)

Do you still need the REG_UNUSED stuff with the SMS patch?  If so,
could you describe the (presumably non-SMS) cases that are affected?

TBH, given that the bundling information is so uncertain at this stage,
I think it would be better to have a mode in which regrename ignores
REG_UNUSED notes altogether.  Perhaps we could put it under a --param,
which targets could then set to whichever default they prefer.
The default should be the current behaviour though.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-07-22 20:47 ` Richard Sandiford
@ 2020-07-26  7:28   ` Zhongyunde
  2020-07-27  4:53   ` Zhongyunde
  1 sibling, 0 replies; 12+ messages in thread
From: Zhongyunde @ 2020-07-26  7:28 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Yangfei (Felix)

[-- Attachment #1: Type: text/plain, Size: 2338 bytes --]


> >> It's interesting that this is for a testcase using SMS.  One of the
> >> traditional problems with the GCC implementation of SMS has been
> >> ensuring that later passes don't mess up the scheduled loop.  So in
> >> your testcase, does register allocation succeed for the SMS loop
> >> without invalidating the bundling decisions?
> >
> > Yes.
> >
> >> If so, then it's probably better to avoid running regrename on it at all.
> >> It mostly exists to help the second scheduling pass, but the second
> >> scheduling pass shouldn't be messing with an SMS loop anyway.  Also,
> >> although the patch deals with one case in which regrename could
> >> disrupt the bundling, there are others too.
> >>
> >> So maybe one option would be to make regrename ignore blocks that
> >> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
> >> discounted
> >> already.)
> >
> > ok, according your advice, I make a new patch attached.
> 
> Thanks.  I think we should treat the SMS and the REG_UNUSED stuff as
> separate patches though.
> 
> For the SMS part, I think a better place to enforce the rule is in
> build_def_use.  If that function returns false early for
> BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
> block without wasting too much compile time on it, and we'll still keep the
> pass structures internally correct.  (It would also be good to have a
> dump_file message to say that that's what we're doing.)

> Do you still need the REG_UNUSED stuff with the SMS patch?  If so, could
> you describe the (presumably non-SMS) cases that are affected?

Yes, the non-SMS basic block should not be affected. 
An alternate method attached can avoid use REG_UNUSED stuff for BB with BB_DISABLE_SCHEDUL.

I don't change build_def_use to return false early as I find some other optimization reuse the function
regrename_analyze to creat def/use chain info of the kernel loop body in our target.

> TBH, given that the bundling information is so uncertain at this stage, I
> think it would be better to have a mode in which regrename ignores
> REG_UNUSED notes altogether.  Perhaps we could put it under a --param,
> which targets could then set to whichever default they prefer.
> The default should be the current behaviour though.


> Thanks,
> Richard

[-- Attachment #2: PR95696_2.patch --]
[-- Type: application/octet-stream, Size: 777 bytes --]

diff --git a/gcc/regrename.c b/gcc/regrename.c
index c38173a77..2683629db 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -475,6 +475,7 @@ rename_chains (void)
       int n_uses;
       HARD_REG_SET this_unavailable;
       int reg = this_head->regno;
+      basic_block bb;
 
       if (this_head->cannot_rename)
 	continue;
@@ -493,6 +494,15 @@ rename_chains (void)
       if (n_uses < 2)
 	continue;
 
+      bb = BLOCK_FOR_INSN (this_head->first->insn);
+      if ((bb->flags & BB_DISABLE_SCHEDULE) != 0
+	  && (bb->flags & BB_MODIFIED) == 0)
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "Skip to avoid disrupting the sms schedule\n");
+	  continue;
+	}
+
       best_new_reg = find_rename_reg (this_head, super_class,
 				      &this_unavailable, reg, true);
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-07-22 20:47 ` Richard Sandiford
  2020-07-26  7:28   ` Zhongyunde
@ 2020-07-27  4:53   ` Zhongyunde
  2020-07-27 17:32     ` Richard Sandiford
  1 sibling, 1 reply; 12+ messages in thread
From: Zhongyunde @ 2020-07-27  4:53 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Yangfei (Felix)

[-- Attachment #1: Type: text/plain, Size: 2931 bytes --]

I reconsider the issue and update patch attached.
Yes, If the kernel loop bb's information doesn't use in regrename, it also need not be collected to save compile time.

> -----Original Message-----
> From: Zhongyunde
> Sent: Sunday, July 26, 2020 3:29 PM
> To: 'Richard Sandiford' <richard.sandiford@arm.com>
> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) <felix.yang@huawei.com>
> Subject: RE: [PATCH PR95696] regrename creates overlapping register
> allocations for vliw
> 
> 
> > >> It's interesting that this is for a testcase using SMS.  One of the
> > >> traditional problems with the GCC implementation of SMS has been
> > >> ensuring that later passes don't mess up the scheduled loop.  So in
> > >> your testcase, does register allocation succeed for the SMS loop
> > >> without invalidating the bundling decisions?
> > >
> > > Yes.
> > >
> > >> If so, then it's probably better to avoid running regrename on it at all.
> > >> It mostly exists to help the second scheduling pass, but the second
> > >> scheduling pass shouldn't be messing with an SMS loop anyway.
> > >> Also, although the patch deals with one case in which regrename
> > >> could disrupt the bundling, there are others too.
> > >>
> > >> So maybe one option would be to make regrename ignore blocks that
> > >> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
> > >> discounted
> > >> already.)
> > >
> > > ok, according your advice, I make a new patch attached.
> >
> > Thanks.  I think we should treat the SMS and the REG_UNUSED stuff as
> > separate patches though.
> >
> > For the SMS part, I think a better place to enforce the rule is in
> > build_def_use.  If that function returns false early for
> > BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
> > block without wasting too much compile time on it, and we'll still
> > keep the pass structures internally correct.  (It would also be good
> > to have a dump_file message to say that that's what we're doing.)
> 
> > Do you still need the REG_UNUSED stuff with the SMS patch?  If so,
> > could you describe the (presumably non-SMS) cases that are affected?
> 
> Yes, the non-SMS basic block should not be affected.
> An alternate method attached can avoid use REG_UNUSED stuff for BB with
> BB_DISABLE_SCHEDUL.
> 
> I don't change build_def_use to return false early as I find some other
> optimization reuse the function regrename_analyze to creat def/use chain
> info of the kernel loop body in our target.
> 
> > TBH, given that the bundling information is so uncertain at this
> > stage, I think it would be better to have a mode in which regrename
> > ignores REG_UNUSED notes altogether.  Perhaps we could put it under
> a
> > --param, which targets could then set to whichever default they prefer.
> > The default should be the current behaviour though.
> 
> 
> > Thanks,
> > Richard

[-- Attachment #2: PR95696_3.patch --]
[-- Type: application/octet-stream, Size: 558 bytes --]

diff --git a/gcc/regrename.c b/gcc/regrename.c
index c38173a77..1df58e52d 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -737,6 +737,14 @@ regrename_analyze (bitmap bb_mask)
       if (dump_file)
 	fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
 
+      if ((bb1->flags & BB_DISABLE_SCHEDULE) != 0
+	  && (bb1->flags & BB_MODIFIED) == 0)
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "skip to avoid disrupting the sms schedule\n");
+	  continue;
+	}
+
       init_rename_info (this_info, bb1);
 
       success = build_def_use (bb1);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-07-27  4:53   ` Zhongyunde
@ 2020-07-27 17:32     ` Richard Sandiford
  2020-07-31  6:09       ` Zhongyunde
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Sandiford @ 2020-07-27 17:32 UTC (permalink / raw)
  To: Zhongyunde; +Cc: gcc-patches, Yangfei (Felix)

Zhongyunde <zhongyunde@huawei.com> writes:
> I reconsider the issue and update patch attached.
> Yes, If the kernel loop bb's information doesn't use in regrename, it
> also need not be collected to save compile time.

Thanks.  The point about other callers was a good one though.
I think it would be OK to add a new parameter to regrename_analyze
that says whether it should process all blocks, ignoring
BB_DISABLE_SCHEDULE.  The default value could be true, with only
regrename itself passing false.

Why do you need the BB_MODIFIED test?  The point in time that that
flag is measured from is somewhat arbitrary.  Also, the modification
that caused the flag to be set might not have invalidated the schedule.

Richard

>
>> -----Original Message-----
>> From: Zhongyunde
>> Sent: Sunday, July 26, 2020 3:29 PM
>> To: 'Richard Sandiford' <richard.sandiford@arm.com>
>> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) <felix.yang@huawei.com>
>> Subject: RE: [PATCH PR95696] regrename creates overlapping register
>> allocations for vliw
>> 
>> 
>> > >> It's interesting that this is for a testcase using SMS.  One of the
>> > >> traditional problems with the GCC implementation of SMS has been
>> > >> ensuring that later passes don't mess up the scheduled loop.  So in
>> > >> your testcase, does register allocation succeed for the SMS loop
>> > >> without invalidating the bundling decisions?
>> > >
>> > > Yes.
>> > >
>> > >> If so, then it's probably better to avoid running regrename on it at all.
>> > >> It mostly exists to help the second scheduling pass, but the second
>> > >> scheduling pass shouldn't be messing with an SMS loop anyway.
>> > >> Also, although the patch deals with one case in which regrename
>> > >> could disrupt the bundling, there are others too.
>> > >>
>> > >> So maybe one option would be to make regrename ignore blocks that
>> > >> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
>> > >> discounted
>> > >> already.)
>> > >
>> > > ok, according your advice, I make a new patch attached.
>> >
>> > Thanks.  I think we should treat the SMS and the REG_UNUSED stuff as
>> > separate patches though.
>> >
>> > For the SMS part, I think a better place to enforce the rule is in
>> > build_def_use.  If that function returns false early for
>> > BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
>> > block without wasting too much compile time on it, and we'll still
>> > keep the pass structures internally correct.  (It would also be good
>> > to have a dump_file message to say that that's what we're doing.)
>> 
>> > Do you still need the REG_UNUSED stuff with the SMS patch?  If so,
>> > could you describe the (presumably non-SMS) cases that are affected?
>> 
>> Yes, the non-SMS basic block should not be affected.
>> An alternate method attached can avoid use REG_UNUSED stuff for BB with
>> BB_DISABLE_SCHEDUL.
>> 
>> I don't change build_def_use to return false early as I find some other
>> optimization reuse the function regrename_analyze to creat def/use chain
>> info of the kernel loop body in our target.
>> 
>> > TBH, given that the bundling information is so uncertain at this
>> > stage, I think it would be better to have a mode in which regrename
>> > ignores REG_UNUSED notes altogether.  Perhaps we could put it under
>> a
>> > --param, which targets could then set to whichever default they prefer.
>> > The default should be the current behaviour though.
>> 
>> 
>> > Thanks,
>> > Richard
>
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> index c38173a77..1df58e52d 100644
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -737,6 +737,14 @@ regrename_analyze (bitmap bb_mask)
>        if (dump_file)
>  	fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
>  
> +      if ((bb1->flags & BB_DISABLE_SCHEDULE) != 0
> +	  && (bb1->flags & BB_MODIFIED) == 0)
> +	{
> +	  if (dump_file)
> +	    fprintf (dump_file, "skip to avoid disrupting the sms schedule\n");
> +	  continue;
> +	}
> +
>        init_rename_info (this_info, bb1);
>  
>        success = build_def_use (bb1);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-07-27 17:32     ` Richard Sandiford
@ 2020-07-31  6:09       ` Zhongyunde
  2020-07-31  9:32         ` Richard Sandiford
  0 siblings, 1 reply; 12+ messages in thread
From: Zhongyunde @ 2020-07-31  6:09 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Yangfei (Felix)

[-- Attachment #1: Type: text/plain, Size: 4858 bytes --]


> -----Original Message-----
> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
> Sent: Tuesday, July 28, 2020 1:33 AM
> To: Zhongyunde <zhongyunde@huawei.com>
> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) <felix.yang@huawei.com>
> Subject: Re: [PATCH PR95696] regrename creates overlapping register
> allocations for vliw
> 
> Zhongyunde <zhongyunde@huawei.com> writes:
> > I reconsider the issue and update patch attached.
> > Yes, If the kernel loop bb's information doesn't use in regrename, it
> > also need not be collected to save compile time.
> 
> Thanks.  The point about other callers was a good one though.
> I think it would be OK to add a new parameter to regrename_analyze that
> says whether it should process all blocks, ignoring BB_DISABLE_SCHEDULE.
> The default value could be true, with only regrename itself passing false.

Thanks for your detail advice, and I add a new parameter to regrename_analyze.

> Why do you need the BB_MODIFIED test?  The point in time that that flag
> is measured from is somewhat arbitrary.  Also, the modification that
> caused the flag to be set might not have invalidated the schedule.

Yes, in fact the BB_MODIFIED test is not need.

> Richard
> 
> >
> >> -----Original Message-----
> >> From: Zhongyunde
> >> Sent: Sunday, July 26, 2020 3:29 PM
> >> To: 'Richard Sandiford' <richard.sandiford@arm.com>
> >> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) <felix.yang@huawei.com>
> >> Subject: RE: [PATCH PR95696] regrename creates overlapping register
> >> allocations for vliw
> >>
> >>
> >> > >> It's interesting that this is for a testcase using SMS.  One of
> >> > >> the traditional problems with the GCC implementation of SMS has
> >> > >> been ensuring that later passes don't mess up the scheduled
> >> > >> loop.  So in your testcase, does register allocation succeed for
> >> > >> the SMS loop without invalidating the bundling decisions?
> >> > >
> >> > > Yes.
> >> > >
> >> > >> If so, then it's probably better to avoid running regrename on it at
> all.
> >> > >> It mostly exists to help the second scheduling pass, but the
> >> > >> second scheduling pass shouldn't be messing with an SMS loop
> anyway.
> >> > >> Also, although the patch deals with one case in which regrename
> >> > >> could disrupt the bundling, there are others too.
> >> > >>
> >> > >> So maybe one option would be to make regrename ignore blocks
> >> > >> that have BB_DISABLE_SCHEDULE set.  (Sorry if that's been
> >> > >> discussed and discounted
> >> > >> already.)
> >> > >
> >> > > ok, according your advice, I make a new patch attached.
> >> >
> >> > Thanks.  I think we should treat the SMS and the REG_UNUSED stuff
> >> > as separate patches though.
> >> >
> >> > For the SMS part, I think a better place to enforce the rule is in
> >> > build_def_use.  If that function returns false early for
> >> > BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
> >> > block without wasting too much compile time on it, and we'll still
> >> > keep the pass structures internally correct.  (It would also be
> >> > good to have a dump_file message to say that that's what we're
> >> > doing.)
> >>
> >> > Do you still need the REG_UNUSED stuff with the SMS patch?  If so,
> >> > could you describe the (presumably non-SMS) cases that are
> affected?
> >>
> >> Yes, the non-SMS basic block should not be affected.
> >> An alternate method attached can avoid use REG_UNUSED stuff for BB
> >> with BB_DISABLE_SCHEDUL.
> >>
> >> I don't change build_def_use to return false early as I find some
> >> other optimization reuse the function regrename_analyze to creat
> >> def/use chain info of the kernel loop body in our target.
> >>
> >> > TBH, given that the bundling information is so uncertain at this
> >> > stage, I think it would be better to have a mode in which regrename
> >> > ignores REG_UNUSED notes altogether.  Perhaps we could put it
> under
> >> a
> >> > --param, which targets could then set to whichever default they
> prefer.
> >> > The default should be the current behaviour though.
> >>
> >>
> >> > Thanks,
> >> > Richard
> >
> > diff --git a/gcc/regrename.c b/gcc/regrename.c index
> > c38173a77..1df58e52d 100644
> > --- a/gcc/regrename.c
> > +++ b/gcc/regrename.c
> > @@ -737,6 +737,14 @@ regrename_analyze (bitmap bb_mask)
> >        if (dump_file)
> >  	fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
> >
> > +      if ((bb1->flags & BB_DISABLE_SCHEDULE) != 0
> > +	  && (bb1->flags & BB_MODIFIED) == 0)
> > +	{
> > +	  if (dump_file)
> > +	    fprintf (dump_file, "skip to avoid disrupting the sms schedule\n");
> > +	  continue;
> > +	}
> > +
> >        init_rename_info (this_info, bb1);
> >
> >        success = build_def_use (bb1);

[-- Attachment #2: PR95696_5.patch --]
[-- Type: application/octet-stream, Size: 1806 bytes --]

diff --git a/gcc/regrename.c b/gcc/regrename.c
old mode 100644
new mode 100755
index 637b3cbe6d7..815ed22805d
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -684,10 +684,12 @@ merge_chains (du_head_p c1, du_head_p c2)
   c1->cannot_rename |= c2->cannot_rename;
 }
 
-/* Analyze the current function and build chains for renaming.  */
+/* Analyze the current function and build chains for renaming.
+   If INCLUDE_ALL_BLOCKS_P is set to true, should process all blocks,
+   ignoring BB_DISABLE_SCHEDULE.  The default value is true.  */
 
 void
-regrename_analyze (bitmap bb_mask)
+regrename_analyze (bitmap bb_mask, bool include_all_block_p)
 {
   struct bb_rename_info *rename_info;
   int i;
@@ -737,6 +739,14 @@ regrename_analyze (bitmap bb_mask)
       if (dump_file)
 	fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
 
+      if (!include_all_block_p && (bb1->flags & BB_DISABLE_SCHEDULE) != 0)
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "avoid disrupting the sms schedule of bb %d\n",
+	         bb1->index);
+	  continue;
+	}
+
       init_rename_info (this_info, bb1);
 
       success = build_def_use (bb1);
@@ -1946,7 +1956,7 @@ regrename_optimize (void)
 
   regrename_init (false);
 
-  regrename_analyze (NULL);
+  regrename_analyze (NULL, false);
 
   rename_chains ();
 
diff --git a/gcc/regrename.h b/gcc/regrename.h
index 37f5e398ded..2bdb0338186 100644
--- a/gcc/regrename.h
+++ b/gcc/regrename.h
@@ -96,7 +96,7 @@ extern vec<insn_rr_info> insn_rr;
 
 extern void regrename_init (bool);
 extern void regrename_finish (void);
-extern void regrename_analyze (bitmap);
+extern void regrename_analyze (bitmap, bool = true);
 extern du_head_p regrename_chain_from_id (unsigned int);
 extern int find_rename_reg (du_head_p, enum reg_class, HARD_REG_SET *, int,
 			    bool);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-07-31  6:09       ` Zhongyunde
@ 2020-07-31  9:32         ` Richard Sandiford
  2020-08-03  7:40           ` Yangfei (Felix)
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Sandiford @ 2020-07-31  9:32 UTC (permalink / raw)
  To: Zhongyunde; +Cc: gcc-patches, Yangfei (Felix)

Thanks for the update, looks good.  Could you post a changelog too
so that I can use it when committing?

The changelog was the only reason I didn't just push the patch,
but FWIW, a couple of very minor things…

Zhongyunde <zhongyunde@huawei.com> writes:
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> old mode 100644
> new mode 100755
> index 637b3cbe6d7..815ed22805d
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -684,10 +684,12 @@ merge_chains (du_head_p c1, du_head_p c2)
>    c1->cannot_rename |= c2->cannot_rename;
>  }
>  
> -/* Analyze the current function and build chains for renaming.  */
> +/* Analyze the current function and build chains for renaming.
> +   If INCLUDE_ALL_BLOCKS_P is set to true, should process all blocks,
> +   ignoring BB_DISABLE_SCHEDULE.  The default value is true.  */

I think s/should// here, since GCC comments usually use an imperative
style.

> @@ -737,6 +739,14 @@ regrename_analyze (bitmap bb_mask)
>        if (dump_file)
>  	fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
>  
> +      if (!include_all_block_p && (bb1->flags & BB_DISABLE_SCHEDULE) != 0)
> +	{
> +	  if (dump_file)
> +	    fprintf (dump_file, "avoid disrupting the sms schedule of bb %d\n",
> +	         bb1->index);

bb1->index should be indented below “dump_file”.

Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-07-31  9:32         ` Richard Sandiford
@ 2020-08-03  7:40           ` Yangfei (Felix)
  2020-08-03 14:05             ` Richard Sandiford
  0 siblings, 1 reply; 12+ messages in thread
From: Yangfei (Felix) @ 2020-08-03  7:40 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Zhongyunde

[-- Attachment #1: Type: text/plain, Size: 2094 bytes --]

Hi Richard,

    Thanks for reviewing this fix and the detailed suggestions : -)
    Looks like my colleague Yunde was having some issue setting up his local repo.
    I have prepared one for him.  Attached please find the patch.
    Bootstrapped and tested on aarch64-linux-gnu.  Please help install if it's good to go.

Felix

> -----Original Message-----
> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
> Sent: Friday, July 31, 2020 5:33 PM
> To: Zhongyunde <zhongyunde@huawei.com>
> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) <felix.yang@huawei.com>
> Subject: Re: [PATCH PR95696] regrename creates overlapping register
> allocations for vliw
> 
> Thanks for the update, looks good.  Could you post a changelog too so that I
> can use it when committing?
> 
> The changelog was the only reason I didn't just push the patch, but FWIW, a
> couple of very minor things…
> 
> Zhongyunde <zhongyunde@huawei.com> writes:
> > diff --git a/gcc/regrename.c b/gcc/regrename.c old mode 100644 new
> > mode 100755 index 637b3cbe6d7..815ed22805d
> > --- a/gcc/regrename.c
> > +++ b/gcc/regrename.c
> > @@ -684,10 +684,12 @@ merge_chains (du_head_p c1, du_head_p c2)
> >    c1->cannot_rename |= c2->cannot_rename;  }
> >
> > -/* Analyze the current function and build chains for renaming.  */
> > +/* Analyze the current function and build chains for renaming.
> > +   If INCLUDE_ALL_BLOCKS_P is set to true, should process all blocks,
> > +   ignoring BB_DISABLE_SCHEDULE.  The default value is true.  */
> 
> I think s/should// here, since GCC comments usually use an imperative style.
> 
> > @@ -737,6 +739,14 @@ regrename_analyze (bitmap bb_mask)
> >        if (dump_file)
> >  	fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
> >
> > +      if (!include_all_block_p && (bb1->flags & BB_DISABLE_SCHEDULE) != 0)
> > +	{
> > +	  if (dump_file)
> > +	    fprintf (dump_file, "avoid disrupting the sms schedule of bb %d\n",
> > +	         bb1->index);
> 
> bb1->index should be indented below “dump_file”.
> 
> Richard

[-- Attachment #2: pr95696-v0.diff --]
[-- Type: application/octet-stream, Size: 3012 bytes --]

From 471c7c45a923baf1503ac9699b46e178cad3ccb1 Mon Sep 17 00:00:00 2001
From: Yunde Zhong <zhongyunde@huawei.com>
Date: Mon, 3 Aug 2020 15:31:42 +0800
Subject: [PATCH] regrename: Avoid disrupting SMS schedule [PR95696]

SMS is performed before reload, and each insn in SMS schedule uses
pseudo-register.  After reload, regrename pass try to adjust the hard
registers with def/use chain created by build_def_use.  For now, regrename
pass isn't aware of VLIW bundles created by SMS, it may updated a register
which may not be really unused, which will causes invalid VLIW bundles.
Before the final schedule, we recheck the validation of VLIW bundles and
reschedule the conflicted insns to avoid the above issue.  Rescheduling
the conflicted insns will destroy SMS schedule of the kernel loop, which
would be harmful to performance.

2020-08-03  Yunde Zhong  <zhongyunde@huawei.com>

gcc/
	PR rtl-optimization/95696
	* regrename.c (regrename_analyze): New param include_all_block_p
	with default value TRUE.  If set to false, avoid disrupting SMS
	schedule.
	* regrename.h (regrename_analyze): Adjust prototype.
---
 gcc/regrename.c | 16 +++++++++++++---
 gcc/regrename.h |  2 +-
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/gcc/regrename.c b/gcc/regrename.c
index 669a6ead705..ebe74c50d0a 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -695,10 +695,12 @@ merge_chains (du_head_p c1, du_head_p c2)
   c1->cannot_rename |= c2->cannot_rename;
 }
 
-/* Analyze the current function and build chains for renaming.  */
+/* Analyze the current function and build chains for renaming.
+   If INCLUDE_ALL_BLOCKS_P is set to true, process all blocks,
+   ignoring BB_DISABLE_SCHEDULE.  The default value is true.  */
 
 void
-regrename_analyze (bitmap bb_mask)
+regrename_analyze (bitmap bb_mask, bool include_all_block_p)
 {
   class bb_rename_info *rename_info;
   int i;
@@ -748,6 +750,14 @@ regrename_analyze (bitmap bb_mask)
       if (dump_file)
 	fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
 
+      if (!include_all_block_p && (bb1->flags & BB_DISABLE_SCHEDULE) != 0)
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "avoid disrupting the sms schedule of bb %d\n",
+		     bb1->index);
+	  continue;
+	}
+
       init_rename_info (this_info, bb1);
 
       success = build_def_use (bb1);
@@ -1962,7 +1972,7 @@ regrename_optimize (void)
 
   regrename_init (false);
 
-  regrename_analyze (NULL);
+  regrename_analyze (NULL, false);
 
   rename_chains ();
 
diff --git a/gcc/regrename.h b/gcc/regrename.h
index f2af04bab4a..10a3aa14e5d 100644
--- a/gcc/regrename.h
+++ b/gcc/regrename.h
@@ -100,7 +100,7 @@ extern vec<insn_rr_info> insn_rr;
 
 extern void regrename_init (bool);
 extern void regrename_finish (void);
-extern void regrename_analyze (bitmap);
+extern void regrename_analyze (bitmap, bool = true);
 extern du_head_p regrename_chain_from_id (unsigned int);
 extern int find_rename_reg (du_head_p, enum reg_class, HARD_REG_SET *, int,
 			    bool);
-- 
2.19.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-08-03  7:40           ` Yangfei (Felix)
@ 2020-08-03 14:05             ` Richard Sandiford
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Sandiford @ 2020-08-03 14:05 UTC (permalink / raw)
  To: Yangfei (Felix); +Cc: gcc-patches, Zhongyunde

"Yangfei (Felix)" <felix.yang@huawei.com> writes:
> Hi Richard,
>
>     Thanks for reviewing this fix and the detailed suggestions : -)
>     Looks like my colleague Yunde was having some issue setting up his local repo.
>     I have prepared one for him.  Attached please find the patch.
>     Bootstrapped and tested on aarch64-linux-gnu.  Please help install if it's good to go.

Thanks, pushed to trunk.

Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-07-20  0:59 ` Zhongyunde
@ 2020-07-20 16:05   ` Richard Sandiford
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Sandiford @ 2020-07-20 16:05 UTC (permalink / raw)
  To: Zhongyunde; +Cc: gcc-patches, Yangfei (A)

Hi,

Zhongyunde <zhongyunde@huawei.com> writes:
> Hi,
>
> In most target, it is limited to issue two insns with change the same register. So a register is not realy unused if there is another insn, which set the register in the save VLIW.
>
> For example, The insn 73 start with insn:TI, so it will be issued together with others insns until a new insn start with insn:TI, here is insn 243.
>
> The regrename pass known the mode V2VF in insn 73 need two successive registers, i.e. v2 and v3, here is dump snippet before the regrename:
>
>
>
> ==========================================================================================
>
> (insn:TI 73 76 71 4 (set (reg/v:V2VF 37 v2 [orig:180 _62 ] [180])
>
>         (unspec:V2VF [
>
>                 (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
>
>                 (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
>
>             ] UNSPEC_HFSQMAG_32X32)) "../test_modify.c":57 710 {hfsqmag_v2vf}
>
>      (expr_list:REG_DEAD (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
>
>         (expr_list:REG_UNUSED (reg:VHF 38 v3)
>
>             (expr_list:REG_STAGE (const_int 2 [0x2])
>
>                 (expr_list:REG_CYCLE (const_int 2 [0x2])
>
>                     (expr_list:REG_UNITS (const_int 256 [0x100])
>
>                         (nil)))))))
>
>
>
> (insn 71 73 243 4 (set (reg:VHF 43 v8 [orig:265 MEM[(const vfloat32x16 *)Src_base_134] ] [265])
>
>         (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 {movvhf_internal}
>
>      (expr_list:REG_STAGE (const_int 1 [0x1])
>
>         (expr_list:REG_CYCLE (const_int 2 [0x2])
>
>             (nil))))
>
>
>
> (insn:TI 243 …
>
>
>
> Then, in the regrename, the insn 71 will be transformed as following code with register v3, and there is an conflict between insn 73 and insn 71 (as both of them set the v3 register).
>
>
>
> Register v2 (2): 73 [SVEC_REGS]
>
> Register v8 (1): 71 [VEC_ALL_REGS]
>
>
>
> ....
>
>
>
> (insn 71 73 243 4 (set (reg:VHF 38 v3 [orig:265 MEM[(const vfloat32x16 *)Src_base_134] ] [265])
>
>         (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 {movvhf_internal}
>
>      (expr_list:REG_STAGE (const_int 1 [0x1])
>
>         (expr_list:REG_CYCLE (const_int 2 [0x2])

Do you need this for correctness, or is it “just” an optimisation?

The reason for asking is that regrename runs quite a long time after
the first scheduling pass, with things like register allocation
happening in between.  The :TI markers on the instructions are
therefore going to be very stale by the time that regrename runs.

If it's just a heuristic then it might be OK to use them anyway,
but it would be worth having a comment to say what's going on.

One alternative would be to run the DFA over the instructions
to recompute the VLIW bundles, but that isn't simple.  It also isn't
necessarily a better heuristic, since the second scheduling pass is
likely to change things anyway.

There's also the problem that :TI markers are used on non-VLIW
targets too, whereas the problem is really specific to VLIW targets.

Unfortunately, I can't think of any good counter-suggestions
at the moment…

Thanks,
Richard

>
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> index c38173a77..e54794413 100644
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -1614,12 +1614,26 @@ record_out_operands (rtx_insn *insn, bool earlyclobber, insn_rr_info *insn_info)
>    cur_operand = NULL;
>  }
>  
> +/* Get the first real insn of next vliw in current BB.  */
> +static rtx_insn *
> +get_next_vliw_first_insn (rtx_insn *cur_insn, basic_block bb)
> +{
> +  rtx_insn *insn = next_real_insn (cur_insn);
> +
> +  for (; insn && insn != BB_END (bb); insn = next_real_insn (insn))
> +    if (GET_MODE (insn) == TImode)
> +      return insn;
> +
> +  return cur_insn;
> +}
> +
>  /* Build def/use chain.  */
>  
>  static bool
>  build_def_use (basic_block bb)
>  {
>    rtx_insn *insn;
> +  rtx_insn *vliw_start_insn = NULL;
>    unsigned HOST_WIDE_INT untracked_operands;
>  
>    fail_current_block = false;
> @@ -1663,6 +1677,9 @@ build_def_use (basic_block bb)
>  	     to be marked unrenamable or even cause us to abort the entire
>  	     basic block.  */
>  
> +	  if (GET_MODE (insn) == TImode)
> +	    vliw_start_insn = insn;
> +
>  	  extract_constrain_insn (insn);
>  	  preprocess_constraints (insn);
>  	  const operand_alternative *op_alt = which_op_alt ();
> @@ -1858,17 +1875,26 @@ build_def_use (basic_block bb)
>  	      scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_access,
>  			OP_INOUT);
>  
> -	  /* Step 7: Close chains for registers that were never
> -	     really used here.  */
> -	  for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
> -	    if (REG_NOTE_KIND (note) == REG_UNUSED)
> -	      {
> -		remove_from_hard_reg_set (&live_hard_regs,
> -					  GET_MODE (XEXP (note, 0)),
> -					  REGNO (XEXP (note, 0)));
> -		scan_rtx (insn, &XEXP (note, 0), NO_REGS, terminate_dead,
> -			  OP_IN);
> -	      }
> +	  /* Step 7: Close chains for registers that were never
> +	     really used delayed at the end of vliw.  */
> +	  if (vliw_start_insn
> +	      && next_real_insn (insn) == get_next_vliw_first_insn (insn, bb))
> +	    {
> +	      rtx_insn *member;
> +
> +	      for (member = vliw_start_insn;
> +	           member != next_active_insn (insn);
> +	           member = next_active_insn (member))
> +	        for (note = REG_NOTES (member); note; note = XEXP (note, 1))
> +	          if (REG_NOTE_KIND (note) == REG_UNUSED)
> +	            {
> +	              remove_from_hard_reg_set (&live_hard_regs,
> +	                                        GET_MODE (XEXP (note, 0)),
> +	                                        REGNO (XEXP (note, 0)));
> +	              scan_rtx (member, &XEXP (note, 0), NO_REGS, terminate_dead,
> +	                        OP_IN);
> +	            }
> +	    }
>  
>  	  /* Step 8: Kill the chains involving register restores.  Those
>  	     should restore _that_ register.  */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH PR95696] regrename creates overlapping register allocations for vliw
  2020-07-16 16:18 zhongyunde
@ 2020-07-20  0:59 ` Zhongyunde
  2020-07-20 16:05   ` Richard Sandiford
  0 siblings, 1 reply; 12+ messages in thread
From: Zhongyunde @ 2020-07-20  0:59 UTC (permalink / raw)
  To: gcc-patches; +Cc: Zhanghaijian (A), Yangfei (A)

[-- Attachment #1: Type: text/plain, Size: 2312 bytes --]

Hi,

In most target, it is limited to issue two insns with change the same register. So a register is not realy unused if there is another insn, which set the register in the save VLIW.

For example, The insn 73 start with insn:TI, so it will be issued together with others insns until a new insn start with insn:TI, here is insn 243.

The regrename pass known the mode V2VF in insn 73 need two successive registers, i.e. v2 and v3, here is dump snippet before the regrename:



==========================================================================================

(insn:TI 73 76 71 4 (set (reg/v:V2VF 37 v2 [orig:180 _62 ] [180])

        (unspec:V2VF [

                (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])

                (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])

            ] UNSPEC_HFSQMAG_32X32)) "../test_modify.c":57 710 {hfsqmag_v2vf}

     (expr_list:REG_DEAD (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])

        (expr_list:REG_UNUSED (reg:VHF 38 v3)

            (expr_list:REG_STAGE (const_int 2 [0x2])

                (expr_list:REG_CYCLE (const_int 2 [0x2])

                    (expr_list:REG_UNITS (const_int 256 [0x100])

                        (nil)))))))



(insn 71 73 243 4 (set (reg:VHF 43 v8 [orig:265 MEM[(const vfloat32x16 *)Src_base_134] ] [265])

        (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 {movvhf_internal}

     (expr_list:REG_STAGE (const_int 1 [0x1])

        (expr_list:REG_CYCLE (const_int 2 [0x2])

            (nil))))



(insn:TI 243 …



Then, in the regrename, the insn 71 will be transformed as following code with register v3, and there is an conflict between insn 73 and insn 71 (as both of them set the v3 register).



Register v2 (2): 73 [SVEC_REGS]

Register v8 (1): 71 [VEC_ALL_REGS]



....



(insn 71 73 243 4 (set (reg:VHF 38 v3 [orig:265 MEM[(const vfloat32x16 *)Src_base_134] ] [265])

        (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 {movvhf_internal}

     (expr_list:REG_STAGE (const_int 1 [0x1])

        (expr_list:REG_CYCLE (const_int 2 [0x2])

[-- Attachment #2: PR95696.patch --]
[-- Type: application/octet-stream, Size: 2736 bytes --]

diff --git a/gcc/regrename.c b/gcc/regrename.c
index c38173a77..e54794413 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -1614,12 +1614,26 @@ record_out_operands (rtx_insn *insn, bool earlyclobber, insn_rr_info *insn_info)
   cur_operand = NULL;
 }
 
+/* Get the first real insn of next vliw in current BB.  */
+static rtx_insn *
+get_next_vliw_first_insn (rtx_insn *cur_insn, basic_block bb)
+{
+  rtx_insn *insn = next_real_insn (cur_insn);
+
+  for (; insn && insn != BB_END (bb); insn = next_real_insn (insn))
+    if (GET_MODE (insn) == TImode)
+      return insn;
+
+  return cur_insn;
+}
+
 /* Build def/use chain.  */
 
 static bool
 build_def_use (basic_block bb)
 {
   rtx_insn *insn;
+  rtx_insn *vliw_start_insn = NULL;
   unsigned HOST_WIDE_INT untracked_operands;
 
   fail_current_block = false;
@@ -1663,6 +1677,9 @@ build_def_use (basic_block bb)
 	     to be marked unrenamable or even cause us to abort the entire
 	     basic block.  */
 
+	  if (GET_MODE (insn) == TImode)
+	    vliw_start_insn = insn;
+
 	  extract_constrain_insn (insn);
 	  preprocess_constraints (insn);
 	  const operand_alternative *op_alt = which_op_alt ();
@@ -1858,17 +1875,26 @@ build_def_use (basic_block bb)
 	      scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_access,
 			OP_INOUT);
 
-	  /* Step 7: Close chains for registers that were never
-	     really used here.  */
-	  for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
-	    if (REG_NOTE_KIND (note) == REG_UNUSED)
-	      {
-		remove_from_hard_reg_set (&live_hard_regs,
-					  GET_MODE (XEXP (note, 0)),
-					  REGNO (XEXP (note, 0)));
-		scan_rtx (insn, &XEXP (note, 0), NO_REGS, terminate_dead,
-			  OP_IN);
-	      }
+	  /* Step 7: Close chains for registers that were never
+	     really used delayed at the end of vliw.  */
+	  if (vliw_start_insn
+	      && next_real_insn (insn) == get_next_vliw_first_insn (insn, bb))
+	    {
+	      rtx_insn *member;
+
+	      for (member = vliw_start_insn;
+	           member != next_active_insn (insn);
+	           member = next_active_insn (member))
+	        for (note = REG_NOTES (member); note; note = XEXP (note, 1))
+	          if (REG_NOTE_KIND (note) == REG_UNUSED)
+	            {
+	              remove_from_hard_reg_set (&live_hard_regs,
+	                                        GET_MODE (XEXP (note, 0)),
+	                                        REGNO (XEXP (note, 0)));
+	              scan_rtx (member, &XEXP (note, 0), NO_REGS, terminate_dead,
+	                        OP_IN);
+	            }
+	    }
 
 	  /* Step 8: Kill the chains involving register restores.  Those
 	     should restore _that_ register.  */

^ permalink raw reply	[flat|nested] 12+ messages in thread

*  [PATCH PR95696] regrename creates overlapping register allocations for vliw
@ 2020-07-16 16:18 zhongyunde
  2020-07-20  0:59 ` Zhongyunde
  0 siblings, 1 reply; 12+ messages in thread
From: zhongyunde @ 2020-07-16 16:18 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]

hi,&nbsp;&nbsp;&nbsp;Insometarget,itislimitedtoissuetwoinsnstogetherwithchangethesameregister,so&nbsp;Imakeapatchtoextendtheliverangeuntiltheendofvliwtoavoidit.(Theinsn73startwithinsn:TI,soitwillbeissuedtogetherwithothersinsnsuntilanewinsnstartwithinsn:TI,suchasinsn71)TheregrenamecanknownthemodeV2VFininsn73needtwosuccessiveregisters,i.e.v2andv3,hereisdumpsnippetbeforetheregrename.(insn:TI7376714(set(reg/v:V2VF37v2[orig:180_62][180])(unspec:V2VF[(reg/v:VHF43v8[orig:210Dest_value][210])(reg/v:VHF43v8[orig:210Dest_value][210])]UNSPEC_HFSQMAG_32X32))&quot;../test_modify.c&quot;:57710{hfsqmag_v2vf}(expr_list:REG_DEAD(reg/v:VHF43v8[orig:210Dest_value][210])(expr_list:REG_UNUSED(reg:VHF38v3)(expr_list:REG_STAGE(const_int2[0x2])(expr_list:REG_CYCLE(const_int2[0x2])(expr_list:REG_UNITS(const_int256[0x100])(nil)))))))(insn71732434(set(reg:VHF43v8[orig:265MEM[(constvfloat32x16*)Src_base_134]][265])(mem:VHF(reg/v/f:DI13a13[orig:207Src_base][207])[1MEM[(constvfloat32x16*)Src_base_134]+0S64A512]))&quot;../test_modify.c&quot;:56450{movvhf_internal}(expr_list:REG_STAGE(const_int1[0x1])(expr_list:REG_CYCLE(const_int2[0x2])(nil))))Then,intheregrename,theinsn71willbetransformedintofollowingcodewithregisterv3,sothereisanconflictbetweeninsn73andinsn71,asbothofthemsetthev3register.Registerv2(2):73[SVEC_REGS]Registerv8(1):71[VEC_ALL_REGS]....(insn71732434(set(reg:VHF38v3[orig:265MEM[(constvfloat32x16*)Src_base_134]][265])(mem:VHF(reg/v/f:DI13a13[orig:207Src_base][207])[1MEM[(constvfloat32x16*)Src_base_134]+0S64A512]))&quot;../test_modify.c&quot;:56450{movvhf_internal}(expr_list:REG_STAGE(const_int1[0x1])&nbsp;(expr_list:REG_CYCLE(const_int2[0x2])&nbsp;随心邮-在微信里收发邮件,及时省电又安心

[-- Attachment #2: PR95696.patch --]
[-- Type: application/octet-stream, Size: 2658 bytes --]

diff --git a/gcc/regrename.c b/gcc/regrename.c
index c38173a77..e54794413 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -1614,12 +1614,26 @@ record_out_operands (rtx_insn *insn, bool earlyclobber, insn_rr_info *insn_info)
   cur_operand = NULL;
 }
 
+/* Get the first real insn of next vliw in current BB.  */
+static rtx_insn *
+get_next_vliw_first_insn (rtx_insn *cur_insn, basic_block bb)
+{
+  rtx_insn *insn = next_real_insn (cur_insn);
+
+  for (; insn && insn != BB_END (bb); insn = next_real_insn (insn))
+    if (GET_MODE (insn) == TImode)
+      return insn;
+
+  return cur_insn;
+}
+
 /* Build def/use chain.  */
 
 static bool
 build_def_use (basic_block bb)
 {
   rtx_insn *insn;
+  rtx_insn *vliw_start_insn = NULL;
   unsigned HOST_WIDE_INT untracked_operands;
 
   fail_current_block = false;
@@ -1663,6 +1677,9 @@ build_def_use (basic_block bb)
 	     to be marked unrenamable or even cause us to abort the entire
 	     basic block.  */
 
+	  if (GET_MODE (insn) == TImode)
+	    vliw_start_insn = insn;
+
 	  extract_constrain_insn (insn);
 	  preprocess_constraints (insn);
 	  const operand_alternative *op_alt = which_op_alt ();
@@ -1858,17 +1875,26 @@ build_def_use (basic_block bb)
 	      scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_access,
 			OP_INOUT);
 
-	  /* Step 7: Close chains for registers that were never
-	     really used here.  */
-	  for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
-	    if (REG_NOTE_KIND (note) == REG_UNUSED)
-	      {
-		remove_from_hard_reg_set (&live_hard_regs,
-					  GET_MODE (XEXP (note, 0)),
-					  REGNO (XEXP (note, 0)));
-		scan_rtx (insn, &XEXP (note, 0), NO_REGS, terminate_dead,
-			  OP_IN);
-	      }
+	  /* Step 7: Close chains for registers that were never
+	     really used delayed at the end of vliw.  */
+	  if (vliw_start_insn
+	      && next_real_insn (insn) == get_next_vliw_first_insn (insn, bb))
+	    {
+	      rtx_insn *member;
+
+	      for (member = vliw_start_insn;
+	           member != next_active_insn (insn);
+	           member = next_active_insn (member))
+	        for (note = REG_NOTES (member); note; note = XEXP (note, 1))
+	          if (REG_NOTE_KIND (note) == REG_UNUSED)
+	            {
+	              remove_from_hard_reg_set (&live_hard_regs,
+	                                        GET_MODE (XEXP (note, 0)),
+	                                        REGNO (XEXP (note, 0)));
+	              scan_rtx (member, &XEXP (note, 0), NO_REGS, terminate_dead,
+	                        OP_IN);
+	            }
+	    }
 
 	  /* Step 8: Kill the chains involving register restores.  Those
 	     should restore _that_ register.  */

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-08-03 14:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 14:58 RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw Zhongyunde
2020-07-22 20:47 ` Richard Sandiford
2020-07-26  7:28   ` Zhongyunde
2020-07-27  4:53   ` Zhongyunde
2020-07-27 17:32     ` Richard Sandiford
2020-07-31  6:09       ` Zhongyunde
2020-07-31  9:32         ` Richard Sandiford
2020-08-03  7:40           ` Yangfei (Felix)
2020-08-03 14:05             ` Richard Sandiford
  -- strict thread matches above, loose matches on Subject: below --
2020-07-16 16:18 zhongyunde
2020-07-20  0:59 ` Zhongyunde
2020-07-20 16:05   ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).