public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH/RFC] combine: Tweak the condition of last_set invalidation
@ 2020-12-16  8:49 Kewen.Lin
  2021-01-14  2:29 ` PING^1 " Kewen.Lin
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Kewen.Lin @ 2020-12-16  8:49 UTC (permalink / raw)
  To: GCC Patches; +Cc: Segher Boessenkool, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 2076 bytes --]

Hi,

When I was investigating unsigned int vec_init issue on Power,
I happened to find there seems something we can enhance in how
combine pass invalidate last_set (set last_set_invalid nonzero).

Currently we have the check:

      if (!insn
	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
	rsp->last_set_invalid = 1; 

which means if we want to record some value for some reg and
this reg got refered before in a valid scope, we invalidate the
set of reg (last_set_invalid to 1).  It avoids to find the wrong
set for one reg reference, such as the case like:

   ... op regX  // this regX could find wrong last_set below
   regX = ...   // if we think this set is valid
   ... op regX

But because of retry's existence, the last_set_table_tick could
be set by some later reference insns, but we see it's set due
to retry on the set (for that reg) insn again, such as:

   insn 1
   insn 2

   regX = ...     --> (a)
   ... op regX    --> (b)
   
   insn 3

   // assume all in the same BB.

Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
(3 insns -> 2 insns), retrying from insn1 or insn2 again:
it will scan insn (a) again, the below condition holds for regX:

  (value && rsp->last_set_table_tick >= label_tick_ebb_start)

it will mark this set as invalid set.  But actually the
last_set_table_tick here is set by insn (b) before retrying, so it
should be safe to be taken as valid set.

This proposal is to check whether the last_set_table safely happens
after the current set, make the set still valid if so.

Bootstrapped/regtested on powerpc64le-linux-gnu (P9),
aarch64-linux-gnu and x86_64-pc-linux-gnu.

Full SPEC2017 building shows this patch gets more sucessful combines
from 1902208 to 1902243 (trivial though).

Any comments are highly appreciated!

BR,
Kewen
-----

gcc/ChangeLog:

	* combine.c (struct reg_stat_type): New member
	last_set_table_luid.
	(update_table_tick): Add one argument for insn luid and
	set last_set_table_luid with it.
	(record_value_for_reg): Adjust the condition to set
	last_set_invalid nonzero.

[-- Attachment #2: last_set_luid.diff --]
[-- Type: text/plain, Size: 3570 bytes --]

diff --git a/gcc/combine.c b/gcc/combine.c
index 6fb2fa82c3f..2f45a0ad733 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -202,6 +202,10 @@ struct reg_stat_type {
 
   int				last_set_table_tick;
 
+  /* Record the luid of the insn whose expression involving register n.  */
+
+  int				last_set_table_luid;
+
   /* Record the value of label_tick when the value for register n is placed in
      last_set_value.  */
 
@@ -480,7 +484,7 @@ static rtx gen_lowpart_for_combine (machine_mode, rtx);
 static enum rtx_code simplify_compare_const (enum rtx_code, machine_mode,
 					     rtx, rtx *);
 static enum rtx_code simplify_comparison (enum rtx_code, rtx *, rtx *);
-static void update_table_tick (rtx);
+static void update_table_tick (rtx, int);
 static void record_value_for_reg (rtx, rtx_insn *, rtx);
 static void check_promoted_subreg (rtx_insn *, rtx);
 static void record_dead_and_set_regs_1 (rtx, const_rtx, void *);
@@ -13228,7 +13232,7 @@ count_rtxs (rtx x)
    for each register mentioned.  Similar to mention_regs in cse.c  */
 
 static void
-update_table_tick (rtx x)
+update_table_tick (rtx x, int insn_luid)
 {
   enum rtx_code code = GET_CODE (x);
   const char *fmt = GET_RTX_FORMAT (code);
@@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
       for (r = regno; r < endregno; r++)
 	{
 	  reg_stat_type *rsp = &reg_stat[r];
-	  rsp->last_set_table_tick = label_tick;
+	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
+	    {
+	      /* Later references should not have lower ticks.  */
+	      gcc_assert (label_tick >= rsp->last_set_table_tick);
+	      /* Should pick up the lowest luid if the references
+		 are in the same block.  */
+	      if (label_tick == rsp->last_set_table_tick
+		  && rsp->last_set_table_luid > insn_luid)
+		rsp->last_set_table_luid = insn_luid;
+	    }
+	  else
+	    {
+	      rsp->last_set_table_tick = label_tick;
+	      rsp->last_set_table_luid = insn_luid;
+	    }
 	}
 
       return;
@@ -13279,16 +13297,17 @@ update_table_tick (rtx x)
 	    if (ARITHMETIC_P (x0)
 		&& (x1 == XEXP (x0, 0) || x1 == XEXP (x0, 1)))
 	      {
-		update_table_tick (XEXP (x0, x1 == XEXP (x0, 0) ? 1 : 0));
+		update_table_tick (XEXP (x0, x1 == XEXP (x0, 0) ? 1 : 0),
+				   insn_luid);
 		break;
 	      }
 	  }
 
-	update_table_tick (XEXP (x, i));
+	update_table_tick (XEXP (x, i), insn_luid);
       }
     else if (fmt[i] == 'E')
       for (j = 0; j < XVECLEN (x, i); j++)
-	update_table_tick (XVECEXP (x, i, j));
+	update_table_tick (XVECEXP (x, i, j), insn_luid);
 }
 
 /* Record that REG is set to VALUE in insn INSN.  If VALUE is zero, we
@@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
 
   /* Mark registers that are being referenced in this value.  */
   if (value)
-    update_table_tick (value);
+    {
+      gcc_assert (insn);
+      update_table_tick (value, DF_INSN_LUID (insn));
+    }
 
   /* Now update the status of each register being set.
      If someone is using this register in this block, set this register
@@ -13372,8 +13394,11 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
     {
       rsp = &reg_stat[i];
       rsp->last_set_label = label_tick;
+      gcc_assert (label_tick >= rsp->last_set_table_tick);
       if (!insn
-	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
+	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start
+	      && !(label_tick == rsp->last_set_table_tick
+		   && DF_INSN_LUID (insn) < rsp->last_set_table_luid)))
 	rsp->last_set_invalid = 1;
       else
 	rsp->last_set_invalid = 0;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^1 [PATCH/RFC] combine: Tweak the condition of last_set invalidation
  2020-12-16  8:49 [PATCH/RFC] combine: Tweak the condition of last_set invalidation Kewen.Lin
@ 2021-01-14  2:29 ` Kewen.Lin
  2021-01-15  0:22 ` Segher Boessenkool
  2021-06-09 20:17 ` Segher Boessenkool
  2 siblings, 0 replies; 20+ messages in thread
From: Kewen.Lin @ 2021-01-14  2:29 UTC (permalink / raw)
  To: GCC Patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi,

I'd like to gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html


BR,
Kewen

on 2020/12/16 下午4:49, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> When I was investigating unsigned int vec_init issue on Power,
> I happened to find there seems something we can enhance in how
> combine pass invalidate last_set (set last_set_invalid nonzero).
> 
> Currently we have the check:
> 
>       if (!insn
> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
> 	rsp->last_set_invalid = 1; 
> 
> which means if we want to record some value for some reg and
> this reg got refered before in a valid scope, we invalidate the
> set of reg (last_set_invalid to 1).  It avoids to find the wrong
> set for one reg reference, such as the case like:
> 
>    ... op regX  // this regX could find wrong last_set below
>    regX = ...   // if we think this set is valid
>    ... op regX
> 
> But because of retry's existence, the last_set_table_tick could
> be set by some later reference insns, but we see it's set due
> to retry on the set (for that reg) insn again, such as:
> 
>    insn 1
>    insn 2
> 
>    regX = ...     --> (a)
>    ... op regX    --> (b)
>    
>    insn 3
> 
>    // assume all in the same BB.
> 
> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
> (3 insns -> 2 insns), retrying from insn1 or insn2 again:
> it will scan insn (a) again, the below condition holds for regX:
> 
>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
> 
> it will mark this set as invalid set.  But actually the
> last_set_table_tick here is set by insn (b) before retrying, so it
> should be safe to be taken as valid set.
> 
> This proposal is to check whether the last_set_table safely happens
> after the current set, make the set still valid if so.
> 
> Bootstrapped/regtested on powerpc64le-linux-gnu (P9),
> aarch64-linux-gnu and x86_64-pc-linux-gnu.
> 
> Full SPEC2017 building shows this patch gets more sucessful combines
> from 1902208 to 1902243 (trivial though).
> 
> Any comments are highly appreciated!
> 
> BR,
> Kewen
> -----
> 
> gcc/ChangeLog:
> 
> 	* combine.c (struct reg_stat_type): New member
> 	last_set_table_luid.
> 	(update_table_tick): Add one argument for insn luid and
> 	set last_set_table_luid with it.
> 	(record_value_for_reg): Adjust the condition to set
> 	last_set_invalid nonzero.
> 


BR,
Kewen

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/RFC] combine: Tweak the condition of last_set invalidation
  2020-12-16  8:49 [PATCH/RFC] combine: Tweak the condition of last_set invalidation Kewen.Lin
  2021-01-14  2:29 ` PING^1 " Kewen.Lin
@ 2021-01-15  0:22 ` Segher Boessenkool
  2021-01-15  8:06   ` Kewen.Lin
  2021-06-09 20:17 ` Segher Boessenkool
  2 siblings, 1 reply; 20+ messages in thread
From: Segher Boessenkool @ 2021-01-15  0:22 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, Bill Schmidt

Hi!

On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
> When I was investigating unsigned int vec_init issue on Power,
> I happened to find there seems something we can enhance in how
> combine pass invalidate last_set (set last_set_invalid nonzero).
> 
> Currently we have the check:
> 
>       if (!insn
> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
> 	rsp->last_set_invalid = 1; 
> 
> which means if we want to record some value for some reg and
> this reg got refered before in a valid scope, we invalidate the
> set of reg (last_set_invalid to 1).  It avoids to find the wrong
> set for one reg reference, such as the case like:
> 
>    ... op regX  // this regX could find wrong last_set below
>    regX = ...   // if we think this set is valid
>    ... op regX
> 
> But because of retry's existence, the last_set_table_tick could

It is not just because of retry: combine can change other insns than
just i2 and i3, too.  And even changing i2 requires this!

The whole reg_stat stuff is an ugly hack that does not work well.  For
example, as in your example, some "known" value can be invalidated
before the combination that wants to know that value is tried.

We need to have this outside of combine, in a dataflow(-like) thing
for example.  This could take the place of REG_EQ* as well probably
(which is good, there are various problems with that as well).

> This proposal is to check whether the last_set_table safely happens
> after the current set, make the set still valid if so.

I don't think this is safe to do like this, unfortunately.  There are
more places that set last_set_invalid (well, one more), so at the very
minimum this needs a lot more justification.


Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/RFC] combine: Tweak the condition of last_set invalidation
  2021-01-15  0:22 ` Segher Boessenkool
@ 2021-01-15  8:06   ` Kewen.Lin
  2021-01-22  0:30     ` Segher Boessenkool
  0 siblings, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-01-15  8:06 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, Bill Schmidt

Hi Segher,

Thanks for the comments!

on 2021/1/15 上午8:22, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>> When I was investigating unsigned int vec_init issue on Power,
>> I happened to find there seems something we can enhance in how
>> combine pass invalidate last_set (set last_set_invalid nonzero).
>>
>> Currently we have the check:
>>
>>       if (!insn
>> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
>> 	rsp->last_set_invalid = 1; 
>>
>> which means if we want to record some value for some reg and
>> this reg got refered before in a valid scope, we invalidate the
>> set of reg (last_set_invalid to 1).  It avoids to find the wrong
>> set for one reg reference, such as the case like:
>>
>>    ... op regX  // this regX could find wrong last_set below
>>    regX = ...   // if we think this set is valid
>>    ... op regX
>>
>> But because of retry's existence, the last_set_table_tick could
> 
> It is not just because of retry: combine can change other insns than
> just i2 and i3, too.  And even changing i2 requires this!
> 

Ah, thanks for the information!  Here retry is one example for that
we can revisit one instruction but meanwhile the stored information
for reg reference can be from that instruction after the current
one but visited before.

> The whole reg_stat stuff is an ugly hack that does not work well.  For
> example, as in your example, some "known" value can be invalidated
> before the combination that wants to know that value is tried.
> 
> We need to have this outside of combine, in a dataflow(-like) thing
> for example.  This could take the place of REG_EQ* as well probably
> (which is good, there are various problems with that as well).
> 

Good point, but IIUC we still need to keep updating(tracking)
information like what we put into reg_stat stuff, it's not static
since as you pointed out above, combine can change i2/i3 etc,
we need to update the information for the changes.  Anyway, it's not
what this patch tries to solve.  :-P

>> This proposal is to check whether the last_set_table safely happens
>> after the current set, make the set still valid if so.
> 
> I don't think this is safe to do like this, unfortunately.  There are
> more places that set last_set_invalid (well, one more), so at the very
> minimum this needs a lot more justification.
> 

Let me try to explain it more.
* Background *

There are two places which set last_set_invalid to 1. 

CASE 1:

```
  if (CALL_P (insn))
    {
      HARD_REG_SET callee_clobbers
	= insn_callee_abi (insn).full_and_partial_reg_clobbers ();
      hard_reg_set_iterator hrsi;
      EXECUTE_IF_SET_IN_HARD_REG_SET (callee_clobbers, 0, i, hrsi)
	{
	  reg_stat_type *rsp;

	  /* ??? We could try to preserve some information from the last
	     set of register I if the call doesn't actually clobber
	     (reg:last_set_mode I), which might be true for ABIs with
	     partial clobbers.  However, it would be difficult to
	     update last_set_nonzero_bits and last_sign_bit_copies
	     to account for the part of I that actually was clobbered.
	     It wouldn't help much anyway, since we rarely see this
	     situation before RA.  */
	  rsp = &reg_stat[i];
	  rsp->last_set_invalid = 1;
```

The justification for this part is that if ABI defines some callee
clobberred registers, we need to invalidate their sets to avoid
later references to use the unexpected values.

CASE 2:

```
  for (i = regno; i < endregno; i++)
    {
      rsp = &reg_stat[i];
      rsp->last_set_label = label_tick;
      if (!insn
	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
	rsp->last_set_invalid = 1;
      else
	rsp->last_set_invalid = 0;
    }
```

The justification here is that: if the insn is NULL, it's simply to
invalidate this reg set, go for it; if the value is NULL, it's simply
to say the reg clobberred, invalidate it; if the reference of this
reg being set have been seen, let's invalidate it.

The last part follows the comments:

  /* Now update the status of each register being set.
     If someone is using this register in this block, set this register
     to invalid since we will get confused between the two lives in this
     basic block.  This makes using this register always invalid.  In cse, we
     scan the table to invalidate all entries using this register, but this
     is too much work for us.  */

It's understandable to invalidate it to avoid the case:

   ... op regX    // (a)
   regX = ...     // (b)
   ... op regX    // (c)
 
When we are revisiting (a), it avoids to use the reg set in (b).

* Problem *

Firstly, the problem that this patch is trying to solve is mainly related
to case 2, so it doesn't touch CASE 1.

In the context of CASE 2, the current condition

  (rsp->last_set_table_tick >= label_tick_ebb_start)

is completely safe but too conservative to over-kill some case as below:

   regX = ...     // (d)
   ... op regX    // (e)

At the beginning, the visiting sequence looks like: (d) -> (e),
assuming (d) joins in some successful combination, then re-visiting (d). 
At that time of re-visiting (d), the last_set_table_tick has been set
according to (e) since we found regX's reference when visiting (e) before.
Then it will invalidate the set for regX.

This set (d) is actually safe since the recorded last_set_table_tick is
from one later reference (e) after this set.

* Solution *

The patch is trying to distinguish some safe cases like (d)(e)(d),
the basic points are:

  1) obtain the earliest reference of set reg.
     
     We have one assertion that the later references should have
     the same or higher label_tick, so if we already have one
     lower label_tick, we won't update the last_set_table_tick.

  2) record the luid of the earliest reference.
     
     When we meet one reference, if the label_tick are the same
     as the kept one, check the luid and choose the earliest one.

     If both label_ticks of reg set and reg ref are different, we
     have to take it as unsafe conservatively.  If they are the
     same, we can simply compare the luid to ensure the reg set
     happens before reg ref.

BR,
Kewen

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/RFC] combine: Tweak the condition of last_set invalidation
  2021-01-15  8:06   ` Kewen.Lin
@ 2021-01-22  0:30     ` Segher Boessenkool
  2021-01-22  2:21       ` Kewen.Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Segher Boessenkool @ 2021-01-22  0:30 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, Bill Schmidt

Hi Ke Wen,

On Fri, Jan 15, 2021 at 04:06:17PM +0800, Kewen.Lin wrote:
> on 2021/1/15 上午8:22, Segher Boessenkool wrote:
> > On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
> >>    ... op regX  // this regX could find wrong last_set below
> >>    regX = ...   // if we think this set is valid
> >>    ... op regX
> >>
> >> But because of retry's existence, the last_set_table_tick could
> > 
> > It is not just because of retry: combine can change other insns than
> > just i2 and i3, too.  And even changing i2 requires this!
> 
> Ah, thanks for the information!  Here retry is one example for that
> we can revisit one instruction but meanwhile the stored information
> for reg reference can be from that instruction after the current
> one but visited before.

Yes; and we do not usually revisit just one insn, but everything after
it as well.  We only need to revisit thos insns that are fed by what
has changed, but this is a good enough approximation (we never revisit
very far back).

> > The whole reg_stat stuff is an ugly hack that does not work well.  For
> > example, as in your example, some "known" value can be invalidated
> > before the combination that wants to know that value is tried.
> > 
> > We need to have this outside of combine, in a dataflow(-like) thing
> > for example.  This could take the place of REG_EQ* as well probably
> > (which is good, there are various problems with that as well).
> 
> Good point, but IIUC we still need to keep updating(tracking)
> information like what we put into reg_stat stuff, it's not static
> since as you pointed out above, combine can change i2/i3 etc,
> we need to update the information for the changes.

Yes, we should keep it correct all the time, and for any point in the
code.  It also can be used by other passes, e.g. it can replace all
REG_EQ* notes, all nonzero_bits and num_sign_bit_copies, and no doubt
even more things.

> Anyway, it's not what this patch tries to solve.  :-P

:-)

> >> This proposal is to check whether the last_set_table safely happens
> >> after the current set, make the set still valid if so.
> > 
> > I don't think this is safe to do like this, unfortunately.  There are
> > more places that set last_set_invalid (well, one more), so at the very
> > minimum this needs a lot more justification.
> 
> Let me try to explain it more.
> * Background *
> 
> There are two places which set last_set_invalid to 1. 
> 
> CASE 1:

<SNIP>

Thanks for the in-depth explanation!

I think this should be postponed to stage 1 though?  Or is there
anything very urgent in it?


Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/RFC] combine: Tweak the condition of last_set invalidation
  2021-01-22  0:30     ` Segher Boessenkool
@ 2021-01-22  2:21       ` Kewen.Lin
  2021-05-07  2:45         ` Kewen.Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-01-22  2:21 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, Bill Schmidt

Hi Segher,

on 2021/1/22 上午8:30, Segher Boessenkool wrote:
> Hi Ke Wen,
> 
> On Fri, Jan 15, 2021 at 04:06:17PM +0800, Kewen.Lin wrote:
>> on 2021/1/15 上午8:22, Segher Boessenkool wrote:
>>> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>>>>    ... op regX  // this regX could find wrong last_set below
>>>>    regX = ...   // if we think this set is valid
>>>>    ... op regX
>>>>
>>>> But because of retry's existence, the last_set_table_tick could
>>>
>>> It is not just because of retry: combine can change other insns than
>>> just i2 and i3, too.  And even changing i2 requires this!
>>
>> Ah, thanks for the information!  Here retry is one example for that
>> we can revisit one instruction but meanwhile the stored information
>> for reg reference can be from that instruction after the current
>> one but visited before.
> 
> Yes; and we do not usually revisit just one insn, but everything after
> it as well.  We only need to revisit thos insns that are fed by what
> has changed, but this is a good enough approximation (we never revisit
> very far back).
> 
>>> The whole reg_stat stuff is an ugly hack that does not work well.  For
>>> example, as in your example, some "known" value can be invalidated
>>> before the combination that wants to know that value is tried.
>>>
>>> We need to have this outside of combine, in a dataflow(-like) thing
>>> for example.  This could take the place of REG_EQ* as well probably
>>> (which is good, there are various problems with that as well).
>>
>> Good point, but IIUC we still need to keep updating(tracking)
>> information like what we put into reg_stat stuff, it's not static
>> since as you pointed out above, combine can change i2/i3 etc,
>> we need to update the information for the changes.
> 
> Yes, we should keep it correct all the time, and for any point in the
> code.  It also can be used by other passes, e.g. it can replace all
> REG_EQ* notes, all nonzero_bits and num_sign_bit_copies, and no doubt
> even more things.
> 
>> Anyway, it's not what this patch tries to solve.  :-P
> 
> :-)
> 
>>>> This proposal is to check whether the last_set_table safely happens
>>>> after the current set, make the set still valid if so.
>>>
>>> I don't think this is safe to do like this, unfortunately.  There are
>>> more places that set last_set_invalid (well, one more), so at the very
>>> minimum this needs a lot more justification.
>>
>> Let me try to explain it more.
>> * Background *
>>
>> There are two places which set last_set_invalid to 1. 
>>
>> CASE 1:
> 
> <SNIP>
> 
> Thanks for the in-depth explanation!
> 
> I think this should be postponed to stage 1 though?  Or is there
> anything very urgent in it?
> 

Yeah, I agree that this belongs to stage1, and there isn't anything
urgent about it.  Thanks for all further comments above!


BR,
Kewen

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/RFC] combine: Tweak the condition of last_set invalidation
  2021-01-22  2:21       ` Kewen.Lin
@ 2021-05-07  2:45         ` Kewen.Lin
  2021-05-26  3:04           ` PING^2 " Kewen.Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-05-07  2:45 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, Bill Schmidt

Hi Segher,

>>
>> I think this should be postponed to stage 1 though?  Or is there
>> anything very urgent in it?
>>
> 
> Yeah, I agree that this belongs to stage1, and there isn't anything
> urgent about it.  Thanks for all further comments above!
> 

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html

BR,
Kewen

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^2 [PATCH/RFC] combine: Tweak the condition of last_set invalidation
  2021-05-07  2:45         ` Kewen.Lin
@ 2021-05-26  3:04           ` Kewen.Lin
  2021-06-09  2:32             ` PING^3 " Kewen.Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-05-26  3:04 UTC (permalink / raw)
  To: gcc-patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html

BR,
Kewen

on 2021/5/7 上午10:45, Kewen.Lin via Gcc-patches wrote:
> Hi Segher,
> 
>>>
>>> I think this should be postponed to stage 1 though?  Or is there
>>> anything very urgent in it?
>>>
>>
>> Yeah, I agree that this belongs to stage1, and there isn't anything
>> urgent about it.  Thanks for all further comments above!
>>
> 
> Gentle ping this:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html
> 
> BR,
> Kewen
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^3 [PATCH/RFC] combine: Tweak the condition of last_set invalidation
  2021-05-26  3:04           ` PING^2 " Kewen.Lin
@ 2021-06-09  2:32             ` Kewen.Lin
  0 siblings, 0 replies; 20+ messages in thread
From: Kewen.Lin @ 2021-06-09  2:32 UTC (permalink / raw)
  To: gcc-patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html

BR,
Kewen

on 2021/5/26 上午11:04, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> Gentle ping this:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html
> 
> BR,
> Kewen
> 
> on 2021/5/7 上午10:45, Kewen.Lin via Gcc-patches wrote:
>> Hi Segher,
>>
>>>>
>>>> I think this should be postponed to stage 1 though?  Or is there
>>>> anything very urgent in it?
>>>>
>>>
>>> Yeah, I agree that this belongs to stage1, and there isn't anything
>>> urgent about it.  Thanks for all further comments above!
>>>
>>
>> Gentle ping this:
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html
>>
>> BR,
>> Kewen
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH/RFC] combine: Tweak the condition of last_set invalidation
  2020-12-16  8:49 [PATCH/RFC] combine: Tweak the condition of last_set invalidation Kewen.Lin
  2021-01-14  2:29 ` PING^1 " Kewen.Lin
  2021-01-15  0:22 ` Segher Boessenkool
@ 2021-06-09 20:17 ` Segher Boessenkool
  2021-06-11 13:16   ` [PATCH v2] " Kewen.Lin
  2 siblings, 1 reply; 20+ messages in thread
From: Segher Boessenkool @ 2021-06-09 20:17 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, Bill Schmidt

Hi!

On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
> Currently we have the check:
> 
>       if (!insn
> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
> 	rsp->last_set_invalid = 1; 
> 
> which means if we want to record some value for some reg and
> this reg got refered before in a valid scope,

If we already know it is *set* in this same extended basic block.
Possibly by the same instruction btw.

> we invalidate the
> set of reg (last_set_invalid to 1).  It avoids to find the wrong
> set for one reg reference, such as the case like:
> 
>    ... op regX  // this regX could find wrong last_set below
>    regX = ...   // if we think this set is valid
>    ... op regX

Yup, exactly.

> But because of retry's existence, the last_set_table_tick could
> be set by some later reference insns, but we see it's set due
> to retry on the set (for that reg) insn again, such as:
> 
>    insn 1
>    insn 2
> 
>    regX = ...     --> (a)
>    ... op regX    --> (b)
>    
>    insn 3
> 
>    // assume all in the same BB.
> 
> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
> (3 insns -> 2 insns),

This will delete insn 1 and write the combined result to insns 2 and 3.

> retrying from insn1 or insn2 again:

Always 2, but your point remains valid.

> it will scan insn (a) again, the below condition holds for regX:
> 
>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
> 
> it will mark this set as invalid set.  But actually the
> last_set_table_tick here is set by insn (b) before retrying, so it
> should be safe to be taken as valid set.

Yup.

> This proposal is to check whether the last_set_table safely happens
> after the current set, make the set still valid if so.

> Full SPEC2017 building shows this patch gets more sucessful combines
> from 1902208 to 1902243 (trivial though).

Do you have some example, or maybe even a testcase?  :-)

> +  /* Record the luid of the insn whose expression involving register n.  */
> +
> +  int				last_set_table_luid;

"Record the luid of the insn for which last_set_table_tick was set",
right?

> -static void update_table_tick (rtx);
> +static void update_table_tick (rtx, int);

Please remove this declaration instead, the function is not used until
after its actual definition :-)

> @@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
>        for (r = regno; r < endregno; r++)
>  	{
>  	  reg_stat_type *rsp = &reg_stat[r];
> -	  rsp->last_set_table_tick = label_tick;
> +	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
> +	    {
> +	      /* Later references should not have lower ticks.  */
> +	      gcc_assert (label_tick >= rsp->last_set_table_tick);

This should be obvious, but checking it won't hurt, okay.

> +	      /* Should pick up the lowest luid if the references
> +		 are in the same block.  */
> +	      if (label_tick == rsp->last_set_table_tick
> +		  && rsp->last_set_table_luid > insn_luid)
> +		rsp->last_set_table_luid = insn_luid;

Why?  Is it conservative for the check you will do later?  Please spell
this out, it is crucial!

> @@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
>  
>    /* Mark registers that are being referenced in this value.  */
>    if (value)
> -    update_table_tick (value);
> +    {
> +      gcc_assert (insn);
> +      update_table_tick (value, DF_INSN_LUID (insn));
> +    }

Don't add that assert please.  If you really want one it should come
right at the start of the function, not 60 lines later :-)

Looks good if I understood this correctly :-)


Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-06-09 20:17 ` Segher Boessenkool
@ 2021-06-11 13:16   ` Kewen.Lin
  2021-06-28  7:00     ` PING^1 " Kewen.Lin
  2021-11-29 22:28     ` Segher Boessenkool
  0 siblings, 2 replies; 20+ messages in thread
From: Kewen.Lin @ 2021-06-11 13:16 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 6139 bytes --]

Hi Segher,

Thanks for the review!

on 2021/6/10 上午4:17, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>> Currently we have the check:
>>
>>       if (!insn
>> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
>> 	rsp->last_set_invalid = 1; 
>>
>> which means if we want to record some value for some reg and
>> this reg got refered before in a valid scope,
> 
> If we already know it is *set* in this same extended basic block.
> Possibly by the same instruction btw.
> 
>> we invalidate the
>> set of reg (last_set_invalid to 1).  It avoids to find the wrong
>> set for one reg reference, such as the case like:
>>
>>    ... op regX  // this regX could find wrong last_set below
>>    regX = ...   // if we think this set is valid
>>    ... op regX
> 
> Yup, exactly.
> 
>> But because of retry's existence, the last_set_table_tick could
>> be set by some later reference insns, but we see it's set due
>> to retry on the set (for that reg) insn again, such as:
>>
>>    insn 1
>>    insn 2
>>
>>    regX = ...     --> (a)
>>    ... op regX    --> (b)
>>    
>>    insn 3
>>
>>    // assume all in the same BB.
>>
>> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
>> (3 insns -> 2 insns),
> 
> This will delete insn 1 and write the combined result to insns 2 and 3.
> 
>> retrying from insn1 or insn2 again:
> 
> Always 2, but your point remains valid.
> 
>> it will scan insn (a) again, the below condition holds for regX:
>>
>>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
>>
>> it will mark this set as invalid set.  But actually the
>> last_set_table_tick here is set by insn (b) before retrying, so it
>> should be safe to be taken as valid set.
> 
> Yup.
> 
>> This proposal is to check whether the last_set_table safely happens
>> after the current set, make the set still valid if so.
> 
>> Full SPEC2017 building shows this patch gets more sucessful combines
>> from 1902208 to 1902243 (trivial though).
> 
> Do you have some example, or maybe even a testcase?  :-)
> 

Sorry for the late reply, it took some time to get one reduced case.

typedef struct SA *pa_t;

struct SC {
  int h;
  pa_t elem[];
};

struct SD {
  struct SC *e;
};

struct SA {
  struct {
    struct SD f[1];
  } g;
};

void foo(pa_t *k, char **m) {
  int l, i;
  pa_t a;
  l = (int)a->g.f[5].e;
  i = 0;
  for (; i < l; i++) {
    k[i] = a->g.f[5].e->elem[i];
    m[i] = "";
  }
}

Baseline is r12-0 and the option is "-O3 -mcpu=power9 -fno-strict-aliasing",
with this patch, the generated assembly can save two rlwinm s.

>> +  /* Record the luid of the insn whose expression involving register n.  */
>> +
>> +  int				last_set_table_luid;
> 
> "Record the luid of the insn for which last_set_table_tick was set",
> right?
> 

But it can be updated later to one smaller luid, how about the wording like:


+  /* Record the luid of the insn which uses register n, the insn should
+     be the first one using register n in that block of the insn which
+     last_set_table_tick was set for.  */


>> -static void update_table_tick (rtx);
>> +static void update_table_tick (rtx, int);
> 
> Please remove this declaration instead, the function is not used until
> after its actual definition :-)
> 

Done.

>> @@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
>>        for (r = regno; r < endregno; r++)
>>  	{
>>  	  reg_stat_type *rsp = &reg_stat[r];
>> -	  rsp->last_set_table_tick = label_tick;
>> +	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
>> +	    {
>> +	      /* Later references should not have lower ticks.  */
>> +	      gcc_assert (label_tick >= rsp->last_set_table_tick);
> 
> This should be obvious, but checking it won't hurt, okay.
> 
>> +	      /* Should pick up the lowest luid if the references
>> +		 are in the same block.  */
>> +	      if (label_tick == rsp->last_set_table_tick
>> +		  && rsp->last_set_table_luid > insn_luid)
>> +		rsp->last_set_table_luid = insn_luid;
> 
> Why?  Is it conservative for the check you will do later?  Please spell
> this out, it is crucial!
> 

Since later the combinations involving this insn probably make the
register be used in one insn sitting ahead (which has smaller luid than
the one which was recorded before).  Yes, it's very conservative, this
ensure that we always use the luid of the insn which is the first insn
using this register in the block.  The last_set invalidation is going
to catch the case like:

   ... regX  // avoid the set used here ...
   regX = ...
   ...

Once we have the smallest luid one of all insns which use register X,
any unsafe regX sets should be caught.

I updated the comments to:

+              /* Since combination may generate some instructions
+                 to replace some foregoing instructions with the
+                 references to register r (using register r), we
+                 need to make sure we record the first instruction
+                 which is using register r, so always update with
+                 the lowest luid here.  If the given set happens
+                 before this recorded earliest reference, the set
+                 value should be safe to be used.  */

>> @@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
>>  
>>    /* Mark registers that are being referenced in this value.  */
>>    if (value)
>> -    update_table_tick (value);
>> +    {
>> +      gcc_assert (insn);
>> +      update_table_tick (value, DF_INSN_LUID (insn));
>> +    }
> 
> Don't add that assert please.  If you really want one it should come
> right at the start of the function, not 60 lines later :-)
> 

Exactly, fixed.

> Looks good if I understood this correctly :-)
> 
> 

Thanks again, I also updated the comments in func record_value_for_reg,
the new version is attached.

BR,
Kewen
-----
gcc/ChangeLog:

	* combine.c (struct reg_stat_type): New member
	last_set_table_luid.
	(update_table_tick): Add one argument for insn luid and
	set last_set_table_luid with it, remove its declaration.
	(record_value_for_reg): Adjust the condition to set
	last_set_invalid nonzero.

[-- Attachment #2: last_set_luid_v2.diff --]
[-- Type: text/plain, Size: 4345 bytes --]

diff --git a/gcc/combine.c b/gcc/combine.c
index 62bf4aeaaba..6ef6a08d54f 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -202,6 +202,12 @@ struct reg_stat_type {
 
   int				last_set_table_tick;
 
+  /* Record the luid of the insn which uses register n, the insn should
+     be the first one using register n in that block of the insn which
+     last_set_table_tick was set for.  */
+
+  int				last_set_table_luid;
+
   /* Record the value of label_tick when the value for register n is placed in
      last_set_value.  */
 
@@ -476,7 +482,6 @@ static rtx gen_lowpart_for_combine (machine_mode, rtx);
 static enum rtx_code simplify_compare_const (enum rtx_code, machine_mode,
 					     rtx, rtx *);
 static enum rtx_code simplify_comparison (enum rtx_code, rtx *, rtx *);
-static void update_table_tick (rtx);
 static void record_value_for_reg (rtx, rtx_insn *, rtx);
 static void check_promoted_subreg (rtx_insn *, rtx);
 static void record_dead_and_set_regs_1 (rtx, const_rtx, void *);
@@ -13179,7 +13184,7 @@ count_rtxs (rtx x)
    for each register mentioned.  Similar to mention_regs in cse.c  */
 
 static void
-update_table_tick (rtx x)
+update_table_tick (rtx x, int insn_luid)
 {
   enum rtx_code code = GET_CODE (x);
   const char *fmt = GET_RTX_FORMAT (code);
@@ -13194,7 +13199,27 @@ update_table_tick (rtx x)
       for (r = regno; r < endregno; r++)
 	{
 	  reg_stat_type *rsp = &reg_stat[r];
-	  rsp->last_set_table_tick = label_tick;
+	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
+	    {
+	      /* Later references should not have lower ticks.  */
+	      gcc_assert (label_tick >= rsp->last_set_table_tick);
+	      /* Since combination may generate some instructions
+		 to replace some foregoing instructions with the
+		 references to register r (using register r), we
+		 need to make sure we record the first instruction
+		 which is using register r, so always update with
+		 the lowest luid here.  If the given set happens
+		 before this recorded earliest reference, the set
+		 value should be safe to be used.  */
+	      if (label_tick == rsp->last_set_table_tick
+		  && rsp->last_set_table_luid > insn_luid)
+		rsp->last_set_table_luid = insn_luid;
+	    }
+	  else
+	    {
+	      rsp->last_set_table_tick = label_tick;
+	      rsp->last_set_table_luid = insn_luid;
+	    }
 	}
 
       return;
@@ -13230,16 +13255,17 @@ update_table_tick (rtx x)
 	    if (ARITHMETIC_P (x0)
 		&& (x1 == XEXP (x0, 0) || x1 == XEXP (x0, 1)))
 	      {
-		update_table_tick (XEXP (x0, x1 == XEXP (x0, 0) ? 1 : 0));
+		update_table_tick (XEXP (x0, x1 == XEXP (x0, 0) ? 1 : 0),
+				   insn_luid);
 		break;
 	      }
 	  }
 
-	update_table_tick (XEXP (x, i));
+	update_table_tick (XEXP (x, i), insn_luid);
       }
     else if (fmt[i] == 'E')
       for (j = 0; j < XVECLEN (x, i); j++)
-	update_table_tick (XVECEXP (x, i, j));
+	update_table_tick (XVECEXP (x, i, j), insn_luid);
 }
 
 /* Record that REG is set to VALUE in insn INSN.  If VALUE is zero, we
@@ -13310,21 +13336,26 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
 
   /* Mark registers that are being referenced in this value.  */
   if (value)
-    update_table_tick (value);
+    update_table_tick (value, DF_INSN_LUID (insn));
 
   /* Now update the status of each register being set.
      If someone is using this register in this block, set this register
      to invalid since we will get confused between the two lives in this
      basic block.  This makes using this register always invalid.  In cse, we
      scan the table to invalidate all entries using this register, but this
-     is too much work for us.  */
+     is too much work for us.  If we know this register set and its register
+     uses are in the same block, and the set always happens before any uses,
+     we don't need to make it invalid.  */
 
   for (i = regno; i < endregno; i++)
     {
       rsp = &reg_stat[i];
       rsp->last_set_label = label_tick;
+      gcc_assert (label_tick >= rsp->last_set_table_tick);
       if (!insn
-	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
+	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start
+	      && !(label_tick == rsp->last_set_table_tick
+		   && DF_INSN_LUID (insn) < rsp->last_set_table_luid)))
 	rsp->last_set_invalid = 1;
       else
 	rsp->last_set_invalid = 0;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^1 [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-06-11 13:16   ` [PATCH v2] " Kewen.Lin
@ 2021-06-28  7:00     ` Kewen.Lin
  2021-07-15  2:00       ` PING^2 " Kewen.Lin
  2021-11-29 22:28     ` Segher Boessenkool
  1 sibling, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-06-28  7:00 UTC (permalink / raw)
  To: GCC Patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi!

I'd like to gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html


BR,
Kewen

on 2021/6/11 下午9:16, Kewen.Lin via Gcc-patches wrote:
> Hi Segher,
> 
> Thanks for the review!
> 
> on 2021/6/10 上午4:17, Segher Boessenkool wrote:
>> Hi!
>>
>> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>>> Currently we have the check:
>>>
>>>       if (!insn
>>> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
>>> 	rsp->last_set_invalid = 1; 
>>>
>>> which means if we want to record some value for some reg and
>>> this reg got refered before in a valid scope,
>>
>> If we already know it is *set* in this same extended basic block.
>> Possibly by the same instruction btw.
>>
>>> we invalidate the
>>> set of reg (last_set_invalid to 1).  It avoids to find the wrong
>>> set for one reg reference, such as the case like:
>>>
>>>    ... op regX  // this regX could find wrong last_set below
>>>    regX = ...   // if we think this set is valid
>>>    ... op regX
>>
>> Yup, exactly.
>>
>>> But because of retry's existence, the last_set_table_tick could
>>> be set by some later reference insns, but we see it's set due
>>> to retry on the set (for that reg) insn again, such as:
>>>
>>>    insn 1
>>>    insn 2
>>>
>>>    regX = ...     --> (a)
>>>    ... op regX    --> (b)
>>>    
>>>    insn 3
>>>
>>>    // assume all in the same BB.
>>>
>>> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
>>> (3 insns -> 2 insns),
>>
>> This will delete insn 1 and write the combined result to insns 2 and 3.
>>
>>> retrying from insn1 or insn2 again:
>>
>> Always 2, but your point remains valid.
>>
>>> it will scan insn (a) again, the below condition holds for regX:
>>>
>>>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
>>>
>>> it will mark this set as invalid set.  But actually the
>>> last_set_table_tick here is set by insn (b) before retrying, so it
>>> should be safe to be taken as valid set.
>>
>> Yup.
>>
>>> This proposal is to check whether the last_set_table safely happens
>>> after the current set, make the set still valid if so.
>>
>>> Full SPEC2017 building shows this patch gets more sucessful combines
>>> from 1902208 to 1902243 (trivial though).
>>
>> Do you have some example, or maybe even a testcase?  :-)
>>
> 
> Sorry for the late reply, it took some time to get one reduced case.
> 
> typedef struct SA *pa_t;
> 
> struct SC {
>   int h;
>   pa_t elem[];
> };
> 
> struct SD {
>   struct SC *e;
> };
> 
> struct SA {
>   struct {
>     struct SD f[1];
>   } g;
> };
> 
> void foo(pa_t *k, char **m) {
>   int l, i;
>   pa_t a;
>   l = (int)a->g.f[5].e;
>   i = 0;
>   for (; i < l; i++) {
>     k[i] = a->g.f[5].e->elem[i];
>     m[i] = "";
>   }
> }
> 
> Baseline is r12-0 and the option is "-O3 -mcpu=power9 -fno-strict-aliasing",
> with this patch, the generated assembly can save two rlwinm s.
> 
>>> +  /* Record the luid of the insn whose expression involving register n.  */
>>> +
>>> +  int				last_set_table_luid;
>>
>> "Record the luid of the insn for which last_set_table_tick was set",
>> right?
>>
> 
> But it can be updated later to one smaller luid, how about the wording like:
> 
> 
> +  /* Record the luid of the insn which uses register n, the insn should
> +     be the first one using register n in that block of the insn which
> +     last_set_table_tick was set for.  */
> 
> 
>>> -static void update_table_tick (rtx);
>>> +static void update_table_tick (rtx, int);
>>
>> Please remove this declaration instead, the function is not used until
>> after its actual definition :-)
>>
> 
> Done.
> 
>>> @@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
>>>        for (r = regno; r < endregno; r++)
>>>  	{
>>>  	  reg_stat_type *rsp = &reg_stat[r];
>>> -	  rsp->last_set_table_tick = label_tick;
>>> +	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
>>> +	    {
>>> +	      /* Later references should not have lower ticks.  */
>>> +	      gcc_assert (label_tick >= rsp->last_set_table_tick);
>>
>> This should be obvious, but checking it won't hurt, okay.
>>
>>> +	      /* Should pick up the lowest luid if the references
>>> +		 are in the same block.  */
>>> +	      if (label_tick == rsp->last_set_table_tick
>>> +		  && rsp->last_set_table_luid > insn_luid)
>>> +		rsp->last_set_table_luid = insn_luid;
>>
>> Why?  Is it conservative for the check you will do later?  Please spell
>> this out, it is crucial!
>>
> 
> Since later the combinations involving this insn probably make the
> register be used in one insn sitting ahead (which has smaller luid than
> the one which was recorded before).  Yes, it's very conservative, this
> ensure that we always use the luid of the insn which is the first insn
> using this register in the block.  The last_set invalidation is going
> to catch the case like:
> 
>    ... regX  // avoid the set used here ...
>    regX = ...
>    ...
> 
> Once we have the smallest luid one of all insns which use register X,
> any unsafe regX sets should be caught.
> 
> I updated the comments to:
> 
> +              /* Since combination may generate some instructions
> +                 to replace some foregoing instructions with the
> +                 references to register r (using register r), we
> +                 need to make sure we record the first instruction
> +                 which is using register r, so always update with
> +                 the lowest luid here.  If the given set happens
> +                 before this recorded earliest reference, the set
> +                 value should be safe to be used.  */
> 
>>> @@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
>>>  
>>>    /* Mark registers that are being referenced in this value.  */
>>>    if (value)
>>> -    update_table_tick (value);
>>> +    {
>>> +      gcc_assert (insn);
>>> +      update_table_tick (value, DF_INSN_LUID (insn));
>>> +    }
>>
>> Don't add that assert please.  If you really want one it should come
>> right at the start of the function, not 60 lines later :-)
>>
> 
> Exactly, fixed.
> 
>> Looks good if I understood this correctly :-)
>>
>>
> 
> Thanks again, I also updated the comments in func record_value_for_reg,
> the new version is attached.
> 
> BR,
> Kewen
> -----
> gcc/ChangeLog:
> 
> 	* combine.c (struct reg_stat_type): New member
> 	last_set_table_luid.
> 	(update_table_tick): Add one argument for insn luid and
> 	set last_set_table_luid with it, remove its declaration.
> 	(record_value_for_reg): Adjust the condition to set
> 	last_set_invalid nonzero.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^2 [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-06-28  7:00     ` PING^1 " Kewen.Lin
@ 2021-07-15  2:00       ` Kewen.Lin
  2021-09-08  7:03         ` PING^3 " Kewen.Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-07-15  2:00 UTC (permalink / raw)
  To: GCC Patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html

BR,
Kewen

on 2021/6/28 下午3:00, Kewen.Lin via Gcc-patches wrote:
> Hi!
> 
> I'd like to gentle ping this:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html
> 
> 
> BR,
> Kewen
> 
> on 2021/6/11 下午9:16, Kewen.Lin via Gcc-patches wrote:
>> Hi Segher,
>>
>> Thanks for the review!
>>
>> on 2021/6/10 上午4:17, Segher Boessenkool wrote:
>>> Hi!
>>>
>>> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>>>> Currently we have the check:
>>>>
>>>>       if (!insn
>>>> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
>>>> 	rsp->last_set_invalid = 1; 
>>>>
>>>> which means if we want to record some value for some reg and
>>>> this reg got refered before in a valid scope,
>>>
>>> If we already know it is *set* in this same extended basic block.
>>> Possibly by the same instruction btw.
>>>
>>>> we invalidate the
>>>> set of reg (last_set_invalid to 1).  It avoids to find the wrong
>>>> set for one reg reference, such as the case like:
>>>>
>>>>    ... op regX  // this regX could find wrong last_set below
>>>>    regX = ...   // if we think this set is valid
>>>>    ... op regX
>>>
>>> Yup, exactly.
>>>
>>>> But because of retry's existence, the last_set_table_tick could
>>>> be set by some later reference insns, but we see it's set due
>>>> to retry on the set (for that reg) insn again, such as:
>>>>
>>>>    insn 1
>>>>    insn 2
>>>>
>>>>    regX = ...     --> (a)
>>>>    ... op regX    --> (b)
>>>>    
>>>>    insn 3
>>>>
>>>>    // assume all in the same BB.
>>>>
>>>> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
>>>> (3 insns -> 2 insns),
>>>
>>> This will delete insn 1 and write the combined result to insns 2 and 3.
>>>
>>>> retrying from insn1 or insn2 again:
>>>
>>> Always 2, but your point remains valid.
>>>
>>>> it will scan insn (a) again, the below condition holds for regX:
>>>>
>>>>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>
>>>> it will mark this set as invalid set.  But actually the
>>>> last_set_table_tick here is set by insn (b) before retrying, so it
>>>> should be safe to be taken as valid set.
>>>
>>> Yup.
>>>
>>>> This proposal is to check whether the last_set_table safely happens
>>>> after the current set, make the set still valid if so.
>>>
>>>> Full SPEC2017 building shows this patch gets more sucessful combines
>>>> from 1902208 to 1902243 (trivial though).
>>>
>>> Do you have some example, or maybe even a testcase?  :-)
>>>
>>
>> Sorry for the late reply, it took some time to get one reduced case.
>>
>> typedef struct SA *pa_t;
>>
>> struct SC {
>>   int h;
>>   pa_t elem[];
>> };
>>
>> struct SD {
>>   struct SC *e;
>> };
>>
>> struct SA {
>>   struct {
>>     struct SD f[1];
>>   } g;
>> };
>>
>> void foo(pa_t *k, char **m) {
>>   int l, i;
>>   pa_t a;
>>   l = (int)a->g.f[5].e;
>>   i = 0;
>>   for (; i < l; i++) {
>>     k[i] = a->g.f[5].e->elem[i];
>>     m[i] = "";
>>   }
>> }
>>
>> Baseline is r12-0 and the option is "-O3 -mcpu=power9 -fno-strict-aliasing",
>> with this patch, the generated assembly can save two rlwinm s.
>>
>>>> +  /* Record the luid of the insn whose expression involving register n.  */
>>>> +
>>>> +  int				last_set_table_luid;
>>>
>>> "Record the luid of the insn for which last_set_table_tick was set",
>>> right?
>>>
>>
>> But it can be updated later to one smaller luid, how about the wording like:
>>
>>
>> +  /* Record the luid of the insn which uses register n, the insn should
>> +     be the first one using register n in that block of the insn which
>> +     last_set_table_tick was set for.  */
>>
>>
>>>> -static void update_table_tick (rtx);
>>>> +static void update_table_tick (rtx, int);
>>>
>>> Please remove this declaration instead, the function is not used until
>>> after its actual definition :-)
>>>
>>
>> Done.
>>
>>>> @@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
>>>>        for (r = regno; r < endregno; r++)
>>>>  	{
>>>>  	  reg_stat_type *rsp = &reg_stat[r];
>>>> -	  rsp->last_set_table_tick = label_tick;
>>>> +	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
>>>> +	    {
>>>> +	      /* Later references should not have lower ticks.  */
>>>> +	      gcc_assert (label_tick >= rsp->last_set_table_tick);
>>>
>>> This should be obvious, but checking it won't hurt, okay.
>>>
>>>> +	      /* Should pick up the lowest luid if the references
>>>> +		 are in the same block.  */
>>>> +	      if (label_tick == rsp->last_set_table_tick
>>>> +		  && rsp->last_set_table_luid > insn_luid)
>>>> +		rsp->last_set_table_luid = insn_luid;
>>>
>>> Why?  Is it conservative for the check you will do later?  Please spell
>>> this out, it is crucial!
>>>
>>
>> Since later the combinations involving this insn probably make the
>> register be used in one insn sitting ahead (which has smaller luid than
>> the one which was recorded before).  Yes, it's very conservative, this
>> ensure that we always use the luid of the insn which is the first insn
>> using this register in the block.  The last_set invalidation is going
>> to catch the case like:
>>
>>    ... regX  // avoid the set used here ...
>>    regX = ...
>>    ...
>>
>> Once we have the smallest luid one of all insns which use register X,
>> any unsafe regX sets should be caught.
>>
>> I updated the comments to:
>>
>> +              /* Since combination may generate some instructions
>> +                 to replace some foregoing instructions with the
>> +                 references to register r (using register r), we
>> +                 need to make sure we record the first instruction
>> +                 which is using register r, so always update with
>> +                 the lowest luid here.  If the given set happens
>> +                 before this recorded earliest reference, the set
>> +                 value should be safe to be used.  */
>>
>>>> @@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
>>>>  
>>>>    /* Mark registers that are being referenced in this value.  */
>>>>    if (value)
>>>> -    update_table_tick (value);
>>>> +    {
>>>> +      gcc_assert (insn);
>>>> +      update_table_tick (value, DF_INSN_LUID (insn));
>>>> +    }
>>>
>>> Don't add that assert please.  If you really want one it should come
>>> right at the start of the function, not 60 lines later :-)
>>>
>>
>> Exactly, fixed.
>>
>>> Looks good if I understood this correctly :-)
>>>
>>>
>>
>> Thanks again, I also updated the comments in func record_value_for_reg,
>> the new version is attached.
>>
>> BR,
>> Kewen
>> -----
>> gcc/ChangeLog:
>>
>> 	* combine.c (struct reg_stat_type): New member
>> 	last_set_table_luid.
>> 	(update_table_tick): Add one argument for insn luid and
>> 	set last_set_table_luid with it, remove its declaration.
>> 	(record_value_for_reg): Adjust the condition to set
>> 	last_set_invalid nonzero.
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^3 [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-07-15  2:00       ` PING^2 " Kewen.Lin
@ 2021-09-08  7:03         ` Kewen.Lin
  2021-10-13  2:27           ` PING^4 " Kewen.Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-09-08  7:03 UTC (permalink / raw)
  To: GCC Patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html

BR,
Kewen

on 2021/7/15 上午10:00, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> Gentle ping this:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html
> 
> BR,
> Kewen
> 
> on 2021/6/28 下午3:00, Kewen.Lin via Gcc-patches wrote:
>> Hi!
>>
>> I'd like to gentle ping this:
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html
>>
>>
>> BR,
>> Kewen
>>
>> on 2021/6/11 下午9:16, Kewen.Lin via Gcc-patches wrote:
>>> Hi Segher,
>>>
>>> Thanks for the review!
>>>
>>> on 2021/6/10 上午4:17, Segher Boessenkool wrote:
>>>> Hi!
>>>>
>>>> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>>>>> Currently we have the check:
>>>>>
>>>>>       if (!insn
>>>>> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
>>>>> 	rsp->last_set_invalid = 1; 
>>>>>
>>>>> which means if we want to record some value for some reg and
>>>>> this reg got refered before in a valid scope,
>>>>
>>>> If we already know it is *set* in this same extended basic block.
>>>> Possibly by the same instruction btw.
>>>>
>>>>> we invalidate the
>>>>> set of reg (last_set_invalid to 1).  It avoids to find the wrong
>>>>> set for one reg reference, such as the case like:
>>>>>
>>>>>    ... op regX  // this regX could find wrong last_set below
>>>>>    regX = ...   // if we think this set is valid
>>>>>    ... op regX
>>>>
>>>> Yup, exactly.
>>>>
>>>>> But because of retry's existence, the last_set_table_tick could
>>>>> be set by some later reference insns, but we see it's set due
>>>>> to retry on the set (for that reg) insn again, such as:
>>>>>
>>>>>    insn 1
>>>>>    insn 2
>>>>>
>>>>>    regX = ...     --> (a)
>>>>>    ... op regX    --> (b)
>>>>>    
>>>>>    insn 3
>>>>>
>>>>>    // assume all in the same BB.
>>>>>
>>>>> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
>>>>> (3 insns -> 2 insns),
>>>>
>>>> This will delete insn 1 and write the combined result to insns 2 and 3.
>>>>
>>>>> retrying from insn1 or insn2 again:
>>>>
>>>> Always 2, but your point remains valid.
>>>>
>>>>> it will scan insn (a) again, the below condition holds for regX:
>>>>>
>>>>>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>>
>>>>> it will mark this set as invalid set.  But actually the
>>>>> last_set_table_tick here is set by insn (b) before retrying, so it
>>>>> should be safe to be taken as valid set.
>>>>
>>>> Yup.
>>>>
>>>>> This proposal is to check whether the last_set_table safely happens
>>>>> after the current set, make the set still valid if so.
>>>>
>>>>> Full SPEC2017 building shows this patch gets more sucessful combines
>>>>> from 1902208 to 1902243 (trivial though).
>>>>
>>>> Do you have some example, or maybe even a testcase?  :-)
>>>>
>>>
>>> Sorry for the late reply, it took some time to get one reduced case.
>>>
>>> typedef struct SA *pa_t;
>>>
>>> struct SC {
>>>   int h;
>>>   pa_t elem[];
>>> };
>>>
>>> struct SD {
>>>   struct SC *e;
>>> };
>>>
>>> struct SA {
>>>   struct {
>>>     struct SD f[1];
>>>   } g;
>>> };
>>>
>>> void foo(pa_t *k, char **m) {
>>>   int l, i;
>>>   pa_t a;
>>>   l = (int)a->g.f[5].e;
>>>   i = 0;
>>>   for (; i < l; i++) {
>>>     k[i] = a->g.f[5].e->elem[i];
>>>     m[i] = "";
>>>   }
>>> }
>>>
>>> Baseline is r12-0 and the option is "-O3 -mcpu=power9 -fno-strict-aliasing",
>>> with this patch, the generated assembly can save two rlwinm s.
>>>
>>>>> +  /* Record the luid of the insn whose expression involving register n.  */
>>>>> +
>>>>> +  int				last_set_table_luid;
>>>>
>>>> "Record the luid of the insn for which last_set_table_tick was set",
>>>> right?
>>>>
>>>
>>> But it can be updated later to one smaller luid, how about the wording like:
>>>
>>>
>>> +  /* Record the luid of the insn which uses register n, the insn should
>>> +     be the first one using register n in that block of the insn which
>>> +     last_set_table_tick was set for.  */
>>>
>>>
>>>>> -static void update_table_tick (rtx);
>>>>> +static void update_table_tick (rtx, int);
>>>>
>>>> Please remove this declaration instead, the function is not used until
>>>> after its actual definition :-)
>>>>
>>>
>>> Done.
>>>
>>>>> @@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
>>>>>        for (r = regno; r < endregno; r++)
>>>>>  	{
>>>>>  	  reg_stat_type *rsp = &reg_stat[r];
>>>>> -	  rsp->last_set_table_tick = label_tick;
>>>>> +	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>> +	    {
>>>>> +	      /* Later references should not have lower ticks.  */
>>>>> +	      gcc_assert (label_tick >= rsp->last_set_table_tick);
>>>>
>>>> This should be obvious, but checking it won't hurt, okay.
>>>>
>>>>> +	      /* Should pick up the lowest luid if the references
>>>>> +		 are in the same block.  */
>>>>> +	      if (label_tick == rsp->last_set_table_tick
>>>>> +		  && rsp->last_set_table_luid > insn_luid)
>>>>> +		rsp->last_set_table_luid = insn_luid;
>>>>
>>>> Why?  Is it conservative for the check you will do later?  Please spell
>>>> this out, it is crucial!
>>>>
>>>
>>> Since later the combinations involving this insn probably make the
>>> register be used in one insn sitting ahead (which has smaller luid than
>>> the one which was recorded before).  Yes, it's very conservative, this
>>> ensure that we always use the luid of the insn which is the first insn
>>> using this register in the block.  The last_set invalidation is going
>>> to catch the case like:
>>>
>>>    ... regX  // avoid the set used here ...
>>>    regX = ...
>>>    ...
>>>
>>> Once we have the smallest luid one of all insns which use register X,
>>> any unsafe regX sets should be caught.
>>>
>>> I updated the comments to:
>>>
>>> +              /* Since combination may generate some instructions
>>> +                 to replace some foregoing instructions with the
>>> +                 references to register r (using register r), we
>>> +                 need to make sure we record the first instruction
>>> +                 which is using register r, so always update with
>>> +                 the lowest luid here.  If the given set happens
>>> +                 before this recorded earliest reference, the set
>>> +                 value should be safe to be used.  */
>>>
>>>>> @@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
>>>>>  
>>>>>    /* Mark registers that are being referenced in this value.  */
>>>>>    if (value)
>>>>> -    update_table_tick (value);
>>>>> +    {
>>>>> +      gcc_assert (insn);
>>>>> +      update_table_tick (value, DF_INSN_LUID (insn));
>>>>> +    }
>>>>
>>>> Don't add that assert please.  If you really want one it should come
>>>> right at the start of the function, not 60 lines later :-)
>>>>
>>>
>>> Exactly, fixed.
>>>
>>>> Looks good if I understood this correctly :-)
>>>>
>>>>
>>>
>>> Thanks again, I also updated the comments in func record_value_for_reg,
>>> the new version is attached.
>>>
>>> BR,
>>> Kewen
>>> -----
>>> gcc/ChangeLog:
>>>
>>> 	* combine.c (struct reg_stat_type): New member
>>> 	last_set_table_luid.
>>> 	(update_table_tick): Add one argument for insn luid and
>>> 	set last_set_table_luid with it, remove its declaration.
>>> 	(record_value_for_reg): Adjust the condition to set
>>> 	last_set_invalid nonzero.
>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^4 [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-09-08  7:03         ` PING^3 " Kewen.Lin
@ 2021-10-13  2:27           ` Kewen.Lin
  2021-10-20  9:28             ` PING^5 " Kewen.Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-10-13  2:27 UTC (permalink / raw)
  To: GCC Patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html

BR,
Kewen

>>> on 2021/6/11 下午9:16, Kewen.Lin via Gcc-patches wrote:
>>>> Hi Segher,
>>>>
>>>> Thanks for the review!
>>>>
>>>> on 2021/6/10 上午4:17, Segher Boessenkool wrote:
>>>>> Hi!
>>>>>
>>>>> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>>>>>> Currently we have the check:
>>>>>>
>>>>>>       if (!insn
>>>>>> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
>>>>>> 	rsp->last_set_invalid = 1; 
>>>>>>
>>>>>> which means if we want to record some value for some reg and
>>>>>> this reg got refered before in a valid scope,
>>>>>
>>>>> If we already know it is *set* in this same extended basic block.
>>>>> Possibly by the same instruction btw.
>>>>>
>>>>>> we invalidate the
>>>>>> set of reg (last_set_invalid to 1).  It avoids to find the wrong
>>>>>> set for one reg reference, such as the case like:
>>>>>>
>>>>>>    ... op regX  // this regX could find wrong last_set below
>>>>>>    regX = ...   // if we think this set is valid
>>>>>>    ... op regX
>>>>>
>>>>> Yup, exactly.
>>>>>
>>>>>> But because of retry's existence, the last_set_table_tick could
>>>>>> be set by some later reference insns, but we see it's set due
>>>>>> to retry on the set (for that reg) insn again, such as:
>>>>>>
>>>>>>    insn 1
>>>>>>    insn 2
>>>>>>
>>>>>>    regX = ...     --> (a)
>>>>>>    ... op regX    --> (b)
>>>>>>    
>>>>>>    insn 3
>>>>>>
>>>>>>    // assume all in the same BB.
>>>>>>
>>>>>> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
>>>>>> (3 insns -> 2 insns),
>>>>>
>>>>> This will delete insn 1 and write the combined result to insns 2 and 3.
>>>>>
>>>>>> retrying from insn1 or insn2 again:
>>>>>
>>>>> Always 2, but your point remains valid.
>>>>>
>>>>>> it will scan insn (a) again, the below condition holds for regX:
>>>>>>
>>>>>>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>>>
>>>>>> it will mark this set as invalid set.  But actually the
>>>>>> last_set_table_tick here is set by insn (b) before retrying, so it
>>>>>> should be safe to be taken as valid set.
>>>>>
>>>>> Yup.
>>>>>
>>>>>> This proposal is to check whether the last_set_table safely happens
>>>>>> after the current set, make the set still valid if so.
>>>>>
>>>>>> Full SPEC2017 building shows this patch gets more sucessful combines
>>>>>> from 1902208 to 1902243 (trivial though).
>>>>>
>>>>> Do you have some example, or maybe even a testcase?  :-)
>>>>>
>>>>
>>>> Sorry for the late reply, it took some time to get one reduced case.
>>>>
>>>> typedef struct SA *pa_t;
>>>>
>>>> struct SC {
>>>>   int h;
>>>>   pa_t elem[];
>>>> };
>>>>
>>>> struct SD {
>>>>   struct SC *e;
>>>> };
>>>>
>>>> struct SA {
>>>>   struct {
>>>>     struct SD f[1];
>>>>   } g;
>>>> };
>>>>
>>>> void foo(pa_t *k, char **m) {
>>>>   int l, i;
>>>>   pa_t a;
>>>>   l = (int)a->g.f[5].e;
>>>>   i = 0;
>>>>   for (; i < l; i++) {
>>>>     k[i] = a->g.f[5].e->elem[i];
>>>>     m[i] = "";
>>>>   }
>>>> }
>>>>
>>>> Baseline is r12-0 and the option is "-O3 -mcpu=power9 -fno-strict-aliasing",
>>>> with this patch, the generated assembly can save two rlwinm s.
>>>>
>>>>>> +  /* Record the luid of the insn whose expression involving register n.  */
>>>>>> +
>>>>>> +  int				last_set_table_luid;
>>>>>
>>>>> "Record the luid of the insn for which last_set_table_tick was set",
>>>>> right?
>>>>>
>>>>
>>>> But it can be updated later to one smaller luid, how about the wording like:
>>>>
>>>>
>>>> +  /* Record the luid of the insn which uses register n, the insn should
>>>> +     be the first one using register n in that block of the insn which
>>>> +     last_set_table_tick was set for.  */
>>>>
>>>>
>>>>>> -static void update_table_tick (rtx);
>>>>>> +static void update_table_tick (rtx, int);
>>>>>
>>>>> Please remove this declaration instead, the function is not used until
>>>>> after its actual definition :-)
>>>>>
>>>>
>>>> Done.
>>>>
>>>>>> @@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
>>>>>>        for (r = regno; r < endregno; r++)
>>>>>>  	{
>>>>>>  	  reg_stat_type *rsp = &reg_stat[r];
>>>>>> -	  rsp->last_set_table_tick = label_tick;
>>>>>> +	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>>> +	    {
>>>>>> +	      /* Later references should not have lower ticks.  */
>>>>>> +	      gcc_assert (label_tick >= rsp->last_set_table_tick);
>>>>>
>>>>> This should be obvious, but checking it won't hurt, okay.
>>>>>
>>>>>> +	      /* Should pick up the lowest luid if the references
>>>>>> +		 are in the same block.  */
>>>>>> +	      if (label_tick == rsp->last_set_table_tick
>>>>>> +		  && rsp->last_set_table_luid > insn_luid)
>>>>>> +		rsp->last_set_table_luid = insn_luid;
>>>>>
>>>>> Why?  Is it conservative for the check you will do later?  Please spell
>>>>> this out, it is crucial!
>>>>>
>>>>
>>>> Since later the combinations involving this insn probably make the
>>>> register be used in one insn sitting ahead (which has smaller luid than
>>>> the one which was recorded before).  Yes, it's very conservative, this
>>>> ensure that we always use the luid of the insn which is the first insn
>>>> using this register in the block.  The last_set invalidation is going
>>>> to catch the case like:
>>>>
>>>>    ... regX  // avoid the set used here ...
>>>>    regX = ...
>>>>    ...
>>>>
>>>> Once we have the smallest luid one of all insns which use register X,
>>>> any unsafe regX sets should be caught.
>>>>
>>>> I updated the comments to:
>>>>
>>>> +              /* Since combination may generate some instructions
>>>> +                 to replace some foregoing instructions with the
>>>> +                 references to register r (using register r), we
>>>> +                 need to make sure we record the first instruction
>>>> +                 which is using register r, so always update with
>>>> +                 the lowest luid here.  If the given set happens
>>>> +                 before this recorded earliest reference, the set
>>>> +                 value should be safe to be used.  */
>>>>
>>>>>> @@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
>>>>>>  
>>>>>>    /* Mark registers that are being referenced in this value.  */
>>>>>>    if (value)
>>>>>> -    update_table_tick (value);
>>>>>> +    {
>>>>>> +      gcc_assert (insn);
>>>>>> +      update_table_tick (value, DF_INSN_LUID (insn));
>>>>>> +    }
>>>>>
>>>>> Don't add that assert please.  If you really want one it should come
>>>>> right at the start of the function, not 60 lines later :-)
>>>>>
>>>>
>>>> Exactly, fixed.
>>>>
>>>>> Looks good if I understood this correctly :-)
>>>>>
>>>>>
>>>>
>>>> Thanks again, I also updated the comments in func record_value_for_reg,
>>>> the new version is attached.
>>>>
>>>> BR,
>>>> Kewen
>>>> -----
>>>> gcc/ChangeLog:
>>>>
>>>> 	* combine.c (struct reg_stat_type): New member
>>>> 	last_set_table_luid.
>>>> 	(update_table_tick): Add one argument for insn luid and
>>>> 	set last_set_table_luid with it, remove its declaration.
>>>> 	(record_value_for_reg): Adjust the condition to set
>>>> 	last_set_invalid nonzero.
>>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^5 [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-10-13  2:27           ` PING^4 " Kewen.Lin
@ 2021-10-20  9:28             ` Kewen.Lin
  2021-11-04 10:56               ` PING^6 " Kewen.Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-10-20  9:28 UTC (permalink / raw)
  To: GCC Patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html

BR,
Kewen

> 
>>>> on 2021/6/11 下午9:16, Kewen.Lin via Gcc-patches wrote:
>>>>> Hi Segher,
>>>>>
>>>>> Thanks for the review!
>>>>>
>>>>> on 2021/6/10 上午4:17, Segher Boessenkool wrote:
>>>>>> Hi!
>>>>>>
>>>>>> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>>>>>>> Currently we have the check:
>>>>>>>
>>>>>>>       if (!insn
>>>>>>> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
>>>>>>> 	rsp->last_set_invalid = 1; 
>>>>>>>
>>>>>>> which means if we want to record some value for some reg and
>>>>>>> this reg got refered before in a valid scope,
>>>>>>
>>>>>> If we already know it is *set* in this same extended basic block.
>>>>>> Possibly by the same instruction btw.
>>>>>>
>>>>>>> we invalidate the
>>>>>>> set of reg (last_set_invalid to 1).  It avoids to find the wrong
>>>>>>> set for one reg reference, such as the case like:
>>>>>>>
>>>>>>>    ... op regX  // this regX could find wrong last_set below
>>>>>>>    regX = ...   // if we think this set is valid
>>>>>>>    ... op regX
>>>>>>
>>>>>> Yup, exactly.
>>>>>>
>>>>>>> But because of retry's existence, the last_set_table_tick could
>>>>>>> be set by some later reference insns, but we see it's set due
>>>>>>> to retry on the set (for that reg) insn again, such as:
>>>>>>>
>>>>>>>    insn 1
>>>>>>>    insn 2
>>>>>>>
>>>>>>>    regX = ...     --> (a)
>>>>>>>    ... op regX    --> (b)
>>>>>>>    
>>>>>>>    insn 3
>>>>>>>
>>>>>>>    // assume all in the same BB.
>>>>>>>
>>>>>>> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
>>>>>>> (3 insns -> 2 insns),
>>>>>>
>>>>>> This will delete insn 1 and write the combined result to insns 2 and 3.
>>>>>>
>>>>>>> retrying from insn1 or insn2 again:
>>>>>>
>>>>>> Always 2, but your point remains valid.
>>>>>>
>>>>>>> it will scan insn (a) again, the below condition holds for regX:
>>>>>>>
>>>>>>>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>>>>
>>>>>>> it will mark this set as invalid set.  But actually the
>>>>>>> last_set_table_tick here is set by insn (b) before retrying, so it
>>>>>>> should be safe to be taken as valid set.
>>>>>>
>>>>>> Yup.
>>>>>>
>>>>>>> This proposal is to check whether the last_set_table safely happens
>>>>>>> after the current set, make the set still valid if so.
>>>>>>
>>>>>>> Full SPEC2017 building shows this patch gets more sucessful combines
>>>>>>> from 1902208 to 1902243 (trivial though).
>>>>>>
>>>>>> Do you have some example, or maybe even a testcase?  :-)
>>>>>>
>>>>>
>>>>> Sorry for the late reply, it took some time to get one reduced case.
>>>>>
>>>>> typedef struct SA *pa_t;
>>>>>
>>>>> struct SC {
>>>>>   int h;
>>>>>   pa_t elem[];
>>>>> };
>>>>>
>>>>> struct SD {
>>>>>   struct SC *e;
>>>>> };
>>>>>
>>>>> struct SA {
>>>>>   struct {
>>>>>     struct SD f[1];
>>>>>   } g;
>>>>> };
>>>>>
>>>>> void foo(pa_t *k, char **m) {
>>>>>   int l, i;
>>>>>   pa_t a;
>>>>>   l = (int)a->g.f[5].e;
>>>>>   i = 0;
>>>>>   for (; i < l; i++) {
>>>>>     k[i] = a->g.f[5].e->elem[i];
>>>>>     m[i] = "";
>>>>>   }
>>>>> }
>>>>>
>>>>> Baseline is r12-0 and the option is "-O3 -mcpu=power9 -fno-strict-aliasing",
>>>>> with this patch, the generated assembly can save two rlwinm s.
>>>>>
>>>>>>> +  /* Record the luid of the insn whose expression involving register n.  */
>>>>>>> +
>>>>>>> +  int				last_set_table_luid;
>>>>>>
>>>>>> "Record the luid of the insn for which last_set_table_tick was set",
>>>>>> right?
>>>>>>
>>>>>
>>>>> But it can be updated later to one smaller luid, how about the wording like:
>>>>>
>>>>>
>>>>> +  /* Record the luid of the insn which uses register n, the insn should
>>>>> +     be the first one using register n in that block of the insn which
>>>>> +     last_set_table_tick was set for.  */
>>>>>
>>>>>
>>>>>>> -static void update_table_tick (rtx);
>>>>>>> +static void update_table_tick (rtx, int);
>>>>>>
>>>>>> Please remove this declaration instead, the function is not used until
>>>>>> after its actual definition :-)
>>>>>>
>>>>>
>>>>> Done.
>>>>>
>>>>>>> @@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
>>>>>>>        for (r = regno; r < endregno; r++)
>>>>>>>  	{
>>>>>>>  	  reg_stat_type *rsp = &reg_stat[r];
>>>>>>> -	  rsp->last_set_table_tick = label_tick;
>>>>>>> +	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>>>> +	    {
>>>>>>> +	      /* Later references should not have lower ticks.  */
>>>>>>> +	      gcc_assert (label_tick >= rsp->last_set_table_tick);
>>>>>>
>>>>>> This should be obvious, but checking it won't hurt, okay.
>>>>>>
>>>>>>> +	      /* Should pick up the lowest luid if the references
>>>>>>> +		 are in the same block.  */
>>>>>>> +	      if (label_tick == rsp->last_set_table_tick
>>>>>>> +		  && rsp->last_set_table_luid > insn_luid)
>>>>>>> +		rsp->last_set_table_luid = insn_luid;
>>>>>>
>>>>>> Why?  Is it conservative for the check you will do later?  Please spell
>>>>>> this out, it is crucial!
>>>>>>
>>>>>
>>>>> Since later the combinations involving this insn probably make the
>>>>> register be used in one insn sitting ahead (which has smaller luid than
>>>>> the one which was recorded before).  Yes, it's very conservative, this
>>>>> ensure that we always use the luid of the insn which is the first insn
>>>>> using this register in the block.  The last_set invalidation is going
>>>>> to catch the case like:
>>>>>
>>>>>    ... regX  // avoid the set used here ...
>>>>>    regX = ...
>>>>>    ...
>>>>>
>>>>> Once we have the smallest luid one of all insns which use register X,
>>>>> any unsafe regX sets should be caught.
>>>>>
>>>>> I updated the comments to:
>>>>>
>>>>> +              /* Since combination may generate some instructions
>>>>> +                 to replace some foregoing instructions with the
>>>>> +                 references to register r (using register r), we
>>>>> +                 need to make sure we record the first instruction
>>>>> +                 which is using register r, so always update with
>>>>> +                 the lowest luid here.  If the given set happens
>>>>> +                 before this recorded earliest reference, the set
>>>>> +                 value should be safe to be used.  */
>>>>>
>>>>>>> @@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
>>>>>>>  
>>>>>>>    /* Mark registers that are being referenced in this value.  */
>>>>>>>    if (value)
>>>>>>> -    update_table_tick (value);
>>>>>>> +    {
>>>>>>> +      gcc_assert (insn);
>>>>>>> +      update_table_tick (value, DF_INSN_LUID (insn));
>>>>>>> +    }
>>>>>>
>>>>>> Don't add that assert please.  If you really want one it should come
>>>>>> right at the start of the function, not 60 lines later :-)
>>>>>>
>>>>>
>>>>> Exactly, fixed.
>>>>>
>>>>>> Looks good if I understood this correctly :-)
>>>>>>
>>>>>>
>>>>>
>>>>> Thanks again, I also updated the comments in func record_value_for_reg,
>>>>> the new version is attached.
>>>>>
>>>>> BR,
>>>>> Kewen
>>>>> -----
>>>>> gcc/ChangeLog:
>>>>>
>>>>> 	* combine.c (struct reg_stat_type): New member
>>>>> 	last_set_table_luid.
>>>>> 	(update_table_tick): Add one argument for insn luid and
>>>>> 	set last_set_table_luid with it, remove its declaration.
>>>>> 	(record_value_for_reg): Adjust the condition to set
>>>>> 	last_set_invalid nonzero.
>>>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^6 [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-10-20  9:28             ` PING^5 " Kewen.Lin
@ 2021-11-04 10:56               ` Kewen.Lin
  2021-11-22  2:22                 ` PING^7 " Kewen.Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Kewen.Lin @ 2021-11-04 10:56 UTC (permalink / raw)
  To: GCC Patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html

BR,
Kewen

>>>>> on 2021/6/11 下午9:16, Kewen.Lin via Gcc-patches wrote:
>>>>>> Hi Segher,
>>>>>>
>>>>>> Thanks for the review!
>>>>>>
>>>>>> on 2021/6/10 上午4:17, Segher Boessenkool wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>>>>>>>> Currently we have the check:
>>>>>>>>
>>>>>>>>       if (!insn
>>>>>>>> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
>>>>>>>> 	rsp->last_set_invalid = 1; 
>>>>>>>>
>>>>>>>> which means if we want to record some value for some reg and
>>>>>>>> this reg got refered before in a valid scope,
>>>>>>>
>>>>>>> If we already know it is *set* in this same extended basic block.
>>>>>>> Possibly by the same instruction btw.
>>>>>>>
>>>>>>>> we invalidate the
>>>>>>>> set of reg (last_set_invalid to 1).  It avoids to find the wrong
>>>>>>>> set for one reg reference, such as the case like:
>>>>>>>>
>>>>>>>>    ... op regX  // this regX could find wrong last_set below
>>>>>>>>    regX = ...   // if we think this set is valid
>>>>>>>>    ... op regX
>>>>>>>
>>>>>>> Yup, exactly.
>>>>>>>
>>>>>>>> But because of retry's existence, the last_set_table_tick could
>>>>>>>> be set by some later reference insns, but we see it's set due
>>>>>>>> to retry on the set (for that reg) insn again, such as:
>>>>>>>>
>>>>>>>>    insn 1
>>>>>>>>    insn 2
>>>>>>>>
>>>>>>>>    regX = ...     --> (a)
>>>>>>>>    ... op regX    --> (b)
>>>>>>>>    
>>>>>>>>    insn 3
>>>>>>>>
>>>>>>>>    // assume all in the same BB.
>>>>>>>>
>>>>>>>> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
>>>>>>>> (3 insns -> 2 insns),
>>>>>>>
>>>>>>> This will delete insn 1 and write the combined result to insns 2 and 3.
>>>>>>>
>>>>>>>> retrying from insn1 or insn2 again:
>>>>>>>
>>>>>>> Always 2, but your point remains valid.
>>>>>>>
>>>>>>>> it will scan insn (a) again, the below condition holds for regX:
>>>>>>>>
>>>>>>>>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>>>>>
>>>>>>>> it will mark this set as invalid set.  But actually the
>>>>>>>> last_set_table_tick here is set by insn (b) before retrying, so it
>>>>>>>> should be safe to be taken as valid set.
>>>>>>>
>>>>>>> Yup.
>>>>>>>
>>>>>>>> This proposal is to check whether the last_set_table safely happens
>>>>>>>> after the current set, make the set still valid if so.
>>>>>>>
>>>>>>>> Full SPEC2017 building shows this patch gets more sucessful combines
>>>>>>>> from 1902208 to 1902243 (trivial though).
>>>>>>>
>>>>>>> Do you have some example, or maybe even a testcase?  :-)
>>>>>>>
>>>>>>
>>>>>> Sorry for the late reply, it took some time to get one reduced case.
>>>>>>
>>>>>> typedef struct SA *pa_t;
>>>>>>
>>>>>> struct SC {
>>>>>>   int h;
>>>>>>   pa_t elem[];
>>>>>> };
>>>>>>
>>>>>> struct SD {
>>>>>>   struct SC *e;
>>>>>> };
>>>>>>
>>>>>> struct SA {
>>>>>>   struct {
>>>>>>     struct SD f[1];
>>>>>>   } g;
>>>>>> };
>>>>>>
>>>>>> void foo(pa_t *k, char **m) {
>>>>>>   int l, i;
>>>>>>   pa_t a;
>>>>>>   l = (int)a->g.f[5].e;
>>>>>>   i = 0;
>>>>>>   for (; i < l; i++) {
>>>>>>     k[i] = a->g.f[5].e->elem[i];
>>>>>>     m[i] = "";
>>>>>>   }
>>>>>> }
>>>>>>
>>>>>> Baseline is r12-0 and the option is "-O3 -mcpu=power9 -fno-strict-aliasing",
>>>>>> with this patch, the generated assembly can save two rlwinm s.
>>>>>>
>>>>>>>> +  /* Record the luid of the insn whose expression involving register n.  */
>>>>>>>> +
>>>>>>>> +  int				last_set_table_luid;
>>>>>>>
>>>>>>> "Record the luid of the insn for which last_set_table_tick was set",
>>>>>>> right?
>>>>>>>
>>>>>>
>>>>>> But it can be updated later to one smaller luid, how about the wording like:
>>>>>>
>>>>>>
>>>>>> +  /* Record the luid of the insn which uses register n, the insn should
>>>>>> +     be the first one using register n in that block of the insn which
>>>>>> +     last_set_table_tick was set for.  */
>>>>>>
>>>>>>
>>>>>>>> -static void update_table_tick (rtx);
>>>>>>>> +static void update_table_tick (rtx, int);
>>>>>>>
>>>>>>> Please remove this declaration instead, the function is not used until
>>>>>>> after its actual definition :-)
>>>>>>>
>>>>>>
>>>>>> Done.
>>>>>>
>>>>>>>> @@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
>>>>>>>>        for (r = regno; r < endregno; r++)
>>>>>>>>  	{
>>>>>>>>  	  reg_stat_type *rsp = &reg_stat[r];
>>>>>>>> -	  rsp->last_set_table_tick = label_tick;
>>>>>>>> +	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>>>>> +	    {
>>>>>>>> +	      /* Later references should not have lower ticks.  */
>>>>>>>> +	      gcc_assert (label_tick >= rsp->last_set_table_tick);
>>>>>>>
>>>>>>> This should be obvious, but checking it won't hurt, okay.
>>>>>>>
>>>>>>>> +	      /* Should pick up the lowest luid if the references
>>>>>>>> +		 are in the same block.  */
>>>>>>>> +	      if (label_tick == rsp->last_set_table_tick
>>>>>>>> +		  && rsp->last_set_table_luid > insn_luid)
>>>>>>>> +		rsp->last_set_table_luid = insn_luid;
>>>>>>>
>>>>>>> Why?  Is it conservative for the check you will do later?  Please spell
>>>>>>> this out, it is crucial!
>>>>>>>
>>>>>>
>>>>>> Since later the combinations involving this insn probably make the
>>>>>> register be used in one insn sitting ahead (which has smaller luid than
>>>>>> the one which was recorded before).  Yes, it's very conservative, this
>>>>>> ensure that we always use the luid of the insn which is the first insn
>>>>>> using this register in the block.  The last_set invalidation is going
>>>>>> to catch the case like:
>>>>>>
>>>>>>    ... regX  // avoid the set used here ...
>>>>>>    regX = ...
>>>>>>    ...
>>>>>>
>>>>>> Once we have the smallest luid one of all insns which use register X,
>>>>>> any unsafe regX sets should be caught.
>>>>>>
>>>>>> I updated the comments to:
>>>>>>
>>>>>> +              /* Since combination may generate some instructions
>>>>>> +                 to replace some foregoing instructions with the
>>>>>> +                 references to register r (using register r), we
>>>>>> +                 need to make sure we record the first instruction
>>>>>> +                 which is using register r, so always update with
>>>>>> +                 the lowest luid here.  If the given set happens
>>>>>> +                 before this recorded earliest reference, the set
>>>>>> +                 value should be safe to be used.  */
>>>>>>
>>>>>>>> @@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
>>>>>>>>  
>>>>>>>>    /* Mark registers that are being referenced in this value.  */
>>>>>>>>    if (value)
>>>>>>>> -    update_table_tick (value);
>>>>>>>> +    {
>>>>>>>> +      gcc_assert (insn);
>>>>>>>> +      update_table_tick (value, DF_INSN_LUID (insn));
>>>>>>>> +    }
>>>>>>>
>>>>>>> Don't add that assert please.  If you really want one it should come
>>>>>>> right at the start of the function, not 60 lines later :-)
>>>>>>>
>>>>>>
>>>>>> Exactly, fixed.
>>>>>>
>>>>>>> Looks good if I understood this correctly :-)
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Thanks again, I also updated the comments in func record_value_for_reg,
>>>>>> the new version is attached.
>>>>>>
>>>>>> BR,
>>>>>> Kewen
>>>>>> -----
>>>>>> gcc/ChangeLog:
>>>>>>
>>>>>> 	* combine.c (struct reg_stat_type): New member
>>>>>> 	last_set_table_luid.
>>>>>> 	(update_table_tick): Add one argument for insn luid and
>>>>>> 	set last_set_table_luid with it, remove its declaration.
>>>>>> 	(record_value_for_reg): Adjust the condition to set
>>>>>> 	last_set_invalid nonzero.
>>>>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* PING^7 [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-11-04 10:56               ` PING^6 " Kewen.Lin
@ 2021-11-22  2:22                 ` Kewen.Lin
  0 siblings, 0 replies; 20+ messages in thread
From: Kewen.Lin @ 2021-11-22  2:22 UTC (permalink / raw)
  To: GCC Patches; +Cc: Bill Schmidt, Segher Boessenkool

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572555.html

BR,
Kewen

>>>>>> on 2021/6/11 下午9:16, Kewen.Lin via Gcc-patches wrote:
>>>>>>> Hi Segher,
>>>>>>>
>>>>>>> Thanks for the review!
>>>>>>>
>>>>>>> on 2021/6/10 上午4:17, Segher Boessenkool wrote:
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote:
>>>>>>>>> Currently we have the check:
>>>>>>>>>
>>>>>>>>>       if (!insn
>>>>>>>>> 	  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
>>>>>>>>> 	rsp->last_set_invalid = 1; 
>>>>>>>>>
>>>>>>>>> which means if we want to record some value for some reg and
>>>>>>>>> this reg got refered before in a valid scope,
>>>>>>>>
>>>>>>>> If we already know it is *set* in this same extended basic block.
>>>>>>>> Possibly by the same instruction btw.
>>>>>>>>
>>>>>>>>> we invalidate the
>>>>>>>>> set of reg (last_set_invalid to 1).  It avoids to find the wrong
>>>>>>>>> set for one reg reference, such as the case like:
>>>>>>>>>
>>>>>>>>>    ... op regX  // this regX could find wrong last_set below
>>>>>>>>>    regX = ...   // if we think this set is valid
>>>>>>>>>    ... op regX
>>>>>>>>
>>>>>>>> Yup, exactly.
>>>>>>>>
>>>>>>>>> But because of retry's existence, the last_set_table_tick could
>>>>>>>>> be set by some later reference insns, but we see it's set due
>>>>>>>>> to retry on the set (for that reg) insn again, such as:
>>>>>>>>>
>>>>>>>>>    insn 1
>>>>>>>>>    insn 2
>>>>>>>>>
>>>>>>>>>    regX = ...     --> (a)
>>>>>>>>>    ... op regX    --> (b)
>>>>>>>>>    
>>>>>>>>>    insn 3
>>>>>>>>>
>>>>>>>>>    // assume all in the same BB.
>>>>>>>>>
>>>>>>>>> Assuming we combine 1, 2 -> 3 sucessfully and replace them as two
>>>>>>>>> (3 insns -> 2 insns),
>>>>>>>>
>>>>>>>> This will delete insn 1 and write the combined result to insns 2 and 3.
>>>>>>>>
>>>>>>>>> retrying from insn1 or insn2 again:
>>>>>>>>
>>>>>>>> Always 2, but your point remains valid.
>>>>>>>>
>>>>>>>>> it will scan insn (a) again, the below condition holds for regX:
>>>>>>>>>
>>>>>>>>>   (value && rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>>>>>>
>>>>>>>>> it will mark this set as invalid set.  But actually the
>>>>>>>>> last_set_table_tick here is set by insn (b) before retrying, so it
>>>>>>>>> should be safe to be taken as valid set.
>>>>>>>>
>>>>>>>> Yup.
>>>>>>>>
>>>>>>>>> This proposal is to check whether the last_set_table safely happens
>>>>>>>>> after the current set, make the set still valid if so.
>>>>>>>>
>>>>>>>>> Full SPEC2017 building shows this patch gets more sucessful combines
>>>>>>>>> from 1902208 to 1902243 (trivial though).
>>>>>>>>
>>>>>>>> Do you have some example, or maybe even a testcase?  :-)
>>>>>>>>
>>>>>>>
>>>>>>> Sorry for the late reply, it took some time to get one reduced case.
>>>>>>>
>>>>>>> typedef struct SA *pa_t;
>>>>>>>
>>>>>>> struct SC {
>>>>>>>   int h;
>>>>>>>   pa_t elem[];
>>>>>>> };
>>>>>>>
>>>>>>> struct SD {
>>>>>>>   struct SC *e;
>>>>>>> };
>>>>>>>
>>>>>>> struct SA {
>>>>>>>   struct {
>>>>>>>     struct SD f[1];
>>>>>>>   } g;
>>>>>>> };
>>>>>>>
>>>>>>> void foo(pa_t *k, char **m) {
>>>>>>>   int l, i;
>>>>>>>   pa_t a;
>>>>>>>   l = (int)a->g.f[5].e;
>>>>>>>   i = 0;
>>>>>>>   for (; i < l; i++) {
>>>>>>>     k[i] = a->g.f[5].e->elem[i];
>>>>>>>     m[i] = "";
>>>>>>>   }
>>>>>>> }
>>>>>>>
>>>>>>> Baseline is r12-0 and the option is "-O3 -mcpu=power9 -fno-strict-aliasing",
>>>>>>> with this patch, the generated assembly can save two rlwinm s.
>>>>>>>
>>>>>>>>> +  /* Record the luid of the insn whose expression involving register n.  */
>>>>>>>>> +
>>>>>>>>> +  int				last_set_table_luid;
>>>>>>>>
>>>>>>>> "Record the luid of the insn for which last_set_table_tick was set",
>>>>>>>> right?
>>>>>>>>
>>>>>>>
>>>>>>> But it can be updated later to one smaller luid, how about the wording like:
>>>>>>>
>>>>>>>
>>>>>>> +  /* Record the luid of the insn which uses register n, the insn should
>>>>>>> +     be the first one using register n in that block of the insn which
>>>>>>> +     last_set_table_tick was set for.  */
>>>>>>>
>>>>>>>
>>>>>>>>> -static void update_table_tick (rtx);
>>>>>>>>> +static void update_table_tick (rtx, int);
>>>>>>>>
>>>>>>>> Please remove this declaration instead, the function is not used until
>>>>>>>> after its actual definition :-)
>>>>>>>>
>>>>>>>
>>>>>>> Done.
>>>>>>>
>>>>>>>>> @@ -13243,7 +13247,21 @@ update_table_tick (rtx x)
>>>>>>>>>        for (r = regno; r < endregno; r++)
>>>>>>>>>  	{
>>>>>>>>>  	  reg_stat_type *rsp = &reg_stat[r];
>>>>>>>>> -	  rsp->last_set_table_tick = label_tick;
>>>>>>>>> +	  if (rsp->last_set_table_tick >= label_tick_ebb_start)
>>>>>>>>> +	    {
>>>>>>>>> +	      /* Later references should not have lower ticks.  */
>>>>>>>>> +	      gcc_assert (label_tick >= rsp->last_set_table_tick);
>>>>>>>>
>>>>>>>> This should be obvious, but checking it won't hurt, okay.
>>>>>>>>
>>>>>>>>> +	      /* Should pick up the lowest luid if the references
>>>>>>>>> +		 are in the same block.  */
>>>>>>>>> +	      if (label_tick == rsp->last_set_table_tick
>>>>>>>>> +		  && rsp->last_set_table_luid > insn_luid)
>>>>>>>>> +		rsp->last_set_table_luid = insn_luid;
>>>>>>>>
>>>>>>>> Why?  Is it conservative for the check you will do later?  Please spell
>>>>>>>> this out, it is crucial!
>>>>>>>>
>>>>>>>
>>>>>>> Since later the combinations involving this insn probably make the
>>>>>>> register be used in one insn sitting ahead (which has smaller luid than
>>>>>>> the one which was recorded before).  Yes, it's very conservative, this
>>>>>>> ensure that we always use the luid of the insn which is the first insn
>>>>>>> using this register in the block.  The last_set invalidation is going
>>>>>>> to catch the case like:
>>>>>>>
>>>>>>>    ... regX  // avoid the set used here ...
>>>>>>>    regX = ...
>>>>>>>    ...
>>>>>>>
>>>>>>> Once we have the smallest luid one of all insns which use register X,
>>>>>>> any unsafe regX sets should be caught.
>>>>>>>
>>>>>>> I updated the comments to:
>>>>>>>
>>>>>>> +              /* Since combination may generate some instructions
>>>>>>> +                 to replace some foregoing instructions with the
>>>>>>> +                 references to register r (using register r), we
>>>>>>> +                 need to make sure we record the first instruction
>>>>>>> +                 which is using register r, so always update with
>>>>>>> +                 the lowest luid here.  If the given set happens
>>>>>>> +                 before this recorded earliest reference, the set
>>>>>>> +                 value should be safe to be used.  */
>>>>>>>
>>>>>>>>> @@ -13359,7 +13378,10 @@ record_value_for_reg (rtx reg, rtx_insn *insn, rtx value)
>>>>>>>>>  
>>>>>>>>>    /* Mark registers that are being referenced in this value.  */
>>>>>>>>>    if (value)
>>>>>>>>> -    update_table_tick (value);
>>>>>>>>> +    {
>>>>>>>>> +      gcc_assert (insn);
>>>>>>>>> +      update_table_tick (value, DF_INSN_LUID (insn));
>>>>>>>>> +    }
>>>>>>>>
>>>>>>>> Don't add that assert please.  If you really want one it should come
>>>>>>>> right at the start of the function, not 60 lines later :-)
>>>>>>>>
>>>>>>>
>>>>>>> Exactly, fixed.
>>>>>>>
>>>>>>>> Looks good if I understood this correctly :-)
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> Thanks again, I also updated the comments in func record_value_for_reg,
>>>>>>> the new version is attached.
>>>>>>>
>>>>>>> BR,
>>>>>>> Kewen
>>>>>>> -----
>>>>>>> gcc/ChangeLog:
>>>>>>>
>>>>>>> 	* combine.c (struct reg_stat_type): New member
>>>>>>> 	last_set_table_luid.
>>>>>>> 	(update_table_tick): Add one argument for insn luid and
>>>>>>> 	set last_set_table_luid with it, remove its declaration.
>>>>>>> 	(record_value_for_reg): Adjust the condition to set
>>>>>>> 	last_set_invalid nonzero.
>>>>>>>


BR,
Kewen

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-06-11 13:16   ` [PATCH v2] " Kewen.Lin
  2021-06-28  7:00     ` PING^1 " Kewen.Lin
@ 2021-11-29 22:28     ` Segher Boessenkool
  2021-11-30  8:47       ` Kewen.Lin
  1 sibling, 1 reply; 20+ messages in thread
From: Segher Boessenkool @ 2021-11-29 22:28 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, Bill Schmidt

Hi!

On Fri, Jun 11, 2021 at 09:16:21PM +0800, Kewen.Lin wrote:
> >> +	      /* Should pick up the lowest luid if the references
> >> +		 are in the same block.  */
> >> +	      if (label_tick == rsp->last_set_table_tick
> >> +		  && rsp->last_set_table_luid > insn_luid)
> >> +		rsp->last_set_table_luid = insn_luid;
> > 
> > Why?  Is it conservative for the check you will do later?  Please spell
> > this out, it is crucial!
> 
> Since later the combinations involving this insn probably make the
> register be used in one insn sitting ahead (which has smaller luid than
> the one which was recorded before).  Yes, it's very conservative, this
> ensure that we always use the luid of the insn which is the first insn
> using this register in the block.

Why would that be correct?!

> The last_set invalidation is going
> to catch the case like:
> 
>    ... regX  // avoid the set used here ...
>    regX = ...
>    ...
> 
> Once we have the smallest luid one of all insns which use register X,
> any unsafe regX sets should be caught.

Yes, you invalidate more, but because you put lies in the table :-(

> 	* combine.c (struct reg_stat_type): New member
> 	last_set_table_luid.

This fits on one line.

> 	(update_table_tick): Add one argument for insn luid and
> 	set last_set_table_luid with it, remove its declaration.
> 	(record_value_for_reg): Adjust the condition to set
> 	last_set_invalid nonzero.

These lines break earlier than they should as well.

> +  /* Record the luid of the insn which uses register n, the insn should
> +     be the first one using register n in that block of the insn which
> +     last_set_table_tick was set for.  */
> +
> +  int				last_set_table_luid;

I'm not sure what this variable is for.  The comment says something
else than the variable name does, and now I don't know what to
believe :-)

The name says it is for a SET, the explanation says it is for a USE.


Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] combine: Tweak the condition of last_set invalidation
  2021-11-29 22:28     ` Segher Boessenkool
@ 2021-11-30  8:47       ` Kewen.Lin
  0 siblings, 0 replies; 20+ messages in thread
From: Kewen.Lin @ 2021-11-30  8:47 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, Bill Schmidt

Hi Segher,

Thanks for the review!

on 2021/11/30 上午6:28, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Jun 11, 2021 at 09:16:21PM +0800, Kewen.Lin wrote:
>>>> +	      /* Should pick up the lowest luid if the references
>>>> +		 are in the same block.  */
>>>> +	      if (label_tick == rsp->last_set_table_tick
>>>> +		  && rsp->last_set_table_luid > insn_luid)
>>>> +		rsp->last_set_table_luid = insn_luid;
>>>
>>> Why?  Is it conservative for the check you will do later?  Please spell
>>> this out, it is crucial!
>>
>> Since later the combinations involving this insn probably make the
>> register be used in one insn sitting ahead (which has smaller luid than
>> the one which was recorded before).  Yes, it's very conservative, this
>> ensure that we always use the luid of the insn which is the first insn
>> using this register in the block.
> 
> Why would that be correct?!
> 

The later check has:

       if (!insn
-          || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
+          || (value && rsp->last_set_table_tick >= label_tick_ebb_start
+              && !(label_tick == rsp->last_set_table_tick
+                   && DF_INSN_LUID (insn) < rsp->last_set_table_luid)))
         rsp->last_set_invalid = 1;

For "label_tick != rsp->last_set_table_tick", it's the same as before.

For "label_tick == rsp->last_set_table_tick", we have the below:

+              if (label_tick == rsp->last_set_table_tick
+                  && rsp->last_set_table_luid > insn_luid)
+                rsp->last_set_table_luid = insn_luid;

It keeps checking and updating with the smallest LUID of the insns which
have the expression involving register n are placed in last_set_value.

The updating here aims to ensure we always the LUID of first INSN which
uses register n (or saying that having one expression involving register n
is placed in last_set_value).

For the first time we set last_set_table_tick for register n, we will also
set last_set_table_luid.  For below case, we record x for LUID.  Assuming
we combining 1,2,x to 2,x and regX is updated to be used in insn2.  Then
the first INSN using regX has become to insn 2.

  ... reg1 // insn 1
  ...
  ... reg2 // insn 2
  ...
  ... regX // insn x
  ...
  regX     // insn y
  ...

Later whether combining moves regX setting upward or not, the LUID which it
compares with is always the updated smallest one (insn 2 here), not the one
which is set at the beginning.  So I think it's conservative.

>> The last_set invalidation is going
>> to catch the case like:
>>
>>    ... regX  // avoid the set used here ...
>>    regX = ...
>>    ...
>>
>> Once we have the smallest luid one of all insns which use register X,
>> any unsafe regX sets should be caught.
> 
> Yes, you invalidate more, but because you put lies in the table :-(
> 

This patch tries to relax some restrictions, it seems there are no lies.  :)
Could you help to explain this comment more?

>> 	* combine.c (struct reg_stat_type): New member
>> 	last_set_table_luid.
> 
> This fits on one line.
> 
>> 	(update_table_tick): Add one argument for insn luid and
>> 	set last_set_table_luid with it, remove its declaration.
>> 	(record_value_for_reg): Adjust the condition to set
>> 	last_set_invalid nonzero.
> 
> These lines break earlier than they should as well.
> 
>> +  /* Record the luid of the insn which uses register n, the insn should
>> +     be the first one using register n in that block of the insn which
>> +     last_set_table_tick was set for.  */
>> +
>> +  int				last_set_table_luid;
> 
> I'm not sure what this variable is for.  The comment says something
> else than the variable name does, and now I don't know what to
> believe :-)
> 
> The name says it is for a SET, the explanation says it is for a USE.
> 

Good point.  :)  For the existing last_set_table_tick,
  /* Record the value of label_tick when an expression involving register n
     is placed in last_set_value.  */

  int				last_set_table_tick;

it seems it has the set in the name too, but "an expression involving
register n " is actually a reference (use) to register n?

How about the below one referring to "last_set_table_tick"?

/* Record the smallest luid of the insns whose expressions involving
   register n are placed in last_set_value, meanwhile the insns locate
   in the same block of the insn which last_set_table_tick was set for.  */


BR,
Kewen

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-11-30  8:47 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-16  8:49 [PATCH/RFC] combine: Tweak the condition of last_set invalidation Kewen.Lin
2021-01-14  2:29 ` PING^1 " Kewen.Lin
2021-01-15  0:22 ` Segher Boessenkool
2021-01-15  8:06   ` Kewen.Lin
2021-01-22  0:30     ` Segher Boessenkool
2021-01-22  2:21       ` Kewen.Lin
2021-05-07  2:45         ` Kewen.Lin
2021-05-26  3:04           ` PING^2 " Kewen.Lin
2021-06-09  2:32             ` PING^3 " Kewen.Lin
2021-06-09 20:17 ` Segher Boessenkool
2021-06-11 13:16   ` [PATCH v2] " Kewen.Lin
2021-06-28  7:00     ` PING^1 " Kewen.Lin
2021-07-15  2:00       ` PING^2 " Kewen.Lin
2021-09-08  7:03         ` PING^3 " Kewen.Lin
2021-10-13  2:27           ` PING^4 " Kewen.Lin
2021-10-20  9:28             ` PING^5 " Kewen.Lin
2021-11-04 10:56               ` PING^6 " Kewen.Lin
2021-11-22  2:22                 ` PING^7 " Kewen.Lin
2021-11-29 22:28     ` Segher Boessenkool
2021-11-30  8:47       ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).