public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
@ 2020-03-27 22:41 Peter Bergner
  2020-03-27 23:26 ` Ian Lance Taylor
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Peter Bergner @ 2020-03-27 22:41 UTC (permalink / raw)
  To: GCC Patches; +Cc: ian, Segher Boessenkool, Richard Biener

The pr87507.c test case regressed due to Segher's commit that added
-fsplit-wide-types-early.  The issue is that the lower-subreg pass only
decomposes the TImode code in the example if there is a pseudo reg to pseudo
reg copy.  When the lower-subreg pass is called late (its old location),
then combine changes the generated code by adding a TImode pseudo reg to
pseudo reg copy and lower-subreg successfully decomposes it.

When we run lower-subreg before combine, that copy isn't there so we
do not decompose our TImode uses.  I'm not sure why we require a pseudo
to pseudo copy before we decompose things, but changing find_pseudo_copy
to allow pseudo and hard regs to be copied into a pseudo like below fixes
the issue.

Ian, do you remember why we don't just decompose all wide types and instead
require a pseudo to pseudo copy to exist?

This patch survived bootstrap and regtesting on powerpc64le-linux with
no regressions.

	* lower-subreg.c (find_pseudo_copy): Allow copies from hard registers
	too.

diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index 4c8bc835f93..c6816a34489 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -419,7 +419,7 @@ find_pseudo_copy (rtx set)
 
   rd = REGNO (dest);
   rs = REGNO (src);
-  if (HARD_REGISTER_NUM_P (rd) || HARD_REGISTER_NUM_P (rs))
+  if (HARD_REGISTER_NUM_P (rd))
     return false;
 
   b = reg_copy_graph[rs];

Given we're late in stage4, the above patch (assuming it's ok) probably
shouldn't go in until stage1, since it is changing lower-subreg's behaviour
slightly.

A different (ie, safer) approach would be to just rerun lower-subreg at
its old location, regardless of whether we used -fsplit-wide-types-early.
This way, we are not changing lower-subreg's behaviour, just running it an
extra time (3 times instead of twice when using -fsplit-wide-types-early).
I don't think lower-subreg is too expensive to run an extra time and we'd
only do it when using -fsplit-wide-types-early.  The only downside (if any)
is that we don't decompose these TImode uses early like the patch above does,
so combine, etc. can't see what they will eventually become.  This does fix
the bug and also survives bootstrap and regtesting on powerpc64le-linux
with no regressions.

	* lower-subreg.c (pass_lower_subreg3::gate): Remove test for
	flag_split_wide_types_early.

diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index 4c8bc835f93..807ad398b64 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -1844,8 +1844,7 @@ public:
   {}
 
   /* opt_pass methods: */
-  virtual bool gate (function *) { return flag_split_wide_types
-					  && !flag_split_wide_types_early; }
+  virtual bool gate (function *) { return flag_split_wide_types != 0; }
   virtual unsigned int execute (function *)
     {
       decompose_multiword_subregs (true);


Does anyone have any preferences on the patches above or comments?
If we go with the first patch for stage1, I'll add -fno-split-wide-types-early
to pr87507.c so it doesn't FAIL until the patch goes in.

Peter


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-27 22:41 [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail Peter Bergner
@ 2020-03-27 23:26 ` Ian Lance Taylor
  2020-03-28 19:22 ` Segher Boessenkool
  2020-03-30  8:50 ` Richard Sandiford
  2 siblings, 0 replies; 14+ messages in thread
From: Ian Lance Taylor @ 2020-03-27 23:26 UTC (permalink / raw)
  To: Peter Bergner; +Cc: GCC Patches, Segher Boessenkool, Richard Biener

Peter Bergner <bergner@linux.ibm.com> writes:

> Ian, do you remember why we don't just decompose all wide types and instead
> require a pseudo to pseudo copy to exist?

No, sorry, I don't remember this at all.

Ian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-27 22:41 [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail Peter Bergner
  2020-03-27 23:26 ` Ian Lance Taylor
@ 2020-03-28 19:22 ` Segher Boessenkool
  2020-03-28 23:39   ` Peter Bergner
  2020-03-30  8:50 ` Richard Sandiford
  2 siblings, 1 reply; 14+ messages in thread
From: Segher Boessenkool @ 2020-03-28 19:22 UTC (permalink / raw)
  To: Peter Bergner; +Cc: GCC Patches, ian, Richard Biener

On Fri, Mar 27, 2020 at 05:41:36PM -0500, Peter Bergner wrote:
> Given we're late in stage4, the above patch (assuming it's ok) probably
> shouldn't go in until stage1, since it is changing lower-subreg's behaviour
> slightly.
> 
> A different (ie, safer) approach would be to just rerun lower-subreg at
> its old location, regardless of whether we used -fsplit-wide-types-early.

That is my preference, for a simpler reason even: when I added the new
pass I disabled the old one, thinking it wouldn't do anything useful
anymore.  Here you show that isn't true.

> This way, we are not changing lower-subreg's behaviour, just running it an
> extra time (3 times instead of twice when using -fsplit-wide-types-early).
> I don't think lower-subreg is too expensive to run an extra time

Yes, I think so too.

> and we'd only do it when using -fsplit-wide-types-early.

But note that is true by default for some targets (rs6000 and avr
currently, I think).

>    /* opt_pass methods: */
> -  virtual bool gate (function *) { return flag_split_wide_types
> -					  && !flag_split_wide_types_early; }
> +  virtual bool gate (function *) { return flag_split_wide_types != 0; }

I think you mean
  return flag_split_wide_types != 0 != 0 != 0;

:-P


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-28 19:22 ` Segher Boessenkool
@ 2020-03-28 23:39   ` Peter Bergner
  2020-04-02 21:56     ` Segher Boessenkool
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Bergner @ 2020-03-28 23:39 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, ian, Richard Biener

On 3/28/20 2:22 PM, Segher Boessenkool wrote:
> On Fri, Mar 27, 2020 at 05:41:36PM -0500, Peter Bergner wrote:
>> A different (ie, safer) approach would be to just rerun lower-subreg at
>> its old location, regardless of whether we used -fsplit-wide-types-early.
> 
> That is my preference, for a simpler reason even: when I added the new
> pass I disabled the old one, thinking it wouldn't do anything useful
> anymore.  Here you show that isn't true.
>
>> This way, we are not changing lower-subreg's behaviour, just running it an
>> extra time (3 times instead of twice when using -fsplit-wide-types-early).
>> I don't think lower-subreg is too expensive to run an extra time
> 
> Yes, I think so too.

Right.  However, like I said though, the downside is that we don't expose
the decomposed uses to passes in between subreg2 and subreg3, like combine,
etc.  Isn't that why you moved it early in the first place?  Then again,
maybe you're getting the important cases now and subreg3 is just cleanup?

That said, I'm fine with whatever you, richi and others want.



>>    /* opt_pass methods: */
>> -  virtual bool gate (function *) { return flag_split_wide_types
>> -					  && !flag_split_wide_types_early; }
>> +  virtual bool gate (function *) { return flag_split_wide_types != 0; }
> 
> I think you mean
>   return flag_split_wide_types != 0 != 0 != 0;

Heh, I was just reverting it to the code prior to your change.  I can make
that just "return flag_split_wide_types;" if you like and we end up going
with this version.

Peter


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-27 22:41 [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail Peter Bergner
  2020-03-27 23:26 ` Ian Lance Taylor
  2020-03-28 19:22 ` Segher Boessenkool
@ 2020-03-30  8:50 ` Richard Sandiford
  2020-03-30 11:26   ` Segher Boessenkool
                     ` (2 more replies)
  2 siblings, 3 replies; 14+ messages in thread
From: Richard Sandiford @ 2020-03-30  8:50 UTC (permalink / raw)
  To: Peter Bergner via Gcc-patches; +Cc: Peter Bergner, ian, Segher Boessenkool

Peter Bergner via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> The pr87507.c test case regressed due to Segher's commit that added
> -fsplit-wide-types-early.  The issue is that the lower-subreg pass only
> decomposes the TImode code in the example if there is a pseudo reg to pseudo
> reg copy.  When the lower-subreg pass is called late (its old location),
> then combine changes the generated code by adding a TImode pseudo reg to
> pseudo reg copy and lower-subreg successfully decomposes it.
>
> When we run lower-subreg before combine, that copy isn't there so we
> do not decompose our TImode uses.  I'm not sure why we require a pseudo
> to pseudo copy before we decompose things, but changing find_pseudo_copy
> to allow pseudo and hard regs to be copied into a pseudo like below fixes
> the issue.
>
> Ian, do you remember why we don't just decompose all wide types and instead
> require a pseudo to pseudo copy to exist?
>
> This patch survived bootstrap and regtesting on powerpc64le-linux with
> no regressions.
>
> 	* lower-subreg.c (find_pseudo_copy): Allow copies from hard registers
> 	too.
>
> diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
> index 4c8bc835f93..c6816a34489 100644
> --- a/gcc/lower-subreg.c
> +++ b/gcc/lower-subreg.c
> @@ -419,7 +419,7 @@ find_pseudo_copy (rtx set)
>  
>    rd = REGNO (dest);
>    rs = REGNO (src);
> -  if (HARD_REGISTER_NUM_P (rd) || HARD_REGISTER_NUM_P (rs))
> +  if (HARD_REGISTER_NUM_P (rd))
>      return false;
>  
>    b = reg_copy_graph[rs];

I guess this would also work if we dropped the rd check instead.
So how about s/||/&&/ instead, to avoid the assymetry?

I agree something like this is a better fix long-term, since we
shouldn't be relying on make_more_copies outside combine.

> Given we're late in stage4, the above patch (assuming it's ok) probably
> shouldn't go in until stage1, since it is changing lower-subreg's behaviour
> slightly.
>
> A different (ie, safer) approach would be to just rerun lower-subreg at
> its old location, regardless of whether we used -fsplit-wide-types-early.
> This way, we are not changing lower-subreg's behaviour, just running it an
> extra time (3 times instead of twice when using -fsplit-wide-types-early).
> I don't think lower-subreg is too expensive to run an extra time and we'd
> only do it when using -fsplit-wide-types-early.  The only downside (if any)
> is that we don't decompose these TImode uses early like the patch above does,
> so combine, etc. can't see what they will eventually become.  This does fix
> the bug and also survives bootstrap and regtesting on powerpc64le-linux
> with no regressions.
>
> 	* lower-subreg.c (pass_lower_subreg3::gate): Remove test for
> 	flag_split_wide_types_early.
>
> diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
> index 4c8bc835f93..807ad398b64 100644
> --- a/gcc/lower-subreg.c
> +++ b/gcc/lower-subreg.c
> @@ -1844,8 +1844,7 @@ public:
>    {}
>  
>    /* opt_pass methods: */
> -  virtual bool gate (function *) { return flag_split_wide_types
> -					  && !flag_split_wide_types_early; }
> +  virtual bool gate (function *) { return flag_split_wide_types != 0; }
>    virtual unsigned int execute (function *)
>      {
>        decompose_multiword_subregs (true);

Looks good to me with the s/ != 0// that Segher mentioned.

With this change, the only remaining function of -fsplit-wide-types-early
is to act as a double lock on one pass.  IMO it'd make more sense to remove
that double lock and make -fsplit-wide-types-early and -fsplit-wide-types
act as independent options, a bit like -fschedule-insns{,2}.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-30  8:50 ` Richard Sandiford
@ 2020-03-30 11:26   ` Segher Boessenkool
  2020-03-30 16:23     ` Peter Bergner
  2020-03-30 16:06   ` Segher Boessenkool
  2020-04-01 17:48   ` Peter Bergner
  2 siblings, 1 reply; 14+ messages in thread
From: Segher Boessenkool @ 2020-03-30 11:26 UTC (permalink / raw)
  To: Peter Bergner via Gcc-patches, Peter Bergner, ian, richard.sandiford

Hi!

On Mon, Mar 30, 2020 at 09:50:05AM +0100, Richard Sandiford wrote:
> Peter Bergner via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > -  if (HARD_REGISTER_NUM_P (rd) || HARD_REGISTER_NUM_P (rs))
> > +  if (HARD_REGISTER_NUM_P (rd))
> >      return false;
> >  
> >    b = reg_copy_graph[rs];
> 
> I guess this would also work if we dropped the rd check instead.
> So how about s/||/&&/ instead, to avoid the assymetry?
> 
> I agree something like this is a better fix long-term, since we
> shouldn't be relying on make_more_copies outside combine.

Yes; on the other hand, most RTL passes should do something to not have
hard registers forwarded into non-move instructions (where they can
cause problems later).  (make_more_copies itself is a technicality
specific to how combine works, and we might be able to drop it in the
future).

> With this change, the only remaining function of -fsplit-wide-types-early
> is to act as a double lock on one pass.  IMO it'd make more sense to remove
> that double lock and make -fsplit-wide-types-early and -fsplit-wide-types
> act as independent options, a bit like -fschedule-insns{,2}.

Sure, that would simplify things a bit (at least conceptually).

With or without that change, the documentation could use some tweaking
as well, after this patch.


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-30  8:50 ` Richard Sandiford
  2020-03-30 11:26   ` Segher Boessenkool
@ 2020-03-30 16:06   ` Segher Boessenkool
  2020-04-01 17:48   ` Peter Bergner
  2 siblings, 0 replies; 14+ messages in thread
From: Segher Boessenkool @ 2020-03-30 16:06 UTC (permalink / raw)
  To: Peter Bergner via Gcc-patches, Peter Bergner, ian, richard.sandiford

Hi!

On Mon, Mar 30, 2020 at 09:50:05AM +0100, Richard Sandiford wrote:
> With this change, the only remaining function of -fsplit-wide-types-early
> is to act as a double lock on one pass.  IMO it'd make more sense to remove
> that double lock and make -fsplit-wide-types-early and -fsplit-wide-types
> act as independent options, a bit like -fschedule-insns{,2}.

But then, -fsplit-wide-types would control two passes, one of which is
identical to -fsplit-wide-types-early, just in a different spot in the
pass pipeline.  That is bad enough on its own, already :-/

I still think all three passes should be controlled by -fsplit-wide-types,
just as they already are (and the "old" two have been for years).


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-30 11:26   ` Segher Boessenkool
@ 2020-03-30 16:23     ` Peter Bergner
  2020-03-30 16:26       ` Peter Bergner
  2020-03-30 16:39       ` Segher Boessenkool
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Bergner @ 2020-03-30 16:23 UTC (permalink / raw)
  To: Segher Boessenkool, richard.sandiford; +Cc: Gcc-patches, ian

On 3/30/20 6:26 AM, Segher Boessenkool wrote:
> On Mon, Mar 30, 2020 at 09:50:05AM +0100, Richard Sandiford wrote:
>> Peter Bergner via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>> -  if (HARD_REGISTER_NUM_P (rd) || HARD_REGISTER_NUM_P (rs))
>>> +  if (HARD_REGISTER_NUM_P (rd))
>>>      return false;
>>>  
>>>    b = reg_copy_graph[rs];
>>
>> I guess this would also work if we dropped the rd check instead.
>> So how about s/||/&&/ instead, to avoid the assymetry?
>>
>> I agree something like this is a better fix long-term, since we
>> shouldn't be relying on make_more_copies outside combine.
> 
> Yes; on the other hand, most RTL passes should do something to not have
> hard registers forwarded into non-move instructions (where they can
> cause problems later).  (make_more_copies itself is a technicality
> specific to how combine works, and we might be able to drop it in the
> future).

I kind of agree with Richard above on making it more applicable/symmetric,
but why can't we just remove the HARD_REGISTER_NUM_P() tests altogether?
It's not like lower-subreg is extending hard register lifetime usage than
what is already there in the rtl.  We're just decomposing what's already
there into smaller register sized chunks.  Is there a problem with that
I'm not aware of?


Peter


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-30 16:23     ` Peter Bergner
@ 2020-03-30 16:26       ` Peter Bergner
  2020-03-30 16:39       ` Segher Boessenkool
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Bergner @ 2020-03-30 16:26 UTC (permalink / raw)
  To: Segher Boessenkool, richard.sandiford; +Cc: Gcc-patches, ian

On 3/30/20 11:23 AM, Peter Bergner wrote:
> I kind of agree with Richard above on making it more applicable/symmetric,
> but why can't we just remove the HARD_REGISTER_NUM_P() tests altogether?
> It's not like lower-subreg is extending hard register lifetime usage than
> what is already there in the rtl.  We're just decomposing what's already
> there into smaller register sized chunks.  Is there a problem with that
> I'm not aware of?

...or maybe there was an issue when combine used to extend hard register
lifetimes (which it doesn't anymore) and the test above was just a workaround?

Peter



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-30 16:23     ` Peter Bergner
  2020-03-30 16:26       ` Peter Bergner
@ 2020-03-30 16:39       ` Segher Boessenkool
  1 sibling, 0 replies; 14+ messages in thread
From: Segher Boessenkool @ 2020-03-30 16:39 UTC (permalink / raw)
  To: Peter Bergner; +Cc: richard.sandiford, Gcc-patches, ian

On Mon, Mar 30, 2020 at 11:23:12AM -0500, Peter Bergner wrote:
> On 3/30/20 6:26 AM, Segher Boessenkool wrote:
> > On Mon, Mar 30, 2020 at 09:50:05AM +0100, Richard Sandiford wrote:
> >> Peter Bergner via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> >>> -  if (HARD_REGISTER_NUM_P (rd) || HARD_REGISTER_NUM_P (rs))
> >>> +  if (HARD_REGISTER_NUM_P (rd))
> >>>      return false;
> >>>  
> >>>    b = reg_copy_graph[rs];
> >>
> >> I guess this would also work if we dropped the rd check instead.
> >> So how about s/||/&&/ instead, to avoid the assymetry?
> >>
> >> I agree something like this is a better fix long-term, since we
> >> shouldn't be relying on make_more_copies outside combine.
> > 
> > Yes; on the other hand, most RTL passes should do something to not have
> > hard registers forwarded into non-move instructions (where they can
> > cause problems later).  (make_more_copies itself is a technicality
> > specific to how combine works, and we might be able to drop it in the
> > future).
> 
> I kind of agree with Richard above on making it more applicable/symmetric,
> but why can't we just remove the HARD_REGISTER_NUM_P() tests altogether?
> It's not like lower-subreg is extending hard register lifetime usage than
> what is already there in the rtl.  We're just decomposing what's already
> there into smaller register sized chunks.  Is there a problem with that
> I'm not aware of?

The function comment is (since day 1):
  /* If SET is a copy from one multi-word pseudo-register to another,
     record that in reg_copy_graph.  Return whether it is such a
     copy.  */
so a) that needs fixing; and b) what does this change for the (single)
caller of find_pseudo_copy?  The comment there isn't very enlightening:
  /* We mark pseudo-to-pseudo copies as decomposable during the
     second pass only.  The first pass is so early that there is
     good chance such moves will be optimized away completely by
     subsequent optimizations anyway.

     However, we call find_pseudo_copy even during the first pass
     so as to properly set up the reg_copy_graph.  */

(The function *name* should change as well, if you make this change).


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-30  8:50 ` Richard Sandiford
  2020-03-30 11:26   ` Segher Boessenkool
  2020-03-30 16:06   ` Segher Boessenkool
@ 2020-04-01 17:48   ` Peter Bergner
  2020-04-01 18:32     ` Richard Sandiford
  2 siblings, 1 reply; 14+ messages in thread
From: Peter Bergner @ 2020-04-01 17:48 UTC (permalink / raw)
  To: Segher Boessenkool, richard.sandiford; +Cc: gcc-patches, ian

On 3/30/20 3:50 AM, Richard Sandiford wrote:
> Peter Bergner via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> 	* lower-subreg.c (pass_lower_subreg3::gate): Remove test for
>> 	flag_split_wide_types_early.
>>
>> diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
>> index 4c8bc835f93..807ad398b64 100644
>> --- a/gcc/lower-subreg.c
>> +++ b/gcc/lower-subreg.c
>> @@ -1844,8 +1844,7 @@ public:
>>    {}
>>  
>>    /* opt_pass methods: */
>> -  virtual bool gate (function *) { return flag_split_wide_types
>> -					  && !flag_split_wide_types_early; }
>> +  virtual bool gate (function *) { return flag_split_wide_types != 0; }
>>    virtual unsigned int execute (function *)
>>      {
>>        decompose_multiword_subregs (true);
> 
> Looks good to me with the s/ != 0// that Segher mentioned.
> 
> With this change, the only remaining function of -fsplit-wide-types-early
> is to act as a double lock on one pass.  IMO it'd make more sense to remove
> that double lock and make -fsplit-wide-types-early and -fsplit-wide-types
> act as independent options, a bit like -fschedule-insns{,2}.

Have we come to consensus on whether to split the options or not?
I think Segher is against it given we actually have 3 passes of
lower-subreg and -fsplit-wide-types would control the 1st and 3rd
passes and -fsplit-wide-types-early would control the second.
That does seem strange to me too.

Peter



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-04-01 17:48   ` Peter Bergner
@ 2020-04-01 18:32     ` Richard Sandiford
  2020-04-01 19:43       ` Peter Bergner
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Sandiford @ 2020-04-01 18:32 UTC (permalink / raw)
  To: Peter Bergner; +Cc: Segher Boessenkool, gcc-patches, ian

Peter Bergner <bergner@linux.ibm.com> writes:
> On 3/30/20 3:50 AM, Richard Sandiford wrote:
>> Peter Bergner via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>> 	* lower-subreg.c (pass_lower_subreg3::gate): Remove test for
>>> 	flag_split_wide_types_early.
>>>
>>> diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
>>> index 4c8bc835f93..807ad398b64 100644
>>> --- a/gcc/lower-subreg.c
>>> +++ b/gcc/lower-subreg.c
>>> @@ -1844,8 +1844,7 @@ public:
>>>    {}
>>>  
>>>    /* opt_pass methods: */
>>> -  virtual bool gate (function *) { return flag_split_wide_types
>>> -					  && !flag_split_wide_types_early; }
>>> +  virtual bool gate (function *) { return flag_split_wide_types != 0; }
>>>    virtual unsigned int execute (function *)
>>>      {
>>>        decompose_multiword_subregs (true);
>> 
>> Looks good to me with the s/ != 0// that Segher mentioned.
>> 
>> With this change, the only remaining function of -fsplit-wide-types-early
>> is to act as a double lock on one pass.  IMO it'd make more sense to remove
>> that double lock and make -fsplit-wide-types-early and -fsplit-wide-types
>> act as independent options, a bit like -fschedule-insns{,2}.
>
> Have we come to consensus on whether to split the options or not?
> I think Segher is against it given we actually have 3 passes of
> lower-subreg and -fsplit-wide-types would control the 1st and 3rd
> passes and -fsplit-wide-types-early would control the second.
> That does seem strange to me too.

I guess the name of the option is a bit weird, since it'll control
the middle pass of three.  That's going to be true either way though.

We're talking about having independent options controlling independent
passes, which seems like a Good Thing in general and doesn't seem that
strange to me in this case.  But I'm certainly happy to yield given the
strong opinions the other way.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-04-01 18:32     ` Richard Sandiford
@ 2020-04-01 19:43       ` Peter Bergner
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Bergner @ 2020-04-01 19:43 UTC (permalink / raw)
  To: richard.sandiford; +Cc: Segher Boessenkool, gcc-patches, ian

On 4/1/20 1:32 PM, Richard Sandiford wrote:
> Peter Bergner <bergner@linux.ibm.com> writes:
>> Have we come to consensus on whether to split the options or not?
>> I think Segher is against it given we actually have 3 passes of
>> lower-subreg and -fsplit-wide-types would control the 1st and 3rd
>> passes and -fsplit-wide-types-early would control the second.
>> That does seem strange to me too.
> 
> I guess the name of the option is a bit weird, since it'll control
> the middle pass of three.  That's going to be true either way though.
> 
> We're talking about having independent options controlling independent
> passes, which seems like a Good Thing in general and doesn't seem that
> strange to me in this case.  But I'm certainly happy to yield given the
> strong opinions the other way.

Ok, I pushed the patch without breaking them apart.  We can maybe revisit
the issue in stage1, when I'll start testing the first patch that allows
hard registers to be decomposed.

Thanks!

Peter

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail
  2020-03-28 23:39   ` Peter Bergner
@ 2020-04-02 21:56     ` Segher Boessenkool
  0 siblings, 0 replies; 14+ messages in thread
From: Segher Boessenkool @ 2020-04-02 21:56 UTC (permalink / raw)
  To: Peter Bergner; +Cc: GCC Patches, ian, Richard Biener

On Sat, Mar 28, 2020 at 06:39:56PM -0500, Peter Bergner wrote:
> On 3/28/20 2:22 PM, Segher Boessenkool wrote:
> > On Fri, Mar 27, 2020 at 05:41:36PM -0500, Peter Bergner wrote:
> >> A different (ie, safer) approach would be to just rerun lower-subreg at
> >> its old location, regardless of whether we used -fsplit-wide-types-early.
> > 
> > That is my preference, for a simpler reason even: when I added the new
> > pass I disabled the old one, thinking it wouldn't do anything useful
> > anymore.  Here you show that isn't true.
> >
> >> This way, we are not changing lower-subreg's behaviour, just running it an
> >> extra time (3 times instead of twice when using -fsplit-wide-types-early).
> >> I don't think lower-subreg is too expensive to run an extra time
> > 
> > Yes, I think so too.
> 
> Right.  However, like I said though, the downside is that we don't expose
> the decomposed uses to passes in between subreg2 and subreg3, like combine,
> etc.  Isn't that why you moved it early in the first place?  Then again,
> maybe you're getting the important cases now and subreg3 is just cleanup?

Yeah.  subreg1 is the limited one; subreg2 and subreg3 are the "full"
one.  subreg2 is the "early" one that only some archs have by default.
subreg3 will not do much (if subreg2 is enabled), except all the usual
RTL passes (CSE, *prop, combine, etc.) can expose more possibilities
for lower-subreg in some cases.


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-04-02 21:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-27 22:41 [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail Peter Bergner
2020-03-27 23:26 ` Ian Lance Taylor
2020-03-28 19:22 ` Segher Boessenkool
2020-03-28 23:39   ` Peter Bergner
2020-04-02 21:56     ` Segher Boessenkool
2020-03-30  8:50 ` Richard Sandiford
2020-03-30 11:26   ` Segher Boessenkool
2020-03-30 16:23     ` Peter Bergner
2020-03-30 16:26       ` Peter Bergner
2020-03-30 16:39       ` Segher Boessenkool
2020-03-30 16:06   ` Segher Boessenkool
2020-04-01 17:48   ` Peter Bergner
2020-04-01 18:32     ` Richard Sandiford
2020-04-01 19:43       ` Peter Bergner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).