Re: [PATCH v2 2/2] combine: Try harder to form zero_extends [PR106594]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Segher Boessenkool <segher@kernel.crashing.org>
To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com
Subject: Re: [PATCH v2 2/2] combine: Try harder to form zero_extends [PR106594]
Date: Fri, 31 Mar 2023 09:35:57 -0500	[thread overview]
Message-ID: <20230331143556.GW25951@gate.crashing.org> (raw)
In-Reply-To: <mpty1ndrp5a.fsf@arm.com>

On Fri, Mar 31, 2023 at 03:06:41PM +0100, Richard Sandiford wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > On Thu, Mar 09, 2023 at 12:10:51PM +0000, Richard Sandiford wrote:
> >>   (and:DI (subreg:DI (reg:SI r115) 0)
> >>           (const_int 63))
> >
> > This is more expensive already?!  An "and" usually costs a real machine
> > instruction, while subregs are usually free (just notational).  The
> > "and" can not easily be optimised away either (after combine we only
> > have less accurate nonzero_bits, to start with).
> 
> Well, it's an "and" in place of a zero_extend rather than an "and"
> in place of a subreg.  So it is one real instruction for another
> real instruction.  But yeah, it's not exactly a simplification.

We should be able to optimise away the zero_extend in many cases, and we
cannot do that to an "and" usually.

> The "we" here is expand_compound_operation, via simplify_and_const_int.
> So it's the usual thing: expand_compound_operation creates something
> that is perhaps more expensive, then after simplification,
> make_compound_operation is supposed to collapse these "expanded"
> perhaps-more-expensive forms back.  And it's the last bit that
> goes wrong.

Yes.  And that is even unavoidable.  The whole concept is a bad idea.
We should not create worse things hoping that we can make it better
things later again; we usually cannot make it as good anymore, or even
reasonably good :-(

Totally skipping this dance of course results in worse code as well, so
more work is needed.

> > This looks pretty good (thanks!), but as always it needs testing on more
> > architectures, showing it doesn't hurt things.  It should be beneficial,
> > but it is not unlikely to hurt other existing backends, and we are not
> > in stage 1 (we are in stage 4 even!)
> 
> Yeah, but it's fixing a GCC 13 regression (and an important one on aarch64).

That is not an excuse for (potentially) causing hundreds of new
regressions, some maybe even worse.

> > Do you have any such proof / indication / anything?  I can start
> > some run but it takes a day (or two), and I cannot start it until next
> > week.
> 
> This is an alternative presentation of the change that we discussed
> a few weeks ago, and that you already tested:
> 
>   https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613486.html
> 
> The results seem to indicate that the patch had no effect on targets
> other than aarch64.  [A good thing, IMO. :-)  The purpose of the
> patch is to fix the aarch64 regression in a minimally invasive way.]

If it is the same (I don't agree at all fwiw) it has no effect on
aarch64 either, other than on your single testcase.  So that is a good
reason to NAK this patch outside of stage 1 already.

> I tried building an aarch64 linux kernel locally with -Os
> -fno-schedule-insns{,2}.

Completely unrealistic.  No one builds anything they care for speed fo
with -Os (and if people care for size, you still get better results
with -O2 + some other tunings), and disabling scheduling is disastrous.

> I saw code size improvements in 182 .os and a
> regression in only one .o.  (I was comparing individual .os because it
> makes isolation easier.)


> The one regression was because the optimised version had fewer pseudos,
> and so something that was previously allocated to x3 was allocated to x2.
> This pseudo was initialised to 0, and a preceding stack protect
> instruction had the known side effect (both before and after the patch)
> of setting x3 to 0.  So with the x3 allocation, postreload was able to
> remove the separate initialisation of x3 with 0, but with the x2
> allocation it couldn't.
> 
> So for my kernel build it was a 182:1 split in favour of improvements,
> with the one regression being a quirk rather than a true regression.

For a default configuration it gave 100.033% size, a non-trivial
regression.  But again, I don't think this code has the same effect.

> > Do you have plans to make combine not do this insane "and" thing at all?
> > Or to attack the compound operation stuff head on?
> 
> I don't have any plans to do that myself, but I agree it would be
> better to get rid of the compound operation stuff if we can.

Getting rid of it is easy enough, but that also loses wanted
functionality (even for archs that do not have any extract
functionality; these functions do more than that :-( ).

> I can see why the expand/simplify/remake process seemed like
> a nice idea, but in practice, there's just too much that can
> go wrong.

Yup.

> And removing the compound stuff would go a long way
> to making combine simplification more like fwprop simpliification
> (unlike now, where we effectively have two separate simplification
> processes for things involving *_extends and *_extracts).

This has nothing to do with simplification?  We just use
simplify-rtx (which started out as part of combine exclusively), just
like everything else.

All the other transforms that are combine-only are very useful to have
as well, as shown a year or so ago.  But those are not
"simplifications" :-)


Segher

next prev parent reply	other threads:[~2023-03-31 14:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-09 12:08 [PATCH v2 0/2] Series of patch to fix PR106594 Richard Sandiford
2023-03-09 12:09 ` [PATCH v2 1/2] combine: Split code out of make_compound_operation_int Richard Sandiford
2023-03-31 11:06   ` Segher Boessenkool
2023-03-31 12:13     ` Richard Sandiford
2023-03-09 12:10 ` [PATCH v2 2/2] combine: Try harder to form zero_extends [PR106594] Richard Sandiford
2023-03-31 12:16   ` Segher Boessenkool
2023-03-31 14:06     ` Richard Sandiford
2023-03-31 14:35       ` Segher Boessenkool [this message]
2023-03-31 14:55         ` Richard Sandiford
2023-03-27 12:14 ` [PATCH v2 0/2] Series of patch to fix PR106594 Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230331143556.GW25951@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).