public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Manolis Tsamis <manolis.tsamis@vrull.eu>
To: Manolis Tsamis <manolis.tsamis@vrull.eu>,
	gcc-patches@gcc.gnu.org,
	 Philipp Tomsich <philipp.tomsich@vrull.eu>,
	Jakub Jelinek <jakub@redhat.com>,
	 Andrew Pinski <apinski@marvell.com>,
	Robin Dapp <rdapp@linux.ibm.com>,
	richard.sandiford@arm.com
Subject: Re: [PATCH v2 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets
Date: Tue, 18 Jul 2023 20:03:38 +0300	[thread overview]
Message-ID: <CAM3yNXoSVJwMFxDMs4WO51V5f8rWWQ5iz3NNFkqyUzfGUKAY7w@mail.gmail.com> (raw)
In-Reply-To: <mpty1jemblk.fsf@arm.com>

Hi Richard,

Thanks for your insightful reply.

On Tue, Jul 18, 2023 at 1:12 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Manolis Tsamis <manolis.tsamis@vrull.eu> writes:
> > noce_convert_multiple_sets has been introduced and extended over time to handle
> > if conversion for blocks with multiple sets. Currently this is focused on
> > register moves and rejects any sort of arithmetic operations.
> >
> > This series is an extension to allow more sequences to take part in if
> > conversion. The first patch is a required change to emit correct code and the
> > second patch whitelists a larger number of operations through
> > bb_ok_for_noce_convert_multiple_sets.
> >
> > For targets that have a rich selection of conditional instructions,
> > like aarch64, I have seen an ~5x increase of profitable if conversions for
> > multiple set blocks in SPEC benchmarks. Also tested with a wide variety of
> > benchmarks and I have not seen performance regressions on either x64 / aarch64.
>
> Interesting results.  Are you free to say which target you used for aarch64?
>
> If I've understood the cost heuristics correctly, we'll allow a "predictable"
> branch to be replaced by up to 5 simple conditional instructions and an
> "unpredictable" branch to be replaced by up to 10 simple conditional
> instructions.  That seems pretty high.  And I'm not sure how well we
> guess predictability in the absence of real profile information.
>
> So my gut instinct was that the limitations of the current code might
> be saving us from overly generous limits.  It sounds from your results
> like that might not be the case though.
>
> Still, if it does turn out to be the case in future, I agree we should
> fix the costs rather than hamstring the code.
>

My writing may have been confusing, but with "~5x increase of
profitable if conversions" I just meant that ifcvt considers these
profitable, not that they actually are when executed in particular
hardware.
But at the same time I haven't yet seen any obvious performance
regressions in some benchmarks that I have ran.
In any case it could be interesting to microbenchmark branches vs
conditional instructions and see how sane these numbers are.

> > Some samples that previously resulted in a branch but now better use these
> > instructions can be seen in the provided test case.
> >
> > Tested on aarch64 and x64; On x64 some tests that use __builtin_rint are
> > failing with an ICE but I believe that it's not an issue of this change.
> > force_operand crashes when (and:DF (not:DF (reg:DF 88)) (reg/v:DF 83 [ x ]))
> > is provided through emit_conditional_move.
>
> I guess that needs to be fixed first though.  (Thanks for checking both
> targets.)
>

I get a feeling this may be fixed if I properly take care of your
points 1 & 2 below. I will report on that.

> My main comments on the series are:
>
> (1) It isn't obvious which operations should be included in the list
>     in patch 2 and which shouldn't.  Also, the patch only checks the
>     outermost operation, and so it allows the inner rtxes to be
>     arbitrarily complex.
>
>     Because of that, it might be better to remove the condition
>     altogether and just rely on the other routines to do costing and
>     correctness checks.
>

That is true; I wanted to somehow only allow "normal arithmetic
operations" and avoid generating sequences with stranger codes. I will
try and see what happens if I remove the condition altogether. I also
totally missed the fact that I was allowing arbitrarily complex inner
rtxes so thanks for pointing that out.

> (2) Don't you also need to update the "rewiring" mechanism, to cope
>     with cases where the then block has something like:
>
>       if (a == 0) {
>         a = b op c;       ->    a' = a == 0 ? b op c : a;
>         d = a op b;       ->    d = a == 0 ? a' op b : d;
>       }                         a = a'
>
>     At the moment the code only handles regs and subregs, whereas but IIUC
>     it should now iterate over all the regs in the SET_SRC.  And I suppose
>     that creates the need for multiple possible rewirings in the same insn,
>     so that it isn't a simple insn -> index mapping any more.
>

Indeed, I believe this current patch cannot properly handle these. I
will create testcases for this and see what changes need to be done in
the next iteration so that correct code is generated.

Thanks,
Manolis

> Thanks,
> Richard
>
> >
> >
> > Changes in v2:
> >         - Change "conditional moves" to "conditional instructions"
> >         in bb_ok_for_noce_convert_multiple_sets's comment.
> >
> > Manolis Tsamis (2):
> >   ifcvt: handle sequences that clobber flags in
> >     noce_convert_multiple_sets
> >   ifcvt: Allow more operations in multiple set if conversion
> >
> >  gcc/ifcvt.cc                                  | 109 ++++++++++--------
> >  .../aarch64/ifcvt_multiple_sets_arithm.c      |  67 +++++++++++
> >  2 files changed, 127 insertions(+), 49 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c

  reply	other threads:[~2023-07-18 17:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-13 14:09 Manolis Tsamis
2023-07-13 14:09 ` [PATCH v2 1/2] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets Manolis Tsamis
2023-07-13 14:09 ` [PATCH v2 2/2] ifcvt: Allow more operations in multiple set if conversion Manolis Tsamis
2023-07-17 22:12 ` [PATCH v2 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets Richard Sandiford
2023-07-18 17:03   ` Manolis Tsamis [this message]
2023-07-18 18:38     ` Richard Sandiford
2023-08-30 10:30       ` Manolis Tsamis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM3yNXoSVJwMFxDMs4WO51V5f8rWWQ5iz3NNFkqyUzfGUKAY7w@mail.gmail.com \
    --to=manolis.tsamis@vrull.eu \
    --cc=apinski@marvell.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=philipp.tomsich@vrull.eu \
    --cc=rdapp@linux.ibm.com \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).