From: Manolis Tsamis <manolis.tsamis@vrull.eu>
To: Manolis Tsamis <manolis.tsamis@vrull.eu>,
gcc-patches@gcc.gnu.org,
Philipp Tomsich <philipp.tomsich@vrull.eu>,
Jakub Jelinek <jakub@redhat.com>,
Andrew Pinski <apinski@marvell.com>,
Robin Dapp <rdapp@linux.ibm.com>,
richard.sandiford@arm.com
Subject: Re: [PATCH v2 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets
Date: Tue, 18 Jul 2023 20:03:38 +0300 [thread overview]
Message-ID: <CAM3yNXoSVJwMFxDMs4WO51V5f8rWWQ5iz3NNFkqyUzfGUKAY7w@mail.gmail.com> (raw)
In-Reply-To: <mpty1jemblk.fsf@arm.com>
Hi Richard,
Thanks for your insightful reply.
On Tue, Jul 18, 2023 at 1:12 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Manolis Tsamis <manolis.tsamis@vrull.eu> writes:
> > noce_convert_multiple_sets has been introduced and extended over time to handle
> > if conversion for blocks with multiple sets. Currently this is focused on
> > register moves and rejects any sort of arithmetic operations.
> >
> > This series is an extension to allow more sequences to take part in if
> > conversion. The first patch is a required change to emit correct code and the
> > second patch whitelists a larger number of operations through
> > bb_ok_for_noce_convert_multiple_sets.
> >
> > For targets that have a rich selection of conditional instructions,
> > like aarch64, I have seen an ~5x increase of profitable if conversions for
> > multiple set blocks in SPEC benchmarks. Also tested with a wide variety of
> > benchmarks and I have not seen performance regressions on either x64 / aarch64.
>
> Interesting results. Are you free to say which target you used for aarch64?
>
> If I've understood the cost heuristics correctly, we'll allow a "predictable"
> branch to be replaced by up to 5 simple conditional instructions and an
> "unpredictable" branch to be replaced by up to 10 simple conditional
> instructions. That seems pretty high. And I'm not sure how well we
> guess predictability in the absence of real profile information.
>
> So my gut instinct was that the limitations of the current code might
> be saving us from overly generous limits. It sounds from your results
> like that might not be the case though.
>
> Still, if it does turn out to be the case in future, I agree we should
> fix the costs rather than hamstring the code.
>
My writing may have been confusing, but with "~5x increase of
profitable if conversions" I just meant that ifcvt considers these
profitable, not that they actually are when executed in particular
hardware.
But at the same time I haven't yet seen any obvious performance
regressions in some benchmarks that I have ran.
In any case it could be interesting to microbenchmark branches vs
conditional instructions and see how sane these numbers are.
> > Some samples that previously resulted in a branch but now better use these
> > instructions can be seen in the provided test case.
> >
> > Tested on aarch64 and x64; On x64 some tests that use __builtin_rint are
> > failing with an ICE but I believe that it's not an issue of this change.
> > force_operand crashes when (and:DF (not:DF (reg:DF 88)) (reg/v:DF 83 [ x ]))
> > is provided through emit_conditional_move.
>
> I guess that needs to be fixed first though. (Thanks for checking both
> targets.)
>
I get a feeling this may be fixed if I properly take care of your
points 1 & 2 below. I will report on that.
> My main comments on the series are:
>
> (1) It isn't obvious which operations should be included in the list
> in patch 2 and which shouldn't. Also, the patch only checks the
> outermost operation, and so it allows the inner rtxes to be
> arbitrarily complex.
>
> Because of that, it might be better to remove the condition
> altogether and just rely on the other routines to do costing and
> correctness checks.
>
That is true; I wanted to somehow only allow "normal arithmetic
operations" and avoid generating sequences with stranger codes. I will
try and see what happens if I remove the condition altogether. I also
totally missed the fact that I was allowing arbitrarily complex inner
rtxes so thanks for pointing that out.
> (2) Don't you also need to update the "rewiring" mechanism, to cope
> with cases where the then block has something like:
>
> if (a == 0) {
> a = b op c; -> a' = a == 0 ? b op c : a;
> d = a op b; -> d = a == 0 ? a' op b : d;
> } a = a'
>
> At the moment the code only handles regs and subregs, whereas but IIUC
> it should now iterate over all the regs in the SET_SRC. And I suppose
> that creates the need for multiple possible rewirings in the same insn,
> so that it isn't a simple insn -> index mapping any more.
>
Indeed, I believe this current patch cannot properly handle these. I
will create testcases for this and see what changes need to be done in
the next iteration so that correct code is generated.
Thanks,
Manolis
> Thanks,
> Richard
>
> >
> >
> > Changes in v2:
> > - Change "conditional moves" to "conditional instructions"
> > in bb_ok_for_noce_convert_multiple_sets's comment.
> >
> > Manolis Tsamis (2):
> > ifcvt: handle sequences that clobber flags in
> > noce_convert_multiple_sets
> > ifcvt: Allow more operations in multiple set if conversion
> >
> > gcc/ifcvt.cc | 109 ++++++++++--------
> > .../aarch64/ifcvt_multiple_sets_arithm.c | 67 +++++++++++
> > 2 files changed, 127 insertions(+), 49 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c
next prev parent reply other threads:[~2023-07-18 17:04 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-13 14:09 Manolis Tsamis
2023-07-13 14:09 ` [PATCH v2 1/2] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets Manolis Tsamis
2023-07-13 14:09 ` [PATCH v2 2/2] ifcvt: Allow more operations in multiple set if conversion Manolis Tsamis
2023-07-17 22:12 ` [PATCH v2 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets Richard Sandiford
2023-07-18 17:03 ` Manolis Tsamis [this message]
2023-07-18 18:38 ` Richard Sandiford
2023-08-30 10:30 ` Manolis Tsamis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAM3yNXoSVJwMFxDMs4WO51V5f8rWWQ5iz3NNFkqyUzfGUKAY7w@mail.gmail.com \
--to=manolis.tsamis@vrull.eu \
--cc=apinski@marvell.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=philipp.tomsich@vrull.eu \
--cc=rdapp@linux.ibm.com \
--cc=richard.sandiford@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).