public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Tamar Christina <Tamar.Christina@arm.com>,  nd <nd@arm.com>,
	 "rguenther\@suse.de" <rguenther@suse.de>,
	 "jlaw\@ventanamicro.com" <jlaw@ventanamicro.com>
Subject: Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]
Date: Mon, 27 Feb 2023 21:33:02 +0000	[thread overview]
Message-ID: <mptcz5upze9.fsf@arm.com> (raw)
In-Reply-To: <VI1PR08MB5325BCA53FA5F1C80A003843FFAF9@VI1PR08MB5325.eurprd08.prod.outlook.com> (Tamar Christina via Gcc-patches's message of "Mon, 27 Feb 2023 12:14:05 +0000")

Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> -----Original Message-----
>> From: Richard Sandiford <richard.sandiford@arm.com>
>> Sent: Monday, February 27, 2023 12:12 PM
>> To: Tamar Christina <Tamar.Christina@arm.com>
>> Cc: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>; nd
>> <nd@arm.com>; rguenther@suse.de; jlaw@ventanamicro.com
>> Subject: Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask
>> by using new optabs [PR108583]
>> 
>> Tamar Christina <Tamar.Christina@arm.com> writes:
>> > Hi,
>> >
>> >> > I avoided open coding it with add and shift because it creates a 4
>> >> > instructions (and shifts which are typically slow) dependency chain
>> >> > instead of a load and multiply.  This change, unless the target is
>> >> > known to optimize it further is unlikely to be beneficial.  And by
>> >> > the time we get to costing the only alternative is to undo the
>> >> > existing pattern and
>> >> so you lose the general shift optimization.
>> >> >
>> >> > So it seemed unwise to open code as shifts, given the codegen out
>> >> > of the vectorizer would be degenerate for most targets or one needs
>> >> > the more complicated route of costing during pattern matching already.
>> >>
>> >> Hmm, OK.  That seems like a cost-model thing though, rather than
>> >> something that should be exposed through optabs.  And I imagine the
>> >> open-coded version would still be better than nothing on targets without
>> highpart multiply.
>> >>
>> >> So how about replacing the hook with one that simply asks whether
>> >> division through highpart multiplication is preferred over the add/shift
>> sequence?
>> >> (Unfortunately it's not going to be possible to work that out from
>> >> existing
>> >> information.)
>> >
>> > So this doesn't work for SVE.  For SVE the multiplication widening
>> > pass introduces FMAs at gimple level.  So in the cases where the
>> > operation is fed from a widening multiplication we end up generating FMA.
>> If that was it I could have matched FMA.
>> >
>> > But it also pushes the multiplication in the second operand because it
>> > no longer has a mul to share the results with.
>> >
>> > In any case, the gimple code is transformed into
>> >
>> > vect__3.8_122 = .MASK_LOAD (_29, 8B, loop_mask_121);
>> > vect_patt_57.9_123 = (vector([8,8]) unsigned short) vect__3.8_122;
>> > vect_patt_64.11_127 = .FMA (vect_patt_57.9_123, vect_cst__124, { 257,
>> > ... });
>> > vect_patt_65.12_128 = vect_patt_64.11_127 >> 8;
>> > vect_patt_66.13_129 = .FMA (vect_patt_57.9_123, vect_cst__124,
>> > vect_patt_65.12_128);
>> > vect_patt_62.14_130 = vect_patt_66.13_129 >> 8;
>> > vect_patt_68.15_131 = (vector([8,8]) unsigned charD.21)
>> > vect_patt_62.14_130;
>> >
>> > This transformation is much worse than the original code, it extended
>> > the dependency chain with another expensive instruction. I can try to
>> > correct this in RTL by matching FMA and shift and splitting into MUL +
>> ADDHNB and hope CSE takes care of the extra mul.
>> >
>> > But this seems like a hack, and it's basically undoing the earlier
>> > transformation.  It seems to me that the open coding is a bad idea.
>> 
>> Could you post the patch that gives this result?  I'll have a poke around.
>
> Sure, I'll post the new series, it needs all of them.

Thanks.  Which testcase did you use to get the above?

But since SVE does have highpart multiply, and since the assumption for
SVE is that MULH+shift is better than ADD*3+shift*2, shouldn't SVE just
be one of the targets for which the hook that "asks whether division
through highpart multiplication is preferred over the add/shift
sequence" returns true?

For extra conservativeness, we could make the hook default to true
and explicitly return false for Advanced SIMD and for SVE2.

Richard

  reply	other threads:[~2023-02-27 21:33 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-09 17:16 Tamar Christina
2023-02-09 17:22 ` [PATCH 2/2]AArch64 Update div-bitmask to implement new optab instead of target hook [PR108583] Tamar Christina
2023-02-10 10:35   ` Tamar Christina
2023-02-10 14:10   ` Richard Sandiford
2023-02-10 10:34 ` [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583] Tamar Christina
2023-02-10 13:13 ` Richard Biener
2023-02-10 13:36 ` Richard Sandiford
2023-02-10 13:52   ` Richard Biener
2023-02-10 14:13   ` Tamar Christina
2023-02-10 14:30     ` Richard Sandiford
2023-02-10 14:54       ` Tamar Christina
2023-02-27 11:09       ` Tamar Christina
2023-02-27 12:11         ` Richard Sandiford
2023-02-27 12:14           ` Tamar Christina
2023-02-27 21:33             ` Richard Sandiford [this message]
2023-02-27 22:10               ` Tamar Christina
2023-02-28 11:08                 ` Richard Sandiford
2023-02-28 11:12                   ` Tamar Christina
2023-02-28 12:03                     ` Richard Sandiford
2023-03-01 11:30                       ` Richard Biener
2023-02-10 15:56     ` Richard Sandiford
2023-02-10 16:09       ` Tamar Christina
2023-02-10 16:25         ` Richard Sandiford
2023-02-10 16:33           ` Tamar Christina
2023-02-10 16:57             ` Richard Sandiford
2023-02-10 17:01               ` Richard Sandiford
2023-02-10 17:14               ` Tamar Christina
2023-02-10 18:12                 ` Richard Sandiford
2023-02-10 18:34                   ` Richard Biener
2023-02-10 20:58                     ` Andrew MacLeod
2023-02-13  9:54                       ` Tamar Christina
2023-02-15 12:51                         ` Tamar Christina
2023-02-15 16:05                           ` Andrew MacLeod
2023-02-15 17:13                             ` Tamar Christina
2023-02-15 17:50                               ` Andrew MacLeod
2023-02-15 18:42                                 ` Andrew MacLeod
2023-02-22 12:51                                   ` Tamar Christina
2023-02-22 16:41                                   ` Andrew MacLeod
2023-02-22 18:03                                     ` Tamar Christina
2023-02-22 18:33                                       ` Andrew MacLeod
2023-02-23  8:36                                         ` Tamar Christina
2023-02-23 16:39                                           ` Andrew MacLeod
2023-02-23 16:56                                             ` Tamar Christina
2023-03-01 16:57                                             ` Andrew Carlotti
2023-03-01 18:16                                               ` Tamar Christina
2023-02-22 13:06                                 ` Tamar Christina
2023-02-22 15:19                                   ` Andrew MacLeod

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptcz5upze9.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=Tamar.Christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@ventanamicro.com \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).