public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Hongtao Liu <crazylht@gmail.com>
Cc: Tamar Christina <Tamar.Christina@arm.com>,
	 Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>,
	 nd <nd@arm.com>,  "rguenther\@suse.de" <rguenther@suse.de>
Subject: Re: [PATCH 3/8]middle-end: Support extractions of subvectors from arbitrary element position inside a vector
Date: Thu, 17 Nov 2022 13:59:34 +0000	[thread overview]
Message-ID: <mptr0y1fzqx.fsf@arm.com> (raw)
In-Reply-To: <CAMZc-bwfCMpgAc1ta8t=4WQLje8y0sX3Fhd-SY4=GD1swgUGUg@mail.gmail.com> (Hongtao Liu's message of "Thu, 17 Nov 2022 18:20:29 +0800")

Hongtao Liu <crazylht@gmail.com> writes:
> On Thu, Nov 17, 2022 at 5:39 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Hongtao Liu <crazylht@gmail.com> writes:
>> > On Wed, Nov 16, 2022 at 1:39 AM Richard Sandiford
>> > <richard.sandiford@arm.com> wrote:
>> >>
>> >> Tamar Christina <Tamar.Christina@arm.com> writes:
>> >> >> -----Original Message-----
>> >> >> From: Hongtao Liu <crazylht@gmail.com>
>> >> >> Sent: Tuesday, November 15, 2022 9:37 AM
>> >> >> To: Tamar Christina <Tamar.Christina@arm.com>
>> >> >> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina via
>> >> >> Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>;
>> >> >> rguenther@suse.de
>> >> >> Subject: Re: [PATCH 3/8]middle-end: Support extractions of subvectors from
>> >> >> arbitrary element position inside a vector
>> >> >>
>> >> >> On Tue, Nov 15, 2022 at 4:51 PM Tamar Christina
>> >> >> <Tamar.Christina@arm.com> wrote:
>> >> >> >
>> >> >> > > -----Original Message-----
>> >> >> > > From: Hongtao Liu <crazylht@gmail.com>
>> >> >> > > Sent: Tuesday, November 15, 2022 8:36 AM
>> >> >> > > To: Tamar Christina <Tamar.Christina@arm.com>
>> >> >> > > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
>> >> >> > > via Gcc-patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com>;
>> >> >> > > rguenther@suse.de
>> >> >> > > Subject: Re: [PATCH 3/8]middle-end: Support extractions of
>> >> >> > > subvectors from arbitrary element position inside a vector
>> >> >> > >
>> >> >> > > Hi:
>> >> >> > >   I'm from https://gcc.gnu.org/pipermail/gcc-patches/2022-
>> >> >> > > November/606040.html.
>> >> >> > > >      }
>> >> >> > > >
>> >> >> > > >    /* See if we can get a better vector mode before extracting.
>> >> >> > > > */ diff --git a/gcc/optabs.cc b/gcc/optabs.cc index
>> >> >> > > >
>> >> >> > >
>> >> >> cff37ccb0dfc3dd79b97d0abfd872f340855dc96..f338df410265dfe55b68961600
>> >> >> > > 9
>> >> >> > > 0
>> >> >> > > > a453cc6a28d9 100644
>> >> >> > > > --- a/gcc/optabs.cc
>> >> >> > > > +++ b/gcc/optabs.cc
>> >> >> > > > @@ -6267,6 +6267,7 @@ expand_vec_perm_const (machine_mode
>> >> >> mode,
>> >> >> > > rtx v0, rtx v1,
>> >> >> > > >        v0_qi = gen_lowpart (qimode, v0);
>> >> >> > > >        v1_qi = gen_lowpart (qimode, v1);
>> >> >> > > >        if (targetm.vectorize.vec_perm_const != NULL
>> >> >> > > > +         && targetm.can_change_mode_class (mode, qimode,
>> >> >> > > > + ALL_REGS)
>> >> >> > > It looks like you want to guard gen_lowpart, shouldn't it be better
>> >> >> > > to use validate_subreg  or (tmp = gen_lowpart_if_possible (mode,
>> >> >> target_qi)).
>> >> >> > > IMHO, targetm.can_change_mode_class is mostly used for RA, but not
>> >> >> > > to guard gen_lowpart.
>> >> >> >
>> >> >> > Hmm I don't think this is quite true, there are existing usages in
>> >> >> > expr.cc and rtanal.cc That do this and aren't part of RA.  As I
>> >> >> > mentioned before for instance the canoncalization of vec_select to subreg
>> >> >> in rtlanal for instances uses this.
>> >> >> In theory, we need to iterate through all reg classes that can be assigned for
>> >> >> both qimode and mode, if any regclass returns true for
>> >> >> targetm.can_change_mode_class, the bitcast(validate_subreg) should be ok.
>> >> >> Here we just passed ALL_REGS.
>> >> >
>> >> > Yes, and most targets where this transformation is valid return true here.
>> >> >
>> >> > I've checked:
>> >> >  * alpha
>> >> >  * arm
>> >> >  * aarch64
>> >> >  * rs6000
>> >> >  * s390
>> >> >  * sparc
>> >> >  * pa
>> >> >  * mips
>> >> >
>> >> > And even the default example that other targets use from the documentation
>> >> > would return true as the size of the modes are the same.
>> >> >
>> >> > X86 and RISCV are the only two targets that I found (but didn't check all) that
>> >> > blankly return a result based on just the register classes.
>> >> >
>> >> > That is to say, there are more targets that adhere to the interpretation that
>> >> > rclass here means "should be possible in some class in rclass" rather than
>> >> > "should be possible in ALL classes of rclass".
>> >>
>> >> Yeah, I agree.  A query "can something stored in ALL_REGS change from
>> >> mode M1 to mode M2?" is meaningful if at least one register R in ALL_REGS
>> >> can hold both M1 and M2.  It's then the target's job to answer
>> >> conservatively so that the result covers all such R.
>> >>
>> >> In principle it's OK for a target to err on the side of caution and forbid
>> >> things that are actually OK.  But that's going to risk losing performance
>> >> in some cases, and sometimes that loss of performance will be unacceptable.
>> >> IMO that's what's happening here.  The target is applying x87 rules to
>> >> things that (AIUI) are never stored in x87 registers, and so losing
>> > Yes, it can be optimized since some mode will never assigned to x87 registers.
>> >> performance as a result.
>> >>
>> >> Note that the RA also uses ALL_REGS for some things, so this usage
>> >> isn't specific to non-RA code.
>> > RA passes the minimal reg class(REGNO_REG_CLASS) which contains REGN
>> > to decide if can_change_mode_class, not ALL_REGS.
>> > 511/* Given a hard REGN a FROM mode and a TO mode, return true if
>> > 512   REGN can change from mode FROM to mode TO.  */
>> > 513#define REG_CAN_CHANGE_MODE_P(REGN, FROM, TO)                          \
>> > 514  (targetm.can_change_mode_class (FROM, TO, REGNO_REG_CLASS (REGN)))
>> > 515
>> >
>> > So I still think using can_change_mode_class outside of RA with
>> > ALL_REGS passed to decide whether it's ok to generate subreg is not a
>> > good idea.
>>
>> But if the argument is that the only valid uses of can_change_mode_class
>> are through this macro, the hook isn't describing a class property,
>> it's describing the property of individual registers.  If we say that
>> querying individual registers is the only valid thing to do them
>> we should change the hook to take a register number rather than
>> a class enum.
>>
>> The reason we have a class-based rather than register-based interface
>> is because it is useful to query classes before you've picked a
>> specific register.
> For individual registers in the minimal reg class, we assume they are
> not different from each other, I guess that's why we have
> REGNO_REG_CLASS and class-based interfaces other than register-based
> interfaces.

I don't think that's necessarily true.  We have things like
hard_regno_nregs that operate on individual registers.  And even
the x86 implementation of the hook uses subset operations rather
than comparing the class for equality, which suggests some attempt
to handle classes other than those returned by REGNO_REG_CLASS.

> But for ALL_REGS, it's not the minimal reg class, it's the largest.
> Using it It's not that suitable.

But I think it is suitable for the gimple level, where we have no
information that would narrow the choice to a specific register class.

> If the argument is if some r in rclass is ok for mode change, the hook
> would return true, then why would RA use REGNO_REG_CLASS other than
> ALL_REGS.

If you know which hard register something is stored in, it makes
sense to ask about the closest enclosing class rather than something
more general.  If a particular mode can be stored in both general
registers and floating-point registers, and if floating-point registers
have restrictions that general registers don't, ALL_REGS should honour
the floating-point constraints.  But it wouldn't make sense to use the
floating-point constraints for something that is known to be in a
general register.

> Another spot is in validate_subreg, we're using the minimal reg class
> instead of ALL_REGS.
>  973  /* This is a normal subreg.  Verify that the offset is representable.  */
>  974
>  975  /* For hard registers, we already have most of these rules collected in
>  976     subreg_offset_representable_p.  */
>  977  if (reg && REG_P (reg) && HARD_REGISTER_P (reg))
>  978    {
>  979      unsigned int regno = REGNO (reg);
>  980
>  981      if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
>  982          && GET_MODE_INNER (imode) == omode)
>  983        ;
>  984      else if (!REG_CAN_CHANGE_MODE_P (regno, imode, omode))
>  985        return false;

But here too, we're using REG_CAN_CHANGE_MODE_P because we know
the register number.  It wouldn't make sense to ask about a more
general class than necessary.

REG_CANNOT_CHANGE_MODE_P (as it was then) was added in 2002 as a
convenient interface to CANNOT_CHANGE_MODE_CLASS.  CANNOT_CHANGE_MODE_CLASS
in turn replaced CLASS_CANNOT_CHANGE_MODE_P, which only took two modes,
and wasn't given any class information.  So this interface was
originally a query about modes, not a query about classes.  The class
information was added later since (understandably) modes weren't always
enough on their own.  But I think it is still fundamentally a query
about modes, with the class provided for context, rather than a query
about classes, with modes provided by context.

> I think we do need some hook in the middle end to query things like if
> some r in rclass is ok for mode change?  but not reusing
> can_change_mode_class.

But if we add a hook to say "are mode changes from mode M1 to mode M2 OK?",
which is what Tamar's patch and some other existing code wants to know,
I fear we'll just reintroduce the old CLASS_CANNOT_CHANGE_MODE_P (but
hopefully without embedding the negative sense).  I don't think it makes
sense to have that hook alongside the existing one.  It would require
targets to duplicate information and would run the risk on conflicting
information for corner cases.  IMO it would repeat the mistake of having
both hard_regno_nregs and class_max_nregs; really, the latter ought
to be calculated from the former.

Thanks,
Richard

>> Thanks,
>> Richard
>>
>> >> IMO it's not the job of target-independent code to iterate through
>> >> individual classes and aggregate the result.  One of the reasons for
>> >> having union classes is to avoid the need to do that.  And ALL_REGS
>> >> is the ultimate union class. :-)
>> >>
>> >> The patch looks correct to me.
>> >>
>> >> Thanks,
>> >> Richard
>> >>
>> >> >> >
>> >> >> > So there are already existing precedence for this.  And the
>> >> >> > documentation for the hook says:
>> >> >> >
>> >> >> > "This hook returns true if it is possible to bitcast values held in registers of
>> >> >> class rclass from mode from to mode to and if doing so preserves the low-
>> >> >> order bits that are common to both modes. The result is only meaningful if
>> >> >> rclass has registers that can hold both from and to. The default
>> >> >> implementation returns true"
>> >> >> >
>> >> >> > So it looks like it's use outside of RA is perfectly valid.. and the
>> >> >> > documentation also mentions in the example the use from the mid-end as
>> >> >> an example.
>> >> >> >
>> >> >> > But if the mid-end maintainers are happy I'll use something else.
>> >> >> >
>> >> >> > Tamar
>> >> >> >
>> >> >> > > I did similar things in
>> >> >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579296.html
>> >> >> > > (and ALL_REGS doesn't cover all cases for registers which are both
>> >> >> > > available for qimode and mode, ALL_REGS fail doesn't mean it can't
>> >> >> > > be subreg, it just means parts of ALL_REGS can't be subreg. but with
>> >> >> > > a subset of ALL_REGS, there could be a reg class which return true
>> >> >> > > for
>> >> >> > > targetm.can_change_mode_class)
>> >> >> > > >           && targetm.vectorize.vec_perm_const (qimode, qimode,
>> >> >> > > > target_qi,
>> >> >> > > v0_qi,
>> >> >> > > >                                                v1_qi, qimode_indices))
>> >> >> > > >         return gen_lowpart (mode, target_qi); @@ -6311,7 +6312,8
>> >> >> > > > @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1,
>> >> >> > > >      }
>> >> >> > > >
>> >> >> > > >    if (qimode != VOIDmode
>> >> >> > > > -      && selector_fits_mode_p (qimode, qimode_indices))
>> >> >> > > > +      && selector_fits_mode_p (qimode, qimode_indices)
>> >> >> > > > +      && targetm.can_change_mode_class (mode, qimode, ALL_REGS))
>> >> >> > > >      {
>> >> >> > > >        icode = direct_optab_handler (vec_perm_optab, qimode);
>> >> >> > > >        if (icode != CODE_FOR_nothing) diff --git
>> >> >> > > > a/gcc/testsuite/gcc.target/aarch64/ext_1.c
>> >> >> > > > b/gcc/testsuite/gcc.target/aarch64/ext_1.c
>> >> >> > > > new file mode 100644
>> >> >> > > > index
>> >> >> > > >
>> >> >> > >
>> >> >> 0000000000000000000000000000000000000000..18a10a14f1161584267a8472e5
>> >> >> > > 71
>> >> >> > > > b3bc2ddf887a
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > > --
>> >> >> > > BR,
>> >> >> > > Hongtao
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> BR,
>> >> >> Hongtao

  reply	other threads:[~2022-11-17 13:59 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 11:56 [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs Tamar Christina
2022-10-31 11:57 ` [PATCH 2/8]middle-end: Recognize scalar widening reductions Tamar Christina
2022-10-31 21:42   ` Jeff Law
2022-11-07 13:21   ` Richard Biener
2022-10-31 11:57 ` [PATCH 3/8]middle-end: Support extractions of subvectors from arbitrary element position inside a vector Tamar Christina
2022-10-31 21:44   ` Jeff Law
2022-11-01 14:25   ` Richard Sandiford
2022-11-11 14:33     ` Tamar Christina
2022-11-15  8:35       ` Hongtao Liu
2022-11-15  8:51         ` Tamar Christina
2022-11-15  9:37           ` Hongtao Liu
2022-11-15 10:00             ` Tamar Christina
2022-11-15 17:39               ` Richard Sandiford
2022-11-17  8:04                 ` Hongtao Liu
2022-11-17  9:39                   ` Richard Sandiford
2022-11-17 10:20                     ` Hongtao Liu
2022-11-17 13:59                       ` Richard Sandiford [this message]
2022-11-18  2:31                         ` Hongtao Liu
2022-11-18  9:16                           ` Richard Sandiford
2022-10-31 11:58 ` [PATCH 4/8]AArch64 aarch64: Implement widening reduction patterns Tamar Christina
2022-11-01 14:41   ` Richard Sandiford
2022-10-31 11:58 ` [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable Tamar Christina
2022-11-01 14:58   ` Richard Sandiford
2022-11-01 15:11     ` Tamar Christina
2022-11-11 14:39     ` Tamar Christina
2022-11-22 16:01       ` Tamar Christina
2022-11-30  4:26         ` Tamar Christina
2022-12-06 10:28       ` Richard Sandiford
2022-12-06 10:58         ` Tamar Christina
2022-12-06 11:05           ` Richard Sandiford
2022-10-31 11:59 ` [PATCH 6/8]AArch64: Add peephole and scheduling logic for pairwise operations that appear late in RTL Tamar Christina
2022-10-31 11:59 ` [PATCH 7/8]AArch64: Consolidate zero and sign extension patterns and add missing ones Tamar Christina
2022-11-30  4:28   ` Tamar Christina
2022-12-06 15:59   ` Richard Sandiford
2022-10-31 12:00 ` [PATCH 8/8]AArch64: Have reload not choose to do add on the scalar side if both values exist on the SIMD side Tamar Christina
2022-11-01 15:04   ` Richard Sandiford
2022-11-01 15:20     ` Tamar Christina
2022-10-31 21:41 ` [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs Jeff Law
2022-11-05 11:32 ` Richard Biener
2022-11-07  7:16   ` Tamar Christina
2022-11-07 10:17     ` Richard Biener
2022-11-07 11:00       ` Tamar Christina
2022-11-07 11:22         ` Richard Biener
2022-11-07 11:56           ` Tamar Christina
2022-11-22 10:36             ` Richard Sandiford
2022-11-22 10:58               ` Richard Biener
2022-11-22 11:02                 ` Tamar Christina
2022-11-22 11:06                   ` Richard Sandiford
2022-11-22 11:08                     ` Richard Biener
2022-11-22 14:33                       ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptr0y1fzqx.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=Tamar.Christina@arm.com \
    --cc=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).