public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Tamar Christina <Tamar.Christina@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>,  "ook@ucw.cz" <ook@ucw.cz>
Subject: RE: [PATCH v2 5/16]middle-end: Add shared machinery for matching patterns involving complex numbers.
Date: Tue, 29 Sep 2020 08:32:41 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.YFH.7.76.2009290811400.10073@p653.nepu.fhfr.qr> (raw)
In-Reply-To: <VI1PR08MB532533E72CE4F57296C03C80FF350@VI1PR08MB5325.eurprd08.prod.outlook.com>

On Mon, 28 Sep 2020, Tamar Christina wrote:

> Hi Richi,
> 
> > -----Original Message-----
> > From: rguenther@c653.arch.suse.de <rguenther@c653.arch.suse.de> On
> > Behalf Of Richard Biener
> > Sent: Monday, September 28, 2020 2:22 PM
> > To: Tamar Christina <Tamar.Christina@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; ook@ucw.cz
> > Subject: Re: [PATCH v2 5/16]middle-end: Add shared machinery for matching
> > patterns involving complex numbers.
> > 
> > On Fri, 25 Sep 2020, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > This patch adds shared machinery for detecting patterns having to do
> > > with complex number operations.  The class ComplexPattern provides
> > > helpers for matching and ultimately undoing the permutation in the
> > > tree by rebuilding the graph.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > 
> > I think you want to change all this to not look at individual
> > stmts:
> > 
> > +    vect_match_expression_p (slp_tree node, tree_code code, int base,
> > + int
> > idx,
> > +                            stmt_vec_info *op1, stmt_vec_info *op2)
> > +    {
> > +
> > +      vec<stmt_vec_info> scalar_stmts = SLP_TREE_SCALAR_STMTS (node);
> > +
> > 
> > _all_ lanes are supposed to match the operation in
> > SLP_TREE_REPRESENTATIVE there's no need to do any operation-matching
> > on lane granularity.
> > 
> 
> That's fair, that flexibility seems like it indeed won't work since the statements are vectorized based on
> SLP_TREE_REPRESENTATIVE anyway. So I'll simplify it.
> 
> > The only thing making a difference here is VEC_PERM_EXPR nodes where
> > again - there's no need to look at (eventually non-existant!)
> > SLP_TREE_SCALAR_STMTS but its SLP_TREE_REPRESENTATIVE.
> > 
> > +      vec<slp_tree> children = SLP_TREE_CHILDREN (node);
> > +
> > +      /* If it's a VEC_PERM_EXPR we need to look one deeper.
> > VEC_PERM_EXPR
> > +        only have one entry.  So pick on.  */
> > +      if (node->code == VEC_PERM_EXPR)
> > +       children = SLP_TREE_CHILDREN (children.last ());
> > 
> > that's too simplistic ;)
> > 
> > +         *op1 = SLP_TREE_SCALAR_STMTS (children[0])[n];
> > 
> > please make op1,op2 slp_tree, not stmt_vec_info.
> > 
> > Where do you look at SLP_TREE_LANE_PERMUTATION at all?  I think the
> > result of
> > 
> 
> Here I have to admit that I have/am a bit confused as to the relation 
> between the different permute fields. LOAD permute is quite straight 
> forward, LANE permute I think are shuffles/explicit permutes.
> 
> But then I am lost as to the purpose of a VEC_PERM_EXPR nodes. Is it 
> just a marker to indicate that some node below has a load permute 
> somewhere?

I've introduced VEC_PERM_EXPR for the lack of an isolated way to
perform a permute operation.  Before we've only had the chance
to specify a permutation at loads.  I'm still working on the
load side - one common optimization there for example is to
move a common load permutation across isomorphic operations,
like a plus of two same permuted loads can be done as a
permute of the plus of two unpermuted loads.

Now, the way VEC_PERM SLP nodes are designed also allows them
to be used to mediate between SLP sub-graphs using for example
different vector sizes.  That's because they semantically represent
a concat of the VEC_PERM children lanes and then applying a
selection mask as to which lanes to output.  The comment of
vectorizable_slp_permutation has the most detail here
(it doesn't yet fully implement all cases since VEC_PERM is
currently only used for the two_operators case).

The SLP_TREE_LANE_PERMUTATION specifies { child-nr, lane-nr }
pairs (so it enumerates the 'scalar' output lanes).

I think for all complex ops you are looking for
{ { 0, 2*i + 1 }, { 0, 2*i }, ... } or { { 0, 0 }, { 0, 1 }, .. }
kind of vectors, blended according to mix plus/minus appropriately?

> > +    vect_detect_pair_op (int base, slp_tree node1, int offset1,
> > + slp_tree
> > node2,
> > +                        int offset2, vec<stmt_vec_info> *ops)
> > 
> > could be simply the SLP_TREE_LANE_PERMUTATION? plus its two child
> > nodes?
> 
> Right, if I understood correctly, on the two_operands case the lane 
> permute would tell me whether it's + or -, and in the case of the non- 
> two_operands cases I probably want to check that it's vNULL since any 
> permute in the order changes how the instruction accepts the inputs?

In the vectorized code you can think of the permute nodes producing
a vector permute instruction, shuffling lanes around.  If that changes
any of the semantics then you need to see how to deal with it.

> > 
> > In the ComplexAddPattern patch I see
> > 
> > +      /* Correct the arguments after matching.  */
> > +      std::swap (this->m_vects[1], this->m_vects[3]);
> > 
> > how's that necessary?  The replacement SLP node should end up with just a
> > SLP_TREE_REPRESENTATIVE (the IFN function call).
> > That is, the only thing necessary is verification / matching of the appropriate
> > "interleaving" scheme imposed by SLP_TREE_LANE_PERMUTATION.
> 
> But the order or the statements are important as those decide the 
> LOAD_PERMUTATION that build_slp_tree creates.
> 
> So in this case the original statement is
> 
>        stmt 0 _39 = _37 + _12;
>        stmt 1 _6 = _38 - _36;
> 
> {_12, _36} result in a LOAD_PERMUTATION of {1, 0} because of how the 
> addition is done. So to undo the LOAD_PERMUTE it has to build the new 
> children using
> 
>        stmt 0 _39 = {_37, _36};
>        stmt 1 _6 = {_38, _12};
> 
> So the scalar statements are purely a re-ordering to get it to rebuild 
> the children correctly.

Hmm, but you're looking at the permute node here which has the
add and the subtract as children.  Only that eventually has
{_12, _36} as child which may be a load containing a permute.
If you need {_12, _36} swapped you shoulnd't change the scalar
stmts array in some (which?) SLP node - instead you could
insert a VEC_PERM SLP node applying the permute (the not yet
written phase to optimize permutes should then eventually merge
this with a LOAD_PERMUTATION).

> Now ADD is simplistic, the more complicated cases like MUL and FMA use 
> this property to Also rebuild the tree removing unneeded statements.  
> This method returns these and stores them in the PatternMatch class, so 
> I don't have to ask for them again later when replacing the nodes.  
> Even for SLP_TREE_REPRESENTATIVE don't the arguments in the IFN call 
> need to be in such way that they both in the same place in the load 
> chain?

The GIMPLE stmts just serve as a "pattern" for the actual operation,
the vectorized stmts are now constructed entirely from the SLP
node where for the case of the IFN call the first argument for the 
vectorized IFN is picked from its SLP node first childs vectorized
defs.  That's as opposed to what we needed to do in the past where
we tried to match up the scalar stmts operand with the appropriate
scalar lane def.

> > I guess things would go wrong if instead of effectively blending two 
> > vectors we'd interleave two smaller vector type vectors?  Say a 4-way 
> > add and a 4- way sub interleaved into a 8-way addsub, using V4SF and 
> > V8SF vectors?
> > 
> 
> At this stage yes, most likely, but it should be rejected at the 
> validate level.
> 
> What is also currently rejected is when some of the definitions are 
> external, which I think I should be able to handle. I'll fix that up 
> with the rest of the changes.
> 
> 
> Thanks for the feedback!

You're welcome.

Richard.

> Regards,
> Tamar
> 
> > Going to stop looking at the series at this point, one further note is that all of
> > the Complex*Patterns seem to share the initial match and thus redundantly
> > do a vect_detect_pair_op repeatedly on each node for each pattern?  I
> > wonder if it could be commoned into a single pattern, they all seem to share
> > the initial mixed plus/minus, then we have the multiplication on one or both
> > operand cases.
> > 
> > Richard.
> > 
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > 	* tree-vect-slp-patterns.c (complex_operation_t,class
> > ComplexPattern):
> > > 	New.
> > >
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > Nuernberg, Germany; GF: Felix Imend
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

  reply	other threads:[~2020-09-29  6:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-25 14:28 Tamar Christina
2020-09-28 13:21 ` Richard Biener
2020-09-28 16:06   ` Tamar Christina
2020-09-29  6:32     ` Richard Biener [this message]
2020-11-03 15:06     ` Tamar Christina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.76.2009290811400.10073@p653.nepu.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=Tamar.Christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    --cc=ook@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).