RE: [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: Richard Biener <richard.guenther@gmail.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>
Subject: RE: [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs
Date: Mon, 7 Nov 2022 11:00:36 +0000	[thread overview]
Message-ID: <VI1PR08MB5325DE188F22EC1611452483FF3C9@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2211071002260.4294@jbgna.fhfr.qr>

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Monday, November 7, 2022 10:18 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Richard Biener <richard.guenther@gmail.com>; gcc-
> patches@gcc.gnu.org; nd <nd@arm.com>
> Subject: RE: [PATCH 1/8]middle-end: Recognize scalar reductions from
> bitfields and array_refs
> 
> On Mon, 7 Nov 2022, Tamar Christina wrote:
> 
> > > -----Original Message-----
> > > From: Richard Biener <richard.guenther@gmail.com>
> > > Sent: Saturday, November 5, 2022 11:33 AM
> > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; rguenther@suse.de
> > > Subject: Re: [PATCH 1/8]middle-end: Recognize scalar reductions from
> > > bitfields and array_refs
> > >
> > > On Mon, Oct 31, 2022 at 1:00 PM Tamar Christina via Gcc-patches
> > > <gcc- patches@gcc.gnu.org> wrote:
> > > >
> > > > Hi All,
> > > >
> > > > This patch series is to add recognition of pairwise operations
> > > > (reductions) in match.pd such that we can benefit from them even
> > > > at
> > > > -O1 when the vectorizer isn't enabled.
> > > >
> > > > Ths use of these allow for a lot simpler codegen in AArch64 and
> > > > allows us to avoid quite a lot of codegen warts.
> > > >
> > > > As an example a simple:
> > > >
> > > > typedef float v4sf __attribute__((vector_size (16)));
> > > >
> > > > float
> > > > foo3 (v4sf x)
> > > > {
> > > >   return x[1] + x[2];
> > > > }
> > > >
> > > > currently generates:
> > > >
> > > > foo3:
> > > >         dup     s1, v0.s[1]
> > > >         dup     s0, v0.s[2]
> > > >         fadd    s0, s1, s0
> > > >         ret
> > > >
> > > > while with this patch series now generates:
> > > >
> > > > foo3:
> > > >         ext     v0.16b, v0.16b, v0.16b, #4
> > > >         faddp   s0, v0.2s
> > > >         ret
> > > >
> > > > This patch will not perform the operation if the source is not a
> > > > gimple register and leaves memory sources to the vectorizer as
> > > > it's able to deal correctly with clobbers.
> > >
> > > But the vectorizer should also be able to cope with the above.
> >
> > There are several problems with leaving it up to the vectorizer to do:
> >
> > 1. We only get it at -O2 and higher.
> > 2. The way the vectorizer costs the reduction makes the resulting cost
> always too high for AArch64.
> >
> > As an example the following:
> >
> > typedef unsigned int u32v4 __attribute__((vector_size(16))); unsigned
> > int f (u32v4 a, u32v4 b) {
> >     return a[0] + a[1];
> > }
> >
> > Doesn't get SLP'ed because the vectorizer costs it as:
> >
> > node 0x485eb30 0 times vec_perm costs 0 in body
> > _1 + _2 1 times vector_stmt costs 1 in body
> > _1 + _2 1 times vec_perm costs 2 in body
> > _1 + _2 1 times vec_to_scalar costs 2 in body
> >
> > And so ultimately you fail because:
> >
> > /app/example.c:8:17: note: Cost model analysis for part in loop 0:
> >   Vector cost: 5
> >   Scalar cost: 3
> >
> > This looks like it's because the vectorizer costs the operation to
> > create the BIT_FIELD_REF <a_3(D), 64, 0>; For the reduction as
> > requiring two scalar extracts and a permute. While it ultimately does
> produce a BIT_FIELD_REF <a_3(D), 64, 0>; that's not what it costs.
> >
> > This causes the reduction to almost always be more expensive, so
> > unless the rest of the SLP tree amortizes the cost we never generate them.
> 
> On x86 for example the hadds are prohibitly expensive here.  Are you sure
> the horizontal add is actually profitable on arm?  Your pattern-matching has
> no cost modeling at all?

Yes, they are dirt cheap, that's why we use them for a lot of our codegen for
e.g. compressing values.

> 
> > 3. The SLP only happens on operation that are SLP shaped and where SLP
> didn't fail.
> >
> > As a simple example, the vectorizer can't SLP the following:
> >
> > unsigned int f (u32v4 a, u32v4 b)
> > {
> >     a[0] += b[0];
> >     return a[0] + a[1];
> > }
> >
> > Because there's not enough VF here and it can't unroll. This and many
> > others fail because they're not an SLP-able operation, or SLP build fails.
> 
> That's of course because the pattern matching for reductions is too simple
> here, getting us a group size of three.  Bad association would make your
> simple pattern matching fail as well.
> 
> > This causes us to generate for e.g. this example:
> >
> > f:
> >         dup     s2, v0.s[1]
> >         fmov    w1, s1
> >         add     v0.2s, v2.2s, v0.2s
> >         fmov    w0, s0
> >         add     w0, w0, w1
> >         ret
> >
> > instead of with my patch:
> >
> > f:
> >         addp    v0.2s, v0.2s, v0.2s
> >         add     v0.2s, v0.2s, v1.2s
> >         fmov    w0, s0
> >         ret
> >
> > which is significantly better code.  So I don't think the vectorizer is the right
> solution for this.
> 
> Simple pattern matching isn't either.  In fact basic-block SLP is supposed to be
> the advanced pattern matching including a cost model.

The cost model seems a bit moot here, at least on AArch64.  There is no sequence
of events that would make these pairwise operations more expensive then the
alternative, which is to do vector extraction and crossing a register file to do simple
addition.

And in fact the ISA classifies these instructions as scalar not vector, and it doesn't
seem right to need the vectorizer for something that's basic codegen.

It seems like the problem here is that the current reductions are designed around
x86 specific limitations.  So perhaps the solution here is to just have an AArch64
specific Gimple pass or gate this transform on a target hook, or new cheap reduction
codes.

> 
> IMHO the correct approach is to improve that,
> vect_slp_check_for_constructors plus how we handle/recover from SLP
> discovery fails as in your second example above.

Is this feasible in the general sense? SLP tree decomposition would then require
you to cost every sub tree possible that gets built.  That seems quite expensive...

The bigger the tree the longer the more you have to decompose..

> 
> > > I don't think
> > > we want to do this as part of general folding.  Iff, then this
> > > belongs in specific points of the pass pipeline, no?
> >
> > The reason I currently have it as such is because in general the
> > compiler doesn't really deal with horizontal reductions at all.  Also
> > since the vectorizer itself can introduce reductions I figured it's
> > better to have one representation for this.  So admittedly perhaps this
> should only be done after vectorization as that's when today we expect
> reductions to be in Gimple.
> >
> > As for having it in a specific point in the pass pipeline, I have it
> > as a general one since a number of passes could create the form for
> > the reduction, for instance vec_lower could break up an operation to allow
> this to match.  The bigger BIT_FIELD_EXPR it creates could also lead to other
> optimizations.
> >
> > Additionally you had mentioned last time that Andrew was trying to
> > move min/max detection to match.pd So I had figured this was the correct
> place for it.
> 
> That's mostly because we have fold-const patterns for ?: min/max and CFG
> patterns for min/max in phiopt and it's possible to unify both.
> 
> > That said I have no intuition for what would be better here. Since the
> > check is quite cheap.  But do you have a particular place you want
> > this move to then?  Ideally I'd want it before the last FRE pass, but perhaps
> isel?
> 
> As said, I think it belongs where we can do costing which means the
> vectorizer.  Iff there are two/three instruction sequences that can be
> peepholed do it in the targets machine description instead.

We can't do it in RTL, because we don't know when things are sequentially
are after register allocator, for integer mode these would have then been
assigned to scalar hard register.  And so this is unrecoverable.

So quite literally, you cannot peephole this. You also cannot use combine
because there's no way to ensure that reload generates the same register
from subregs.

Thanks,
Tamar

> Richard.
> 
> > Thanks,
> > Tamar
> >
> > >
> > > > The use of these instruction makes a significant difference in
> > > > codegen quality for AArch64 and Arm.
> > > >
> > > > NOTE: The last entry in the series contains tests for all of the
> > > > previous patches as it's a bit of an all or nothing thing.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > x86_64-pc-linux-gnu and no issues.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         * match.pd (adjacent_data_access_p): Import.
> > > >         Add new pattern for bitwise plus, min, max, fmax, fmin.
> > > >         * tree-cfg.cc (verify_gimple_call): Allow function arguments in IFNs.
> > > >         * tree.cc (adjacent_data_access_p): New.
> > > >         * tree.h (adjacent_data_access_p): New.
> > > >
> > > > --- inline copy of patch --
> > > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > > >
> > >
> 2617d56091dfbd41ae49f980ee0af3757f5ec1cf..aecaa3520b36e770d11ea9a10
> > > eb1
> > > > 8db23c0cd9f7 100644
> > > > --- a/gcc/match.pd
> > > > +++ b/gcc/match.pd
> > > > @@ -39,7 +39,8 @@ along with GCC; see the file COPYING3.  If not see
> > > >     HONOR_NANS
> > > >     uniform_vector_p
> > > >     expand_vec_cmp_expr_p
> > > > -   bitmask_inv_cst_vector_p)
> > > > +   bitmask_inv_cst_vector_p
> > > > +   adjacent_data_access_p)
> > > >
> > > >  /* Operator lists.  */
> > > >  (define_operator_list tcc_comparison @@ -7195,6 +7196,47 @@
> > > > DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > >
> > > >  /* Canonicalizations of BIT_FIELD_REFs.  */
> > > >
> > > > +/* Canonicalize BIT_FIELD_REFS to pairwise operations. */ (for op
> > > > +(plus min max FMIN_ALL FMAX_ALL)
> > > > +     ifn (IFN_REDUC_PLUS IFN_REDUC_MIN IFN_REDUC_MAX
> > > > +         IFN_REDUC_FMIN IFN_REDUC_FMAX)  (simplify
> > > > +  (op @0 @1)
> > > > +   (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type))
> > > > +    (with { poly_uint64 nloc = 0;
> > > > +           tree src = adjacent_data_access_p (@0, @1, &nloc, true);
> > > > +           tree ntype = build_vector_type (type, 2);
> > > > +           tree size = TYPE_SIZE (ntype);
> > > > +           tree pos = build_int_cst (TREE_TYPE (size), nloc);
> > > > +           poly_uint64 _sz;
> > > > +           poly_uint64 _total; }
> > > > +     (if (src && is_gimple_reg (src) && ntype
> > > > +         && poly_int_tree_p (size, &_sz)
> > > > +         && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total)
> > > > +         && known_ge (_total, _sz + nloc))
> > > > +      (ifn (BIT_FIELD_REF:ntype { src; } { size; } { pos;
> > > > +})))))))
> > > > +
> > > > +(for op (lt gt)
> > > > +     ifni (IFN_REDUC_MIN IFN_REDUC_MAX)
> > > > +     ifnf (IFN_REDUC_FMIN IFN_REDUC_FMAX)  (simplify
> > > > +  (cond (op @0 @1) @0 @1)
> > > > +   (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type))
> > > > +    (with { poly_uint64 nloc = 0;
> > > > +           tree src = adjacent_data_access_p (@0, @1, &nloc, false);
> > > > +           tree ntype = build_vector_type (type, 2);
> > > > +           tree size = TYPE_SIZE (ntype);
> > > > +           tree pos = build_int_cst (TREE_TYPE (size), nloc);
> > > > +           poly_uint64 _sz;
> > > > +           poly_uint64 _total; }
> > > > +     (if (src && is_gimple_reg (src) && ntype
> > > > +         && poly_int_tree_p (size, &_sz)
> > > > +         && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total)
> > > > +         && known_ge (_total, _sz + nloc))
> > > > +      (if (SCALAR_FLOAT_MODE_P (TYPE_MODE (type)))
> > > > +       (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))
> > > > +       (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos;
> > > > +}))))))))
> > > > +
> > > >  (simplify
> > > >   (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
> > > >   (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2,
> > > > @4);
> > > > })) diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index
> > > >
> > >
> 91ec33c80a41e1e0cc6224e137dd42144724a168..b19710392940cf469de52d006
> > > 603
> > > > ae1e3deb6b76 100644
> > > > --- a/gcc/tree-cfg.cc
> > > > +++ b/gcc/tree-cfg.cc
> > > > @@ -3492,6 +3492,7 @@ verify_gimple_call (gcall *stmt)
> > > >      {
> > > >        tree arg = gimple_call_arg (stmt, i);
> > > >        if ((is_gimple_reg_type (TREE_TYPE (arg))
> > > > +          && !is_gimple_variable (arg)
> > > >            && !is_gimple_val (arg))
> > > >           || (!is_gimple_reg_type (TREE_TYPE (arg))
> > > >               && !is_gimple_lvalue (arg))) diff --git a/gcc/tree.h
> > > > b/gcc/tree.h index
> > > >
> > >
> e6564aaccb7b69cd938ff60b6121aec41b7e8a59..8f8a9660c9e0605eb516de194
> > > 640
> > > > b8c1b531b798 100644
> > > > --- a/gcc/tree.h
> > > > +++ b/gcc/tree.h
> > > > @@ -5006,6 +5006,11 @@ extern bool integer_pow2p (const_tree);
> > > >
> > > >  extern tree bitmask_inv_cst_vector_p (tree);
> > > >
> > > > +/* TRUE if the two operands represent adjacent access of data such
> that a
> > > > +   pairwise operation can be used.  */
> > > > +
> > > > +extern tree adjacent_data_access_p (tree, tree, poly_uint64*,
> > > > +bool);
> > > > +
> > > >  /* integer_nonzerop (tree x) is nonzero if X is an integer constant
> > > >     with a nonzero value.  */
> > > >
> > > > diff --git a/gcc/tree.cc b/gcc/tree.cc index
> > > >
> > >
> 007c9325b17076f474e6681c49966c59cf6b91c7..5315af38a1ead89ca5f75dc4b1
> > > 9
> > > d
> > > > e9841e29d311 100644
> > > > --- a/gcc/tree.cc
> > > > +++ b/gcc/tree.cc
> > > > @@ -10457,6 +10457,90 @@ bitmask_inv_cst_vector_p (tree t)
> > > >    return builder.build ();
> > > >  }
> > > >
> > > > +/* Returns base address if the two operands represent adjacent
> > > > +access of
> > > data
> > > > +   such that a pairwise operation can be used.  OP1 must be a
> > > > + lower
> > > subpart
> > > > +   than OP2.  If POS is not NULL then on return if a value is returned
> POS
> > > > +   will indicate the position of the lower address.  If
> > > > + COMMUTATIVE_P
> > > then
> > > > +   the operation is also tried by flipping op1 and op2.  */
> > > > +
> > > > +tree adjacent_data_access_p (tree op1, tree op2, poly_uint64 *pos,
> > > > +                            bool commutative_p) {
> > > > +  gcc_assert (op1);
> > > > +  gcc_assert (op2);
> > > > +  if (TREE_CODE (op1) != TREE_CODE (op2)
> > > > +      || TREE_TYPE (op1) != TREE_TYPE (op2))
> > > > +    return NULL;
> > > > +
> > > > +  tree type = TREE_TYPE (op1);
> > > > +  gimple *stmt1 = NULL, *stmt2 = NULL;  unsigned int bits =
> > > > + GET_MODE_BITSIZE (GET_MODE_INNER (TYPE_MODE (type)));
> > > > +
> > > > +  if (TREE_CODE (op1) == BIT_FIELD_REF
> > > > +      && operand_equal_p (TREE_OPERAND (op1, 0), TREE_OPERAND
> > > > + (op2,
> > > 0), 0)
> > > > +      && operand_equal_p (TREE_OPERAND (op1, 1), TREE_OPERAND
> > > > + (op2,
> > > 1), 0)
> > > > +      && known_eq (bit_field_size (op1), bits))
> > > > +    {
> > > > +      poly_uint64 offset1 = bit_field_offset (op1);
> > > > +      poly_uint64 offset2 = bit_field_offset (op2);
> > > > +      if (known_eq (offset2 - offset1, bits))
> > > > +       {
> > > > +         if (pos)
> > > > +           *pos = offset1;
> > > > +         return TREE_OPERAND (op1, 0);
> > > > +       }
> > > > +      else if (commutative_p && known_eq (offset1 - offset2, bits))
> > > > +       {
> > > > +         if (pos)
> > > > +           *pos = offset2;
> > > > +         return TREE_OPERAND (op1, 0);
> > > > +       }
> > > > +    }
> > > > +  else if (TREE_CODE (op1) == ARRAY_REF
> > > > +          && operand_equal_p (get_base_address (op1),
> > > > + get_base_address
> > > (op2)))
> > > > +    {
> > > > +      wide_int size1 = wi::to_wide (array_ref_element_size (op1));
> > > > +      wide_int size2 = wi::to_wide (array_ref_element_size (op2));
> > > > +      if (wi::ne_p (size1, size2) || wi::ne_p (size1, bits / 8)
> > > > +         || !tree_fits_poly_uint64_p (TREE_OPERAND (op1, 1))
> > > > +         || !tree_fits_poly_uint64_p (TREE_OPERAND (op2, 1)))
> > > > +       return NULL;
> > > > +
> > > > +      poly_uint64 offset1 = tree_to_poly_uint64 (TREE_OPERAND (op1,
> 1));
> > > > +      poly_uint64 offset2 = tree_to_poly_uint64 (TREE_OPERAND (op2,
> 1));
> > > > +      if (known_eq (offset2 - offset1, 1UL))
> > > > +       {
> > > > +         if (pos)
> > > > +           *pos = offset1 * bits;
> > > > +         return TREE_OPERAND (op1, 0);
> > > > +       }
> > > > +      else if (commutative_p && known_eq (offset1 - offset2, 1UL))
> > > > +       {
> > > > +         if (pos)
> > > > +           *pos = offset2 * bits;
> > > > +         return TREE_OPERAND (op1, 0);
> > > > +       }
> > > > +    }
> > > > +  else if (TREE_CODE (op1) == SSA_NAME
> > > > +          && (stmt1 = SSA_NAME_DEF_STMT (op1)) != NULL
> > > > +          && (stmt2 = SSA_NAME_DEF_STMT (op2)) != NULL
> > > > +          && is_gimple_assign (stmt1)
> > > > +          && is_gimple_assign (stmt2))
> > > > +    {
> > > > +      if (gimple_assign_rhs_code (stmt1) != ARRAY_REF
> > > > +         && gimple_assign_rhs_code (stmt1) != BIT_FIELD_REF
> > > > +         && gimple_assign_rhs_code (stmt2) != ARRAY_REF
> > > > +         && gimple_assign_rhs_code (stmt2) != BIT_FIELD_REF)
> > > > +       return NULL;
> > > > +
> > > > +      return adjacent_data_access_p (gimple_assign_rhs1 (stmt1),
> > > > +                                    gimple_assign_rhs1 (stmt2), pos,
> > > > +                                    commutative_p);
> > > > +    }
> > > > +
> > > > +  return NULL;
> > > > +}
> > > > +
> > > >  /* If VECTOR_CST T has a single nonzero element, return the index of
> that
> > > >     element, otherwise return -1.  */
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> Boudien Moerman; HRB 36809 (AG Nuernberg)

next prev parent reply	other threads:[~2022-11-07 11:00 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 11:56 Tamar Christina
2022-10-31 11:57 ` [PATCH 2/8]middle-end: Recognize scalar widening reductions Tamar Christina
2022-10-31 21:42   ` Jeff Law
2022-11-07 13:21   ` Richard Biener
2022-10-31 11:57 ` [PATCH 3/8]middle-end: Support extractions of subvectors from arbitrary element position inside a vector Tamar Christina
2022-10-31 21:44   ` Jeff Law
2022-11-01 14:25   ` Richard Sandiford
2022-11-11 14:33     ` Tamar Christina
2022-11-15  8:35       ` Hongtao Liu
2022-11-15  8:51         ` Tamar Christina
2022-11-15  9:37           ` Hongtao Liu
2022-11-15 10:00             ` Tamar Christina
2022-11-15 17:39               ` Richard Sandiford
2022-11-17  8:04                 ` Hongtao Liu
2022-11-17  9:39                   ` Richard Sandiford
2022-11-17 10:20                     ` Hongtao Liu
2022-11-17 13:59                       ` Richard Sandiford
2022-11-18  2:31                         ` Hongtao Liu
2022-11-18  9:16                           ` Richard Sandiford
2022-10-31 11:58 ` [PATCH 4/8]AArch64 aarch64: Implement widening reduction patterns Tamar Christina
2022-11-01 14:41   ` Richard Sandiford
2022-10-31 11:58 ` [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable Tamar Christina
2022-11-01 14:58   ` Richard Sandiford
2022-11-01 15:11     ` Tamar Christina
2022-11-11 14:39     ` Tamar Christina
2022-11-22 16:01       ` Tamar Christina
2022-11-30  4:26         ` Tamar Christina
2022-12-06 10:28       ` Richard Sandiford
2022-12-06 10:58         ` Tamar Christina
2022-12-06 11:05           ` Richard Sandiford
2022-10-31 11:59 ` [PATCH 6/8]AArch64: Add peephole and scheduling logic for pairwise operations that appear late in RTL Tamar Christina
2022-10-31 11:59 ` [PATCH 7/8]AArch64: Consolidate zero and sign extension patterns and add missing ones Tamar Christina
2022-11-30  4:28   ` Tamar Christina
2022-12-06 15:59   ` Richard Sandiford
2022-10-31 12:00 ` [PATCH 8/8]AArch64: Have reload not choose to do add on the scalar side if both values exist on the SIMD side Tamar Christina
2022-11-01 15:04   ` Richard Sandiford
2022-11-01 15:20     ` Tamar Christina
2022-10-31 21:41 ` [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs Jeff Law
2022-11-05 11:32 ` Richard Biener
2022-11-07  7:16   ` Tamar Christina
2022-11-07 10:17     ` Richard Biener
2022-11-07 11:00       ` Tamar Christina [this message]
2022-11-07 11:22         ` Richard Biener
2022-11-07 11:56           ` Tamar Christina
2022-11-22 10:36             ` Richard Sandiford
2022-11-22 10:58               ` Richard Biener
2022-11-22 11:02                 ` Tamar Christina
2022-11-22 11:06                   ` Richard Sandiford
2022-11-22 11:08                     ` Richard Biener
2022-11-22 14:33                       ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB5325DE188F22EC1611452483FF3C9@VI1PR08MB5325.eurprd08.prod.outlook.com \
    --to=tamar.christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).