Re: [1/2] PR96463 - aarch64 specific changes

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <richard.sandiford@arm.com>
To: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
Cc: gcc Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [1/2] PR96463 - aarch64 specific changes
Date: Thu, 12 May 2022 11:44:59 +0100	[thread overview]
Message-ID: <mpttu9v3u8k.fsf@arm.com> (raw)
In-Reply-To: <CAAgBjM=pyXy4szROys3_KBY69fF-1sPtrsKcurh2eiwFsUNtew@mail.gmail.com> (Prathamesh Kulkarni's message of "Thu, 12 May 2022 14:42:29 +0530")

Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> On Wed, 11 May 2022 at 12:44, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
>> > On Fri, 6 May 2022 at 16:00, Richard Sandiford
>> > <richard.sandiford@arm.com> wrote:
>> >>
>> >> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
>> >> > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> >> > index c24c0548724..1ef4ea2087b 100644
>> >> > --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> >> > +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> >> > @@ -44,6 +44,14 @@
>> >> >  #include "aarch64-sve-builtins-shapes.h"
>> >> >  #include "aarch64-sve-builtins-base.h"
>> >> >  #include "aarch64-sve-builtins-functions.h"
>> >> > +#include "aarch64-builtins.h"
>> >> > +#include "gimple-ssa.h"
>> >> > +#include "tree-phinodes.h"
>> >> > +#include "tree-ssa-operands.h"
>> >> > +#include "ssa-iterators.h"
>> >> > +#include "stringpool.h"
>> >> > +#include "value-range.h"
>> >> > +#include "tree-ssanames.h"
>> >>
>> >> Minor, but: I think the preferred approach is to include "ssa.h"
>> >> rather than include some of these headers directly.
>> >>
>> >> >
>> >> >  using namespace aarch64_sve;
>> >> >
>> >> > @@ -1207,6 +1215,56 @@ public:
>> >> >      insn_code icode = code_for_aarch64_sve_ld1rq (e.vector_mode (0));
>> >> >      return e.use_contiguous_load_insn (icode);
>> >> >    }
>> >> > +
>> >> > +  gimple *
>> >> > +  fold (gimple_folder &f) const OVERRIDE
>> >> > +  {
>> >> > +    tree arg0 = gimple_call_arg (f.call, 0);
>> >> > +    tree arg1 = gimple_call_arg (f.call, 1);
>> >> > +
>> >> > +    /* Transform:
>> >> > +       lhs = svld1rq ({-1, -1, ... }, arg1)
>> >> > +       into:
>> >> > +       tmp = mem_ref<int32x4_t> [(int * {ref-all}) arg1]
>> >> > +       lhs = vec_perm_expr<tmp, tmp, {0, 1, 2, 3, ...}>.
>> >> > +       on little endian target.  */
>> >> > +
>> >> > +    if (!BYTES_BIG_ENDIAN
>> >> > +     && integer_all_onesp (arg0))
>> >> > +      {
>> >> > +     tree lhs = gimple_call_lhs (f.call);
>> >> > +     auto simd_type = aarch64_get_simd_info_for_type (Int32x4_t);
>> >>
>> >> Does this work for other element sizes?  I would have expected it
>> >> to be the (128-bit) Advanced SIMD vector associated with the same
>> >> element type as the SVE vector.
>> >>
>> >> The testcase should cover more than just int32x4_t -> svint32_t,
>> >> just to be sure.
>> > In the attached patch, it obtains corresponding advsimd type with:
>> >
>> > tree eltype = TREE_TYPE (lhs_type);
>> > unsigned nunits = 128 / TREE_INT_CST_LOW (TYPE_SIZE (eltype));
>> > tree vectype = build_vector_type (eltype, nunits);
>> >
>> > While this seems to work with different element sizes, I am not sure if it's
>> > the correct approach ?
>>
>> Yeah, that looks correct.  Other SVE code uses aarch64_vq_mode
>> to get the vector mode associated with a .Q “element”, so an
>> alternative would be:
>>
>>     machine_mode vq_mode = aarch64_vq_mode (TYPE_MODE (eltype)).require ();
>>     tree vectype = build_vector_type_for_mode (eltype, vq_mode);
>>
>> which is more explicit about wanting an Advanced SIMD vector.
>>
>> >> > +
>> >> > +     tree elt_ptr_type
>> >> > +       = build_pointer_type_for_mode (simd_type.eltype, VOIDmode, true);
>> >> > +     tree zero = build_zero_cst (elt_ptr_type);
>> >> > +
>> >> > +     /* Use element type alignment.  */
>> >> > +     tree access_type
>> >> > +       = build_aligned_type (simd_type.itype, TYPE_ALIGN (simd_type.eltype));
>> >> > +
>> >> > +     tree tmp = make_ssa_name_fn (cfun, access_type, 0);
>> >> > +     gimple *mem_ref_stmt
>> >> > +       = gimple_build_assign (tmp, fold_build2 (MEM_REF, access_type, arg1, zero));
>> >>
>> >> Long line.  Might be easier to format by assigning the fold_build2 result
>> >> to a temporary variable.
>> >>
>> >> > +     gsi_insert_before (f.gsi, mem_ref_stmt, GSI_SAME_STMT);
>> >> > +
>> >> > +     tree mem_ref_lhs = gimple_get_lhs (mem_ref_stmt);
>> >> > +     tree vectype = TREE_TYPE (mem_ref_lhs);
>> >> > +     tree lhs_type = TREE_TYPE (lhs);
>> >>
>> >> Is this necessary?  The code above supplied the types and I wouldn't
>> >> have expected them to change during the build process.
>> >>
>> >> > +
>> >> > +     int source_nelts = TYPE_VECTOR_SUBPARTS (vectype).to_constant ();
>> >> > +     vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (lhs_type), source_nelts, 1);
>> >> > +     for (int i = 0; i < source_nelts; i++)
>> >> > +       sel.quick_push (i);
>> >> > +
>> >> > +     vec_perm_indices indices (sel, 1, source_nelts);
>> >> > +     gcc_checking_assert (can_vec_perm_const_p (TYPE_MODE (lhs_type), indices));
>> >> > +     tree mask = vec_perm_indices_to_tree (lhs_type, indices);
>> >> > +     return gimple_build_assign (lhs, VEC_PERM_EXPR, mem_ref_lhs, mem_ref_lhs, mask);
>> >>
>> >> Nit: long line.
>> >>
>> >> > +      }
>> >> > +
>> >> > +    return NULL;
>> >> > +  }
>> >> >  };
>> >> >
>> >> >  class svld1ro_impl : public load_replicate
>> >> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
>> >> > index f650abbc4ce..47810fec804 100644
>> >> > --- a/gcc/config/aarch64/aarch64.cc
>> >> > +++ b/gcc/config/aarch64/aarch64.cc
>> >> > @@ -23969,6 +23969,35 @@ aarch64_evpc_sve_tbl (struct expand_vec_perm_d *d)
>> >> >    return true;
>> >> >  }
>> >> >
>> >> > +/* Try to implement D using SVE dup instruction.  */
>> >> > +
>> >> > +static bool
>> >> > +aarch64_evpc_sve_dup (struct expand_vec_perm_d *d)
>> >> > +{
>> >> > +  if (BYTES_BIG_ENDIAN
>> >> > +      || d->perm.length ().is_constant ()
>> >> > +      || !d->one_vector_p
>> >> > +      || d->target == NULL
>> >> > +      || d->op0 == NULL
>> >>
>> >> These last two lines mean that we always return false for d->testing.
>> >> The idea instead is that the return value should be the same for both
>> >> d->testing and !d->testing.  The difference is that for !d->testing we
>> >> also emit code to do the permute.
>>
>> It doesn't look like the new patch addresses this.  There should be
>> no checks for/uses of “d->target” and “d->op0” until after:
>>
>>   if (d->testing_p)
>>     return true;
>>
>> This...
>>
>> >> > +      || GET_MODE_NUNITS (GET_MODE (d->target)).is_constant ()
>> >>
>> >> Sorry, I've forgotten the context now, but: these positive tests
>> >> for is_constant surprised me.  Do we really only want to do this
>> >> for variable-length SVE code generation, rather than fixed-length?
>> >>
>> >> > +      || !GET_MODE_NUNITS (GET_MODE (d->op0)).is_constant ())
>> >> > +    return false;
>> >> > +
>> >> > +  if (d->testing_p)
>> >> > +    return true;
>> >>
>> >> This should happen after the later tests, once we're sure that the
>> >> permute vector has the right form.  If the issue is that op0 isn't
>> >> provided for testing then I think the hook needs to be passed the
>> >> input mode alongside the result mode.
>>
>> ...was my guess about why the checks were there.
> Ah right sorry. IIUC, if d->testing is true, then d->op0 could be NULL ?
> In that case, how do we obtain input mode ?

Well, like I say, I think we might need to extend the vec_perm_const
hook interface so that it gets passed the input mode, now that that
isn't necessarily the same as the output mode.

It would be good to do that as a separate prepatch, since it would
affect other targets too.  And for safety, that patch should make all
existing implementations of the hook return false if the modes aren't
equal, including for aarch64.  The current patch can then make the
aarch64 hook treat the dup case as an exception.

Thanks,
Richard

next prev parent reply	other threads:[~2022-05-12 10:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-17 10:04 Prathamesh Kulkarni
2021-12-17 11:33 ` Richard Sandiford
2021-12-27 10:24   ` Prathamesh Kulkarni
2022-05-03 10:40     ` Prathamesh Kulkarni
2022-05-06 10:30       ` Richard Sandiford
2022-05-11  6:24         ` Prathamesh Kulkarni
2022-05-11  7:14           ` Richard Sandiford
2022-05-12  9:12             ` Prathamesh Kulkarni
2022-05-12 10:44               ` Richard Sandiford [this message]
2022-05-31 11:32                 ` Prathamesh Kulkarni
2022-06-01  8:42                   ` Richard Sandiford
2022-06-05 10:15                     ` Prathamesh Kulkarni
2022-06-06 10:59                       ` Richard Sandiford
2022-06-07 10:47                         ` Prathamesh Kulkarni
2022-06-07 11:02                           ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mpttu9v3u8k.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=prathamesh.kulkarni@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).