public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Arm] Implement CDE predicated intrinsics for MVE registers
@ 2020-04-07  9:16 Matthew Malcomson
  2020-04-08  8:31 ` Matthew Malcomson
  0 siblings, 1 reply; 3+ messages in thread
From: Matthew Malcomson @ 2020-04-07  9:16 UTC (permalink / raw)
  To: gcc-patches
  Cc: Ramana.Radhakrishnan, Kyrylo.Tkachov, nd, Richard.Earnshaw, nickc

[-- Attachment #1: Type: text/plain, Size: 4999 bytes --]

These intrinsics are the predicated version of the intrinsics inroduced
in https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542725.html.

These are not yet public on developer.arm.com but we have reached
internal consensus on them.

The approach follows the same method as for the CDE intrinsics for MVE
registers, most notably using the same arm_resolve_overloaded_builtin
function with minor modifications.

The resolver hook has been moved from arm-builtins.c to arm-c.c so it
can access the c-common function build_function_call_vec.  This function
is needed to perform the same checks on arguments as a normal C or C++
function would perform.
It is fine to put this resolver in arm-c.c since it's only use is for
the ACLE functions, and these are only available in C/C++.
So that the resolver function has access to information it needs from
the builtins, we put two query functions into arm-builtins.c and use
them from arm-c.c.

We rely on the order that the builtins are defined in
gcc/config/arm/arm_cde_builtins.def, knowing that the predicated
versions come after the non-predicated versions.

The machine description patterns for these builtins are simpler than
those for the non-predicated versions, since the accumulator versions
*and* non-accumulator versions both need an input vector now.
The input vector is needed for the non-accumulator version to describe
the original values for those lanes that are not updated during the
merge operation.

We additionally need to introduce qualifiers for these new builtins,
which follow the same pattern as the non-predicated versions but with an
extra argument to describe the predicate.

Error message changes:
- We directly mention the builtin argument when complaining that an
  argument is not in the correct range.
  This more closely matches the C error messages.
- We ensure the resolver complains about *all* invalid arguments to a
  function instead of just the first one.
- The resolver error messages index arguments from 1 instead of 0 to
  match the arguments coming from the C/C++ frontend.

In order to allow the user to give an argument for the merging predicate
when they don't care what data is stored in the 'false' lanes, we also
move the __arm_vuninitializedq* intrinsics from arm_mve.h to
arm_mve_types.h which is shared with arm_cde.h.

We only move the fully type-specified `__arm_vuninitializedq*`
intrinsics and not the polymorphic versions, since moving the
polymorphic versions requires moving the _Generic framework as well as
just the intrinsics we're interested in.  This matches the approach taken
for the `__arm_vreinterpret*` functions in this include file.

This patch also contains a slight change in spacing of an existing
assembly instruction to be emitted.
This is just to help writing tests -- vmsr usually has a tab and a space
between the mnemonic and the first argument, but in one case it just has
a tab -- making all the same helps make test regexps simpler.

Testing Done:
    Bootstrap and full regtest on arm-none-linux-gnueabihf
    Full regtest on arm-none-eabi

    All testing done with a local fix for the bugzilla PR below.
    That bugzilla currently causes multiple ICE's on the tests added in
    this patch.
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94341

gcc/ChangeLog:

2020-04-07  Matthew Malcomson  <matthew.malcomson@arm.com>

	* config/arm/arm-builtins.c (CX_UNARY_UNONE_QUALIFIERS): New.
	(CX_BINARY_UNONE_QUALIFIERS): New.
	(CX_TERNARY_UNONE_QUALIFIERS): New.
	(arm_resolve_overloaded_builtin): Move to arm-c.c.
	(arm_expand_builtin_args): Update error message.
	(enum resolver_ident): New.
	(arm_describe_resolver): New.
	(arm_cde_end_args): New.
	* config/arm/arm-builtins.h: New file.
	* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New.
	(arm_resolve_cde_builtin): Moved from arm-builtins.c.
	* config/arm/arm_cde.h (__arm_vcx1q_m, __arm_vcx1qa_m,
	__arm_vcx2q_m, __arm_vcx2qa_m, __arm_vcx3q_m, __arm_vcx3qa_m):
	New.
	* config/arm/arm_cde_builtins.def (vcx1q_p_, vcx1qa_p_,
	vcx2q_p_, vcx2qa_p_, vcx3q_p_, vcx3qa_p_): New builtin defs.
	* config/arm/iterators.md (CDE_VCX): New int iterator.
	(a) New int attribute.
	* config/arm/mve.md (arm_vcx1q<a>_p_v16qi, arm_vcx2q<a>_p_v16qi,
	arm_vcx3q<a>_p_v16qi): New patterns.
	* config/arm/vfp.md (thumb2_movhi_fp16): Extra space in assembly.

gcc/testsuite/ChangeLog:

2020-04-07  Matthew Malcomson  <matthew.malcomson@arm.com>

	* gcc.target/arm/acle/cde-errors.c: Add predicated forms.
	* gcc.target/arm/acle/cde-mve-error-1.c: Add predicated forms.
	* gcc.target/arm/acle/cde-mve-error-2.c: Add predicated forms.
	* gcc.target/arm/acle/cde-mve-error-3.c: Add predicated forms.
	* gcc.target/arm/acle/cde-mve-full-assembly.c: Add predicated
	forms.
	* gcc.target/arm/acle/cde-mve-tests.c: Add predicated forms.
	* gcc.target/arm/acle/cde_v_1_err.c (test_imm_range): Update for
	error message format change.
	* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c:
	Update scan-assembler regexp.

[-- Attachment #2: cde-mve-predicated.patch.gz --]
[-- Type: application/gzip, Size: 13582 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Arm] Implement CDE predicated intrinsics for MVE registers
  2020-04-07  9:16 [Arm] Implement CDE predicated intrinsics for MVE registers Matthew Malcomson
@ 2020-04-08  8:31 ` Matthew Malcomson
  2020-04-08  9:08   ` Kyrylo Tkachov
  0 siblings, 1 reply; 3+ messages in thread
From: Matthew Malcomson @ 2020-04-08  8:31 UTC (permalink / raw)
  To: gcc-patches
  Cc: Kyrylo.Tkachov, Richard.Earnshaw, Ramana.Radhakrishnan, nickc, nd

[-- Attachment #1: Type: text/plain, Size: 4933 bytes --]

This is an update of the previous patch but rebased onto recent MVE patches.
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543414.html

These intrinsics are the predicated version of the intrinsics inroduced
in https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543527.html.

These are not yet public on developer.arm.com but we have reached
internal consensus on them.

The approach follows the same method as for the CDE intrinsics for MVE
registers, most notably using the same arm_resolve_overloaded_builtin
function with minor modifications.

The resolver hook has been moved from arm-builtins.c to arm-c.c so it
can access the c-common function build_function_call_vec.  This function
is needed to perform the same checks on arguments as a normal C or C++
function would perform.
It is fine to put this resolver in arm-c.c since it's only use is for
the ACLE functions, and these are only available in C/C++.
So that the resolver function has access to information it needs from
the builtins, we put two query functions into arm-builtins.c and use
them from arm-c.c.

We rely on the order that the builtins are defined in
gcc/config/arm/arm_cde_builtins.def, knowing that the predicated
versions come after the non-predicated versions.

The machine description patterns for these builtins are simpler than
those for the non-predicated versions, since the accumulator versions
*and* non-accumulator versions both need an input vector now.
The input vector is needed for the non-accumulator version to describe
the original values for those lanes that are not updated during the
merge operation.

We additionally need to introduce qualifiers for these new builtins,
which follow the same pattern as the non-predicated versions but with an
extra argument to describe the predicate.

Error message changes:
- We directly mention the builtin argument when complaining that an
  argument is not in the correct range.
  This more closely matches the C error messages.
- We ensure the resolver complains about *all* invalid arguments to a
  function instead of just the first one.
- The resolver error messages index arguments from 1 instead of 0 to
  match the arguments coming from the C/C++ frontend.

In order to allow the user to give an argument for the merging predicate
when they don't care what data is stored in the 'false' lanes, we also
move the __arm_vuninitializedq* intrinsics from arm_mve.h to
arm_mve_types.h which is shared with arm_cde.h.

We only move the fully type-specified `__arm_vuninitializedq*`
intrinsics and not the polymorphic versions, since moving the
polymorphic versions requires moving the _Generic framework as well as
just the intrinsics we're interested in.  This matches the approach taken
for the `__arm_vreinterpret*` functions in this include file.

This patch also contains a slight change in spacing of an existing
assembly instruction to be emitted.
This is just to help writing tests -- vmsr usually has a tab and a space
between the mnemonic and the first argument, but in one case it just has
a tab -- making all the same helps make test regexps simpler.

Testing Done:
    Bootstrap and full regtest on arm-none-linux-gnueabihf
    Full regtest on arm-none-eabi

gcc/ChangeLog:

2020-04-08  Matthew Malcomson  <matthew.malcomson@arm.com>

	* config/arm/arm-builtins.c (CX_UNARY_UNONE_QUALIFIERS): New.
	(CX_BINARY_UNONE_QUALIFIERS): New.
	(CX_TERNARY_UNONE_QUALIFIERS): New.
	(arm_resolve_overloaded_builtin): Move to arm-c.c.
	(arm_expand_builtin_args): Update error message.
	(enum resolver_ident): New.
	(arm_describe_resolver): New.
	(arm_cde_end_args): New.
	* config/arm/arm-builtins.h: New file.
	* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New.
	(arm_resolve_cde_builtin): Moved from arm-builtins.c.
	* config/arm/arm_cde.h (__arm_vcx1q_m, __arm_vcx1qa_m,
	__arm_vcx2q_m, __arm_vcx2qa_m, __arm_vcx3q_m, __arm_vcx3qa_m):
	New.
	* config/arm/arm_cde_builtins.def (vcx1q_p_, vcx1qa_p_,
	vcx2q_p_, vcx2qa_p_, vcx3q_p_, vcx3qa_p_): New builtin defs.
	* config/arm/iterators.md (CDE_VCX): New int iterator.
	(a) New int attribute.
	* config/arm/mve.md (arm_vcx1q<a>_p_v16qi, arm_vcx2q<a>_p_v16qi,
	arm_vcx3q<a>_p_v16qi): New patterns.
	* config/arm/vfp.md (thumb2_movhi_fp16): Extra space in assembly.

gcc/testsuite/ChangeLog:

2020-04-08  Matthew Malcomson  <matthew.malcomson@arm.com>

	* gcc.target/arm/acle/cde-errors.c: Add predicated forms.
	* gcc.target/arm/acle/cde-mve-error-1.c: Add predicated forms.
	* gcc.target/arm/acle/cde-mve-error-2.c: Add predicated forms.
	* gcc.target/arm/acle/cde-mve-error-3.c: Add predicated forms.
	* gcc.target/arm/acle/cde-mve-full-assembly.c: Add predicated
	forms.
	* gcc.target/arm/acle/cde-mve-tests.c: Add predicated forms.
	* gcc.target/arm/acle/cde_v_1_err.c (test_imm_range): Update for
	error message format change.
	* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c:
	Update scan-assembler regexp.

[-- Attachment #2: cde-predicated-intrinsics.patch.gz --]
[-- Type: application/gzip, Size: 14224 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [Arm] Implement CDE predicated intrinsics for MVE registers
  2020-04-08  8:31 ` Matthew Malcomson
@ 2020-04-08  9:08   ` Kyrylo Tkachov
  0 siblings, 0 replies; 3+ messages in thread
From: Kyrylo Tkachov @ 2020-04-08  9:08 UTC (permalink / raw)
  To: Matthew Malcomson, gcc-patches
  Cc: Richard Earnshaw, Ramana Radhakrishnan, nickc, nd

Hi Matthew,

> -----Original Message-----
> From: Matthew Malcomson <Matthew.Malcomson@arm.com>
> Sent: 08 April 2020 09:32
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; Ramana Radhakrishnan
> <Ramana.Radhakrishnan@arm.com>; nickc@redhat.com; nd <nd@arm.com>
> Subject: [Arm] Implement CDE predicated intrinsics for MVE registers
> 
> This is an update of the previous patch but rebased onto recent MVE patches.
> https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543414.html
> 
> These intrinsics are the predicated version of the intrinsics inroduced
> in https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543527.html.
> 
> These are not yet public on developer.arm.com but we have reached
> internal consensus on them.
> 
> The approach follows the same method as for the CDE intrinsics for MVE
> registers, most notably using the same arm_resolve_overloaded_builtin
> function with minor modifications.
> 
> The resolver hook has been moved from arm-builtins.c to arm-c.c so it
> can access the c-common function build_function_call_vec.  This function
> is needed to perform the same checks on arguments as a normal C or C++
> function would perform.
> It is fine to put this resolver in arm-c.c since it's only use is for
> the ACLE functions, and these are only available in C/C++.
> So that the resolver function has access to information it needs from
> the builtins, we put two query functions into arm-builtins.c and use
> them from arm-c.c.
> 
> We rely on the order that the builtins are defined in
> gcc/config/arm/arm_cde_builtins.def, knowing that the predicated
> versions come after the non-predicated versions.
> 
> The machine description patterns for these builtins are simpler than
> those for the non-predicated versions, since the accumulator versions
> *and* non-accumulator versions both need an input vector now.
> The input vector is needed for the non-accumulator version to describe
> the original values for those lanes that are not updated during the
> merge operation.
> 
> We additionally need to introduce qualifiers for these new builtins,
> which follow the same pattern as the non-predicated versions but with an
> extra argument to describe the predicate.
> 
> Error message changes:
> - We directly mention the builtin argument when complaining that an
>   argument is not in the correct range.
>   This more closely matches the C error messages.
> - We ensure the resolver complains about *all* invalid arguments to a
>   function instead of just the first one.
> - The resolver error messages index arguments from 1 instead of 0 to
>   match the arguments coming from the C/C++ frontend.
> 
> In order to allow the user to give an argument for the merging predicate
> when they don't care what data is stored in the 'false' lanes, we also
> move the __arm_vuninitializedq* intrinsics from arm_mve.h to
> arm_mve_types.h which is shared with arm_cde.h.
> 
> We only move the fully type-specified `__arm_vuninitializedq*`
> intrinsics and not the polymorphic versions, since moving the
> polymorphic versions requires moving the _Generic framework as well as
> just the intrinsics we're interested in.  This matches the approach taken
> for the `__arm_vreinterpret*` functions in this include file.
> 
> This patch also contains a slight change in spacing of an existing
> assembly instruction to be emitted.
> This is just to help writing tests -- vmsr usually has a tab and a space
> between the mnemonic and the first argument, but in one case it just has
> a tab -- making all the same helps make test regexps simpler.
> 
> Testing Done:
>     Bootstrap and full regtest on arm-none-linux-gnueabihf
>     Full regtest on arm-none-eabi

Ok once the prerequisites have gone in.
Thanks,
Kyrill

> 
> gcc/ChangeLog:
> 
> 2020-04-08  Matthew Malcomson  <matthew.malcomson@arm.com>
> 
> 	* config/arm/arm-builtins.c (CX_UNARY_UNONE_QUALIFIERS): New.
> 	(CX_BINARY_UNONE_QUALIFIERS): New.
> 	(CX_TERNARY_UNONE_QUALIFIERS): New.
> 	(arm_resolve_overloaded_builtin): Move to arm-c.c.
> 	(arm_expand_builtin_args): Update error message.
> 	(enum resolver_ident): New.
> 	(arm_describe_resolver): New.
> 	(arm_cde_end_args): New.
> 	* config/arm/arm-builtins.h: New file.
> 	* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New.
> 	(arm_resolve_cde_builtin): Moved from arm-builtins.c.
> 	* config/arm/arm_cde.h (__arm_vcx1q_m, __arm_vcx1qa_m,
> 	__arm_vcx2q_m, __arm_vcx2qa_m, __arm_vcx3q_m,
> __arm_vcx3qa_m):
> 	New.
> 	* config/arm/arm_cde_builtins.def (vcx1q_p_, vcx1qa_p_,
> 	vcx2q_p_, vcx2qa_p_, vcx3q_p_, vcx3qa_p_): New builtin defs.
> 	* config/arm/iterators.md (CDE_VCX): New int iterator.
> 	(a) New int attribute.
> 	* config/arm/mve.md (arm_vcx1q<a>_p_v16qi,
> arm_vcx2q<a>_p_v16qi,
> 	arm_vcx3q<a>_p_v16qi): New patterns.
> 	* config/arm/vfp.md (thumb2_movhi_fp16): Extra space in assembly.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-04-08  Matthew Malcomson  <matthew.malcomson@arm.com>
> 
> 	* gcc.target/arm/acle/cde-errors.c: Add predicated forms.
> 	* gcc.target/arm/acle/cde-mve-error-1.c: Add predicated forms.
> 	* gcc.target/arm/acle/cde-mve-error-2.c: Add predicated forms.
> 	* gcc.target/arm/acle/cde-mve-error-3.c: Add predicated forms.
> 	* gcc.target/arm/acle/cde-mve-full-assembly.c: Add predicated
> 	forms.
> 	* gcc.target/arm/acle/cde-mve-tests.c: Add predicated forms.
> 	* gcc.target/arm/acle/cde_v_1_err.c (test_imm_range): Update for
> 	error message format change.
> 	* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c:
> 	Update scan-assembler regexp.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-04-08  9:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-07  9:16 [Arm] Implement CDE predicated intrinsics for MVE registers Matthew Malcomson
2020-04-08  8:31 ` Matthew Malcomson
2020-04-08  9:08   ` Kyrylo Tkachov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).