From: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
To: Christophe Lyon <Christophe.Lyon@arm.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
Richard Earnshaw <Richard.Earnshaw@arm.com>,
Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Christophe Lyon <Christophe.Lyon@arm.com>
Subject: RE: [PATCH 02/22] arm: [MVE intrinsics] Add new framework
Date: Tue, 2 May 2023 10:17:04 +0000 [thread overview]
Message-ID: <PAXPR08MB6926816B6C8756D491C51F63936F9@PAXPR08MB6926.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <20230418134608.244751-3-christophe.lyon@arm.com>
> -----Original Message-----
> From: Christophe Lyon <christophe.lyon@arm.com>
> Sent: Tuesday, April 18, 2023 2:46 PM
> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>;
> Richard Earnshaw <Richard.Earnshaw@arm.com>; Richard Sandiford
> <Richard.Sandiford@arm.com>
> Cc: Christophe Lyon <Christophe.Lyon@arm.com>
> Subject: [PATCH 02/22] arm: [MVE intrinsics] Add new framework
>
> This patch introduces the new MVE intrinsics framework, heavily
> inspired by the SVE one in the aarch64 port.
>
> Like the MVE intrinsic types implementation, the intrinsics framework
> defines functions via a new pragma in arm_mve.h. A boolean parameter
> is used to pass true when __ARM_MVE_PRESERVE_USER_NAMESPACE is
> defined, and false when it is not, allowing for non-prefixed intrinsic
> functions to be conditionally defined.
>
> Future patches will build on this framework by adding new intrinsic
> functions and adding the features needed to support them.
>
> Differences compared to the aarch64/SVE port include:
> - when present, the predicate argument is the last one with MVE (the
> first one with SVE)
> - when using merging predicates ("_m" suffix), the "inactive" argument
> (if any) is inserted in the first position
> - when using merging predicates ("_m" suffix), some function do not
> have the "inactive" argument, so we maintain an exception-list
> - MVE intrinsics dealing with floating-point require the FP extension,
> while SVE may support different extensions
> - regarding global state, MVE does not have any prefetch intrinsic, so
> we do not need a flag for this
> - intrinsic names can be prefixed with "__arm", depending on whether
> preserve_user_namespace is true or false
> - parse_signature: the maximum number of arguments is now a parameter,
> this helps detecting an overflow with a new assert.
> - suffixes and overloading can be controlled using
> explicit_mode_suffix_p and skip_overload_p in addition to
> explicit_type_suffix_p
Ok.
Thanks,
Kyrill
>
> At this implemtation stage, there are some limitations compared
> to aarch64/SVE, which are removed later in the series:
> - "offset" mode is not supported yet
> - gimple folding is not implemented
>
> 2022-09-08 Murray Steele <murray.steele@arm.com>
> Christophe Lyon <christophe.lyon@arm.com>
>
> gcc/ChangeLog:
>
> * config.gcc: Add arm-mve-builtins-base.o and
> arm-mve-builtins-shapes.o to extra_objs.
> * config/arm/arm-builtins.cc (arm_builtin_decl): Handle MVE builtin
> numberspace.
> (arm_expand_builtin): Likewise
> (arm_check_builtin_call): Likewise
> (arm_describe_resolver): Likewise.
> * config/arm/arm-builtins.h (enum resolver_ident): Add
> arm_mve_resolver.
> * config/arm/arm-c.cc (arm_pragma_arm): Handle new pragma.
> (arm_resolve_overloaded_builtin): Handle MVE builtins.
> (arm_register_target_pragmas): Register arm_check_builtin_call.
> * config/arm/arm-mve-builtins.cc (class registered_function): New
> class.
> (struct registered_function_hasher): New struct.
> (pred_suffixes): New table.
> (mode_suffixes): New table.
> (type_suffix_info): New table.
> (TYPES_float16): New.
> (TYPES_all_float): New.
> (TYPES_integer_8): New.
> (TYPES_integer_8_16): New.
> (TYPES_integer_16_32): New.
> (TYPES_integer_32): New.
> (TYPES_signed_16_32): New.
> (TYPES_signed_32): New.
> (TYPES_all_signed): New.
> (TYPES_all_unsigned): New.
> (TYPES_all_integer): New.
> (TYPES_all_integer_with_64): New.
> (DEF_VECTOR_TYPE): New.
> (DEF_DOUBLE_TYPE): New.
> (DEF_MVE_TYPES_ARRAY): New.
> (all_integer): New.
> (all_integer_with_64): New.
> (float16): New.
> (all_float): New.
> (all_signed): New.
> (all_unsigned): New.
> (integer_8): New.
> (integer_8_16): New.
> (integer_16_32): New.
> (integer_32): New.
> (signed_16_32): New.
> (signed_32): New.
> (register_vector_type): Use void_type_node for mve.fp-only types
> when
> mve.fp is not enabled.
> (register_builtin_tuple_types): Likewise.
> (handle_arm_mve_h): New function..
> (matches_type_p): Likewise..
> (report_out_of_range): Likewise.
> (report_not_enum): Likewise.
> (report_missing_float): Likewise.
> (report_non_ice): Likewise.
> (check_requires_float): Likewise.
> (function_instance::hash): Likewise
> (function_instance::call_properties): Likewise.
> (function_instance::reads_global_state_p): Likewise.
> (function_instance::modifies_global_state_p): Likewise.
> (function_instance::could_trap_p): Likewise.
> (function_instance::has_inactive_argument): Likewise.
> (registered_function_hasher::hash): Likewise.
> (registered_function_hasher::equal): Likewise.
> (function_builder::function_builder): Likewise.
> (function_builder::~function_builder): Likewise.
> (function_builder::append_name): Likewise.
> (function_builder::finish_name): Likewise.
> (function_builder::get_name): Likewise.
> (add_attribute): Likewise.
> (function_builder::get_attributes): Likewise.
> (function_builder::add_function): Likewise.
> (function_builder::add_unique_function): Likewise.
> (function_builder::add_overloaded_function): Likewise.
> (function_builder::add_overloaded_functions): Likewise.
> (function_builder::register_function_group): Likewise.
> (function_call_info::function_call_info): Likewise.
> (function_resolver::function_resolver): Likewise.
> (function_resolver::get_vector_type): Likewise.
> (function_resolver::get_scalar_type_name): Likewise.
> (function_resolver::get_argument_type): Likewise.
> (function_resolver::scalar_argument_p): Likewise.
> (function_resolver::report_no_such_form): Likewise.
> (function_resolver::lookup_form): Likewise.
> (function_resolver::resolve_to): Likewise.
> (function_resolver::infer_vector_or_tuple_type): Likewise.
> (function_resolver::infer_vector_type): Likewise.
> (function_resolver::require_vector_or_scalar_type): Likewise.
> (function_resolver::require_vector_type): Likewise.
> (function_resolver::require_matching_vector_type): Likewise.
> (function_resolver::require_derived_vector_type): Likewise.
> (function_resolver::require_derived_scalar_type): Likewise.
> (function_resolver::require_integer_immediate): Likewise.
> (function_resolver::require_scalar_type): Likewise.
> (function_resolver::check_num_arguments): Likewise.
> (function_resolver::check_gp_argument): Likewise.
> (function_resolver::finish_opt_n_resolution): Likewise.
> (function_resolver::resolve_unary): Likewise.
> (function_resolver::resolve_unary_n): Likewise.
> (function_resolver::resolve_uniform): Likewise.
> (function_resolver::resolve_uniform_opt_n): Likewise.
> (function_resolver::resolve): Likewise.
> (function_checker::function_checker): Likewise.
> (function_checker::argument_exists_p): Likewise.
> (function_checker::require_immediate): Likewise.
> (function_checker::require_immediate_enum): Likewise.
> (function_checker::require_immediate_range): Likewise.
> (function_checker::check): Likewise.
> (gimple_folder::gimple_folder): Likewise.
> (gimple_folder::fold): Likewise.
> (function_expander::function_expander): Likewise.
> (function_expander::direct_optab_handler): Likewise.
> (function_expander::get_fallback_value): Likewise.
> (function_expander::get_reg_target): Likewise.
> (function_expander::add_output_operand): Likewise.
> (function_expander::add_input_operand): Likewise.
> (function_expander::add_integer_operand): Likewise.
> (function_expander::generate_insn): Likewise.
> (function_expander::use_exact_insn): Likewise.
> (function_expander::use_unpred_insn): Likewise.
> (function_expander::use_pred_x_insn): Likewise.
> (function_expander::use_cond_insn): Likewise.
> (function_expander::map_to_rtx_codes): Likewise.
> (function_expander::expand): Likewise.
> (resolve_overloaded_builtin): Likewise.
> (check_builtin_call): Likewise.
> (gimple_fold_builtin): Likewise.
> (expand_builtin): Likewise.
> (gt_ggc_mx): Likewise.
> (gt_pch_nx): Likewise.
> (gt_pch_nx): Likewise.
> * config/arm/arm-mve-builtins.def(s8): Define new type suffix.
> (s16): Likewise.
> (s32): Likewise.
> (s64): Likewise.
> (u8): Likewise.
> (u16): Likewise.
> (u32): Likewise.
> (u64): Likewise.
> (f16): Likewise.
> (f32): Likewise.
> (n): New mode.
> (offset): New mode.
> * config/arm/arm-mve-builtins.h (MAX_TUPLE_SIZE): New constant.
> (CP_READ_FPCR): Likewise.
> (CP_RAISE_FP_EXCEPTIONS): Likewise.
> (CP_READ_MEMORY): Likewise.
> (CP_WRITE_MEMORY): Likewise.
> (enum units_index): New enum.
> (enum predication_index): New.
> (enum type_class_index): New.
> (enum mode_suffix_index): New enum.
> (enum type_suffix_index): New.
> (struct mode_suffix_info): New struct.
> (struct type_suffix_info): New.
> (struct function_group_info): Likewise.
> (class function_instance): Likewise.
> (class registered_function): Likewise.
> (class function_builder): Likewise.
> (class function_call_info): Likewise.
> (class function_resolver): Likewise.
> (class function_checker): Likewise.
> (class gimple_folder): Likewise.
> (class function_expander): Likewise.
> (get_mve_pred16_t): Likewise.
> (find_mode_suffix): New function.
> (class function_base): Likewise.
> (class function_shape): Likewise.
> (function_instance::operator==): New function.
> (function_instance::operator!=): Likewise.
> (function_instance::vectors_per_tuple): Likewise.
> (function_instance::mode_suffix): Likewise.
> (function_instance::type_suffix): Likewise.
> (function_instance::scalar_type): Likewise.
> (function_instance::vector_type): Likewise.
> (function_instance::tuple_type): Likewise.
> (function_instance::vector_mode): Likewise.
> (function_call_info::function_returns_void_p): Likewise.
> (function_base::call_properties): Likewise.
> * config/arm/arm-protos.h (enum arm_builtin_class): Add
> ARM_BUILTIN_MVE.
> (handle_arm_mve_h): New.
> (resolve_overloaded_builtin): New.
> (check_builtin_call): New.
> (gimple_fold_builtin): New.
> (expand_builtin): New.
> * config/arm/arm.cc (TARGET_GIMPLE_FOLD_BUILTIN): Define as
> arm_gimple_fold_builtin.
> (arm_gimple_fold_builtin): New function.
> * config/arm/arm_mve.h: Use new arm_mve.h pragma.
> * config/arm/predicates.md (arm_any_register_operand): New
> predicate.
> * config/arm/t-arm: (arm-mve-builtins.o): Add includes.
> (arm-mve-builtins-shapes.o): New target.
> (arm-mve-builtins-base.o): New target.
> * config/arm/arm-mve-builtins-base.cc: New file.
> * config/arm/arm-mve-builtins-base.def: New file.
> * config/arm/arm-mve-builtins-base.h: New file.
> * config/arm/arm-mve-builtins-functions.h: New file.
> * config/arm/arm-mve-builtins-shapes.cc: New file.
> * config/arm/arm-mve-builtins-shapes.h: New file.
>
> Co-authored-by: Christophe Lyon <christophe.lyon@arm.com
> ---
> gcc/config.gcc | 2 +-
> gcc/config/arm/arm-builtins.cc | 15 +-
> gcc/config/arm/arm-builtins.h | 1 +
> gcc/config/arm/arm-c.cc | 42 +-
> gcc/config/arm/arm-mve-builtins-base.cc | 45 +
> gcc/config/arm/arm-mve-builtins-base.def | 24 +
> gcc/config/arm/arm-mve-builtins-base.h | 29 +
> gcc/config/arm/arm-mve-builtins-functions.h | 50 +
> gcc/config/arm/arm-mve-builtins-shapes.cc | 343 ++++
> gcc/config/arm/arm-mve-builtins-shapes.h | 30 +
> gcc/config/arm/arm-mve-builtins.cc | 1950 ++++++++++++++++++-
> gcc/config/arm/arm-mve-builtins.def | 40 +-
> gcc/config/arm/arm-mve-builtins.h | 669 ++++++-
> gcc/config/arm/arm-protos.h | 10 +-
> gcc/config/arm/arm.cc | 27 +
> gcc/config/arm/arm_mve.h | 6 +
> gcc/config/arm/predicates.md | 4 +
> gcc/config/arm/t-arm | 32 +-
> 18 files changed, 3292 insertions(+), 27 deletions(-)
> create mode 100644 gcc/config/arm/arm-mve-builtins-base.cc
> create mode 100644 gcc/config/arm/arm-mve-builtins-base.def
> create mode 100644 gcc/config/arm/arm-mve-builtins-base.h
> create mode 100644 gcc/config/arm/arm-mve-builtins-functions.h
> create mode 100644 gcc/config/arm/arm-mve-builtins-shapes.cc
> create mode 100644 gcc/config/arm/arm-mve-builtins-shapes.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 6fd1594480a..5d49f5890ab 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -362,7 +362,7 @@ arc*-*-*)
> ;;
> arm*-*-*)
> cpu_type=arm
> - extra_objs="arm-builtins.o arm-mve-builtins.o aarch-common.o
> aarch-bti-insert.o"
> + extra_objs="arm-builtins.o arm-mve-builtins.o arm-mve-builtins-
> shapes.o arm-mve-builtins-base.o aarch-common.o aarch-bti-insert.o"
> extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_fp16.h
> arm_cmse.h arm_bf16.h arm_mve_types.h arm_mve.h arm_cde.h"
> target_type_format_char='%'
> c_target_objs="arm-c.o"
> diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc
> index adcb50d2185..d0c57409b4c 100644
> --- a/gcc/config/arm/arm-builtins.cc
> +++ b/gcc/config/arm/arm-builtins.cc
> @@ -2712,6 +2712,7 @@ arm_general_builtin_decl (unsigned code)
> return arm_builtin_decls[code];
> }
>
> +/* Implement TARGET_BUILTIN_DECL. */
> /* Return the ARM builtin for CODE. */
> tree
> arm_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)
> @@ -2721,6 +2722,8 @@ arm_builtin_decl (unsigned code, bool initialize_p
> ATTRIBUTE_UNUSED)
> {
> case ARM_BUILTIN_GENERAL:
> return arm_general_builtin_decl (subcode);
> + case ARM_BUILTIN_MVE:
> + return error_mark_node;
> default:
> gcc_unreachable ();
> }
> @@ -4087,6 +4090,8 @@ arm_expand_builtin (tree exp,
> {
> case ARM_BUILTIN_GENERAL:
> return arm_general_expand_builtin (subcode, exp, target, ignore);
> + case ARM_BUILTIN_MVE:
> + return arm_mve::expand_builtin (subcode, exp, target);
> default:
> gcc_unreachable ();
> }
> @@ -4188,8 +4193,9 @@ arm_general_check_builtin_call (unsigned int code)
>
> /* Implement TARGET_CHECK_BUILTIN_CALL. */
> bool
> -arm_check_builtin_call (location_t, vec<location_t>, tree fndecl, tree,
> - unsigned int, tree *)
> +arm_check_builtin_call (location_t loc, vec<location_t> arg_loc,
> + tree fndecl, tree orig_fndecl,
> + unsigned int nargs, tree *args)
> {
> unsigned int code = DECL_MD_FUNCTION_CODE (fndecl);
> unsigned int subcode = code >> ARM_BUILTIN_SHIFT;
> @@ -4197,6 +4203,9 @@ arm_check_builtin_call (location_t,
> vec<location_t>, tree fndecl, tree,
> {
> case ARM_BUILTIN_GENERAL:
> return arm_general_check_builtin_call (subcode);
> + case ARM_BUILTIN_MVE:
> + return arm_mve::check_builtin_call (loc, arg_loc, subcode,
> + orig_fndecl, nargs, args);
> default:
> gcc_unreachable ();
> }
> @@ -4215,6 +4224,8 @@ arm_describe_resolver (tree fndecl)
> && subcode < ARM_BUILTIN_MVE_BASE)
> return arm_cde_resolver;
> return arm_no_resolver;
> + case ARM_BUILTIN_MVE:
> + return arm_mve_resolver;
> default:
> gcc_unreachable ();
> }
> diff --git a/gcc/config/arm/arm-builtins.h b/gcc/config/arm/arm-builtins.h
> index 8c94b6bc40b..494dcd09411 100644
> --- a/gcc/config/arm/arm-builtins.h
> +++ b/gcc/config/arm/arm-builtins.h
> @@ -27,6 +27,7 @@
>
> enum resolver_ident {
> arm_cde_resolver,
> + arm_mve_resolver,
> arm_no_resolver
> };
> enum resolver_ident arm_describe_resolver (tree);
> diff --git a/gcc/config/arm/arm-c.cc b/gcc/config/arm/arm-c.cc
> index 59c0d8ce747..d3d93ceba00 100644
> --- a/gcc/config/arm/arm-c.cc
> +++ b/gcc/config/arm/arm-c.cc
> @@ -144,20 +144,44 @@ arm_pragma_arm (cpp_reader *)
> const char *name = TREE_STRING_POINTER (x);
> if (strcmp (name, "arm_mve_types.h") == 0)
> arm_mve::handle_arm_mve_types_h ();
> + else if (strcmp (name, "arm_mve.h") == 0)
> + {
> + if (pragma_lex (&x) == CPP_NAME)
> + {
> + if (strcmp (IDENTIFIER_POINTER (x), "true") == 0)
> + arm_mve::handle_arm_mve_h (true);
> + else if (strcmp (IDENTIFIER_POINTER (x), "false") == 0)
> + arm_mve::handle_arm_mve_h (false);
> + else
> + error ("%<#pragma GCC arm \"arm_mve.h\"%> requires a boolean
> parameter");
> + }
> + }
> else
> error ("unknown %<#pragma GCC arm%> option %qs", name);
> }
>
> -/* Implement TARGET_RESOLVE_OVERLOADED_BUILTIN. This is currently
> only
> - used for the MVE related builtins for the CDE extension.
> - Here we ensure the type of arguments is such that the size is correct, and
> - then return a tree that describes the same function call but with the
> - relevant types cast as necessary. */
> +/* Implement TARGET_RESOLVE_OVERLOADED_BUILTIN. */
> tree
> -arm_resolve_overloaded_builtin (location_t loc, tree fndecl, void *arglist)
> +arm_resolve_overloaded_builtin (location_t loc, tree fndecl,
> + void *uncast_arglist)
> {
> - if (arm_describe_resolver (fndecl) == arm_cde_resolver)
> - return arm_resolve_cde_builtin (loc, fndecl, arglist);
> + enum resolver_ident resolver = arm_describe_resolver (fndecl);
> + if (resolver == arm_cde_resolver)
> + return arm_resolve_cde_builtin (loc, fndecl, uncast_arglist);
> + if (resolver == arm_mve_resolver)
> + {
> + vec<tree, va_gc> empty = {};
> + vec<tree, va_gc> *arglist = (uncast_arglist
> + ? (vec<tree, va_gc> *) uncast_arglist
> + : &empty);
> + unsigned int code = DECL_MD_FUNCTION_CODE (fndecl);
> + unsigned int subcode = code >> ARM_BUILTIN_SHIFT;
> + tree new_fndecl = arm_mve::resolve_overloaded_builtin (loc, subcode,
> arglist);
> + if (new_fndecl == NULL_TREE || new_fndecl == error_mark_node)
> + return new_fndecl;
> + return build_function_call_vec (loc, vNULL, new_fndecl, arglist,
> + NULL, fndecl);
> + }
> return NULL_TREE;
> }
>
> @@ -519,7 +543,9 @@ arm_register_target_pragmas (void)
> {
> /* Update pragma hook to allow parsing #pragma GCC target. */
> targetm.target_option.pragma_parse = arm_pragma_target_parse;
> +
> targetm.resolve_overloaded_builtin = arm_resolve_overloaded_builtin;
> + targetm.check_builtin_call = arm_check_builtin_call;
>
> c_register_pragma ("GCC", "arm", arm_pragma_arm);
>
> diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-
> mve-builtins-base.cc
> new file mode 100644
> index 00000000000..e9f285faf2b
> --- /dev/null
> +++ b/gcc/config/arm/arm-mve-builtins-base.cc
> @@ -0,0 +1,45 @@
> +/* ACLE support for Arm MVE (__ARM_FEATURE_MVE intrinsics)
> + Copyright (C) 2023 Free Software Foundation, Inc.
> +
> + This file is part of GCC.
> +
> + GCC is free software; you can redistribute it and/or modify it
> + under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3, or (at your option)
> + any later version.
> +
> + GCC is distributed in the hope that it will be useful, but
> + WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with GCC; see the file COPYING3. If not see
> + <http://www.gnu.org/licenses/>. */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tm.h"
> +#include "tree.h"
> +#include "rtl.h"
> +#include "memmodel.h"
> +#include "insn-codes.h"
> +#include "optabs.h"
> +#include "basic-block.h"
> +#include "function.h"
> +#include "gimple.h"
> +#include "arm-mve-builtins.h"
> +#include "arm-mve-builtins-shapes.h"
> +#include "arm-mve-builtins-base.h"
> +#include "arm-mve-builtins-functions.h"
> +
> +using namespace arm_mve;
> +
> +namespace {
> +
> +} /* end anonymous namespace */
> +
> +namespace arm_mve {
> +
> +} /* end namespace arm_mve */
> diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-
> mve-builtins-base.def
> new file mode 100644
> index 00000000000..d15ba2e23e8
> --- /dev/null
> +++ b/gcc/config/arm/arm-mve-builtins-base.def
> @@ -0,0 +1,24 @@
> +/* ACLE support for Arm MVE (__ARM_FEATURE_MVE intrinsics)
> + Copyright (C) 2023 Free Software Foundation, Inc.
> +
> + This file is part of GCC.
> +
> + GCC is free software; you can redistribute it and/or modify it
> + under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3, or (at your option)
> + any later version.
> +
> + GCC is distributed in the hope that it will be useful, but
> + WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with GCC; see the file COPYING3. If not see
> + <http://www.gnu.org/licenses/>. */
> +
> +#define REQUIRES_FLOAT false
> +#undef REQUIRES_FLOAT
> +
> +#define REQUIRES_FLOAT true
> +#undef REQUIRES_FLOAT
> diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-
> mve-builtins-base.h
> new file mode 100644
> index 00000000000..c4d7b750cd5
> --- /dev/null
> +++ b/gcc/config/arm/arm-mve-builtins-base.h
> @@ -0,0 +1,29 @@
> +/* ACLE support for Arm MVE (__ARM_FEATURE_MVE intrinsics)
> + Copyright (C) 2023 Free Software Foundation, Inc.
> +
> + This file is part of GCC.
> +
> + GCC is free software; you can redistribute it and/or modify it
> + under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3, or (at your option)
> + any later version.
> +
> + GCC is distributed in the hope that it will be useful, but
> + WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with GCC; see the file COPYING3. If not see
> + <http://www.gnu.org/licenses/>. */
> +
> +#ifndef GCC_ARM_MVE_BUILTINS_BASE_H
> +#define GCC_ARM_MVE_BUILTINS_BASE_H
> +
> +namespace arm_mve {
> +namespace functions {
> +
> +} /* end namespace arm_mve::functions */
> +} /* end namespace arm_mve */
> +
> +#endif
> diff --git a/gcc/config/arm/arm-mve-builtins-functions.h
> b/gcc/config/arm/arm-mve-builtins-functions.h
> new file mode 100644
> index 00000000000..dff01999bcd
> --- /dev/null
> +++ b/gcc/config/arm/arm-mve-builtins-functions.h
> @@ -0,0 +1,50 @@
> +/* ACLE support for Arm MVE (function_base classes)
> + Copyright (C) 2023 Free Software Foundation, Inc.
> +
> + This file is part of GCC.
> +
> + GCC is free software; you can redistribute it and/or modify it
> + under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3, or (at your option)
> + any later version.
> +
> + GCC is distributed in the hope that it will be useful, but
> + WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with GCC; see the file COPYING3. If not see
> + <http://www.gnu.org/licenses/>. */
> +
> +#ifndef GCC_ARM_MVE_BUILTINS_FUNCTIONS_H
> +#define GCC_ARM_MVE_BUILTINS_FUNCTIONS_H
> +
> +namespace arm_mve {
> +
> +/* Wrap T, which is derived from function_base, and indicate that the
> + function never has side effects. It is only necessary to use this
> + wrapper on functions that might have floating-point suffixes, since
> + otherwise we assume by default that the function has no side effects. */
> +template<typename T>
> +class quiet : public T
> +{
> +public:
> + CONSTEXPR quiet () : T () {}
> +
> + unsigned int
> + call_properties (const function_instance &) const override
> + {
> + return 0;
> + }
> +};
> +
> +} /* end namespace arm_mve */
> +
> +/* Declare the global function base NAME, creating it from an instance
> + of class CLASS with constructor arguments ARGS. */
> +#define FUNCTION(NAME, CLASS, ARGS) \
> + namespace { static CONSTEXPR const CLASS NAME##_obj ARGS; } \
> + namespace functions { const function_base *const NAME = &NAME##_obj;
> }
> +
> +#endif
> diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-
> mve-builtins-shapes.cc
> new file mode 100644
> index 00000000000..f20660d8319
> --- /dev/null
> +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
> @@ -0,0 +1,343 @@
> +/* ACLE support for Arm MVE (function shapes)
> + Copyright (C) 2023 Free Software Foundation, Inc.
> +
> + This file is part of GCC.
> +
> + GCC is free software; you can redistribute it and/or modify it
> + under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3, or (at your option)
> + any later version.
> +
> + GCC is distributed in the hope that it will be useful, but
> + WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with GCC; see the file COPYING3. If not see
> + <http://www.gnu.org/licenses/>. */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tm.h"
> +#include "tree.h"
> +#include "rtl.h"
> +#include "memmodel.h"
> +#include "insn-codes.h"
> +#include "optabs.h"
> +#include "arm-mve-builtins.h"
> +#include "arm-mve-builtins-shapes.h"
> +
> +/* In the comments below, _t0 represents the first type suffix
> + (e.g. "_s8") and _t1 represents the second. T0/T1 represent the
> + type full names (e.g. int8x16_t). Square brackets enclose
> + characters that are present in only the full name, not the
> + overloaded name. Governing predicate arguments and predicate
> + suffixes are not shown, since they depend on the predication type,
> + which is a separate piece of information from the shape. */
> +
> +namespace arm_mve {
> +
> +/* If INSTANCE has a predicate, add it to the list of argument types
> + in ARGUMENT_TYPES. RETURN_TYPE is the type returned by the
> + function. */
> +static void
> +apply_predication (const function_instance &instance, tree return_type,
> + vec<tree> &argument_types)
> +{
> + if (instance.pred != PRED_none)
> + {
> + /* When predicate is PRED_m, insert a first argument
> + ("inactive") with the same type as return_type. */
> + if (instance.has_inactive_argument ())
> + argument_types.quick_insert (0, return_type);
> + argument_types.quick_push (get_mve_pred16_t ());
> + }
> +}
> +
> +/* Parse and move past an element type in FORMAT and return it as a type
> + suffix. The format is:
> +
> + [01] - the element type in type suffix 0 or 1 of INSTANCE.
> + h<elt> - a half-sized version of <elt>
> + s<bits> - a signed type with the given number of bits
> + s[01] - a signed type with the same width as type suffix 0 or 1
> + u<bits> - an unsigned type with the given number of bits
> + u[01] - an unsigned type with the same width as type suffix 0 or 1
> + w<elt> - a double-sized version of <elt>
> + x<bits> - a type with the given number of bits and same signedness
> + as the next argument.
> +
> + Future intrinsics will extend this format. */
> +static type_suffix_index
> +parse_element_type (const function_instance &instance, const char
> *&format)
> +{
> + int ch = *format++;
> +
> +
> + if (ch == 's' || ch == 'u')
> + {
> + type_class_index tclass = (ch == 'f' ? TYPE_float
> + : ch == 's' ? TYPE_signed
> + : TYPE_unsigned);
> + char *end;
> + unsigned int bits = strtol (format, &end, 10);
> + format = end;
> + if (bits == 0 || bits == 1)
> + bits = instance.type_suffix (bits).element_bits;
> + return find_type_suffix (tclass, bits);
> + }
> +
> + if (ch == 'h')
> + {
> + type_suffix_index suffix = parse_element_type (instance, format);
> + return find_type_suffix (type_suffixes[suffix].tclass,
> + type_suffixes[suffix].element_bits / 2);
> + }
> +
> + if (ch == 'w')
> + {
> + type_suffix_index suffix = parse_element_type (instance, format);
> + return find_type_suffix (type_suffixes[suffix].tclass,
> + type_suffixes[suffix].element_bits * 2);
> + }
> +
> + if (ch == 'x')
> + {
> + const char *next = format;
> + next = strstr (format, ",");
> + next+=2;
> + type_suffix_index suffix = parse_element_type (instance, next);
> + type_class_index tclass = type_suffixes[suffix].tclass;
> + char *end;
> + unsigned int bits = strtol (format, &end, 10);
> + format = end;
> + return find_type_suffix (tclass, bits);
> + }
> +
> + if (ch == '0' || ch == '1')
> + return instance.type_suffix_ids[ch - '0'];
> +
> + gcc_unreachable ();
> +}
> +
> +/* Read and return a type from FORMAT for function INSTANCE. Advance
> + FORMAT beyond the type string. The format is:
> +
> + p - predicates with type mve_pred16_t
> + s<elt> - a scalar type with the given element suffix
> + t<elt> - a vector or tuple type with given element suffix [*1]
> + v<elt> - a vector with the given element suffix
> +
> + where <elt> has the format described above parse_element_type.
> +
> + Future intrinsics will extend this format.
> +
> + [*1] the vectors_per_tuple function indicates whether the type should
> + be a tuple, and if so, how many vectors it should contain. */
> +static tree
> +parse_type (const function_instance &instance, const char *&format)
> +{
> + int ch = *format++;
> +
> + if (ch == 'p')
> + return get_mve_pred16_t ();
> +
> + if (ch == 's')
> + {
> + type_suffix_index suffix = parse_element_type (instance, format);
> + return scalar_types[type_suffixes[suffix].vector_type];
> + }
> +
> + if (ch == 't')
> + {
> + type_suffix_index suffix = parse_element_type (instance, format);
> + vector_type_index vector_type = type_suffixes[suffix].vector_type;
> + unsigned int num_vectors = instance.vectors_per_tuple ();
> + return acle_vector_types[num_vectors - 1][vector_type];
> + }
> +
> + if (ch == 'v')
> + {
> + type_suffix_index suffix = parse_element_type (instance, format);
> + return acle_vector_types[0][type_suffixes[suffix].vector_type];
> + }
> +
> + gcc_unreachable ();
> +}
> +
> +/* Read a type signature for INSTANCE from FORMAT. Add the argument
> + types to ARGUMENT_TYPES and return the return type. Assert there
> + are no more than MAX_ARGS arguments.
> +
> + The format is a comma-separated list of types (as for parse_type),
> + with the first type being the return type and the rest being the
> + argument types. */
> +static tree
> +parse_signature (const function_instance &instance, const char *format,
> + vec<tree> &argument_types, unsigned int max_args)
> +{
> + tree return_type = parse_type (instance, format);
> + unsigned int args = 0;
> + while (format[0] == ',')
> + {
> + gcc_assert (args < max_args);
> + format += 1;
> + tree argument_type = parse_type (instance, format);
> + argument_types.quick_push (argument_type);
> + args += 1;
> + }
> + gcc_assert (format[0] == 0);
> + return return_type;
> +}
> +
> +/* Add one function instance for GROUP, using mode suffix
> MODE_SUFFIX_ID,
> + the type suffixes at index TI and the predication suffix at index PI.
> + The other arguments are as for build_all. */
> +static void
> +build_one (function_builder &b, const char *signature,
> + const function_group_info &group, mode_suffix_index
> mode_suffix_id,
> + unsigned int ti, unsigned int pi, bool preserve_user_namespace,
> + bool force_direct_overloads)
> +{
> + /* Current functions take at most five arguments. Match
> + parse_signature parameter below. */
> + auto_vec<tree, 5> argument_types;
> + function_instance instance (group.base_name, *group.base, *group.shape,
> + mode_suffix_id, group.types[ti],
> + group.preds[pi]);
> + tree return_type = parse_signature (instance, signature, argument_types,
> 5);
> + apply_predication (instance, return_type, argument_types);
> + b.add_unique_function (instance, return_type, argument_types,
> + preserve_user_namespace, group.requires_float,
> + force_direct_overloads);
> +}
> +
> +/* Add a function instance for every type and predicate combination in
> + GROUP, except if requested to use only the predicates listed in
> + RESTRICT_TO_PREDS. Take the function base name from GROUP and the
> + mode suffix from MODE_SUFFIX_ID. Use SIGNATURE to construct the
> + function signature, then use apply_predication to add in the
> + predicate. */
> +static void
> +build_all (function_builder &b, const char *signature,
> + const function_group_info &group, mode_suffix_index
> mode_suffix_id,
> + bool preserve_user_namespace,
> + bool force_direct_overloads = false,
> + const predication_index *restrict_to_preds = NULL)
> +{
> + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi)
> + {
> + unsigned int pi2 = 0;
> +
> + if (restrict_to_preds)
> + for (; restrict_to_preds[pi2] != NUM_PREDS; ++pi2)
> + if (restrict_to_preds[pi2] == group.preds[pi])
> + break;
> +
> + if (restrict_to_preds == NULL || restrict_to_preds[pi2] != NUM_PREDS)
> + for (unsigned int ti = 0;
> + ti == 0 || group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti)
> + build_one (b, signature, group, mode_suffix_id, ti, pi,
> + preserve_user_namespace, force_direct_overloads);
> + }
> +}
> +
> +/* Add a function instance for every type and predicate combination in
> + GROUP, except if requested to use only the predicates listed in
> + RESTRICT_TO_PREDS, and only for 16-bit and 32-bit integers. Take
> + the function base name from GROUP and the mode suffix from
> + MODE_SUFFIX_ID. Use SIGNATURE to construct the function signature,
> + then use apply_predication to add in the predicate. */
> +static void
> +build_16_32 (function_builder &b, const char *signature,
> + const function_group_info &group, mode_suffix_index
> mode_suffix_id,
> + bool preserve_user_namespace,
> + bool force_direct_overloads = false,
> + const predication_index *restrict_to_preds = NULL)
> +{
> + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi)
> + {
> + unsigned int pi2 = 0;
> +
> + if (restrict_to_preds)
> + for (; restrict_to_preds[pi2] != NUM_PREDS; ++pi2)
> + if (restrict_to_preds[pi2] == group.preds[pi])
> + break;
> +
> + if (restrict_to_preds == NULL || restrict_to_preds[pi2] != NUM_PREDS)
> + for (unsigned int ti = 0;
> + ti == 0 || group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti)
> + {
> + unsigned int element_bits =
> type_suffixes[group.types[ti][0]].element_bits;
> + type_class_index tclass = type_suffixes[group.types[ti][0]].tclass;
> + if ((tclass == TYPE_signed || tclass == TYPE_unsigned)
> + && (element_bits == 16 || element_bits == 32))
> + build_one (b, signature, group, mode_suffix_id, ti, pi,
> + preserve_user_namespace, force_direct_overloads);
> + }
> + }
> +}
> +
> +/* Declare the function shape NAME, pointing it to an instance
> + of class <NAME>_def. */
> +#define SHAPE(NAME) \
> + static CONSTEXPR const NAME##_def NAME##_obj; \
> + namespace shapes { const function_shape *const NAME = &NAME##_obj; }
> +
> +/* Base class for functions that are not overloaded. */
> +struct nonoverloaded_base : public function_shape
> +{
> + bool
> + explicit_type_suffix_p (unsigned int, enum predication_index, enum
> mode_suffix_index) const override
> + {
> + return true;
> + }
> +
> + bool
> + explicit_mode_suffix_p (enum predication_index, enum
> mode_suffix_index) const override
> + {
> + return true;
> + }
> +
> + bool
> + skip_overload_p (enum predication_index, enum mode_suffix_index)
> const override
> + {
> + return false;
> + }
> +
> + tree
> + resolve (function_resolver &) const override
> + {
> + gcc_unreachable ();
> + }
> +};
> +
> +/* Base class for overloaded functions. Bit N of EXPLICIT_MASK is true
> + if type suffix N appears in the overloaded name. */
> +template<unsigned int EXPLICIT_MASK>
> +struct overloaded_base : public function_shape
> +{
> + bool
> + explicit_type_suffix_p (unsigned int i, enum predication_index, enum
> mode_suffix_index) const override
> + {
> + return (EXPLICIT_MASK >> i) & 1;
> + }
> +
> + bool
> + explicit_mode_suffix_p (enum predication_index, enum
> mode_suffix_index) const override
> + {
> + return false;
> + }
> +
> + bool
> + skip_overload_p (enum predication_index, enum mode_suffix_index)
> const override
> + {
> + return false;
> + }
> +};
> +
> +} /* end namespace arm_mve */
> +
> +#undef SHAPE
> diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-
> mve-builtins-shapes.h
> new file mode 100644
> index 00000000000..9e353b85a76
> --- /dev/null
> +++ b/gcc/config/arm/arm-mve-builtins-shapes.h
> @@ -0,0 +1,30 @@
> +/* ACLE support for Arm MVE (function shapes)
> + Copyright (C) 2023 Free Software Foundation, Inc.
> +
> + This file is part of GCC.
> +
> + GCC is free software; you can redistribute it and/or modify it
> + under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3, or (at your option)
> + any later version.
> +
> + GCC is distributed in the hope that it will be useful, but
> + WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with GCC; see the file COPYING3. If not see
> + <http://www.gnu.org/licenses/>. */
> +
> +#ifndef GCC_ARM_MVE_BUILTINS_SHAPES_H
> +#define GCC_ARM_MVE_BUILTINS_SHAPES_H
> +
> +namespace arm_mve
> +{
> + namespace shapes
> + {
> + } /* end namespace arm_mve::shapes */
> +} /* end namespace arm_mve */
> +
> +#endif
> diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-
> builtins.cc
> index 7586a82e3c1..b0cceb75ceb 100644
> --- a/gcc/config/arm/arm-mve-builtins.cc
> +++ b/gcc/config/arm/arm-mve-builtins.cc
> @@ -24,7 +24,19 @@
> #include "coretypes.h"
> #include "tm.h"
> #include "tree.h"
> +#include "rtl.h"
> +#include "tm_p.h"
> +#include "memmodel.h"
> +#include "insn-codes.h"
> +#include "optabs.h"
> +#include "recog.h"
> +#include "expr.h"
> +#include "basic-block.h"
> +#include "function.h"
> #include "fold-const.h"
> +#include "gimple.h"
> +#include "gimple-iterator.h"
> +#include "emit-rtl.h"
> #include "langhooks.h"
> #include "stringpool.h"
> #include "attribs.h"
> @@ -32,6 +44,8 @@
> #include "arm-protos.h"
> #include "arm-builtins.h"
> #include "arm-mve-builtins.h"
> +#include "arm-mve-builtins-base.h"
> +#include "arm-mve-builtins-shapes.h"
>
> namespace arm_mve {
>
> @@ -46,6 +60,33 @@ struct vector_type_info
> const bool requires_float;
> };
>
> +/* Describes a function decl. */
> +class GTY(()) registered_function
> +{
> +public:
> + /* The ACLE function that the decl represents. */
> + function_instance instance GTY ((skip));
> +
> + /* The decl itself. */
> + tree decl;
> +
> + /* Whether the function requires a floating point abi. */
> + bool requires_float;
> +
> + /* True if the decl represents an overloaded function that needs to be
> + resolved by function_resolver. */
> + bool overloaded_p;
> +};
> +
> +/* Hash traits for registered_function. */
> +struct registered_function_hasher : nofree_ptr_hash <registered_function>
> +{
> + typedef function_instance compare_type;
> +
> + static hashval_t hash (value_type);
> + static bool equal (value_type, const compare_type &);
> +};
> +
> /* Flag indicating whether the arm MVE types have been handled. */
> static bool handle_arm_mve_types_p;
>
> @@ -54,11 +95,167 @@ static CONSTEXPR const vector_type_info
> vector_types[] = {
> #define DEF_MVE_TYPE(ACLE_NAME, SCALAR_TYPE) \
> { #ACLE_NAME, REQUIRES_FLOAT },
> #include "arm-mve-builtins.def"
> -#undef DEF_MVE_TYPE
> +};
> +
> +/* The function name suffix associated with each predication type. */
> +static const char *const pred_suffixes[NUM_PREDS + 1] = {
> + "",
> + "_m",
> + "_p",
> + "_x",
> + "_z",
> + ""
> +};
> +
> +/* Static information about each mode_suffix_index. */
> +CONSTEXPR const mode_suffix_info mode_suffixes[] = {
> +#define VECTOR_TYPE_none NUM_VECTOR_TYPES
> +#define DEF_MVE_MODE(NAME, BASE, DISPLACEMENT, UNITS) \
> + { "_" #NAME, VECTOR_TYPE_##BASE, VECTOR_TYPE_##DISPLACEMENT,
> UNITS_##UNITS },
> +#include "arm-mve-builtins.def"
> +#undef VECTOR_TYPE_none
> + { "", NUM_VECTOR_TYPES, NUM_VECTOR_TYPES, UNITS_none }
> +};
> +
> +/* Static information about each type_suffix_index. */
> +CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] =
> {
> +#define DEF_MVE_TYPE_SUFFIX(NAME, ACLE_TYPE, CLASS, BITS, MODE)
> \
> + { "_" #NAME, \
> + VECTOR_TYPE_##ACLE_TYPE, \
> + TYPE_##CLASS, \
> + BITS, \
> + BITS / BITS_PER_UNIT, \
> + TYPE_##CLASS == TYPE_signed || TYPE_##CLASS == TYPE_unsigned, \
> + TYPE_##CLASS == TYPE_unsigned, \
> + TYPE_##CLASS == TYPE_float, \
> + 0, \
> + MODE },
> +#include "arm-mve-builtins.def"
> + { "", NUM_VECTOR_TYPES, TYPE_bool, 0, 0, false, false, false,
> + 0, VOIDmode }
> +};
> +
> +/* Define a TYPES_<combination> macro for each combination of type
> + suffixes that an ACLE function can have, where <combination> is the
> + name used in DEF_MVE_FUNCTION entries.
> +
> + Use S (T) for single type suffix T and D (T1, T2) for a pair of type
> + suffixes T1 and T2. Use commas to separate the suffixes.
> +
> + Although the order shouldn't matter, the convention is to sort the
> + suffixes lexicographically after dividing suffixes into a type
> + class ("b", "f", etc.) and a numerical bit count. */
> +
> +/* _f16. */
> +#define TYPES_float16(S, D) \
> + S (f16)
> +
> +/* _f16 _f32. */
> +#define TYPES_all_float(S, D) \
> + S (f16), S (f32)
> +
> +/* _s8 _u8 . */
> +#define TYPES_integer_8(S, D) \
> + S (s8), S (u8)
> +
> +/* _s8 _s16
> + _u8 _u16. */
> +#define TYPES_integer_8_16(S, D) \
> + S (s8), S (s16), S (u8), S(u16)
> +
> +/* _s16 _s32
> + _u16 _u32. */
> +#define TYPES_integer_16_32(S, D) \
> + S (s16), S (s32), \
> + S (u16), S (u32)
> +
> +/* _s16 _s32. */
> +#define TYPES_signed_16_32(S, D) \
> + S (s16), S (s32)
> +
> +/* _s8 _s16 _s32. */
> +#define TYPES_all_signed(S, D) \
> + S (s8), S (s16), S (s32)
> +
> +/* _u8 _u16 _u32. */
> +#define TYPES_all_unsigned(S, D) \
> + S (u8), S (u16), S (u32)
> +
> +/* _s8 _s16 _s32
> + _u8 _u16 _u32. */
> +#define TYPES_all_integer(S, D) \
> + TYPES_all_signed (S, D), TYPES_all_unsigned (S, D)
> +
> +/* _s8 _s16 _s32 _s64
> + _u8 _u16 _u32 _u64. */
> +#define TYPES_all_integer_with_64(S, D) \
> + TYPES_all_signed (S, D), S (s64), TYPES_all_unsigned (S, D), S (u64)
> +
> +/* s32 _u32. */
> +#define TYPES_integer_32(S, D) \
> + S (s32), S (u32)
> +
> +/* s32 . */
> +#define TYPES_signed_32(S, D) \
> + S (s32)
> +
> +/* Describe a pair of type suffixes in which only the first is used. */
> +#define DEF_VECTOR_TYPE(X) { TYPE_SUFFIX_ ## X, NUM_TYPE_SUFFIXES }
> +
> +/* Describe a pair of type suffixes in which both are used. */
> +#define DEF_DOUBLE_TYPE(X, Y) { TYPE_SUFFIX_ ## X, TYPE_SUFFIX_ ## Y }
> +
> +/* Create an array that can be used in arm-mve-builtins.def to
> + select the type suffixes in TYPES_<NAME>. */
> +#define DEF_MVE_TYPES_ARRAY(NAME) \
> + static const type_suffix_pair types_##NAME[] = { \
> + TYPES_##NAME (DEF_VECTOR_TYPE, DEF_DOUBLE_TYPE), \
> + { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES } \
> + }
> +
> +/* For functions that don't take any type suffixes. */
> +static const type_suffix_pair types_none[] = {
> + { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES },
> + { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES }
> +};
> +
> +DEF_MVE_TYPES_ARRAY (all_integer);
> +DEF_MVE_TYPES_ARRAY (all_integer_with_64);
> +DEF_MVE_TYPES_ARRAY (float16);
> +DEF_MVE_TYPES_ARRAY (all_float);
> +DEF_MVE_TYPES_ARRAY (all_signed);
> +DEF_MVE_TYPES_ARRAY (all_unsigned);
> +DEF_MVE_TYPES_ARRAY (integer_8);
> +DEF_MVE_TYPES_ARRAY (integer_8_16);
> +DEF_MVE_TYPES_ARRAY (integer_16_32);
> +DEF_MVE_TYPES_ARRAY (integer_32);
> +DEF_MVE_TYPES_ARRAY (signed_16_32);
> +DEF_MVE_TYPES_ARRAY (signed_32);
> +
> +/* Used by functions that have no governing predicate. */
> +static const predication_index preds_none[] = { PRED_none, NUM_PREDS };
> +
> +/* Used by functions that have the m (merging) predicated form, and in
> + addition have an unpredicated form. */
> +static const predication_index preds_m_or_none[] = {
> + PRED_m, PRED_none, NUM_PREDS
> +};
> +
> +/* Used by functions that have the mx (merging and "don't care"
> + predicated forms, and in addition have an unpredicated form. */
> +static const predication_index preds_mx_or_none[] = {
> + PRED_m, PRED_x, PRED_none, NUM_PREDS
> +};
> +
> +/* Used by functions that have the p predicated form, in addition to
> + an unpredicated form. */
> +static const predication_index preds_p_or_none[] = {
> + PRED_p, PRED_none, NUM_PREDS
> };
>
> /* The scalar type associated with each vector type. */
> -GTY(()) tree scalar_types[NUM_VECTOR_TYPES];
> +extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES];
> +tree scalar_types[NUM_VECTOR_TYPES];
>
> /* The single-predicate and single-vector types, with their built-in
> "__simd128_..._t" name. Allow an index of NUM_VECTOR_TYPES, which
> always
> @@ -66,7 +263,20 @@ GTY(()) tree scalar_types[NUM_VECTOR_TYPES];
> static GTY(()) tree abi_vector_types[NUM_VECTOR_TYPES + 1];
>
> /* Same, but with the arm_mve.h names. */
> -GTY(()) tree acle_vector_types[3][NUM_VECTOR_TYPES + 1];
> +extern GTY(()) tree
> acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1];
> +tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1];
> +
> +/* The list of all registered function decls, indexed by code. */
> +static GTY(()) vec<registered_function *, va_gc> *registered_functions;
> +
> +/* All registered function decls, hashed on the function_instance
> + that they implement. This is used for looking up implementations of
> + overloaded functions. */
> +static hash_table<registered_function_hasher> *function_table;
> +
> +/* True if we've already complained about attempts to use functions
> + when the required extension is disabled. */
> +static bool reported_missing_float_p;
>
> /* Return the MVE abi type with element of type TYPE. */
> static tree
> @@ -87,7 +297,6 @@ register_builtin_types ()
> #define DEF_MVE_TYPE(ACLE_NAME, SCALAR_TYPE) \
> scalar_types[VECTOR_TYPE_ ## ACLE_NAME] = SCALAR_TYPE;
> #include "arm-mve-builtins.def"
> -#undef DEF_MVE_TYPE
> for (unsigned int i = 0; i < NUM_VECTOR_TYPES; ++i)
> {
> if (vector_types[i].requires_float && !TARGET_HAVE_MVE_FLOAT)
> @@ -113,8 +322,18 @@ register_builtin_types ()
> static void
> register_vector_type (vector_type_index type)
> {
> +
> + /* If the target does not have the mve.fp extension, but the type requires
> + it, then it needs to be assigned a non-dummy type so that functions
> + with those types in their signature can be registered. This allows for
> + diagnostics about the missing extension, rather than about a missing
> + function definition. */
> if (vector_types[type].requires_float && !TARGET_HAVE_MVE_FLOAT)
> - return;
> + {
> + acle_vector_types[0][type] = void_type_node;
> + return;
> + }
> +
> tree vectype = abi_vector_types[type];
> tree id = get_identifier (vector_types[type].acle_name);
> tree decl = build_decl (input_location, TYPE_DECL, id, vectype);
> @@ -133,15 +352,26 @@ register_vector_type (vector_type_index type)
> acle_vector_types[0][type] = vectype;
> }
>
> -/* Register tuple type TYPE with NUM_VECTORS arity under its
> - arm_mve_types.h name. */
> +/* Register tuple types of element type TYPE under their arm_mve_types.h
> + names. */
> static void
> register_builtin_tuple_types (vector_type_index type)
> {
> const vector_type_info* info = &vector_types[type];
> +
> + /* If the target does not have the mve.fp extension, but the type requires
> + it, then it needs to be assigned a non-dummy type so that functions
> + with those types in their signature can be registered. This allows for
> + diagnostics about the missing extension, rather than about a missing
> + function definition. */
> if (scalar_types[type] == boolean_type_node
> || (info->requires_float && !TARGET_HAVE_MVE_FLOAT))
> + {
> + for (unsigned int num_vectors = 2; num_vectors <= 4; num_vectors += 2)
> + acle_vector_types[num_vectors >> 1][type] = void_type_node;
> return;
> + }
> +
> const char *vector_type_name = info->acle_name;
> char buffer[sizeof ("float32x4x2_t")];
> for (unsigned int num_vectors = 2; num_vectors <= 4; num_vectors += 2)
> @@ -189,8 +419,1710 @@ handle_arm_mve_types_h ()
> }
> }
>
> -} /* end namespace arm_mve */
> +/* Implement #pragma GCC arm "arm_mve.h" <bool>. */
> +void
> +handle_arm_mve_h (bool preserve_user_namespace)
> +{
> + if (function_table)
> + {
> + error ("duplicate definition of %qs", "arm_mve.h");
> + return;
> + }
>
> -using namespace arm_mve;
> + /* Define MVE functions. */
> + function_table = new hash_table<registered_function_hasher> (1023);
> +}
> +
> +/* Return true if CANDIDATE is equivalent to MODEL_TYPE for overloading
> + purposes. */
> +static bool
> +matches_type_p (const_tree model_type, const_tree candidate)
> +{
> + if (VECTOR_TYPE_P (model_type))
> + {
> + if (!VECTOR_TYPE_P (candidate)
> + || maybe_ne (TYPE_VECTOR_SUBPARTS (model_type),
> + TYPE_VECTOR_SUBPARTS (candidate))
> + || TYPE_MODE (model_type) != TYPE_MODE (candidate))
> + return false;
> +
> + model_type = TREE_TYPE (model_type);
> + candidate = TREE_TYPE (candidate);
> + }
> + return (candidate != error_mark_node
> + && TYPE_MAIN_VARIANT (model_type) == TYPE_MAIN_VARIANT
> (candidate));
> +}
> +
> +/* Report an error against LOCATION that the user has tried to use
> + a floating point function when the mve.fp extension is disabled. */
> +static void
> +report_missing_float (location_t location, tree fndecl)
> +{
> + /* Avoid reporting a slew of messages for a single oversight. */
> + if (reported_missing_float_p)
> + return;
> +
> + error_at (location, "ACLE function %qD requires ISA extension %qs",
> + fndecl, "mve.fp");
> + inform (location, "you can enable mve.fp by using the command-line"
> + " option %<-march%>, or by using the %<target%>"
> + " attribute or pragma");
> + reported_missing_float_p = true;
> +}
> +
> +/* Report that LOCATION has a call to FNDECL in which argument ARGNO
> + was not an integer constant expression. ARGNO counts from zero. */
> +static void
> +report_non_ice (location_t location, tree fndecl, unsigned int argno)
> +{
> + error_at (location, "argument %d of %qE must be an integer constant"
> + " expression", argno + 1, fndecl);
> +}
> +
> +/* Report that LOCATION has a call to FNDECL in which argument ARGNO has
> + the value ACTUAL, whereas the function requires a value in the range
> + [MIN, MAX]. ARGNO counts from zero. */
> +static void
> +report_out_of_range (location_t location, tree fndecl, unsigned int argno,
> + HOST_WIDE_INT actual, HOST_WIDE_INT min,
> + HOST_WIDE_INT max)
> +{
> + error_at (location, "passing %wd to argument %d of %qE, which expects"
> + " a value in the range [%wd, %wd]", actual, argno + 1, fndecl,
> + min, max);
> +}
> +
> +/* Report that LOCATION has a call to FNDECL in which argument ARGNO has
> + the value ACTUAL, whereas the function requires a valid value of
> + enum type ENUMTYPE. ARGNO counts from zero. */
> +static void
> +report_not_enum (location_t location, tree fndecl, unsigned int argno,
> + HOST_WIDE_INT actual, tree enumtype)
> +{
> + error_at (location, "passing %wd to argument %d of %qE, which expects"
> + " a valid %qT value", actual, argno + 1, fndecl, enumtype);
> +}
> +
> +/* Checks that the mve.fp extension is enabled, given that REQUIRES_FLOAT
> + indicates whether it is required or not for function FNDECL.
> + Report an error against LOCATION if not. */
> +static bool
> +check_requires_float (location_t location, tree fndecl,
> + bool requires_float)
> +{
> + if (requires_float && !TARGET_HAVE_MVE_FLOAT)
> + {
> + report_missing_float (location, fndecl);
> + return false;
> + }
> +
> + return true;
> +}
> +
> +/* Return a hash code for a function_instance. */
> +hashval_t
> +function_instance::hash () const
> +{
> + inchash::hash h;
> + /* BASE uniquely determines BASE_NAME, so we don't need to hash both.
> */
> + h.add_ptr (base);
> + h.add_ptr (shape);
> + h.add_int (mode_suffix_id);
> + h.add_int (type_suffix_ids[0]);
> + h.add_int (type_suffix_ids[1]);
> + h.add_int (pred);
> + return h.end ();
> +}
> +
> +/* Return a set of CP_* flags that describe what the function could do,
> + taking the command-line flags into account. */
> +unsigned int
> +function_instance::call_properties () const
> +{
> + unsigned int flags = base->call_properties (*this);
> +
> + /* -fno-trapping-math means that we can assume any FP exceptions
> + are not user-visible. */
> + if (!flag_trapping_math)
> + flags &= ~CP_RAISE_FP_EXCEPTIONS;
> +
> + return flags;
> +}
> +
> +/* Return true if calls to the function could read some form of
> + global state. */
> +bool
> +function_instance::reads_global_state_p () const
> +{
> + unsigned int flags = call_properties ();
> +
> + /* Preserve any dependence on rounding mode, flush to zero mode, etc.
> + There is currently no way of turning this off; in particular,
> + -fno-rounding-math (which is the default) means that we should make
> + the usual assumptions about rounding mode, which for intrinsics means
> + acting as the instructions do. */
> + if (flags & CP_READ_FPCR)
> + return true;
> +
> + return false;
> +}
> +
> +/* Return true if calls to the function could modify some form of
> + global state. */
> +bool
> +function_instance::modifies_global_state_p () const
> +{
> + unsigned int flags = call_properties ();
> +
> + /* Preserve any exception state written back to the FPCR,
> + unless -fno-trapping-math says this is unnecessary. */
> + if (flags & CP_RAISE_FP_EXCEPTIONS)
> + return true;
> +
> + /* Handle direct modifications of global state. */
> + return flags & CP_WRITE_MEMORY;
> +}
> +
> +/* Return true if calls to the function could raise a signal. */
> +bool
> +function_instance::could_trap_p () const
> +{
> + unsigned int flags = call_properties ();
> +
> + /* Handle functions that could raise SIGFPE. */
> + if (flags & CP_RAISE_FP_EXCEPTIONS)
> + return true;
> +
> + /* Handle functions that could raise SIGBUS or SIGSEGV. */
> + if (flags & (CP_READ_MEMORY | CP_WRITE_MEMORY))
> + return true;
> +
> + return false;
> +}
> +
> +/* Return true if the function has an implicit "inactive" argument.
> + This is the case of most _m predicated functions, but not all.
> + The list will be updated as needed. */
> +bool
> +function_instance::has_inactive_argument () const
> +{
> + if (pred != PRED_m)
> + return false;
> +
> + return true;
> +}
> +
> +inline hashval_t
> +registered_function_hasher::hash (value_type value)
> +{
> + return value->instance.hash ();
> +}
> +
> +inline bool
> +registered_function_hasher::equal (value_type value, const compare_type
> &key)
> +{
> + return value->instance == key;
> +}
> +
> +function_builder::function_builder ()
> +{
> + m_overload_type = build_function_type (void_type_node, void_list_node);
> + m_direct_overloads = lang_GNU_CXX ();
> + gcc_obstack_init (&m_string_obstack);
> +}
> +
> +function_builder::~function_builder ()
> +{
> + obstack_free (&m_string_obstack, NULL);
> +}
> +
> +/* Add NAME to the end of the function name being built. */
> +void
> +function_builder::append_name (const char *name)
> +{
> + obstack_grow (&m_string_obstack, name, strlen (name));
> +}
> +
> +/* Zero-terminate and complete the function name being built. */
> +char *
> +function_builder::finish_name ()
> +{
> + obstack_1grow (&m_string_obstack, 0);
> + return (char *) obstack_finish (&m_string_obstack);
> +}
> +
> +/* Return the overloaded or full function name for INSTANCE, with optional
> + prefix; PRESERVE_USER_NAMESPACE selects the prefix, and
> OVERLOADED_P
> + selects which the overloaded or full function name. Allocate the string on
> + m_string_obstack; the caller must use obstack_free to free it after use. */
> +char *
> +function_builder::get_name (const function_instance &instance,
> + bool preserve_user_namespace,
> + bool overloaded_p)
> +{
> + if (preserve_user_namespace)
> + append_name ("__arm_");
> + append_name (instance.base_name);
> + append_name (pred_suffixes[instance.pred]);
> + if (!overloaded_p
> + || instance.shape->explicit_mode_suffix_p (instance.pred,
> + instance.mode_suffix_id))
> + append_name (instance.mode_suffix ().string);
> + for (unsigned int i = 0; i < 2; ++i)
> + if (!overloaded_p
> + || instance.shape->explicit_type_suffix_p (i, instance.pred,
> + instance.mode_suffix_id))
> + append_name (instance.type_suffix (i).string);
> + return finish_name ();
> +}
> +
> +/* Add attribute NAME to ATTRS. */
> +static tree
> +add_attribute (const char *name, tree attrs)
> +{
> + return tree_cons (get_identifier (name), NULL_TREE, attrs);
> +}
> +
> +/* Return the appropriate function attributes for INSTANCE. */
> +tree
> +function_builder::get_attributes (const function_instance &instance)
> +{
> + tree attrs = NULL_TREE;
> +
> + if (!instance.modifies_global_state_p ())
> + {
> + if (instance.reads_global_state_p ())
> + attrs = add_attribute ("pure", attrs);
> + else
> + attrs = add_attribute ("const", attrs);
> + }
> +
> + if (!flag_non_call_exceptions || !instance.could_trap_p ())
> + attrs = add_attribute ("nothrow", attrs);
> +
> + return add_attribute ("leaf", attrs);
> +}
> +
> +/* Add a function called NAME with type FNTYPE and attributes ATTRS.
> + INSTANCE describes what the function does and OVERLOADED_P indicates
> + whether it is overloaded. REQUIRES_FLOAT indicates whether the function
> + requires the mve.fp extension. */
> +registered_function &
> +function_builder::add_function (const function_instance &instance,
> + const char *name, tree fntype, tree attrs,
> + bool requires_float,
> + bool overloaded_p,
> + bool placeholder_p)
> +{
> + unsigned int code = vec_safe_length (registered_functions);
> + code = (code << ARM_BUILTIN_SHIFT) | ARM_BUILTIN_MVE;
> +
> + /* We need to be able to generate placeholders to ensure that we have a
> + consistent numbering scheme for function codes between the C and C++
> + frontends, so that everything ties up in LTO.
> +
> + Currently, tree-streamer-in.cc:unpack_ts_function_decl_value_fields
> + validates that tree nodes returned by TARGET_BUILTIN_DECL are non-
> NULL and
> + some node other than error_mark_node. This is a holdover from when
> builtin
> + decls were streamed by code rather than by value.
> +
> + Ultimately, we should be able to remove this validation of BUILT_IN_MD
> + nodes and remove the target hook. For now, however, we need to
> appease the
> + validation and return a non-NULL, non-error_mark_node node, so we
> + arbitrarily choose integer_zero_node. */
> + tree decl = placeholder_p
> + ? integer_zero_node
> + : simulate_builtin_function_decl (input_location, name, fntype,
> + code, NULL, attrs);
> +
> + registered_function &rfn = *ggc_alloc <registered_function> ();
> + rfn.instance = instance;
> + rfn.decl = decl;
> + rfn.requires_float = requires_float;
> + rfn.overloaded_p = overloaded_p;
> + vec_safe_push (registered_functions, &rfn);
> +
> + return rfn;
> +}
> +
> +/* Add a built-in function for INSTANCE, with the argument types given
> + by ARGUMENT_TYPES and the return type given by RETURN_TYPE.
> + REQUIRES_FLOAT indicates whether the function requires the mve.fp
> extension,
> + and PRESERVE_USER_NAMESPACE indicates whether the function should
> also be
> + registered under its non-prefixed name. */
> +void
> +function_builder::add_unique_function (const function_instance &instance,
> + tree return_type,
> + vec<tree> &argument_types,
> + bool preserve_user_namespace,
> + bool requires_float,
> + bool force_direct_overloads)
> +{
> + /* Add the function under its full (unique) name with prefix. */
> + char *name = get_name (instance, true, false);
> + tree fntype = build_function_type_array (return_type,
> + argument_types.length (),
> + argument_types.address ());
> + tree attrs = get_attributes (instance);
> + registered_function &rfn = add_function (instance, name, fntype, attrs,
> + requires_float, false, false);
> +
> + /* Enter the function into the hash table. */
> + hashval_t hash = instance.hash ();
> + registered_function **rfn_slot
> + = function_table->find_slot_with_hash (instance, hash, INSERT);
> + gcc_assert (!*rfn_slot);
> + *rfn_slot = &rfn;
> +
> + /* Also add the non-prefixed non-overloaded function, if the user
> namespace
> + does not need to be preserved. */
> + if (!preserve_user_namespace)
> + {
> + char *noprefix_name = get_name (instance, false, false);
> + tree attrs = get_attributes (instance);
> + add_function (instance, noprefix_name, fntype, attrs, requires_float,
> + false, false);
> + }
> +
> + /* Also add the function under its overloaded alias, if we want
> + a separate decl for each instance of an overloaded function. */
> + char *overload_name = get_name (instance, true, true);
> + if (strcmp (name, overload_name) != 0)
> + {
> + /* Attribute lists shouldn't be shared. */
> + tree attrs = get_attributes (instance);
> + bool placeholder_p = !(m_direct_overloads || force_direct_overloads);
> + add_function (instance, overload_name, fntype, attrs,
> + requires_float, false, placeholder_p);
> +
> + /* Also add the non-prefixed overloaded function, if the user namespace
> + does not need to be preserved. */
> + if (!preserve_user_namespace)
> + {
> + char *noprefix_overload_name = get_name (instance, false, true);
> + tree attrs = get_attributes (instance);
> + add_function (instance, noprefix_overload_name, fntype, attrs,
> + requires_float, false, placeholder_p);
> + }
> + }
> +
> + obstack_free (&m_string_obstack, name);
> +}
> +
> +/* Add one function decl for INSTANCE, to be used with manual overload
> + resolution. REQUIRES_FLOAT indicates whether the function requires the
> + mve.fp extension.
> +
> + For simplicity, partition functions by instance and required extensions,
> + and check whether the required extensions are available as part of
> resolving
> + the function to the relevant unique function. */
> +void
> +function_builder::add_overloaded_function (const function_instance
> &instance,
> + bool preserve_user_namespace,
> + bool requires_float)
> +{
> + char *name = get_name (instance, true, true);
> + if (registered_function **map_value = m_overload_names.get (name))
> + {
> + gcc_assert ((*map_value)->instance == instance);
> + obstack_free (&m_string_obstack, name);
> + }
> + else
> + {
> + registered_function &rfn
> + = add_function (instance, name, m_overload_type, NULL_TREE,
> + requires_float, true, m_direct_overloads);
> + m_overload_names.put (name, &rfn);
> + if (!preserve_user_namespace)
> + {
> + char *noprefix_name = get_name (instance, false, true);
> + registered_function &noprefix_rfn
> + = add_function (instance, noprefix_name, m_overload_type,
> + NULL_TREE, requires_float, true,
> + m_direct_overloads);
> + m_overload_names.put (noprefix_name, &noprefix_rfn);
> + }
> + }
> +}
> +
> +/* If we are using manual overload resolution, add one function decl
> + for each overloaded function in GROUP. Take the function base name
> + from GROUP and the mode from MODE. */
> +void
> +function_builder::add_overloaded_functions (const function_group_info
> &group,
> + mode_suffix_index mode,
> + bool preserve_user_namespace)
> +{
> + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi)
> + {
> + unsigned int explicit_type0
> + = (*group.shape)->explicit_type_suffix_p (0, group.preds[pi], mode);
> + unsigned int explicit_type1
> + = (*group.shape)->explicit_type_suffix_p (1, group.preds[pi], mode);
> +
> + if ((*group.shape)->skip_overload_p (group.preds[pi], mode))
> + continue;
> +
> + if (!explicit_type0 && !explicit_type1)
> + {
> + /* Deal with the common case in which there is one overloaded
> + function for all type combinations. */
> + function_instance instance (group.base_name, *group.base,
> + *group.shape, mode, types_none[0],
> + group.preds[pi]);
> + add_overloaded_function (instance, preserve_user_namespace,
> + group.requires_float);
> + }
> + else
> + for (unsigned int ti = 0; group.types[ti][0] != NUM_TYPE_SUFFIXES;
> + ++ti)
> + {
> + /* Stub out the types that are determined by overload
> + resolution. */
> + type_suffix_pair types = {
> + explicit_type0 ? group.types[ti][0] : NUM_TYPE_SUFFIXES,
> + explicit_type1 ? group.types[ti][1] : NUM_TYPE_SUFFIXES
> + };
> + function_instance instance (group.base_name, *group.base,
> + *group.shape, mode, types,
> + group.preds[pi]);
> + add_overloaded_function (instance, preserve_user_namespace,
> + group.requires_float);
> + }
> + }
> +}
> +
> +/* Register all the functions in GROUP. */
> +void
> +function_builder::register_function_group (const function_group_info
> &group,
> + bool preserve_user_namespace)
> +{
> + (*group.shape)->build (*this, group, preserve_user_namespace);
> +}
> +
> +function_call_info::function_call_info (location_t location_in,
> + const function_instance &instance_in,
> + tree fndecl_in)
> + : function_instance (instance_in), location (location_in), fndecl (fndecl_in)
> +{
> +}
> +
> +function_resolver::function_resolver (location_t location,
> + const function_instance &instance,
> + tree fndecl, vec<tree, va_gc> &arglist)
> + : function_call_info (location, instance, fndecl), m_arglist (arglist)
> +{
> +}
> +
> +/* Return the vector type associated with type suffix TYPE. */
> +tree
> +function_resolver::get_vector_type (type_suffix_index type)
> +{
> + return acle_vector_types[0][type_suffixes[type].vector_type];
> +}
> +
> +/* Return the <stdint.h> name associated with TYPE. Using the <stdint.h>
> + name should be more user-friendly than the underlying canonical type,
> + since it makes the signedness and bitwidth explicit. */
> +const char *
> +function_resolver::get_scalar_type_name (type_suffix_index type)
> +{
> + return vector_types[type_suffixes[type].vector_type].acle_name + 2;
> +}
> +
> +/* Return the type of argument I, or error_mark_node if it isn't
> + well-formed. */
> +tree
> +function_resolver::get_argument_type (unsigned int i)
> +{
> + tree arg = m_arglist[i];
> + return arg == error_mark_node ? arg : TREE_TYPE (arg);
> +}
> +
> +/* Return true if argument I is some form of scalar value. */
> +bool
> +function_resolver::scalar_argument_p (unsigned int i)
> +{
> + tree type = get_argument_type (i);
> + return (INTEGRAL_TYPE_P (type)
> + /* Allow pointer types, leaving the frontend to warn where
> + necessary. */
> + || POINTER_TYPE_P (type)
> + || SCALAR_FLOAT_TYPE_P (type));
> +}
> +
> +/* Report that the function has no form that takes type suffix TYPE.
> + Return error_mark_node. */
> +tree
> +function_resolver::report_no_such_form (type_suffix_index type)
> +{
> + error_at (location, "%qE has no form that takes %qT arguments",
> + fndecl, get_vector_type (type));
> + return error_mark_node;
> +}
> +
> +/* Silently check whether there is an instance of the function with the
> + mode suffix given by MODE and the type suffixes given by TYPE0 and
> TYPE1.
> + Return its function decl if so, otherwise return null. */
> +tree
> +function_resolver::lookup_form (mode_suffix_index mode,
> + type_suffix_index type0,
> + type_suffix_index type1)
> +{
> + type_suffix_pair types = { type0, type1 };
> + function_instance instance (base_name, base, shape, mode, types, pred);
> + registered_function *rfn
> + = function_table->find_with_hash (instance, instance.hash ());
> + return rfn ? rfn->decl : NULL_TREE;
> +}
> +
> +/* Resolve the function to one with the mode suffix given by MODE and the
> + type suffixes given by TYPE0 and TYPE1. Return its function decl on
> + success, otherwise report an error and return error_mark_node. */
> +tree
> +function_resolver::resolve_to (mode_suffix_index mode,
> + type_suffix_index type0,
> + type_suffix_index type1)
> +{
> + tree res = lookup_form (mode, type0, type1);
> + if (!res)
> + {
> + if (type1 == NUM_TYPE_SUFFIXES)
> + return report_no_such_form (type0);
> + if (type0 == type_suffix_ids[0])
> + return report_no_such_form (type1);
> + /* To be filled in when we have other cases. */
> + gcc_unreachable ();
> + }
> + return res;
> +}
> +
> +/* Require argument ARGNO to be a single vector or a tuple of
> NUM_VECTORS
> + vectors; NUM_VECTORS is 1 for the former. Return the associated type
> + suffix on success, using TYPE_SUFFIX_b for predicates. Report an error
> + and return NUM_TYPE_SUFFIXES on failure. */
> +type_suffix_index
> +function_resolver::infer_vector_or_tuple_type (unsigned int argno,
> + unsigned int num_vectors)
> +{
> + tree actual = get_argument_type (argno);
> + if (actual == error_mark_node)
> + return NUM_TYPE_SUFFIXES;
> +
> + /* A linear search should be OK here, since the code isn't hot and
> + the number of types is only small. */
> + for (unsigned int size_i = 0; size_i < MAX_TUPLE_SIZE; ++size_i)
> + for (unsigned int suffix_i = 0; suffix_i < NUM_TYPE_SUFFIXES; ++suffix_i)
> + {
> + vector_type_index type_i = type_suffixes[suffix_i].vector_type;
> + tree type = acle_vector_types[size_i][type_i];
> + if (type && matches_type_p (type, actual))
> + {
> + if (size_i + 1 == num_vectors)
> + return type_suffix_index (suffix_i);
> +
> + if (num_vectors == 1)
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects a single MVE vector rather than a tuple",
> + actual, argno + 1, fndecl);
> + else if (size_i == 0 && type_i != VECTOR_TYPE_mve_pred16_t)
> + /* num_vectors is always != 1, so the singular isn't needed. */
> + error_n (location, num_vectors, "%qT%d%qE%d",
> + "passing single vector %qT to argument %d"
> + " of %qE, which expects a tuple of %d vectors",
> + actual, argno + 1, fndecl, num_vectors);
> + else
> + /* num_vectors is always != 1, so the singular isn't needed. */
> + error_n (location, num_vectors, "%qT%d%qE%d",
> + "passing %qT to argument %d of %qE, which"
> + " expects a tuple of %d vectors", actual, argno + 1,
> + fndecl, num_vectors);
> + return NUM_TYPE_SUFFIXES;
> + }
> + }
> +
> + if (num_vectors == 1)
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects an MVE vector type", actual, argno + 1, fndecl);
> + else
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects an MVE tuple type", actual, argno + 1, fndecl);
> + return NUM_TYPE_SUFFIXES;
> +}
> +
> +/* Require argument ARGNO to have some form of vector type. Return the
> + associated type suffix on success, using TYPE_SUFFIX_b for predicates.
> + Report an error and return NUM_TYPE_SUFFIXES on failure. */
> +type_suffix_index
> +function_resolver::infer_vector_type (unsigned int argno)
> +{
> + return infer_vector_or_tuple_type (argno, 1);
> +}
> +
> +/* Require argument ARGNO to be a vector or scalar argument. Return true
> + if it is, otherwise report an appropriate error. */
> +bool
> +function_resolver::require_vector_or_scalar_type (unsigned int argno)
> +{
> + tree actual = get_argument_type (argno);
> + if (actual == error_mark_node)
> + return false;
> +
> + if (!scalar_argument_p (argno) && !VECTOR_TYPE_P (actual))
> + {
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects a vector or scalar type", actual, argno + 1, fndecl);
> + return false;
> + }
> +
> + return true;
> +}
> +
> +/* Require argument ARGNO to have vector type TYPE, in cases where this
> + requirement holds for all uses of the function. Return true if the
> + argument has the right form, otherwise report an appropriate error. */
> +bool
> +function_resolver::require_vector_type (unsigned int argno,
> + vector_type_index type)
> +{
> + tree expected = acle_vector_types[0][type];
> + tree actual = get_argument_type (argno);
> + if (actual == error_mark_node)
> + return false;
> +
> + if (!matches_type_p (expected, actual))
> + {
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects %qT", actual, argno + 1, fndecl, expected);
> + return false;
> + }
> + return true;
> +}
> +
> +/* Like require_vector_type, but TYPE is inferred from previous arguments
> + rather than being a fixed part of the function signature. This changes
> + the nature of the error messages. */
> +bool
> +function_resolver::require_matching_vector_type (unsigned int argno,
> + type_suffix_index type)
> +{
> + type_suffix_index new_type = infer_vector_type (argno);
> + if (new_type == NUM_TYPE_SUFFIXES)
> + return false;
> +
> + if (type != new_type)
> + {
> + error_at (location, "passing %qT to argument %d of %qE, but"
> + " previous arguments had type %qT",
> + get_vector_type (new_type), argno + 1, fndecl,
> + get_vector_type (type));
> + return false;
> + }
> + return true;
> +}
> +
> +/* Require argument ARGNO to be a vector type with the following
> properties:
> +
> + - the type class must be the same as FIRST_TYPE's if EXPECTED_TCLASS
> + is SAME_TYPE_CLASS, otherwise it must be EXPECTED_TCLASS itself.
> +
> + - the element size must be:
> +
> + - the same as FIRST_TYPE's if EXPECTED_BITS == SAME_SIZE
> + - half of FIRST_TYPE's if EXPECTED_BITS == HALF_SIZE
> + - a quarter of FIRST_TYPE's if EXPECTED_BITS == QUARTER_SIZE
> + - EXPECTED_BITS itself otherwise
> +
> + Return true if the argument has the required type, otherwise report
> + an appropriate error.
> +
> + FIRST_ARGNO is the first argument that is known to have type FIRST_TYPE.
> + Usually it comes before ARGNO, but sometimes it is more natural to
> resolve
> + arguments out of order.
> +
> + If the required properties depend on FIRST_TYPE then both FIRST_ARGNO
> and
> + ARGNO contribute to the resolution process. If the required properties
> + are fixed, only FIRST_ARGNO contributes to the resolution process.
> +
> + This function is a bit of a Swiss army knife. The complication comes
> + from trying to give good error messages when FIRST_ARGNO and ARGNO
> are
> + inconsistent, since either of them might be wrong. */
> +bool function_resolver::
> +require_derived_vector_type (unsigned int argno,
> + unsigned int first_argno,
> + type_suffix_index first_type,
> + type_class_index expected_tclass,
> + unsigned int expected_bits)
> +{
> + /* If the type needs to match FIRST_ARGNO exactly, use the preferred
> + error message for that case. The VECTOR_TYPE_P test excludes tuple
> + types, which we handle below instead. */
> + bool both_vectors_p = VECTOR_TYPE_P (get_argument_type (first_argno));
> + if (both_vectors_p
> + && expected_tclass == SAME_TYPE_CLASS
> + && expected_bits == SAME_SIZE)
> + {
> + /* There's no need to resolve this case out of order. */
> + gcc_assert (argno > first_argno);
> + return require_matching_vector_type (argno, first_type);
> + }
> +
> + /* Use FIRST_TYPE to get the expected type class and element size. */
> + type_class_index orig_expected_tclass = expected_tclass;
> + if (expected_tclass == NUM_TYPE_CLASSES)
> + expected_tclass = type_suffixes[first_type].tclass;
> +
> + unsigned int orig_expected_bits = expected_bits;
> + if (expected_bits == SAME_SIZE)
> + expected_bits = type_suffixes[first_type].element_bits;
> + else if (expected_bits == HALF_SIZE)
> + expected_bits = type_suffixes[first_type].element_bits / 2;
> + else if (expected_bits == QUARTER_SIZE)
> + expected_bits = type_suffixes[first_type].element_bits / 4;
> +
> + /* If the expected type doesn't depend on FIRST_TYPE at all,
> + just check for the fixed choice of vector type. */
> + if (expected_tclass == orig_expected_tclass
> + && expected_bits == orig_expected_bits)
> + {
> + const type_suffix_info &expected_suffix
> + = type_suffixes[find_type_suffix (expected_tclass, expected_bits)];
> + return require_vector_type (argno, expected_suffix.vector_type);
> + }
> +
> + /* Require the argument to be some form of MVE vector type,
> + without being specific about the type of vector we want. */
> + type_suffix_index actual_type = infer_vector_type (argno);
> + if (actual_type == NUM_TYPE_SUFFIXES)
> + return false;
> +
> + /* Exit now if we got the right type. */
> + bool tclass_ok_p = (type_suffixes[actual_type].tclass == expected_tclass);
> + bool size_ok_p = (type_suffixes[actual_type].element_bits ==
> expected_bits);
> + if (tclass_ok_p && size_ok_p)
> + return true;
> +
> + /* First look for cases in which the actual type contravenes a fixed
> + size requirement, without having to refer to FIRST_TYPE. */
> + if (!size_ok_p && expected_bits == orig_expected_bits)
> + {
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects a vector of %d-bit elements",
> + get_vector_type (actual_type), argno + 1, fndecl,
> + expected_bits);
> + return false;
> + }
> +
> + /* Likewise for a fixed type class requirement. This is only ever
> + needed for signed and unsigned types, so don't create unnecessary
> + translation work for other type classes. */
> + if (!tclass_ok_p && orig_expected_tclass == TYPE_signed)
> + {
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects a vector of signed integers",
> + get_vector_type (actual_type), argno + 1, fndecl);
> + return false;
> + }
> + if (!tclass_ok_p && orig_expected_tclass == TYPE_unsigned)
> + {
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects a vector of unsigned integers",
> + get_vector_type (actual_type), argno + 1, fndecl);
> + return false;
> + }
> +
> + /* Make sure that FIRST_TYPE itself is sensible before using it
> + as a basis for an error message. */
> + if (resolve_to (mode_suffix_id, first_type) == error_mark_node)
> + return false;
> +
> + /* If the arguments have consistent type classes, but a link between
> + the sizes has been broken, try to describe the error in those terms. */
> + if (both_vectors_p && tclass_ok_p && orig_expected_bits == SAME_SIZE)
> + {
> + if (argno < first_argno)
> + {
> + std::swap (argno, first_argno);
> + std::swap (actual_type, first_type);
> + }
> + error_at (location, "arguments %d and %d of %qE must have the"
> + " same element size, but the values passed here have type"
> + " %qT and %qT respectively", first_argno + 1, argno + 1,
> + fndecl, get_vector_type (first_type),
> + get_vector_type (actual_type));
> + return false;
> + }
> +
> + /* Likewise in reverse: look for cases in which the sizes are consistent
> + but a link between the type classes has been broken. */
> + if (both_vectors_p
> + && size_ok_p
> + && orig_expected_tclass == SAME_TYPE_CLASS
> + && type_suffixes[first_type].integer_p
> + && type_suffixes[actual_type].integer_p)
> + {
> + if (argno < first_argno)
> + {
> + std::swap (argno, first_argno);
> + std::swap (actual_type, first_type);
> + }
> + error_at (location, "arguments %d and %d of %qE must have the"
> + " same signedness, but the values passed here have type"
> + " %qT and %qT respectively", first_argno + 1, argno + 1,
> + fndecl, get_vector_type (first_type),
> + get_vector_type (actual_type));
> + return false;
> + }
> +
> + /* The two arguments are wildly inconsistent. */
> + type_suffix_index expected_type
> + = find_type_suffix (expected_tclass, expected_bits);
> + error_at (location, "passing %qT instead of the expected %qT to argument"
> + " %d of %qE, after passing %qT to argument %d",
> + get_vector_type (actual_type), get_vector_type (expected_type),
> + argno + 1, fndecl, get_argument_type (first_argno),
> + first_argno + 1);
> + return false;
> +}
> +
> +/* Require argument ARGNO to be a (possibly variable) scalar, expecting it
> + to have the following properties:
> +
> + - the type class must be the same as for type suffix 0 if EXPECTED_TCLASS
> + is SAME_TYPE_CLASS, otherwise it must be EXPECTED_TCLASS itself.
> +
> + - the element size must be the same as for type suffix 0 if EXPECTED_BITS
> + is SAME_TYPE_SIZE, otherwise it must be EXPECTED_BITS itself.
> +
> + Return true if the argument is valid, otherwise report an appropriate error.
> +
> + Note that we don't check whether the scalar type actually has the required
> + properties, since that's subject to implicit promotions and conversions.
> + Instead we just use the expected properties to tune the error message. */
> +bool function_resolver::
> +require_derived_scalar_type (unsigned int argno,
> + type_class_index expected_tclass,
> + unsigned int expected_bits)
> +{
> + gcc_assert (expected_tclass == SAME_TYPE_CLASS
> + || expected_tclass == TYPE_signed
> + || expected_tclass == TYPE_unsigned);
> +
> + /* If the expected type doesn't depend on the type suffix at all,
> + just check for the fixed choice of scalar type. */
> + if (expected_tclass != SAME_TYPE_CLASS && expected_bits != SAME_SIZE)
> + {
> + type_suffix_index expected_type
> + = find_type_suffix (expected_tclass, expected_bits);
> + return require_scalar_type (argno, get_scalar_type_name
> (expected_type));
> + }
> +
> + if (scalar_argument_p (argno))
> + return true;
> +
> + if (expected_tclass == SAME_TYPE_CLASS)
> + /* It doesn't really matter whether the element is expected to be
> + the same size as type suffix 0. */
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects a scalar element", get_argument_type (argno),
> + argno + 1, fndecl);
> + else
> + /* It doesn't seem useful to distinguish between signed and unsigned
> + scalars here. */
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects a scalar integer", get_argument_type (argno),
> + argno + 1, fndecl);
> + return false;
> +}
> +
> +/* Require argument ARGNO to be suitable for an integer constant
> expression.
> + Return true if it is, otherwise report an appropriate error.
> +
> + function_checker checks whether the argument is actually constant and
> + has a suitable range. The reason for distinguishing immediate arguments
> + here is because it provides more consistent error messages than
> + require_scalar_type would. */
> +bool
> +function_resolver::require_integer_immediate (unsigned int argno)
> +{
> + if (!scalar_argument_p (argno))
> + {
> + report_non_ice (location, fndecl, argno);
> + return false;
> + }
> + return true;
> +}
> +
> +/* Require argument ARGNO to be a (possibly variable) scalar, using
> EXPECTED
> + as the name of its expected type. Return true if the argument has the
> + right form, otherwise report an appropriate error. */
> +bool
> +function_resolver::require_scalar_type (unsigned int argno,
> + const char *expected)
> +{
> + if (!scalar_argument_p (argno))
> + {
> + error_at (location, "passing %qT to argument %d of %qE, which"
> + " expects %qs", get_argument_type (argno), argno + 1,
> + fndecl, expected);
> + return false;
> + }
> + return true;
> +}
> +
> +/* Require the function to have exactly EXPECTED arguments. Return true
> + if it does, otherwise report an appropriate error. */
> +bool
> +function_resolver::check_num_arguments (unsigned int expected)
> +{
> + if (m_arglist.length () < expected)
> + error_at (location, "too few arguments to function %qE", fndecl);
> + else if (m_arglist.length () > expected)
> + error_at (location, "too many arguments to function %qE", fndecl);
> + return m_arglist.length () == expected;
> +}
> +
> +/* If the function is predicated, check that the last argument is a
> + suitable predicate. Also check that there are NOPS further
> + arguments before any predicate, but don't check what they are.
> +
> + Return true on success, otherwise report a suitable error.
> + When returning true:
> +
> + - set I to the number of the last unchecked argument.
> + - set NARGS to the total number of arguments. */
> +bool
> +function_resolver::check_gp_argument (unsigned int nops,
> + unsigned int &i, unsigned int &nargs)
> +{
> + i = nops - 1;
> + if (pred != PRED_none)
> + {
> + switch (pred)
> + {
> + case PRED_m:
> + /* Add first inactive argument if needed, and final predicate. */
> + if (has_inactive_argument ())
> + nargs = nops + 2;
> + else
> + nargs = nops + 1;
> + break;
> +
> + case PRED_p:
> + case PRED_x:
> + /* Add final predicate. */
> + nargs = nops + 1;
> + break;
> +
> + default:
> + gcc_unreachable ();
> + }
> +
> + if (!check_num_arguments (nargs)
> + || !require_vector_type (nargs - 1, VECTOR_TYPE_mve_pred16_t))
> + return false;
> +
> + i = nargs - 2;
> + }
> + else
> + {
> + nargs = nops;
> + if (!check_num_arguments (nargs))
> + return false;
> + }
> +
> + return true;
> +}
> +
> +/* Finish resolving a function whose final argument can be a vector
> + or a scalar, with the function having an implicit "_n" suffix
> + in the latter case. This "_n" form might only exist for certain
> + type suffixes.
> +
> + ARGNO is the index of the final argument. The inferred type suffix
> + was obtained from argument FIRST_ARGNO, which has type FIRST_TYPE.
> + EXPECTED_TCLASS and EXPECTED_BITS describe the expected properties
> + of the final vector or scalar argument, in the same way as for
> + require_derived_vector_type. INFERRED_TYPE is the inferred type
> + suffix itself, or NUM_TYPE_SUFFIXES if it's the same as FIRST_TYPE.
> +
> + Return the function decl of the resolved function on success,
> + otherwise report a suitable error and return error_mark_node. */
> +tree function_resolver::
> +finish_opt_n_resolution (unsigned int argno, unsigned int first_argno,
> + type_suffix_index first_type,
> + type_class_index expected_tclass,
> + unsigned int expected_bits,
> + type_suffix_index inferred_type)
> +{
> + if (inferred_type == NUM_TYPE_SUFFIXES)
> + inferred_type = first_type;
> + tree scalar_form = lookup_form (MODE_n, inferred_type);
> +
> + /* Allow the final argument to be scalar, if an _n form exists. */
> + if (scalar_argument_p (argno))
> + {
> + if (scalar_form)
> + return scalar_form;
> +
> + /* Check the vector form normally. If that succeeds, raise an
> + error about having no corresponding _n form. */
> + tree res = resolve_to (mode_suffix_id, inferred_type);
> + if (res != error_mark_node)
> + error_at (location, "passing %qT to argument %d of %qE, but its"
> + " %qT form does not accept scalars",
> + get_argument_type (argno), argno + 1, fndecl,
> + get_vector_type (first_type));
> + return error_mark_node;
> + }
> +
> + /* If an _n form does exist, provide a more accurate message than
> + require_derived_vector_type would for arguments that are neither
> + vectors nor scalars. */
> + if (scalar_form && !require_vector_or_scalar_type (argno))
> + return error_mark_node;
> +
> + /* Check for the correct vector type. */
> + if (!require_derived_vector_type (argno, first_argno, first_type,
> + expected_tclass, expected_bits))
> + return error_mark_node;
> +
> + return resolve_to (mode_suffix_id, inferred_type);
> +}
> +
> +/* Resolve a (possibly predicated) unary function. If the function uses
> + merge predication or if TREAT_AS_MERGE_P is true, there is an extra
> + vector argument before the governing predicate that specifies the
> + values of inactive elements. This argument has the following
> + properties:
> +
> + - the type class must be the same as for active elements if MERGE_TCLASS
> + is SAME_TYPE_CLASS, otherwise it must be MERGE_TCLASS itself.
> +
> + - the element size must be the same as for active elements if MERGE_BITS
> + is SAME_TYPE_SIZE, otherwise it must be MERGE_BITS itself.
> +
> + Return the function decl of the resolved function on success,
> + otherwise report a suitable error and return error_mark_node. */
> +tree
> +function_resolver::resolve_unary (type_class_index merge_tclass,
> + unsigned int merge_bits,
> + bool treat_as_merge_p)
> +{
> + type_suffix_index type;
> + if (pred == PRED_m || treat_as_merge_p)
> + {
> + if (!check_num_arguments (3))
> + return error_mark_node;
> + if (merge_tclass == SAME_TYPE_CLASS && merge_bits == SAME_SIZE)
> + {
> + /* The inactive elements are the same as the active elements,
> + so we can use normal left-to-right resolution. */
> + if ((type = infer_vector_type (0)) == NUM_TYPE_SUFFIXES
> + /* Predicates are the last argument. */
> + || !require_vector_type (2 , VECTOR_TYPE_mve_pred16_t)
> + || !require_matching_vector_type (1 , type))
> + return error_mark_node;
> + }
> + else
> + {
> + /* The inactive element type is a function of the active one,
> + so resolve the active one first. */
> + if (!require_vector_type (1, VECTOR_TYPE_mve_pred16_t)
> + || (type = infer_vector_type (2)) == NUM_TYPE_SUFFIXES
> + || !require_derived_vector_type (0, 2, type, merge_tclass,
> + merge_bits))
> + return error_mark_node;
> + }
> + }
> + else
> + {
> + /* We just need to check the predicate (if any) and the single
> + vector argument. */
> + unsigned int i, nargs;
> + if (!check_gp_argument (1, i, nargs)
> + || (type = infer_vector_type (i)) == NUM_TYPE_SUFFIXES)
> + return error_mark_node;
> + }
> +
> + /* Handle convert-like functions in which the first type suffix is
> + explicit. */
> + if (type_suffix_ids[0] != NUM_TYPE_SUFFIXES)
> + return resolve_to (mode_suffix_id, type_suffix_ids[0], type);
> +
> + return resolve_to (mode_suffix_id, type);
> +}
> +
> +/* Resolve a (possibly predicated) unary function taking a scalar
> + argument (_n suffix). If the function uses merge predication,
> + there is an extra vector argument in the first position, and the
> + final governing predicate that specifies the values of inactive
> + elements.
> +
> + Return the function decl of the resolved function on success,
> + otherwise report a suitable error and return error_mark_node. */
> +tree
> +function_resolver::resolve_unary_n ()
> +{
> + type_suffix_index type;
> +
> + /* Currently only support overrides for _m (vdupq). */
> + if (pred != PRED_m)
> + return error_mark_node;
> +
> + if (pred == PRED_m)
> + {
> + if (!check_num_arguments (3))
> + return error_mark_node;
> +
> + /* The inactive elements are the same as the active elements,
> + so we can use normal left-to-right resolution. */
> + if ((type = infer_vector_type (0)) == NUM_TYPE_SUFFIXES
> + /* Predicates are the last argument. */
> + || !require_vector_type (2 , VECTOR_TYPE_mve_pred16_t))
> + return error_mark_node;
> + }
> +
> + /* Make sure the argument is scalar. */
> + tree scalar_form = lookup_form (MODE_n, type);
> +
> + if (scalar_argument_p (1) && scalar_form)
> + return scalar_form;
> +
> + return error_mark_node;
> +}
> +
> +/* Resolve a (possibly predicated) function that takes NOPS like-typed
> + vector arguments followed by NIMM integer immediates. Return the
> + function decl of the resolved function on success, otherwise report
> + a suitable error and return error_mark_node. */
> +tree
> +function_resolver::resolve_uniform (unsigned int nops, unsigned int nimm)
> +{
> + unsigned int i, nargs;
> + type_suffix_index type;
> + if (!check_gp_argument (nops + nimm, i, nargs)
> + || (type = infer_vector_type (0 )) == NUM_TYPE_SUFFIXES)
> + return error_mark_node;
> +
> + unsigned int last_arg = i + 1 - nimm;
> + for (i = 0; i < last_arg; i++)
> + if (!require_matching_vector_type (i, type))
> + return error_mark_node;
> +
> + for (i = last_arg; i < nargs; ++i)
> + if (!require_integer_immediate (i))
> + return error_mark_node;
> +
> + return resolve_to (mode_suffix_id, type);
> +}
> +
> +/* Resolve a (possibly predicated) function that offers a choice between
> + taking:
> +
> + - NOPS like-typed vector arguments or
> + - NOPS - 1 like-typed vector arguments followed by a scalar argument
> +
> + Return the function decl of the resolved function on success,
> + otherwise report a suitable error and return error_mark_node. */
> +tree
> +function_resolver::resolve_uniform_opt_n (unsigned int nops)
> +{
> + unsigned int i, nargs;
> + type_suffix_index type;
> + if (!check_gp_argument (nops, i, nargs)
> + /* Unary operators should use resolve_unary, so using i - 1 is
> + safe. */
> + || (type = infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES)
> + return error_mark_node;
> +
> + /* Skip last argument, may be scalar. */
> + unsigned int last_arg = i;
> + for (i = 0; i < last_arg; i++)
> + if (!require_matching_vector_type (i, type))
> + return error_mark_node;
> +
> + return finish_opt_n_resolution (last_arg, 0, type);
> +}
> +
> +/* If the call is erroneous, report an appropriate error and return
> + error_mark_node. Otherwise, if the function is overloaded, return
> + the decl of the non-overloaded function. Return NULL_TREE otherwise,
> + indicating that the call should be processed in the normal way. */
> +tree
> +function_resolver::resolve ()
> +{
> + return shape->resolve (*this);
> +}
> +
> +function_checker::function_checker (location_t location,
> + const function_instance &instance,
> + tree fndecl, tree fntype,
> + unsigned int nargs, tree *args)
> + : function_call_info (location, instance, fndecl),
> + m_fntype (fntype), m_nargs (nargs), m_args (args)
> +{
> + if (instance.has_inactive_argument ())
> + m_base_arg = 1;
> + else
> + m_base_arg = 0;
> +}
> +
> +/* Return true if argument ARGNO exists. which it might not for
> + erroneous calls. It is safe to wave through checks if this
> + function returns false. */
> +bool
> +function_checker::argument_exists_p (unsigned int argno)
> +{
> + gcc_assert (argno < (unsigned int) type_num_arguments (m_fntype));
> + return argno < m_nargs;
> +}
> +
> +/* Check that argument ARGNO is an integer constant expression and
> + store its value in VALUE_OUT if so. The caller should first
> + check that argument ARGNO exists. */
> +bool
> +function_checker::require_immediate (unsigned int argno,
> + HOST_WIDE_INT &value_out)
> +{
> + gcc_assert (argno < m_nargs);
> + tree arg = m_args[argno];
> +
> + /* The type and range are unsigned, so read the argument as an
> + unsigned rather than signed HWI. */
> + if (!tree_fits_uhwi_p (arg))
> + {
> + report_non_ice (location, fndecl, argno);
> + return false;
> + }
> +
> + /* ...but treat VALUE_OUT as signed for error reporting, since printing
> + -1 is more user-friendly than the maximum uint64_t value. */
> + value_out = tree_to_uhwi (arg);
> + return true;
> +}
> +
> +/* Check that argument REL_ARGNO is an integer constant expression that
> has
> + a valid value for enumeration type TYPE. REL_ARGNO counts from the end
> + of the predication arguments. */
> +bool
> +function_checker::require_immediate_enum (unsigned int rel_argno, tree
> type)
> +{
> + unsigned int argno = m_base_arg + rel_argno;
> + if (!argument_exists_p (argno))
> + return true;
> +
> + HOST_WIDE_INT actual;
> + if (!require_immediate (argno, actual))
> + return false;
> +
> + for (tree entry = TYPE_VALUES (type); entry; entry = TREE_CHAIN (entry))
> + {
> + /* The value is an INTEGER_CST for C and a CONST_DECL wrapper
> + around an INTEGER_CST for C++. */
> + tree value = TREE_VALUE (entry);
> + if (TREE_CODE (value) == CONST_DECL)
> + value = DECL_INITIAL (value);
> + if (wi::to_widest (value) == actual)
> + return true;
> + }
> +
> + report_not_enum (location, fndecl, argno, actual, type);
> + return false;
> +}
> +
> +/* Check that argument REL_ARGNO is an integer constant expression in the
> + range [MIN, MAX]. REL_ARGNO counts from the end of the predication
> + arguments. */
> +bool
> +function_checker::require_immediate_range (unsigned int rel_argno,
> + HOST_WIDE_INT min,
> + HOST_WIDE_INT max)
> +{
> + unsigned int argno = m_base_arg + rel_argno;
> + if (!argument_exists_p (argno))
> + return true;
> +
> + /* Required because of the tree_to_uhwi -> HOST_WIDE_INT conversion
> + in require_immediate. */
> + gcc_assert (min >= 0 && min <= max);
> + HOST_WIDE_INT actual;
> + if (!require_immediate (argno, actual))
> + return false;
> +
> + if (!IN_RANGE (actual, min, max))
> + {
> + report_out_of_range (location, fndecl, argno, actual, min, max);
> + return false;
> + }
> +
> + return true;
> +}
> +
> +/* Perform semantic checks on the call. Return true if the call is valid,
> + otherwise report a suitable error. */
> +bool
> +function_checker::check ()
> +{
> + function_args_iterator iter;
> + tree type;
> + unsigned int i = 0;
> + FOREACH_FUNCTION_ARGS (m_fntype, type, iter)
> + {
> + if (type == void_type_node || i >= m_nargs)
> + break;
> +
> + if (i >= m_base_arg
> + && TREE_CODE (type) == ENUMERAL_TYPE
> + && !require_immediate_enum (i - m_base_arg, type))
> + return false;
> +
> + i += 1;
> + }
> +
> + return shape->check (*this);
> +}
> +
> +gimple_folder::gimple_folder (const function_instance &instance, tree
> fndecl,
> + gcall *call_in)
> + : function_call_info (gimple_location (call_in), instance, fndecl),
> + call (call_in), lhs (gimple_call_lhs (call_in))
> +{
> +}
> +
> +/* Try to fold the call. Return the new statement on success and null
> + on failure. */
> +gimple *
> +gimple_folder::fold ()
> +{
> + /* Don't fold anything when MVE is disabled; emit an error during
> + expansion instead. */
> + if (!TARGET_HAVE_MVE)
> + return NULL;
> +
> + /* Punt if the function has a return type and no result location is
> + provided. The attributes should allow target-independent code to
> + remove the calls if appropriate. */
> + if (!lhs && TREE_TYPE (gimple_call_fntype (call)) != void_type_node)
> + return NULL;
> +
> + return base->fold (*this);
> +}
> +
> +function_expander::function_expander (const function_instance &instance,
> + tree fndecl, tree call_expr_in,
> + rtx possible_target_in)
> + : function_call_info (EXPR_LOCATION (call_expr_in), instance, fndecl),
> + call_expr (call_expr_in), possible_target (possible_target_in)
> +{
> +}
> +
> +/* Return the handler of direct optab OP for type suffix SUFFIX_I. */
> +insn_code
> +function_expander::direct_optab_handler (optab op, unsigned int suffix_i)
> +{
> + return ::direct_optab_handler (op, vector_mode (suffix_i));
> +}
> +
> +/* For a function that does the equivalent of:
> +
> + OUTPUT = COND ? FN (INPUTS) : FALLBACK;
> +
> + return the value of FALLBACK.
> +
> + MODE is the mode of OUTPUT.
> + MERGE_ARGNO is the argument that provides FALLBACK for _m functions,
> + or DEFAULT_MERGE_ARGNO if we should apply the usual rules.
> +
> + ARGNO is the caller's index into args. If the returned value is
> + argument 0 (as for unary _m operations), increment ARGNO past the
> + returned argument. */
> +rtx
> +function_expander::get_fallback_value (machine_mode mode,
> + unsigned int merge_argno,
> + unsigned int &argno)
> +{
> + if (pred == PRED_z)
> + return CONST0_RTX (mode);
> +
> + gcc_assert (pred == PRED_m || pred == PRED_x);
> +
> + if (merge_argno == 0)
> + return args[argno++];
> +
> + return args[merge_argno];
> +}
> +
> +/* Return a REG rtx that can be used for the result of the function,
> + using the preferred target if suitable. */
> +rtx
> +function_expander::get_reg_target ()
> +{
> + machine_mode target_mode = TYPE_MODE (TREE_TYPE (TREE_TYPE
> (fndecl)));
> + if (!possible_target || GET_MODE (possible_target) != target_mode)
> + possible_target = gen_reg_rtx (target_mode);
> + return possible_target;
> +}
> +
> +/* Add an output operand to the instruction we're building, which has
> + code ICODE. Bind the output to the preferred target rtx if possible. */
> +void
> +function_expander::add_output_operand (insn_code icode)
> +{
> + unsigned int opno = m_ops.length ();
> + machine_mode mode = insn_data[icode].operand[opno].mode;
> + m_ops.safe_grow (opno + 1, true);
> + create_output_operand (&m_ops.last (), possible_target, mode);
> +}
> +
> +/* Add an input operand to the instruction we're building, which has
> + code ICODE. Calculate the value of the operand as follows:
> +
> + - If the operand is a predicate, coerce X to have the
> + mode that the instruction expects.
> +
> + - Otherwise use X directly. The expand machinery checks that X has
> + the right mode for the instruction. */
> +void
> +function_expander::add_input_operand (insn_code icode, rtx x)
> +{
> + unsigned int opno = m_ops.length ();
> + const insn_operand_data &operand = insn_data[icode].operand[opno];
> + machine_mode mode = operand.mode;
> + if (mode == VOIDmode)
> + {
> + /* The only allowable use of VOIDmode is the wildcard
> + arm_any_register_operand, which is used to avoid
> + combinatorial explosion in the reinterpret patterns. */
> + gcc_assert (operand.predicate == arm_any_register_operand);
> + mode = GET_MODE (x);
> + }
> + else if (VALID_MVE_PRED_MODE (mode))
> + x = gen_lowpart (mode, x);
> +
> + m_ops.safe_grow (m_ops.length () + 1, true);
> + create_input_operand (&m_ops.last (), x, mode);
> +}
> +
> +/* Add an integer operand with value X to the instruction. */
> +void
> +function_expander::add_integer_operand (HOST_WIDE_INT x)
> +{
> + m_ops.safe_grow (m_ops.length () + 1, true);
> + create_integer_operand (&m_ops.last (), x);
> +}
> +
> +/* Generate instruction ICODE, given that its operands have already
> + been added to M_OPS. Return the value of the first operand. */
> +rtx
> +function_expander::generate_insn (insn_code icode)
> +{
> + expand_insn (icode, m_ops.length (), m_ops.address ());
> + return function_returns_void_p () ? const0_rtx : m_ops[0].value;
> +}
> +
> +/* Implement the call using instruction ICODE, with a 1:1 mapping between
> + arguments and input operands. */
> +rtx
> +function_expander::use_exact_insn (insn_code icode)
> +{
> + unsigned int nops = insn_data[icode].n_operands;
> + if (!function_returns_void_p ())
> + {
> + add_output_operand (icode);
> + nops -= 1;
> + }
> + for (unsigned int i = 0; i < nops; ++i)
> + add_input_operand (icode, args[i]);
> + return generate_insn (icode);
> +}
> +
> +/* Implement the call using instruction ICODE, which does not use a
> + predicate. */
> +rtx
> +function_expander::use_unpred_insn (insn_code icode)
> +{
> + gcc_assert (pred == PRED_none);
> + /* Discount the output operand. */
> + unsigned int nops = insn_data[icode].n_operands - 1;
> + unsigned int i = 0;
> +
> + add_output_operand (icode);
> + for (; i < nops; ++i)
> + add_input_operand (icode, args[i]);
> +
> + return generate_insn (icode);
> +}
> +
> +/* Implement the call using instruction ICODE, which is a predicated
> + operation that returns arbitrary values for inactive lanes. */
> +rtx
> +function_expander::use_pred_x_insn (insn_code icode)
> +{
> + gcc_assert (pred == PRED_x);
> + unsigned int nops = args.length ();
> +
> + add_output_operand (icode);
> + /* Use first operand as arbitrary inactive input. */
> + add_input_operand (icode, possible_target);
> + emit_clobber (possible_target);
> + /* Copy remaining arguments, including the final predicate. */
> + for (unsigned int i = 0; i < nops; ++i)
> + add_input_operand (icode, args[i]);
> +
> + return generate_insn (icode);
> +}
> +
> +/* Implement the call using instruction ICODE, which does the equivalent of:
> +
> + OUTPUT = COND ? FN (INPUTS) : FALLBACK;
> +
> + The instruction operands are in the order above: OUTPUT, COND, INPUTS
> + and FALLBACK. MERGE_ARGNO is the argument that provides FALLBACK
> for _m
> + functions, or DEFAULT_MERGE_ARGNO if we should apply the usual rules.
> */
> +rtx
> +function_expander::use_cond_insn (insn_code icode, unsigned int
> merge_argno)
> +{
> + /* At present we never need to handle PRED_none, which would involve
> + creating a new predicate rather than using one supplied by the user. */
> + gcc_assert (pred != PRED_none);
> + /* For MVE, we only handle PRED_m at present. */
> + gcc_assert (pred == PRED_m);
> +
> + /* Discount the output, predicate and fallback value. */
> + unsigned int nops = insn_data[icode].n_operands - 3;
> + machine_mode mode = insn_data[icode].operand[0].mode;
> +
> + unsigned int opno = 0;
> + rtx fallback_arg = NULL_RTX;
> + fallback_arg = get_fallback_value (mode, merge_argno, opno);
> + rtx pred_arg = args[nops + 1];
> +
> + add_output_operand (icode);
> + add_input_operand (icode, fallback_arg);
> + for (unsigned int i = 0; i < nops; ++i)
> + add_input_operand (icode, args[opno + i]);
> + add_input_operand (icode, pred_arg);
> + return generate_insn (icode);
> +}
> +
> +/* Implement the call using a normal unpredicated optab for PRED_none.
> +
> + <optab> corresponds to:
> +
> + - CODE_FOR_SINT for signed integers
> + - CODE_FOR_UINT for unsigned integers
> + - CODE_FOR_FP for floating-point values */
> +rtx
> +function_expander::map_to_rtx_codes (rtx_code code_for_sint,
> + rtx_code code_for_uint,
> + rtx_code code_for_fp)
> +{
> + gcc_assert (pred == PRED_none);
> + rtx_code code = type_suffix (0).integer_p ?
> + (type_suffix (0).unsigned_p ? code_for_uint : code_for_sint)
> + : code_for_fp;
> + insn_code icode = direct_optab_handler (code_to_optab (code), 0);
> + if (icode == CODE_FOR_nothing)
> + gcc_unreachable ();
> +
> + return use_unpred_insn (icode);
> +}
> +
> +/* Expand the call and return its lhs. */
> +rtx
> +function_expander::expand ()
> +{
> + unsigned int nargs = call_expr_nargs (call_expr);
> + args.reserve (nargs);
> + for (unsigned int i = 0; i < nargs; ++i)
> + args.quick_push (expand_normal (CALL_EXPR_ARG (call_expr, i)));
> +
> + return base->expand (*this);
> +}
> +
> +/* If we're implementing manual overloading, check whether the MVE
> + function with subcode CODE is overloaded, and if so attempt to
> + determine the corresponding non-overloaded function. The call
> + occurs at location LOCATION and has the arguments given by ARGLIST.
> +
> + If the call is erroneous, report an appropriate error and return
> + error_mark_node. Otherwise, if the function is overloaded, return
> + the decl of the non-overloaded function. Return NULL_TREE otherwise,
> + indicating that the call should be processed in the normal way. */
> +tree
> +resolve_overloaded_builtin (location_t location, unsigned int code,
> + vec<tree, va_gc> *arglist)
> +{
> + if (code >= vec_safe_length (registered_functions))
> + return NULL_TREE;
> +
> + registered_function &rfn = *(*registered_functions)[code];
> + if (rfn.overloaded_p)
> + return function_resolver (location, rfn.instance, rfn.decl,
> + *arglist).resolve ();
> + return NULL_TREE;
> +}
> +
> +/* Perform any semantic checks needed for a call to the MVE function
> + with subcode CODE, such as testing for integer constant expressions.
> + The call occurs at location LOCATION and has NARGS arguments,
> + given by ARGS. FNDECL is the original function decl, before
> + overload resolution.
> +
> + Return true if the call is valid, otherwise report a suitable error. */
> +bool
> +check_builtin_call (location_t location, vec<location_t>, unsigned int code,
> + tree fndecl, unsigned int nargs, tree *args)
> +{
> + const registered_function &rfn = *(*registered_functions)[code];
> + if (!check_requires_float (location, rfn.decl, rfn.requires_float))
> + return false;
> +
> + return function_checker (location, rfn.instance, fndecl,
> + TREE_TYPE (rfn.decl), nargs, args).check ();
> +}
> +
> +/* Attempt to fold STMT, given that it's a call to the MVE function
> + with subcode CODE. Return the new statement on success and null
> + on failure. Insert any other new statements at GSI. */
> +gimple *
> +gimple_fold_builtin (unsigned int code, gcall *stmt)
> +{
> + registered_function &rfn = *(*registered_functions)[code];
> + return gimple_folder (rfn.instance, rfn.decl, stmt).fold ();
> +}
> +
> +/* Expand a call to the MVE function with subcode CODE. EXP is the call
> + expression and TARGET is the preferred location for the result.
> + Return the value of the lhs. */
> +rtx
> +expand_builtin (unsigned int code, tree exp, rtx target)
> +{
> + registered_function &rfn = *(*registered_functions)[code];
> + if (!check_requires_float (EXPR_LOCATION (exp), rfn.decl,
> + rfn.requires_float))
> + return target;
> + return function_expander (rfn.instance, rfn.decl, exp, target).expand ();
> +}
> +
> +} /* end namespace arm_mve */
> +
> +using namespace arm_mve;
> +
> +inline void
> +gt_ggc_mx (function_instance *)
> +{
> +}
> +
> +inline void
> +gt_pch_nx (function_instance *)
> +{
> +}
> +
> +inline void
> +gt_pch_nx (function_instance *, gt_pointer_operator, void *)
> +{
> +}
>
> #include "gt-arm-mve-builtins.h"
> diff --git a/gcc/config/arm/arm-mve-builtins.def b/gcc/config/arm/arm-mve-
> builtins.def
> index 69f3f81b473..49d07364fa2 100644
> --- a/gcc/config/arm/arm-mve-builtins.def
> +++ b/gcc/config/arm/arm-mve-builtins.def
> @@ -17,10 +17,25 @@
> along with GCC; see the file COPYING3. If not see
> <http://www.gnu.org/licenses/>. */
>
> +#ifndef DEF_MVE_MODE
> +#define DEF_MVE_MODE(A, B, C, D)
> +#endif
> +
> #ifndef DEF_MVE_TYPE
> -#error "arm-mve-builtins.def included without defining DEF_MVE_TYPE"
> +#define DEF_MVE_TYPE(A, B)
> +#endif
> +
> +#ifndef DEF_MVE_TYPE_SUFFIX
> +#define DEF_MVE_TYPE_SUFFIX(A, B, C, D, E)
> #endif
>
> +#ifndef DEF_MVE_FUNCTION
> +#define DEF_MVE_FUNCTION(A, B, C, D)
> +#endif
> +
> +DEF_MVE_MODE (n, none, none, none)
> +DEF_MVE_MODE (offset, none, none, bytes)
> +
> #define REQUIRES_FLOAT false
> DEF_MVE_TYPE (mve_pred16_t, boolean_type_node)
> DEF_MVE_TYPE (uint8x16_t, unsigned_intQI_type_node)
> @@ -37,3 +52,26 @@ DEF_MVE_TYPE (int64x2_t, intDI_type_node)
> DEF_MVE_TYPE (float16x8_t, arm_fp16_type_node)
> DEF_MVE_TYPE (float32x4_t, float_type_node)
> #undef REQUIRES_FLOAT
> +
> +#define REQUIRES_FLOAT false
> +DEF_MVE_TYPE_SUFFIX (s8, int8x16_t, signed, 8, V16QImode)
> +DEF_MVE_TYPE_SUFFIX (s16, int16x8_t, signed, 16, V8HImode)
> +DEF_MVE_TYPE_SUFFIX (s32, int32x4_t, signed, 32, V4SImode)
> +DEF_MVE_TYPE_SUFFIX (s64, int64x2_t, signed, 64, V2DImode)
> +DEF_MVE_TYPE_SUFFIX (u8, uint8x16_t, unsigned, 8, V16QImode)
> +DEF_MVE_TYPE_SUFFIX (u16, uint16x8_t, unsigned, 16, V8HImode)
> +DEF_MVE_TYPE_SUFFIX (u32, uint32x4_t, unsigned, 32, V4SImode)
> +DEF_MVE_TYPE_SUFFIX (u64, uint64x2_t, unsigned, 64, V2DImode)
> +#undef REQUIRES_FLOAT
> +
> +#define REQUIRES_FLOAT true
> +DEF_MVE_TYPE_SUFFIX (f16, float16x8_t, float, 16, V8HFmode)
> +DEF_MVE_TYPE_SUFFIX (f32, float32x4_t, float, 32, V4SFmode)
> +#undef REQUIRES_FLOAT
> +
> +#include "arm-mve-builtins-base.def"
> +
> +#undef DEF_MVE_TYPE
> +#undef DEF_MVE_TYPE_SUFFIX
> +#undef DEF_MVE_FUNCTION
> +#undef DEF_MVE_MODE
> diff --git a/gcc/config/arm/arm-mve-builtins.h b/gcc/config/arm/arm-mve-
> builtins.h
> index 290a118ec92..a20d2fb5d86 100644
> --- a/gcc/config/arm/arm-mve-builtins.h
> +++ b/gcc/config/arm/arm-mve-builtins.h
> @@ -20,7 +20,79 @@
> #ifndef GCC_ARM_MVE_BUILTINS_H
> #define GCC_ARM_MVE_BUILTINS_H
>
> +/* The full name of an MVE ACLE function is the concatenation of:
> +
> + - the base name ("vadd", etc.)
> + - the "mode" suffix ("_n", "_index", etc.)
> + - the type suffixes ("_s32", "_b8", etc.)
> + - the predication suffix ("_x", "_z", etc.)
> +
> + Each piece of information is individually useful, so we retain this
> + classification throughout:
> +
> + - function_base represents the base name
> +
> + - mode_suffix_index represents the mode suffix
> +
> + - type_suffix_index represents individual type suffixes, while
> + type_suffix_pair represents a pair of them
> +
> + - prediction_index extends the predication suffix with an additional
> + alternative: PRED_implicit for implicitly-predicated operations
> +
> + In addition to its unique full name, a function may have a shorter
> + overloaded alias. This alias removes pieces of the suffixes that
> + can be inferred from the arguments, such as by shortening the mode
> + suffix or dropping some of the type suffixes. The base name and the
> + predication suffix stay the same.
> +
> + The function_shape class describes what arguments a given function
> + takes and what its overloaded alias is called. In broad terms,
> + function_base describes how the underlying instruction behaves while
> + function_shape describes how that instruction has been presented at
> + the language level.
> +
> + The static list of functions uses function_group to describe a group
> + of related functions. The function_builder class is responsible for
> + expanding this static description into a list of individual functions
> + and registering the associated built-in functions. function_instance
> + describes one of these individual functions in terms of the properties
> + described above.
> +
> + The classes involved in compiling a function call are:
> +
> + - function_resolver, which resolves an overloaded function call to a
> + specific function_instance and its associated function decl
> +
> + - function_checker, which checks whether the values of the arguments
> + conform to the ACLE specification
> +
> + - gimple_folder, which tries to fold a function call at the gimple level
> +
> + - function_expander, which expands a function call into rtl instructions
> +
> + function_resolver and function_checker operate at the language level
> + and so are associated with the function_shape. gimple_folder and
> + function_expander are concerned with the behavior of the function
> + and so are associated with the function_base.
> +
> + Note that we've specifically chosen not to fold calls in the frontend,
> + since MVE intrinsics will hardly ever fold a useful language-level
> + constant. */
> namespace arm_mve {
> +/* The maximum number of vectors in an ACLE tuple type. */
> +const unsigned int MAX_TUPLE_SIZE = 3;
> +
> +/* Used to represent the default merge argument index for _m functions.
> + The actual index depends on how many arguments the function takes. */
> +const unsigned int DEFAULT_MERGE_ARGNO = 0;
> +
> +/* Flags that describe what a function might do, in addition to reading
> + its arguments and returning a result. */
> +const unsigned int CP_READ_FPCR = 1U << 0;
> +const unsigned int CP_RAISE_FP_EXCEPTIONS = 1U << 1;
> +const unsigned int CP_READ_MEMORY = 1U << 2;
> +const unsigned int CP_WRITE_MEMORY = 1U << 3;
>
> /* Enumerates the MVE predicate and (data) vector types, together called
> "vector types" for brevity. */
> @@ -30,11 +102,604 @@ enum vector_type_index
> VECTOR_TYPE_ ## ACLE_NAME,
> #include "arm-mve-builtins.def"
> NUM_VECTOR_TYPES
> -#undef DEF_MVE_TYPE
> };
>
> +/* Classifies the available measurement units for an address displacement.
> */
> +enum units_index
> +{
> + UNITS_none,
> + UNITS_bytes
> +};
> +
> +/* Describes the various uses of a governing predicate. */
> +enum predication_index
> +{
> + /* No governing predicate is present. */
> + PRED_none,
> +
> + /* Merging predication: copy inactive lanes from the first data argument
> + to the vector result. */
> + PRED_m,
> +
> + /* Plain predication: inactive lanes are not used to compute the
> + scalar result. */
> + PRED_p,
> +
> + /* "Don't care" predication: set inactive lanes of the vector result
> + to arbitrary values. */
> + PRED_x,
> +
> + /* Zero predication: set inactive lanes of the vector result to zero. */
> + PRED_z,
> +
> + NUM_PREDS
> +};
> +
> +/* Classifies element types, based on type suffixes with the bit count
> + removed. */
> +enum type_class_index
> +{
> + TYPE_bool,
> + TYPE_float,
> + TYPE_signed,
> + TYPE_unsigned,
> + NUM_TYPE_CLASSES
> +};
> +
> +/* Classifies an operation into "modes"; for example, to distinguish
> + vector-scalar operations from vector-vector operations, or to
> + distinguish between different addressing modes. This classification
> + accounts for the function suffixes that occur between the base name
> + and the first type suffix. */
> +enum mode_suffix_index
> +{
> +#define DEF_MVE_MODE(NAME, BASE, DISPLACEMENT, UNITS)
> MODE_##NAME,
> +#include "arm-mve-builtins.def"
> + MODE_none
> +};
> +
> +/* Enumerates the possible type suffixes. Each suffix is associated with
> + a vector type, but for predicates provides extra information about the
> + element size. */
> +enum type_suffix_index
> +{
> +#define DEF_MVE_TYPE_SUFFIX(NAME, ACLE_TYPE, CLASS, BITS, MODE)
> \
> + TYPE_SUFFIX_ ## NAME,
> +#include "arm-mve-builtins.def"
> + NUM_TYPE_SUFFIXES
> +};
> +
> +/* Combines two type suffixes. */
> +typedef enum type_suffix_index type_suffix_pair[2];
> +
> +class function_base;
> +class function_shape;
> +
> +/* Static information about a mode suffix. */
> +struct mode_suffix_info
> +{
> + /* The suffix string itself. */
> + const char *string;
> +
> + /* The type of the vector base address, or NUM_VECTOR_TYPES if the
> + mode does not include a vector base address. */
> + vector_type_index base_vector_type;
> +
> + /* The type of the vector displacement, or NUM_VECTOR_TYPES if the
> + mode does not include a vector displacement. (Note that scalar
> + displacements are always int64_t.) */
> + vector_type_index displacement_vector_type;
> +
> + /* The units in which the vector or scalar displacement is measured,
> + or UNITS_none if the mode doesn't take a displacement. */
> + units_index displacement_units;
> +};
> +
> +/* Static information about a type suffix. */
> +struct type_suffix_info
> +{
> + /* The suffix string itself. */
> + const char *string;
> +
> + /* The associated ACLE vector or predicate type. */
> + vector_type_index vector_type : 8;
> +
> + /* What kind of type the suffix represents. */
> + type_class_index tclass : 8;
> +
> + /* The number of bits and bytes in an element. For predicates this
> + measures the associated data elements. */
> + unsigned int element_bits : 8;
> + unsigned int element_bytes : 8;
> +
> + /* True if the suffix is for an integer type. */
> + unsigned int integer_p : 1;
> + /* True if the suffix is for an unsigned type. */
> + unsigned int unsigned_p : 1;
> + /* True if the suffix is for a floating-point type. */
> + unsigned int float_p : 1;
> + unsigned int spare : 13;
> +
> + /* The associated vector or predicate mode. */
> + machine_mode vector_mode : 16;
> +};
> +
> +/* Static information about a set of functions. */
> +struct function_group_info
> +{
> + /* The base name, as a string. */
> + const char *base_name;
> +
> + /* Describes the behavior associated with the function base name. */
> + const function_base *const *base;
> +
> + /* The shape of the functions, as described above the class definition.
> + It's possible to have entries with the same base name but different
> + shapes. */
> + const function_shape *const *shape;
> +
> + /* A list of the available type suffixes, and of the available predication
> + types. The function supports every combination of the two.
> +
> + The list of type suffixes is terminated by two NUM_TYPE_SUFFIXES
> + while the list of predication types is terminated by NUM_PREDS.
> + The list of type suffixes is lexicographically ordered based
> + on the index value. */
> + const type_suffix_pair *types;
> + const predication_index *preds;
> +
> + /* Whether the function group requires a floating point abi. */
> + bool requires_float;
> +};
> +
> +/* Describes a single fully-resolved function (i.e. one that has a
> + unique full name). */
> +class GTY((user)) function_instance
> +{
> +public:
> + function_instance (const char *, const function_base *,
> + const function_shape *, mode_suffix_index,
> + const type_suffix_pair &, predication_index);
> +
> + bool operator== (const function_instance &) const;
> + bool operator!= (const function_instance &) const;
> + hashval_t hash () const;
> +
> + unsigned int call_properties () const;
> + bool reads_global_state_p () const;
> + bool modifies_global_state_p () const;
> + bool could_trap_p () const;
> +
> + unsigned int vectors_per_tuple () const;
> +
> + const mode_suffix_info &mode_suffix () const;
> +
> + const type_suffix_info &type_suffix (unsigned int) const;
> + tree scalar_type (unsigned int) const;
> + tree vector_type (unsigned int) const;
> + tree tuple_type (unsigned int) const;
> + machine_mode vector_mode (unsigned int) const;
> + machine_mode gp_mode (unsigned int) const;
> +
> + bool has_inactive_argument () const;
> +
> + /* The properties of the function. (The explicit "enum"s are required
> + for gengtype.) */
> + const char *base_name;
> + const function_base *base;
> + const function_shape *shape;
> + enum mode_suffix_index mode_suffix_id;
> + type_suffix_pair type_suffix_ids;
> + enum predication_index pred;
> +};
> +
> +class registered_function;
> +
> +/* A class for building and registering function decls. */
> +class function_builder
> +{
> +public:
> + function_builder ();
> + ~function_builder ();
> +
> + void add_unique_function (const function_instance &, tree,
> + vec<tree> &, bool, bool, bool);
> + void add_overloaded_function (const function_instance &, bool, bool);
> + void add_overloaded_functions (const function_group_info &,
> + mode_suffix_index, bool);
> +
> + void register_function_group (const function_group_info &, bool);
> +
> +private:
> + void append_name (const char *);
> + char *finish_name ();
> +
> + char *get_name (const function_instance &, bool, bool);
> +
> + tree get_attributes (const function_instance &);
> +
> + registered_function &add_function (const function_instance &,
> + const char *, tree, tree,
> + bool, bool, bool);
> +
> + /* The function type to use for functions that are resolved by
> + function_resolver. */
> + tree m_overload_type;
> +
> + /* True if we should create a separate decl for each instance of an
> + overloaded function, instead of using function_resolver. */
> + bool m_direct_overloads;
> +
> + /* Used for building up function names. */
> + obstack m_string_obstack;
> +
> + /* Maps all overloaded function names that we've registered so far
> + to their associated function_instances. */
> + hash_map<nofree_string_hash, registered_function *>
> m_overload_names;
> +};
> +
> +/* A base class for handling calls to built-in functions. */
> +class function_call_info : public function_instance
> +{
> +public:
> + function_call_info (location_t, const function_instance &, tree);
> +
> + bool function_returns_void_p ();
> +
> + /* The location of the call. */
> + location_t location;
> +
> + /* The FUNCTION_DECL that is being called. */
> + tree fndecl;
> +};
> +
> +/* A class for resolving an overloaded function call. */
> +class function_resolver : public function_call_info
> +{
> +public:
> + enum { SAME_SIZE = 256, HALF_SIZE, QUARTER_SIZE };
> + static const type_class_index SAME_TYPE_CLASS = NUM_TYPE_CLASSES;
> +
> + function_resolver (location_t, const function_instance &, tree,
> + vec<tree, va_gc> &);
> +
> + tree get_vector_type (type_suffix_index);
> + const char *get_scalar_type_name (type_suffix_index);
> + tree get_argument_type (unsigned int);
> + bool scalar_argument_p (unsigned int);
> +
> + tree report_no_such_form (type_suffix_index);
> + tree lookup_form (mode_suffix_index,
> + type_suffix_index = NUM_TYPE_SUFFIXES,
> + type_suffix_index = NUM_TYPE_SUFFIXES);
> + tree resolve_to (mode_suffix_index,
> + type_suffix_index = NUM_TYPE_SUFFIXES,
> + type_suffix_index = NUM_TYPE_SUFFIXES);
> +
> + type_suffix_index infer_vector_or_tuple_type (unsigned int, unsigned int);
> + type_suffix_index infer_vector_type (unsigned int);
> +
> + bool require_vector_or_scalar_type (unsigned int);
> +
> + bool require_vector_type (unsigned int, vector_type_index);
> + bool require_matching_vector_type (unsigned int, type_suffix_index);
> + bool require_derived_vector_type (unsigned int, unsigned int,
> + type_suffix_index,
> + type_class_index = SAME_TYPE_CLASS,
> + unsigned int = SAME_SIZE);
> + bool require_integer_immediate (unsigned int);
> + bool require_scalar_type (unsigned int, const char *);
> + bool require_derived_scalar_type (unsigned int, type_class_index,
> + unsigned int = SAME_SIZE);
> +
> + bool check_num_arguments (unsigned int);
> + bool check_gp_argument (unsigned int, unsigned int &, unsigned int &);
> + tree resolve_unary (type_class_index = SAME_TYPE_CLASS,
> + unsigned int = SAME_SIZE, bool = false);
> + tree resolve_unary_n ();
> + tree resolve_uniform (unsigned int, unsigned int = 0);
> + tree resolve_uniform_opt_n (unsigned int);
> + tree finish_opt_n_resolution (unsigned int, unsigned int, type_suffix_index,
> + type_class_index = SAME_TYPE_CLASS,
> + unsigned int = SAME_SIZE,
> + type_suffix_index = NUM_TYPE_SUFFIXES);
> +
> + tree resolve ();
> +
> +private:
> + /* The arguments to the overloaded function. */
> + vec<tree, va_gc> &m_arglist;
> +};
> +
> +/* A class for checking that the semantic constraints on a function call are
> + satisfied, such as arguments being integer constant expressions with
> + a particular range. The parent class's FNDECL is the decl that was
> + called in the original source, before overload resolution. */
> +class function_checker : public function_call_info
> +{
> +public:
> + function_checker (location_t, const function_instance &, tree,
> + tree, unsigned int, tree *);
> +
> + bool require_immediate_enum (unsigned int, tree);
> + bool require_immediate_lane_index (unsigned int, unsigned int = 1);
> + bool require_immediate_range (unsigned int, HOST_WIDE_INT,
> HOST_WIDE_INT);
> +
> + bool check ();
> +
> +private:
> + bool argument_exists_p (unsigned int);
> +
> + bool require_immediate (unsigned int, HOST_WIDE_INT &);
> +
> + /* The type of the resolved function. */
> + tree m_fntype;
> +
> + /* The arguments to the function. */
> + unsigned int m_nargs;
> + tree *m_args;
> +
> + /* The first argument not associated with the function's predication
> + type. */
> + unsigned int m_base_arg;
> +};
> +
> +/* A class for folding a gimple function call. */
> +class gimple_folder : public function_call_info
> +{
> +public:
> + gimple_folder (const function_instance &, tree,
> + gcall *);
> +
> + gimple *fold ();
> +
> + /* The call we're folding. */
> + gcall *call;
> +
> + /* The result of the call, or null if none. */
> + tree lhs;
> +};
> +
> +/* A class for expanding a function call into RTL. */
> +class function_expander : public function_call_info
> +{
> +public:
> + function_expander (const function_instance &, tree, tree, rtx);
> + rtx expand ();
> +
> + insn_code direct_optab_handler (optab, unsigned int = 0);
> +
> + rtx get_fallback_value (machine_mode, unsigned int, unsigned int &);
> + rtx get_reg_target ();
> +
> + void add_output_operand (insn_code);
> + void add_input_operand (insn_code, rtx);
> + void add_integer_operand (HOST_WIDE_INT);
> + rtx generate_insn (insn_code);
> +
> + rtx use_exact_insn (insn_code);
> + rtx use_unpred_insn (insn_code);
> + rtx use_pred_x_insn (insn_code);
> + rtx use_cond_insn (insn_code, unsigned int = DEFAULT_MERGE_ARGNO);
> +
> + rtx map_to_rtx_codes (rtx_code, rtx_code, rtx_code);
> +
> + /* The function call expression. */
> + tree call_expr;
> +
> + /* For functions that return a value, this is the preferred location
> + of that value. It could be null or could have a different mode
> + from the function return type. */
> + rtx possible_target;
> +
> + /* The expanded arguments. */
> + auto_vec<rtx, 16> args;
> +
> +private:
> + /* Used to build up the operands to an instruction. */
> + auto_vec<expand_operand, 8> m_ops;
> +};
> +
> +/* Provides information about a particular function base name, and handles
> + tasks related to the base name. */
> +class function_base
> +{
> +public:
> + /* Return a set of CP_* flags that describe what the function might do,
> + in addition to reading its arguments and returning a result. */
> + virtual unsigned int call_properties (const function_instance &) const;
> +
> + /* If the function operates on tuples of vectors, return the number
> + of vectors in the tuples, otherwise return 1. */
> + virtual unsigned int vectors_per_tuple () const { return 1; }
> +
> + /* Try to fold the given gimple call. Return the new gimple statement
> + on success, otherwise return null. */
> + virtual gimple *fold (gimple_folder &) const { return NULL; }
> +
> + /* Expand the given call into rtl. Return the result of the function,
> + or an arbitrary value if the function doesn't return a result. */
> + virtual rtx expand (function_expander &) const = 0;
> +};
> +
> +/* Classifies functions into "shapes". The idea is to take all the
> + type signatures for a set of functions, and classify what's left
> + based on:
> +
> + - the number of arguments
> +
> + - the process of determining the types in the signature from the mode
> + and type suffixes in the function name (including types that are not
> + affected by the suffixes)
> +
> + - which arguments must be integer constant expressions, and what range
> + those arguments have
> +
> + - the process for mapping overloaded names to "full" names. */
> +class function_shape
> +{
> +public:
> + virtual bool explicit_type_suffix_p (unsigned int, enum predication_index,
> enum mode_suffix_index) const = 0;
> + virtual bool explicit_mode_suffix_p (enum predication_index, enum
> mode_suffix_index) const = 0;
> + virtual bool skip_overload_p (enum predication_index, enum
> mode_suffix_index) const = 0;
> +
> + /* Define all functions associated with the given group. */
> + virtual void build (function_builder &,
> + const function_group_info &,
> + bool) const = 0;
> +
> + /* Try to resolve the overloaded call. Return the non-overloaded
> + function decl on success and error_mark_node on failure. */
> + virtual tree resolve (function_resolver &) const = 0;
> +
> + /* Check whether the given call is semantically valid. Return true
> + if it is, otherwise report an error and return false. */
> + virtual bool check (function_checker &) const { return true; }
> +};
> +
> +extern const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1];
> +extern const mode_suffix_info mode_suffixes[MODE_none + 1];
> +
> extern tree scalar_types[NUM_VECTOR_TYPES];
> -extern tree acle_vector_types[3][NUM_VECTOR_TYPES + 1];
> +extern tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1];
> +
> +/* Return the ACLE type mve_pred16_t. */
> +inline tree
> +get_mve_pred16_t (void)
> +{
> + return acle_vector_types[0][VECTOR_TYPE_mve_pred16_t];
> +}
> +
> +/* Try to find a mode with the given mode_suffix_info fields. Return the
> + mode on success or MODE_none on failure. */
> +inline mode_suffix_index
> +find_mode_suffix (vector_type_index base_vector_type,
> + vector_type_index displacement_vector_type,
> + units_index displacement_units)
> +{
> + for (unsigned int mode_i = 0; mode_i < ARRAY_SIZE (mode_suffixes);
> ++mode_i)
> + {
> + const mode_suffix_info &mode = mode_suffixes[mode_i];
> + if (mode.base_vector_type == base_vector_type
> + && mode.displacement_vector_type == displacement_vector_type
> + && mode.displacement_units == displacement_units)
> + return mode_suffix_index (mode_i);
> + }
> + return MODE_none;
> +}
> +
> +/* Return the type suffix associated with ELEMENT_BITS-bit elements of type
> + class TCLASS. */
> +inline type_suffix_index
> +find_type_suffix (type_class_index tclass, unsigned int element_bits)
> +{
> + for (unsigned int i = 0; i < NUM_TYPE_SUFFIXES; ++i)
> + if (type_suffixes[i].tclass == tclass
> + && type_suffixes[i].element_bits == element_bits)
> + return type_suffix_index (i);
> + gcc_unreachable ();
> +}
> +
> +inline function_instance::
> +function_instance (const char *base_name_in,
> + const function_base *base_in,
> + const function_shape *shape_in,
> + mode_suffix_index mode_suffix_id_in,
> + const type_suffix_pair &type_suffix_ids_in,
> + predication_index pred_in)
> + : base_name (base_name_in), base (base_in), shape (shape_in),
> + mode_suffix_id (mode_suffix_id_in), pred (pred_in)
> +{
> + memcpy (type_suffix_ids, type_suffix_ids_in, sizeof (type_suffix_ids));
> +}
> +
> +inline bool
> +function_instance::operator== (const function_instance &other) const
> +{
> + return (base == other.base
> + && shape == other.shape
> + && mode_suffix_id == other.mode_suffix_id
> + && pred == other.pred
> + && type_suffix_ids[0] == other.type_suffix_ids[0]
> + && type_suffix_ids[1] == other.type_suffix_ids[1]);
> +}
> +
> +inline bool
> +function_instance::operator!= (const function_instance &other) const
> +{
> + return !operator== (other);
> +}
> +
> +/* If the function operates on tuples of vectors, return the number
> + of vectors in the tuples, otherwise return 1. */
> +inline unsigned int
> +function_instance::vectors_per_tuple () const
> +{
> + return base->vectors_per_tuple ();
> +}
> +
> +/* Return information about the function's mode suffix. */
> +inline const mode_suffix_info &
> +function_instance::mode_suffix () const
> +{
> + return mode_suffixes[mode_suffix_id];
> +}
> +
> +/* Return information about type suffix I. */
> +inline const type_suffix_info &
> +function_instance::type_suffix (unsigned int i) const
> +{
> + return type_suffixes[type_suffix_ids[i]];
> +}
> +
> +/* Return the scalar type associated with type suffix I. */
> +inline tree
> +function_instance::scalar_type (unsigned int i) const
> +{
> + return scalar_types[type_suffix (i).vector_type];
> +}
> +
> +/* Return the vector type associated with type suffix I. */
> +inline tree
> +function_instance::vector_type (unsigned int i) const
> +{
> + return acle_vector_types[0][type_suffix (i).vector_type];
> +}
> +
> +/* If the function operates on tuples of vectors, return the tuple type
> + associated with type suffix I, otherwise return the vector type associated
> + with type suffix I. */
> +inline tree
> +function_instance::tuple_type (unsigned int i) const
> +{
> + unsigned int num_vectors = vectors_per_tuple ();
> + return acle_vector_types[num_vectors - 1][type_suffix (i).vector_type];
> +}
> +
> +/* Return the vector or predicate mode associated with type suffix I. */
> +inline machine_mode
> +function_instance::vector_mode (unsigned int i) const
> +{
> + return type_suffix (i).vector_mode;
> +}
> +
> +/* Return true if the function has no return value. */
> +inline bool
> +function_call_info::function_returns_void_p ()
> +{
> + return TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node;
> +}
> +
> +/* Default implementation of function::call_properties, with conservatively
> + correct behavior for floating-point instructions. */
> +inline unsigned int
> +function_base::call_properties (const function_instance &instance) const
> +{
> + unsigned int flags = 0;
> + if (instance.type_suffix (0).float_p || instance.type_suffix (1).float_p)
> + flags |= CP_READ_FPCR | CP_RAISE_FP_EXCEPTIONS;
> + return flags;
> +}
>
> } /* end namespace arm_mve */
>
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 1bdbd3b8ab3..61fcd671437 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -215,7 +215,8 @@ extern opt_machine_mode arm_get_mask_mode
> (machine_mode mode);
> those groups. */
> enum arm_builtin_class
> {
> - ARM_BUILTIN_GENERAL
> + ARM_BUILTIN_GENERAL,
> + ARM_BUILTIN_MVE
> };
>
> /* Built-in function codes are structured so that the low
> @@ -229,6 +230,13 @@ const unsigned int ARM_BUILTIN_CLASS = (1 <<
> ARM_BUILTIN_SHIFT) - 1;
> /* MVE functions. */
> namespace arm_mve {
> void handle_arm_mve_types_h ();
> + void handle_arm_mve_h (bool);
> + tree resolve_overloaded_builtin (location_t, unsigned int,
> + vec<tree, va_gc> *);
> + bool check_builtin_call (location_t, vec<location_t>, unsigned int,
> + tree, unsigned int, tree *);
> + gimple *gimple_fold_builtin (unsigned int code, gcall *stmt);
> + rtx expand_builtin (unsigned int, tree, rtx);
> }
>
> /* Thumb functions. */
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index bf7ff9a9704..004e6c6194e 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -69,6 +69,7 @@
> #include "optabs-libfuncs.h"
> #include "gimplify.h"
> #include "gimple.h"
> +#include "gimple-iterator.h"
> #include "selftest.h"
> #include "tree-vectorizer.h"
> #include "opts.h"
> @@ -506,6 +507,9 @@ static const struct attribute_spec
> arm_attribute_table[] =
> #undef TARGET_FUNCTION_VALUE_REGNO_P
> #define TARGET_FUNCTION_VALUE_REGNO_P arm_function_value_regno_p
>
> +#undef TARGET_GIMPLE_FOLD_BUILTIN
> +#define TARGET_GIMPLE_FOLD_BUILTIN arm_gimple_fold_builtin
> +
> #undef TARGET_ASM_OUTPUT_MI_THUNK
> #define TARGET_ASM_OUTPUT_MI_THUNK arm_output_mi_thunk
> #undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
> @@ -2844,6 +2848,29 @@ arm_init_libfuncs (void)
> speculation_barrier_libfunc = init_one_libfunc ("__speculation_barrier");
> }
>
> +/* Implement TARGET_GIMPLE_FOLD_BUILTIN. */
> +static bool
> +arm_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> +{
> + gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
> + tree fndecl = gimple_call_fndecl (stmt);
> + unsigned int code = DECL_MD_FUNCTION_CODE (fndecl);
> + unsigned int subcode = code >> ARM_BUILTIN_SHIFT;
> + gimple *new_stmt = NULL;
> + switch (code & ARM_BUILTIN_CLASS)
> + {
> + case ARM_BUILTIN_GENERAL:
> + break;
> + case ARM_BUILTIN_MVE:
> + new_stmt = arm_mve::gimple_fold_builtin (subcode, stmt);
> + }
> + if (!new_stmt)
> + return false;
> +
> + gsi_replace (gsi, new_stmt, true);
> + return true;
> +}
> +
> /* On AAPCS systems, this is the "struct __va_list". */
> static GTY(()) tree va_list_type;
>
> diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
> index 1262d668121..0d2ba968fc0 100644
> --- a/gcc/config/arm/arm_mve.h
> +++ b/gcc/config/arm/arm_mve.h
> @@ -34,6 +34,12 @@
> #endif
> #include "arm_mve_types.h"
>
> +#ifdef __ARM_MVE_PRESERVE_USER_NAMESPACE
> +#pragma GCC arm "arm_mve.h" true
> +#else
> +#pragma GCC arm "arm_mve.h" false
> +#endif
> +
> #ifndef __ARM_MVE_PRESERVE_USER_NAMESPACE
> #define vst4q(__addr, __value) __arm_vst4q(__addr, __value)
> #define vdupq_n(__a) __arm_vdupq_n(__a)
> diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
> index 3139750c606..8e235f63ee6 100644
> --- a/gcc/config/arm/predicates.md
> +++ b/gcc/config/arm/predicates.md
> @@ -903,3 +903,7 @@ (define_predicate "call_insn_operand"
> (define_special_predicate "aligned_operand"
> (ior (not (match_code "mem"))
> (match_test "MEM_ALIGN (op) >= GET_MODE_ALIGNMENT (mode)")))
> +
> +;; A special predicate that doesn't match a particular mode.
> +(define_special_predicate "arm_any_register_operand"
> + (match_code "reg"))
> diff --git a/gcc/config/arm/t-arm b/gcc/config/arm/t-arm
> index 637e72af5bb..9a1b06368a1 100644
> --- a/gcc/config/arm/t-arm
> +++ b/gcc/config/arm/t-arm
> @@ -154,15 +154,41 @@ arm-builtins.o: $(srcdir)/config/arm/arm-builtins.cc
> $(CONFIG_H) \
> $(srcdir)/config/arm/arm-builtins.cc
>
> arm-mve-builtins.o: $(srcdir)/config/arm/arm-mve-builtins.cc $(CONFIG_H) \
> - $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
> - fold-const.h langhooks.h stringpool.h attribs.h diagnostic.h \
> + $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) $(TM_P_H) \
> + memmodel.h insn-codes.h optabs.h recog.h expr.h basic-block.h \
> + function.h fold-const.h gimple.h gimple-fold.h emit-rtl.h langhooks.h \
> + stringpool.h attribs.h diagnostic.h \
> $(srcdir)/config/arm/arm-protos.h \
> $(srcdir)/config/arm/arm-builtins.h \
> $(srcdir)/config/arm/arm-mve-builtins.h \
> - $(srcdir)/config/arm/arm-mve-builtins.def
> + $(srcdir)/config/arm/arm-mve-builtins-base.h \
> + $(srcdir)/config/arm/arm-mve-builtins-shapes.h \
> + $(srcdir)/config/arm/arm-mve-builtins.def \
> + $(srcdir)/config/arm/arm-mve-builtins-base.def
> $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS)
> $(INCLUDES) \
> $(srcdir)/config/arm/arm-mve-builtins.cc
>
> +arm-mve-builtins-shapes.o: \
> + $(srcdir)/config/arm/arm-mve-builtins-shapes.cc \
> + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
> + $(RTL_H) memmodel.h insn-codes.h optabs.h \
> + $(srcdir)/config/arm/arm-mve-builtins.h \
> + $(srcdir)/config/arm/arm-mve-builtins-shapes.h
> + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS)
> $(INCLUDES) \
> + $(srcdir)/config/arm/arm-mve-builtins-shapes.cc
> +
> +arm-mve-builtins-base.o: \
> + $(srcdir)/config/arm/arm-mve-builtins-base.cc \
> + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \
> + memmodel.h insn-codes.h $(OPTABS_H) \
> + $(BASIC_BLOCK_H) $(FUNCTION_H) $(GIMPLE_H) \
> + $(srcdir)/config/arm/arm-mve-builtins.h \
> + $(srcdir)/config/arm/arm-mve-builtins-shapes.h \
> + $(srcdir)/config/arm/arm-mve-builtins-base.h \
> + $(srcdir)/config/arm/arm-mve-builtins-functions.h
> + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS)
> $(INCLUDES) \
> + $(srcdir)/config/arm/arm-mve-builtins-base.cc
> +
> arm-c.o: $(srcdir)/config/arm/arm-c.cc $(CONFIG_H) $(SYSTEM_H) \
> coretypes.h $(TM_H) $(TREE_H) output.h $(C_COMMON_H)
> $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS)
> $(INCLUDES) \
> --
> 2.34.1
next prev parent reply other threads:[~2023-05-02 10:17 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-18 13:45 [PATCH 00/22] arm: New framework for MVE intrinsics Christophe Lyon
2023-04-18 13:45 ` [PATCH 01/22] arm: move builtin function codes into general numberspace Christophe Lyon
2023-05-02 9:24 ` Kyrylo Tkachov
2023-04-18 13:45 ` [PATCH 02/22] arm: [MVE intrinsics] Add new framework Christophe Lyon
2023-05-02 10:17 ` Kyrylo Tkachov [this message]
2023-04-18 13:45 ` [PATCH 03/22] arm: [MVE intrinsics] Rework vreinterpretq Christophe Lyon
2023-05-02 10:26 ` Kyrylo Tkachov
2023-05-02 14:05 ` Christophe Lyon
2023-05-02 15:28 ` Kyrylo Tkachov
2023-05-02 15:49 ` Christophe Lyon
2023-05-03 14:37 ` [PATCH v2 " Christophe Lyon
2023-05-03 14:52 ` Kyrylo Tkachov
2023-04-18 13:45 ` [PATCH 04/22] arm: [MVE intrinsics] Rework vuninitialized Christophe Lyon
2023-05-02 16:13 ` Kyrylo Tkachov
2023-04-18 13:45 ` [PATCH 05/22] arm: [MVE intrinsics] add binary_opt_n shape Christophe Lyon
2023-05-02 16:16 ` Kyrylo Tkachov
2023-04-18 13:45 ` [PATCH 06/22] arm: [MVE intrinsics] add unspec_based_mve_function_exact_insn Christophe Lyon
2023-05-02 16:17 ` Kyrylo Tkachov
2023-04-18 13:45 ` [PATCH 07/22] arm: [MVE intrinsics] factorize vadd vsubq vmulq Christophe Lyon
2023-05-02 16:19 ` Kyrylo Tkachov
2023-05-02 16:22 ` Christophe Lyon
2023-04-18 13:45 ` [PATCH 08/22] arm: [MVE intrinsics] rework vaddq vmulq vsubq Christophe Lyon
2023-05-02 16:31 ` Kyrylo Tkachov
2023-05-03 9:06 ` Christophe Lyon
2023-04-18 13:45 ` [PATCH 09/22] arm: [MVE intrinsics] add binary shape Christophe Lyon
2023-05-02 16:32 ` Kyrylo Tkachov
2023-04-18 13:45 ` [PATCH 10/22] arm: [MVE intrinsics] factorize vandq veorq vorrq vbicq Christophe Lyon
2023-05-02 16:36 ` Kyrylo Tkachov
2023-04-18 13:45 ` [PATCH 11/22] arm: [MVE intrinsics] rework vandq veorq Christophe Lyon
2023-05-02 16:37 ` Kyrylo Tkachov
2023-04-18 13:45 ` [PATCH 12/22] arm: [MVE intrinsics] add binary_orrq shape Christophe Lyon
2023-05-02 16:39 ` Kyrylo Tkachov
2023-04-18 13:45 ` [PATCH 13/22] arm: [MVE intrinsics] rework vorrq Christophe Lyon
2023-05-02 16:41 ` Kyrylo Tkachov
2023-04-18 13:46 ` [PATCH 14/22] arm: [MVE intrinsics] add unspec_mve_function_exact_insn Christophe Lyon
2023-05-03 8:40 ` Kyrylo Tkachov
2023-04-18 13:46 ` [PATCH 15/22] arm: [MVE intrinsics] add create shape Christophe Lyon
2023-05-03 8:40 ` Kyrylo Tkachov
2023-04-18 13:46 ` [PATCH 16/22] arm: [MVE intrinsics] factorize vcreateq Christophe Lyon
2023-05-03 8:42 ` Kyrylo Tkachov
2023-04-18 13:46 ` [PATCH 17/22] arm: [MVE intrinsics] rework vcreateq Christophe Lyon
2023-05-03 8:44 ` Kyrylo Tkachov
2023-04-18 13:46 ` [PATCH 18/22] arm: [MVE intrinsics] factorize several binary_m operations Christophe Lyon
2023-05-03 8:46 ` Kyrylo Tkachov
2023-04-18 13:46 ` [PATCH 19/22] arm: [MVE intrinsics] factorize several binary _n operations Christophe Lyon
2023-05-03 8:47 ` Kyrylo Tkachov
2023-04-18 13:46 ` [PATCH 20/22] arm: [MVE intrinsics] factorize several binary _m_n operations Christophe Lyon
2023-05-03 8:48 ` Kyrylo Tkachov
2023-04-18 13:46 ` [PATCH 21/22] arm: [MVE intrinsics] factorize several binary operations Christophe Lyon
2023-05-03 8:49 ` Kyrylo Tkachov
2023-04-18 13:46 ` [PATCH 22/22] arm: [MVE intrinsics] rework vhaddq vhsubq vmulhq vqaddq vqsubq vqdmulhq vrhaddq vrmulhq Christophe Lyon
2023-05-03 8:51 ` Kyrylo Tkachov
2023-05-02 9:18 ` [PATCH 00/22] arm: New framework for MVE intrinsics Kyrylo Tkachov
2023-05-02 15:04 ` Christophe Lyon
2023-05-03 15:01 ` Christophe Lyon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PAXPR08MB6926816B6C8756D491C51F63936F9@PAXPR08MB6926.eurprd08.prod.outlook.com \
--to=kyrylo.tkachov@arm.com \
--cc=Christophe.Lyon@arm.com \
--cc=Richard.Earnshaw@arm.com \
--cc=Richard.Sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).