From: Richard Biener <rguenther@suse.de>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Richard Sandiford <rdsandiford@googlemail.com>,
Jason Merrill <jason@redhat.com>,
"Joseph S. Myers" <joseph@codesourcery.com>,
Jan Hubicka <jh@suse.cz>,
gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Add __builtin_convertvector support (PR c++/85052)
Date: Thu, 03 Jan 2019 11:16:00 -0000 [thread overview]
Message-ID: <alpine.LSU.2.20.1901031205260.23386@zhemvz.fhfr.qr> (raw)
In-Reply-To: <20190103100640.GM30353@tucnak>
On Thu, 3 Jan 2019, Jakub Jelinek wrote:
> Hi!
>
> The following patch adds support for the __builtin_convertvector builtin.
> C casts on generic vectors are just reinterpretation of the bits (i.e. a
> VCE), this builtin allows to cast int/unsigned elements to float or vice
> versa or promote/demote them. doc/ change is missing, will write it soon.
>
> The builtin appeared in I think clang 3.4 and is apparently in real-world
> use as e.g. Honza reported. The first argument is an expression with vector
> type, the second argument is a vector type (similarly e.g. to va_arg), to
> which the first argument should be converted. Both vector types need to
> have the same number of elements.
>
> I've implemented same element size (thus also whole vector size) conversions
> efficiently - signed to unsigned and vice versa or same vector type just
> using a VCE, for e.g. int <-> float or long long <-> double using
> appropriate optab, possibly repeated multiple times for very large vectors.
> For everything there is a fallback to lower __builtin_convertvector (x, t)
> as { (__typeof (t[0])) x[0], (__typeof (t[1])) x[1], ... }.
>
> What isn't implemented efficiently (yet) are the narrowing or widening
> conversions; the optabs we have are meant for same size vectors, so
> for the packing we have 2 arguments that we pack into 1, for unpacking we
> have those lo/hi variants, but in this case at least for the most common
> vectors we have just one argument and want result with the same number of
> elements. The AVX* different vector size instructions is the thing that
> does this most efficiently, of course for large generic vectors we can
> easily use these optabs. Shall we go for e.g. trying to pack the argument
> and all zeros dummy operand and pick the low half of the result vector,
> or pick the low and high halves of the argument and use a half sized vector
> operations, or both?
I guess it depends on target capabilities - I think
__builtin_convertvector is a bit "misdesigned" for pack/unpack. You
also have to consider a v2di to v2qi conversion which requires
several unpack steps. Does the clang documentation given any
hints how to "efficiently" use __builtin_convertvector for
packing/unpacking without exposing too much of the target architecture?
But yes, for unpacking you'd use a series of vec_unpack_*_lo_expr
with padded input (padded with "don't care" if we had that, on
RTL we'd use a paradoxical subreg, on GIMPLE we _might_ consider
allowing VCE of different size? Or simply allow half-size input
operands to vec_unpack_*_lo where that expands to paradoxical
subregs (a bit difficult for the optab query I guess).
For packing you'd use a series of vec_pack_* on argument split
to two halves via BIT_FIELD_REF.
What does clang do for testcases that request promotion/demotion?
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
do_vec_conversion needs a comment. Overall the patch (with its
existing features) looks OK to me.
As of Marcs comments I agree that vector lowering happens quite late.
It might be for example useful to lower before vectorization (or
any loop optimization) so that un-handled generic vector code can be
eventually vectorized differently. But that's sth to investigate for
GCC 10.
Giving FE maintainers a chance to comment, so no overall ACK yet.
Thanks,
Richard.
> 2019-01-03 Jakub Jelinek <jakub@redhat.com>
>
> PR c++/85052
> * tree-vect-generic.c (expand_vector_piecewise): Add defaulted
> ret_type argument, if non-NULL, use that in preference to type
> for the result type.
> (expand_vector_parallel): Formatting fix.
> (do_vec_conversion, expand_vector_conversion): New functions.
> (expand_vector_operations_1): Call expand_vector_conversion
> for VEC_CONVERT ifn calls.
> * internal-fn.def (VEC_CONVERT): New internal function.
> * internal-fn.c (expand_VEC_CONVERT): New function.
> * fold-const-call.c (fold_const_vec_convert): New function.
> (fold_const_call): Use it for CFN_VEC_CONVERT.
> c-family/
> * c-common.h (enum rid): Add RID_BUILTIN_CONVERTVECTOR.
> (c_build_vec_convert): Declare.
> * c-common.c (c_build_vec_convert): New function.
> c/
> * c-parser.c (c_parser_postfix_expression): Parse
> __builtin_convertvector.
> cp/
> * cp-tree.h (cp_build_vec_convert): Declare.
> * parser.c (cp_parser_postfix_expression): Parse
> __builtin_convertvector.
> * constexpr.c: Include fold-const-call.h.
> (cxx_eval_internal_function): Handle IFN_VEC_CONVERT.
> (potential_constant_expression_1): Likewise.
> * semantics.c (cp_build_vec_convert): New function.
> * pt.c (tsubst_copy_and_build): Handle CALL_EXPR to
> IFN_VEC_CONVERT.
> testsuite/
> * c-c++-common/builtin-convertvector-1.c: New test.
> * c-c++-common/torture/builtin-convertvector-1.c: New test.
> * g++.dg/ext/builtin-convertvector-1.C: New test.
> * g++.dg/cpp0x/constexpr-builtin4.C: New test.
>
> --- gcc/tree-vect-generic.c.jj 2019-01-01 12:37:17.084976148 +0100
> +++ gcc/tree-vect-generic.c 2019-01-02 17:51:28.012876543 +0100
> @@ -267,7 +267,8 @@ do_negate (gimple_stmt_iterator *gsi, tr
> static tree
> expand_vector_piecewise (gimple_stmt_iterator *gsi, elem_op_func f,
> tree type, tree inner_type,
> - tree a, tree b, enum tree_code code)
> + tree a, tree b, enum tree_code code,
> + tree ret_type = NULL_TREE)
> {
> vec<constructor_elt, va_gc> *v;
> tree part_width = TYPE_SIZE (inner_type);
> @@ -278,23 +279,27 @@ expand_vector_piecewise (gimple_stmt_ite
> int i;
> location_t loc = gimple_location (gsi_stmt (*gsi));
>
> - if (types_compatible_p (gimple_expr_type (gsi_stmt (*gsi)), type))
> + if (ret_type
> + || types_compatible_p (gimple_expr_type (gsi_stmt (*gsi)), type))
> warning_at (loc, OPT_Wvector_operation_performance,
> "vector operation will be expanded piecewise");
> else
> warning_at (loc, OPT_Wvector_operation_performance,
> "vector operation will be expanded in parallel");
>
> + if (!ret_type)
> + ret_type = type;
> vec_alloc (v, (nunits + delta - 1) / delta);
> for (i = 0; i < nunits;
> i += delta, index = int_const_binop (PLUS_EXPR, index, part_width))
> {
> - tree result = f (gsi, inner_type, a, b, index, part_width, code, type);
> + tree result = f (gsi, inner_type, a, b, index, part_width, code,
> + ret_type);
> constructor_elt ce = {NULL_TREE, result};
> v->quick_push (ce);
> }
>
> - return build_constructor (type, v);
> + return build_constructor (ret_type, v);
> }
>
> /* Expand a vector operation to scalars with the freedom to use
> @@ -302,8 +307,7 @@ expand_vector_piecewise (gimple_stmt_ite
> in the vector type. */
> static tree
> expand_vector_parallel (gimple_stmt_iterator *gsi, elem_op_func f, tree type,
> - tree a, tree b,
> - enum tree_code code)
> + tree a, tree b, enum tree_code code)
> {
> tree result, compute_type;
> int n_words = tree_to_uhwi (TYPE_SIZE_UNIT (type)) / UNITS_PER_WORD;
> @@ -1547,6 +1551,147 @@ expand_vector_scalar_condition (gimple_s
> update_stmt (gsi_stmt (*gsi));
> }
>
> +static tree
> +do_vec_conversion (gimple_stmt_iterator *gsi, tree inner_type, tree a,
> + tree decl, tree bitpos, tree bitsize,
> + enum tree_code code, tree type)
> +{
> + a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
> + if (!VECTOR_TYPE_P (inner_type))
> + return gimplify_build1 (gsi, code, TREE_TYPE (type), a);
> + if (code == CALL_EXPR)
> + {
> + gimple *g = gimple_build_call (decl, 1, a);
> + tree lhs = make_ssa_name (TREE_TYPE (TREE_TYPE (decl)));
> + gimple_call_set_lhs (g, lhs);
> + gsi_insert_before (gsi, g, GSI_SAME_STMT);
> + return lhs;
> + }
> + else
> + {
> + tree outer_type = build_vector_type (TREE_TYPE (type),
> + TYPE_VECTOR_SUBPARTS (inner_type));
> + return gimplify_build1 (gsi, code, outer_type, a);
> + }
> +}
> +
> +/* Expand VEC_CONVERT ifn call. */
> +
> +static void
> +expand_vector_conversion (gimple_stmt_iterator *gsi)
> +{
> + gimple *stmt = gsi_stmt (*gsi);
> + gimple *g;
> + tree lhs = gimple_call_lhs (stmt);
> + tree arg = gimple_call_arg (stmt, 0);
> + tree decl = NULL_TREE;
> + tree ret_type = TREE_TYPE (lhs);
> + tree arg_type = TREE_TYPE (arg);
> + tree new_rhs, compute_type = TREE_TYPE (arg_type);
> + enum tree_code code = NOP_EXPR;
> + enum tree_code code1 = ERROR_MARK;
> + enum { NARROW, NONE, WIDEN } modifier = NONE;
> + optab optab1 = unknown_optab;
> +
> + gcc_checking_assert (VECTOR_TYPE_P (ret_type) && VECTOR_TYPE_P (arg_type));
> + gcc_checking_assert (tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (ret_type))));
> + gcc_checking_assert (tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (arg_type))));
> + if (INTEGRAL_TYPE_P (TREE_TYPE (ret_type))
> + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg_type)))
> + code = FIX_TRUNC_EXPR;
> + else if (INTEGRAL_TYPE_P (TREE_TYPE (arg_type))
> + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (ret_type)))
> + code = FLOAT_EXPR;
> + if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (ret_type)))
> + < tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg_type))))
> + modifier = NARROW;
> + else if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (ret_type)))
> + > tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg_type))))
> + modifier = WIDEN;
> +
> + if (modifier == NONE && (code == FIX_TRUNC_EXPR || code == FLOAT_EXPR))
> + {
> + if (supportable_convert_operation (code, ret_type, arg_type, &decl,
> + &code1))
> + {
> + if (code1 == CALL_EXPR)
> + {
> + g = gimple_build_call (decl, 1, arg);
> + gimple_call_set_lhs (g, lhs);
> + }
> + else
> + g = gimple_build_assign (lhs, code1, arg);
> + gsi_replace (gsi, g, false);
> + return;
> + }
> + /* Can't use get_compute_type here, as supportable_convert_operation
> + doesn't necessarily use an optab and needs two arguments. */
> + tree vector_compute_type
> + = type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab);
> + unsigned HOST_WIDE_INT nelts;
> + if (vector_compute_type
> + && VECTOR_MODE_P (TYPE_MODE (vector_compute_type))
> + && subparts_gt (arg_type, vector_compute_type)
> + && TYPE_VECTOR_SUBPARTS (vector_compute_type).is_constant (&nelts))
> + {
> + while (nelts > 1)
> + {
> + tree ret1_type = build_vector_type (TREE_TYPE (ret_type), nelts);
> + tree arg1_type = build_vector_type (TREE_TYPE (arg_type), nelts);
> + if (supportable_convert_operation (code, ret1_type, arg1_type,
> + &decl, &code1))
> + {
> + new_rhs = expand_vector_piecewise (gsi, do_vec_conversion,
> + ret_type, arg1_type, arg,
> + decl, code1);
> + g = gimple_build_assign (lhs, new_rhs);
> + gsi_replace (gsi, g, false);
> + return;
> + }
> + nelts = nelts / 2;
> + }
> + }
> + }
> + /* FIXME: __builtin_convertvector argument and return vectors have the same
> + number of elements, so for both narrowing and widening we need to figure
> + out what is the best set of optabs to use. E.g. for NARROW
> + VEC_PACK_TRUNC_EXPR has 2 arguments, shall we prefer emitting that with
> + one argument of arg and another argument all zeros and extract first
> + half of the resulting vector, or extract lo and hi halves of the arg
> + vector and use VEC_PACK_TRUNC_EXPR on those? */
> + else if (0 && modifier == NARROW)
> + {
> + switch (code)
> + {
> + case NOP_EXPR:
> + code1 = VEC_PACK_TRUNC_EXPR;
> + optab1 = optab_for_tree_code (code1, arg_type, optab_default);
> + break;
> + case FIX_TRUNC_EXPR:
> + code1 = VEC_PACK_FIX_TRUNC_EXPR;
> + /* The signedness is determined from output operand. */
> + optab1 = optab_for_tree_code (code1, ret_type, optab_default);
> + break;
> + case FLOAT_EXPR:
> + code1 = VEC_PACK_FLOAT_EXPR;
> + optab1 = optab_for_tree_code (code1, arg_type, optab_default);
> + break;
> + default:
> + gcc_unreachable ();
> + }
> +
> + if (optab1)
> + compute_type = get_compute_type (code1, optab1, arg_type);
> + (void) compute_type;
> + }
> +
> + new_rhs = expand_vector_piecewise (gsi, do_vec_conversion, arg_type,
> + TREE_TYPE (arg_type), arg,
> + NULL_TREE, code, ret_type);
> + g = gimple_build_assign (lhs, new_rhs);
> + gsi_replace (gsi, g, false);
> +}
> +
> /* Process one statement. If we identify a vector operation, expand it. */
>
> static void
> @@ -1561,7 +1706,11 @@ expand_vector_operations_1 (gimple_stmt_
> /* Only consider code == GIMPLE_ASSIGN. */
> gassign *stmt = dyn_cast <gassign *> (gsi_stmt (*gsi));
> if (!stmt)
> - return;
> + {
> + if (gimple_call_internal_p (gsi_stmt (*gsi), IFN_VEC_CONVERT))
> + expand_vector_conversion (gsi);
> + return;
> + }
>
> code = gimple_assign_rhs_code (stmt);
> rhs_class = get_gimple_rhs_class (code);
> --- gcc/internal-fn.def.jj 2019-01-01 12:37:17.893962875 +0100
> +++ gcc/internal-fn.def 2019-01-02 11:24:24.307681792 +0100
> @@ -296,6 +296,7 @@ DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST
> DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
> DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
> DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW | ECF_LEAF, NULL)
> +DEF_INTERNAL_FN (VEC_CONVERT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>
> /* An unduplicable, uncombinable function. Generally used to preserve
> a CFG property in the face of jump threading, tail merging or
> --- gcc/internal-fn.c.jj 2019-01-01 12:37:19.567935410 +0100
> +++ gcc/internal-fn.c 2019-01-02 11:24:24.315681661 +0100
> @@ -2581,6 +2581,15 @@ expand_VA_ARG (internal_fn, gcall *)
> gcc_unreachable ();
> }
>
> +/* IFN_VEC_CONVERT is supposed to be expanded at pass_lower_vector. So this
> + dummy function should never be called. */
> +
> +static void
> +expand_VEC_CONVERT (internal_fn, gcall *)
> +{
> + gcc_unreachable ();
> +}
> +
> /* Expand the IFN_UNIQUE function according to its first argument. */
>
> static void
> --- gcc/fold-const-call.c.jj 2019-01-01 12:37:16.528985271 +0100
> +++ gcc/fold-const-call.c 2019-01-02 15:57:36.656449175 +0100
> @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.
> #include "tm.h" /* For C[LT]Z_DEFINED_AT_ZERO. */
> #include "builtins.h"
> #include "gimple-expr.h"
> +#include "tree-vector-builder.h"
>
> /* Functions that test for certain constant types, abstracting away the
> decision about whether to check for overflow. */
> @@ -645,6 +646,40 @@ fold_const_reduction (tree type, tree ar
> return res;
> }
>
> +/* Fold a call to IFN_VEC_CONVERT (ARG) returning TYPE. */
> +
> +static tree
> +fold_const_vec_convert (tree ret_type, tree arg)
> +{
> + enum tree_code code = NOP_EXPR;
> + tree arg_type = TREE_TYPE (arg);
> + if (TREE_CODE (arg) != VECTOR_CST)
> + return NULL_TREE;
> +
> + gcc_checking_assert (VECTOR_TYPE_P (ret_type) && VECTOR_TYPE_P (arg_type));
> +
> + if (INTEGRAL_TYPE_P (TREE_TYPE (ret_type))
> + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg_type)))
> + code = FIX_TRUNC_EXPR;
> + else if (INTEGRAL_TYPE_P (TREE_TYPE (arg_type))
> + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (ret_type)))
> + code = FLOAT_EXPR;
> +
> + tree_vector_builder elts;
> + elts.new_unary_operation (ret_type, arg, true);
> + unsigned int count = elts.encoded_nelts ();
> + for (unsigned int i = 0; i < count; ++i)
> + {
> + tree elt = fold_unary (code, TREE_TYPE (ret_type),
> + VECTOR_CST_ELT (arg, i));
> + if (elt == NULL_TREE || !CONSTANT_CLASS_P (elt))
> + return NULL_TREE;
> + elts.quick_push (elt);
> + }
> +
> + return elts.build ();
> +}
> +
> /* Try to evaluate:
>
> *RESULT = FN (*ARG)
> @@ -1232,6 +1267,9 @@ fold_const_call (combined_fn fn, tree ty
> case CFN_REDUC_XOR:
> return fold_const_reduction (type, arg, BIT_XOR_EXPR);
>
> + case CFN_VEC_CONVERT:
> + return fold_const_vec_convert (type, arg);
> +
> default:
> return fold_const_call_1 (fn, type, arg);
> }
> --- gcc/c-family/c-common.h.jj 2019-01-01 12:37:51.309414610 +0100
> +++ gcc/c-family/c-common.h 2019-01-02 11:24:24.314681677 +0100
> @@ -102,7 +102,7 @@ enum rid
> RID_ASM, RID_TYPEOF, RID_ALIGNOF, RID_ATTRIBUTE, RID_VA_ARG,
> RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL, RID_CHOOSE_EXPR,
> RID_TYPES_COMPATIBLE_P, RID_BUILTIN_COMPLEX, RID_BUILTIN_SHUFFLE,
> - RID_BUILTIN_TGMATH,
> + RID_BUILTIN_CONVERTVECTOR, RID_BUILTIN_TGMATH,
> RID_BUILTIN_HAS_ATTRIBUTE,
> RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
>
> @@ -1001,6 +1001,7 @@ extern bool lvalue_p (const_tree);
> extern bool vector_targets_convertible_p (const_tree t1, const_tree t2);
> extern bool vector_types_convertible_p (const_tree t1, const_tree t2, bool emit_lax_note);
> extern tree c_build_vec_perm_expr (location_t, tree, tree, tree, bool = true);
> +extern tree c_build_vec_convert (location_t, tree, location_t, tree, bool = true);
>
> extern void init_c_lex (void);
>
> --- gcc/c-family/c-common.c.jj 2019-01-01 12:37:51.366413675 +0100
> +++ gcc/c-family/c-common.c 2019-01-02 11:24:24.314681677 +0100
> @@ -376,6 +376,7 @@ const struct c_common_resword c_common_r
> RID_BUILTIN_CALL_WITH_STATIC_CHAIN, D_CONLY },
> { "__builtin_choose_expr", RID_CHOOSE_EXPR, D_CONLY },
> { "__builtin_complex", RID_BUILTIN_COMPLEX, D_CONLY },
> + { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 },
> { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 },
> { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY },
> { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 },
> @@ -1070,6 +1071,70 @@ c_build_vec_perm_expr (location_t loc, t
> ret = c_wrap_maybe_const (ret, true);
>
> return ret;
> +}
> +
> +/* Build a VEC_CONVERT ifn for __builtin_convertvector builtin. */
> +
> +tree
> +c_build_vec_convert (location_t loc1, tree expr, location_t loc2, tree type,
> + bool complain)
> +{
> + if (error_operand_p (type))
> + return error_mark_node;
> + if (error_operand_p (expr))
> + return error_mark_node;
> +
> + if (!VECTOR_INTEGER_TYPE_P (TREE_TYPE (expr))
> + && !VECTOR_FLOAT_TYPE_P (TREE_TYPE (expr)))
> + {
> + if (complain)
> + error_at (loc1, "%<__builtin_convertvector%> first argument must "
> + "be an integer or floating vector");
> + return error_mark_node;
> + }
> +
> + if (!VECTOR_INTEGER_TYPE_P (type) && !VECTOR_FLOAT_TYPE_P (type))
> + {
> + if (complain)
> + error_at (loc2, "%<__builtin_convertvector%> second argument must "
> + "be an integer or floating vector type");
> + return error_mark_node;
> + }
> +
> + if (maybe_ne (TYPE_VECTOR_SUBPARTS (TREE_TYPE (expr)),
> + TYPE_VECTOR_SUBPARTS (type)))
> + {
> + if (complain)
> + error_at (loc1, "%<__builtin_convertvector%> number of elements "
> + "of the first argument vector and the second argument "
> + "vector type should be the same");
> + return error_mark_node;
> + }
> +
> + if ((TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (expr)))
> + == TYPE_MAIN_VARIANT (TREE_TYPE (type)))
> + || (VECTOR_INTEGER_TYPE_P (TREE_TYPE (expr))
> + && VECTOR_INTEGER_TYPE_P (type)
> + && (TYPE_PRECISION (TREE_TYPE (TREE_TYPE (expr)))
> + == TYPE_PRECISION (TREE_TYPE (type)))))
> + return build1_loc (loc1, VIEW_CONVERT_EXPR, type, expr);
> +
> + bool wrap = true;
> + bool maybe_const = false;
> + tree ret;
> + if (!c_dialect_cxx ())
> + {
> + /* Avoid C_MAYBE_CONST_EXPRs inside of VEC_CONVERT argument. */
> + expr = c_fully_fold (expr, false, &maybe_const);
> + wrap &= maybe_const;
> + }
> +
> + ret = build_call_expr_internal_loc (loc1, IFN_VEC_CONVERT, type, 1, expr);
> +
> + if (!wrap)
> + ret = c_wrap_maybe_const (ret, true);
> +
> + return ret;
> }
>
> /* Like tree.c:get_narrower, but retain conversion from C++0x scoped enum
> --- gcc/c/c-parser.c.jj 2019-01-01 12:37:48.677457794 +0100
> +++ gcc/c/c-parser.c 2019-01-02 11:24:24.312681710 +0100
> @@ -8038,6 +8038,7 @@ enum tgmath_parm_kind
> __builtin_shuffle ( assignment-expression ,
> assignment-expression ,
> assignment-expression, )
> + __builtin_convertvector ( assignment-expression , type-name )
>
> offsetof-member-designator:
> identifier
> @@ -9113,17 +9114,14 @@ c_parser_postfix_expression (c_parser *p
> *p = convert_lvalue_to_rvalue (loc, *p, true, true);
>
> if (vec_safe_length (cexpr_list) == 2)
> - expr.value =
> - c_build_vec_perm_expr
> - (loc, (*cexpr_list)[0].value,
> - NULL_TREE, (*cexpr_list)[1].value);
> + expr.value = c_build_vec_perm_expr (loc, (*cexpr_list)[0].value,
> + NULL_TREE,
> + (*cexpr_list)[1].value);
>
> else if (vec_safe_length (cexpr_list) == 3)
> - expr.value =
> - c_build_vec_perm_expr
> - (loc, (*cexpr_list)[0].value,
> - (*cexpr_list)[1].value,
> - (*cexpr_list)[2].value);
> + expr.value = c_build_vec_perm_expr (loc, (*cexpr_list)[0].value,
> + (*cexpr_list)[1].value,
> + (*cexpr_list)[2].value);
> else
> {
> error_at (loc, "wrong number of arguments to "
> @@ -9133,6 +9131,41 @@ c_parser_postfix_expression (c_parser *p
> set_c_expr_source_range (&expr, loc, close_paren_loc);
> break;
> }
> + case RID_BUILTIN_CONVERTVECTOR:
> + {
> + location_t start_loc = loc;
> + c_parser_consume_token (parser);
> + matching_parens parens;
> + if (!parens.require_open (parser))
> + {
> + expr.set_error ();
> + break;
> + }
> + e1 = c_parser_expr_no_commas (parser, NULL);
> + mark_exp_read (e1.value);
> + if (!c_parser_require (parser, CPP_COMMA, "expected %<,%>"))
> + {
> + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
> + expr.set_error ();
> + break;
> + }
> + loc = c_parser_peek_token (parser)->location;
> + t1 = c_parser_type_name (parser);
> + location_t end_loc = c_parser_peek_token (parser)->get_finish ();
> + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
> + "expected %<)%>");
> + if (t1 == NULL)
> + expr.set_error ();
> + else
> + {
> + tree type_expr = NULL_TREE;
> + expr.value = c_build_vec_convert (start_loc, e1.value, loc,
> + groktypename (t1, &type_expr,
> + NULL));
> + set_c_expr_source_range (&expr, start_loc, end_loc);
> + }
> + }
> + break;
> case RID_AT_SELECTOR:
> {
> gcc_assert (c_dialect_objc ());
> --- gcc/cp/cp-tree.h.jj 2019-01-01 12:37:46.884487212 +0100
> +++ gcc/cp/cp-tree.h 2019-01-02 16:43:35.480393140 +0100
> @@ -7142,6 +7142,8 @@ extern bool is_lambda_ignored_entity
> extern bool lambda_static_thunk_p (tree);
> extern tree finish_builtin_launder (location_t, tree,
> tsubst_flags_t);
> +extern tree cp_build_vec_convert (tree, location_t, tree,
> + tsubst_flags_t);
> extern void start_lambda_scope (tree);
> extern void record_lambda_scope (tree);
> extern void record_null_lambda_scope (tree);
> --- gcc/cp/parser.c.jj 2019-01-01 12:37:47.352479534 +0100
> +++ gcc/cp/parser.c 2019-01-02 16:19:44.765760167 +0100
> @@ -7031,6 +7031,32 @@ cp_parser_postfix_expression (cp_parser
> break;
> }
>
> + case RID_BUILTIN_CONVERTVECTOR:
> + {
> + tree expression;
> + tree type;
> + /* Consume the `__builtin_convertvector' token. */
> + cp_lexer_consume_token (parser->lexer);
> + /* Look for the opening `('. */
> + matching_parens parens;
> + parens.require_open (parser);
> + /* Now, parse the assignment-expression. */
> + expression = cp_parser_assignment_expression (parser);
> + /* Look for the `,'. */
> + cp_parser_require (parser, CPP_COMMA, RT_COMMA);
> + location_t type_location
> + = cp_lexer_peek_token (parser->lexer)->location;
> + /* Parse the type-id. */
> + {
> + type_id_in_expr_sentinel s (parser);
> + type = cp_parser_type_id (parser);
> + }
> + /* Look for the closing `)'. */
> + parens.require_close (parser);
> + return cp_build_vec_convert (expression, type_location, type,
> + tf_warning_or_error);
> + }
> +
> default:
> {
> tree type;
> --- gcc/cp/constexpr.c.jj 2019-01-01 12:37:47.282480682 +0100
> +++ gcc/cp/constexpr.c 2019-01-02 16:56:54.126359632 +0100
> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.
> #include "ubsan.h"
> #include "gimple-fold.h"
> #include "timevar.h"
> +#include "fold-const-call.h"
>
> static bool verify_constant (tree, bool, bool *, bool *);
> #define VERIFY_CONSTANT(X) \
> @@ -1449,6 +1450,20 @@ cxx_eval_internal_function (const conste
> return cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0),
> false, non_constant_p, overflow_p);
>
> + case IFN_VEC_CONVERT:
> + {
> + tree arg = cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0),
> + false, non_constant_p,
> + overflow_p);
> + if (TREE_CODE (arg) == VECTOR_CST)
> + return fold_const_call (CFN_VEC_CONVERT, TREE_TYPE (t), arg);
> + else
> + {
> + *non_constant_p = true;
> + return t;
> + }
> + }
> +
> default:
> if (!ctx->quiet)
> error_at (cp_expr_loc_or_loc (t, input_location),
> @@ -5623,7 +5638,9 @@ potential_constant_expression_1 (tree t,
> case IFN_SUB_OVERFLOW:
> case IFN_MUL_OVERFLOW:
> case IFN_LAUNDER:
> + case IFN_VEC_CONVERT:
> bail = false;
> + break;
>
> default:
> break;
> --- gcc/cp/semantics.c.jj 2019-01-01 12:37:46.976485703 +0100
> +++ gcc/cp/semantics.c 2019-01-02 18:15:42.844133048 +0100
> @@ -9933,4 +9933,26 @@ finish_builtin_launder (location_t loc,
> TREE_TYPE (arg), 1, arg);
> }
>
> +/* Finish __builtin_convertvector (arg, type). */
> +
> +tree
> +cp_build_vec_convert (tree arg, location_t loc, tree type,
> + tsubst_flags_t complain)
> +{
> + if (error_operand_p (type))
> + return error_mark_node;
> + if (error_operand_p (arg))
> + return error_mark_node;
> +
> + tree ret = NULL_TREE;
> + if (!type_dependent_expression_p (arg) && !dependent_type_p (type))
> + ret = c_build_vec_convert (cp_expr_loc_or_loc (arg, input_location), arg,
> + loc, type, (complain & tf_error) != 0);
> +
> + if (!processing_template_decl)
> + return ret;
> +
> + return build_call_expr_internal_loc (loc, IFN_VEC_CONVERT, type, 1, arg);
> +}
> +
> #include "gt-cp-semantics.h"
> --- gcc/cp/pt.c.jj 2019-01-01 12:37:47.081483980 +0100
> +++ gcc/cp/pt.c 2019-01-02 18:25:17.997778249 +0100
> @@ -18813,6 +18813,27 @@ tsubst_copy_and_build (tree t,
> (*call_args)[0], complain);
> break;
>
> + case IFN_VEC_CONVERT:
> + gcc_assert (nargs == 1);
> + if (vec_safe_length (call_args) != 1)
> + {
> + error_at (cp_expr_loc_or_loc (t, input_location),
> + "wrong number of arguments to "
> + "%<__builtin_convertvector%>");
> + ret = error_mark_node;
> + break;
> + }
> + ret = cp_build_vec_convert ((*call_args)[0], input_location,
> + tsubst (TREE_TYPE (t), args,
> + complain, in_decl),
> + complain);
> + if (TREE_CODE (ret) == VIEW_CONVERT_EXPR)
> + {
> + release_tree_vector (call_args);
> + RETURN (ret);
> + }
> + break;
> +
> default:
> /* Unsupported internal function with arguments. */
> gcc_unreachable ();
> --- gcc/testsuite/c-c++-common/builtin-convertvector-1.c.jj 2019-01-02 18:38:18.265090910 +0100
> +++ gcc/testsuite/c-c++-common/builtin-convertvector-1.c 2019-01-02 18:37:50.337544972 +0100
> @@ -0,0 +1,15 @@
> +typedef int v8si __attribute__((vector_size (8 * sizeof (int))));
> +typedef long long v4di __attribute__((vector_size (4 * sizeof (long long))));
> +
> +void
> +foo (v8si *x, v4di *y, int z)
> +{
> + __builtin_convertvector (*y, v8si); /* { dg-error "number of elements of the first argument vector and the second argument vector type should be the same" } */
> + __builtin_convertvector (*x, v4di); /* { dg-error "number of elements of the first argument vector and the second argument vector type should be the same" } */
> + __builtin_convertvector (*x, int); /* { dg-error "second argument must be an integer or floating vector type" } */
> + __builtin_convertvector (z, v4di); /* { dg-error "first argument must be an integer or floating vector" } */
> + __builtin_convertvector (); /* { dg-error "expected" } */
> + __builtin_convertvector (*x); /* { dg-error "expected" } */
> + __builtin_convertvector (*x, *y); /* { dg-error "expected" } */
> + __builtin_convertvector (*x, v8si, 1);/* { dg-error "expected" } */
> +}
> --- gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c.jj 2019-01-02 18:00:59.982534637 +0100
> +++ gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c 2019-01-02 18:00:32.871977360 +0100
> @@ -0,0 +1,131 @@
> +extern
> +#ifdef __cplusplus
> +"C"
> +#endif
> +void abort (void);
> +typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> +typedef unsigned int v4usi __attribute__((vector_size (4 * sizeof (unsigned int))));
> +typedef float v4sf __attribute__((vector_size (4 * sizeof (float))));
> +typedef double v4df __attribute__((vector_size (4 * sizeof (double))));
> +typedef long long v256di __attribute__((vector_size (256 * sizeof (long long))));
> +typedef double v256df __attribute__((vector_size (256 * sizeof (double))));
> +
> +void
> +f1 (v4usi *x, v4si *y)
> +{
> + *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +void
> +f2 (v4sf *x, v4si *y)
> +{
> + *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +void
> +f3 (v4si *x, v4sf *y)
> +{
> + *y = __builtin_convertvector (*x, v4sf);
> +}
> +
> +void
> +f4 (v4df *x, v4si *y)
> +{
> + *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +void
> +f5 (v4si *x, v4df *y)
> +{
> + *y = __builtin_convertvector (*x, v4df);
> +}
> +
> +void
> +f6 (v256df *x, v256di *y)
> +{
> + *y = __builtin_convertvector (*x, v256di);
> +}
> +
> +void
> +f7 (v256di *x, v256df *y)
> +{
> + *y = __builtin_convertvector (*x, v256df);
> +}
> +
> +void
> +f8 (v4df *x)
> +{
> + v4si a = { 1, 2, -3, -4 };
> + *x = __builtin_convertvector (a, v4df);
> +}
> +
> +int
> +main ()
> +{
> + union U1 { v4si v; int a[4]; } u1;
> + union U2 { v4usi v; unsigned int a[4]; } u2;
> + union U3 { v4sf v; float a[4]; } u3;
> + union U4 { v4df v; double a[4]; } u4;
> + union U5 { v256di v; long long a[256]; } u5;
> + union U6 { v256df v; double a[256]; } u6;
> + int i;
> + for (i = 0; i < 4; i++)
> + u2.a[i] = i * 2;
> + f1 (&u2.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != i * 2)
> + abort ();
> + else
> + u3.a[i] = i - 2.25f;
> + f2 (&u3.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != (i == 3 ? 0 : i - 2))
> + abort ();
> + else
> + u3.a[i] = i + 0.75f;
> + f2 (&u3.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != i)
> + abort ();
> + else
> + u1.a[i] = 7 * i - 5;
> + f3 (&u1.v, &u3.v);
> + for (i = 0; i < 4; i++)
> + if (u3.a[i] != 7 * i - 5)
> + abort ();
> + else
> + u4.a[i] = i - 2.25;
> + f4 (&u4.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != (i == 3 ? 0 : i - 2))
> + abort ();
> + else
> + u4.a[i] = i + 0.75;
> + f4 (&u4.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != i)
> + abort ();
> + else
> + u1.a[i] = 7 * i - 5;
> + f5 (&u1.v, &u4.v);
> + for (i = 0; i < 4; i++)
> + if (u4.a[i] != 7 * i - 5)
> + abort ();
> + for (i = 0; i < 256; i++)
> + u6.a[i] = i - 128.25;
> + f6 (&u6.v, &u5.v);
> + for (i = 0; i < 256; i++)
> + if (u5.a[i] != i - 128 - (i > 128))
> + abort ();
> + else
> + u5.a[i] = i - 128;
> + f7 (&u5.v, &u6.v);
> + for (i = 0; i < 256; i++)
> + if (u6.a[i] != i - 128)
> + abort ();
> + f8 (&u4.v);
> + for (i = 0; i < 4; i++)
> + if (u4.a[i] != (i >= 2 ? -1 - i : i + 1))
> + abort ();
> + return 0;
> +}
> --- gcc/testsuite/g++.dg/ext/builtin-convertvector-1.C.jj 2019-01-02 18:04:14.984350274 +0100
> +++ gcc/testsuite/g++.dg/ext/builtin-convertvector-1.C 2019-01-02 18:07:17.122375950 +0100
> @@ -0,0 +1,137 @@
> +// { dg-do run }
> +
> +extern "C" void abort ();
> +typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> +typedef unsigned int v4usi __attribute__((vector_size (4 * sizeof (unsigned int))));
> +typedef float v4sf __attribute__((vector_size (4 * sizeof (float))));
> +typedef double v4df __attribute__((vector_size (4 * sizeof (double))));
> +typedef long long v256di __attribute__((vector_size (256 * sizeof (long long))));
> +typedef double v256df __attribute__((vector_size (256 * sizeof (double))));
> +
> +template <int N>
> +void
> +f1 (v4usi *x, v4si *y)
> +{
> + *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +template <typename T>
> +void
> +f2 (T *x, v4si *y)
> +{
> + *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +template <typename T>
> +void
> +f3 (v4si *x, T *y)
> +{
> + *y = __builtin_convertvector (*x, T);
> +}
> +
> +template <int N>
> +void
> +f4 (v4df *x, v4si *y)
> +{
> + *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +template <typename T, typename U>
> +void
> +f5 (T *x, U *y)
> +{
> + *y = __builtin_convertvector (*x, U);
> +}
> +
> +template <typename T>
> +void
> +f6 (v256df *x, T *y)
> +{
> + *y = __builtin_convertvector (*x, T);
> +}
> +
> +template <int N>
> +void
> +f7 (v256di *x, v256df *y)
> +{
> + *y = __builtin_convertvector (*x, v256df);
> +}
> +
> +template <int N>
> +void
> +f8 (v4df *x)
> +{
> + v4si a = { 1, 2, -3, -4 };
> + *x = __builtin_convertvector (a, v4df);
> +}
> +
> +int
> +main ()
> +{
> + union U1 { v4si v; int a[4]; } u1;
> + union U2 { v4usi v; unsigned int a[4]; } u2;
> + union U3 { v4sf v; float a[4]; } u3;
> + union U4 { v4df v; double a[4]; } u4;
> + union U5 { v256di v; long long a[256]; } u5;
> + union U6 { v256df v; double a[256]; } u6;
> + int i;
> + for (i = 0; i < 4; i++)
> + u2.a[i] = i * 2;
> + f1<0> (&u2.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != i * 2)
> + abort ();
> + else
> + u3.a[i] = i - 2.25f;
> + f2 (&u3.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != (i == 3 ? 0 : i - 2))
> + abort ();
> + else
> + u3.a[i] = i + 0.75f;
> + f2 (&u3.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != i)
> + abort ();
> + else
> + u1.a[i] = 7 * i - 5;
> + f3 (&u1.v, &u3.v);
> + for (i = 0; i < 4; i++)
> + if (u3.a[i] != 7 * i - 5)
> + abort ();
> + else
> + u4.a[i] = i - 2.25;
> + f4<12> (&u4.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != (i == 3 ? 0 : i - 2))
> + abort ();
> + else
> + u4.a[i] = i + 0.75;
> + f4<13> (&u4.v, &u1.v);
> + for (i = 0; i < 4; i++)
> + if (u1.a[i] != i)
> + abort ();
> + else
> + u1.a[i] = 7 * i - 5;
> + f5 (&u1.v, &u4.v);
> + for (i = 0; i < 4; i++)
> + if (u4.a[i] != 7 * i - 5)
> + abort ();
> + for (i = 0; i < 256; i++)
> + u6.a[i] = i - 128.25;
> + f6 (&u6.v, &u5.v);
> + for (i = 0; i < 256; i++)
> + if (u5.a[i] != i - 128 - (i > 128))
> + abort ();
> + else
> + u5.a[i] = i - 128;
> + f7<-1> (&u5.v, &u6.v);
> + for (i = 0; i < 256; i++)
> + if (u6.a[i] != i - 128)
> + abort ();
> + f8<5> (&u4.v);
> + for (i = 0; i < 4; i++)
> + if (u4.a[i] != (i >= 2 ? -1 - i : i + 1))
> + abort ();
> + return 0;
> +}
> --- gcc/testsuite/g++.dg/cpp0x/constexpr-builtin4.C.jj 2019-01-02 18:39:12.767204801 +0100
> +++ gcc/testsuite/g++.dg/cpp0x/constexpr-builtin4.C 2019-01-02 18:42:30.749985890 +0100
> @@ -0,0 +1,17 @@
> +// { dg-do compile { target c++11 } }
> +// { dg-additional-options "-Wno-psabi" }
> +
> +typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> +typedef float v4sf __attribute__((vector_size (4 * sizeof (float))));
> +constexpr v4sf a = __builtin_convertvector (v4si { 1, 2, -3, -4 }, v4sf);
> +
> +constexpr v4sf
> +foo (v4si x)
> +{
> + return __builtin_convertvector (x, v4sf);
> +}
> +
> +constexpr v4sf b = foo (v4si { 3, 4, -1, -2 });
> +
> +static_assert (a[0] == 1.0f && a[1] == 2.0f && a[2] == -3.0f && a[3] == -4.0f, "");
> +static_assert (b[0] == 3.0f && b[1] == 4.0f && b[2] == -1.0f && b[3] == -2.0f, "");
>
> Jakub
>
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
next prev parent reply other threads:[~2019-01-03 11:16 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-03 10:06 Jakub Jelinek
2019-01-03 10:48 ` Marc Glisse
2019-01-03 11:04 ` Jakub Jelinek
2019-01-03 17:32 ` Marc Glisse
2019-01-03 22:24 ` [PATCH] Add __builtin_convertvector support (PR c++/85052, take 2) Jakub Jelinek
2019-01-07 8:27 ` Richard Biener
2019-01-03 11:16 ` Richard Biener [this message]
2019-01-03 12:11 ` [PATCH] Add __builtin_convertvector support (PR c++/85052) Jakub Jelinek
2019-01-03 13:06 ` Richard Sandiford
2019-01-03 17:04 ` Martin Sebor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.20.1901031205260.23386@zhemvz.fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=jason@redhat.com \
--cc=jh@suse.cz \
--cc=joseph@codesourcery.com \
--cc=rdsandiford@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).