public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Richard Sandiford <rdsandiford@googlemail.com>,
	    Jason Merrill <jason@redhat.com>,
	    "Joseph S. Myers" <joseph@codesourcery.com>,
	Jan Hubicka <jh@suse.cz>,
	    gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Add __builtin_convertvector support (PR c++/85052)
Date: Thu, 03 Jan 2019 11:16:00 -0000	[thread overview]
Message-ID: <alpine.LSU.2.20.1901031205260.23386@zhemvz.fhfr.qr> (raw)
In-Reply-To: <20190103100640.GM30353@tucnak>

On Thu, 3 Jan 2019, Jakub Jelinek wrote:

> Hi!
> 
> The following patch adds support for the __builtin_convertvector builtin.
> C casts on generic vectors are just reinterpretation of the bits (i.e. a
> VCE), this builtin allows to cast int/unsigned elements to float or vice
> versa or promote/demote them.  doc/ change is missing, will write it soon.
> 
> The builtin appeared in I think clang 3.4 and is apparently in real-world
> use as e.g. Honza reported.  The first argument is an expression with vector
> type, the second argument is a vector type (similarly e.g. to va_arg), to
> which the first argument should be converted.  Both vector types need to
> have the same number of elements.
> 
> I've implemented same element size (thus also whole vector size) conversions
> efficiently - signed to unsigned and vice versa or same vector type just
> using a VCE, for e.g. int <-> float or long long <-> double using
> appropriate optab, possibly repeated multiple times for very large vectors.
> For everything there is a fallback to lower __builtin_convertvector (x, t)
> as { (__typeof (t[0])) x[0], (__typeof (t[1])) x[1], ... }.
> 
> What isn't implemented efficiently (yet) are the narrowing or widening
> conversions; the optabs we have are meant for same size vectors, so
> for the packing we have 2 arguments that we pack into 1, for unpacking we
> have those lo/hi variants, but in this case at least for the most common
> vectors we have just one argument and want result with the same number of
> elements.  The AVX* different vector size instructions is the thing that
> does this most efficiently, of course for large generic vectors we can
> easily use these optabs.  Shall we go for e.g. trying to pack the argument
> and all zeros dummy operand and pick the low half of the result vector,
> or pick the low and high halves of the argument and use a half sized vector
> operations, or both?

I guess it depends on target capabilities - I think
__builtin_convertvector is a bit "misdesigned" for pack/unpack.  You
also have to consider a v2di to v2qi conversion which requires
several unpack steps.  Does the clang documentation given any
hints how to "efficiently" use __builtin_convertvector for
packing/unpacking without exposing too much of the target architecture?

But yes, for unpacking you'd use a series of vec_unpack_*_lo_expr
with padded input (padded with "don't care" if we had that, on
RTL we'd use a paradoxical subreg, on GIMPLE we _might_ consider
allowing VCE of different size?  Or simply allow half-size input
operands to vec_unpack_*_lo where that expands to paradoxical
subregs (a bit difficult for the optab query I guess).

For packing you'd use a series of vec_pack_* on argument split
to two halves via BIT_FIELD_REF.

What does clang do for testcases that request promotion/demotion?

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

do_vec_conversion needs a comment.  Overall the patch (with its
existing features) looks OK to me.

As of Marcs comments I agree that vector lowering happens quite late.
It might be for example useful to lower before vectorization (or
any loop optimization) so that un-handled generic vector code can be
eventually vectorized differently.  But that's sth to investigate for
GCC 10.

Giving FE maintainers a chance to comment, so no overall ACK yet.

Thanks,
Richard.

> 2019-01-03  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c++/85052
> 	* tree-vect-generic.c (expand_vector_piecewise): Add defaulted
> 	ret_type argument, if non-NULL, use that in preference to type
> 	for the result type.
> 	(expand_vector_parallel): Formatting fix.
> 	(do_vec_conversion, expand_vector_conversion): New functions.
> 	(expand_vector_operations_1): Call expand_vector_conversion
> 	for VEC_CONVERT ifn calls.
> 	* internal-fn.def (VEC_CONVERT): New internal function.
> 	* internal-fn.c (expand_VEC_CONVERT): New function.
> 	* fold-const-call.c (fold_const_vec_convert): New function.
> 	(fold_const_call): Use it for CFN_VEC_CONVERT.
> c-family/
> 	* c-common.h (enum rid): Add RID_BUILTIN_CONVERTVECTOR.
> 	(c_build_vec_convert): Declare.
> 	* c-common.c (c_build_vec_convert): New function.
> c/
> 	* c-parser.c (c_parser_postfix_expression): Parse
> 	__builtin_convertvector.
> cp/
> 	* cp-tree.h (cp_build_vec_convert): Declare.
> 	* parser.c (cp_parser_postfix_expression): Parse
> 	__builtin_convertvector.
> 	* constexpr.c: Include fold-const-call.h.
> 	(cxx_eval_internal_function): Handle IFN_VEC_CONVERT.
> 	(potential_constant_expression_1): Likewise.
> 	* semantics.c (cp_build_vec_convert): New function.
> 	* pt.c (tsubst_copy_and_build): Handle CALL_EXPR to
> 	IFN_VEC_CONVERT.
> testsuite/
> 	* c-c++-common/builtin-convertvector-1.c: New test.
> 	* c-c++-common/torture/builtin-convertvector-1.c: New test.
> 	* g++.dg/ext/builtin-convertvector-1.C: New test.
> 	* g++.dg/cpp0x/constexpr-builtin4.C: New test.
> 
> --- gcc/tree-vect-generic.c.jj	2019-01-01 12:37:17.084976148 +0100
> +++ gcc/tree-vect-generic.c	2019-01-02 17:51:28.012876543 +0100
> @@ -267,7 +267,8 @@ do_negate (gimple_stmt_iterator *gsi, tr
>  static tree
>  expand_vector_piecewise (gimple_stmt_iterator *gsi, elem_op_func f,
>  			 tree type, tree inner_type,
> -			 tree a, tree b, enum tree_code code)
> +			 tree a, tree b, enum tree_code code,
> +			 tree ret_type = NULL_TREE)
>  {
>    vec<constructor_elt, va_gc> *v;
>    tree part_width = TYPE_SIZE (inner_type);
> @@ -278,23 +279,27 @@ expand_vector_piecewise (gimple_stmt_ite
>    int i;
>    location_t loc = gimple_location (gsi_stmt (*gsi));
>  
> -  if (types_compatible_p (gimple_expr_type (gsi_stmt (*gsi)), type))
> +  if (ret_type
> +      || types_compatible_p (gimple_expr_type (gsi_stmt (*gsi)), type))
>      warning_at (loc, OPT_Wvector_operation_performance,
>  		"vector operation will be expanded piecewise");
>    else
>      warning_at (loc, OPT_Wvector_operation_performance,
>  		"vector operation will be expanded in parallel");
>  
> +  if (!ret_type)
> +    ret_type = type;
>    vec_alloc (v, (nunits + delta - 1) / delta);
>    for (i = 0; i < nunits;
>         i += delta, index = int_const_binop (PLUS_EXPR, index, part_width))
>      {
> -      tree result = f (gsi, inner_type, a, b, index, part_width, code, type);
> +      tree result = f (gsi, inner_type, a, b, index, part_width, code,
> +		       ret_type);
>        constructor_elt ce = {NULL_TREE, result};
>        v->quick_push (ce);
>      }
>  
> -  return build_constructor (type, v);
> +  return build_constructor (ret_type, v);
>  }
>  
>  /* Expand a vector operation to scalars with the freedom to use
> @@ -302,8 +307,7 @@ expand_vector_piecewise (gimple_stmt_ite
>     in the vector type.  */
>  static tree
>  expand_vector_parallel (gimple_stmt_iterator *gsi, elem_op_func f, tree type,
> -			tree a, tree b,
> -			enum tree_code code)
> +			tree a, tree b, enum tree_code code)
>  {
>    tree result, compute_type;
>    int n_words = tree_to_uhwi (TYPE_SIZE_UNIT (type)) / UNITS_PER_WORD;
> @@ -1547,6 +1551,147 @@ expand_vector_scalar_condition (gimple_s
>    update_stmt (gsi_stmt (*gsi));
>  }
>  
> +static tree
> +do_vec_conversion (gimple_stmt_iterator *gsi, tree inner_type, tree a,
> +		   tree decl, tree bitpos, tree bitsize,
> +		   enum tree_code code, tree type)
> +{
> +  a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
> +  if (!VECTOR_TYPE_P (inner_type))
> +    return gimplify_build1 (gsi, code, TREE_TYPE (type), a);
> +  if (code == CALL_EXPR)
> +    {
> +      gimple *g = gimple_build_call (decl, 1, a);
> +      tree lhs = make_ssa_name (TREE_TYPE (TREE_TYPE (decl)));
> +      gimple_call_set_lhs (g, lhs);
> +      gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +      return lhs;
> +    }
> +  else
> +    {
> +      tree outer_type = build_vector_type (TREE_TYPE (type),
> +					   TYPE_VECTOR_SUBPARTS (inner_type));
> +      return gimplify_build1 (gsi, code, outer_type, a);
> +    }
> +}
> +
> +/* Expand VEC_CONVERT ifn call.  */
> +
> +static void
> +expand_vector_conversion (gimple_stmt_iterator *gsi)
> +{
> +  gimple *stmt = gsi_stmt (*gsi);
> +  gimple *g;
> +  tree lhs = gimple_call_lhs (stmt);
> +  tree arg = gimple_call_arg (stmt, 0);
> +  tree decl = NULL_TREE;
> +  tree ret_type = TREE_TYPE (lhs);
> +  tree arg_type = TREE_TYPE (arg);
> +  tree new_rhs, compute_type = TREE_TYPE (arg_type);
> +  enum tree_code code = NOP_EXPR;
> +  enum tree_code code1 = ERROR_MARK;
> +  enum { NARROW, NONE, WIDEN } modifier = NONE;
> +  optab optab1 = unknown_optab;
> +
> +  gcc_checking_assert (VECTOR_TYPE_P (ret_type) && VECTOR_TYPE_P (arg_type));
> +  gcc_checking_assert (tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (ret_type))));
> +  gcc_checking_assert (tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (arg_type))));
> +  if (INTEGRAL_TYPE_P (TREE_TYPE (ret_type))
> +      && SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg_type)))
> +    code = FIX_TRUNC_EXPR;
> +  else if (INTEGRAL_TYPE_P (TREE_TYPE (arg_type))
> +	   && SCALAR_FLOAT_TYPE_P (TREE_TYPE (ret_type)))
> +    code = FLOAT_EXPR;
> +  if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (ret_type)))
> +      < tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg_type))))
> +    modifier = NARROW;
> +  else if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (ret_type)))
> +	   > tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg_type))))
> +    modifier = WIDEN;
> +
> +  if (modifier == NONE && (code == FIX_TRUNC_EXPR || code == FLOAT_EXPR))
> +    {
> +      if (supportable_convert_operation (code, ret_type, arg_type, &decl,
> +					 &code1))
> +	{
> +	  if (code1 == CALL_EXPR)
> +	    {
> +	      g = gimple_build_call (decl, 1, arg);
> +	      gimple_call_set_lhs (g, lhs);
> +	    }
> +	  else
> +	    g = gimple_build_assign (lhs, code1, arg);
> +	  gsi_replace (gsi, g, false);
> +	  return;
> +	}
> +      /* Can't use get_compute_type here, as supportable_convert_operation
> +	 doesn't necessarily use an optab and needs two arguments.  */
> +      tree vector_compute_type
> +	= type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab);
> +      unsigned HOST_WIDE_INT nelts;
> +      if (vector_compute_type
> +	  && VECTOR_MODE_P (TYPE_MODE (vector_compute_type))
> +	  && subparts_gt (arg_type, vector_compute_type)
> +	  && TYPE_VECTOR_SUBPARTS (vector_compute_type).is_constant (&nelts))
> +	{
> +	  while (nelts > 1)
> +	    {
> +	      tree ret1_type = build_vector_type (TREE_TYPE (ret_type), nelts);
> +	      tree arg1_type = build_vector_type (TREE_TYPE (arg_type), nelts);
> +	      if (supportable_convert_operation (code, ret1_type, arg1_type,
> +						 &decl, &code1))
> +		{
> +		  new_rhs = expand_vector_piecewise (gsi, do_vec_conversion,
> +						     ret_type, arg1_type, arg,
> +						     decl, code1);
> +		  g = gimple_build_assign (lhs, new_rhs);
> +		  gsi_replace (gsi, g, false);
> +		  return;
> +		}
> +	      nelts = nelts / 2;
> +	    }
> +	}
> +    }
> +  /* FIXME: __builtin_convertvector argument and return vectors have the same
> +     number of elements, so for both narrowing and widening we need to figure
> +     out what is the best set of optabs to use.  E.g. for NARROW
> +     VEC_PACK_TRUNC_EXPR has 2 arguments, shall we prefer emitting that with
> +     one argument of arg and another argument all zeros and extract first
> +     half of the resulting vector, or extract lo and hi halves of the arg
> +     vector and use VEC_PACK_TRUNC_EXPR on those?  */
> +  else if (0 && modifier == NARROW)
> +    {
> +      switch (code)
> +	{
> +	case NOP_EXPR:
> +	  code1 = VEC_PACK_TRUNC_EXPR;
> +	  optab1 = optab_for_tree_code (code1, arg_type, optab_default);
> +	  break;
> +	case FIX_TRUNC_EXPR:
> +	  code1 = VEC_PACK_FIX_TRUNC_EXPR;
> +	  /* The signedness is determined from output operand.  */
> +	  optab1 = optab_for_tree_code (code1, ret_type, optab_default);
> +	  break;
> +	case FLOAT_EXPR:
> +	  code1 = VEC_PACK_FLOAT_EXPR;
> +	  optab1 = optab_for_tree_code (code1, arg_type, optab_default);
> +	  break;
> +	default:
> +	  gcc_unreachable ();
> +	}
> +
> +      if (optab1)
> +	compute_type = get_compute_type (code1, optab1, arg_type);
> +      (void) compute_type;
> +    }
> +
> +  new_rhs = expand_vector_piecewise (gsi, do_vec_conversion, arg_type,
> +				     TREE_TYPE (arg_type), arg,
> +				     NULL_TREE, code, ret_type);
> +  g = gimple_build_assign (lhs, new_rhs);
> +  gsi_replace (gsi, g, false);
> +}
> +
>  /* Process one statement.  If we identify a vector operation, expand it.  */
>  
>  static void
> @@ -1561,7 +1706,11 @@ expand_vector_operations_1 (gimple_stmt_
>    /* Only consider code == GIMPLE_ASSIGN. */
>    gassign *stmt = dyn_cast <gassign *> (gsi_stmt (*gsi));
>    if (!stmt)
> -    return;
> +    {
> +      if (gimple_call_internal_p (gsi_stmt (*gsi), IFN_VEC_CONVERT))
> +	expand_vector_conversion (gsi);
> +      return;
> +    }
>  
>    code = gimple_assign_rhs_code (stmt);
>    rhs_class = get_gimple_rhs_class (code);
> --- gcc/internal-fn.def.jj	2019-01-01 12:37:17.893962875 +0100
> +++ gcc/internal-fn.def	2019-01-02 11:24:24.307681792 +0100
> @@ -296,6 +296,7 @@ DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST
>  DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>  DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
>  DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW | ECF_LEAF, NULL)
> +DEF_INTERNAL_FN (VEC_CONVERT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>  
>  /* An unduplicable, uncombinable function.  Generally used to preserve
>     a CFG property in the face of jump threading, tail merging or
> --- gcc/internal-fn.c.jj	2019-01-01 12:37:19.567935410 +0100
> +++ gcc/internal-fn.c	2019-01-02 11:24:24.315681661 +0100
> @@ -2581,6 +2581,15 @@ expand_VA_ARG (internal_fn, gcall *)
>    gcc_unreachable ();
>  }
>  
> +/* IFN_VEC_CONVERT is supposed to be expanded at pass_lower_vector.  So this
> +   dummy function should never be called.  */
> +
> +static void
> +expand_VEC_CONVERT (internal_fn, gcall *)
> +{
> +  gcc_unreachable ();
> +}
> +
>  /* Expand the IFN_UNIQUE function according to its first argument.  */
>  
>  static void
> --- gcc/fold-const-call.c.jj	2019-01-01 12:37:16.528985271 +0100
> +++ gcc/fold-const-call.c	2019-01-02 15:57:36.656449175 +0100
> @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.
>  #include "tm.h" /* For C[LT]Z_DEFINED_AT_ZERO.  */
>  #include "builtins.h"
>  #include "gimple-expr.h"
> +#include "tree-vector-builder.h"
>  
>  /* Functions that test for certain constant types, abstracting away the
>     decision about whether to check for overflow.  */
> @@ -645,6 +646,40 @@ fold_const_reduction (tree type, tree ar
>    return res;
>  }
>  
> +/* Fold a call to IFN_VEC_CONVERT (ARG) returning TYPE.  */
> +
> +static tree
> +fold_const_vec_convert (tree ret_type, tree arg)
> +{
> +  enum tree_code code = NOP_EXPR;
> +  tree arg_type = TREE_TYPE (arg);
> +  if (TREE_CODE (arg) != VECTOR_CST)
> +    return NULL_TREE;
> +
> +  gcc_checking_assert (VECTOR_TYPE_P (ret_type) && VECTOR_TYPE_P (arg_type));
> +
> +  if (INTEGRAL_TYPE_P (TREE_TYPE (ret_type))
> +      && SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg_type)))
> +    code = FIX_TRUNC_EXPR;
> +  else if (INTEGRAL_TYPE_P (TREE_TYPE (arg_type))
> +	   && SCALAR_FLOAT_TYPE_P (TREE_TYPE (ret_type)))
> +    code = FLOAT_EXPR;
> +
> +  tree_vector_builder elts;
> +  elts.new_unary_operation (ret_type, arg, true);
> +  unsigned int count = elts.encoded_nelts ();
> +  for (unsigned int i = 0; i < count; ++i)
> +    {
> +      tree elt = fold_unary (code, TREE_TYPE (ret_type),
> +			     VECTOR_CST_ELT (arg, i));
> +      if (elt == NULL_TREE || !CONSTANT_CLASS_P (elt))
> +	return NULL_TREE;
> +      elts.quick_push (elt);
> +    }
> +
> +  return elts.build ();
> +}
> +
>  /* Try to evaluate:
>  
>        *RESULT = FN (*ARG)
> @@ -1232,6 +1267,9 @@ fold_const_call (combined_fn fn, tree ty
>      case CFN_REDUC_XOR:
>        return fold_const_reduction (type, arg, BIT_XOR_EXPR);
>  
> +    case CFN_VEC_CONVERT:
> +      return fold_const_vec_convert (type, arg);
> +
>      default:
>        return fold_const_call_1 (fn, type, arg);
>      }
> --- gcc/c-family/c-common.h.jj	2019-01-01 12:37:51.309414610 +0100
> +++ gcc/c-family/c-common.h	2019-01-02 11:24:24.314681677 +0100
> @@ -102,7 +102,7 @@ enum rid
>    RID_ASM,       RID_TYPEOF,   RID_ALIGNOF,  RID_ATTRIBUTE,  RID_VA_ARG,
>    RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL,      RID_CHOOSE_EXPR,
>    RID_TYPES_COMPATIBLE_P,      RID_BUILTIN_COMPLEX,	     RID_BUILTIN_SHUFFLE,
> -  RID_BUILTIN_TGMATH,
> +  RID_BUILTIN_CONVERTVECTOR,   RID_BUILTIN_TGMATH,
>    RID_BUILTIN_HAS_ATTRIBUTE,
>    RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
>  
> @@ -1001,6 +1001,7 @@ extern bool lvalue_p (const_tree);
>  extern bool vector_targets_convertible_p (const_tree t1, const_tree t2);
>  extern bool vector_types_convertible_p (const_tree t1, const_tree t2, bool emit_lax_note);
>  extern tree c_build_vec_perm_expr (location_t, tree, tree, tree, bool = true);
> +extern tree c_build_vec_convert (location_t, tree, location_t, tree, bool = true);
>  
>  extern void init_c_lex (void);
>  
> --- gcc/c-family/c-common.c.jj	2019-01-01 12:37:51.366413675 +0100
> +++ gcc/c-family/c-common.c	2019-01-02 11:24:24.314681677 +0100
> @@ -376,6 +376,7 @@ const struct c_common_resword c_common_r
>      RID_BUILTIN_CALL_WITH_STATIC_CHAIN, D_CONLY },
>    { "__builtin_choose_expr", RID_CHOOSE_EXPR, D_CONLY },
>    { "__builtin_complex", RID_BUILTIN_COMPLEX, D_CONLY },
> +  { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 },
>    { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 },
>    { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY },
>    { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 },
> @@ -1070,6 +1071,70 @@ c_build_vec_perm_expr (location_t loc, t
>      ret = c_wrap_maybe_const (ret, true);
>  
>    return ret;
> +}
> +
> +/* Build a VEC_CONVERT ifn for __builtin_convertvector builtin.  */
> +
> +tree
> +c_build_vec_convert (location_t loc1, tree expr, location_t loc2, tree type,
> +		     bool complain)
> +{
> +  if (error_operand_p (type))
> +    return error_mark_node;
> +  if (error_operand_p (expr))
> +    return error_mark_node;
> +
> +  if (!VECTOR_INTEGER_TYPE_P (TREE_TYPE (expr))
> +      && !VECTOR_FLOAT_TYPE_P (TREE_TYPE (expr)))
> +    {
> +      if (complain)
> +	error_at (loc1, "%<__builtin_convertvector%> first argument must "
> +			"be an integer or floating vector");
> +      return error_mark_node;
> +    }
> +
> +  if (!VECTOR_INTEGER_TYPE_P (type) && !VECTOR_FLOAT_TYPE_P (type))
> +    {
> +      if (complain)
> +	error_at (loc2, "%<__builtin_convertvector%> second argument must "
> +			"be an integer or floating vector type");
> +      return error_mark_node;
> +    }
> +
> +  if (maybe_ne (TYPE_VECTOR_SUBPARTS (TREE_TYPE (expr)),
> +		TYPE_VECTOR_SUBPARTS (type)))
> +    {
> +      if (complain)
> +	error_at (loc1, "%<__builtin_convertvector%> number of elements "
> +			"of the first argument vector and the second argument "
> +			"vector type should be the same");
> +      return error_mark_node;
> +    }
> +
> +  if ((TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (expr)))
> +       == TYPE_MAIN_VARIANT (TREE_TYPE (type)))
> +      || (VECTOR_INTEGER_TYPE_P (TREE_TYPE (expr))
> +	  && VECTOR_INTEGER_TYPE_P (type)
> +	  && (TYPE_PRECISION (TREE_TYPE (TREE_TYPE (expr)))
> +	      == TYPE_PRECISION (TREE_TYPE (type)))))
> +    return build1_loc (loc1, VIEW_CONVERT_EXPR, type, expr);
> +
> +  bool wrap = true;
> +  bool maybe_const = false;
> +  tree ret;
> +  if (!c_dialect_cxx ())
> +    {
> +      /* Avoid C_MAYBE_CONST_EXPRs inside of VEC_CONVERT argument.  */
> +      expr = c_fully_fold (expr, false, &maybe_const);
> +      wrap &= maybe_const;
> +    }
> +
> +  ret = build_call_expr_internal_loc (loc1, IFN_VEC_CONVERT, type, 1, expr);
> +
> +  if (!wrap)
> +    ret = c_wrap_maybe_const (ret, true);
> +
> +  return ret;
>  }
>  
>  /* Like tree.c:get_narrower, but retain conversion from C++0x scoped enum
> --- gcc/c/c-parser.c.jj	2019-01-01 12:37:48.677457794 +0100
> +++ gcc/c/c-parser.c	2019-01-02 11:24:24.312681710 +0100
> @@ -8038,6 +8038,7 @@ enum tgmath_parm_kind
>       __builtin_shuffle ( assignment-expression ,
>  			 assignment-expression ,
>  			 assignment-expression, )
> +     __builtin_convertvector ( assignment-expression , type-name )
>  
>     offsetof-member-designator:
>       identifier
> @@ -9113,17 +9114,14 @@ c_parser_postfix_expression (c_parser *p
>  	      *p = convert_lvalue_to_rvalue (loc, *p, true, true);
>  
>  	    if (vec_safe_length (cexpr_list) == 2)
> -	      expr.value =
> -		c_build_vec_perm_expr
> -		  (loc, (*cexpr_list)[0].value,
> -		   NULL_TREE, (*cexpr_list)[1].value);
> +	      expr.value = c_build_vec_perm_expr (loc, (*cexpr_list)[0].value,
> +						  NULL_TREE,
> +						  (*cexpr_list)[1].value);
>  
>  	    else if (vec_safe_length (cexpr_list) == 3)
> -	      expr.value =
> -		c_build_vec_perm_expr
> -		  (loc, (*cexpr_list)[0].value,
> -		   (*cexpr_list)[1].value,
> -		   (*cexpr_list)[2].value);
> +	      expr.value = c_build_vec_perm_expr (loc, (*cexpr_list)[0].value,
> +						  (*cexpr_list)[1].value,
> +						  (*cexpr_list)[2].value);
>  	    else
>  	      {
>  		error_at (loc, "wrong number of arguments to "
> @@ -9133,6 +9131,41 @@ c_parser_postfix_expression (c_parser *p
>  	    set_c_expr_source_range (&expr, loc, close_paren_loc);
>  	    break;
>  	  }
> +	case RID_BUILTIN_CONVERTVECTOR:
> +	  {
> +	    location_t start_loc = loc;
> +	    c_parser_consume_token (parser);
> +	    matching_parens parens;
> +	    if (!parens.require_open (parser))
> +	      {
> +		expr.set_error ();
> +		break;
> +	      }
> +	    e1 = c_parser_expr_no_commas (parser, NULL);
> +	    mark_exp_read (e1.value);
> +	    if (!c_parser_require (parser, CPP_COMMA, "expected %<,%>"))
> +	      {
> +		c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
> +		expr.set_error ();
> +		break;
> +	      }
> +	    loc = c_parser_peek_token (parser)->location;
> +	    t1 = c_parser_type_name (parser);
> +	    location_t end_loc = c_parser_peek_token (parser)->get_finish ();
> +	    c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
> +				       "expected %<)%>");
> +	    if (t1 == NULL)
> +	      expr.set_error ();
> +	    else
> +	      {
> +		tree type_expr = NULL_TREE;
> +		expr.value = c_build_vec_convert (start_loc, e1.value, loc,
> +						  groktypename (t1, &type_expr,
> +								NULL));
> +		set_c_expr_source_range (&expr, start_loc, end_loc);
> +	      }
> +	  }
> +	  break;
>  	case RID_AT_SELECTOR:
>  	  {
>  	    gcc_assert (c_dialect_objc ());
> --- gcc/cp/cp-tree.h.jj	2019-01-01 12:37:46.884487212 +0100
> +++ gcc/cp/cp-tree.h	2019-01-02 16:43:35.480393140 +0100
> @@ -7142,6 +7142,8 @@ extern bool is_lambda_ignored_entity
>  extern bool lambda_static_thunk_p		(tree);
>  extern tree finish_builtin_launder		(location_t, tree,
>  						 tsubst_flags_t);
> +extern tree cp_build_vec_convert		(tree, location_t, tree,
> +						 tsubst_flags_t);
>  extern void start_lambda_scope			(tree);
>  extern void record_lambda_scope			(tree);
>  extern void record_null_lambda_scope		(tree);
> --- gcc/cp/parser.c.jj	2019-01-01 12:37:47.352479534 +0100
> +++ gcc/cp/parser.c	2019-01-02 16:19:44.765760167 +0100
> @@ -7031,6 +7031,32 @@ cp_parser_postfix_expression (cp_parser
>  	break;
>        }
>  
> +    case RID_BUILTIN_CONVERTVECTOR:
> +      {
> +	tree expression;
> +	tree type;
> +	/* Consume the `__builtin_convertvector' token.  */
> +	cp_lexer_consume_token (parser->lexer);
> +	/* Look for the opening `('.  */
> +	matching_parens parens;
> +	parens.require_open (parser);
> +	/* Now, parse the assignment-expression.  */
> +	expression = cp_parser_assignment_expression (parser);
> +	/* Look for the `,'.  */
> +	cp_parser_require (parser, CPP_COMMA, RT_COMMA);
> +	location_t type_location
> +	  = cp_lexer_peek_token (parser->lexer)->location;
> +	/* Parse the type-id.  */
> +	{
> +	  type_id_in_expr_sentinel s (parser);
> +	  type = cp_parser_type_id (parser);
> +	}
> +	/* Look for the closing `)'.  */
> +	parens.require_close (parser);
> +	return cp_build_vec_convert (expression, type_location, type,
> +				     tf_warning_or_error);
> +      }
> +
>      default:
>        {
>  	tree type;
> --- gcc/cp/constexpr.c.jj	2019-01-01 12:37:47.282480682 +0100
> +++ gcc/cp/constexpr.c	2019-01-02 16:56:54.126359632 +0100
> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.
>  #include "ubsan.h"
>  #include "gimple-fold.h"
>  #include "timevar.h"
> +#include "fold-const-call.h"
>  
>  static bool verify_constant (tree, bool, bool *, bool *);
>  #define VERIFY_CONSTANT(X)						\
> @@ -1449,6 +1450,20 @@ cxx_eval_internal_function (const conste
>        return cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0),
>  					   false, non_constant_p, overflow_p);
>  
> +    case IFN_VEC_CONVERT:
> +      {
> +	tree arg = cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0),
> +						 false, non_constant_p,
> +						 overflow_p);
> +	if (TREE_CODE (arg) == VECTOR_CST)
> +	  return fold_const_call (CFN_VEC_CONVERT, TREE_TYPE (t), arg);
> +	else
> +	  {
> +	    *non_constant_p = true;
> +	    return t;
> +	  }
> +      }
> +
>      default:
>        if (!ctx->quiet)
>  	error_at (cp_expr_loc_or_loc (t, input_location),
> @@ -5623,7 +5638,9 @@ potential_constant_expression_1 (tree t,
>  		case IFN_SUB_OVERFLOW:
>  		case IFN_MUL_OVERFLOW:
>  		case IFN_LAUNDER:
> +		case IFN_VEC_CONVERT:
>  		  bail = false;
> +		  break;
>  
>  		default:
>  		  break;
> --- gcc/cp/semantics.c.jj	2019-01-01 12:37:46.976485703 +0100
> +++ gcc/cp/semantics.c	2019-01-02 18:15:42.844133048 +0100
> @@ -9933,4 +9933,26 @@ finish_builtin_launder (location_t loc,
>  				       TREE_TYPE (arg), 1, arg);
>  }
>  
> +/* Finish __builtin_convertvector (arg, type).  */
> +
> +tree
> +cp_build_vec_convert (tree arg, location_t loc, tree type,
> +		      tsubst_flags_t complain)
> +{
> +  if (error_operand_p (type))
> +    return error_mark_node;
> +  if (error_operand_p (arg))
> +    return error_mark_node;
> +
> +  tree ret = NULL_TREE;
> +  if (!type_dependent_expression_p (arg) && !dependent_type_p (type))
> +    ret = c_build_vec_convert (cp_expr_loc_or_loc (arg, input_location), arg,
> +			       loc, type, (complain & tf_error) != 0);
> +
> +  if (!processing_template_decl)
> +    return ret;
> +
> +  return build_call_expr_internal_loc (loc, IFN_VEC_CONVERT, type, 1, arg);
> +}
> +
>  #include "gt-cp-semantics.h"
> --- gcc/cp/pt.c.jj	2019-01-01 12:37:47.081483980 +0100
> +++ gcc/cp/pt.c	2019-01-02 18:25:17.997778249 +0100
> @@ -18813,6 +18813,27 @@ tsubst_copy_and_build (tree t,
>  					      (*call_args)[0], complain);
>  	      break;
>  
> +	    case IFN_VEC_CONVERT:
> +	      gcc_assert (nargs == 1);
> +	      if (vec_safe_length (call_args) != 1)
> +		{
> +		  error_at (cp_expr_loc_or_loc (t, input_location),
> +			    "wrong number of arguments to "
> +			    "%<__builtin_convertvector%>");
> +		  ret = error_mark_node;
> +		  break;
> +		}
> +	      ret = cp_build_vec_convert ((*call_args)[0], input_location,
> +					  tsubst (TREE_TYPE (t), args,
> +						  complain, in_decl),
> +					  complain);
> +	      if (TREE_CODE (ret) == VIEW_CONVERT_EXPR)
> +		{
> +		  release_tree_vector (call_args);
> +		  RETURN (ret);
> +		}
> +	      break;
> +
>  	    default:
>  	      /* Unsupported internal function with arguments.  */
>  	      gcc_unreachable ();
> --- gcc/testsuite/c-c++-common/builtin-convertvector-1.c.jj	2019-01-02 18:38:18.265090910 +0100
> +++ gcc/testsuite/c-c++-common/builtin-convertvector-1.c	2019-01-02 18:37:50.337544972 +0100
> @@ -0,0 +1,15 @@
> +typedef int v8si __attribute__((vector_size (8 * sizeof (int))));
> +typedef long long v4di __attribute__((vector_size (4 * sizeof (long long))));
> +
> +void
> +foo (v8si *x, v4di *y, int z)
> +{
> +  __builtin_convertvector (*y, v8si);	/* { dg-error "number of elements of the first argument vector and the second argument vector type should be the same" } */
> +  __builtin_convertvector (*x, v4di);	/* { dg-error "number of elements of the first argument vector and the second argument vector type should be the same" } */
> +  __builtin_convertvector (*x, int);	/* { dg-error "second argument must be an integer or floating vector type" } */
> +  __builtin_convertvector (z, v4di);	/* { dg-error "first argument must be an integer or floating vector" } */
> +  __builtin_convertvector ();		/* { dg-error "expected" } */
> +  __builtin_convertvector (*x);		/* { dg-error "expected" } */
> +  __builtin_convertvector (*x, *y);	/* { dg-error "expected" } */
> +  __builtin_convertvector (*x, v8si, 1);/* { dg-error "expected" } */
> +}
> --- gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c.jj	2019-01-02 18:00:59.982534637 +0100
> +++ gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c	2019-01-02 18:00:32.871977360 +0100
> @@ -0,0 +1,131 @@
> +extern
> +#ifdef __cplusplus
> +"C"
> +#endif
> +void abort (void);
> +typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> +typedef unsigned int v4usi __attribute__((vector_size (4 * sizeof (unsigned int))));
> +typedef float v4sf __attribute__((vector_size (4 * sizeof (float))));
> +typedef double v4df __attribute__((vector_size (4 * sizeof (double))));
> +typedef long long v256di __attribute__((vector_size (256 * sizeof (long long))));
> +typedef double v256df __attribute__((vector_size (256 * sizeof (double))));
> +
> +void
> +f1 (v4usi *x, v4si *y)
> +{
> +  *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +void
> +f2 (v4sf *x, v4si *y)
> +{
> +  *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +void
> +f3 (v4si *x, v4sf *y)
> +{
> +  *y = __builtin_convertvector (*x, v4sf);
> +}
> +
> +void
> +f4 (v4df *x, v4si *y)
> +{
> +  *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +void
> +f5 (v4si *x, v4df *y)
> +{
> +  *y = __builtin_convertvector (*x, v4df);
> +}
> +
> +void
> +f6 (v256df *x, v256di *y)
> +{
> +  *y = __builtin_convertvector (*x, v256di);
> +}
> +
> +void
> +f7 (v256di *x, v256df *y)
> +{
> +  *y = __builtin_convertvector (*x, v256df);
> +}
> +
> +void
> +f8 (v4df *x)
> +{
> +  v4si a = { 1, 2, -3, -4 };
> +  *x = __builtin_convertvector (a, v4df);
> +}
> +
> +int
> +main ()
> +{
> +  union U1 { v4si v; int a[4]; } u1;
> +  union U2 { v4usi v; unsigned int a[4]; } u2;
> +  union U3 { v4sf v; float a[4]; } u3;
> +  union U4 { v4df v; double a[4]; } u4;
> +  union U5 { v256di v; long long a[256]; } u5;
> +  union U6 { v256df v; double a[256]; } u6;
> +  int i;
> +  for (i = 0; i < 4; i++)
> +    u2.a[i] = i * 2;
> +  f1 (&u2.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != i * 2)
> +      abort ();
> +    else
> +      u3.a[i] = i - 2.25f;
> +  f2 (&u3.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != (i == 3 ? 0 : i - 2))
> +      abort ();
> +    else
> +      u3.a[i] = i + 0.75f;
> +  f2 (&u3.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != i)
> +      abort ();
> +    else
> +      u1.a[i] = 7 * i - 5;
> +  f3 (&u1.v, &u3.v);
> +  for (i = 0; i < 4; i++)
> +    if (u3.a[i] != 7 * i - 5)
> +      abort ();
> +    else
> +      u4.a[i] = i - 2.25;
> +  f4 (&u4.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != (i == 3 ? 0 : i - 2))
> +      abort ();
> +    else
> +      u4.a[i] = i + 0.75;
> +  f4 (&u4.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != i)
> +      abort ();
> +    else
> +      u1.a[i] = 7 * i - 5;
> +  f5 (&u1.v, &u4.v);
> +  for (i = 0; i < 4; i++)
> +    if (u4.a[i] != 7 * i - 5)
> +      abort ();
> +  for (i = 0; i < 256; i++)
> +    u6.a[i] = i - 128.25;
> +  f6 (&u6.v, &u5.v);
> +  for (i = 0; i < 256; i++)
> +    if (u5.a[i] != i - 128 - (i > 128))
> +      abort ();
> +    else
> +      u5.a[i] = i - 128;
> +  f7 (&u5.v, &u6.v);
> +  for (i = 0; i < 256; i++)
> +    if (u6.a[i] != i - 128)
> +      abort ();
> +  f8 (&u4.v);
> +  for (i = 0; i < 4; i++)
> +    if (u4.a[i] != (i >= 2 ? -1 - i : i + 1))
> +      abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.dg/ext/builtin-convertvector-1.C.jj	2019-01-02 18:04:14.984350274 +0100
> +++ gcc/testsuite/g++.dg/ext/builtin-convertvector-1.C	2019-01-02 18:07:17.122375950 +0100
> @@ -0,0 +1,137 @@
> +// { dg-do run }
> +
> +extern "C" void abort ();
> +typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> +typedef unsigned int v4usi __attribute__((vector_size (4 * sizeof (unsigned int))));
> +typedef float v4sf __attribute__((vector_size (4 * sizeof (float))));
> +typedef double v4df __attribute__((vector_size (4 * sizeof (double))));
> +typedef long long v256di __attribute__((vector_size (256 * sizeof (long long))));
> +typedef double v256df __attribute__((vector_size (256 * sizeof (double))));
> +
> +template <int N>
> +void
> +f1 (v4usi *x, v4si *y)
> +{
> +  *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +template <typename T>
> +void
> +f2 (T *x, v4si *y)
> +{
> +  *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +template <typename T>
> +void
> +f3 (v4si *x, T *y)
> +{
> +  *y = __builtin_convertvector (*x, T);
> +}
> +
> +template <int N>
> +void
> +f4 (v4df *x, v4si *y)
> +{
> +  *y = __builtin_convertvector (*x, v4si);
> +}
> +
> +template <typename T, typename U>
> +void
> +f5 (T *x, U *y)
> +{
> +  *y = __builtin_convertvector (*x, U);
> +}
> +
> +template <typename T>
> +void
> +f6 (v256df *x, T *y)
> +{
> +  *y = __builtin_convertvector (*x, T);
> +}
> +
> +template <int N>
> +void
> +f7 (v256di *x, v256df *y)
> +{
> +  *y = __builtin_convertvector (*x, v256df);
> +}
> +
> +template <int N>
> +void
> +f8 (v4df *x)
> +{
> +  v4si a = { 1, 2, -3, -4 };
> +  *x = __builtin_convertvector (a, v4df);
> +}
> +
> +int
> +main ()
> +{
> +  union U1 { v4si v; int a[4]; } u1;
> +  union U2 { v4usi v; unsigned int a[4]; } u2;
> +  union U3 { v4sf v; float a[4]; } u3;
> +  union U4 { v4df v; double a[4]; } u4;
> +  union U5 { v256di v; long long a[256]; } u5;
> +  union U6 { v256df v; double a[256]; } u6;
> +  int i;
> +  for (i = 0; i < 4; i++)
> +    u2.a[i] = i * 2;
> +  f1<0> (&u2.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != i * 2)
> +      abort ();
> +    else
> +      u3.a[i] = i - 2.25f;
> +  f2 (&u3.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != (i == 3 ? 0 : i - 2))
> +      abort ();
> +    else
> +      u3.a[i] = i + 0.75f;
> +  f2 (&u3.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != i)
> +      abort ();
> +    else
> +      u1.a[i] = 7 * i - 5;
> +  f3 (&u1.v, &u3.v);
> +  for (i = 0; i < 4; i++)
> +    if (u3.a[i] != 7 * i - 5)
> +      abort ();
> +    else
> +      u4.a[i] = i - 2.25;
> +  f4<12> (&u4.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != (i == 3 ? 0 : i - 2))
> +      abort ();
> +    else
> +      u4.a[i] = i + 0.75;
> +  f4<13> (&u4.v, &u1.v);
> +  for (i = 0; i < 4; i++)
> +    if (u1.a[i] != i)
> +      abort ();
> +    else
> +      u1.a[i] = 7 * i - 5;
> +  f5 (&u1.v, &u4.v);
> +  for (i = 0; i < 4; i++)
> +    if (u4.a[i] != 7 * i - 5)
> +      abort ();
> +  for (i = 0; i < 256; i++)
> +    u6.a[i] = i - 128.25;
> +  f6 (&u6.v, &u5.v);
> +  for (i = 0; i < 256; i++)
> +    if (u5.a[i] != i - 128 - (i > 128))
> +      abort ();
> +    else
> +      u5.a[i] = i - 128;
> +  f7<-1> (&u5.v, &u6.v);
> +  for (i = 0; i < 256; i++)
> +    if (u6.a[i] != i - 128)
> +      abort ();
> +  f8<5> (&u4.v);
> +  for (i = 0; i < 4; i++)
> +    if (u4.a[i] != (i >= 2 ? -1 - i : i + 1))
> +      abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.dg/cpp0x/constexpr-builtin4.C.jj	2019-01-02 18:39:12.767204801 +0100
> +++ gcc/testsuite/g++.dg/cpp0x/constexpr-builtin4.C	2019-01-02 18:42:30.749985890 +0100
> @@ -0,0 +1,17 @@
> +// { dg-do compile { target c++11 } }
> +// { dg-additional-options "-Wno-psabi" }
> +
> +typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> +typedef float v4sf __attribute__((vector_size (4 * sizeof (float))));
> +constexpr v4sf a = __builtin_convertvector (v4si { 1, 2, -3, -4 }, v4sf);
> +
> +constexpr v4sf
> +foo (v4si x)
> +{
> +  return __builtin_convertvector (x, v4sf);
> +}
> +
> +constexpr v4sf b = foo (v4si { 3, 4, -1, -2 });
> +
> +static_assert (a[0] == 1.0f && a[1] == 2.0f && a[2] == -3.0f && a[3] == -4.0f, "");
> +static_assert (b[0] == 3.0f && b[1] == 4.0f && b[2] == -1.0f && b[3] == -2.0f, "");
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

  parent reply	other threads:[~2019-01-03 11:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-03 10:06 Jakub Jelinek
2019-01-03 10:48 ` Marc Glisse
2019-01-03 11:04   ` Jakub Jelinek
2019-01-03 17:32     ` Marc Glisse
2019-01-03 22:24       ` [PATCH] Add __builtin_convertvector support (PR c++/85052, take 2) Jakub Jelinek
2019-01-07  8:27         ` Richard Biener
2019-01-03 11:16 ` Richard Biener [this message]
2019-01-03 12:11   ` [PATCH] Add __builtin_convertvector support (PR c++/85052) Jakub Jelinek
2019-01-03 13:06 ` Richard Sandiford
2019-01-03 17:04 ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.20.1901031205260.23386@zhemvz.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jason@redhat.com \
    --cc=jh@suse.cz \
    --cc=joseph@codesourcery.com \
    --cc=rdsandiford@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).