Re: Implement SLP of internal functions

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Biener <richard.guenther@gmail.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>,
		Richard Sandiford <richard.sandiford@linaro.org>
Subject: Re: Implement SLP of internal functions
Date: Thu, 17 May 2018 11:42:00 -0000	[thread overview]
Message-ID: <CAFiYyc0F27eqfNjtukMJRJudMxBOKG0r0dwCJmQG5YcZ5Homhw@mail.gmail.com> (raw)
In-Reply-To: <87muwzoqd3.fsf@linaro.org>

On Wed, May 16, 2018 at 12:18 PM Richard Sandiford <
richard.sandiford@linaro.org> wrote:

> SLP of calls was previously restricted to built-in functions.
> This patch extends it to internal functions.

> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
> and x86_64-linux-gnu.  OK to install?

> Richard


> 2018-05-16  Richard Sandiford  <richard.sandiford@linaro.org>

> gcc/
>          * internal-fn.h (vectorizable_internal_fn_p): New function.
>          * tree-vect-slp.c (compatible_calls_p): Likewise.
>          (vect_build_slp_tree_1): Remove nops argument.  Handle calls
>          to internal functions.
>          (vect_build_slp_tree_2): Update call to vect_build_slp_tree_1.

> gcc/testsuite/
>          * gcc.target/aarch64/sve/cond_arith_4.c: New test.
>          * gcc.target/aarch64/sve/cond_arith_4_run.c: Likewise.
>          * gcc.target/aarch64/sve/cond_arith_5.c: Likewise.
>          * gcc.target/aarch64/sve/cond_arith_5_run.c: Likewise.
>          * gcc.target/aarch64/sve/slp_14.c: Likewise.
>          * gcc.target/aarch64/sve/slp_14_run.c: Likewise.

> Index: gcc/internal-fn.h
> ===================================================================
> --- gcc/internal-fn.h   2018-05-16 11:06:14.513574219 +0100
> +++ gcc/internal-fn.h   2018-05-16 11:12:11.872116220 +0100
> @@ -158,6 +158,17 @@ direct_internal_fn_p (internal_fn fn)
>     return direct_internal_fn_array[fn].type0 >= -1;
>   }

> +/* Return true if FN is a direct internal function that can be
vectorized by
> +   converting the return type and all argument types to vectors of the
same
> +   number of elements.  E.g. we can vectorize an IFN_SQRT on floats as an
> +   IFN_SQRT on vectors of N floats.  */
> +
> +inline bool
> +vectorizable_internal_fn_p (internal_fn fn)
> +{
> +  return direct_internal_fn_array[fn].vectorizable;
> +}
> +
>   /* Return optab information about internal function FN.  Only meaningful
>      if direct_internal_fn_p (FN).  */

> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2018-05-16 11:02:46.262494712 +0100
> +++ gcc/tree-vect-slp.c 2018-05-16 11:12:11.873116180 +0100
> @@ -564,6 +564,41 @@ vect_get_and_check_slp_defs (vec_info *v
>     return 0;
>   }

> +/* Return true if call statements CALL1 and CALL2 are similar enough
> +   to be combined into the same SLP group.  */
> +
> +static bool
> +compatible_calls_p (gcall *call1, gcall *call2)
> +{
> +  unsigned int nargs = gimple_call_num_args (call1);
> +  if (nargs != gimple_call_num_args (call2))
> +    return false;
> +
> +  if (gimple_call_combined_fn (call1) != gimple_call_combined_fn (call2))
> +    return false;
> +
> +  if (gimple_call_internal_p (call1))
> +    {
> +      if (TREE_TYPE (gimple_call_lhs (call1))
> +         != TREE_TYPE (gimple_call_lhs (call2)))
> +       return false;
> +      for (unsigned int i = 0; i < nargs; ++i)
> +       if (TREE_TYPE (gimple_call_arg (call1, i))
> +           != TREE_TYPE (gimple_call_arg (call2, i)))

Please use types_compatible_p in these two type comparisons.

Can you please add a generic vect_call_sqrtf to the main
vectorizer testsuite?  In fact I already see
gcc.dg/vect/fast-math-bb-slp-call-1.c.
Does that mean SQRT does never appear as internal function before
vectorization?

OK with that changes.
Richard.

> +         return false;
> +    }
> +  else
> +    {
> +      if (!operand_equal_p (gimple_call_fn (call1),
> +                           gimple_call_fn (call2), 0))
> +       return false;
> +
> +      if (gimple_call_fntype (call1) != gimple_call_fntype (call2))
> +       return false;
> +    }
> +  return true;
> +}
> +
>   /* A subroutine of vect_build_slp_tree for checking VECTYPE, which is the
>      caller's attempt to find the vector type in STMT with the narrowest
>      element type.  Return true if VECTYPE is nonnull and if it is valid
> @@ -625,8 +660,8 @@ vect_record_max_nunits (vec_info *vinfo,
>   static bool
>   vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
>                         vec<gimple *> stmts, unsigned int group_size,
> -                      unsigned nops, poly_uint64 *max_nunits,
> -                      bool *matches, bool *two_operators)
> +                      poly_uint64 *max_nunits, bool *matches,
> +                      bool *two_operators)
>   {
>     unsigned int i;
>     gimple *first_stmt = stmts[0], *stmt = stmts[0];
> @@ -698,7 +733,9 @@ vect_build_slp_tree_1 (vec_info *vinfo,
>         if (gcall *call_stmt = dyn_cast <gcall *> (stmt))
>          {
>            rhs_code = CALL_EXPR;
> -         if (gimple_call_internal_p (call_stmt)
> +         if ((gimple_call_internal_p (call_stmt)
> +              && (!vectorizable_internal_fn_p
> +                  (gimple_call_internal_fn (call_stmt))))
>                || gimple_call_tail_p (call_stmt)
>                || gimple_call_noreturn_p (call_stmt)
>                || !gimple_call_nothrow_p (call_stmt)
> @@ -833,11 +870,8 @@ vect_build_slp_tree_1 (vec_info *vinfo,
>            if (rhs_code == CALL_EXPR)
>              {
>                gimple *first_stmt = stmts[0];
> -             if (gimple_call_num_args (stmt) != nops
> -                 || !operand_equal_p (gimple_call_fn (first_stmt),
> -                                      gimple_call_fn (stmt), 0)
> -                 || gimple_call_fntype (first_stmt)
> -                    != gimple_call_fntype (stmt))
> +             if (!compatible_calls_p (as_a <gcall *> (first_stmt),
> +                                      as_a <gcall *> (stmt)))
>                  {
>                    if (dump_enabled_p ())
>                      {
> @@ -1166,8 +1200,7 @@ vect_build_slp_tree_2 (vec_info *vinfo,

>     bool two_operators = false;
>     unsigned char *swap = XALLOCAVEC (unsigned char, group_size);
> -  if (!vect_build_slp_tree_1 (vinfo, swap,
> -                             stmts, group_size, nops,
> +  if (!vect_build_slp_tree_1 (vinfo, swap, stmts, group_size,
>                                &this_max_nunits, matches, &two_operators))
>       return NULL;

> Index: gcc/testsuite/gcc.target/aarch64/sve/cond_arith_4.c
> ===================================================================
> --- /dev/null   2018-04-20 16:19:46.369131350 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/cond_arith_4.c 2018-05-16
11:12:11.872116220 +0100
> @@ -0,0 +1,62 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize" } */
> +
> +#include <stdint.h>
> +
> +#define TEST(TYPE, NAME, OP)                                           \
> +  void __attribute__ ((noinline, noclone))                             \
> +  test_##TYPE##_##NAME (TYPE *__restrict x,                            \
> +                       TYPE *__restrict y,                             \
> +                       TYPE z1, TYPE z2,                               \
> +                       TYPE *__restrict pred, int n)                   \
> +  {                                                                    \
> +    for (int i = 0; i < n; i += 2)                                     \
> +      {
        \
> +       x[i] = (pred[i] != 1 ? y[i] OP z1 : y[i]);                      \
> +       x[i + 1] = (pred[i + 1] != 1 ? y[i + 1] OP z2 : y[i + 1]);      \
> +      }
        \
> +  }
> +
> +#define TEST_INT_TYPE(TYPE) \
> +  TEST (TYPE, div, /)
> +
> +#define TEST_FP_TYPE(TYPE) \
> +  TEST (TYPE, add, +) \
> +  TEST (TYPE, sub, -) \
> +  TEST (TYPE, mul, *) \
> +  TEST (TYPE, div, /)
> +
> +#define TEST_ALL \
> +  TEST_INT_TYPE (int32_t) \
> +  TEST_INT_TYPE (uint32_t) \
> +  TEST_INT_TYPE (int64_t) \
> +  TEST_INT_TYPE (uint64_t) \
> +  TEST_FP_TYPE (float) \
> +  TEST_FP_TYPE (double)
> +
> +TEST_ALL
> +
> +/* { dg-final { scan-assembler-times {\tsdiv\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tudiv\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tsdiv\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tudiv\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +
> +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +
> +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +
> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z,} 12
} } */
> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7],} 6 } }
*/
> +
> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z,} 12
} } */
> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7],} 6 } }
*/
> +
> +/* { dg-final { scan-assembler-not {\tsel\t} } } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/cond_arith_4_run.c
> ===================================================================
> --- /dev/null   2018-04-20 16:19:46.369131350 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/cond_arith_4_run.c
2018-05-16 11:12:11.872116220 +0100
> @@ -0,0 +1,32 @@
> +/* { dg-do run { target aarch64_sve_hw } } */
> +/* { dg-options "-O2 -ftree-vectorize" } */
> +
> +#include "cond_arith_4.c"
> +
> +#define N 98
> +
> +#undef TEST
> +#define TEST(TYPE, NAME, OP)                                   \
> +  {                                                            \
> +    TYPE x[N], y[N], pred[N], z[2] = { 5, 7 };                 \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       y[i] = i * i;                                           \
> +       pred[i] = i % 3;                                        \
> +      }                                                                \
> +    test_##TYPE##_##NAME (x, y, z[0], z[1], pred, N);          \
> +    for (int i = 0; i < N; ++i)                                        \
> +      {                                                                \
> +       TYPE expected = i % 3 != 1 ? y[i] OP z[i & 1] : y[i];   \
> +       if (x[i] != expected)                                   \
> +         __builtin_abort ();                                   \
> +       asm volatile ("" ::: "memory");                         \
> +      }                                                                \
> +  }
> +
> +int
> +main (void)
> +{
> +  TEST_ALL
> +  return 0;
> +}
> Index: gcc/testsuite/gcc.target/aarch64/sve/cond_arith_5.c
> ===================================================================
> --- /dev/null   2018-04-20 16:19:46.369131350 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/cond_arith_5.c 2018-05-16
11:12:11.872116220 +0100
> @@ -0,0 +1,85 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model" } */
> +
> +#include <stdint.h>
> +
> +#define TEST(DATA_TYPE, OTHER_TYPE, NAME, OP)                          \
> +  void __attribute__ ((noinline, noclone))                             \
> +  test_##DATA_TYPE##_##OTHER_TYPE##_##NAME (DATA_TYPE *__restrict x,   \
> +                                           DATA_TYPE *__restrict y,    \
> +                                           DATA_TYPE z1, DATA_TYPE z2, \
> +                                           DATA_TYPE *__restrict pred, \
> +                                           OTHER_TYPE *__restrict foo, \
> +                                           int n)                      \
> +  {                                                                    \
> +    for (int i = 0; i < n; i += 2)                                     \
> +      {
        \
> +       x[i] = (pred[i] != 1 ? y[i] OP z1 : y[i]);                      \
> +       x[i + 1] = (pred[i + 1] != 1 ? y[i + 1] OP z2 : y[i + 1]);      \
> +       foo[i] += 1;                                                    \
> +       foo[i + 1] += 2;                                                \
> +      }
        \
> +  }
> +
> +#define TEST_INT_TYPE(DATA_TYPE, OTHER_TYPE) \
> +  TEST (DATA_TYPE, OTHER_TYPE, div, /)
> +
> +#define TEST_FP_TYPE(DATA_TYPE, OTHER_TYPE) \
> +  TEST (DATA_TYPE, OTHER_TYPE, add, +) \
> +  TEST (DATA_TYPE, OTHER_TYPE, sub, -) \
> +  TEST (DATA_TYPE, OTHER_TYPE, mul, *) \
> +  TEST (DATA_TYPE, OTHER_TYPE, div, /)
> +
> +#define TEST_ALL \
> +  TEST_INT_TYPE (int32_t, int8_t) \
> +  TEST_INT_TYPE (int32_t, int16_t) \
> +  TEST_INT_TYPE (uint32_t, int8_t) \
> +  TEST_INT_TYPE (uint32_t, int16_t) \
> +  TEST_INT_TYPE (int64_t, int8_t) \
> +  TEST_INT_TYPE (int64_t, int16_t) \
> +  TEST_INT_TYPE (int64_t, int32_t) \
> +  TEST_INT_TYPE (uint64_t, int8_t) \
> +  TEST_INT_TYPE (uint64_t, int16_t) \
> +  TEST_INT_TYPE (uint64_t, int32_t) \
> +  TEST_FP_TYPE (float, int8_t) \
> +  TEST_FP_TYPE (float, int16_t) \
> +  TEST_FP_TYPE (double, int8_t) \
> +  TEST_FP_TYPE (double, int16_t) \
> +  TEST_FP_TYPE (double, int32_t)
> +
> +TEST_ALL
> +
> +/* { dg-final { scan-assembler-times {\tsdiv\tz[0-9]+\.s, p[0-7]/m,} 6 }
} */
> +/* { dg-final { scan-assembler-times {\tudiv\tz[0-9]+\.s, p[0-7]/m,} 6 }
} */
> +/* { dg-final { scan-assembler-times {\tsdiv\tz[0-9]+\.d, p[0-7]/m,} 14
} } */
> +/* { dg-final { scan-assembler-times {\tudiv\tz[0-9]+\.d, p[0-7]/m,} 14
} } */
> +
> +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m,} 6 }
} */
> +/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m,} 14
} } */
> +
> +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 6 }
} */
> +/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 14
} } */
> +
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 6 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 14
} } */
> +
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.s, p[0-7]/m,} 6 }
} */
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.d, p[0-7]/m,} 14
} } */
> +
> +/* The load XFAILs for fixed-length SVE account for extra loads from the
> +   constant pool.  */
> +/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z,} 12
{ xfail { aarch64_sve && { ! vect_variable_length } } } } } */
> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7],} 12 }
} */
> +
> +/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z,} 12
{ xfail { aarch64_sve && { ! vect_variable_length } } } } } */
> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7],} 12 }
} */
> +
> +/* 72 for x operations, 6 for foo operations.  */
> +/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z,} 78
{ xfail { aarch64_sve && { ! vect_variable_length } } } } } */
> +/* 36 for x operations, 6 for foo operations.  */
> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7],} 42 }
} */
> +
> +/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z,} 168
} } */
> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7],} 84 }
} */
> +
> +/* { dg-final { scan-assembler-not {\tsel\t} } } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/cond_arith_5_run.c
> ===================================================================
> --- /dev/null   2018-04-20 16:19:46.369131350 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/cond_arith_5_run.c
2018-05-16 11:12:11.873116180 +0100
> @@ -0,0 +1,35 @@
> +/* { dg-do run { target aarch64_sve_hw } } */
> +/* { dg-options "-O2 -ftree-vectorize" } */
> +
> +#include "cond_arith_5.c"
> +
> +#define N 98
> +
> +#undef TEST
> +#define TEST(DATA_TYPE, OTHER_TYPE, NAME, OP)                          \
> +  {                                                                    \
> +    DATA_TYPE x[N], y[N], pred[N], z[2] = { 5, 7 };                    \
> +    OTHER_TYPE foo[N];                                                 \
> +    for (int i = 0; i < N; ++i)
        \
> +      {
        \
> +       y[i] = i * i;                                                   \
> +       pred[i] = i % 3;                                                \
> +       foo[i] = i * 5;                                                 \
> +      }
        \
> +    test_##DATA_TYPE##_##OTHER_TYPE##_##NAME (x, y, z[0], z[1],
        \
> +                                             pred, foo, N);            \
> +    for (int i = 0; i < N; ++i)
        \
> +      {
        \
> +       DATA_TYPE expected = i % 3 != 1 ? y[i] OP z[i & 1] : y[i];      \
> +       if (x[i] != expected)                                           \
> +         __builtin_abort ();                                           \
> +       asm volatile ("" ::: "memory");                                 \
> +      }
        \
> +  }
> +
> +int
> +main (void)
> +{
> +  TEST_ALL
> +  return 0;
> +}
> Index: gcc/testsuite/gcc.target/aarch64/sve/slp_14.c
> ===================================================================
> --- /dev/null   2018-04-20 16:19:46.369131350 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/slp_14.c       2018-05-16
11:12:11.873116180 +0100
> @@ -0,0 +1,48 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize" } */
> +
> +#include <stdint.h>
> +
> +#define VEC_PERM(TYPE)                                         \
> +void __attribute__ ((weak))                                    \
> +vec_slp_##TYPE (TYPE *restrict a, TYPE *restrict b, int n)     \
> +{                                                              \
> +  for (int i = 0; i < n; ++i)                                  \
> +    {                                                          \
> +      TYPE a1 = a[i * 2];                                      \
> +      TYPE a2 = a[i * 2 + 1];                                  \
> +      TYPE b1 = b[i * 2];                                      \
> +      TYPE b2 = b[i * 2 + 1];                                  \
> +      a[i * 2] = b1 > 1 ? a1 / b1 : a1;                                \
> +      a[i * 2 + 1] = b2 > 2 ? a2 / b2 : a2;                    \
> +    }                                                          \
> +}
> +
> +#define TEST_ALL(T)                            \
> +  T (int32_t)                                  \
> +  T (uint32_t)                                 \
> +  T (int64_t)                                  \
> +  T (uint64_t)                                 \
> +  T (float)                                    \
> +  T (double)
> +
> +TEST_ALL (VEC_PERM)
> +
> +/* The loop should be fully-masked.  The load XFAILs for fixed-length
> +   SVE account for extra loads from the constant pool.  */
> +/* { dg-final { scan-assembler-times {\tld1w\t} 6 { xfail { aarch64_sve
&& { ! vect_variable_length } } } } } */
> +/* { dg-final { scan-assembler-times {\tst1w\t} 3 } } */
> +/* { dg-final { scan-assembler-times {\tld1d\t} 6 { xfail { aarch64_sve
&& { ! vect_variable_length } } } } } */
> +/* { dg-final { scan-assembler-times {\tst1d\t} 3 } } */
> +/* { dg-final { scan-assembler-not {\tldr} } } */
> +/* { dg-final { scan-assembler-not {\tstr} } } */
> +
> +/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s} 6 } } */
> +/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d} 6 } } */
> +
> +/* { dg-final { scan-assembler-times {\tsdiv\tz[0-9]+\.s} 1 } } */
> +/* { dg-final { scan-assembler-times {\tudiv\tz[0-9]+\.s} 1 } } */
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.s} 1 } } */
> +/* { dg-final { scan-assembler-times {\tsdiv\tz[0-9]+\.d} 1 } } */
> +/* { dg-final { scan-assembler-times {\tudiv\tz[0-9]+\.d} 1 } } */
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.d} 1 } } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/slp_14_run.c
> ===================================================================
> --- /dev/null   2018-04-20 16:19:46.369131350 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/slp_14_run.c   2018-05-16
11:12:11.873116180 +0100
> @@ -0,0 +1,34 @@
> +/* { dg-do run { target aarch64_sve_hw } } */
> +/* { dg-options "-O2 -ftree-vectorize" } */
> +
> +#include "slp_14.c"
> +
> +#define N1 (103 * 2)
> +#define N2 (111 * 2)
> +
> +#define HARNESS(TYPE)                                          \
> +  {                                                            \
> +    TYPE a[N2], b[N2];                                         \
> +    for (unsigned int i = 0; i < N2; ++i)                      \
> +      {                                                                \
> +       a[i] = i * 2 + i % 5;                                   \
> +       b[i] = i % 11;                                          \
> +      }                                                                \
> +    vec_slp_##TYPE (a, b, N1 / 2);                             \
> +    for (unsigned int i = 0; i < N2; ++i)                      \
> +      {                                                                \
> +       TYPE orig_a = i * 2 + i % 5;                            \
> +       TYPE orig_b = i % 11;                                   \
> +       TYPE expected_a = orig_a;                               \
> +       if (i < N1 && orig_b > (i & 1 ? 2 : 1))                 \
> +         expected_a /= orig_b;                                 \
> +       if (a[i] != expected_a || b[i] != orig_b)               \
> +         __builtin_abort ();                                   \
> +      }                                                                \
> +  }
> +
> +int
> +main (void)
> +{
> +  TEST_ALL (HARNESS)
> +}

next prev parent reply	other threads:[~2018-05-17 11:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-16 10:21 Richard Sandiford
2018-05-17 11:42 ` Richard Biener [this message]
2018-05-25 10:49   ` Richard Sandiford
2018-05-25 11:09     ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc0F27eqfNjtukMJRJudMxBOKG0r0dwCJmQG5YcZ5Homhw@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.sandiford@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).