* Add support for bitwise reductions
@ 2017-11-17 10:10 Richard Sandiford
2017-11-22 18:28 ` Richard Sandiford
0 siblings, 1 reply; 9+ messages in thread
From: Richard Sandiford @ 2017-11-17 10:10 UTC (permalink / raw)
To: gcc-patches
This patch adds support for the SVE bitwise reduction instructions
(ANDV, ORV and EORV). It's a fairly mechanical extension of existing
REDUC_* operators.
Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
and powerpc64le-linux-gnu.
Richard
2017-11-17 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree.def (REDUC_AND_EXPR, REDUC_IOR_EXPR, REDUC_XOR_EXPR): New
tree codes.
* doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
(reduc_xor_scal_@var{m}): Document.
* doc/sourcebuild.texi (vect_logical_reduc): Likewise.
* doc/generic.texi (REDUC_MAX_EXPR, REDUC_MIN_EXPR, REDUC_PLUS_EXPR)
(REDUC_AND_EXPR, REDUC_IOR_EXPR, REDUC_XOR_EXPR): Likewise.
* optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
(reduc_xor_scal_optab): New optabs.
* cfgexpand.c (expand_debug_expr): Handle the new tree codes.
* expr.c (expand_expr_real_2): Likewise.
* fold-const.c (const_unop): Likewise.
* optabs-tree.c (optab_for_tree_code): Likewise.
* tree-cfg.c (verify_gimple_assign_unary): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise. Reuse
generic unary code for REDUC_MAX_EXPR, REDUC_MIN_EXPR and
REDUC_PLUS_EXPR.
* tree-vect-loop.c (reduction_code_for_scalar_code): Return the
new reduction codes for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
* config/aarch64/aarch64-sve.md (reduc_<bit_reduc>_scal_<mode>):
(*reduc_<bit_reduc>_scal_<mode>): New patterns.
* config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
(UNSPEC_XORV): New unspecs.
(optab): Add entries for them.
(BITWISEV): New int iterator.
(bit_reduc_op): New int attributes.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_logical_reduc):
New proc.
* gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
and add an associated scan-dump test. Prevent vectorization
of the first two loops.
* gcc.dg/vect/vect-reduc-or_2.c: Likewise.
* gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
* gcc.target/aarch64/sve_reduc_2.c: Likewise.
* gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
(INIT_VECTOR): Tweak initial value so that some bits are always set.
* gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
Index: gcc/tree.def
===================================================================
--- gcc/tree.def 2017-11-17 09:40:43.533167007 +0000
+++ gcc/tree.def 2017-11-17 09:49:36.196354636 +0000
@@ -1298,6 +1298,9 @@ DEFTREECODE (TRANSACTION_EXPR, "transact
DEFTREECODE (REDUC_MAX_EXPR, "reduc_max_expr", tcc_unary, 1)
DEFTREECODE (REDUC_MIN_EXPR, "reduc_min_expr", tcc_unary, 1)
DEFTREECODE (REDUC_PLUS_EXPR, "reduc_plus_expr", tcc_unary, 1)
+DEFTREECODE (REDUC_AND_EXPR, "reduc_and_expr", tcc_unary, 1)
+DEFTREECODE (REDUC_IOR_EXPR, "reduc_ior_expr", tcc_unary, 1)
+DEFTREECODE (REDUC_XOR_EXPR, "reduc_xor_expr", tcc_unary, 1)
/* Widening dot-product.
The first two arguments are of type t1.
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi 2017-11-17 09:44:46.386606597 +0000
+++ gcc/doc/md.texi 2017-11-17 09:49:36.189354637 +0000
@@ -5244,6 +5244,17 @@ Compute the sum of the elements of a vec
operand 0 is the scalar result, with mode equal to the mode of the elements of
the input vector.
+@cindex @code{reduc_and_scal_@var{m}} instruction pattern
+@item @samp{reduc_and_scal_@var{m}}
+@cindex @code{reduc_ior_scal_@var{m}} instruction pattern
+@itemx @samp{reduc_ior_scal_@var{m}}
+@cindex @code{reduc_xor_scal_@var{m}} instruction pattern
+@itemx @samp{reduc_xor_scal_@var{m}}
+Compute the bitwise @code{AND}/@code{IOR}/@code{XOR} reduction of the elements
+of a vector of mode @var{m}. Operand 1 is the vector input and operand 0
+is the scalar result. The mode of the scalar result is the same as one
+element of @var{m}.
+
@cindex @code{sdot_prod@var{m}} instruction pattern
@item @samp{sdot_prod@var{m}}
@cindex @code{udot_prod@var{m}} instruction pattern
Index: gcc/doc/sourcebuild.texi
===================================================================
--- gcc/doc/sourcebuild.texi 2017-11-09 15:19:05.427168565 +0000
+++ gcc/doc/sourcebuild.texi 2017-11-17 09:49:36.190354637 +0000
@@ -1570,6 +1570,9 @@ Target supports 16- and 8-bytes vectors.
@item vect_sizes_32B_16B
Target supports 32- and 16-bytes vectors.
+
+@item vect_logical_reduc
+Target supports AND, IOR and XOR reduction on vectors.
@end table
@subsubsection Thread Local Storage attributes
Index: gcc/doc/generic.texi
===================================================================
--- gcc/doc/generic.texi 2017-11-17 09:40:43.510667010 +0000
+++ gcc/doc/generic.texi 2017-11-17 09:49:36.188354638 +0000
@@ -1740,6 +1740,12 @@ a value from @code{enum annot_expr_kind}
@tindex VEC_PACK_FIX_TRUNC_EXPR
@tindex VEC_COND_EXPR
@tindex SAD_EXPR
+@tindex REDUC_MAX_EXPR
+@tindex REDUC_MIN_EXPR
+@tindex REDUC_PLUS_EXPR
+@tindex REDUC_AND_EXPR
+@tindex REDUC_IOR_EXPR
+@tindex REDUC_XOR_EXPR
@table @code
@item VEC_DUPLICATE_EXPR
@@ -1841,6 +1847,20 @@ operand must be at lease twice of the si
first and second one. The SAD is calculated between the first and second
operands, added to the third operand, and returned.
+@item REDUC_MAX_EXPR
+@itemx REDUC_MIN_EXPR
+@itemx REDUC_PLUS_EXPR
+@itemx REDUC_AND_EXPR
+@itemx REDUC_IOR_EXPR
+@itemx REDUC_XOR_EXPR
+These nodes represent operations that take a vector input and repeatedly
+apply a binary operator on pairs of elements until only one scalar remains.
+For example, @samp{REDUC_PLUS_EXPR <@var{x}>} returns the sum of
+the elements in @var{x} and @samp{REDUC_MAX_EXPR <@var{x}>} returns
+the maximum element in @var{x}. The associativity of the operation
+is unspecified; for example, @samp{REDUC_PLUS_EXPR <@var{x}>} could
+sum floating-point @var{x} in forward order, in reverse order,
+using a tree, or in some other way.
@end table
Index: gcc/optabs.def
===================================================================
--- gcc/optabs.def 2017-11-17 09:44:46.386606597 +0000
+++ gcc/optabs.def 2017-11-17 09:49:36.192354637 +0000
@@ -292,6 +292,9 @@ OPTAB_D (reduc_smin_scal_optab, "reduc_s
OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a")
OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a")
OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a")
+OPTAB_D (reduc_and_scal_optab, "reduc_and_scal_$a")
+OPTAB_D (reduc_ior_scal_optab, "reduc_ior_scal_$a")
+OPTAB_D (reduc_xor_scal_optab, "reduc_xor_scal_$a")
OPTAB_D (sdot_prod_optab, "sdot_prod$I$a")
OPTAB_D (ssum_widen_optab, "widen_ssum$I$a3")
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c 2017-11-17 09:40:43.509767010 +0000
+++ gcc/cfgexpand.c 2017-11-17 09:49:36.187354638 +0000
@@ -5069,6 +5069,9 @@ expand_debug_expr (tree exp)
case REDUC_MAX_EXPR:
case REDUC_MIN_EXPR:
case REDUC_PLUS_EXPR:
+ case REDUC_AND_EXPR:
+ case REDUC_IOR_EXPR:
+ case REDUC_XOR_EXPR:
case VEC_COND_EXPR:
case VEC_PACK_FIX_TRUNC_EXPR:
case VEC_PACK_SAT_EXPR:
Index: gcc/expr.c
===================================================================
--- gcc/expr.c 2017-11-17 09:06:05.552470755 +0000
+++ gcc/expr.c 2017-11-17 09:49:36.191354637 +0000
@@ -9438,6 +9438,9 @@ #define REDUCE_BIT_FIELD(expr) (reduce_b
case REDUC_MAX_EXPR:
case REDUC_MIN_EXPR:
case REDUC_PLUS_EXPR:
+ case REDUC_AND_EXPR:
+ case REDUC_IOR_EXPR:
+ case REDUC_XOR_EXPR:
{
op0 = expand_normal (treeop0);
this_optab = optab_for_tree_code (code, type, optab_default);
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c 2017-11-17 09:06:23.404260252 +0000
+++ gcc/fold-const.c 2017-11-17 09:49:36.192354637 +0000
@@ -1869,6 +1869,9 @@ const_unop (enum tree_code code, tree ty
case REDUC_MIN_EXPR:
case REDUC_MAX_EXPR:
case REDUC_PLUS_EXPR:
+ case REDUC_AND_EXPR:
+ case REDUC_IOR_EXPR:
+ case REDUC_XOR_EXPR:
{
unsigned int nelts, i;
enum tree_code subcode;
@@ -1882,6 +1885,9 @@ const_unop (enum tree_code code, tree ty
case REDUC_MIN_EXPR: subcode = MIN_EXPR; break;
case REDUC_MAX_EXPR: subcode = MAX_EXPR; break;
case REDUC_PLUS_EXPR: subcode = PLUS_EXPR; break;
+ case REDUC_AND_EXPR: subcode = BIT_AND_EXPR; break;
+ case REDUC_IOR_EXPR: subcode = BIT_IOR_EXPR; break;
+ case REDUC_XOR_EXPR: subcode = BIT_XOR_EXPR; break;
default: gcc_unreachable ();
}
Index: gcc/optabs-tree.c
===================================================================
--- gcc/optabs-tree.c 2017-11-17 09:40:43.523267008 +0000
+++ gcc/optabs-tree.c 2017-11-17 09:49:36.192354637 +0000
@@ -157,6 +157,15 @@ optab_for_tree_code (enum tree_code code
case REDUC_PLUS_EXPR:
return reduc_plus_scal_optab;
+ case REDUC_AND_EXPR:
+ return reduc_and_scal_optab;
+
+ case REDUC_IOR_EXPR:
+ return reduc_ior_scal_optab;
+
+ case REDUC_XOR_EXPR:
+ return reduc_xor_scal_optab;
+
case VEC_WIDEN_MULT_HI_EXPR:
return TYPE_UNSIGNED (type) ?
vec_widen_umult_hi_optab : vec_widen_smult_hi_optab;
Index: gcc/tree-cfg.c
===================================================================
--- gcc/tree-cfg.c 2017-11-17 09:05:59.899390175 +0000
+++ gcc/tree-cfg.c 2017-11-17 09:49:36.194354636 +0000
@@ -3773,6 +3773,9 @@ verify_gimple_assign_unary (gassign *stm
case REDUC_MAX_EXPR:
case REDUC_MIN_EXPR:
case REDUC_PLUS_EXPR:
+ case REDUC_AND_EXPR:
+ case REDUC_IOR_EXPR:
+ case REDUC_XOR_EXPR:
if (!VECTOR_TYPE_P (rhs1_type)
|| !useless_type_conversion_p (lhs_type, TREE_TYPE (rhs1_type)))
{
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c 2017-11-17 09:40:43.527767008 +0000
+++ gcc/tree-inline.c 2017-11-17 09:49:36.195354636 +0000
@@ -3878,6 +3878,9 @@ estimate_operator_cost (enum tree_code c
case REDUC_MAX_EXPR:
case REDUC_MIN_EXPR:
case REDUC_PLUS_EXPR:
+ case REDUC_AND_EXPR:
+ case REDUC_IOR_EXPR:
+ case REDUC_XOR_EXPR:
case WIDEN_SUM_EXPR:
case WIDEN_MULT_EXPR:
case DOT_PROD_EXPR:
Index: gcc/tree-pretty-print.c
===================================================================
--- gcc/tree-pretty-print.c 2017-11-17 08:57:27.159529444 +0000
+++ gcc/tree-pretty-print.c 2017-11-17 09:49:36.195354636 +0000
@@ -3231,24 +3231,6 @@ dump_generic_node (pretty_printer *pp, t
is_expr = false;
break;
- case REDUC_MAX_EXPR:
- pp_string (pp, " REDUC_MAX_EXPR < ");
- dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
- pp_string (pp, " > ");
- break;
-
- case REDUC_MIN_EXPR:
- pp_string (pp, " REDUC_MIN_EXPR < ");
- dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
- pp_string (pp, " > ");
- break;
-
- case REDUC_PLUS_EXPR:
- pp_string (pp, " REDUC_PLUS_EXPR < ");
- dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
- pp_string (pp, " > ");
- break;
-
case VEC_SERIES_EXPR:
case VEC_WIDEN_MULT_HI_EXPR:
case VEC_WIDEN_MULT_LO_EXPR:
@@ -3267,6 +3249,12 @@ dump_generic_node (pretty_printer *pp, t
break;
case VEC_DUPLICATE_EXPR:
+ case REDUC_MAX_EXPR:
+ case REDUC_MIN_EXPR:
+ case REDUC_PLUS_EXPR:
+ case REDUC_AND_EXPR:
+ case REDUC_IOR_EXPR:
+ case REDUC_XOR_EXPR:
pp_space (pp);
for (str = get_tree_code_name (code); *str; str++)
pp_character (pp, TOUPPER (*str));
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c 2017-11-17 09:44:46.389306597 +0000
+++ gcc/tree-vect-loop.c 2017-11-17 09:49:36.196354636 +0000
@@ -2437,11 +2437,20 @@ reduction_code_for_scalar_code (enum tre
*reduc_code = REDUC_PLUS_EXPR;
return true;
- case MULT_EXPR:
- case MINUS_EXPR:
+ case BIT_AND_EXPR:
+ *reduc_code = REDUC_AND_EXPR;
+ return true;
+
case BIT_IOR_EXPR:
+ *reduc_code = REDUC_IOR_EXPR;
+ return true;
+
case BIT_XOR_EXPR:
- case BIT_AND_EXPR:
+ *reduc_code = REDUC_XOR_EXPR;
+ return true;
+
+ case MULT_EXPR:
+ case MINUS_EXPR:
*reduc_code = ERROR_MARK;
return true;
Index: gcc/config/aarch64/aarch64-sve.md
===================================================================
--- gcc/config/aarch64/aarch64-sve.md 2017-11-17 09:44:46.385706597 +0000
+++ gcc/config/aarch64/aarch64-sve.md 2017-11-17 09:49:36.188354638 +0000
@@ -1513,6 +1513,26 @@ (define_insn "*reduc_<maxmin_uns>_scal_<
"<maxmin_uns_op>v\t%<Vetype>0, %1, %2.<Vetype>"
)
+(define_expand "reduc_<optab>_scal_<mode>"
+ [(set (match_operand:<VEL> 0 "register_operand")
+ (unspec:<VEL> [(match_dup 2)
+ (match_operand:SVE_I 1 "register_operand")]
+ BITWISEV))]
+ "TARGET_SVE"
+ {
+ operands[2] = force_reg (<VPRED>mode, CONSTM1_RTX (<VPRED>mode));
+ }
+)
+
+(define_insn "*reduc_<optab>_scal_<mode>"
+ [(set (match_operand:<VEL> 0 "register_operand" "=w")
+ (unspec:<VEL> [(match_operand:<VPRED> 1 "register_operand" "Upl")
+ (match_operand:SVE_I 2 "register_operand" "w")]
+ BITWISEV))]
+ "TARGET_SVE"
+ "<bit_reduc_op>\t%<Vetype>0, %1, %2.<Vetype>"
+)
+
;; Unpredicated floating-point addition.
(define_expand "add<mode>3"
[(set (match_operand:SVE_F 0 "register_operand")
Index: gcc/config/aarch64/iterators.md
===================================================================
--- gcc/config/aarch64/iterators.md 2017-11-17 09:40:36.505067706 +0000
+++ gcc/config/aarch64/iterators.md 2017-11-17 09:49:36.188354638 +0000
@@ -405,6 +405,9 @@ (define_c_enum "unspec"
UNSPEC_SDOT ; Used in aarch64-simd.md.
UNSPEC_UDOT ; Used in aarch64-simd.md.
UNSPEC_SEL ; Used in aarch64-sve.md.
+ UNSPEC_ANDV ; Used in aarch64-sve.md.
+ UNSPEC_IORV ; Used in aarch64-sve.md.
+ UNSPEC_XORV ; Used in aarch64-sve.md.
UNSPEC_ANDF ; Used in aarch64-sve.md.
UNSPEC_IORF ; Used in aarch64-sve.md.
UNSPEC_XORF ; Used in aarch64-sve.md.
@@ -1298,6 +1301,8 @@ (define_int_iterator MAXMINV [UNSPEC_UMA
(define_int_iterator FMAXMINV [UNSPEC_FMAXV UNSPEC_FMINV
UNSPEC_FMAXNMV UNSPEC_FMINNMV])
+(define_int_iterator BITWISEV [UNSPEC_ANDV UNSPEC_IORV UNSPEC_XORV])
+
(define_int_iterator LOGICALF [UNSPEC_ANDF UNSPEC_IORF UNSPEC_XORF])
(define_int_iterator HADDSUB [UNSPEC_SHADD UNSPEC_UHADD
@@ -1417,7 +1422,10 @@ (define_int_attr atomic_ldop
;; name for consistency with the integer patterns.
(define_int_attr optab [(UNSPEC_ANDF "and")
(UNSPEC_IORF "ior")
- (UNSPEC_XORF "xor")])
+ (UNSPEC_XORF "xor")
+ (UNSPEC_ANDV "and")
+ (UNSPEC_IORV "ior")
+ (UNSPEC_XORV "xor")])
(define_int_attr maxmin_uns [(UNSPEC_UMAXV "umax")
(UNSPEC_UMINV "umin")
@@ -1445,6 +1453,10 @@ (define_int_attr maxmin_uns_op [(UNSPEC
(UNSPEC_FMAXNM "fmaxnm")
(UNSPEC_FMINNM "fminnm")])
+(define_int_attr bit_reduc_op [(UNSPEC_ANDV "andv")
+ (UNSPEC_IORV "orv")
+ (UNSPEC_XORV "eorv")])
+
;; The SVE logical instruction that implements an unspec.
(define_int_attr logicalf_op [(UNSPEC_ANDF "and")
(UNSPEC_IORF "orr")
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp 2017-11-17 09:06:28.516102419 +0000
+++ gcc/testsuite/lib/target-supports.exp 2017-11-17 09:49:36.194354636 +0000
@@ -7162,6 +7162,12 @@ proc check_effective_target_vect_call_ro
return $et_vect_call_roundf_saved($et_index)
}
+# Return 1 if the target supports AND, OR and XOR reduction.
+
+proc check_effective_target_vect_logical_reduc { } {
+ return [check_effective_target_aarch64_sve]
+}
+
# Return 1 if the target supports section-anchors
proc check_effective_target_section_anchors { } {
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2017-11-09 15:15:28.900668540 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2017-11-17 09:49:36.192354637 +0000
@@ -1,4 +1,4 @@
-/* { dg-require-effective-target whole_vector_shift } */
+/* { dg-do run { target { whole_vector_shift || vect_logical_reduc } } } */
/* Write a reduction loop to be reduced using vector shifts. */
@@ -24,17 +24,17 @@ main (unsigned char argc, char **argv)
check_vect ();
for (i = 0; i < N; i++)
- in[i] = (i + i + 1) & 0xfd;
+ {
+ in[i] = (i + i + 1) & 0xfd;
+ asm volatile ("" ::: "memory");
+ }
for (i = 0; i < N; i++)
{
expected |= in[i];
- asm volatile ("");
+ asm volatile ("" ::: "memory");
}
- /* Prevent constant propagation of the entire loop below. */
- asm volatile ("" : : : "memory");
-
for (i = 0; i < N; i++)
sum |= in[i];
@@ -47,5 +47,5 @@ main (unsigned char argc, char **argv)
return 0;
}
-/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" } } */
-
+/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
+/* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2017-11-09 15:15:28.900668540 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2017-11-17 09:49:36.192354637 +0000
@@ -1,4 +1,4 @@
-/* { dg-require-effective-target whole_vector_shift } */
+/* { dg-do run { target { whole_vector_shift || vect_logical_reduc } } } */
/* Write a reduction loop to be reduced using vector shifts and folded. */
@@ -23,12 +23,15 @@ main (unsigned char argc, char **argv)
check_vect ();
for (i = 0; i < N; i++)
- in[i] = (i + i + 1) & 0xfd;
+ {
+ in[i] = (i + i + 1) & 0xfd;
+ asm volatile ("" ::: "memory");
+ }
for (i = 0; i < N; i++)
{
expected |= in[i];
- asm volatile ("");
+ asm volatile ("" ::: "memory");
}
for (i = 0; i < N; i++)
@@ -43,5 +46,5 @@ main (unsigned char argc, char **argv)
return 0;
}
-/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" } } */
-
+/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
+/* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_reduc_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_reduc_1.c 2017-11-17 09:06:21.395260303 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_reduc_1.c 2017-11-17 09:49:36.192354637 +0000
@@ -65,6 +65,46 @@ #define TEST_MAXMIN(T) \
TEST_MAXMIN (DEF_REDUC_MAXMIN)
+#define DEF_REDUC_BITWISE(TYPE, NAME, BIT_OP) \
+TYPE __attribute__ ((noinline, noclone)) \
+reduc_##NAME##_##TYPE (TYPE *a, int n) \
+{ \
+ TYPE r = 13; \
+ for (int i = 0; i < n; ++i) \
+ r BIT_OP a[i]; \
+ return r; \
+}
+
+#define TEST_BITWISE(T) \
+ T (int8_t, and, &=) \
+ T (int16_t, and, &=) \
+ T (int32_t, and, &=) \
+ T (int64_t, and, &=) \
+ T (uint8_t, and, &=) \
+ T (uint16_t, and, &=) \
+ T (uint32_t, and, &=) \
+ T (uint64_t, and, &=) \
+ \
+ T (int8_t, ior, |=) \
+ T (int16_t, ior, |=) \
+ T (int32_t, ior, |=) \
+ T (int64_t, ior, |=) \
+ T (uint8_t, ior, |=) \
+ T (uint16_t, ior, |=) \
+ T (uint32_t, ior, |=) \
+ T (uint64_t, ior, |=) \
+ \
+ T (int8_t, xor, ^=) \
+ T (int16_t, xor, ^=) \
+ T (int32_t, xor, ^=) \
+ T (int64_t, xor, ^=) \
+ T (uint8_t, xor, ^=) \
+ T (uint16_t, xor, ^=) \
+ T (uint32_t, xor, ^=) \
+ T (uint64_t, xor, ^=)
+
+TEST_BITWISE (DEF_REDUC_BITWISE)
+
/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
@@ -102,6 +142,12 @@ TEST_MAXMIN (DEF_REDUC_MAXMIN)
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 8 } } */
+
+/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 8 } } */
+
+/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 8 } } */
+
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
@@ -133,3 +179,18 @@ TEST_MAXMIN (DEF_REDUC_MAXMIN)
/* { dg-final { scan-assembler-times {\tfminnmv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnmv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnmv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tandv\tb[0-9]+, p[0-7], z[0-9]+\.b} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\th[0-9]+, p[0-7], z[0-9]+\.h} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\ts[0-9]+, p[0-7], z[0-9]+\.s} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\td[0-9]+, p[0-7], z[0-9]+\.d} 2 } } */
+
+/* { dg-final { scan-assembler-times {\torv\tb[0-9]+, p[0-7], z[0-9]+\.b} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\th[0-9]+, p[0-7], z[0-9]+\.h} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\ts[0-9]+, p[0-7], z[0-9]+\.s} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\td[0-9]+, p[0-7], z[0-9]+\.d} 2 } } */
+
+/* { dg-final { scan-assembler-times {\teorv\tb[0-9]+, p[0-7], z[0-9]+\.b} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\th[0-9]+, p[0-7], z[0-9]+\.h} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\ts[0-9]+, p[0-7], z[0-9]+\.s} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\td[0-9]+, p[0-7], z[0-9]+\.d} 2 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_reduc_2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_reduc_2.c 2017-11-17 09:06:21.395260303 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_reduc_2.c 2017-11-17 09:49:36.193354637 +0000
@@ -73,6 +73,49 @@ #define TEST_MAXMIN(T) \
TEST_MAXMIN (DEF_REDUC_MAXMIN)
+#define DEF_REDUC_BITWISE(TYPE,NAME,BIT_OP) \
+void __attribute__ ((noinline, noclone)) \
+reduc_##NAME##TYPE (TYPE (*restrict a)[NUM_ELEMS(TYPE)], \
+ TYPE *restrict r, int n) \
+{ \
+ for (int i = 0; i < n; i++) \
+ { \
+ r[i] = a[i][0]; \
+ for (int j = 0; j < NUM_ELEMS(TYPE); j++) \
+ r[i] BIT_OP a[i][j]; \
+ } \
+}
+
+#define TEST_BITWISE(T) \
+ T (int8_t, and, &=) \
+ T (int16_t, and, &=) \
+ T (int32_t, and, &=) \
+ T (int64_t, and, &=) \
+ T (uint8_t, and, &=) \
+ T (uint16_t, and, &=) \
+ T (uint32_t, and, &=) \
+ T (uint64_t, and, &=) \
+ \
+ T (int8_t, ior, |=) \
+ T (int16_t, ior, |=) \
+ T (int32_t, ior, |=) \
+ T (int64_t, ior, |=) \
+ T (uint8_t, ior, |=) \
+ T (uint16_t, ior, |=) \
+ T (uint32_t, ior, |=) \
+ T (uint64_t, ior, |=) \
+ \
+ T (int8_t, xor, ^=) \
+ T (int16_t, xor, ^=) \
+ T (int32_t, xor, ^=) \
+ T (int64_t, xor, ^=) \
+ T (uint8_t, xor, ^=) \
+ T (uint16_t, xor, ^=) \
+ T (uint32_t, xor, ^=) \
+ T (uint64_t, xor, ^=)
+
+TEST_BITWISE (DEF_REDUC_BITWISE)
+
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
@@ -104,3 +147,18 @@ TEST_MAXMIN (DEF_REDUC_MAXMIN)
/* { dg-final { scan-assembler-times {\tfminnmv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnmv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnmv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tandv\tb[0-9]+, p[0-7], z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\torv\tb[0-9]+, p[0-7], z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\teorv\tb[0-9]+, p[0-7], z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 2 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.c 2017-11-17 09:06:21.395260303 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.c 2017-11-17 09:49:36.193354637 +0000
@@ -9,7 +9,7 @@ #define INIT_VECTOR(TYPE) \
TYPE a[NUM_ELEMS (TYPE) + 1]; \
for (int i = 0; i < NUM_ELEMS (TYPE) + 1; i++) \
{ \
- a[i] = (i * 2) * (i & 1 ? 1 : -1); \
+ a[i] = ((i * 2) * (i & 1 ? 1 : -1) | 3); \
asm volatile ("" ::: "memory"); \
}
@@ -35,10 +35,22 @@ #define TEST_REDUC_MAXMIN(TYPE, NAME, CM
__builtin_abort (); \
}
+#define TEST_REDUC_BITWISE(TYPE, NAME, BIT_OP) \
+ { \
+ INIT_VECTOR (TYPE); \
+ TYPE r1 = reduc_##NAME##_##TYPE (a, NUM_ELEMS (TYPE)); \
+ volatile TYPE r2 = 13; \
+ for (int i = 0; i < NUM_ELEMS (TYPE); ++i) \
+ r2 BIT_OP a[i]; \
+ if (r1 != r2) \
+ __builtin_abort (); \
+ }
+
int main ()
{
TEST_PLUS (TEST_REDUC_PLUS)
TEST_MAXMIN (TEST_REDUC_MAXMIN)
+ TEST_BITWISE (TEST_REDUC_BITWISE)
return 0;
}
Index: gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.c 2017-11-17 09:06:21.395260303 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.c 2017-11-17 09:49:36.193354637 +0000
@@ -56,6 +56,20 @@ #define TEST_REDUC_MAXMIN(TYPE, NAME, CM
} \
}
+#define TEST_REDUC_BITWISE(TYPE, NAME, BIT_OP) \
+ { \
+ INIT_MATRIX (TYPE); \
+ reduc_##NAME##_##TYPE (mat, r, NROWS); \
+ for (int i = 0; i < NROWS; i++) \
+ { \
+ volatile TYPE r2 = mat[i][0]; \
+ for (int j = 0; j < NUM_ELEMS (TYPE); ++j) \
+ r2 BIT_OP mat[i][j]; \
+ if (r[i] != r2) \
+ __builtin_abort (); \
+ } \
+ }
+
int main ()
{
TEST_PLUS (TEST_REDUC_PLUS)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Add support for bitwise reductions
2017-11-17 10:10 Add support for bitwise reductions Richard Sandiford
@ 2017-11-22 18:28 ` Richard Sandiford
2017-12-14 0:37 ` Jeff Law
0 siblings, 1 reply; 9+ messages in thread
From: Richard Sandiford @ 2017-11-22 18:28 UTC (permalink / raw)
To: gcc-patches
Richard Sandiford <richard.sandiford@linaro.org> writes:
> This patch adds support for the SVE bitwise reduction instructions
> (ANDV, ORV and EORV). It's a fairly mechanical extension of existing
> REDUC_* operators.
>
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.
Here's an updated version that applies on top of the recent
removal of REDUC_*_EXPR. Tested as before.
Thanks,
Richard
2017-11-22 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
(reduc_xor_scal_optab): New optabs.
* doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
(reduc_xor_scal_@var{m}): Document.
* doc/sourcebuild.texi (vect_logical_reduc): Likewise.
* internal-fn.def (IFN_REDUC_AND, IFN_REDUC_IOR, IFN_REDUC_XOR): New
internal functions.
* fold-const-call.c (fold_const_call): Handle them.
* tree-vect-loop.c (reduction_fn_for_scalar_code): Return the new
internal functions for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
* config/aarch64/aarch64-sve.md (reduc_<bit_reduc>_scal_<mode>):
(*reduc_<bit_reduc>_scal_<mode>): New patterns.
* config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
(UNSPEC_XORV): New unspecs.
(optab): Add entries for them.
(BITWISEV): New int iterator.
(bit_reduc_op): New int attributes.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_logical_reduc):
New proc.
* gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
and add an associated scan-dump test. Prevent vectorization
of the first two loops.
* gcc.dg/vect/vect-reduc-or_2.c: Likewise.
* gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
* gcc.target/aarch64/sve_reduc_2.c: Likewise.
* gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
(INIT_VECTOR): Tweak initial value so that some bits are always set.
* gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
Index: gcc/optabs.def
===================================================================
--- gcc/optabs.def 2017-11-22 18:05:58.624329338 +0000
+++ gcc/optabs.def 2017-11-22 18:06:54.516061226 +0000
@@ -292,6 +292,9 @@ OPTAB_D (reduc_smin_scal_optab, "reduc_s
OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a")
OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a")
OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a")
+OPTAB_D (reduc_and_scal_optab, "reduc_and_scal_$a")
+OPTAB_D (reduc_ior_scal_optab, "reduc_ior_scal_$a")
+OPTAB_D (reduc_xor_scal_optab, "reduc_xor_scal_$a")
OPTAB_D (sdot_prod_optab, "sdot_prod$I$a")
OPTAB_D (ssum_widen_optab, "widen_ssum$I$a3")
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi 2017-11-22 18:05:58.620520950 +0000
+++ gcc/doc/md.texi 2017-11-22 18:06:54.515109580 +0000
@@ -5244,6 +5244,17 @@ Compute the sum of the elements of a vec
operand 0 is the scalar result, with mode equal to the mode of the elements of
the input vector.
+@cindex @code{reduc_and_scal_@var{m}} instruction pattern
+@item @samp{reduc_and_scal_@var{m}}
+@cindex @code{reduc_ior_scal_@var{m}} instruction pattern
+@itemx @samp{reduc_ior_scal_@var{m}}
+@cindex @code{reduc_xor_scal_@var{m}} instruction pattern
+@itemx @samp{reduc_xor_scal_@var{m}}
+Compute the bitwise @code{AND}/@code{IOR}/@code{XOR} reduction of the elements
+of a vector of mode @var{m}. Operand 1 is the vector input and operand 0
+is the scalar result. The mode of the scalar result is the same as one
+element of @var{m}.
+
@cindex @code{sdot_prod@var{m}} instruction pattern
@item @samp{sdot_prod@var{m}}
@cindex @code{udot_prod@var{m}} instruction pattern
Index: gcc/doc/sourcebuild.texi
===================================================================
--- gcc/doc/sourcebuild.texi 2017-11-22 18:05:58.621473047 +0000
+++ gcc/doc/sourcebuild.texi 2017-11-22 18:06:54.515109580 +0000
@@ -1570,6 +1570,9 @@ Target supports 16- and 8-bytes vectors.
@item vect_sizes_32B_16B
Target supports 32- and 16-bytes vectors.
+
+@item vect_logical_reduc
+Target supports AND, IOR and XOR reduction on vectors.
@end table
@subsubsection Thread Local Storage attributes
Index: gcc/internal-fn.def
===================================================================
--- gcc/internal-fn.def 2017-11-22 18:05:51.545487816 +0000
+++ gcc/internal-fn.def 2017-11-22 18:06:54.516061226 +0000
@@ -137,6 +137,12 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX,
reduc_smax_scal, reduc_umax_scal, unary)
DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first,
reduc_smin_scal, reduc_umin_scal, unary)
+DEF_INTERNAL_OPTAB_FN (REDUC_AND, ECF_CONST | ECF_NOTHROW,
+ reduc_and_scal, unary)
+DEF_INTERNAL_OPTAB_FN (REDUC_IOR, ECF_CONST | ECF_NOTHROW,
+ reduc_ior_scal, unary)
+DEF_INTERNAL_OPTAB_FN (REDUC_XOR, ECF_CONST | ECF_NOTHROW,
+ reduc_xor_scal, unary)
/* Unary math functions. */
DEF_INTERNAL_FLT_FN (ACOS, ECF_CONST, acos, unary)
Index: gcc/fold-const-call.c
===================================================================
--- gcc/fold-const-call.c 2017-11-22 17:53:21.698058809 +0000
+++ gcc/fold-const-call.c 2017-11-22 18:06:54.516061226 +0000
@@ -1176,6 +1176,15 @@ fold_const_call (combined_fn fn, tree ty
case CFN_REDUC_MIN:
return fold_const_reduction (type, arg, MIN_EXPR);
+ case CFN_REDUC_AND:
+ return fold_const_reduction (type, arg, BIT_AND_EXPR);
+
+ case CFN_REDUC_IOR:
+ return fold_const_reduction (type, arg, BIT_IOR_EXPR);
+
+ case CFN_REDUC_XOR:
+ return fold_const_reduction (type, arg, BIT_XOR_EXPR);
+
default:
return fold_const_call_1 (fn, type, arg);
}
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c 2017-11-22 18:05:58.629089823 +0000
+++ gcc/tree-vect-loop.c 2017-11-22 18:06:54.517964518 +0000
@@ -2436,11 +2436,20 @@ reduction_fn_for_scalar_code (enum tree_
*reduc_fn = IFN_REDUC_PLUS;
return true;
- case MULT_EXPR:
- case MINUS_EXPR:
+ case BIT_AND_EXPR:
+ *reduc_fn = IFN_REDUC_AND;
+ return true;
+
case BIT_IOR_EXPR:
+ *reduc_fn = IFN_REDUC_IOR;
+ return true;
+
case BIT_XOR_EXPR:
- case BIT_AND_EXPR:
+ *reduc_fn = IFN_REDUC_XOR;
+ return true;
+
+ case MULT_EXPR:
+ case MINUS_EXPR:
*reduc_fn = IFN_LAST;
return true;
Index: gcc/config/aarch64/aarch64-sve.md
===================================================================
--- gcc/config/aarch64/aarch64-sve.md 2017-11-22 18:05:58.618616756 +0000
+++ gcc/config/aarch64/aarch64-sve.md 2017-11-22 18:06:54.514157934 +0000
@@ -1513,6 +1513,26 @@ (define_insn "*reduc_<maxmin_uns>_scal_<
"<maxmin_uns_op>v\t%<Vetype>0, %1, %2.<Vetype>"
)
+(define_expand "reduc_<optab>_scal_<mode>"
+ [(set (match_operand:<VEL> 0 "register_operand")
+ (unspec:<VEL> [(match_dup 2)
+ (match_operand:SVE_I 1 "register_operand")]
+ BITWISEV))]
+ "TARGET_SVE"
+ {
+ operands[2] = force_reg (<VPRED>mode, CONSTM1_RTX (<VPRED>mode));
+ }
+)
+
+(define_insn "*reduc_<optab>_scal_<mode>"
+ [(set (match_operand:<VEL> 0 "register_operand" "=w")
+ (unspec:<VEL> [(match_operand:<VPRED> 1 "register_operand" "Upl")
+ (match_operand:SVE_I 2 "register_operand" "w")]
+ BITWISEV))]
+ "TARGET_SVE"
+ "<bit_reduc_op>\t%<Vetype>0, %1, %2.<Vetype>"
+)
+
;; Unpredicated floating-point addition.
(define_expand "add<mode>3"
[(set (match_operand:SVE_F 0 "register_operand")
Index: gcc/config/aarch64/iterators.md
===================================================================
--- gcc/config/aarch64/iterators.md 2017-11-22 18:05:58.618616756 +0000
+++ gcc/config/aarch64/iterators.md 2017-11-22 18:06:54.514157934 +0000
@@ -409,6 +409,9 @@ (define_c_enum "unspec"
UNSPEC_SDOT ; Used in aarch64-simd.md.
UNSPEC_UDOT ; Used in aarch64-simd.md.
UNSPEC_SEL ; Used in aarch64-sve.md.
+ UNSPEC_ANDV ; Used in aarch64-sve.md.
+ UNSPEC_IORV ; Used in aarch64-sve.md.
+ UNSPEC_XORV ; Used in aarch64-sve.md.
UNSPEC_ANDF ; Used in aarch64-sve.md.
UNSPEC_IORF ; Used in aarch64-sve.md.
UNSPEC_XORF ; Used in aarch64-sve.md.
@@ -1318,6 +1321,8 @@ (define_int_iterator MAXMINV [UNSPEC_UMA
(define_int_iterator FMAXMINV [UNSPEC_FMAXV UNSPEC_FMINV
UNSPEC_FMAXNMV UNSPEC_FMINNMV])
+(define_int_iterator BITWISEV [UNSPEC_ANDV UNSPEC_IORV UNSPEC_XORV])
+
(define_int_iterator LOGICALF [UNSPEC_ANDF UNSPEC_IORF UNSPEC_XORF])
(define_int_iterator HADDSUB [UNSPEC_SHADD UNSPEC_UHADD
@@ -1437,7 +1442,10 @@ (define_int_attr atomic_ldop
;; name for consistency with the integer patterns.
(define_int_attr optab [(UNSPEC_ANDF "and")
(UNSPEC_IORF "ior")
- (UNSPEC_XORF "xor")])
+ (UNSPEC_XORF "xor")
+ (UNSPEC_ANDV "and")
+ (UNSPEC_IORV "ior")
+ (UNSPEC_XORV "xor")])
(define_int_attr maxmin_uns [(UNSPEC_UMAXV "umax")
(UNSPEC_UMINV "umin")
@@ -1465,6 +1473,10 @@ (define_int_attr maxmin_uns_op [(UNSPEC
(UNSPEC_FMAXNM "fmaxnm")
(UNSPEC_FMINNM "fminnm")])
+(define_int_attr bit_reduc_op [(UNSPEC_ANDV "andv")
+ (UNSPEC_IORV "orv")
+ (UNSPEC_XORV "eorv")])
+
;; The SVE logical instruction that implements an unspec.
(define_int_attr logicalf_op [(UNSPEC_ANDF "and")
(UNSPEC_IORF "orr")
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp 2017-11-22 18:05:58.626233532 +0000
+++ gcc/testsuite/lib/target-supports.exp 2017-11-22 18:06:54.517012872 +0000
@@ -7187,6 +7187,12 @@ proc check_effective_target_vect_call_ro
return $et_vect_call_roundf_saved($et_index)
}
+# Return 1 if the target supports AND, OR and XOR reduction.
+
+proc check_effective_target_vect_logical_reduc { } {
+ return [check_effective_target_aarch64_sve]
+}
+
# Return 1 if the target supports section-anchors
proc check_effective_target_section_anchors { } {
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2017-11-22 18:05:58.624329338 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2017-11-22 18:06:54.516061226 +0000
@@ -1,4 +1,4 @@
-/* { dg-require-effective-target whole_vector_shift } */
+/* { dg-do run { target { whole_vector_shift || vect_logical_reduc } } } */
/* Write a reduction loop to be reduced using vector shifts. */
@@ -24,17 +24,17 @@ main (unsigned char argc, char **argv)
check_vect ();
for (i = 0; i < N; i++)
- in[i] = (i + i + 1) & 0xfd;
+ {
+ in[i] = (i + i + 1) & 0xfd;
+ asm volatile ("" ::: "memory");
+ }
for (i = 0; i < N; i++)
{
expected |= in[i];
- asm volatile ("");
+ asm volatile ("" ::: "memory");
}
- /* Prevent constant propagation of the entire loop below. */
- asm volatile ("" : : : "memory");
-
for (i = 0; i < N; i++)
sum |= in[i];
@@ -47,5 +47,5 @@ main (unsigned char argc, char **argv)
return 0;
}
-/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" } } */
-
+/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
+/* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2017-11-22 18:05:58.625281435 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2017-11-22 18:06:54.516061226 +0000
@@ -1,4 +1,4 @@
-/* { dg-require-effective-target whole_vector_shift } */
+/* { dg-do run { target { whole_vector_shift || vect_logical_reduc } } } */
/* Write a reduction loop to be reduced using vector shifts and folded. */
@@ -23,12 +23,15 @@ main (unsigned char argc, char **argv)
check_vect ();
for (i = 0; i < N; i++)
- in[i] = (i + i + 1) & 0xfd;
+ {
+ in[i] = (i + i + 1) & 0xfd;
+ asm volatile ("" ::: "memory");
+ }
for (i = 0; i < N; i++)
{
expected |= in[i];
- asm volatile ("");
+ asm volatile ("" ::: "memory");
}
for (i = 0; i < N; i++)
@@ -43,5 +46,5 @@ main (unsigned char argc, char **argv)
return 0;
}
-/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" } } */
-
+/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
+/* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_reduc_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_reduc_1.c 2017-11-22 18:05:58.625281435 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_reduc_1.c 2017-11-22 18:06:54.516061226 +0000
@@ -65,6 +65,46 @@ #define TEST_MAXMIN(T) \
TEST_MAXMIN (DEF_REDUC_MAXMIN)
+#define DEF_REDUC_BITWISE(TYPE, NAME, BIT_OP) \
+TYPE __attribute__ ((noinline, noclone)) \
+reduc_##NAME##_##TYPE (TYPE *a, int n) \
+{ \
+ TYPE r = 13; \
+ for (int i = 0; i < n; ++i) \
+ r BIT_OP a[i]; \
+ return r; \
+}
+
+#define TEST_BITWISE(T) \
+ T (int8_t, and, &=) \
+ T (int16_t, and, &=) \
+ T (int32_t, and, &=) \
+ T (int64_t, and, &=) \
+ T (uint8_t, and, &=) \
+ T (uint16_t, and, &=) \
+ T (uint32_t, and, &=) \
+ T (uint64_t, and, &=) \
+ \
+ T (int8_t, ior, |=) \
+ T (int16_t, ior, |=) \
+ T (int32_t, ior, |=) \
+ T (int64_t, ior, |=) \
+ T (uint8_t, ior, |=) \
+ T (uint16_t, ior, |=) \
+ T (uint32_t, ior, |=) \
+ T (uint64_t, ior, |=) \
+ \
+ T (int8_t, xor, ^=) \
+ T (int16_t, xor, ^=) \
+ T (int32_t, xor, ^=) \
+ T (int64_t, xor, ^=) \
+ T (uint8_t, xor, ^=) \
+ T (uint16_t, xor, ^=) \
+ T (uint32_t, xor, ^=) \
+ T (uint64_t, xor, ^=)
+
+TEST_BITWISE (DEF_REDUC_BITWISE)
+
/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
@@ -102,6 +142,12 @@ TEST_MAXMIN (DEF_REDUC_MAXMIN)
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 8 } } */
+
+/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 8 } } */
+
+/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 8 } } */
+
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
@@ -133,3 +179,18 @@ TEST_MAXMIN (DEF_REDUC_MAXMIN)
/* { dg-final { scan-assembler-times {\tfminnmv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnmv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnmv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tandv\tb[0-9]+, p[0-7], z[0-9]+\.b} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\th[0-9]+, p[0-7], z[0-9]+\.h} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\ts[0-9]+, p[0-7], z[0-9]+\.s} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\td[0-9]+, p[0-7], z[0-9]+\.d} 2 } } */
+
+/* { dg-final { scan-assembler-times {\torv\tb[0-9]+, p[0-7], z[0-9]+\.b} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\th[0-9]+, p[0-7], z[0-9]+\.h} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\ts[0-9]+, p[0-7], z[0-9]+\.s} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\td[0-9]+, p[0-7], z[0-9]+\.d} 2 } } */
+
+/* { dg-final { scan-assembler-times {\teorv\tb[0-9]+, p[0-7], z[0-9]+\.b} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\th[0-9]+, p[0-7], z[0-9]+\.h} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\ts[0-9]+, p[0-7], z[0-9]+\.s} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\td[0-9]+, p[0-7], z[0-9]+\.d} 2 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_reduc_2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_reduc_2.c 2017-11-22 18:05:58.625281435 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_reduc_2.c 2017-11-22 18:06:54.517012872 +0000
@@ -73,6 +73,49 @@ #define TEST_MAXMIN(T) \
TEST_MAXMIN (DEF_REDUC_MAXMIN)
+#define DEF_REDUC_BITWISE(TYPE,NAME,BIT_OP) \
+void __attribute__ ((noinline, noclone)) \
+reduc_##NAME##TYPE (TYPE (*restrict a)[NUM_ELEMS(TYPE)], \
+ TYPE *restrict r, int n) \
+{ \
+ for (int i = 0; i < n; i++) \
+ { \
+ r[i] = a[i][0]; \
+ for (int j = 0; j < NUM_ELEMS(TYPE); j++) \
+ r[i] BIT_OP a[i][j]; \
+ } \
+}
+
+#define TEST_BITWISE(T) \
+ T (int8_t, and, &=) \
+ T (int16_t, and, &=) \
+ T (int32_t, and, &=) \
+ T (int64_t, and, &=) \
+ T (uint8_t, and, &=) \
+ T (uint16_t, and, &=) \
+ T (uint32_t, and, &=) \
+ T (uint64_t, and, &=) \
+ \
+ T (int8_t, ior, |=) \
+ T (int16_t, ior, |=) \
+ T (int32_t, ior, |=) \
+ T (int64_t, ior, |=) \
+ T (uint8_t, ior, |=) \
+ T (uint16_t, ior, |=) \
+ T (uint32_t, ior, |=) \
+ T (uint64_t, ior, |=) \
+ \
+ T (int8_t, xor, ^=) \
+ T (int16_t, xor, ^=) \
+ T (int32_t, xor, ^=) \
+ T (int64_t, xor, ^=) \
+ T (uint8_t, xor, ^=) \
+ T (uint16_t, xor, ^=) \
+ T (uint32_t, xor, ^=) \
+ T (uint64_t, xor, ^=)
+
+TEST_BITWISE (DEF_REDUC_BITWISE)
+
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
@@ -104,3 +147,18 @@ TEST_MAXMIN (DEF_REDUC_MAXMIN)
/* { dg-final { scan-assembler-times {\tfminnmv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnmv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnmv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tandv\tb[0-9]+, p[0-7], z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tandv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\torv\tb[0-9]+, p[0-7], z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\torv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\teorv\tb[0-9]+, p[0-7], z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\teorv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 2 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.c 2017-11-22 18:05:58.625281435 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.c 2017-11-22 18:06:54.516061226 +0000
@@ -9,7 +9,7 @@ #define INIT_VECTOR(TYPE) \
TYPE a[NUM_ELEMS (TYPE) + 1]; \
for (int i = 0; i < NUM_ELEMS (TYPE) + 1; i++) \
{ \
- a[i] = (i * 2) * (i & 1 ? 1 : -1); \
+ a[i] = ((i * 2) * (i & 1 ? 1 : -1) | 3); \
asm volatile ("" ::: "memory"); \
}
@@ -35,10 +35,22 @@ #define TEST_REDUC_MAXMIN(TYPE, NAME, CM
__builtin_abort (); \
}
+#define TEST_REDUC_BITWISE(TYPE, NAME, BIT_OP) \
+ { \
+ INIT_VECTOR (TYPE); \
+ TYPE r1 = reduc_##NAME##_##TYPE (a, NUM_ELEMS (TYPE)); \
+ volatile TYPE r2 = 13; \
+ for (int i = 0; i < NUM_ELEMS (TYPE); ++i) \
+ r2 BIT_OP a[i]; \
+ if (r1 != r2) \
+ __builtin_abort (); \
+ }
+
int main ()
{
TEST_PLUS (TEST_REDUC_PLUS)
TEST_MAXMIN (TEST_REDUC_MAXMIN)
+ TEST_BITWISE (TEST_REDUC_BITWISE)
return 0;
}
Index: gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.c 2017-11-22 18:05:58.625281435 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.c 2017-11-22 18:06:54.517012872 +0000
@@ -56,6 +56,20 @@ #define TEST_REDUC_MAXMIN(TYPE, NAME, CM
} \
}
+#define TEST_REDUC_BITWISE(TYPE, NAME, BIT_OP) \
+ { \
+ INIT_MATRIX (TYPE); \
+ reduc_##NAME##_##TYPE (mat, r, NROWS); \
+ for (int i = 0; i < NROWS; i++) \
+ { \
+ volatile TYPE r2 = mat[i][0]; \
+ for (int j = 0; j < NUM_ELEMS (TYPE); ++j) \
+ r2 BIT_OP mat[i][j]; \
+ if (r[i] != r2) \
+ __builtin_abort (); \
+ } \
+ }
+
int main ()
{
TEST_PLUS (TEST_REDUC_PLUS)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Add support for bitwise reductions
2017-11-22 18:28 ` Richard Sandiford
@ 2017-12-14 0:37 ` Jeff Law
2018-01-07 17:03 ` James Greenhalgh
2018-01-24 21:25 ` Rainer Orth
0 siblings, 2 replies; 9+ messages in thread
From: Jeff Law @ 2017-12-14 0:37 UTC (permalink / raw)
To: gcc-patches, richard.sandiford
On 11/22/2017 11:12 AM, Richard Sandiford wrote:
> Richard Sandiford <richard.sandiford@linaro.org> writes:
>> This patch adds support for the SVE bitwise reduction instructions
>> (ANDV, ORV and EORV). It's a fairly mechanical extension of existing
>> REDUC_* operators.
>>
>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>> and powerpc64le-linux-gnu.
>
> Here's an updated version that applies on top of the recent
> removal of REDUC_*_EXPR. Tested as before.
>
> Thanks,
> Richard
>
>
> 2017-11-22 Richard Sandiford <richard.sandiford@linaro.org>
> Alan Hayward <alan.hayward@arm.com>
> David Sherwood <david.sherwood@arm.com>
>
> gcc/
> * optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
> (reduc_xor_scal_optab): New optabs.
> * doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
> (reduc_xor_scal_@var{m}): Document.
> * doc/sourcebuild.texi (vect_logical_reduc): Likewise.
> * internal-fn.def (IFN_REDUC_AND, IFN_REDUC_IOR, IFN_REDUC_XOR): New
> internal functions.
> * fold-const-call.c (fold_const_call): Handle them.
> * tree-vect-loop.c (reduction_fn_for_scalar_code): Return the new
> internal functions for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
> * config/aarch64/aarch64-sve.md (reduc_<bit_reduc>_scal_<mode>):
> (*reduc_<bit_reduc>_scal_<mode>): New patterns.
> * config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
> (UNSPEC_XORV): New unspecs.
> (optab): Add entries for them.
> (BITWISEV): New int iterator.
> (bit_reduc_op): New int attributes.
>
> gcc/testsuite/
> * lib/target-supports.exp (check_effective_target_vect_logical_reduc):
> New proc.
> * gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
> and add an associated scan-dump test. Prevent vectorization
> of the first two loops.
> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
> * gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
> * gcc.target/aarch64/sve_reduc_2.c: Likewise.
> * gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
> (INIT_VECTOR): Tweak initial value so that some bits are always set.
> * gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
OK.
Jeff
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Add support for bitwise reductions
2017-12-14 0:37 ` Jeff Law
@ 2018-01-07 17:03 ` James Greenhalgh
2018-01-24 21:25 ` Rainer Orth
1 sibling, 0 replies; 9+ messages in thread
From: James Greenhalgh @ 2018-01-07 17:03 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches, richard.sandiford, nd
On Thu, Dec 14, 2017 at 12:36:58AM +0000, Jeff Law wrote:
> On 11/22/2017 11:12 AM, Richard Sandiford wrote:
> > Richard Sandiford <richard.sandiford@linaro.org> writes:
> >> This patch adds support for the SVE bitwise reduction instructions
> >> (ANDV, ORV and EORV). It's a fairly mechanical extension of existing
> >> REDUC_* operators.
> >>
> >> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> >> and powerpc64le-linux-gnu.
> >
> > Here's an updated version that applies on top of the recent
> > removal of REDUC_*_EXPR. Tested as before.
> >
> > Thanks,
> > Richard
> >
> >
> > 2017-11-22 Richard Sandiford <richard.sandiford@linaro.org>
> > Alan Hayward <alan.hayward@arm.com>
> > David Sherwood <david.sherwood@arm.com>
> >
> > gcc/
> > * optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
> > (reduc_xor_scal_optab): New optabs.
> > * doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
> > (reduc_xor_scal_@var{m}): Document.
> > * doc/sourcebuild.texi (vect_logical_reduc): Likewise.
> > * internal-fn.def (IFN_REDUC_AND, IFN_REDUC_IOR, IFN_REDUC_XOR): New
> > internal functions.
> > * fold-const-call.c (fold_const_call): Handle them.
> > * tree-vect-loop.c (reduction_fn_for_scalar_code): Return the new
> > internal functions for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
> > * config/aarch64/aarch64-sve.md (reduc_<bit_reduc>_scal_<mode>):
> > (*reduc_<bit_reduc>_scal_<mode>): New patterns.
> > * config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
> > (UNSPEC_XORV): New unspecs.
> > (optab): Add entries for them.
> > (BITWISEV): New int iterator.
> > (bit_reduc_op): New int attributes.
> >
> > gcc/testsuite/
> > * lib/target-supports.exp (check_effective_target_vect_logical_reduc):
> > New proc.
> > * gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
> > and add an associated scan-dump test. Prevent vectorization
> > of the first two loops.
> > * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
> > * gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
> > * gcc.target/aarch64/sve_reduc_2.c: Likewise.
> > * gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
> > (INIT_VECTOR): Tweak initial value so that some bits are always set.
> > * gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
> OK.
> Jeff
I'm also OK with the AArch64 parts.
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Add support for bitwise reductions
2017-12-14 0:37 ` Jeff Law
2018-01-07 17:03 ` James Greenhalgh
@ 2018-01-24 21:25 ` Rainer Orth
2018-01-25 11:18 ` Richard Sandiford
1 sibling, 1 reply; 9+ messages in thread
From: Rainer Orth @ 2018-01-24 21:25 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches, richard.sandiford
Jeff Law <law@redhat.com> writes:
> On 11/22/2017 11:12 AM, Richard Sandiford wrote:
>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>> This patch adds support for the SVE bitwise reduction instructions
>>> (ANDV, ORV and EORV). It's a fairly mechanical extension of existing
>>> REDUC_* operators.
>>>
>>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>>> and powerpc64le-linux-gnu.
>>
>> Here's an updated version that applies on top of the recent
>> removal of REDUC_*_EXPR. Tested as before.
>>
>> Thanks,
>> Richard
>>
>>
>> 2017-11-22 Richard Sandiford <richard.sandiford@linaro.org>
>> Alan Hayward <alan.hayward@arm.com>
>> David Sherwood <david.sherwood@arm.com>
>>
>> gcc/
>> * optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
>> (reduc_xor_scal_optab): New optabs.
>> * doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
>> (reduc_xor_scal_@var{m}): Document.
>> * doc/sourcebuild.texi (vect_logical_reduc): Likewise.
>> * internal-fn.def (IFN_REDUC_AND, IFN_REDUC_IOR, IFN_REDUC_XOR): New
>> internal functions.
>> * fold-const-call.c (fold_const_call): Handle them.
>> * tree-vect-loop.c (reduction_fn_for_scalar_code): Return the new
>> internal functions for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
>> * config/aarch64/aarch64-sve.md (reduc_<bit_reduc>_scal_<mode>):
>> (*reduc_<bit_reduc>_scal_<mode>): New patterns.
>> * config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
>> (UNSPEC_XORV): New unspecs.
>> (optab): Add entries for them.
>> (BITWISEV): New int iterator.
>> (bit_reduc_op): New int attributes.
>>
>> gcc/testsuite/
>> * lib/target-supports.exp (check_effective_target_vect_logical_reduc):
>> New proc.
>> * gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
>> and add an associated scan-dump test. Prevent vectorization
>> of the first two loops.
>> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>> * gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
>> * gcc.target/aarch64/sve_reduc_2.c: Likewise.
>> * gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
>> (INIT_VECTOR): Tweak initial value so that some bits are always set.
>> * gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
> OK.
> Jeff
Two tests have regressed on sparc-sun-solaris2.*:
+FAIL: gcc.dg/vect/vect-reduc-or_1.c -flto -ffat-lto-objects scan-tree-dump vect "Reduce using vector shifts"
+FAIL: gcc.dg/vect/vect-reduc-or_1.c scan-tree-dump vect "Reduce using vector shifts"
+FAIL: gcc.dg/vect/vect-reduc-or_2.c -flto -ffat-lto-objects scan-tree-dump vect "Reduce using vector shifts"
+FAIL: gcc.dg/vect/vect-reduc-or_2.c scan-tree-dump vect "Reduce using vector shifts"
Rainer
--
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Add support for bitwise reductions
2018-01-24 21:25 ` Rainer Orth
@ 2018-01-25 11:18 ` Richard Sandiford
2018-01-26 9:40 ` Christophe Lyon
0 siblings, 1 reply; 9+ messages in thread
From: Richard Sandiford @ 2018-01-25 11:18 UTC (permalink / raw)
To: Rainer Orth; +Cc: Jeff Law, gcc-patches
Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
> Jeff Law <law@redhat.com> writes:
>> On 11/22/2017 11:12 AM, Richard Sandiford wrote:
>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>> This patch adds support for the SVE bitwise reduction instructions
>>>> (ANDV, ORV and EORV). It's a fairly mechanical extension of existing
>>>> REDUC_* operators.
>>>>
>>>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>>>> and powerpc64le-linux-gnu.
>>>
>>> Here's an updated version that applies on top of the recent
>>> removal of REDUC_*_EXPR. Tested as before.
>>>
>>> Thanks,
>>> Richard
>>>
>>>
>>> 2017-11-22 Richard Sandiford <richard.sandiford@linaro.org>
>>> Alan Hayward <alan.hayward@arm.com>
>>> David Sherwood <david.sherwood@arm.com>
>>>
>>> gcc/
>>> * optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
>>> (reduc_xor_scal_optab): New optabs.
>>> * doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
>>> (reduc_xor_scal_@var{m}): Document.
>>> * doc/sourcebuild.texi (vect_logical_reduc): Likewise.
>>> * internal-fn.def (IFN_REDUC_AND, IFN_REDUC_IOR, IFN_REDUC_XOR): New
>>> internal functions.
>>> * fold-const-call.c (fold_const_call): Handle them.
>>> * tree-vect-loop.c (reduction_fn_for_scalar_code): Return the new
>>> internal functions for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
>>> * config/aarch64/aarch64-sve.md (reduc_<bit_reduc>_scal_<mode>):
>>> (*reduc_<bit_reduc>_scal_<mode>): New patterns.
>>> * config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
>>> (UNSPEC_XORV): New unspecs.
>>> (optab): Add entries for them.
>>> (BITWISEV): New int iterator.
>>> (bit_reduc_op): New int attributes.
>>>
>>> gcc/testsuite/
>>> * lib/target-supports.exp (check_effective_target_vect_logical_reduc):
>>> New proc.
>>> * gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
>>> and add an associated scan-dump test. Prevent vectorization
>>> of the first two loops.
>>> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>>> * gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
>>> * gcc.target/aarch64/sve_reduc_2.c: Likewise.
>>> * gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
>>> (INIT_VECTOR): Tweak initial value so that some bits are always set.
>>> * gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
>> OK.
>> Jeff
>
> Two tests have regressed on sparc-sun-solaris2.*:
>
> +FAIL: gcc.dg/vect/vect-reduc-or_1.c -flto -ffat-lto-objects
> scan-tree-dump vect "Reduce using vector shifts"
> +FAIL: gcc.dg/vect/vect-reduc-or_1.c scan-tree-dump vect "Reduce using
> vector shifts"
> +FAIL: gcc.dg/vect/vect-reduc-or_2.c -flto -ffat-lto-objects
> scan-tree-dump vect "Reduce using vector shifts"
> +FAIL: gcc.dg/vect/vect-reduc-or_2.c scan-tree-dump vect "Reduce using
> vector shifts"
Bah, I think I broke this yesterday in:
2018-01-24 Richard Sandiford <richard.sandiford@linaro.org>
PR testsuite/83889
[...]
* gcc.dg/vect/vect-reduc-or_1.c: Remove conditional dg-do run.
* gcc.dg/vect/vect-reduc-or_2.c: Likewise.
(r257022), which removed:
/* { dg-do run { target { whole_vector_shift || vect_logical_reduc } } } */
I'd somehow thought that the dump lines in these two tests were already
guarded, but they weren't.
Tested on aarch64-linux-gnu and x86_64-linux-gnu and applied as obvious.
Sorry for the breakage.
Richard
2018-01-25 Richard Sandiford <richard.sandiford@linaro.org>
gcc/testsuite/
* gcc.dg/vect/vect-reduc-or_1.c: Require whole_vector_shift for
the shift dump line.
* gcc.dg/vect/vect-reduc-or_2.c: Likewise.
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2018-01-24 16:22:31.724089913 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2018-01-25 10:16:16.283500281 +0000
@@ -45,5 +45,5 @@ main (unsigned char argc, char **argv)
return 0;
}
-/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
+/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { whole_vector_shift && { ! vect_logical_reduc } } } } } */
/* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2018-01-24 16:22:31.724089913 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2018-01-25 10:16:16.284500239 +0000
@@ -44,5 +44,5 @@ main (unsigned char argc, char **argv)
return 0;
}
-/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
+/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { whole_vector_shift && { ! vect_logical_reduc } } } } } */
/* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Add support for bitwise reductions
2018-01-25 11:18 ` Richard Sandiford
@ 2018-01-26 9:40 ` Christophe Lyon
2018-01-26 10:25 ` Richard Sandiford
0 siblings, 1 reply; 9+ messages in thread
From: Christophe Lyon @ 2018-01-26 9:40 UTC (permalink / raw)
To: Rainer Orth, Jeff Law, gcc Patches, Richard Sandiford
On 25 January 2018 at 11:24, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
>> Jeff Law <law@redhat.com> writes:
>>> On 11/22/2017 11:12 AM, Richard Sandiford wrote:
>>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>>> This patch adds support for the SVE bitwise reduction instructions
>>>>> (ANDV, ORV and EORV). It's a fairly mechanical extension of existing
>>>>> REDUC_* operators.
>>>>>
>>>>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>>>>> and powerpc64le-linux-gnu.
>>>>
>>>> Here's an updated version that applies on top of the recent
>>>> removal of REDUC_*_EXPR. Tested as before.
>>>>
>>>> Thanks,
>>>> Richard
>>>>
>>>>
>>>> 2017-11-22 Richard Sandiford <richard.sandiford@linaro.org>
>>>> Alan Hayward <alan.hayward@arm.com>
>>>> David Sherwood <david.sherwood@arm.com>
>>>>
>>>> gcc/
>>>> * optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
>>>> (reduc_xor_scal_optab): New optabs.
>>>> * doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
>>>> (reduc_xor_scal_@var{m}): Document.
>>>> * doc/sourcebuild.texi (vect_logical_reduc): Likewise.
>>>> * internal-fn.def (IFN_REDUC_AND, IFN_REDUC_IOR, IFN_REDUC_XOR): New
>>>> internal functions.
>>>> * fold-const-call.c (fold_const_call): Handle them.
>>>> * tree-vect-loop.c (reduction_fn_for_scalar_code): Return the new
>>>> internal functions for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
>>>> * config/aarch64/aarch64-sve.md (reduc_<bit_reduc>_scal_<mode>):
>>>> (*reduc_<bit_reduc>_scal_<mode>): New patterns.
>>>> * config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
>>>> (UNSPEC_XORV): New unspecs.
>>>> (optab): Add entries for them.
>>>> (BITWISEV): New int iterator.
>>>> (bit_reduc_op): New int attributes.
>>>>
>>>> gcc/testsuite/
>>>> * lib/target-supports.exp (check_effective_target_vect_logical_reduc):
>>>> New proc.
>>>> * gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
>>>> and add an associated scan-dump test. Prevent vectorization
>>>> of the first two loops.
>>>> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>>>> * gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
>>>> * gcc.target/aarch64/sve_reduc_2.c: Likewise.
>>>> * gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
>>>> (INIT_VECTOR): Tweak initial value so that some bits are always set.
>>>> * gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
>>> OK.
>>> Jeff
>>
>> Two tests have regressed on sparc-sun-solaris2.*:
>>
>> +FAIL: gcc.dg/vect/vect-reduc-or_1.c -flto -ffat-lto-objects
>> scan-tree-dump vect "Reduce using vector shifts"
>> +FAIL: gcc.dg/vect/vect-reduc-or_1.c scan-tree-dump vect "Reduce using
>> vector shifts"
>> +FAIL: gcc.dg/vect/vect-reduc-or_2.c -flto -ffat-lto-objects
>> scan-tree-dump vect "Reduce using vector shifts"
>> +FAIL: gcc.dg/vect/vect-reduc-or_2.c scan-tree-dump vect "Reduce using
>> vector shifts"
>
> Bah, I think I broke this yesterday in:
>
> 2018-01-24 Richard Sandiford <richard.sandiford@linaro.org>
>
> PR testsuite/83889
> [...]
> * gcc.dg/vect/vect-reduc-or_1.c: Remove conditional dg-do run.
> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>
> (r257022), which removed:
>
> /* { dg-do run { target { whole_vector_shift || vect_logical_reduc } } } */
>
> I'd somehow thought that the dump lines in these two tests were already
> guarded, but they weren't.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu and applied as obvious.
> Sorry for the breakage.
>
> Richard
>
>
Hi Richard,
While this fixes the regression on armeb (same as on sparc), the
effect on arm-none-linux-gnueabi and arm-none-eabi
is that the tests are now skipped, while they used to pass.
Is this expected? Or is the guard you added too restrictive?
Thanks,
Christophe
> 2018-01-25 Richard Sandiford <richard.sandiford@linaro.org>
>
> gcc/testsuite/
> * gcc.dg/vect/vect-reduc-or_1.c: Require whole_vector_shift for
> the shift dump line.
> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>
> Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2018-01-24 16:22:31.724089913 +0000
> +++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2018-01-25 10:16:16.283500281 +0000
> @@ -45,5 +45,5 @@ main (unsigned char argc, char **argv)
> return 0;
> }
>
> -/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
> +/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { whole_vector_shift && { ! vect_logical_reduc } } } } } */
> /* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
> Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2018-01-24 16:22:31.724089913 +0000
> +++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2018-01-25 10:16:16.284500239 +0000
> @@ -44,5 +44,5 @@ main (unsigned char argc, char **argv)
> return 0;
> }
>
> -/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
> +/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { whole_vector_shift && { ! vect_logical_reduc } } } } } */
> /* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Add support for bitwise reductions
2018-01-26 9:40 ` Christophe Lyon
@ 2018-01-26 10:25 ` Richard Sandiford
2018-01-26 10:46 ` Christophe Lyon
0 siblings, 1 reply; 9+ messages in thread
From: Richard Sandiford @ 2018-01-26 10:25 UTC (permalink / raw)
To: Christophe Lyon; +Cc: Rainer Orth, Jeff Law, gcc Patches
Christophe Lyon <christophe.lyon@linaro.org> writes:
> On 25 January 2018 at 11:24, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
>>> Jeff Law <law@redhat.com> writes:
>>>> On 11/22/2017 11:12 AM, Richard Sandiford wrote:
>>>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>>>> This patch adds support for the SVE bitwise reduction instructions
>>>>>> (ANDV, ORV and EORV). It's a fairly mechanical extension of existing
>>>>>> REDUC_* operators.
>>>>>>
>>>>>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>>>>>> and powerpc64le-linux-gnu.
>>>>>
>>>>> Here's an updated version that applies on top of the recent
>>>>> removal of REDUC_*_EXPR. Tested as before.
>>>>>
>>>>> Thanks,
>>>>> Richard
>>>>>
>>>>>
>>>>> 2017-11-22 Richard Sandiford <richard.sandiford@linaro.org>
>>>>> Alan Hayward <alan.hayward@arm.com>
>>>>> David Sherwood <david.sherwood@arm.com>
>>>>>
>>>>> gcc/
>>>>> * optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
>>>>> (reduc_xor_scal_optab): New optabs.
>>>>> * doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
>>>>> (reduc_xor_scal_@var{m}): Document.
>>>>> * doc/sourcebuild.texi (vect_logical_reduc): Likewise.
>>>>> * internal-fn.def (IFN_REDUC_AND, IFN_REDUC_IOR, IFN_REDUC_XOR): New
>>>>> internal functions.
>>>>> * fold-const-call.c (fold_const_call): Handle them.
>>>>> * tree-vect-loop.c (reduction_fn_for_scalar_code): Return the new
>>>>> internal functions for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
>>>>> * config/aarch64/aarch64-sve.md (reduc_<bit_reduc>_scal_<mode>):
>>>>> (*reduc_<bit_reduc>_scal_<mode>): New patterns.
>>>>> * config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
>>>>> (UNSPEC_XORV): New unspecs.
>>>>> (optab): Add entries for them.
>>>>> (BITWISEV): New int iterator.
>>>>> (bit_reduc_op): New int attributes.
>>>>>
>>>>> gcc/testsuite/
>>>>> * lib/target-supports.exp (check_effective_target_vect_logical_reduc):
>>>>> New proc.
>>>>> * gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
>>>>> and add an associated scan-dump test. Prevent vectorization
>>>>> of the first two loops.
>>>>> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>>>>> * gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
>>>>> * gcc.target/aarch64/sve_reduc_2.c: Likewise.
>>>>> * gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
>>>>> (INIT_VECTOR): Tweak initial value so that some bits are always set.
>>>>> * gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
>>>> OK.
>>>> Jeff
>>>
>>> Two tests have regressed on sparc-sun-solaris2.*:
>>>
>>> +FAIL: gcc.dg/vect/vect-reduc-or_1.c -flto -ffat-lto-objects
>>> scan-tree-dump vect "Reduce using vector shifts"
>>> +FAIL: gcc.dg/vect/vect-reduc-or_1.c scan-tree-dump vect "Reduce using
>>> vector shifts"
>>> +FAIL: gcc.dg/vect/vect-reduc-or_2.c -flto -ffat-lto-objects
>>> scan-tree-dump vect "Reduce using vector shifts"
>>> +FAIL: gcc.dg/vect/vect-reduc-or_2.c scan-tree-dump vect "Reduce using
>>> vector shifts"
>>
>> Bah, I think I broke this yesterday in:
>>
>> 2018-01-24 Richard Sandiford <richard.sandiford@linaro.org>
>>
>> PR testsuite/83889
>> [...]
>> * gcc.dg/vect/vect-reduc-or_1.c: Remove conditional dg-do run.
>> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>>
>> (r257022), which removed:
>>
>> /* { dg-do run { target { whole_vector_shift || vect_logical_reduc } } } */
>>
>> I'd somehow thought that the dump lines in these two tests were already
>> guarded, but they weren't.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu and applied as obvious.
>> Sorry for the breakage.
>>
>> Richard
>>
>>
>
> Hi Richard,
>
> While this fixes the regression on armeb (same as on sparc), the
> effect on arm-none-linux-gnueabi and arm-none-eabi
> is that the tests are now skipped, while they used to pass.
> Is this expected? Or is the guard you added too restrictive?
I think that means that the tests went from UNSUPPORTED to PASS on
the last two targets with r257022. Is that right?
It's expected in the sense that whole_vector_shift isn't true for
any arm*-*-* target, and historically this test was restricted to
whole_vector_shift (apart from the blip this week).
Thanks,
Richard
> Thanks,
>
> Christophe
>
>> 2018-01-25 Richard Sandiford <richard.sandiford@linaro.org>
>>
>> gcc/testsuite/
>> * gcc.dg/vect/vect-reduc-or_1.c: Require whole_vector_shift for
>> the shift dump line.
>> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>>
>> Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c
>> ===================================================================
>> --- gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2018-01-24 16:22:31.724089913 +0000
>> +++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2018-01-25 10:16:16.283500281 +0000
>> @@ -45,5 +45,5 @@ main (unsigned char argc, char **argv)
>> return 0;
>> }
>>
>> -/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
>> +/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { whole_vector_shift && { ! vect_logical_reduc } } } } } */
>> /* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
>> Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c
>> ===================================================================
>> --- gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2018-01-24 16:22:31.724089913 +0000
>> +++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2018-01-25 10:16:16.284500239 +0000
>> @@ -44,5 +44,5 @@ main (unsigned char argc, char **argv)
>> return 0;
>> }
>>
>> -/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
>> +/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { whole_vector_shift && { ! vect_logical_reduc } } } } } */
>> /* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Add support for bitwise reductions
2018-01-26 10:25 ` Richard Sandiford
@ 2018-01-26 10:46 ` Christophe Lyon
0 siblings, 0 replies; 9+ messages in thread
From: Christophe Lyon @ 2018-01-26 10:46 UTC (permalink / raw)
To: Christophe Lyon, Rainer Orth, Jeff Law, gcc Patches, Richard Sandiford
On 26 January 2018 at 10:33, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Christophe Lyon <christophe.lyon@linaro.org> writes:
>> On 25 January 2018 at 11:24, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
>>>> Jeff Law <law@redhat.com> writes:
>>>>> On 11/22/2017 11:12 AM, Richard Sandiford wrote:
>>>>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>>>>> This patch adds support for the SVE bitwise reduction instructions
>>>>>>> (ANDV, ORV and EORV). It's a fairly mechanical extension of existing
>>>>>>> REDUC_* operators.
>>>>>>>
>>>>>>> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
>>>>>>> and powerpc64le-linux-gnu.
>>>>>>
>>>>>> Here's an updated version that applies on top of the recent
>>>>>> removal of REDUC_*_EXPR. Tested as before.
>>>>>>
>>>>>> Thanks,
>>>>>> Richard
>>>>>>
>>>>>>
>>>>>> 2017-11-22 Richard Sandiford <richard.sandiford@linaro.org>
>>>>>> Alan Hayward <alan.hayward@arm.com>
>>>>>> David Sherwood <david.sherwood@arm.com>
>>>>>>
>>>>>> gcc/
>>>>>> * optabs.def (reduc_and_scal_optab, reduc_ior_scal_optab)
>>>>>> (reduc_xor_scal_optab): New optabs.
>>>>>> * doc/md.texi (reduc_and_scal_@var{m}, reduc_ior_scal_@var{m})
>>>>>> (reduc_xor_scal_@var{m}): Document.
>>>>>> * doc/sourcebuild.texi (vect_logical_reduc): Likewise.
>>>>>> * internal-fn.def (IFN_REDUC_AND, IFN_REDUC_IOR, IFN_REDUC_XOR): New
>>>>>> internal functions.
>>>>>> * fold-const-call.c (fold_const_call): Handle them.
>>>>>> * tree-vect-loop.c (reduction_fn_for_scalar_code): Return the new
>>>>>> internal functions for BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR.
>>>>>> * config/aarch64/aarch64-sve.md (reduc_<bit_reduc>_scal_<mode>):
>>>>>> (*reduc_<bit_reduc>_scal_<mode>): New patterns.
>>>>>> * config/aarch64/iterators.md (UNSPEC_ANDV, UNSPEC_ORV)
>>>>>> (UNSPEC_XORV): New unspecs.
>>>>>> (optab): Add entries for them.
>>>>>> (BITWISEV): New int iterator.
>>>>>> (bit_reduc_op): New int attributes.
>>>>>>
>>>>>> gcc/testsuite/
>>>>>> * lib/target-supports.exp (check_effective_target_vect_logical_reduc):
>>>>>> New proc.
>>>>>> * gcc.dg/vect/vect-reduc-or_1.c: Also run for vect_logical_reduc
>>>>>> and add an associated scan-dump test. Prevent vectorization
>>>>>> of the first two loops.
>>>>>> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>>>>>> * gcc.target/aarch64/sve_reduc_1.c: Add AND, IOR and XOR reductions.
>>>>>> * gcc.target/aarch64/sve_reduc_2.c: Likewise.
>>>>>> * gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
>>>>>> (INIT_VECTOR): Tweak initial value so that some bits are always set.
>>>>>> * gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
>>>>> OK.
>>>>> Jeff
>>>>
>>>> Two tests have regressed on sparc-sun-solaris2.*:
>>>>
>>>> +FAIL: gcc.dg/vect/vect-reduc-or_1.c -flto -ffat-lto-objects
>>>> scan-tree-dump vect "Reduce using vector shifts"
>>>> +FAIL: gcc.dg/vect/vect-reduc-or_1.c scan-tree-dump vect "Reduce using
>>>> vector shifts"
>>>> +FAIL: gcc.dg/vect/vect-reduc-or_2.c -flto -ffat-lto-objects
>>>> scan-tree-dump vect "Reduce using vector shifts"
>>>> +FAIL: gcc.dg/vect/vect-reduc-or_2.c scan-tree-dump vect "Reduce using
>>>> vector shifts"
>>>
>>> Bah, I think I broke this yesterday in:
>>>
>>> 2018-01-24 Richard Sandiford <richard.sandiford@linaro.org>
>>>
>>> PR testsuite/83889
>>> [...]
>>> * gcc.dg/vect/vect-reduc-or_1.c: Remove conditional dg-do run.
>>> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>>>
>>> (r257022), which removed:
>>>
>>> /* { dg-do run { target { whole_vector_shift || vect_logical_reduc } } } */
>>>
>>> I'd somehow thought that the dump lines in these two tests were already
>>> guarded, but they weren't.
>>>
>>> Tested on aarch64-linux-gnu and x86_64-linux-gnu and applied as obvious.
>>> Sorry for the breakage.
>>>
>>> Richard
>>>
>>>
>>
>> Hi Richard,
>>
>> While this fixes the regression on armeb (same as on sparc), the
>> effect on arm-none-linux-gnueabi and arm-none-eabi
>> is that the tests are now skipped, while they used to pass.
>> Is this expected? Or is the guard you added too restrictive?
>
> I think that means that the tests went from UNSUPPORTED to PASS on
> the last two targets with r257022. Is that right?
>
Yes, that's what I meant.
> It's expected in the sense that whole_vector_shift isn't true for
> any arm*-*-* target, and historically this test was restricted to
> whole_vector_shift (apart from the blip this week).
>
OK, then. Just surprising to see PASS disappear.
Thanks,
Christophe
> Thanks,
> Richard
>
>> Thanks,
>>
>> Christophe
>>
>>> 2018-01-25 Richard Sandiford <richard.sandiford@linaro.org>
>>>
>>> gcc/testsuite/
>>> * gcc.dg/vect/vect-reduc-or_1.c: Require whole_vector_shift for
>>> the shift dump line.
>>> * gcc.dg/vect/vect-reduc-or_2.c: Likewise.
>>>
>>> Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c
>>> ===================================================================
>>> --- gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2018-01-24 16:22:31.724089913 +0000
>>> +++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_1.c 2018-01-25 10:16:16.283500281 +0000
>>> @@ -45,5 +45,5 @@ main (unsigned char argc, char **argv)
>>> return 0;
>>> }
>>>
>>> -/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
>>> +/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { whole_vector_shift && { ! vect_logical_reduc } } } } } */
>>> /* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
>>> Index: gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c
>>> ===================================================================
>>> --- gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2018-01-24 16:22:31.724089913 +0000
>>> +++ gcc/testsuite/gcc.dg/vect/vect-reduc-or_2.c 2018-01-25 10:16:16.284500239 +0000
>>> @@ -44,5 +44,5 @@ main (unsigned char argc, char **argv)
>>> return 0;
>>> }
>>>
>>> -/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { ! vect_logical_reduc } } } } */
>>> +/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" { target { whole_vector_shift && { ! vect_logical_reduc } } } } } */
>>> /* { dg-final { scan-tree-dump "Reduce using direct vector reduction" "vect" { target vect_logical_reduc } } } */
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-01-26 9:40 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-17 10:10 Add support for bitwise reductions Richard Sandiford
2017-11-22 18:28 ` Richard Sandiford
2017-12-14 0:37 ` Jeff Law
2018-01-07 17:03 ` James Greenhalgh
2018-01-24 21:25 ` Rainer Orth
2018-01-25 11:18 ` Richard Sandiford
2018-01-26 9:40 ` Christophe Lyon
2018-01-26 10:25 ` Richard Sandiford
2018-01-26 10:46 ` Christophe Lyon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).