From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id CE7EA3857C7A for ; Wed, 6 Jan 2021 15:53:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org CE7EA3857C7A Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 80D55D6E; Wed, 6 Jan 2021 07:53:31 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0D46C3F719; Wed, 6 Jan 2021 07:53:30 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, rguenther@suse.de, richard.sandiford@arm.com Cc: rguenther@suse.de Subject: [PATCH] gimple-isel: Check whether IFN_VCONDEQ is supported [PR98560] Date: Wed, 06 Jan 2021 15:53:29 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jan 2021 15:53:33 -0000 This patch follows on from the previous one for the PR and makes sure that we can handle == as well as <. Previously we assumed without checking that IFN_VCONDEQ was available if IFN_VCOND or IFN_VCONDU wasn't. The patch also fixes the definition of the IFN_VCOND* functions. The optabs are convert optabs in which the first mode is the data mode and the second mode is the comparison or mask mode. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard gcc/ PR tree-optimization/98560 * internal-fn.def (IFN_VCONDU, IFN_VCONDEQ): Use type vec_cond. * internal-fn.c (vec_cond_mask_direct): Get the data mode from argument 1. (vec_cond_direct): Likewise argument 2. (vec_condu_direct, vec_condeq_direct): Delete. (expand_vect_cond_optab_fn): Rename to... (expand_vec_cond_optab_fn): ...this, replacing old macro. (expand_vec_condu_optab_fn, expand_vec_condeq_optab_fn): Delete. (expand_vect_cond_mask_optab_fn): Rename to... (expand_vec_cond_mask_optab_fn): ...this, replacing old macro. (direct_vec_cond_mask_optab_supported_p): Treat the optab as a convert optab. (direct_vec_cond_optab_supported_p): Likewise. (direct_vec_condu_optab_supported_p): Delete. (direct_vec_condeq_optab_supported_p): Delete. * gimple-isel.cc: Include internal-fn.h. (gimple_expand_vec_cond_expr): Check that IFN_VCONDEQ is supported before using it. gcc/testsuite/ PR tree-optimization/98560 * gcc.dg/vect/pr98560-2.c: New test. --- gcc/gimple-isel.cc | 6 +++++- gcc/internal-fn.c | 22 ++++++---------------- gcc/internal-fn.def | 4 ++-- gcc/testsuite/gcc.dg/vect/pr98560-2.c | 17 +++++++++++++++++ 4 files changed, 30 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/pr98560-2.c diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc index 9c07d79a86c..3ca29191c24 100644 --- a/gcc/gimple-isel.cc +++ b/gcc/gimple-isel.cc @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3. If not see #include "memmodel.h" #include "optabs.h" #include "gimple-fold.h" +#include "internal-fn.h" /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to internal function based on vector type of selected expansion. @@ -246,7 +247,10 @@ gimple_expand_vec_cond_expr (gimple_stmt_iterator *gsi, Try changing it to NE_EXPR. */ tcode = NE_EXPR; } - if (tcode == EQ_EXPR || tcode == NE_EXPR) + if ((tcode == EQ_EXPR || tcode == NE_EXPR) + && direct_internal_fn_supported_p (IFN_VCONDEQ, TREE_TYPE (lhs), + TREE_TYPE (op0a), + OPTIMIZE_FOR_BOTH)) { tree tcode_tree = build_int_cst (integer_type_node, tcode); return gimple_build_call_internal (IFN_VCONDEQ, 5, op0a, op0b, op1, diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 996f0fb6c67..dd7173126fb 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -110,10 +110,8 @@ init_internal_fns () #define mask_store_direct { 3, 2, false } #define store_lanes_direct { 0, 0, false } #define mask_store_lanes_direct { 0, 0, false } -#define vec_cond_mask_direct { 0, 0, false } -#define vec_cond_direct { 0, 0, false } -#define vec_condu_direct { 0, 0, false } -#define vec_condeq_direct { 0, 0, false } +#define vec_cond_mask_direct { 1, 0, false } +#define vec_cond_direct { 2, 0, false } #define scatter_store_direct { 3, 1, false } #define len_store_direct { 3, 3, false } #define vec_set_direct { 3, 3, false } @@ -2766,7 +2764,7 @@ expand_partial_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab) The expansion of STMT happens based on OPTAB table associated. */ static void -expand_vect_cond_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +expand_vec_cond_optab_fn (internal_fn, gcall *stmt, convert_optab optab) { class expand_operand ops[6]; insn_code icode; @@ -2802,15 +2800,11 @@ expand_vect_cond_optab_fn (internal_fn, gcall *stmt, convert_optab optab) emit_move_insn (target, ops[0].value); } -#define expand_vec_cond_optab_fn expand_vect_cond_optab_fn -#define expand_vec_condu_optab_fn expand_vect_cond_optab_fn -#define expand_vec_condeq_optab_fn expand_vect_cond_optab_fn - /* Expand VCOND_MASK optab internal function. The expansion of STMT happens based on OPTAB table associated. */ static void -expand_vect_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab) { class expand_operand ops[4]; @@ -2844,8 +2838,6 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab) emit_move_insn (target, ops[0].value); } -#define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn - /* Expand VEC_SET internal functions. */ static void @@ -3570,10 +3562,8 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_mask_store_optab_supported_p convert_optab_supported_p #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_mask_store_lanes_optab_supported_p multi_vector_optab_supported_p -#define direct_vec_cond_mask_optab_supported_p multi_vector_optab_supported_p -#define direct_vec_cond_optab_supported_p multi_vector_optab_supported_p -#define direct_vec_condu_optab_supported_p multi_vector_optab_supported_p -#define direct_vec_condeq_optab_supported_p multi_vector_optab_supported_p +#define direct_vec_cond_mask_optab_supported_p convert_optab_supported_p +#define direct_vec_cond_optab_supported_p convert_optab_supported_p #define direct_scatter_store_optab_supported_p convert_optab_supported_p #define direct_len_store_optab_supported_p direct_optab_supported_p #define direct_while_optab_supported_p convert_optab_supported_p diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 9abf8043cca..19016ce109f 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -141,8 +141,8 @@ DEF_INTERNAL_OPTAB_FN (MASK_STORE_LANES, 0, vec_mask_store_lanes, mask_store_lanes) DEF_INTERNAL_OPTAB_FN (VCOND, 0, vcond, vec_cond) -DEF_INTERNAL_OPTAB_FN (VCONDU, 0, vcondu, vec_condu) -DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_condeq) +DEF_INTERNAL_OPTAB_FN (VCONDU, 0, vcondu, vec_cond) +DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond) DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask) DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set) diff --git a/gcc/testsuite/gcc.dg/vect/pr98560-2.c b/gcc/testsuite/gcc.dg/vect/pr98560-2.c new file mode 100644 index 00000000000..7759a5e8202 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr98560-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -fno-tree-vrp -fno-tree-fre -fno-tree-pre -fno-code-hoisting -fvect-cost-model=dynamic" } */ +/* { dg-additional-options "-msve-vector-bits=128" { target aarch64_sve } } */ + +#include + +void +f (uint16_t *restrict dst, uint32_t *restrict src1, float *restrict src2) +{ + int i = 0; + for (int j = 0; j < 4; ++j) + { + uint16_t tmp = src1[i] >> 1; + dst[i] = (uint16_t) (src2[i] == 0 && i < 4 ? tmp : 1); + i += 1; + } +}