From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15871 invoked by alias); 21 Aug 2015 12:30:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 15854 invoked by uid 89); 21 Aug 2015 12:30:37 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-io0-f170.google.com Received: from mail-io0-f170.google.com (HELO mail-io0-f170.google.com) (209.85.223.170) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 21 Aug 2015 12:30:35 +0000 Received: by iodv127 with SMTP id v127so79432634iod.3 for ; Fri, 21 Aug 2015 05:30:33 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.107.26.213 with SMTP id a204mr8512313ioa.147.1440160233456; Fri, 21 Aug 2015 05:30:33 -0700 (PDT) Received: by 10.36.202.66 with HTTP; Fri, 21 Aug 2015 05:30:33 -0700 (PDT) In-Reply-To: <55D61F54.1090802@redhat.com> References: <20150817162247.GA12565@msticlxl57.ims.intel.com> <55D61F54.1090802@redhat.com> Date: Fri, 21 Aug 2015 12:49:00 -0000 Message-ID: Subject: Re: [RFC][Scalar masks 1/x] Introduce GEN_MASK_EXPR. From: Ilya Enkovich To: Jeff Law Cc: gcc-patches Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-08/txt/msg01302.txt.bz2 2015-08-20 21:41 GMT+03:00 Jeff Law : > On 08/17/2015 10:22 AM, Ilya Enkovich wrote: >> >> Hi, >> >> This patch starts a series introducing scalar masks support in the >> vectorizer. It was discussed on the recent Cauldron and changes overiew is >> available here: >> https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile&do=view&target=Vectorization+for+Intel+AVX-512.pdf. >> Here is shortly a list of changes introduced by this series: >> >> - Add new tree expr to produce scalar masks in a vectorized code >> - Fix-up if-conversion to use bool predicates instead of integer masks >> - Disable some bool patterns to avoid bool to int conversion where masks >> can be used >> - Support bool operands in vectorization factor computation >> - Support scalar masks in MASK_LOAD, MASK_STORE and VEC_COND_EXPR by >> adding new optabs >> - Support vectorization for statements which are now not transformed by >> bool patterns >> - Add target support (hooks, optabs, expands) >> >> This patch introduces GEN_MASK_EXPR code. Intitially I wanted to use a >> comparison as an operand for it directly mapping it into AVX-512 comparison >> instruction. But a feedback was to simplify new code's semantics and use it >> for converting vectors into scalar masks. Therefore if we want to compare >> two vectors into a scalar masks we use two statements: >> >> vect.18_87 = vect__5.13_81 > vect__6.16_86; >> mask__ifc__23.17_88 = GEN_MASK ; >> >> Trying it in practice I found it producing worse code. The problem is that >> on target first comparison is expanded into two instructions: cmp with mask >> result + masked move to get a vector. GEN_MASK is then expanded into another >> comparison with zero vector. Thus I get two comparisons + move instead of a >> single comparison and have to optimize this out on a target side (current >> optimizers can't handle it). That's actually what I wanted to avoid. For >> now I changed GEN_MASK_EXPR to get a vector value as an operand but didn't >> change expand pattern which has four opernads: two vectors to compare + cmp >> operator + result. On expand I try to detect GEN_MASK uses a result of >> comparison and thus avoid double comparison generation. >> >> Patch series is not actually fully finished yet. I still have several >> type conversion tests not being vectorized and it wasn't widely tested. >> That's what I'm working on now. >> >> Will be glad to any comments. >> >> Thanks, >> Ilya >> -- >> 2015-08-17 Ilya Enkovich >> >> * expr.c (expand_expr_real_2): Support GEN_MASK_EXPR. >> * gimple-pretty-print.c (dump_unary_rhs): Likewise. >> * gimple.c (get_gimple_rhs_num_ops): Likewise. >> * optabs.c: Include gimple.h. >> (vector_compare_rtx): Add OPNO arg. >> (get_gen_mask_icode): New. >> (expand_gen_mask_expr_p): New. >> (expand_gen_mask_expr): New. >> (expand_vec_cond_expr): Adjust vector_compare_rtx call. >> * optabs.def (gen_mask_optab): New. >> (gen_masku_optab): New. >> * optabs.h (expand_gen_mask_expr_p): New. >> (expand_gen_mask_expr): New. >> * tree-cfg.c (verify_gimple_assign_unary): Support GEN_MASK_EXPR. >> * tree-inline.c (estimate_operator_cost): Likewise. >> * tree-pretty-print.c (dump_generic_node): Likewise. >> * tree-ssa-operands.c (get_expr_operands): Likewise. >> * tree.def (GEN_MASK_EXPR): New. > > A general question, would any of this likely help Yuri's work to optimize > MASK_STORES? If I remember correctly his optimization is for Haswell which doesn't have scalar masks. But doing it with scalar masks would be simpler because zero mask check becomes trivial. > >> >> >> diff --git a/gcc/optabs.c b/gcc/optabs.c >> index a6ca706..bf466ca 100644 >> --- a/gcc/optabs.c >> +++ b/gcc/optabs.c >> @@ -51,6 +51,7 @@ along with GCC; see the file COPYING3. If not see >> #include "recog.h" >> #include "reload.h" >> #include "target.h" >> +#include "gimple.h" > > Hmm, part of me doesn't want to see optabs.c depending on gimple.h. > > How painful would it be to have this stuff live in expr.c? > Should be possible to refactor it somehow. > >> + >> +/* Generate insns for a GEN_MASK_EXPR, given its TYPE and operand. */ >> + >> +rtx >> +expand_gen_mask_expr (tree type, tree op0, rtx target) >> +{ >> + struct expand_operand ops[4]; >> + enum insn_code icode; >> + rtx comparison; >> + machine_mode mode = TYPE_MODE (type); >> + machine_mode cmp_op_mode; >> + bool unsignedp; >> + tree op0a, op0b; >> + enum tree_code tcode; >> + gimple def_stmt; >> + >> + /* Avoid double comparison. */ >> + if (TREE_CODE (op0) == SSA_NAME >> + && (def_stmt = SSA_NAME_DEF_STMT (op0)) >> + && is_gimple_assign (def_stmt) >> + && TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt)) == >> tcc_comparison) >> + { >> + op0a = gimple_assign_rhs1 (def_stmt); >> + op0b = gimple_assign_rhs2 (def_stmt); >> + tcode = gimple_assign_rhs_code (def_stmt); >> + } >> + else >> + { >> + op0a = op0; >> + op0b = build_zero_cst (TREE_TYPE (op0)); >> + tcode = NE_EXPR; >> + } >> + >> + unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a)); >> + cmp_op_mode = TYPE_MODE (TREE_TYPE (op0a)); >> + >> + gcc_assert (GET_MODE_BITSIZE (mode) >= GET_MODE_NUNITS (cmp_op_mode)); >> + >> + icode = get_gen_mask_icode (cmp_op_mode, unsignedp); >> + if (icode == CODE_FOR_nothing) >> + return 0; > > So if the target doesn't have suitable insns, what happens? I suspect the > answer is nothing useful. In which case the question becomes what prevents > the optimizers from generating a GEN_MASK_EXPR? Maybe that'll become clear > as I read the rest of the patches. I don't expect other optimizer than vectorizer to generate it. And as usually optab is checked before generating it. > > > >> >> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c >> index e1ceea4..052c055 100644 >> --- a/gcc/tree-inline.c >> +++ b/gcc/tree-inline.c >> @@ -3850,6 +3850,7 @@ estimate_operator_cost (enum tree_code code, >> eni_weights *weights, >> case COMPLEX_EXPR: >> case PAREN_EXPR: >> case VIEW_CONVERT_EXPR: >> + case GEN_MASK_EXPR: >> return 0; > > That seems wrong to me :-) Got it :) Thanks, Ilya > > Jeff