From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113134 invoked by alias); 25 Sep 2019 16:18:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 113126 invoked by uid 89); 25 Sep 2019 16:18:07 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-20.8 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_NUMSUBJECT,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=Richi, HX-Envelope-From:sk:pratham, richi, H*f:sk:CAFiYyc X-HELO: mail-lf1-f68.google.com Received: from mail-lf1-f68.google.com (HELO mail-lf1-f68.google.com) (209.85.167.68) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 25 Sep 2019 16:18:05 +0000 Received: by mail-lf1-f68.google.com with SMTP id r22so4691856lfm.1 for ; Wed, 25 Sep 2019 09:18:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ZrOTik2LnazfQMfPcjOMLPm7N724/dsUlwZ/RM3qIR4=; b=q+/sL7d9yjESkenqkksDctYYEVQ72JpiX1AYoKCwx/Nniri8r/MjAoS2NT0SWQpC6H Mh4mNGU5rQPUGaPP+GgfaYixer0oEYfg/4Qy6fNBta71VekroC12FeR0laYT3KP8P7gx h4lZIu0iRvTqFHRgOoPizVcLQd3RgQsd9xH+WkyY9CIFPTy/urSol6ZOFSRxlMvs/wMf 2HsYzCXkfJ8xn9gCSYnZxR9T+MhWC4sfE5LEZf20NB7ZSvgQJdpz8B5swb8LhaL/VRqt GOb6bdLv2Ld8akUybEIoUBCsWQTUzjWRFdrhW2pfQig4kQ+AbFeiCazslVAZSheIbhvM D+yw== MIME-Version: 1.0 References: In-Reply-To: From: Prathamesh Kulkarni Date: Wed, 25 Sep 2019 16:18:00 -0000 Message-ID: Subject: Re: [SVE] PR86753 To: Richard Sandiford Cc: Richard Biener , gcc Patches Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2019-09/txt/msg01455.txt.bz2 On Mon, 16 Sep 2019 at 08:54, Prathamesh Kulkarni wrote: > > On Mon, 9 Sep 2019 at 09:36, Prathamesh Kulkarni > wrote: > > > > On Mon, 9 Sep 2019 at 16:45, Richard Sandiford > > wrote: > > > > > > Prathamesh Kulkarni writes: > > > > With patch, the only following FAIL remains for aarch64-sve.exp: > > > > FAIL: gcc.target/aarch64/sve/cond_unary_2.c -march=armv8.2-a+sve > > > > scan-assembler-times \\tmovprfx\\t 6 > > > > which now contains 14. > > > > Should I adjust the test, assuming the change isn't a regression ? > > > > > > Well, it is kind-of a regression, but it really just means that the > > > integer code is now consistent with the floating-point code in having > > > an unnecessary MOVPRFX. So I think adjusting the count is fine. > > > Presumably any future fix for the existing redundant MOVPRFXs will > > > apply to the new ones as well. > > > > > > The patch looks good to me, just some very minor nits: > > > > > > > @@ -8309,11 +8309,12 @@ vect_double_mask_nunits (tree type) > > > > > > > > /* Record that a fully-masked version of LOOP_VINFO would need MASKS to > > > > contain a sequence of NVECTORS masks that each control a vector of type > > > > - VECTYPE. */ > > > > + VECTYPE. SCALAR_MASK if non-null, represents the mask used for corresponding > > > > + load/store stmt. */ > > > > > > Should be two spaces between sentences. Maybe: > > > > > > VECTYPE. If SCALAR_MASK is nonnull, the fully-masked loop would AND > > > these vector masks with the vector version of SCALAR_MASK. */ > > > > > > since the mask isn't necessarily for a load or store statement. > > > > > > > [...] > > > > @@ -1879,7 +1879,8 @@ static tree permute_vec_elements (tree, tree, tree, stmt_vec_info, > > > > says how the load or store is going to be implemented and GROUP_SIZE > > > > is the number of load or store statements in the containing group. > > > > If the access is a gather load or scatter store, GS_INFO describes > > > > - its arguments. > > > > + its arguments. SCALAR_MASK is the scalar mask used for corresponding > > > > + load or store stmt. > > > > > > Maybe: > > > > > > its arguments. If the load or store is conditional, SCALAR_MASK is the > > > condition under which it occurs. > > > > > > since SCALAR_MASK can be null here too. > > > > > > > [...] > > > > @@ -9975,6 +9978,31 @@ vectorizable_condition (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, > > > > /* Handle cond expr. */ > > > > for (j = 0; j < ncopies; j++) > > > > { > > > > + tree loop_mask = NULL_TREE; > > > > + bool swap_cond_operands = false; > > > > + > > > > + if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) > > > > + { > > > > + scalar_cond_masked_key cond (cond_expr, ncopies); > > > > + if (loop_vinfo->scalar_cond_masked_set.contains (cond)) > > > > + { > > > > + vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo); > > > > + loop_mask = vect_get_loop_mask (gsi, masks, ncopies, vectype, j); > > > > + } > > > > + else > > > > + { > > > > + cond.code = invert_tree_comparison (cond.code, > > > > + HONOR_NANS (TREE_TYPE (cond.op0))); > > > > > > Long line. Maybe just split it out into a separate assignment: > > > > > > bool honor_nans = HONOR_NANS (TREE_TYPE (cond.op0)); > > > cond.code = invert_tree_comparison (cond.code, honor_nans); > > > > > > > + if (loop_vinfo->scalar_cond_masked_set.contains (cond)) > > > > + { > > > > + vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo); > > > > + loop_mask = vect_get_loop_mask (gsi, masks, ncopies, vectype, j); > > > > > > Long line here too. > > > > > > > [...] > > > > @@ -10090,6 +10121,26 @@ vectorizable_condition (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, > > > > } > > > > } > > > > } > > > > + > > > > + if (loop_mask) > > > > + { > > > > + if (COMPARISON_CLASS_P (vec_compare)) > > > > + { > > > > + tree tmp = make_ssa_name (vec_cmp_type); > > > > + gassign *g = gimple_build_assign (tmp, > > > > + TREE_CODE (vec_compare), > > > > + TREE_OPERAND (vec_compare, 0), > > > d> + TREE_OPERAND (vec_compare, 1)); > > > > > > Two long lines. > > > > > > > + vect_finish_stmt_generation (stmt_info, g, gsi); > > > > + vec_compare = tmp; > > > > + } > > > > + > > > > + tree tmp2 = make_ssa_name (vec_cmp_type); > > > > + gassign *g = gimple_build_assign (tmp2, BIT_AND_EXPR, vec_compare, loop_mask); > > > > > > Long line here too. > > > > > > > [...] > > > > diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c > > > > index dc181524744..c4b2d8e8647 100644 > > > > --- a/gcc/tree-vectorizer.c > > > > +++ b/gcc/tree-vectorizer.c > > > > @@ -1513,3 +1513,39 @@ make_pass_ipa_increase_alignment (gcc::context *ctxt) > > > > { > > > > return new pass_ipa_increase_alignment (ctxt); > > > > } > > > > + > > > > +/* If code(T) is comparison op or def of comparison stmt, > > > > + extract it's operands. > > > > + Else return . */ > > > > + > > > > +void > > > > +scalar_cond_masked_key::get_cond_ops_from_tree (tree t) > > > > +{ > > > > + if (TREE_CODE_CLASS (TREE_CODE (t)) == tcc_comparison) > > > > + { > > > > + this->code = TREE_CODE (t); > > > > + this->op0 = TREE_OPERAND (t, 0); > > > > + this->op1 = TREE_OPERAND (t, 1); > > > > + return; > > > > + } > > > > + > > > > + if (TREE_CODE (t) == SSA_NAME) > > > > + { > > > > + gassign *stmt = dyn_cast (SSA_NAME_DEF_STMT (t)); > > > > + if (stmt) > > > > + { > > > > > > Might as well do this as: > > > > > > if (TREE_CODE (t) == SSA_NAME) > > > if (gassign *stmt = dyn_cast (SSA_NAME_DEF_STMT (t))) > > > { > > > > > > The patch (as hoped) introduces some XPASSes: > > > > > > XPASS: gcc.target/aarch64/sve/cond_cnot_2.c scan-assembler-not \\tsel\\t > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmge\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 21 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmge\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 42 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmge\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 15 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmge\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 30 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmgt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 21 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmgt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 42 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmgt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 15 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmgt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 30 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmle\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 21 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmle\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 42 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmle\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 15 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmle\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 30 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmlt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 21 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmlt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 42 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmlt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 15 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmlt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 30 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmuo\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 252 > > > XPASS: gcc.target/aarch64/sve/vcond_4.c scan-assembler-times \\tfcmuo\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 180 > > > XPASS: gcc.target/aarch64/sve/vcond_5.c scan-assembler-times \\tfcmge\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0 21 > > > XPASS: gcc.target/aarch64/sve/vcond_5.c scan-assembler-times \\tfcmge\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d 42 > > > XPASS: gcc.target/aarch64/sve/vcond_5.c scan-assembler-times \\tfcmge\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0 15 > > > XPASS: gcc.target/aarch64/sve/vcond_5.c scan-assembler-times \\tfcmge\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s 30 > > > XPASS: gcc.target/aarch64/sve/vcond_5.c scan-assembler-times \\tfcmle\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0 21 > > > XPASS: gcc.target/aarch64/sve/vcond_5.c scan-assembler-times \\tfcmle\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d 42 > > > XPASS: gcc.target/aarch64/sve/vcond_5.c scan-assembler-times \\tfcmle\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0 15 > > > XPASS: gcc.target/aarch64/sve/vcond_5.c scan-assembler-times \\tfcmle\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s 30 > > > > > > Could you remove the associated xfails (and comments above them where > > > appropriate)? > > > > > > OK with those changes from my POV, but please give Richi a day or so > > > to object. > > > > > > Thanks for doing this. > > Thanks for the suggestions, I have updated the patch accordingly. > > Boostrap+test in progress on x86_64-unknown-linux-gnu and aarch64-linux-gnu. > > Richi, does the patch look OK to you ? > ping https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00573.html ping * 2: https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00573.html Thanks, Prathamesh > > Thanks, > Prathamesh > > > > Thanks, > > Prathamesh > > > > > > Richard