From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28611 invoked by alias); 29 Nov 2019 10:33:32 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 28603 invoked by uid 89); 29 Nov 2019 10:33:31 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.3 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mail-lj1-f194.google.com Received: from mail-lj1-f194.google.com (HELO mail-lj1-f194.google.com) (209.85.208.194) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 29 Nov 2019 10:33:28 +0000 Received: by mail-lj1-f194.google.com with SMTP id e28so7254836ljo.9 for ; Fri, 29 Nov 2019 02:33:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=xMlSQoQ+8WEzgWq/9VW1VtHI7h4ZgtHn3rgN+a9Zn/Y=; b=ns/3nmZZ6Nck7PcwpIwukA7knoXa7AoSi0E563UQX1tpKgCG1RclM09ct7z/AN5Szc B5br/hGApzplZ+x+pq/Pctd1xwo3NOukLl96YL/kF7f/pXMo4QaqXmWKARoba0fiAlu+ fIR0CYpwpYAw4Q/kuSarRsZ73YuFpX9fXlDSKz7af62Sz8VKplrRWUKFwbqw0ovgJwc0 KeDJdDY9WP2WenR/infwVNpnjU+1FjwedIqXehEJ7zFXGhPK/v3qhynd3OaaoDrKEUbd Gm62KYDwrf9WiN7fjaJN7y+A0p7FmLSUMTMfEeoCaAVERzkXGwK1H7j70g1yEZ/nITnI CGMQ== MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Fri, 29 Nov 2019 10:34:00 -0000 Message-ID: Subject: Re: [4/5] Record the vector mask precision in stmt_vec_info To: GCC Patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2019-11/txt/msg02629.txt.bz2 On Fri, Nov 29, 2019 at 11:14 AM Richard Sandiford wrote: > > search_type_for_mask uses a worklist to search a chain of boolean > operations for a natural vector mask type. This patch instead does > that in vect_determine_stmt_precisions, where we also look for > overpromoted integer operations. We then only need to compute > the precision once and can cache it in the stmt_vec_info. > > The new function vect_determine_mask_precision is supposed > to handle exactly the same cases as search_type_for_mask_1, > and in the same way. There's a lot we could improve here, > but that's not stage 3 material. > > I wondered about sharing mask_precision with other fields like > operation_precision, but in the end that seemed too dangerous. > We have patterns to convert between boolean and non-boolean > operations and it would be very easy to get mixed up about > which case the fields are describing. OK. > > 2019-11-29 Richard Sandiford > > gcc/ > * tree-vectorizer.h (stmt_vec_info::mask_precision): New field. > (vect_use_mask_type_p): New function. > * tree-vect-patterns.c (vect_init_pattern_stmt): Copy the > mask precision to the pattern statement. > (append_pattern_def_seq): Add a scalar_type_for_mask parameter > and use it to initialize the new stmt's mask precision. > (search_type_for_mask_1): Delete. > (search_type_for_mask): Replace with... > (integer_type_for_mask): ...this new function. Use the information > cached in the stmt_vec_info. > (vect_recog_bool_pattern): Update accordingly. > (build_mask_conversion): Pass the scalar type associated with the > mask type to append_pattern_def_seq. > (vect_recog_mask_conversion_pattern): Likewise. Call > integer_type_for_mask instead of search_type_for_mask. > (vect_convert_mask_for_vectype): Call integer_type_for_mask instead > of search_type_for_mask. > (possible_vector_mask_operation_p): New function. > (vect_determine_mask_precision): Likewise. > (vect_determine_stmt_precisions): Call it. > > Index: gcc/tree-vectorizer.h > =================================================================== > --- gcc/tree-vectorizer.h 2019-11-29 09:11:27.781086362 +0000 > +++ gcc/tree-vectorizer.h 2019-11-29 09:11:31.277062112 +0000 > @@ -1089,6 +1089,23 @@ typedef struct data_reference *dr_p; > unsigned int operation_precision; > signop operation_sign; > > + /* If the statement produces a boolean result, this value describes > + how we should choose the associated vector type. The possible > + values are: > + > + - an integer precision N if we should use the vector mask type > + associated with N-bit integers. This is only used if all relevant > + input booleans also want the vector mask type for N-bit integers, > + or if we can convert them into that form by pattern-matching. > + > + - ~0U if we considered choosing a vector mask type but decided > + to treat the boolean as a normal integer type instead. > + > + - 0 otherwise. This means either that the operation isn't one that > + could have a vector mask type (and so should have a normal vector > + type instead) or that we simply haven't made a choice either way. */ > + unsigned int mask_precision; > + > /* True if this is only suitable for SLP vectorization. */ > bool slp_vect_only_p; > }; > @@ -1245,6 +1262,15 @@ nested_in_vect_loop_p (class loop *loop, > && (loop->inner == (gimple_bb (stmt_info->stmt))->loop_father)); > } > > +/* Return true if STMT_INFO should produce a vector mask type rather than > + a normal nonmask type. */ > + > +static inline bool > +vect_use_mask_type_p (stmt_vec_info stmt_info) > +{ > + return stmt_info->mask_precision && stmt_info->mask_precision != ~0U; > +} > + > /* Return TRUE if a statement represented by STMT_INFO is a part of a > pattern. */ > > Index: gcc/tree-vect-patterns.c > =================================================================== > --- gcc/tree-vect-patterns.c 2019-11-29 09:11:21.389130702 +0000 > +++ gcc/tree-vect-patterns.c 2019-11-29 09:11:31.277062112 +0000 > @@ -112,7 +112,12 @@ vect_init_pattern_stmt (gimple *pattern_ > STMT_VINFO_DEF_TYPE (pattern_stmt_info) > = STMT_VINFO_DEF_TYPE (orig_stmt_info); > if (!STMT_VINFO_VECTYPE (pattern_stmt_info)) > - STMT_VINFO_VECTYPE (pattern_stmt_info) = vectype; > + { > + gcc_assert (VECTOR_BOOLEAN_TYPE_P (vectype) > + == vect_use_mask_type_p (orig_stmt_info)); > + STMT_VINFO_VECTYPE (pattern_stmt_info) = vectype; > + pattern_stmt_info->mask_precision = orig_stmt_info->mask_precision; > + } > return pattern_stmt_info; > } > > @@ -131,17 +136,25 @@ vect_set_pattern_stmt (gimple *pattern_s > > /* Add NEW_STMT to STMT_INFO's pattern definition statements. If VECTYPE > is nonnull, record that NEW_STMT's vector type is VECTYPE, which might > - be different from the vector type of the final pattern statement. */ > + be different from the vector type of the final pattern statement. > + If VECTYPE is a mask type, SCALAR_TYPE_FOR_MASK is the scalar type > + from which it was derived. */ > > static inline void > append_pattern_def_seq (stmt_vec_info stmt_info, gimple *new_stmt, > - tree vectype = NULL_TREE) > + tree vectype = NULL_TREE, > + tree scalar_type_for_mask = NULL_TREE) > { > + gcc_assert (!scalar_type_for_mask > + == (!vectype || !VECTOR_BOOLEAN_TYPE_P (vectype))); > vec_info *vinfo = stmt_info->vinfo; > if (vectype) > { > stmt_vec_info new_stmt_info = vinfo->add_stmt (new_stmt); > STMT_VINFO_VECTYPE (new_stmt_info) = vectype; > + if (scalar_type_for_mask) > + new_stmt_info->mask_precision > + = GET_MODE_BITSIZE (SCALAR_TYPE_MODE (scalar_type_for_mask)); > } > gimple_seq_add_stmt_without_update (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_info), > new_stmt); > @@ -3886,108 +3899,22 @@ adjust_bool_stmts (hash_set & > return gimple_assign_lhs (pattern_stmt); > } > > -/* Helper for search_type_for_mask. */ > +/* Return the proper type for converting bool VAR into > + an integer value or NULL_TREE if no such type exists. > + The type is chosen so that the converted value has the > + same number of elements as VAR's vector type. */ > > static tree > -search_type_for_mask_1 (tree var, vec_info *vinfo, > - hash_map &cache) > +integer_type_for_mask (tree var, vec_info *vinfo) > { > - tree rhs1; > - enum tree_code rhs_code; > - tree res = NULL_TREE, res2; > - > if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (var))) > return NULL_TREE; > > stmt_vec_info def_stmt_info = vect_get_internal_def (vinfo, var); > - if (!def_stmt_info) > + if (!def_stmt_info || !vect_use_mask_type_p (def_stmt_info)) > return NULL_TREE; > > - gassign *def_stmt = dyn_cast (def_stmt_info->stmt); > - if (!def_stmt) > - return NULL_TREE; > - > - tree *c = cache.get (def_stmt); > - if (c) > - return *c; > - > - rhs_code = gimple_assign_rhs_code (def_stmt); > - rhs1 = gimple_assign_rhs1 (def_stmt); > - > - switch (rhs_code) > - { > - case SSA_NAME: > - case BIT_NOT_EXPR: > - CASE_CONVERT: > - res = search_type_for_mask_1 (rhs1, vinfo, cache); > - break; > - > - case BIT_AND_EXPR: > - case BIT_IOR_EXPR: > - case BIT_XOR_EXPR: > - res = search_type_for_mask_1 (rhs1, vinfo, cache); > - res2 = search_type_for_mask_1 (gimple_assign_rhs2 (def_stmt), vinfo, > - cache); > - if (!res || (res2 && TYPE_PRECISION (res) > TYPE_PRECISION (res2))) > - res = res2; > - break; > - > - default: > - if (TREE_CODE_CLASS (rhs_code) == tcc_comparison) > - { > - tree comp_vectype, mask_type; > - > - if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1))) > - { > - res = search_type_for_mask_1 (rhs1, vinfo, cache); > - res2 = search_type_for_mask_1 (gimple_assign_rhs2 (def_stmt), > - vinfo, cache); > - if (!res || (res2 && TYPE_PRECISION (res) > TYPE_PRECISION (res2))) > - res = res2; > - if (res) > - break; > - } > - > - comp_vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1)); > - if (comp_vectype == NULL_TREE) > - { > - res = NULL_TREE; > - break; > - } > - > - mask_type = get_mask_type_for_scalar_type (vinfo, TREE_TYPE (rhs1)); > - if (!mask_type > - || !expand_vec_cmp_expr_p (comp_vectype, mask_type, rhs_code)) > - { > - res = NULL_TREE; > - break; > - } > - > - if (TREE_CODE (TREE_TYPE (rhs1)) != INTEGER_TYPE > - || !TYPE_UNSIGNED (TREE_TYPE (rhs1))) > - { > - scalar_mode mode = SCALAR_TYPE_MODE (TREE_TYPE (rhs1)); > - res = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); > - } > - else > - res = TREE_TYPE (rhs1); > - } > - } > - > - cache.put (def_stmt, res); > - return res; > -} > - > -/* Return the proper type for converting bool VAR into > - an integer value or NULL_TREE if no such type exists. > - The type is chosen so that converted value has the > - same number of elements as VAR's vector type. */ > - > -static tree > -search_type_for_mask (tree var, vec_info *vinfo) > -{ > - hash_map cache; > - return search_type_for_mask_1 (var, vinfo, cache); > + return build_nonstandard_integer_type (def_stmt_info->mask_precision, 1); > } > > /* Function vect_recog_bool_pattern > @@ -4079,7 +4006,7 @@ vect_recog_bool_pattern (stmt_vec_info s > } > else > { > - tree type = search_type_for_mask (var, vinfo); > + tree type = integer_type_for_mask (var, vinfo); > tree cst0, cst1, tmp; > > if (!type) > @@ -4164,7 +4091,7 @@ vect_recog_bool_pattern (stmt_vec_info s > rhs = adjust_bool_stmts (bool_stmts, TREE_TYPE (vectype), stmt_vinfo); > else > { > - tree type = search_type_for_mask (var, vinfo); > + tree type = integer_type_for_mask (var, vinfo); > tree cst0, cst1, new_vectype; > > if (!type) > @@ -4219,7 +4146,7 @@ build_mask_conversion (tree mask, tree v > masktype = truth_type_for (vectype); > tmp = vect_recog_temp_ssa_var (TREE_TYPE (masktype), NULL); > stmt = gimple_build_assign (tmp, CONVERT_EXPR, mask); > - append_pattern_def_seq (stmt_vinfo, stmt, masktype); > + append_pattern_def_seq (stmt_vinfo, stmt, masktype, TREE_TYPE (vectype)); > > return tmp; > } > @@ -4285,7 +4212,7 @@ vect_recog_mask_conversion_pattern (stmt > } > > tree mask_arg = gimple_call_arg (last_stmt, mask_argno); > - tree mask_arg_type = search_type_for_mask (mask_arg, vinfo); > + tree mask_arg_type = integer_type_for_mask (mask_arg, vinfo); > if (!mask_arg_type) > return NULL; > vectype2 = get_mask_type_for_scalar_type (vinfo, mask_arg_type); > @@ -4338,7 +4265,7 @@ vect_recog_mask_conversion_pattern (stmt > > if (TREE_CODE (rhs1) == SSA_NAME) > { > - rhs1_type = search_type_for_mask (rhs1, vinfo); > + rhs1_type = integer_type_for_mask (rhs1, vinfo); > if (!rhs1_type) > return NULL; > } > @@ -4358,7 +4285,7 @@ vect_recog_mask_conversion_pattern (stmt > > it is better for b1 and b2 to use the mask type associated > with int elements rather bool (byte) elements. */ > - rhs1_type = search_type_for_mask (TREE_OPERAND (rhs1, 0), vinfo); > + rhs1_type = integer_type_for_mask (TREE_OPERAND (rhs1, 0), vinfo); > if (!rhs1_type) > rhs1_type = TREE_TYPE (TREE_OPERAND (rhs1, 0)); > } > @@ -4414,7 +4341,8 @@ vect_recog_mask_conversion_pattern (stmt > tmp = vect_recog_temp_ssa_var (TREE_TYPE (rhs1), NULL); > pattern_stmt = gimple_build_assign (tmp, rhs1); > rhs1 = tmp; > - append_pattern_def_seq (stmt_vinfo, pattern_stmt, vectype2); > + append_pattern_def_seq (stmt_vinfo, pattern_stmt, vectype2, > + rhs1_type); > } > > if (maybe_ne (TYPE_VECTOR_SUBPARTS (vectype1), > @@ -4447,8 +4375,8 @@ vect_recog_mask_conversion_pattern (stmt > > rhs2 = gimple_assign_rhs2 (last_stmt); > > - rhs1_type = search_type_for_mask (rhs1, vinfo); > - rhs2_type = search_type_for_mask (rhs2, vinfo); > + rhs1_type = integer_type_for_mask (rhs1, vinfo); > + rhs2_type = integer_type_for_mask (rhs2, vinfo); > > if (!rhs1_type || !rhs2_type > || TYPE_PRECISION (rhs1_type) == TYPE_PRECISION (rhs2_type)) > @@ -4509,7 +4437,7 @@ vect_get_load_store_mask (stmt_vec_info > vect_convert_mask_for_vectype (tree mask, tree vectype, > stmt_vec_info stmt_info, vec_info *vinfo) > { > - tree mask_type = search_type_for_mask (mask, vinfo); > + tree mask_type = integer_type_for_mask (mask, vinfo); > if (mask_type) > { > tree mask_vectype = get_mask_type_for_scalar_type (vinfo, mask_type); > @@ -4949,6 +4877,148 @@ vect_determine_precisions_from_users (st > vect_set_min_input_precision (stmt_info, type, min_input_precision); > } > > +/* Return true if the statement described by STMT_INFO sets a boolean > + SSA_NAME and if we know how to vectorize this kind of statement using > + vector mask types. */ > + > +static bool > +possible_vector_mask_operation_p (stmt_vec_info stmt_info) > +{ > + tree lhs = gimple_get_lhs (stmt_info->stmt); > + if (!lhs > + || TREE_CODE (lhs) != SSA_NAME > + || !VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (lhs))) > + return false; > + > + if (gassign *assign = dyn_cast (stmt_info->stmt)) > + { > + tree_code rhs_code = gimple_assign_rhs_code (assign); > + switch (rhs_code) > + { > + CASE_CONVERT: > + case SSA_NAME: > + case BIT_NOT_EXPR: > + case BIT_IOR_EXPR: > + case BIT_XOR_EXPR: > + case BIT_AND_EXPR: > + return true; > + > + default: > + return TREE_CODE_CLASS (rhs_code) == tcc_comparison; > + } > + } > + return false; > +} > + > +/* If STMT_INFO sets a boolean SSA_NAME, see whether we should use > + a vector mask type instead of a normal vector type. Record the > + result in STMT_INFO->mask_precision. */ > + > +static void > +vect_determine_mask_precision (stmt_vec_info stmt_info) > +{ > + vec_info *vinfo = stmt_info->vinfo; > + > + if (!possible_vector_mask_operation_p (stmt_info) > + || stmt_info->mask_precision) > + return; > + > + auto_vec worklist; > + worklist.quick_push (stmt_info); > + while (!worklist.is_empty ()) > + { > + stmt_info = worklist.last (); > + unsigned int orig_length = worklist.length (); > + > + /* If at least one boolean input uses a vector mask type, > + pick the mask type with the narrowest elements. > + > + ??? This is the traditional behavior. It should always produce > + the smallest number of operations, but isn't necessarily the > + optimal choice. For example, if we have: > + > + a = b & c > + > + where: > + > + - the user of a wants it to have a mask type for 16-bit elements (M16) > + - b also uses M16 > + - c uses a mask type for 8-bit elements (M8) > + > + then picking M8 gives: > + > + - 1 M16->M8 pack for b > + - 1 M8 AND for a > + - 2 M8->M16 unpacks for the user of a > + > + whereas picking M16 would have given: > + > + - 2 M8->M16 unpacks for c > + - 2 M16 ANDs for a > + > + The number of operations are equal, but M16 would have given > + a shorter dependency chain and allowed more ILP. */ > + unsigned int precision = ~0U; > + gassign *assign = as_a (stmt_info->stmt); > + unsigned int nops = gimple_num_ops (assign); > + for (unsigned int i = 1; i < nops; ++i) > + { > + tree rhs = gimple_op (assign, i); > + if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs))) > + continue; > + > + stmt_vec_info def_stmt_info = vinfo->lookup_def (rhs); > + if (!def_stmt_info) > + /* Don't let external or constant operands influence the choice. > + We can convert them to whichever vector type we pick. */ > + continue; > + > + if (def_stmt_info->mask_precision) > + { > + if (precision > def_stmt_info->mask_precision) > + precision = def_stmt_info->mask_precision; > + } > + else if (possible_vector_mask_operation_p (def_stmt_info)) > + worklist.safe_push (def_stmt_info); > + } > + > + /* Defer the choice if we need to visit operands first. */ > + if (orig_length != worklist.length ()) > + continue; > + > + /* If the statement compares two values that shouldn't use vector masks, > + try comparing the values as normal scalars instead. */ > + tree_code rhs_code = gimple_assign_rhs_code (assign); > + if (precision == ~0U > + && TREE_CODE_CLASS (rhs_code) == tcc_comparison) > + { > + tree rhs1_type = TREE_TYPE (gimple_assign_rhs1 (assign)); > + scalar_mode mode; > + tree vectype, mask_type; > + if (is_a (TYPE_MODE (rhs1_type), &mode) > + && (vectype = get_vectype_for_scalar_type (vinfo, rhs1_type)) > + && (mask_type = get_mask_type_for_scalar_type (vinfo, rhs1_type)) > + && expand_vec_cmp_expr_p (vectype, mask_type, rhs_code)) > + precision = GET_MODE_BITSIZE (mode); > + } > + > + if (dump_enabled_p ()) > + { > + if (precision == ~0U) > + dump_printf_loc (MSG_NOTE, vect_location, > + "using normal nonmask vectors for %G", > + stmt_info->stmt); > + else > + dump_printf_loc (MSG_NOTE, vect_location, > + "using boolean precision %d for %G", > + precision, stmt_info->stmt); > + } > + > + stmt_info->mask_precision = precision; > + worklist.pop (); > + } > +} > + > /* Handle vect_determine_precisions for STMT_INFO, given that we > have already done so for the users of its result. */ > > @@ -4961,6 +5031,7 @@ vect_determine_stmt_precisions (stmt_vec > vect_determine_precisions_from_range (stmt_info, stmt); > vect_determine_precisions_from_users (stmt_info, stmt); > } > + vect_determine_mask_precision (stmt_info); > } > > /* Walk backwards through the vectorizable region to determine the