From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 80011 invoked by alias); 21 Nov 2017 11:35:54 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 79779 invoked by uid 89); 21 Nov 2017 11:35:37 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-14.4 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_NUMSUBJECT,KB_WAM_FROM_NAME_SINGLEWORD,SPF_PASS autolearn=ham version=3.3.2 spammy=HDKIM-Filter:v2.9.2, H*F:D*be, 2.3 X-HELO: smtp5.sgsi.ucl.ac.be Received: from smtp.sgsi.ucl.ac.be (HELO smtp5.sgsi.ucl.ac.be) (130.104.5.67) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 21 Nov 2017 11:35:34 +0000 Received: from localhost (wifi-secure2-454.sri.ucl.ac.be [130.104.97.198]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: verhetsel@smtp5.sgsi.ucl.ac.be) by smtp5.sgsi.ucl.ac.be (Postfix) with ESMTPSA id EB7E567DC13 for ; Tue, 21 Nov 2017 12:35:25 +0100 (CET) DKIM-Filter: OpenDKIM Filter v2.9.2 smtp5.sgsi.ucl.ac.be EB7E567DC13 From: Kilian Verhetsel To: gcc-patches@gcc.gnu.org Subject: [PATCH] Fix result for conditional reductions matching at index 0 Date: Tue, 21 Nov 2017 11:41:00 -0000 Message-ID: <87d14brhj6.fsf@uclouvain.be> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Sgsi-Spamcheck: SASL authenticated, X-SGSI-Information: X-SGSI-MailScanner-ID: EB7E567DC13.A532A X-SGSI-MailScanner: Found to be clean X-SGSI-From: kilian.verhetsel@uclouvain.be X-SGSI-Spam-Status: No X-IsSubscribed: yes X-SW-Source: 2017-11/txt/msg01878.txt.bz2 --=-=-= Content-Type: text/plain Content-length: 1214 Hi, When translating conditional reductions based on integer induction, the compiler uses the value zero to indicate the absence of any matches: if the index of the last match is still zero at the end of the loop, the default value is returned. The problem with this approach is that this default value is returned not only when there were no matches at all, but also when the last match occurred at index 0. This causes the test gcc.dg/vect/pr65947-14.c to fail. This patch corrects this by reusing the vector of indices used for COND_REDUCTION, which starts at 1. If the 1-based index of the last match is non-zero, 1 is subtracted from it, otherwise the initial value is returned. I tested this patch on x86_64-pc-linux-gnu (both with SSE and AVX2, causing both paths through the reduc_code != ERROR_MARK branch being taken). 2017-11-21 Kilian Verhetsel * tree-vect-loop.c (vect_create_epilog_for_reduction): Fix the returned value for INTEGER_INDUC_COND_REDUCTION whose last match occurred at index 0. (vectorizable_reduction): For INTEGER_INDUC_COND_REDUCTION, pass the PHI statement that sets the induction variable to the code generating the epilogue. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=gcc.patch Content-length: 13691 Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c (revision 254913) +++ gcc/tree-vect-loop.c (working copy) @@ -4316,7 +4316,7 @@ get_initial_defs_for_reduction (slp_tree slp_node, static void vect_create_epilog_for_reduction (vec vect_defs, gimple *stmt, - gimple *reduc_def_stmt, + gimple *reduc_def_stmt, gimple *induct_stmt, int ncopies, enum tree_code reduc_code, vec reduction_phis, bool double_reduc, @@ -4477,7 +4477,9 @@ vect_create_epilog_for_reduction (vec vect_d The first match will be a 1 to allow 0 to be used for non-matching indexes. If there are no matches at all then the vector will be all zeroes. */ - if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION) + if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION + || STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) + == INTEGER_INDUC_COND_REDUCTION) { tree indx_before_incr, indx_after_incr; int nunits_out = TYPE_VECTOR_SUBPARTS (vectype); @@ -4754,7 +4756,9 @@ vect_create_epilog_for_reduction (vec vect_d else new_phi_result = PHI_RESULT (new_phis[0]); - if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION + if ((STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION + || STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) + == INTEGER_INDUC_COND_REDUCTION) && reduc_code != ERROR_MARK) { /* For condition reductions, we have a vector (NEW_PHI_RESULT) containing @@ -4797,76 +4801,118 @@ vect_create_epilog_for_reduction (vec vect_d induction_index); gsi_insert_before (&exit_gsi, max_index_stmt, GSI_SAME_STMT); - /* Vector of {max_index, max_index, max_index,...}. */ - tree max_index_vec = make_ssa_name (index_vec_type); - tree max_index_vec_rhs = build_vector_from_val (index_vec_type, - max_index); - gimple *max_index_vec_stmt = gimple_build_assign (max_index_vec, - max_index_vec_rhs); - gsi_insert_before (&exit_gsi, max_index_vec_stmt, GSI_SAME_STMT); + if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION) + { + /* Vector of {max_index, max_index, max_index,...}. */ + tree max_index_vec = make_ssa_name (index_vec_type); + tree max_index_vec_rhs = build_vector_from_val (index_vec_type, + max_index); + gimple *max_index_vec_stmt = gimple_build_assign (max_index_vec, + max_index_vec_rhs); + gsi_insert_before (&exit_gsi, max_index_vec_stmt, GSI_SAME_STMT); - /* Next we compare the new vector (MAX_INDEX_VEC) full of max indexes - with the vector (INDUCTION_INDEX) of found indexes, choosing values - from the data vector (NEW_PHI_RESULT) for matches, 0 (ZERO_VEC) - otherwise. Only one value should match, resulting in a vector - (VEC_COND) with one data value and the rest zeros. - In the case where the loop never made any matches, every index will - match, resulting in a vector with all data values (which will all be - the default value). */ + /* Next we compare the new vector (MAX_INDEX_VEC) full of max indexes + with the vector (INDUCTION_INDEX) of found indexes, choosing values + from the data vector (NEW_PHI_RESULT) for matches, 0 (ZERO_VEC) + otherwise. Only one value should match, resulting in a vector + (VEC_COND) with one data value and the rest zeros. In the case + where the loop never made any matches, every index will match, + resulting in a vector with all data values (which will all be the + default value). */ - /* Compare the max index vector to the vector of found indexes to find - the position of the max value. */ - tree vec_compare = make_ssa_name (index_vec_cmp_type); - gimple *vec_compare_stmt = gimple_build_assign (vec_compare, EQ_EXPR, - induction_index, - max_index_vec); - gsi_insert_before (&exit_gsi, vec_compare_stmt, GSI_SAME_STMT); + /* Compare the max index vector to the vector of found indexes to find + the position of the max value. */ + tree vec_compare = make_ssa_name (index_vec_cmp_type); + gimple *vec_compare_stmt = gimple_build_assign (vec_compare, EQ_EXPR, + induction_index, + max_index_vec); + gsi_insert_before (&exit_gsi, vec_compare_stmt, GSI_SAME_STMT); - /* Use the compare to choose either values from the data vector or - zero. */ - tree vec_cond = make_ssa_name (vectype); - gimple *vec_cond_stmt = gimple_build_assign (vec_cond, VEC_COND_EXPR, - vec_compare, new_phi_result, - zero_vec); - gsi_insert_before (&exit_gsi, vec_cond_stmt, GSI_SAME_STMT); + /* Use the compare to choose either values from the data vector or + zero. */ + tree vec_cond = make_ssa_name (vectype); + gimple *vec_cond_stmt = gimple_build_assign (vec_cond, VEC_COND_EXPR, + vec_compare, + new_phi_result, + zero_vec); + gsi_insert_before (&exit_gsi, vec_cond_stmt, GSI_SAME_STMT); - /* Finally we need to extract the data value from the vector (VEC_COND) - into a scalar (MATCHED_DATA_REDUC). Logically we want to do a OR - reduction, but because this doesn't exist, we can use a MAX reduction - instead. The data value might be signed or a float so we need to cast - it first. - In the case where the loop never made any matches, the data values are - all identical, and so will reduce down correctly. */ + /* Finally we need to extract the data value from the vector + (VEC_COND) into a scalar (MATCHED_DATA_REDUC). Logically we want + to do a OR reduction, but because this doesn't exist, we can use a + MAX reduction instead. The data value might be signed or a float + so we need to cast it first. In the case where the loop never made + any matches, the data values are all identical, and so will reduce + down correctly. */ - /* Make the matched data values unsigned. */ - tree vec_cond_cast = make_ssa_name (vectype_unsigned); - tree vec_cond_cast_rhs = build1 (VIEW_CONVERT_EXPR, vectype_unsigned, - vec_cond); - gimple *vec_cond_cast_stmt = gimple_build_assign (vec_cond_cast, - VIEW_CONVERT_EXPR, - vec_cond_cast_rhs); - gsi_insert_before (&exit_gsi, vec_cond_cast_stmt, GSI_SAME_STMT); + /* Make the matched data values unsigned. */ + tree vec_cond_cast = make_ssa_name (vectype_unsigned); + tree vec_cond_cast_rhs = build1 (VIEW_CONVERT_EXPR, vectype_unsigned, + vec_cond); + gimple *vec_cond_cast_stmt = gimple_build_assign (vec_cond_cast, + VIEW_CONVERT_EXPR, + vec_cond_cast_rhs); + gsi_insert_before (&exit_gsi, vec_cond_cast_stmt, GSI_SAME_STMT); - /* Reduce down to a scalar value. */ - tree data_reduc = make_ssa_name (scalar_type_unsigned); - optab ot = optab_for_tree_code (REDUC_MAX_EXPR, vectype_unsigned, - optab_default); - gcc_assert (optab_handler (ot, TYPE_MODE (vectype_unsigned)) - != CODE_FOR_nothing); - gimple *data_reduc_stmt = gimple_build_assign (data_reduc, - REDUC_MAX_EXPR, - vec_cond_cast); - gsi_insert_before (&exit_gsi, data_reduc_stmt, GSI_SAME_STMT); + /* Reduce down to a scalar value. */ + tree data_reduc = make_ssa_name (scalar_type_unsigned); + optab ot = optab_for_tree_code (REDUC_MAX_EXPR, vectype_unsigned, + optab_default); + gcc_assert (optab_handler (ot, TYPE_MODE (vectype_unsigned)) + != CODE_FOR_nothing); + gimple *data_reduc_stmt = gimple_build_assign (data_reduc, + REDUC_MAX_EXPR, + vec_cond_cast); + gsi_insert_before (&exit_gsi, data_reduc_stmt, GSI_SAME_STMT); - /* Convert the reduced value back to the result type and set as the - result. */ - gimple_seq stmts = NULL; - new_temp = gimple_build (&stmts, VIEW_CONVERT_EXPR, scalar_type, - data_reduc); - gsi_insert_seq_before (&exit_gsi, stmts, GSI_SAME_STMT); + /* Convert the reduced value back to the result type and set as the + result. */ + gimple_seq stmts = NULL; + new_temp = gimple_build (&stmts, VIEW_CONVERT_EXPR, scalar_type, + data_reduc); + gsi_insert_seq_before (&exit_gsi, stmts, GSI_SAME_STMT); + } + else + { + /* Retrieve the index of the last match, and adjust the result value + if there were no matches. */ + gimple_seq stmts = NULL; + + stmt_vec_info induct_stmt_info = vinfo_for_stmt (induct_stmt); + + tree base = PHI_ARG_DEF_FROM_EDGE (induct_stmt, + loop_preheader_edge (loop)); + tree step = STMT_VINFO_LOOP_PHI_EVOLUTION_PART (induct_stmt_info); + + tree one = build_one_cst (TREE_TYPE (max_index)); + tree index = gimple_build (&stmts, MINUS_EXPR, TREE_TYPE (max_index), + max_index, one); + + index = gimple_convert (&stmts, scalar_type, index); + + tree offset = gimple_build (&stmts, MULT_EXPR, scalar_type, + step, index); + tree result = gimple_build (&stmts, PLUS_EXPR, scalar_type, + base, offset); + + tree zero = build_zero_cst (TREE_TYPE (max_index)); + tree zcompare = build2 (EQ_EXPR, boolean_type_node, + max_index, zero); + + gsi_insert_seq_before (&exit_gsi, stmts, GSI_SAME_STMT); + + new_temp = make_ssa_name (new_scalar_dest); + epilog_stmt = gimple_build_assign (new_temp, COND_EXPR, zcompare, + initial_def, result); + + gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); + } + scalar_results.safe_push (new_temp); } - else if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION + else if ((STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION + || STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) + == INTEGER_INDUC_COND_REDUCTION) && reduc_code == ERROR_MARK) { /* Condition redution without supported REDUC_MAX_EXPR. Generate @@ -4907,8 +4953,12 @@ vect_create_epilog_for_reduction (vec vect_d { tree new_idx_val = idx_val; tree new_val = val; - if (off != v_size - el_size) + if (off != v_size - el_size + || STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) + == INTEGER_INDUC_COND_REDUCTION) { + /* For integer induction, the index of the last match + must always be known. */ new_idx_val = make_ssa_name (idx_eltype); epilog_stmt = gimple_build_assign (new_idx_val, MAX_EXPR, idx_val, @@ -4928,12 +4978,52 @@ vect_create_epilog_for_reduction (vec vect_d val = new_val; } } - /* Convert the reduced value back to the result type and set as the - result. */ + gimple_seq stmts = NULL; - val = gimple_convert (&stmts, scalar_type, val); - gsi_insert_seq_before (&exit_gsi, stmts, GSI_SAME_STMT); - scalar_results.safe_push (val); + + if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION) + { + /* Convert the reduced value back to the result type and set as the + result. */ + val = gimple_convert (&stmts, scalar_type, val); + gsi_insert_seq_before (&exit_gsi, stmts, GSI_SAME_STMT); + scalar_results.safe_push (val); + } + else if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) + == INTEGER_INDUC_COND_REDUCTION) + { + /* Retrieve the index of the last match, and adjust the result value + if there were no matches. */ + stmt_vec_info induct_stmt_info = vinfo_for_stmt (induct_stmt); + + tree base = PHI_ARG_DEF_FROM_EDGE (induct_stmt, + loop_preheader_edge (loop)); + tree step = STMT_VINFO_LOOP_PHI_EVOLUTION_PART (induct_stmt_info); + + tree one = build_one_cst (idx_eltype); + tree index = gimple_build (&stmts, MINUS_EXPR, idx_eltype, + idx_val, one); + + index = gimple_convert (&stmts, scalar_type, index); + + tree offset = gimple_build (&stmts, MULT_EXPR, scalar_type, + step, index); + tree result = gimple_build (&stmts, PLUS_EXPR, scalar_type, + base, offset); + + gsi_insert_seq_before (&exit_gsi, stmts, GSI_SAME_STMT); + + tree zero = build_zero_cst (TREE_TYPE (idx_val)); + tree zcompare = build2 (EQ_EXPR, boolean_type_node, + idx_val, zero); + + new_temp = make_ssa_name (new_scalar_dest); + epilog_stmt = gimple_build_assign (new_temp, COND_EXPR, zcompare, + initial_def, result); + gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); + + scalar_results.safe_push (new_temp); + } } /* 2.3 Create the reduction code, using one of the three schemes described @@ -5625,6 +5715,7 @@ vectorizable_reduction (gimple *stmt, gimple_stmt_ bool first_p = true; tree cr_index_scalar_type = NULL_TREE, cr_index_vector_type = NULL_TREE; tree cond_reduc_val = NULL_TREE; + gimple *induct_stmt = NULL; /* Make sure it was already recognized as a reduction computation. */ if (STMT_VINFO_DEF_TYPE (vinfo_for_stmt (stmt)) != vect_reduction_def @@ -5869,7 +5960,10 @@ vectorizable_reduction (gimple *stmt, gimple_stmt_ } if (dt == vect_induction_def && def_stmt != NULL && is_nonwrapping_integer_induction (def_stmt, loop)) - cond_reduc_dt = dt; + { + cond_reduc_dt = dt; + induct_stmt = def_stmt; + } } } @@ -6461,7 +6555,7 @@ vectorizable_reduction (gimple *stmt, gimple_stmt_ vect_defs[0] = gimple_assign_lhs (*vec_stmt); vect_create_epilog_for_reduction (vect_defs, stmt, reduc_def_stmt, - epilog_copies, + induct_stmt, epilog_copies, epilog_reduc_code, phis, double_reduc, slp_node, slp_node_instance); --=-=-=--