From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id AC9663870853 for ; Thu, 18 Jun 2020 08:10:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org AC9663870853 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mliska@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id DD277AD3A; Thu, 18 Jun 2020 08:10:28 +0000 (UTC) Subject: Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions. To: Richard Biener Cc: Segher Boessenkool , GCC Patches , Richard Sandiford , David Edelsohn References: <20200530130801.GD31009@gate.crashing.org> <16e3957c-e390-5984-b14e-dd3c70c3bd1c@suse.cz> <20200603182734.GA31009@gate.crashing.org> <3932cb72-4529-6755-bb35-45e11f1c3382@suse.cz> <77faed30-2ed8-8319-f552-34cad171c927@suse.cz> From: =?UTF-8?Q?Martin_Li=c5=a1ka?= Message-ID: <77f733c7-5680-4d59-dc3d-f458211a2fa5@suse.cz> Date: Thu, 18 Jun 2020 10:10:28 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-14.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Jun 2020 08:10:32 -0000 On 6/17/20 3:15 PM, Richard Biener wrote: > On Wed, Jun 17, 2020 at 10:50 AM Richard Biener > wrote: >> >> On Mon, Jun 15, 2020 at 2:20 PM Martin Liška wrote: >>> >>> On 6/15/20 1:59 PM, Richard Biener wrote: >>>> On Mon, Jun 15, 2020 at 1:19 PM Martin Liška wrote: >>>>> >>>>> On 6/15/20 9:14 AM, Richard Biener wrote: >>>>>> On Fri, Jun 12, 2020 at 3:24 PM Martin Liška wrote: >>>>>>> >>>>>>> On 6/12/20 11:43 AM, Richard Biener wrote: >>>>>>>> So ... how far are you with enforcing a split VEC_COND_EXPR? >>>>>>>> Thus can we avoid the above completely (even as intermediate >>>>>>>> state)? >>>>>>> >>>>>>> Apparently, I'm quite close. Using the attached patch I see only 2 testsuite >>>>>>> failures: >>>>>>> >>>>>>> FAIL: gcc.dg/tree-ssa/pr68714.c scan-tree-dump-times reassoc1 " <= " 1 >>>>>>> FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3 >>>>>>> >>>>>>> The first one is about teaching reassoc about the SSA_NAMEs in VEC_COND_EXPR. I haven't >>>>>>> analyze the second failure. >>>>>>> >>>>>>> I'm also not sure about the gimlification change, I see a superfluous assignments: >>>>>>> vec_cond_cmp.5 = _1 == _2; >>>>>>> vec_cond_cmp.6 = vec_cond_cmp.5; >>>>>>> vec_cond_cmp.7 = vec_cond_cmp.6; >>>>>>> _3 = VEC_COND_EXPR ; >>>>>>> ? >>>>>>> >>>>>>> So with the suggested patch, the EH should be gone as you suggested. Right? >>>>>> >>>>>> Right, it should be on the comparison already from the start. >>>>>> >>>>>> @@ -14221,9 +14221,13 @@ gimplify_expr (tree *expr_p, gimple_seq >>>>>> *pre_p, gimple_seq *post_p, >>>>>> case VEC_COND_EXPR: >>>>>> { >>>>>> enum gimplify_status r0, r1, r2; >>>>>> - >>>>>> r0 = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, >>>>>> post_p, is_gimple_condexpr, fb_rvalue); >>>>>> + tree xop0 = TREE_OPERAND (*expr_p, 0); >>>>>> + tmp = create_tmp_var_raw (TREE_TYPE (xop0), "vec_cond_cmp"); >>>>>> + gimple_add_tmp_var (tmp); >>>>>> + gimplify_assign (tmp, xop0, pre_p); >>>>>> + TREE_OPERAND (*expr_p, 0) = tmp; >>>>>> r1 = gimplify_expr (&TREE_OPERAND (*expr_p, 1), pre_p, >>>>>> post_p, is_gimple_val, fb_rvalue); >>>>>> >>>>>> all of VEC_COND_EXPR can now be a simple goto expr_3; >>>>> >>>>> Works for me, thanks! >>>>> >>>>>> >>>>>> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c >>>>>> index 494c9e9c20b..090fb52a2f1 100644 >>>>>> --- a/gcc/tree-ssa-forwprop.c >>>>>> +++ b/gcc/tree-ssa-forwprop.c >>>>>> @@ -3136,6 +3136,10 @@ pass_forwprop::execute (function *fun) >>>>>> if (code == COND_EXPR >>>>>> || code == VEC_COND_EXPR) >>>>>> { >>>>>> + /* Do not propagate into VEC_COND_EXPRs. */ >>>>>> + if (code == VEC_COND_EXPR) >>>>>> + break; >>>>>> + >>>>>> >>>>>> err - remove the || code == VEC_COND_EXPR instead? >>>>> >>>>> Yep. >>>>> >>>>>> >>>>>> @@ -2221,24 +2226,12 @@ expand_vector_operations (void) >>>>>> { >>>>>> gimple_stmt_iterator gsi; >>>>>> basic_block bb; >>>>>> - bool cfg_changed = false; >>>>>> >>>>>> FOR_EACH_BB_FN (bb, cfun) >>>>>> - { >>>>>> - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) >>>>>> - { >>>>>> - expand_vector_operations_1 (&gsi); >>>>>> - /* ??? If we do not cleanup EH then we will ICE in >>>>>> - verification. But in reality we have created wrong-code >>>>>> - as we did not properly transition EH info and edges to >>>>>> - the piecewise computations. */ >>>>>> - if (maybe_clean_eh_stmt (gsi_stmt (gsi)) >>>>>> - && gimple_purge_dead_eh_edges (bb)) >>>>>> - cfg_changed = true; >>>>>> - } >>>>>> - } >>>>>> >>>>>> I'm not sure about this. Consider the C++ testcase where >>>>>> the ?: is replaced by a division. If veclower needs to replace >>>>>> that with four scalrar division statements then the above >>>>>> still applies - veclower does not correctly duplicate EH info >>>>>> and EH edges to the individual divisions (and we do not know >>>>>> which component might trap). >>>>>> >>>>>> So please leave the above in. You can try if using integer >>>>>> division makes it break and add such a testcase if there's >>>>>> no coverage for this in the testsuite. >>>>> >>>>> I'm leaving that above. Can you please explain how can a division test-case >>>>> be created? >>>> >>>> typedef long v2di __attribute__((vector_size(16))); >>>> >>>> v2di foo (v2di a, v2di b) >>>> { >>>> try >>>> { >>>> v2di res = a / b; >>>> return res; >>>> } >>>> catch (...) >>>> { >>>> return (v2di){}; >>>> } >>>> } >>>> >>>> with -fnon-call-exceptions I see in t.ii.090t.ehdisp (correctly): >>>> >>>> ;; basic block 2, loop depth 0 >>>> ;; pred: ENTRY >>>> [LP 1] _6 = a_4(D) / b_5(D); >>>> ;; succ: 5 >>>> ;; 3 >>>> >>>> while after t.ii.226t.veclower we have >>>> >>>> ;; basic block 2, loop depth 0 >>>> ;; pred: ENTRY >>>> _13 = BIT_FIELD_REF ; >>>> _14 = BIT_FIELD_REF ; >>>> _15 = _13 / _14; >>>> _16 = BIT_FIELD_REF ; >>>> _17 = BIT_FIELD_REF ; >>>> _18 = _16 / _17; >>>> _6 = {_15, _18}; >>>> res_7 = _6; >>>> _8 = res_7; >>>> ;; succ: 3 >>>> >>>> and all EH is gone and we'd ICE if you remove the above hunk. Hopefully. >>> >>> Yes, it ICEs then: >>> >>> >>> ./xg++ -B. ~/Programming/testcases/ice.c -c -fnon-call-exceptions -O3 >>> /home/marxin/Programming/testcases/ice.c: In function ‘v2di foo(v2di, v2di)’: >>> /home/marxin/Programming/testcases/ice.c:3:6: error: statement marked for throw, but doesn’t >>> 3 | v2di foo (v2di a, v2di b) >>> | ^~~ >>> _6 = {_12, _15}; >>> during GIMPLE pass: veclower2 >>> /home/marxin/Programming/testcases/ice.c:3:6: internal compiler error: verify_gimple failed >>> 0x10e308a verify_gimple_in_cfg(function*, bool) >>> /home/marxin/Programming/gcc/gcc/tree-cfg.c:5461 >>> 0xfc9caf execute_function_todo >>> /home/marxin/Programming/gcc/gcc/passes.c:1985 >>> 0xfcaafc do_per_function >>> /home/marxin/Programming/gcc/gcc/passes.c:1640 >>> 0xfcaafc execute_todo >>> /home/marxin/Programming/gcc/gcc/passes.c:2039 >>> Please submit a full bug report, >>> with preprocessed source if appropriate. >>> Please include the complete backtrace with any bug report. >>> See for instructions. >>> >>>> >>>> We still generate wrong-code obviously as we'd need to duplicate the >>>> EH info on each component division (and split blocks and generate >>>> extra EH edges). That's a pre-existing bug of course. I just wanted >>>> to avoid to create a new instance just because of the early instruction >>>> selection for VEC_COND_EXPR. >>> >>> Fine! >>> >>>> >>>>>> >>>>>> What's missing from the patch is adjusting >>>>>> verify_gimple_assign_ternary from >>>>>> >>>>>> if (((rhs_code == VEC_COND_EXPR || rhs_code == COND_EXPR) >>>>>> ? !is_gimple_condexpr (rhs1) : !is_gimple_val (rhs1)) >>>>>> || !is_gimple_val (rhs2) >>>>>> || !is_gimple_val (rhs3)) >>>>>> { >>>>>> error ("invalid operands in ternary operation"); >>>>>> return true; >>>>>> >>>>>> to the same with the rhs_code == VEC_COND_EXPR case removed. >>>>> >>>>> Hmm. I'm not sure I've got this comment. Why do we want to change it >>>>> and is it done wright in the patch? >>>> >>>> Ah, I missed the hunk you added. >>> >>> That explains the confusion I got. >>> >>>> But the check should be an inclusive >>>> one, not an exclusive one and earlier accepting a is_gimple_condexpr >>>> is superfluous when you later reject the tcc_comparison part. Just >>>> testing is_gimple_val is better. So yes, remove your tree-cfg.c hunk >>>> and just adjust the above test. >>> >>> I simplified that. >>> >>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. >> >> Please double-check the changelog >> >> (do_store_flag): >> >> + tree-vect-isel.o \ >> >> IMHO we want to move more of the pattern matching magic of RTL >> expansion here to obsolete TER. So please name it gimple-isel.cc >> (.cc!, not .c) >> >> + gassign *assign = dyn_cast (SSA_NAME_DEF_STMT (cond)); >> + if (stmt != NULL >> + && TREE_CODE_CLASS (gimple_assign_rhs_code (assign)) != tcc_comparison) >> + return ERROR_MARK; >> >> you want stmt == NULL || TREE_CODE_CLASS (...) >> >> in case the def stmt is a call. >> >> + gimple_seq seq; >> + tree exp = force_gimple_operand (comb, &seq, true, NULL_TREE); >> + if (seq) >> + { >> + gimple_stmt_iterator gsi = gsi_for_stmt (vcond0); >> + gsi_insert_before (&gsi, seq, GSI_SAME_STMT); >> + } >> >> use force_gimple_operand_gsi that makes the above simpler. >> >> if (invert) >> - std::swap (*gimple_assign_rhs2_ptr (stmt0), >> - *gimple_assign_rhs3_ptr (stmt0)); >> - update_stmt (stmt0); >> + std::swap (*gimple_assign_rhs2_ptr (vcond0), >> + *gimple_assign_rhs3_ptr (vcond0)); >> >> use swap_ssa_operands. >> >> + gimple_assign_set_rhs1 (vcond0, exp); >> + update_stmt (vcond0); >> >> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c >> index cf2d979fea1..710b17a7c5c 100644 >> --- a/gcc/tree-vect-stmts.c >> +++ b/gcc/tree-vect-stmts.c >> @@ -9937,8 +9937,8 @@ vectorizable_condition (vec_info *vinfo, >> { >> vec_cond_rhs = vec_oprnds1[i]; >> if (bitop1 == NOP_EXPR) >> - vec_compare = build2 (cond_code, vec_cmp_type, >> - vec_cond_lhs, vec_cond_rhs); >> + vec_compare = gimplify_build2 (gsi, cond_code, vec_cmp_type, >> + vec_cond_lhs, vec_cond_rhs); >> else >> { >> >> please don't introduce more uses of gimplify_buildN - I'd like to >> get rid of those. You can use >> >> gimple_seq stmts = NULL; >> vec_compare = gimple_build (&stmts, cond_code, ...); >> gsi_insert_seq_before/after (...); >> >> OK with those changes. > > Applying the patch caused > > Running target unix//-m32 > FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3 > -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer > -finline-functions > FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3 -g I can't reproduce that with current master. Can you? > > and > > FAIL: ext/random/simd_fast_mersenne_twister_engine/operators/inequal.cc > (test for excess errors) > UNRESOLVED: ext/random/simd_fast_mersenne_twister_engine/operators/inequal.cc > compilation failed to produce executable I've just fixed this one. Martin > > Richard. > >> Thanks, >> Richard. >> >> >>> Thanks, >>> Martin >>> >>>> >>>>>> >>>>>> You'll likely figure the vectorizer still creates some VEC_COND_EXPRs >>>>>> with embedded comparisons. >>>>> >>>>> I've fixed 2 failing test-cases I mentioned in the previous email. >>>>> >>>>> Martin >>>>> >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> >>>>>>> Martin >>>>> >>>