On 6/15/20 1:59 PM, Richard Biener wrote: > On Mon, Jun 15, 2020 at 1:19 PM Martin Liška wrote: >> >> On 6/15/20 9:14 AM, Richard Biener wrote: >>> On Fri, Jun 12, 2020 at 3:24 PM Martin Liška wrote: >>>> >>>> On 6/12/20 11:43 AM, Richard Biener wrote: >>>>> So ... how far are you with enforcing a split VEC_COND_EXPR? >>>>> Thus can we avoid the above completely (even as intermediate >>>>> state)? >>>> >>>> Apparently, I'm quite close. Using the attached patch I see only 2 testsuite >>>> failures: >>>> >>>> FAIL: gcc.dg/tree-ssa/pr68714.c scan-tree-dump-times reassoc1 " <= " 1 >>>> FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3 >>>> >>>> The first one is about teaching reassoc about the SSA_NAMEs in VEC_COND_EXPR. I haven't >>>> analyze the second failure. >>>> >>>> I'm also not sure about the gimlification change, I see a superfluous assignments: >>>> vec_cond_cmp.5 = _1 == _2; >>>> vec_cond_cmp.6 = vec_cond_cmp.5; >>>> vec_cond_cmp.7 = vec_cond_cmp.6; >>>> _3 = VEC_COND_EXPR ; >>>> ? >>>> >>>> So with the suggested patch, the EH should be gone as you suggested. Right? >>> >>> Right, it should be on the comparison already from the start. >>> >>> @@ -14221,9 +14221,13 @@ gimplify_expr (tree *expr_p, gimple_seq >>> *pre_p, gimple_seq *post_p, >>> case VEC_COND_EXPR: >>> { >>> enum gimplify_status r0, r1, r2; >>> - >>> r0 = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, >>> post_p, is_gimple_condexpr, fb_rvalue); >>> + tree xop0 = TREE_OPERAND (*expr_p, 0); >>> + tmp = create_tmp_var_raw (TREE_TYPE (xop0), "vec_cond_cmp"); >>> + gimple_add_tmp_var (tmp); >>> + gimplify_assign (tmp, xop0, pre_p); >>> + TREE_OPERAND (*expr_p, 0) = tmp; >>> r1 = gimplify_expr (&TREE_OPERAND (*expr_p, 1), pre_p, >>> post_p, is_gimple_val, fb_rvalue); >>> >>> all of VEC_COND_EXPR can now be a simple goto expr_3; >> >> Works for me, thanks! >> >>> >>> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c >>> index 494c9e9c20b..090fb52a2f1 100644 >>> --- a/gcc/tree-ssa-forwprop.c >>> +++ b/gcc/tree-ssa-forwprop.c >>> @@ -3136,6 +3136,10 @@ pass_forwprop::execute (function *fun) >>> if (code == COND_EXPR >>> || code == VEC_COND_EXPR) >>> { >>> + /* Do not propagate into VEC_COND_EXPRs. */ >>> + if (code == VEC_COND_EXPR) >>> + break; >>> + >>> >>> err - remove the || code == VEC_COND_EXPR instead? >> >> Yep. >> >>> >>> @@ -2221,24 +2226,12 @@ expand_vector_operations (void) >>> { >>> gimple_stmt_iterator gsi; >>> basic_block bb; >>> - bool cfg_changed = false; >>> >>> FOR_EACH_BB_FN (bb, cfun) >>> - { >>> - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) >>> - { >>> - expand_vector_operations_1 (&gsi); >>> - /* ??? If we do not cleanup EH then we will ICE in >>> - verification. But in reality we have created wrong-code >>> - as we did not properly transition EH info and edges to >>> - the piecewise computations. */ >>> - if (maybe_clean_eh_stmt (gsi_stmt (gsi)) >>> - && gimple_purge_dead_eh_edges (bb)) >>> - cfg_changed = true; >>> - } >>> - } >>> >>> I'm not sure about this. Consider the C++ testcase where >>> the ?: is replaced by a division. If veclower needs to replace >>> that with four scalrar division statements then the above >>> still applies - veclower does not correctly duplicate EH info >>> and EH edges to the individual divisions (and we do not know >>> which component might trap). >>> >>> So please leave the above in. You can try if using integer >>> division makes it break and add such a testcase if there's >>> no coverage for this in the testsuite. >> >> I'm leaving that above. Can you please explain how can a division test-case >> be created? > > typedef long v2di __attribute__((vector_size(16))); > > v2di foo (v2di a, v2di b) > { > try > { > v2di res = a / b; > return res; > } > catch (...) > { > return (v2di){}; > } > } > > with -fnon-call-exceptions I see in t.ii.090t.ehdisp (correctly): > > ;; basic block 2, loop depth 0 > ;; pred: ENTRY > [LP 1] _6 = a_4(D) / b_5(D); > ;; succ: 5 > ;; 3 > > while after t.ii.226t.veclower we have > > ;; basic block 2, loop depth 0 > ;; pred: ENTRY > _13 = BIT_FIELD_REF ; > _14 = BIT_FIELD_REF ; > _15 = _13 / _14; > _16 = BIT_FIELD_REF ; > _17 = BIT_FIELD_REF ; > _18 = _16 / _17; > _6 = {_15, _18}; > res_7 = _6; > _8 = res_7; > ;; succ: 3 > > and all EH is gone and we'd ICE if you remove the above hunk. Hopefully. Yes, it ICEs then: ./xg++ -B. ~/Programming/testcases/ice.c -c -fnon-call-exceptions -O3 /home/marxin/Programming/testcases/ice.c: In function ‘v2di foo(v2di, v2di)’: /home/marxin/Programming/testcases/ice.c:3:6: error: statement marked for throw, but doesn’t 3 | v2di foo (v2di a, v2di b) | ^~~ _6 = {_12, _15}; during GIMPLE pass: veclower2 /home/marxin/Programming/testcases/ice.c:3:6: internal compiler error: verify_gimple failed 0x10e308a verify_gimple_in_cfg(function*, bool) /home/marxin/Programming/gcc/gcc/tree-cfg.c:5461 0xfc9caf execute_function_todo /home/marxin/Programming/gcc/gcc/passes.c:1985 0xfcaafc do_per_function /home/marxin/Programming/gcc/gcc/passes.c:1640 0xfcaafc execute_todo /home/marxin/Programming/gcc/gcc/passes.c:2039 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See for instructions. > > We still generate wrong-code obviously as we'd need to duplicate the > EH info on each component division (and split blocks and generate > extra EH edges). That's a pre-existing bug of course. I just wanted > to avoid to create a new instance just because of the early instruction > selection for VEC_COND_EXPR. Fine! > >>> >>> What's missing from the patch is adjusting >>> verify_gimple_assign_ternary from >>> >>> if (((rhs_code == VEC_COND_EXPR || rhs_code == COND_EXPR) >>> ? !is_gimple_condexpr (rhs1) : !is_gimple_val (rhs1)) >>> || !is_gimple_val (rhs2) >>> || !is_gimple_val (rhs3)) >>> { >>> error ("invalid operands in ternary operation"); >>> return true; >>> >>> to the same with the rhs_code == VEC_COND_EXPR case removed. >> >> Hmm. I'm not sure I've got this comment. Why do we want to change it >> and is it done wright in the patch? > > Ah, I missed the hunk you added. That explains the confusion I got. > But the check should be an inclusive > one, not an exclusive one and earlier accepting a is_gimple_condexpr > is superfluous when you later reject the tcc_comparison part. Just > testing is_gimple_val is better. So yes, remove your tree-cfg.c hunk > and just adjust the above test. I simplified that. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Thanks, Martin > >>> >>> You'll likely figure the vectorizer still creates some VEC_COND_EXPRs >>> with embedded comparisons. >> >> I've fixed 2 failing test-cases I mentioned in the previous email. >> >> Martin >> >>> >>> Thanks, >>> Richard. >>> >>> >>>> Martin >>