From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) by sourceware.org (Postfix) with ESMTPS id 1AF673939C22 for ; Thu, 18 Jun 2020 08:52:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1AF673939C22 Received: by mail-ej1-x631.google.com with SMTP id f7so5551474ejq.6 for ; Thu, 18 Jun 2020 01:52:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=aJNKqGHU4rDord949pDG1lHjLF6zRE8q1v5HrtrFhDc=; b=nic8ywsYzSvVXdjztnXdogO5xbNxsuK8MTk3BZk7WBb09j1kJcw/bagaP1TcvXQ3qd 5LdHdJgWugEzKU+H7CC3DL81WXU+ccxms3Tp5oh5eHMlbrR5/PyU6+NCb+UOrnAapSNT FYWOyzlPpPngvcSOavxNmjZgsr+IfYgpBTFLpQsu8124nFwGszq8Swg+fTF7fMW8A8OU mmcCrJJnbaJKHFjqMykhQlph8ugI6T3OVucOQxyKs8MidKZSpK/2meZKFCj0Gi5iQKVw ISpjzqx2n5vF9V/shm6DV2VOq8b1pRxHtfyb8NyTkeaar4xrdUmZB/MzQHmk7QsqotNo hPDA== X-Gm-Message-State: AOAM5331MhxYeAzd9YWqqX4ledmyzemAGlAHGlzkJIIqF3CB7kODXaTI mHRwO0ZBG67h4iX0X30cbkVIYj4HJaEM9M2jWrU= X-Google-Smtp-Source: ABdhPJzESViC+iSwrpz3WURj4kJWF2tgZ1gj8mwg7qCx+Ep2diIN9nvHRR/lnkyfIJeCxKwa9mRohSDtjVZHK6ZYhTo= X-Received: by 2002:a17:906:95d6:: with SMTP id n22mr3007979ejy.138.1592470341986; Thu, 18 Jun 2020 01:52:21 -0700 (PDT) MIME-Version: 1.0 References: <20200530130801.GD31009@gate.crashing.org> <16e3957c-e390-5984-b14e-dd3c70c3bd1c@suse.cz> <20200603182734.GA31009@gate.crashing.org> <3932cb72-4529-6755-bb35-45e11f1c3382@suse.cz> <77faed30-2ed8-8319-f552-34cad171c927@suse.cz> <77f733c7-5680-4d59-dc3d-f458211a2fa5@suse.cz> In-Reply-To: <77f733c7-5680-4d59-dc3d-f458211a2fa5@suse.cz> From: Richard Biener Date: Thu, 18 Jun 2020 10:52:10 +0200 Message-ID: Subject: Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions. To: =?UTF-8?Q?Martin_Li=C5=A1ka?= Cc: Segher Boessenkool , GCC Patches , Richard Sandiford , David Edelsohn Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Jun 2020 08:52:25 -0000 On Thu, Jun 18, 2020 at 10:10 AM Martin Li=C5=A1ka wrote: > > On 6/17/20 3:15 PM, Richard Biener wrote: > > On Wed, Jun 17, 2020 at 10:50 AM Richard Biener > > wrote: > >> > >> On Mon, Jun 15, 2020 at 2:20 PM Martin Li=C5=A1ka wro= te: > >>> > >>> On 6/15/20 1:59 PM, Richard Biener wrote: > >>>> On Mon, Jun 15, 2020 at 1:19 PM Martin Li=C5=A1ka w= rote: > >>>>> > >>>>> On 6/15/20 9:14 AM, Richard Biener wrote: > >>>>>> On Fri, Jun 12, 2020 at 3:24 PM Martin Li=C5=A1ka = wrote: > >>>>>>> > >>>>>>> On 6/12/20 11:43 AM, Richard Biener wrote: > >>>>>>>> So ... how far are you with enforcing a split VEC_COND_EXPR? > >>>>>>>> Thus can we avoid the above completely (even as intermediate > >>>>>>>> state)? > >>>>>>> > >>>>>>> Apparently, I'm quite close. Using the attached patch I see only = 2 testsuite > >>>>>>> failures: > >>>>>>> > >>>>>>> FAIL: gcc.dg/tree-ssa/pr68714.c scan-tree-dump-times reassoc1 " <= =3D " 1 > >>>>>>> FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3 > >>>>>>> > >>>>>>> The first one is about teaching reassoc about the SSA_NAMEs in VE= C_COND_EXPR. I haven't > >>>>>>> analyze the second failure. > >>>>>>> > >>>>>>> I'm also not sure about the gimlification change, I see a superfl= uous assignments: > >>>>>>> vec_cond_cmp.5 =3D _1 =3D=3D _2; > >>>>>>> vec_cond_cmp.6 =3D vec_cond_cmp.5; > >>>>>>> vec_cond_cmp.7 =3D vec_cond_cmp.6; > >>>>>>> _3 =3D VEC_COND_EXPR ; > >>>>>>> ? > >>>>>>> > >>>>>>> So with the suggested patch, the EH should be gone as you suggest= ed. Right? > >>>>>> > >>>>>> Right, it should be on the comparison already from the start. > >>>>>> > >>>>>> @@ -14221,9 +14221,13 @@ gimplify_expr (tree *expr_p, gimple_seq > >>>>>> *pre_p, gimple_seq *post_p, > >>>>>> case VEC_COND_EXPR: > >>>>>> { > >>>>>> enum gimplify_status r0, r1, r2; > >>>>>> - > >>>>>> r0 =3D gimplify_expr (&TREE_OPERAND (*expr_p, 0), p= re_p, > >>>>>> post_p, is_gimple_condexpr, fb_= rvalue); > >>>>>> + tree xop0 =3D TREE_OPERAND (*expr_p, 0); > >>>>>> + tmp =3D create_tmp_var_raw (TREE_TYPE (xop0), "vec_con= d_cmp"); > >>>>>> + gimple_add_tmp_var (tmp); > >>>>>> + gimplify_assign (tmp, xop0, pre_p); > >>>>>> + TREE_OPERAND (*expr_p, 0) =3D tmp; > >>>>>> r1 =3D gimplify_expr (&TREE_OPERAND (*expr_p, 1), p= re_p, > >>>>>> post_p, is_gimple_val, fb_rvalu= e); > >>>>>> > >>>>>> all of VEC_COND_EXPR can now be a simple goto expr_3; > >>>>> > >>>>> Works for me, thanks! > >>>>> > >>>>>> > >>>>>> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c > >>>>>> index 494c9e9c20b..090fb52a2f1 100644 > >>>>>> --- a/gcc/tree-ssa-forwprop.c > >>>>>> +++ b/gcc/tree-ssa-forwprop.c > >>>>>> @@ -3136,6 +3136,10 @@ pass_forwprop::execute (function *fun) > >>>>>> if (code =3D=3D COND_EXPR > >>>>>> || code =3D=3D VEC_COND_EXPR) > >>>>>> { > >>>>>> + /* Do not propagate into VEC_COND_EXPRs. = */ > >>>>>> + if (code =3D=3D VEC_COND_EXPR) > >>>>>> + break; > >>>>>> + > >>>>>> > >>>>>> err - remove the || code =3D=3D VEC_COND_EXPR instead? > >>>>> > >>>>> Yep. > >>>>> > >>>>>> > >>>>>> @@ -2221,24 +2226,12 @@ expand_vector_operations (void) > >>>>>> { > >>>>>> gimple_stmt_iterator gsi; > >>>>>> basic_block bb; > >>>>>> - bool cfg_changed =3D false; > >>>>>> > >>>>>> FOR_EACH_BB_FN (bb, cfun) > >>>>>> - { > >>>>>> - for (gsi =3D gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next = (&gsi)) > >>>>>> - { > >>>>>> - expand_vector_operations_1 (&gsi); > >>>>>> - /* ??? If we do not cleanup EH then we will ICE in > >>>>>> - verification. But in reality we have created wrong-c= ode > >>>>>> - as we did not properly transition EH info and edges t= o > >>>>>> - the piecewise computations. */ > >>>>>> - if (maybe_clean_eh_stmt (gsi_stmt (gsi)) > >>>>>> - && gimple_purge_dead_eh_edges (bb)) > >>>>>> - cfg_changed =3D true; > >>>>>> - } > >>>>>> - } > >>>>>> > >>>>>> I'm not sure about this. Consider the C++ testcase where > >>>>>> the ?: is replaced by a division. If veclower needs to replace > >>>>>> that with four scalrar division statements then the above > >>>>>> still applies - veclower does not correctly duplicate EH info > >>>>>> and EH edges to the individual divisions (and we do not know > >>>>>> which component might trap). > >>>>>> > >>>>>> So please leave the above in. You can try if using integer > >>>>>> division makes it break and add such a testcase if there's > >>>>>> no coverage for this in the testsuite. > >>>>> > >>>>> I'm leaving that above. Can you please explain how can a division t= est-case > >>>>> be created? > >>>> > >>>> typedef long v2di __attribute__((vector_size(16))); > >>>> > >>>> v2di foo (v2di a, v2di b) > >>>> { > >>>> try > >>>> { > >>>> v2di res =3D a / b; > >>>> return res; > >>>> } > >>>> catch (...) > >>>> { > >>>> return (v2di){}; > >>>> } > >>>> } > >>>> > >>>> with -fnon-call-exceptions I see in t.ii.090t.ehdisp (correctly): > >>>> > >>>> ;; basic block 2, loop depth 0 > >>>> ;; pred: ENTRY > >>>> [LP 1] _6 =3D a_4(D) / b_5(D); > >>>> ;; succ: 5 > >>>> ;; 3 > >>>> > >>>> while after t.ii.226t.veclower we have > >>>> > >>>> ;; basic block 2, loop depth 0 > >>>> ;; pred: ENTRY > >>>> _13 =3D BIT_FIELD_REF ; > >>>> _14 =3D BIT_FIELD_REF ; > >>>> _15 =3D _13 / _14; > >>>> _16 =3D BIT_FIELD_REF ; > >>>> _17 =3D BIT_FIELD_REF ; > >>>> _18 =3D _16 / _17; > >>>> _6 =3D {_15, _18}; > >>>> res_7 =3D _6; > >>>> _8 =3D res_7; > >>>> ;; succ: 3 > >>>> > >>>> and all EH is gone and we'd ICE if you remove the above hunk. Hopef= ully. > >>> > >>> Yes, it ICEs then: > >>> > >>> > >>> ./xg++ -B. ~/Programming/testcases/ice.c -c -fnon-call-exceptions -O3 > >>> /home/marxin/Programming/testcases/ice.c: In function =E2=80=98v2di f= oo(v2di, v2di)=E2=80=99: > >>> /home/marxin/Programming/testcases/ice.c:3:6: error: statement marked= for throw, but doesn=E2=80=99t > >>> 3 | v2di foo (v2di a, v2di b) > >>> | ^~~ > >>> _6 =3D {_12, _15}; > >>> during GIMPLE pass: veclower2 > >>> /home/marxin/Programming/testcases/ice.c:3:6: internal compiler error= : verify_gimple failed > >>> 0x10e308a verify_gimple_in_cfg(function*, bool) > >>> /home/marxin/Programming/gcc/gcc/tree-cfg.c:5461 > >>> 0xfc9caf execute_function_todo > >>> /home/marxin/Programming/gcc/gcc/passes.c:1985 > >>> 0xfcaafc do_per_function > >>> /home/marxin/Programming/gcc/gcc/passes.c:1640 > >>> 0xfcaafc execute_todo > >>> /home/marxin/Programming/gcc/gcc/passes.c:2039 > >>> Please submit a full bug report, > >>> with preprocessed source if appropriate. > >>> Please include the complete backtrace with any bug report. > >>> See for instructions. > >>> > >>>> > >>>> We still generate wrong-code obviously as we'd need to duplicate the > >>>> EH info on each component division (and split blocks and generate > >>>> extra EH edges). That's a pre-existing bug of course. I just wante= d > >>>> to avoid to create a new instance just because of the early instruct= ion > >>>> selection for VEC_COND_EXPR. > >>> > >>> Fine! > >>> > >>>> > >>>>>> > >>>>>> What's missing from the patch is adjusting > >>>>>> verify_gimple_assign_ternary from > >>>>>> > >>>>>> if (((rhs_code =3D=3D VEC_COND_EXPR || rhs_code =3D=3D COND_E= XPR) > >>>>>> ? !is_gimple_condexpr (rhs1) : !is_gimple_val (rhs1)) > >>>>>> || !is_gimple_val (rhs2) > >>>>>> || !is_gimple_val (rhs3)) > >>>>>> { > >>>>>> error ("invalid operands in ternary operation"); > >>>>>> return true; > >>>>>> > >>>>>> to the same with the rhs_code =3D=3D VEC_COND_EXPR case removed. > >>>>> > >>>>> Hmm. I'm not sure I've got this comment. Why do we want to change i= t > >>>>> and is it done wright in the patch? > >>>> > >>>> Ah, I missed the hunk you added. > >>> > >>> That explains the confusion I got. > >>> > >>>> But the check should be an inclusive > >>>> one, not an exclusive one and earlier accepting a is_gimple_condexpr > >>>> is superfluous when you later reject the tcc_comparison part. Just > >>>> testing is_gimple_val is better. So yes, remove your tree-cfg.c hun= k > >>>> and just adjust the above test. > >>> > >>> I simplified that. > >>> > >>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests= . > >> > >> Please double-check the changelog > >> > >> (do_store_flag): > >> > >> + tree-vect-isel.o \ > >> > >> IMHO we want to move more of the pattern matching magic of RTL > >> expansion here to obsolete TER. So please name it gimple-isel.cc > >> (.cc!, not .c) > >> > >> + gassign *assign =3D dyn_cast (SSA_NAME_DEF_STMT (cond)); > >> + if (stmt !=3D NULL > >> + && TREE_CODE_CLASS (gimple_assign_rhs_code (assign)) !=3D tcc_c= omparison) > >> + return ERROR_MARK; > >> > >> you want stmt =3D=3D NULL || TREE_CODE_CLASS (...) > >> > >> in case the def stmt is a call. > >> > >> + gimple_seq seq; > >> + tree exp =3D force_gimple_operand (comb, &seq, true, NULL_TR= EE); > >> + if (seq) > >> + { > >> + gimple_stmt_iterator gsi =3D gsi_for_stmt (vcond0); > >> + gsi_insert_before (&gsi, seq, GSI_SAME_STMT); > >> + } > >> > >> use force_gimple_operand_gsi that makes the above simpler. > >> > >> if (invert) > >> - std::swap (*gimple_assign_rhs2_ptr (stmt0), > >> - *gimple_assign_rhs3_ptr (stmt0)); > >> - update_stmt (stmt0); > >> + std::swap (*gimple_assign_rhs2_ptr (vcond0), > >> + *gimple_assign_rhs3_ptr (vcond0)); > >> > >> use swap_ssa_operands. > >> > >> + gimple_assign_set_rhs1 (vcond0, exp); > >> + update_stmt (vcond0); > >> > >> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > >> index cf2d979fea1..710b17a7c5c 100644 > >> --- a/gcc/tree-vect-stmts.c > >> +++ b/gcc/tree-vect-stmts.c > >> @@ -9937,8 +9937,8 @@ vectorizable_condition (vec_info *vinfo, > >> { > >> vec_cond_rhs =3D vec_oprnds1[i]; > >> if (bitop1 =3D=3D NOP_EXPR) > >> - vec_compare =3D build2 (cond_code, vec_cmp_type, > >> - vec_cond_lhs, vec_cond_rhs); > >> + vec_compare =3D gimplify_build2 (gsi, cond_code, vec_cmp_t= ype, > >> + vec_cond_lhs, vec_cond_rhs)= ; > >> else > >> { > >> > >> please don't introduce more uses of gimplify_buildN - I'd like to > >> get rid of those. You can use > >> > >> gimple_seq stmts =3D NULL; > >> vec_compare =3D gimple_build (&stmts, cond_code, ...); > >> gsi_insert_seq_before/after (...); > >> > >> OK with those changes. > > > > Applying the patch caused > > > > Running target unix//-m32 > > FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3 > > -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer > > -finline-functions > > FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3 -g > > I can't reproduce that with current master. Can you? Yes. > make check-gcc RUNTESTFLAGS=3D"--target_board=3Dunix/-m32 ieee.exp=3Dpr50= 310.c" ... FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3 -g mind the -m32 > > > > and > > > > FAIL: ext/random/simd_fast_mersenne_twister_engine/operators/inequal.cc > > (test for excess errors) > > UNRESOLVED: ext/random/simd_fast_mersenne_twister_engine/operators/ineq= ual.cc > > compilation failed to produce executable > > I've just fixed this one. > > Martin > > > > > Richard. > > > >> Thanks, > >> Richard. > >> > >> > >>> Thanks, > >>> Martin > >>> > >>>> > >>>>>> > >>>>>> You'll likely figure the vectorizer still creates some VEC_COND_EX= PRs > >>>>>> with embedded comparisons. > >>>>> > >>>>> I've fixed 2 failing test-cases I mentioned in the previous email. > >>>>> > >>>>> Martin > >>>>> > >>>>>> > >>>>>> Thanks, > >>>>>> Richard. > >>>>>> > >>>>>> > >>>>>>> Martin > >>>>> > >>> >