From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 87D7B385624C; Wed, 27 Sep 2023 14:11:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 87D7B385624C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1695823870; bh=cKlR2z7+sS/2Pk5D86GhGGIV2z878Q7Npz02eTrUucc=; h=From:To:Subject:Date:In-Reply-To:References:From; b=EOpjdqLW4TmXQH/HAGt2GFMQi8ybXuQgK9BBTil0zh2skXoLW5xtbOTRXroIc8h+u z9r1fuMEAAhItGorGWqMjo1TxrNHZxoHeYTND8bdGLzEVxiqaRbT27184Q9Dsg17KS d26VoHlZZ0c90bP8UNDgmzevNz70ritODXEA46Kc= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/109088] GCC does not always vectorize conditional reduction Date: Wed, 27 Sep 2023 14:11:09 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109088 --- Comment #13 from JuzheZhong --- Hi, Richi. This is my draft approach to enhance the finding more potential condtional reduction. diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index a8c915913ae..c25d2038f16 100644 --- a/gcc/tree-if-conv.cc +++ b/gcc/tree-if-conv.cc @@ -1790,8 +1790,72 @@ is_cond_scalar_reduction (gimple *phi, gimple **redu= c, tree arg_0, tree arg_1, std::swap (r_op1, r_op2); std::swap (r_nop1, r_nop2); } - else if (r_nop1 !=3D PHI_RESULT (header_phi)) - return false; + else if (r_nop1 =3D=3D PHI_RESULT (header_phi)) + ; + else + { + /* Analyze the statement chain of STMT so that we could teach genera= te + better if-converison code sequence. We are trying to catch this + following situation: + + loop-header: + reduc_1 =3D PHI <..., reduc_2> + ... + if (...) + tmp1 =3D reduc_1 + rhs1; + tmp2 =3D tmp1 + rhs2; + tmp3 =3D tmp2 + rhs3; + ... + reduc_3 =3D tmpN-1 + rhsN-1; + + reduc_2 =3D PHI + + and convert to + + reduc_2 =3D PHI <0, reduc_1> + tmp1 =3D rhs1 + rhs2; + tmp2 =3D tmp1 + rhs3; + tmp3 =3D tmp2 + rhs4; + ... + tmpN-1 =3D tmpN-2 + rhsN; + ifcvt =3D cond_expr ? tmpN-1 : 0 + reduc_1 =3D tmpN-1 +/- ifcvt; */ + if (num_imm_uses (PHI_RESULT (header_phi)) !=3D 2) + return false; + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, PHI_RESULT (header_phi)) + { + gimple *use_stmt =3D USE_STMT (use_p); + if (is_gimple_assign (use_stmt)) + { + if (gimple_assign_rhs_code (use_stmt) !=3D reduction_op) + return false; + if (TREE_CODE (gimple_assign_lhs (use_stmt)) !=3D SSA_NAME) + return false; + + bool visited_p =3D false; + while (!visited_p) + { + use_operand_p use; + if (!single_imm_use (gimple_assign_lhs (use_stmt), &use, + &use_stmt) + || gimple_bb (use_stmt) !=3D gimple_bb (stmt) + || !is_gimple_assign (use_stmt) + || TREE_CODE (gimple_assign_lhs (use_stmt)) !=3D SSA_= NAME + || gimple_assign_rhs_code (use_stmt) !=3D reduction_o= p) + return false; + + if (gimple_assign_lhs (use_stmt) =3D=3D gimple_assign_lhs= (stmt)) + { + r_op2 =3D r_op1; + r_op1 =3D PHI_RESULT (header_phi); + visited_p =3D true; + } + } + } + else if (use_stmt !=3D phi) + return false; + } + } My approach is doing the check as follows: tmp1 =3D reduc_1 + rhs1; tmp2 =3D tmp1 + rhs2; tmp3 =3D tmp2 + rhs3; ... reduc_3 =3D tmpN-1 + rhsN-1; Start the iteration check from "tmp1 =3D reduc_1 + rhs1;" until "reduc_3 = =3D tmpN-1 + rhsN-1;" Make sure each statement are PLUS_EXPR for reduction sum. Does it look reasonable ? It succeed on vectorization.=