From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22e.google.com (mail-lj1-x22e.google.com [IPv6:2a00:1450:4864:20::22e]) by sourceware.org (Postfix) with ESMTPS id 7A9B53860758 for ; Fri, 30 Jun 2023 11:18:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7A9B53860758 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x22e.google.com with SMTP id 38308e7fff4ca-2b6b98ac328so27205401fa.0 for ; Fri, 30 Jun 2023 04:18:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688123917; x=1690715917; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ydGKe9nTnVh0wjhMOY0FoJwqnZsNSak4bJDo0qayygI=; b=KSNVil0v4SjUYKcA6LfwXRycAsinqpjfY/+E8p6xr3suDuzlskZRNTQkuCXlug5h6u 0KEER/ynPLub9VTlqgfr7sxLaJozkoE2+PdqqaeM0apMlNihpB0KbbeKSsKAg/PZhBjp gPj4UIdBwTGtHIWGU2d7TfYNHw5+ORiC+/Ghx51K0dtxiFkpiVz6HJKKHp9cQUd3JXZS oM5EwyKjdwG+E4IaayzBcB0p51m3sf0+ooJap7mIKuVZKGq3a1UNtDrbGmyIh7vaVBbw +4qD9lUZ+mwgOUsULE544aL809Np/JRLfry2VaNWXTGPKB0J08INM60KZI8RZC5i8kfG wzLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688123917; x=1690715917; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ydGKe9nTnVh0wjhMOY0FoJwqnZsNSak4bJDo0qayygI=; b=io6WwIrpgqlElYNj4cLwlaAqCl/Xgxz9+30TNPns0kNOqPo+wbhwOTwebdgrfIcjUO kznRErv4CKxqWwnwQT8/MwA2iJZJMCjD9jcS3JI6DqRh87vjM6z0h5FYpb4bm6hpciME UFrloZnRGeslh5sb6rtvSNVkbKu+1qQ/djL++GFOf1ruHOZ6rG6dXYZCZCD1VJQBG9Li +KB4KsgkVzWft9jucYSTtP1fULL0SCxs4Ldspep6jJOpCdmXBCLrF1Dr20j9djCjHoY6 DZfkPy/n9s1g4g//DOxzkPoomXpWi9jp3+LmYQ5JndEVhee5XSlWRQFE746F7fsctzAn L/Sg== X-Gm-Message-State: ABy/qLbY7TevN9Q30aaAE8fvftU8bbh+x5mIRRT+k1/0etvm/xFFoutp InbKXTQcwg/rSMyHH90OBOYpiVeQgbrucdAaoHE= X-Google-Smtp-Source: APBJJlG27d9XYvdDbbptDgaeb5oXv+FtkGnPJ2nzIAiU0gHtQQAE8X2U5pSb0AX/cs9Go/dFs2Md5L7RSMNX3/8seEo= X-Received: by 2002:a2e:97cc:0:b0:2b5:7fd2:ec36 with SMTP id m12-20020a2e97cc000000b002b57fd2ec36mr2028163ljj.21.1688123916578; Fri, 30 Jun 2023 04:18:36 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Fri, 30 Jun 2023 13:18:23 +0200 Message-ID: Subject: Re: [PATCH 3/9] vect: Adjust vectorizable_load costing on VMAT_INVARIANT To: Kewen Lin Cc: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Jun 13, 2023 at 4:03=E2=80=AFAM Kewen Lin wro= te: > > This patch adjusts the cost handling on VMAT_INVARIANT in > function vectorizable_load. We don't call function > vect_model_load_cost for it any more. > > To make the costing on VMAT_INVARIANT better, this patch is > to query hoist_defs_of_uses for hoisting decision, and add > costs for different "where" based on it. Currently function > hoist_defs_of_uses would always hoist the defs of all SSA > uses, adding one argument HOIST_P aims to avoid the actual > hoisting during costing phase. > > gcc/ChangeLog: > > * tree-vect-stmts.cc (hoist_defs_of_uses): Add one argument HOIST= _P. > (vectorizable_load): Adjust the handling on VMAT_INVARIANT to res= pect > hoisting decision and without calling vect_model_load_cost. > (vect_model_load_cost): Assert it won't get VMAT_INVARIANT any mo= re > and remove VMAT_INVARIANT related handlings. > --- > gcc/tree-vect-stmts.cc | 61 +++++++++++++++++++++++++++--------------- > 1 file changed, 39 insertions(+), 22 deletions(-) > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index 744cdf40e26..19c61d703c8 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -1135,7 +1135,8 @@ vect_model_load_cost (vec_info *vinfo, > slp_tree slp_node, > stmt_vector_for_cost *cost_vec) > { > - gcc_assert (memory_access_type !=3D VMAT_GATHER_SCATTER || !gs_info->d= ecl); > + gcc_assert ((memory_access_type !=3D VMAT_GATHER_SCATTER || !gs_info->= decl) > + && memory_access_type !=3D VMAT_INVARIANT); > > unsigned int inside_cost =3D 0, prologue_cost =3D 0; > bool grouped_access_p =3D STMT_VINFO_GROUPED_ACCESS (stmt_info); > @@ -1238,16 +1239,6 @@ vect_model_load_cost (vec_info *vinfo, > ncopies * assumed_nunits, > scalar_load, stmt_info, 0, vect_bo= dy); > } > - else if (memory_access_type =3D=3D VMAT_INVARIANT) > - { > - /* Invariant loads will ideally be hoisted and splat to a vector. = */ > - prologue_cost +=3D record_stmt_cost (cost_vec, 1, > - scalar_load, stmt_info, 0, > - vect_prologue); > - prologue_cost +=3D record_stmt_cost (cost_vec, 1, > - scalar_to_vec, stmt_info, 0, > - vect_prologue); > - } > else > vect_get_load_cost (vinfo, stmt_info, ncopies, > alignment_support_scheme, misalignment, first_stm= t_p, > @@ -9121,10 +9112,11 @@ permute_vec_elements (vec_info *vinfo, > /* Hoist the definitions of all SSA uses on STMT_INFO out of the loop LO= OP, > inserting them on the loops preheader edge. Returns true if we > were successful in doing so (and thus STMT_INFO can be moved then), > - otherwise returns false. */ > + otherwise returns false. HOIST_P indicates if we want to hoist the > + definitions of all SSA uses, it would be false when we are costing. = */ > > static bool > -hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop) > +hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop, bool hois= t_p) > { > ssa_op_iter i; > tree op; > @@ -9158,6 +9150,9 @@ hoist_defs_of_uses (stmt_vec_info stmt_info, class = loop *loop) > if (!any) > return true; > > + if (!hoist_p) > + return true; > + > FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE) > { > gimple *def_stmt =3D SSA_NAME_DEF_STMT (op); > @@ -9510,14 +9505,6 @@ vectorizable_load (vec_info *vinfo, > > if (memory_access_type =3D=3D VMAT_INVARIANT) > { > - if (costing_p) > - { > - vect_model_load_cost (vinfo, stmt_info, ncopies, vf, > - memory_access_type, alignment_support_sch= eme, > - misalignment, &gs_info, slp_node, cost_ve= c); > - return true; > - } > - > gcc_assert (!grouped_load && !mask && !bb_vinfo); > /* If we have versioned for aliasing or the loop doesn't > have any data dependencies that would preclude this, > @@ -9525,7 +9512,37 @@ vectorizable_load (vec_info *vinfo, > thus we can insert it on the preheader edge. */ > bool hoist_p =3D (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) > && !nested_in_vect_loop > - && hoist_defs_of_uses (stmt_info, loop)); > + && hoist_defs_of_uses (stmt_info, loop, !costing_p)= ); 'hoist_defs_of_uses' should ideally be computed once at analysis time and the result remembered. It's not so easy in this case so maybe just add a comment for this here. > + if (costing_p) > + { > + if (hoist_p) > + { > + unsigned int prologue_cost; > + prologue_cost =3D record_stmt_cost (cost_vec, 1, scalar_loa= d, > + stmt_info, 0, vect_prolog= ue); > + prologue_cost +=3D record_stmt_cost (cost_vec, 1, scalar_to= _vec, > + stmt_info, 0, vect_prolo= gue); > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "vect_model_load_cost: inside_cost =3D 0= , " > + "prologue_cost =3D %d .\n", > + prologue_cost); > + } > + else > + { > + unsigned int inside_cost; > + inside_cost =3D record_stmt_cost (cost_vec, 1, scalar_load, > + stmt_info, 0, vect_body); > + inside_cost +=3D record_stmt_cost (cost_vec, 1, scalar_to_v= ec, > + stmt_info, 0, vect_body); > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "vect_model_load_cost: inside_cost =3D %= d, " > + "prologue_cost =3D 0 .\n", > + inside_cost); > + } Please instead do enum vect_cost_model_location loc =3D hoist_p ? vect_prologue : vect_body; and merge the two branches which otherwise look identical to me. > + return true; > + } > if (hoist_p) > { > gassign *stmt =3D as_a (stmt_info->stmt); > -- > 2.31.1 >