From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by sourceware.org (Postfix) with ESMTPS id 77C3F3857425 for ; Tue, 25 Oct 2022 11:09:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 77C3F3857425 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x636.google.com with SMTP id n12so6141412eja.11 for ; Tue, 25 Oct 2022 04:09:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=DaY534NiIDpDLsf79Ojj9qb0/V1bG6yjd92yYgiehzY=; b=p0VYSndqhRrybjZf58dyX2lhHMfjU57JQIAAZcSs0Am3+mRtQmPT69hf0vJK5gcxBc oTxWUDPJqTnJkdAsnljTYVrs0YqjLR0I07LCWBmdcjtgAIJggKKWDIvOe1rG76v+mvhi KQL1GmiC0AwtpLTzYcSZyeaJpoUSu5fG6EyJ9guhv6hBA7tnND2Kwah0NQ8IhypHsM8z cyhXwHP5xTHv6Ao55Aasn0WxUCVS1NpsweY4vsf0Tb67oXmKpqIBtH2HlKdhWoCEfhnb DNV4n4w/drSyihkHf4A1w8bSEfoJ2IdVVrNvvQbUls/KwQgIj2hOemm5oii33rjDAdK1 2rBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DaY534NiIDpDLsf79Ojj9qb0/V1bG6yjd92yYgiehzY=; b=XVnZ5MBJ0KNjap9KP8WPa3E7eZXGhZ+J534AiKgV9Ss8mkQB+2044dXEel60wBZSe8 Mpmcg0BpbqDUIhuQ3XPb4MER0cysuqLChDbjd7pb5/i0DYTF+mI0NWieqYiLdClHmXux 1m71tYAU/Hw2uChoERhqaJoHcvy5qiA4N0KEHK/bfyZ4sgI9f+E7ZPO/46PiN34dQy96 7RYV3icIh/ZrPNxJ5hN7FrXaN9fGM1bFggCp6Gs9PyuqxF1H7HX0Ay0VpWENujFl6GGz 9KVNKjOisC3oUsKdMOlQYTkNq9kFH/GBd/zDTYl3CS74n52rQwzdjOQCK+/n9LUuiieR ZXZQ== X-Gm-Message-State: ACrzQf1Kxban5SKs5GqFd6iimyo9zZOwpCBzh/pkBdOpHJqo3xEj+FIY 4ba3CMy3lLsCIUTNOx6fYHmoNTNed3uMe0SB4D8= X-Google-Smtp-Source: AMsMyM4Aqrj6FTs55OyAJZrygLoD0djyiE3mJwlxECnN66tIBr8zH2NVH/UtR26IcNNI7bbG15EUpX+AB5z5ibjqLZo= X-Received: by 2002:a17:907:8a27:b0:78e:274e:9235 with SMTP id sc39-20020a1709078a2700b0078e274e9235mr32212327ejc.754.1666696148204; Tue, 25 Oct 2022 04:09:08 -0700 (PDT) MIME-Version: 1.0 References: <20221021135203.626255-1-dimitrije.milosevic@syrmia.com> <20221021135203.626255-2-dimitrije.milosevic@syrmia.com> In-Reply-To: <20221021135203.626255-2-dimitrije.milosevic@syrmia.com> From: Richard Biener Date: Tue, 25 Oct 2022 13:08:55 +0200 Message-ID: Subject: Re: [PATCH 1/2] ivopts: Revert computation of address cost complexity. To: Dimitrije Milosevic Cc: gcc-patches@gcc.gnu.org, djordje.todorovic@syrmia.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Oct 21, 2022 at 3:56 PM Dimitrije Milosevic wrote: > > From: Dimitrije Milo=C5=A1evi=C4=87 > > This patch reverts the computation of address cost complexity > to the legacy one. After f9f69dd, complexity is calculated > using the valid_mem_ref_p target hook. Architectures like > Mips only allow BASE + OFFSET addressing modes, which in turn > prevents the calculation of complexity for other addressing > modes, resulting in non-optimal candidate selection. I don't follow how only having BASE + OFFSET addressing prevents calculation of complexity for other addressing modes? Can you explain? Do you have a testcase that shows how both changes improve IV selection for MIPS? > > gcc/ChangeLog: > > * tree-ssa-address.cc (multiplier_allowed_in_address_p): Change > to non-static. > * tree-ssa-address.h (multiplier_allowed_in_address_p): Declare. > * tree-ssa-loop-ivopts.cc (compute_symbol_and_var_present): Reint= roduce. > (compute_min_and_max_offset): Likewise. > (get_address_cost): Revert > complexity calculation. > > Signed-off-by: Dimitrije Milosevic > --- > gcc/tree-ssa-address.cc | 2 +- > gcc/tree-ssa-address.h | 2 + > gcc/tree-ssa-loop-ivopts.cc | 214 ++++++++++++++++++++++++++++++++++-- > 3 files changed, 207 insertions(+), 11 deletions(-) > > diff --git a/gcc/tree-ssa-address.cc b/gcc/tree-ssa-address.cc > index ba7b7c93162..442f54f0165 100644 > --- a/gcc/tree-ssa-address.cc > +++ b/gcc/tree-ssa-address.cc > @@ -561,7 +561,7 @@ add_to_parts (struct mem_address *parts, tree elt) > validity for a memory reference accessing memory of mode MODE in addr= ess > space AS. */ > > -static bool > +bool > multiplier_allowed_in_address_p (HOST_WIDE_INT ratio, machine_mode mode, > addr_space_t as) > { > diff --git a/gcc/tree-ssa-address.h b/gcc/tree-ssa-address.h > index 95143a099b9..09f36ee2f19 100644 > --- a/gcc/tree-ssa-address.h > +++ b/gcc/tree-ssa-address.h > @@ -38,6 +38,8 @@ tree create_mem_ref (gimple_stmt_iterator *, tree, > class aff_tree *, tree, tree, tree, bool); > extern void copy_ref_info (tree, tree); > tree maybe_fold_tmr (tree); > +bool multiplier_allowed_in_address_p (HOST_WIDE_INT ratio, machine_mode = mode, > + addr_space_t as); > > extern unsigned int preferred_mem_scale_factor (tree base, > machine_mode mem_mode, > diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc > index a6f926a68ef..d53ba05a4f6 100644 > --- a/gcc/tree-ssa-loop-ivopts.cc > +++ b/gcc/tree-ssa-loop-ivopts.cc > @@ -4774,6 +4774,135 @@ get_address_cost_ainc (poly_int64 ainc_step, poly= _int64 ainc_offset, > return infinite_cost; > } > > +static void > +compute_symbol_and_var_present (tree e1, tree e2, > + bool *symbol_present, bool *var_present) > +{ > + poly_uint64_pod off1, off2; > + > + e1 =3D strip_offset (e1, &off1); > + e2 =3D strip_offset (e2, &off2); > + > + STRIP_NOPS (e1); > + STRIP_NOPS (e2); > + > + if (TREE_CODE (e1) =3D=3D ADDR_EXPR) > + { > + poly_int64_pod diff; > + if (ptr_difference_const (e1, e2, &diff)) > + { > + *symbol_present =3D false; > + *var_present =3D false; > + return; > + } > + > + if (integer_zerop (e2)) > + { > + tree core; > + poly_int64_pod bitsize; > + poly_int64_pod bitpos; > + widest_int mul; > + tree toffset; > + machine_mode mode; > + int unsignedp, reversep, volatilep; > + > + core =3D get_inner_reference (TREE_OPERAND (e1, 0), &bitsize, &bitpo= s, > + &toffset, &mode, &unsignedp, &reversep, &volatilep); > + > + if (toffset !=3D 0 > + || !constant_multiple_p (bitpos, BITS_PER_UNIT, &mul) > + || reversep > + || !VAR_P (core)) > + { > + *symbol_present =3D false; > + *var_present =3D true; > + return; > + } > + > + if (TREE_STATIC (core) > + || DECL_EXTERNAL (core)) > + { > + *symbol_present =3D true; > + *var_present =3D false; > + return; > + } > + > + *symbol_present =3D false; > + *var_present =3D true; > + return; > + } > + > + *symbol_present =3D false; > + *var_present =3D true; > + } > + *symbol_present =3D false; > + > + if (operand_equal_p (e1, e2, 0)) > + { > + *var_present =3D false; > + return; > + } > + > + *var_present =3D true; > +} > + > +static void > +compute_min_and_max_offset (addr_space_t as, > + machine_mode mem_mode, poly_int64_pod *min_offset, > + poly_int64_pod *max_offset) > +{ > + machine_mode address_mode =3D targetm.addr_space.address_mode (as); > + HOST_WIDE_INT i; > + poly_int64_pod off, width; > + rtx addr; > + rtx reg1; > + > + reg1 =3D gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1); > + > + width =3D GET_MODE_BITSIZE (address_mode) - 1; > + if (known_gt (width, HOST_BITS_PER_WIDE_INT - 1)) > + width =3D HOST_BITS_PER_WIDE_INT - 1; > + gcc_assert (width.is_constant ()); > + addr =3D gen_rtx_fmt_ee (PLUS, address_mode, reg1, NULL_RTX); > + > + off =3D 0; > + for (i =3D width.to_constant (); i >=3D 0; i--) > + { > + off =3D -(HOST_WIDE_INT_1U << i); > + XEXP (addr, 1) =3D gen_int_mode (off, address_mode); > + if (memory_address_addr_space_p (mem_mode, addr, as)) > + break; > + } > + if (i =3D=3D -1) > + *min_offset =3D 0; > + else > + *min_offset =3D off; > + // *min_offset =3D (i =3D=3D -1? 0 : off); > + > + for (i =3D width.to_constant (); i >=3D 0; i--) > + { > + off =3D (HOST_WIDE_INT_1U << i) - 1; > + XEXP (addr, 1) =3D gen_int_mode (off, address_mode); > + if (memory_address_addr_space_p (mem_mode, addr, as)) > + break; > + /* For some strict-alignment targets, the offset must be naturally > + aligned. Try an aligned offset if mem_mode is not QImode. */ > + off =3D mem_mode !=3D QImode > + ? (HOST_WIDE_INT_1U << i) > + - (GET_MODE_SIZE (mem_mode)) > + : 0; > + if (known_gt (off, 0)) > + { > + XEXP (addr, 1) =3D gen_int_mode (off, address_mode); > + if (memory_address_addr_space_p (mem_mode, addr, as)) > + break; > + } > + } > + if (i =3D=3D -1) > + off =3D 0; > + *max_offset =3D off; > +} > + > /* Return cost of computing USE's address expression by using CAND. > AFF_INV and AFF_VAR represent invariant and variant parts of the > address expression, respectively. If AFF_INV is simple, store > @@ -4802,6 +4931,13 @@ get_address_cost (struct ivopts_data *data, struct= iv_use *use, > /* Only true if ratio !=3D 1. */ > bool ok_with_ratio_p =3D false; > bool ok_without_ratio_p =3D false; > + tree ubase =3D use->iv->base; > + tree cbase =3D cand->iv->base, cstep =3D cand->iv->step; > + tree utype =3D TREE_TYPE (ubase), ctype; > + unsigned HOST_WIDE_INT cstepi; > + bool symbol_present =3D false, var_present =3D false, stmt_is_after_in= crement; > + poly_int64_pod min_offset, max_offset; > + bool offset_p, ratio_p; > > if (!aff_combination_const_p (aff_inv)) > { > @@ -4915,16 +5051,74 @@ get_address_cost (struct ivopts_data *data, struc= t iv_use *use, > gcc_assert (memory_address_addr_space_p (mem_mode, addr, as)); > cost +=3D address_cost (addr, mem_mode, as, speed); > > - if (parts.symbol !=3D NULL_TREE) > - cost.complexity +=3D 1; > - /* Don't increase the complexity of adding a scaled index if it's > - the only kind of index that the target allows. */ > - if (parts.step !=3D NULL_TREE && ok_without_ratio_p) > - cost.complexity +=3D 1; > - if (parts.base !=3D NULL_TREE && parts.index !=3D NULL_TREE) > - cost.complexity +=3D 1; > - if (parts.offset !=3D NULL_TREE && !integer_zerop (parts.offset)) > - cost.complexity +=3D 1; > + if (cst_and_fits_in_hwi (cstep)) > + cstepi =3D int_cst_value (cstep); > + else > + cstepi =3D 0; > + > + STRIP_NOPS (cbase); > + ctype =3D TREE_TYPE (cbase); > + > + stmt_is_after_increment =3D stmt_after_increment (data->current_loop, = cand, > + use->stmt); > + > + if (cst_and_fits_in_hwi (cbase)) > + compute_symbol_and_var_present (ubase, build_int_cst (utype, 0), > + &symbol_present, &var_present); > + else if (ratio =3D=3D 1) > + { > + tree real_cbase =3D cbase; > + > + /* Check to see if any adjustment is needed. */ > + if (!cst_and_fits_in_hwi (cstep) && stmt_is_after_increment) > + { > + aff_tree real_cbase_aff; > + aff_tree cstep_aff; > + > + tree_to_aff_combination (cbase, TREE_TYPE (real_cbase), > + &real_cbase_aff); > + tree_to_aff_combination (cstep, TREE_TYPE (cstep), &cstep_aff); > + > + aff_combination_add (&real_cbase_aff, &cstep_aff); > + real_cbase =3D aff_combination_to_tree (&real_cbase_aff); > + } > + compute_symbol_and_var_present (ubase, real_cbase, > + &symbol_present, &var_present); > + } > + else if (!POINTER_TYPE_P (ctype) > + && multiplier_allowed_in_address_p > + (ratio, mem_mode, > + TYPE_ADDR_SPACE (TREE_TYPE (utype)))) > + { > + tree real_cbase =3D cbase; > + > + if (cstepi =3D=3D 0 && stmt_is_after_increment) > + { > + if (POINTER_TYPE_P (ctype)) > + real_cbase =3D fold_build2 (POINTER_PLUS_EXPR, ctype, cbase, = cstep); > + else > + real_cbase =3D fold_build2 (PLUS_EXPR, ctype, cbase, cstep); > + } > + real_cbase =3D fold_build2 (MULT_EXPR, ctype, real_cbase, > + build_int_cst (ctype, ratio)); > + compute_symbol_and_var_present (ubase, real_cbase, > + &symbol_present, &var_present); > + } > + else > + { > + compute_symbol_and_var_present (ubase, build_int_cst (utype, 0), > + &symbol_present, &var_present); > + } > + > + compute_min_and_max_offset (as, mem_mode, &min_offset, &max_offset); > + offset_p =3D maybe_ne (aff_inv->offset, 0) > + && known_le (min_offset, aff_inv->offset) > + && known_le (aff_inv->offset, max_offset); > + ratio_p =3D (ratio !=3D 1 > + && multiplier_allowed_in_address_p (ratio, mem_mode, as)); > + > + cost.complexity =3D (symbol_present !=3D 0) + (var_present !=3D 0) > + + offset_p + ratio_p; > > return cost; > } > -- > 2.25.1 >