From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id EE9383846405 for ; Wed, 3 Apr 2024 21:09:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EE9383846405 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EE9383846405 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712178553; cv=none; b=WyxrTRyMqCQ8Lhh4dJeWzVDWEFf63VJXAGGO3ECJindWwdlLkOMd+4mtw31ZmeoVYD7C0NB1GpFVryKaRqHob8Hbg+MyKSNpjm6a681jC/eNZFNxQYjbNNuDfEBBVEGd/pQTu0KIsSm8XE4andiLu8fL6ilLMYnBJLYlwWd9DhY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712178553; c=relaxed/simple; bh=bdeygb8C/cKXBJ268sSdNbztVF9UYCHlAKMt6CCN7JU=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=aNigJODtgYZ5mtJC5huCJhImz2mBlwb195l9RlRRfFYHi4ALETUii0ryR+Bsdtpl3KhMpZBqcM7bQmClbKJKRbXwDtw4nU2oJakNIpUdb+fMZkySDHA5lhGfJfb84FG27x8WU7dyrq0WgTTRX2+UN+YHYZ3DB1nqIRq196XBSjA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 75D841007; Wed, 3 Apr 2024 14:09:40 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5FAFB3F7B4; Wed, 3 Apr 2024 14:09:08 -0700 (PDT) From: Richard Sandiford To: Alex Coplan Mail-Followup-To: Alex Coplan ,Ajit Agarwal , "Kewen.Lin" , Michael Meissner , Segher Boessenkool , Peter Bergner , David Edelsohn , gcc-patches , richard.sandiford@arm.com Cc: Ajit Agarwal , "Kewen.Lin" , Michael Meissner , Segher Boessenkool , Peter Bergner , David Edelsohn , gcc-patches Subject: Re: [PATCH V3 0/2] aarch64: Place target independent and dependent changed code in one file. References: <6a013cf5-a5f2-44df-bdb3-a4985fbae06a@linux.ibm.com> Date: Wed, 03 Apr 2024 22:09:07 +0100 In-Reply-To: (Alex Coplan's message of "Wed, 3 Apr 2024 16:21:22 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-14.7 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Alex Coplan writes: > On 23/02/2024 16:41, Ajit Agarwal wrote: >> Hello Richard/Alex/Segher: > > Hi Ajit, > > Sorry for the delay and thanks for working on this. > > Generally this looks like the right sort of approach (IMO) but I've left > some comments below. > > I'll start with a meta comment: in the subject line you have marked this > as 0/2, but usually 0/n is reserved for the cover letter of a patch > series and wouldn't contain an actual patch. I think this might have > confused the Linaro CI suitably such that it didn't run regression tests > on the patch. Alex, thanks for the thorough and in-depth review. I agree with all the comments FWIW. Just to add a couple of things: > > @@ -138,8 +138,18 @@ struct alt_base > > poly_int64 offset; > > }; > > > > +// Virtual base class for load/store walkers used in alias analysis. > > +struct alias_walker > > +{ > > + virtual bool conflict_p (int &budget) const = 0; > > + virtual insn_info *insn () const = 0; > > + virtual bool valid () const = 0; > > + virtual void advance () = 0; > > +}; > > + > > + > > // State used by the pass for a given basic block. > > -struct ldp_bb_info > > +struct pair_fusion > > As a comment on the high-level design, I think we want a generic class > for the overall pass, not just for the BB-specific structure. > > That is because naturally we want the ldp_fusion_bb function itself to > be a member of such a class, so that it can access virtual functions to > query the target e.g. about the load/store pair policy, and whether to > try and promote writeback pairs. > > If we keep all of the virtual functions in such an outer class, then we > can keep the ldp_fusion_bb class generic (not needing an override for > each target) and that inner class can perhaps be given a pointer or > reference to the outer class when it is instantiated in ldp_fusion_bb. I agree that in general, the new virtual methods should belong to a pass class rather than the per-bb class. In principle, if we need to virtualise existing members of ldp_bb_info (or code contained within existing members of ldp_bb_info), and if that code accesses members of the bb info, then it might make sense to have target-specific derivatives of the bb info structure too, with a virtual function to create the bb info structure for a given bb. However, it looks like all but one of the virtual functions in the patch are self-contained (in the sense of depending only on their arguments and on globals). The one exception is transform_for_base, but Alex asked whether that needs to be virtualised. If it doesn't, then like Alex says, it seems that all virtuals could belong to the pass class rather than to the bb info. >> [...] >> + } >> }; >> >> +bool >> +store_modifies_mem_p (rtx mem, insn_info *store_insn, int &budget); >> +bool load_modified_by_store_p (insn_info *load, >> + insn_info *store, >> + int &budget); >> +extern insn_info * >> +try_repurpose_store (insn_info *first, >> + insn_info *second, >> + const insn_range_info &move_range); >> + >> +void reset_debug_use (use_info *use); >> + >> +extern void >> +fixup_debug_uses (obstack_watermark &attempt, >> + insn_info *insns[2], >> + rtx orig_rtl[2], >> + insn_info *pair_dst, >> + insn_info *trailing_add, >> + bool load_p, >> + int writeback, >> + rtx writeback_effect, >> + unsigned base_regno); >> + >> +void >> +fixup_debug_uses_trailing_add (obstack_watermark &attempt, >> + insn_info *pair_dst, >> + insn_info *trailing_add, >> + rtx writeback_effect); >> + >> + >> +extern void >> +fixup_debug_use (obstack_watermark &attempt, >> + use_info *use, >> + def_info *def, >> + rtx base, >> + poly_int64 wb_offset); >> + >> +extern insn_info * >> +find_trailing_add (insn_info *insns[2], >> + const insn_range_info &pair_range, >> + int initial_writeback, >> + rtx *writeback_effect, >> + def_info **add_def, >> + def_info *base_def, >> + poly_int64 initial_offset, >> + unsigned access_size); >> + >> +rtx drop_writeback (rtx mem); >> +rtx pair_mem_strip_offset (rtx mem, poly_int64 *offset); >> +bool any_pre_modify_p (rtx x); >> +bool any_post_modify_p (rtx x); >> +int encode_lfs (lfs_fields fields); >> +extern insn_info * latest_hazard_before (insn_info *insn, rtx *ignore, >> + insn_info *ignore_insn = nullptr); >> +insn_info * first_hazard_after (insn_info *insn, rtx *ignore); >> +bool ranges_overlap_p (const insn_range_info &r1, const insn_range_info &r2); >> +insn_range_info get_def_range (def_info *def); >> +insn_range_info def_downwards_move_range (def_info *def); >> +insn_range_info def_upwards_move_range (def_info *def); >> +rtx gen_tombstone (void); >> +rtx filter_notes (rtx note, rtx result, bool *eh_region, rtx *fr_expr); >> +rtx combine_reg_notes (insn_info *i1, insn_info *i2, bool load_p); >> +rtx extract_writebacks (bool load_p, rtx pats[2], int changed); >> +void do_alias_analysis (insn_info *alias_hazards[4], >> + alias_walker *walkers[4], >> + bool load_p); >> +int get_viable_bases (insn_info *insns[2], >> + vec &base_cands, >> + rtx cand_mems[2], >> + unsigned access_size, >> + bool reversed); >> +void dump_insn_list (FILE *f, const insn_list_t &l); > > Is there really a need to forward-declare all of these? > >> + >> splay_tree_node * >> -ldp_bb_info::node_alloc (access_record *access) >> +pair_fusion::node_alloc (access_record *access) >> { >> using T = splay_tree_node; >> void *addr = obstack_alloc (&m_obstack, sizeof (T)); >> @@ -224,7 +401,7 @@ ldp_bb_info::node_alloc (access_record *access) >> // Given a mem MEM, if the address has side effects, return a MEM that accesses >> // the same address but without the side effects. Otherwise, return >> // MEM unchanged. >> -static rtx >> +rtx > > Is there a reason to change the linkage here? Presumably when this (and > many other such functions) get moved to a generic file then they will > only be called from within that file, so they could stay having static > linkage. > > That should remove a fair bit of noise from this patch. Yeah, agreed. In general, I don't think we should create any more true globals as part of this series. If functions need to be exposed to targets, I think it should by via static or non-static member functions of the pass class that Alex described above. >> [...] >> @@ -487,7 +662,7 @@ ldp_bb_info::track_access (insn_info *insn, bool load_p, rtx mem) >> poly_int64 mem_off; >> rtx addr = XEXP (mem, 0); >> const bool autoinc_p = GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC; >> - rtx base = ldp_strip_offset (mem, &mem_off); >> + rtx base = pair_mem_strip_offset (mem, &mem_off); > > For review purposes: it might be easier if you split off patches that > just do a simple rename into a separate patch. So all of the ldp/stp -> > pair mem renaming could be done in a preparatory patch. That would > reduce the noise in reviewing any subsequent patches. Yeah, I agree this would help. Sorry Ajit for making the submission more complex in terms of number of patches (though hopefully less complex in terms of following the changes). >> [...] >> @@ -1684,6 +1859,9 @@ ldp_bb_info::fuse_pair (bool load_p, >> insn_change::DELETE); >> }; >> >> + if (*i1 > *i2) >> + return false; >> + > > I don't think we want this: we want to be able to handle the case > where the insns in offset order are not in program order. What goes > wrong in this case for you? Yeah. It's OK/good for this patch to move existing code verbatim into virtual member functions. But please don't add new tests or conditions as part of this patch. Those would be better as separate pre-patches or post-patches, with an explanation for why each change is needed. Thanks, Richard