From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1B380385801E for ; Mon, 18 Oct 2021 09:58:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1B380385801E Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BE7DEED1; Mon, 18 Oct 2021 02:58:24 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.88]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 10A573F70D; Mon, 18 Oct 2021 02:58:23 -0700 (PDT) From: Richard Sandiford To: Michael Matz via Gcc-patches Mail-Followup-To: Michael Matz via Gcc-patches , Richard Biener , Michael Matz , hubicka@ucw.cz, richard.sandiford@arm.com Cc: Richard Biener , Michael Matz , hubicka@ucw.cz Subject: Re: [PATCH][RFC] Introduce TREE_AOREFWRAP to cache ao_ref in the IL References: <3313269o-5444-9142-o8ro-1s59r67083pq@fhfr.qr> Date: Mon, 18 Oct 2021 10:58:22 +0100 In-Reply-To: (Michael Matz via Gcc-patches's message of "Thu, 14 Oct 2021 13:29:55 +0000 (UTC)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Oct 2021 09:58:26 -0000 Michael Matz via Gcc-patches writes: > Hello, > > On Thu, 14 Oct 2021, Richard Biener wrote: > >> > So, at _this_ write-through of the email I think I like the above idea >> > best: make ao_ref be a tree (at least its storage, because it currently >> > is a one-member-function class), make ao_ref.volatile_p be >> > tree_base.volatile_flag (hence TREE_VOLATILE(ao_ref)) (this reduces >> > sizeof(ao_ref) by 8), increase all nr-of-operand of each tcc_reference by >> > 1, and make TREE_AO_REF(reftree) be "TREE_OPERAND(reftree, >> > TREE_CODE_LENGTH(reftree) - 1)", i.e. the last operand of such >> > tcc_reference tree. >> >> Hmm. I'm not sure that's really something I like - it's especially >> quite some heavy lifting while at the same time lacking true boldness >> as to changing the representation of memory refs ;) > > Well, it would at least enable such changes later in an orderly fashion. > >> That said - I've prototyped the TREE_ASM_WRITTEN way now because it's >> even simpler than the original TREE_AOREFWRAP approach, see below. >> >> Note that I'm not embedding it into the tree structure, I'm merely >> using the same allocation to store two objects, the outermost ref >> and the ao_ref associated with it. Quote: >> >> + size_t length = tree_code_size (TREE_CODE (lhs)); >> + if (!TREE_ASM_WRITTEN (lhs)) >> + { >> + tree alt_lhs >> + = ggc_alloc_cleared_tree_node_stat (length + sizeof (ao_ref)); >> + memcpy (alt_lhs, lhs, length); >> + TREE_ASM_WRITTEN (alt_lhs) = 1; >> + *ref = new ((char *)alt_lhs + length) ao_ref; > > You need to ensure that alt_lhs+length is properly aligned for ao_ref, but > yeah, for a hack that works. If you really want to go that way you need > good comments about this hack. It's really somewhat worrisome that the > size of the allocation depends on a bit in tree_base. > > (It's a really cute hack that works as a micro optimization, the question > is, do we really need to go there already, are all other less hacky > approaches not bringing similar improvements? The cuter the hacks the > less often they pay off in the long run of production software :) ) FWIW, having been guilty of adding a similar hack(?) to SYMBOL_REFs for block_symbol, I like the approach of concatenating/combining structures based on flags. The main tree and rtl types have too much baggage and so I think there are some things that are better represented outside of them. I suppose cselib VALUE rtxes are also similar, although they're more of a special case, since cselib data doesn't survive between passes. Thanks, Richard