From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B6C193854167; Fri, 21 Oct 2022 10:17:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B6C193854167 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=foss.arm.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=foss.arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 86F451063; Fri, 21 Oct 2022 03:17:49 -0700 (PDT) Received: from [10.57.35.253] (unknown [10.57.35.253]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 794333F67D; Fri, 21 Oct 2022 03:17:40 -0700 (PDT) Message-ID: <06133b91-6950-878e-c8e5-f7f0ffc34b51@foss.arm.com> Date: Fri, 21 Oct 2022 11:17:39 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Subject: Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM Content-Language: en-GB To: Andrew Pinski , Segher Boessenkool Cc: "mfortune@gmail.com" , "dave.anglin@bell.net" , "rguenther@suse.de" , "aoliva@gcc.gnu.org" , "richard.sandiford@arm.com" , "uweigand@de.ibm.com" , "hubicka@ucw.cz" , "marcus.shawcroft@arm.com" , "olegendo@gcc.gnu.org" , "gcc-patches@gcc.gnu.org" , "linkw@gcc.gnu.org" , "richard.earnshaw@arm.com" , "ramana.radhakrishnan@arm.com" , "davem@redhat.com" , "gnu@amylaar.uk" , "Liu, Hongtao" , "claziss@synopsys.com" , "dje.gcc@gmail.com" References: <20221014083406.8406-1-haochen.jiang@intel.com> <20221014083406.8406-2-haochen.jiang@intel.com> <20221019211421.GQ25951@gate.crashing.org> <20221020172558.GS25951@gate.crashing.org> From: Richard Earnshaw In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3490.4 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,NICE_REPLY_A,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 20/10/2022 18:37, Andrew Pinski via Gcc-patches wrote: > On Thu, Oct 20, 2022 at 10:28 AM Segher Boessenkool > wrote: >> >> On Thu, Oct 20, 2022 at 01:44:15AM +0000, Jiang, Haochen wrote: >>> Maybe the testcase change cause some misunderstanding and concern. >>> >>> Actually, the patch did not disrupt the previous builtins, as the builtin_prefetch >>> uses vargs. I set the default value of the new parameter as data prefetch, which >>> means that if we are not using the fourth parameter, just like how we use >>> prefetch previously, it is still what it is. >> >> I still think it is a mistake to have one builtin do two very distinct >> operations, only very superficially related. Instruction fetch and data >> demand loads are almosty entirely unrelated, and so is the prefetch >> machinery for them, on all machines I am familiar with. > > On aarch64 (armv8), it is actually the same instruction: PRFM. It > might be the only one which is that way though. > It even allows to specify the level for the instruction prefetch too > (which is actually useful for say OcteonTX2 which has an interesting > cache hierarchy). > Just because the encodings are similar doesn't mean that the instructions are the same, although it's true that once you reach unification in the cache hierarchy the end behaviour /might/ be indistinguishable. Really, Segher's point seems to be 'why overload the existing builtin for this'? It's not like the new parameter is something that users would really need to pass in as a run-time choice; and that wouldn't work anyway because in the end we do need distinct instructions. R. > Though I agree it is a mistake to have one builtin which handles both > data and instruction prefetch. > > Thanks, > Andrew > > >> Which makes >> sense anyway, since instruction prefetch and data prefetch have >> completely different performance characteristics and considerations. >> Maybe if you start with the mistake of having unified L1 caches it >> seems natural, but thankfully most machines do not do that. >> >> >> Segher