From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 5343B385802E for ; Wed, 31 Mar 2021 08:05:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5343B385802E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=none smtp.mailfrom=hubicka@kam.mff.cuni.cz Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 1D3FD282CA7; Wed, 31 Mar 2021 10:05:16 +0200 (CEST) Date: Wed, 31 Mar 2021 10:05:16 +0200 From: Jan Hubicka To: "H.J. Lu" Cc: Hongtao Liu , GCC Patches , Hongyu Wang Subject: Re: [PATCH v2 1/3] x86: Update memcpy/memset inline strategies for Ice Lake Message-ID: <20210331080516.GA51111@kam.mff.cuni.cz> References: <20210322131636.58461-1-hjl.tools@gmail.com> <20210322131636.58461-2-hjl.tools@gmail.com> <20210322141027.GA11522@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Mar 2021 08:05:19 -0000 > > It looks like X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB is quite obviously > > benefical and independent of the rest of changes. I think we will need > > to discuss bit more the move ratio and the code size/uop cache polution > > issues - one option would be to use increased limits for -O3 only. > > My change only increases CLEAR_RATIO, not MOVE_RATIO. We are > checking code size impacts on SPEC CPU 2017 and eembc. > > > Can you break this out to independent patch? I also wonder if it owuld > > X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB improves performance > only when memcpy/memset costs and MOVE_RATIO are updated the same time, > like: > > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567096.html > > Make it a standalone means moving from Ice Lake patch to Skylake patch. > > > not be more readable to special case this just on the beggining of > > decide_alg. > > > @@ -6890,6 +6891,7 @@ decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT expected_size, > > > const struct processor_costs *cost; > > > int i; > > > bool any_alg_usable_p = false; > > > + bool known_size_p = expected_size != -1; > > > > expected_size is not -1 if we have profile feedback and we detected from > > histogram average size of a block. It seems to me that from description > > that you want the const to be actual compile time constant that would be > > min_size == max_size I guess. > > > > You are right. Here is the v2 patch with min_size != max_size check for > unknown size. Patch is OK now. I was wondering about using avx256 for moves of known size (per comment on MOVE_MAX_PIECES there is issue with MAX_FIXED_MODE_SIZE, but that seems not hard to fix). Did you look into it? Honza > > Thanks. > > -- > H.J.