From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 95439 invoked by alias); 12 Feb 2020 21:53:05 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 95430 invoked by uid 89); 12 Feb 2020 21:53:04 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 spammy= X-HELO: gate.crashing.org Received: from gate.crashing.org (HELO gate.crashing.org) (63.228.1.57) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 12 Feb 2020 21:53:02 +0000 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 01CLqsfb032537; Wed, 12 Feb 2020 15:52:59 -0600 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 01CLqqkS032529; Wed, 12 Feb 2020 15:52:52 -0600 Date: Wed, 12 Feb 2020 21:53:00 -0000 From: Segher Boessenkool To: Richard Biener Cc: Roman Zhuykov , "Kewen.Lin" , GCC Patches , Bill Schmidt , "bin.cheng" Subject: Re: [PATCH 0/4 GCC11] IVOPTs consider step cost for different forms when unrolling Message-ID: <20200212215251.GD22482@gate.crashing.org> References: <20200120123332.GV3191@gate.crashing.org> <52c8eecc-3383-81ad-70ce-27c149d7a103@linux.ibm.com> <20200210212910.GL22482@gate.crashing.org> <20200211074859.GV22482@gate.crashing.org> <1ac98132-734e-0ee3-5ea2-7ec256ee92d2@ispras.ru> <20200211180049.GW22482@gate.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-IsSubscribed: yes X-SW-Source: 2020-02/txt/msg00752.txt.bz2 On Wed, Feb 12, 2020 at 09:07:27AM +0100, Richard Biener wrote: > On Tue, 11 Feb 2020, Segher Boessenkool wrote: > > > On Tue, Feb 11, 2020 at 02:58:47PM +0100, Richard Biener wrote: > > > On Tue, 11 Feb 2020, Roman Zhuykov wrote: > > > > 11.02.2020 11:01, Richard Biener wrote: > > > > Sound good, but IMHO modulo scheduler is not the best choice to be the > > > > first step implementing such a concept. > > > > > > True ;) But since the context of this thread is unrolling ... > > > Not sure how you'd figure the unroll factor to apply if you want > > > to do unrolling within a classical scheduling framework? Maybe > > > unroll as much as you can fill slots until the last instruction > > > of the first iteration retires? > > > > That will be terrible on register-rich architectures: it *already* is > > problematic how often some things are unrolled, blindly unrolling more > > would make things worse. We need to unroll more where it helps, but > > less where it does not. For that we need a good cost/benefit estimate. > > True. For x86 we tried but did not come up with a sensible estimate > (probably the x86 uarchs are way too complicated to understand). There are three main factors at work: 1) The cost for iterating the loop. This is related to what ivopts does, and also on most uarchs every loop iteration has some minimum cost (on many uarchs it needs a fetch redirect, or you can only do one branch per cycle, etc.; this is mainly important for "small" loops, where what is "small" differs per uarch). Unrolling brings this cost down. 2) Cost reduction by better scheduling. 3) The cost for using more registers, when unrolling. This is worse if you also use SMS or variable expansion, or simply because of the better scheduling. This can increase the cost a *lot*. 1) isn't all that hard to estimate. 2) and esp. 3) are though :-/ Segher