From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id A8E353858C2C for ; Thu, 9 Sep 2021 18:25:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A8E353858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 189IOji0026777; Thu, 9 Sep 2021 13:24:45 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 189IOiwR026776; Thu, 9 Sep 2021 13:24:44 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Thu, 9 Sep 2021 13:24:44 -0500 From: Segher Boessenkool To: Bill Schmidt Cc: "Kewen.Lin" , David Edelsohn , will schmidt , GCC Patches Subject: Re: [PATCH v4] rs6000: Add load density heuristic Message-ID: <20210909182444.GT1583@gate.crashing.org> References: <7b9f9bdf-1ed5-139b-de9c-511ee8454b85@linux.ibm.com> <3424a3d3-fa4e-16f9-89c6-0b07beec957d@linux.ibm.com> <77fe5ac1-200f-db69-a92a-5d349642f394@linux.ibm.com> <4f7c5da8-75d3-2d98-b728-e1a319392097@linux.ibm.com> <20210909161152.GR1583@gate.crashing.org> <894f01c3-6481-0757-751f-b4239a4f0232@linux.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <894f01c3-6481-0757-751f-b4239a4f0232@linux.ibm.com> User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2021 18:25:48 -0000 Hi! On Thu, Sep 09, 2021 at 12:19:28PM -0500, Bill Schmidt wrote: > On 9/9/21 11:11 AM, Segher Boessenkool wrote: > >On Wed, Sep 08, 2021 at 02:57:14PM +0800, Kewen.Lin wrote: > >>+ /* If we have strided or elementwise loads into a vector, it's > >"strided" is not a word: it properly is "stridden", which does not read > >very well either. "Have loads by stride, or by element, ..."? Is that > >good English, and easier to understand? > > No, this is OK.  "Strided loads" is a term of art used by the > vectorizer; whether or not it was the Queen's English, it's what we > have... "Strided" is not in any dictionary I could find. I have no idea what the island queen has to do with anything :-) > (And I think you might only find "bestridden" in some 18th or > 19th century English poetry... :-) "Bestride" is not used in the US apparently, but it has nothing to do with this anyway. In any case, "strided" is used a lot in the vectoriser already (and nowhere else). So, fine with me of course. > >Does that text look good to you now Bill? It is still kinda complex, > >maybe you see a way to make it simpler. > > I think it's OK now.  The complexity at least matches the code now > instead of exceeding it. :-P  j/k... Ha :-) Ideally the comments will clarify complex code. But not making it worse is at least something, okay :-) > >>+ if (data->extra_ctor_cost > 0) > >>+ { > >>+ /* Threshold for load stmts percentage in all vectorized stmts. */ > >>+ const int DENSITY_LOAD_PCT_THRESHOLD = 45; > >Threshold for what? > > > >45% is awfully exact. Can you make this a param? > > > >>+ /* Threshold for total number of load stmts. */ > >>+ const int DENSITY_LOAD_NUM_THRESHOLD = 20; > >Same. > > We have similar magic constants in here already. And that is problematic. That is the problem. > Parameterizing is > possible, but I'm more interested in making sure the numbers are > appropriate for each processor. Of course. So a) there shouldn't be any magic constants, the constants should be meaningful. That way, tunings will need less maintenance for new hardware versions, but way more importantly: if anything else in the compiler changes. And b), having params makes it easier to do such tuning, or experiments for it. It is very easy to add params btw (just in rs6000.opt). > Perhaps a follow-up patch to add params for the magic constants would be > reasonable, but I'd personally consider it pretty low priority. It also is super easy to do. The only hard part is making a name for it, and if you cannot think of a good name, that says that it is not a good abstraction in the first place -- magic isn't good. Please do it. > >>+ /* As function rs6000_builtin_vectorization_cost shows, we have > >>+ priced much on V16QI/V8HI vector construction as their units, > >>+ if we penalize them with nunits * stmt_cost, it can result in > >>+ an unreliable body cost, eg: for V16QI on Power8, stmt_cost > >>+ is 20 and nunits is 16, the extra cost is 320 which looks > >>+ much exaggerated. So let's use one maximum bound for the > >>+ extra penalized cost for vector construction here. */ > >>+ const unsigned int MAX_PENALIZED_COST_FOR_CTOR = 12; > >>+ if (extra_cost > MAX_PENALIZED_COST_FOR_CTOR) > >>+ extra_cost = MAX_PENALIZED_COST_FOR_CTOR; > >That is a pretty gross hack. Can you think of any saner way to not have > >those out of scale costs in the first place? > > In Kewen's defense, the whole business of "finish_cost" for these > vectorized loops is to tweak things that don't work quite right with the > hooks currently provided to the vectorizer to add costs on a per-stmt > basis without looking at the overall set of statements.  It gives the > back end a chance to massage things and exercise veto power over > otherwise bad decisions.  By nature, that's going to be very much a > heuristic exercise.  Personally I think the heuristics used here are > pretty reasonable, and importantly they are designed to only be employed > in pretty rare circumstances.  It doesn't look easy to me to avoid the > need for a cap here without making the rest of the heuristics harder to > understand.  But sure, he can try! :) Yes. But we need to avoid magic. Magic does not scale; magic makes it hard or impossible to do any future changes. I did approve the patch already. But weaknesses like this need continuous attention, and that does not scale. Segher