From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id F16383865C2A for ; Tue, 29 Sep 2020 09:25:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F16383865C2A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=richard.sandiford@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8259731B; Tue, 29 Sep 2020 02:25:45 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CAD5A3F73B; Tue, 29 Sep 2020 02:25:44 -0700 (PDT) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener , Tamar Christina , nd , "gcc-patches\@gcc.gnu.org" , richard.sandiford@arm.com Cc: Tamar Christina , nd , "gcc-patches\@gcc.gnu.org" Subject: Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding. References: <20200925142753.GA13692@arm.com> Date: Tue, 29 Sep 2020 10:25:43 +0100 In-Reply-To: (Richard Biener's message of "Tue, 29 Sep 2020 08:45:06 +0200 (CEST)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00, INDUSTRIAL_BODY, INDUSTRIAL_SUBJECT, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Sep 2020 09:25:47 -0000 Richard Biener writes: >> > > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo, >> > > &tree_size, bst_map); >> > > if (node != NULL) >> > > { >> > > + /* Temporarily allow add_stmt calls again. */ >> > > + vinfo->stmt_vec_info_ro = false; >> > > + >> > > + /* See if any patterns can be found in the constructed SLP tree >> > > + before we do any analysis on it. */ >> > > + vect_match_slp_patterns (node, vinfo, group_size, &max_nunits, >> > > + matches, &npermutes, &tree_size, >> > > + bst_map); >> > > + >> > > + /* After this no more add_stmt calls are allowed. */ >> > > + vinfo->stmt_vec_info_ro = true; >> > > + >> > > >> > > I think this is a bit early to match patterns - I'd defer it to the >> > > point where all entries into the same SLP subgraph are analyzed, thus >> > > somewhere at the end of vect_analyze_slp loop over all instances and >> > > match patterns? That way phases are more clearly separated. >> > >> > That would probably work, my only worry is that the SLP analysis itself may >> > fail and bail out at >> > >> > /* If the loads and stores can be handled with load/store-lane >> > instructions do not generate this SLP instance. */ >> > if (is_a (vinfo) >> > && loads_permuted >> > && dr && vect_store_lanes_supported (vectype, group_size, >> > false)) > > Ah, that piece of code. Yeah, I'm repeatedly running into it as well - > it's a bad hack that stands in the way all the time :/ At one point I was wondering about trying to drop the above, vectorise with and without SLP, and then compare their costs, like for VECT_COMPARE_COSTS. But that seemed like a dead end with the move to doing everything on the SLP representation. > I guess we should try moving this upward like to > vect_analyze_loop_2 right before > > /* Check the SLP opportunities in the loop, analyze and build SLP trees. > */ > ok = vect_analyze_slp (loop_vinfo, *n_stmts); > if (!ok) > return ok; > > and there check whether all grouped loads and stores can be handled > with store-/load-lanes (and there are no SLP reduction chains?) in > which case do not try to attempt SLP at all. Because the testcases > this check was supposed to change were all-load/store-lane or > all SLP so the mixed case is probably not worth special casing. > > Since load-/store-lanes is an arm speciality I tried to only touch > this fragile part with a ten-foot pole ;) CCing Richard, if he > acks the above I can produce a patch. Yeah, sounds good to me. Probably also sorth checking whether the likely_max iteration count is high enough to support group_size vectors, if we have enough information to guess that. We could also get the gen* machinery to emit a macro that is true if at least one load/store-lane pattern is defined, so that we can skip the code for non-Arm targets. I can do that as a follow-up. Thanks, Richard