From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 9311E385E00A for ; Mon, 30 Mar 2020 08:50:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9311E385E00A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=richard.sandiford@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4795831B; Mon, 30 Mar 2020 01:50:07 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 90B2D3F52E; Mon, 30 Mar 2020 01:50:06 -0700 (PDT) From: Richard Sandiford To: Peter Bergner via Gcc-patches Mail-Followup-To: Peter Bergner via Gcc-patches , Peter Bergner , "ian\@airs.com" , Segher Boessenkool , richard.sandiford@arm.com Cc: Peter Bergner , "ian\@airs.com" , Segher Boessenkool Subject: Re: [PATCH] lower-subreg: PR94123, SVN r273240, causes gcc.target/powerpc/pr87507.c to fail References: <20ee8944-f0bf-cec1-e3d1-5dd5e9c6a4ef@linux.ibm.com> Date: Mon, 30 Mar 2020 09:50:05 +0100 In-Reply-To: <20ee8944-f0bf-cec1-e3d1-5dd5e9c6a4ef@linux.ibm.com> (Peter Bergner via Gcc-patches's message of "Fri, 27 Mar 2020 17:41:36 -0500") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-29.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Mar 2020 08:50:09 -0000 Peter Bergner via Gcc-patches writes: > The pr87507.c test case regressed due to Segher's commit that added > -fsplit-wide-types-early. The issue is that the lower-subreg pass only > decomposes the TImode code in the example if there is a pseudo reg to pseudo > reg copy. When the lower-subreg pass is called late (its old location), > then combine changes the generated code by adding a TImode pseudo reg to > pseudo reg copy and lower-subreg successfully decomposes it. > > When we run lower-subreg before combine, that copy isn't there so we > do not decompose our TImode uses. I'm not sure why we require a pseudo > to pseudo copy before we decompose things, but changing find_pseudo_copy > to allow pseudo and hard regs to be copied into a pseudo like below fixes > the issue. > > Ian, do you remember why we don't just decompose all wide types and instead > require a pseudo to pseudo copy to exist? > > This patch survived bootstrap and regtesting on powerpc64le-linux with > no regressions. > > * lower-subreg.c (find_pseudo_copy): Allow copies from hard registers > too. > > diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c > index 4c8bc835f93..c6816a34489 100644 > --- a/gcc/lower-subreg.c > +++ b/gcc/lower-subreg.c > @@ -419,7 +419,7 @@ find_pseudo_copy (rtx set) > > rd = REGNO (dest); > rs = REGNO (src); > - if (HARD_REGISTER_NUM_P (rd) || HARD_REGISTER_NUM_P (rs)) > + if (HARD_REGISTER_NUM_P (rd)) > return false; > > b = reg_copy_graph[rs]; I guess this would also work if we dropped the rd check instead. So how about s/||/&&/ instead, to avoid the assymetry? I agree something like this is a better fix long-term, since we shouldn't be relying on make_more_copies outside combine. > Given we're late in stage4, the above patch (assuming it's ok) probably > shouldn't go in until stage1, since it is changing lower-subreg's behaviour > slightly. > > A different (ie, safer) approach would be to just rerun lower-subreg at > its old location, regardless of whether we used -fsplit-wide-types-early. > This way, we are not changing lower-subreg's behaviour, just running it an > extra time (3 times instead of twice when using -fsplit-wide-types-early). > I don't think lower-subreg is too expensive to run an extra time and we'd > only do it when using -fsplit-wide-types-early. The only downside (if any) > is that we don't decompose these TImode uses early like the patch above does, > so combine, etc. can't see what they will eventually become. This does fix > the bug and also survives bootstrap and regtesting on powerpc64le-linux > with no regressions. > > * lower-subreg.c (pass_lower_subreg3::gate): Remove test for > flag_split_wide_types_early. > > diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c > index 4c8bc835f93..807ad398b64 100644 > --- a/gcc/lower-subreg.c > +++ b/gcc/lower-subreg.c > @@ -1844,8 +1844,7 @@ public: > {} > > /* opt_pass methods: */ > - virtual bool gate (function *) { return flag_split_wide_types > - && !flag_split_wide_types_early; } > + virtual bool gate (function *) { return flag_split_wide_types != 0; } > virtual unsigned int execute (function *) > { > decompose_multiword_subregs (true); Looks good to me with the s/ != 0// that Segher mentioned. With this change, the only remaining function of -fsplit-wide-types-early is to act as a double lock on one pass. IMO it'd make more sense to remove that double lock and make -fsplit-wide-types-early and -fsplit-wide-types act as independent options, a bit like -fschedule-insns{,2}. Thanks, Richard