From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id AAE62393F841 for ; Tue, 18 May 2021 15:18:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org AAE62393F841 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 56450ED1; Tue, 18 May 2021 08:18:54 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8208D3F73B; Tue, 18 May 2021 08:18:53 -0700 (PDT) From: Richard Sandiford To: Hongtao Liu via Gcc-patches Mail-Followup-To: Hongtao Liu via Gcc-patches , Jakub Jelinek , Uros Bizjak , Hongtao Liu , "H. J. Lu" , richard.sandiford@arm.com Cc: Jakub Jelinek , Uros Bizjak , Hongtao Liu , "H. J. Lu" Subject: Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735] References: <20210513095433.GH1179226@tucnak> <20210513113704.GI1179226@tucnak> Date: Tue, 18 May 2021 16:18:52 +0100 In-Reply-To: (Hongtao Liu via Gcc-patches's message of "Tue, 18 May 2021 21:12:03 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 May 2021 15:18:56 -0000 Hongtao Liu via Gcc-patches writes: > On Mon, May 17, 2021 at 5:56 PM Richard Sandiford > wrote: >> It looks like the rtx =E2=80=9Cused=E2=80=9D flag is unused for INSNs, s= o we could >> use that as a CALL_INSN flag that indicates a fake call. We could just >> need to make: >> >> /* For all other RTXes clear the used flag on the copy. */ >> RTX_FLAG (copy, used) =3D 0; >> >> conditional on !INSN_P. >> > I got another error in > > @@ -83,6 +83,9 @@ control_flow_insn_p (const rtx_insn *insn) > return true; > > case CALL_INSN: > + /* CALL_INSN use "used" flag to indicate it's a fake call. */ > + if (RTX_FLAG (insn, used)) > + break; I guess this is because of the nonlocal_goto condition? If so, that could be fixed by adding a REG_EH_REGION note of INT_MIN. Even if we don't do that, I think the fix belongs in nonlocal_goto instead. > and performance issue in > > modified gcc/final.c > @@ -4498,7 +4498,8 @@ leaf_function_p (void) > for (insn =3D get_insns (); insn; insn =3D NEXT_INSN (insn)) > { > if (CALL_P (insn) > - && ! SIBLING_CALL_P (insn)) > + && ! SIBLING_CALL_P (insn) > + && !RTX_FLAG (insn, used)) > return 0; > if (NONJUMP_INSN_P (insn) > > Also i grep CALL_P or CALL_INSN in GCC source codes, there are many > places which hold the assumption CALL_P/CALL_INSN is a real call. > Considering that vzeroupper is used a lot on the i386 backend, I'm a > bit worried that this implementation solution will be a bottomless > pit. Maybe, but I think the same is true for CLOBBER_HIGH. If we have a third alternative then we should consider it, but I think the call approach is still going to be less problematic then CLOBBER_HIGH. The main advantage of the call approach is that the CALL_P handling is (mostly) conservatively correct and performance problems are just a one-line change. The CLOBBER_HIGH approach instead requires changes to the way that passes track liveness information for non-call instructions (so is much more than a one-line change). Also, treating a CLOBBER_HIGH like a CLOBBER isn't conservatively correct, because other code might be relying on part of the register being preserved. Thanks, Richard