From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 0FA913857025 for ; Thu, 13 May 2021 09:54:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0FA913857025 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-284-3rf56nXHMAG-sVk5vvDEVQ-1; Thu, 13 May 2021 05:54:36 -0400 X-MC-Unique: 3rf56nXHMAG-sVk5vvDEVQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5887C800D55; Thu, 13 May 2021 09:54:35 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-114-59.ams2.redhat.com [10.36.114.59]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D79162BFC7; Thu, 13 May 2021 09:54:34 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 14D9sXth3307190 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 13 May 2021 11:54:33 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 14D9sXLB3307189; Thu, 13 May 2021 11:54:33 +0200 Date: Thu, 13 May 2021 11:54:33 +0200 From: Jakub Jelinek To: Uros Bizjak , Richard Sandiford Cc: Hongtao Liu , GCC Patches , "H. J. Lu" Subject: Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735] Message-ID: <20210513095433.GH1179226@tucnak> Reply-To: Jakub Jelinek References: MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 May 2021 09:54:40 -0000 On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote: > > > Bootstrapped and regtested on X86_64-linux-gnu{-m32,} > > > Ok for trunk? > > > > Some time ago a support for CLOBBER_HIGH RTX was added (and later > > removed for some reason). Perhaps we could resurrect the patch for the > > purpose of ferrying 128bit modes via vzeroupper RTX? > > https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html is where it got removed, CCing Richard. > > +(define_split > > + [(match_parallel 0 "vzeroupper_pattern" > > + [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)])] > > + "TARGET_AVX && ix86_pre_reload_split ()" > > + [(match_dup 0)] > > +{ > > + /* When vzeroupper is explictly used, for LRA purpose, make it clear > > + the instruction kills sse registers. */ > > + gcc_assert (cfun->machine->has_explicit_vzeroupper); > > + unsigned int nregs = TARGET_64BIT ? 16 : 8; > > + rtvec vec = rtvec_alloc (nregs + 1); > > + RTVEC_ELT (vec, 0) = gen_rtx_UNSPEC_VOLATILE (VOIDmode, > > + gen_rtvec (1, const1_rtx), > > + UNSPECV_VZEROUPPER); > > + for (unsigned int i = 0; i < nregs; ++i) > > + { > > + unsigned int regno = GET_SSE_REGNO (i); > > + rtx reg = gen_rtx_REG (V2DImode, regno); > > + RTVEC_ELT (vec, i + 1) = gen_rtx_CLOBBER (VOIDmode, reg); > > + } > > + operands[0] = gen_rtx_PARALLEL (VOIDmode, vec); > > +}) > > > > Wouldn't this also kill lower 128bit values that are not touched by > > vzeroupper? A CLOBBER_HIGH would be more appropriate here. Yes, it would. But normally the only xmm* hard regs live across the explicit user vzeroupper would be local and global register variables, I think the 1st scheduler etc. shouldn't extend lifetime of the xmm hard regs across UNSPEC_VOLATILE. Jakub