From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by sourceware.org (Postfix) with ESMTPS id 295CC385AC1A for ; Thu, 17 Feb 2022 13:57:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 295CC385AC1A Received: by mail-pf1-x42b.google.com with SMTP id d17so5088584pfl.0 for ; Thu, 17 Feb 2022 05:57:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=OWljF8Z/VCocJk4z8l3VKSdvmusTH1jnO5DhE4KbKzM=; b=rakatRFNVeqpx9CdXSj0X24jvh8gKjkgnrGiukZP+OHEnhXzHpl/ELbYg1Ms2iGyRk y19O9Ke0GhT3yVC/Jm9/2srBGcVPtnBrzzocRLt2J3/roZb7ZlASQp1Ye/dKRMXsGxEp NeU0lnx4+giFoyZiGIMNWb8TEDW30lLv0Dqq7yXXx7+lo9VukakaPlq7v6keaWz77aHV YfNIvX38GrMnN0lGF+36kYn+j17kKlS/AeIGFCJ/dnxVsckAXiZQvmZhDQ7dHzaNa23U 1/P+a2GYQM7XC3sOekbtCSzWhgim3Im5xJA0S9Ejw5iKuzpY26a0LhEwv2ba0K/Ifogj PaSg== X-Gm-Message-State: AOAM5331z/qnSHk482HJN3enatmx1X95l130pkDxzcypSPEpKvpL1bXx XQDnRdvcAVQmXf+YpIYirsqxzIpYHxM= X-Google-Smtp-Source: ABdhPJwm4pF4BjTYmV6EmOBzoh7do2iD3e+qiUqfzOrCy7ypbfZXai/7EmPCpIJKxocjA/uFaxqhxA== X-Received: by 2002:a63:5a52:0:b0:36c:7c61:a830 with SMTP id k18-20020a635a52000000b0036c7c61a830mr2568153pgm.233.1645106227993; Thu, 17 Feb 2022 05:57:07 -0800 (PST) Received: from gnu-tgl-3.localdomain ([172.58.38.240]) by smtp.gmail.com with ESMTPSA id w198sm14362368pff.96.2022.02.17.05.57.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Feb 2022 05:57:07 -0800 (PST) Received: by gnu-tgl-3.localdomain (Postfix, from userid 1000) id D0C5EC0586; Thu, 17 Feb 2022 05:57:06 -0800 (PST) Date: Thu, 17 Feb 2022 05:57:06 -0800 From: "H.J. Lu" To: Richard Biener Cc: Uros Bizjak , GCC Patches , liuhongt Subject: Re: [PATCH] x86: Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER Message-ID: References: <20220217042628.133306-1-hjl.tools@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-3023.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Feb 2022 13:57:10 -0000 On Thu, Feb 17, 2022 at 10:49:48AM +0100, Richard Biener via Gcc-patches wrote: > On Thu, Feb 17, 2022 at 8:52 AM Uros Bizjak via Gcc-patches > wrote: > > > > On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches > > wrote: > > > > > > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches > > > wrote: > > > > > > > > Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride, > > > > Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX > > > > transition penalty. Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to > > > > generate vzeroupper instruction after loading all-zero YMM/YMM registers > > > > and enable it by default. > > > Shouldn't TARGET_READ_ZERO_YMM_ZMM_NONEED_VZEROUPPER sounds a bit smoother? > > > Because originally we needed to add vzeroupper to all avx<->sse cases, > > > now it's a tune to indicate that we don't need to add it in some > > > > Perhaps we should go from the other side and use > > X86_TUNE_OPTIMIZE_AVX_READ for new processors? > > Btw, do you have a micro-benchmark to test this on AMD archs? > I don't believe AMD CPUs needs vzeroupper. H.J.