From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb31.google.com (mail-yb1-xb31.google.com [IPv6:2607:f8b0:4864:20::b31]) by sourceware.org (Postfix) with ESMTPS id B2326382C5EB for ; Fri, 3 Jun 2022 23:33:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B2326382C5EB Received: by mail-yb1-xb31.google.com with SMTP id i11so16298243ybq.9 for ; Fri, 03 Jun 2022 16:33:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=esefV4oMFci6wgo+3GFG7KhbbzdRYn/gnE9vuHieSQI=; b=ip6KYfpET4aojbQ5JZldzMaGN2Xy0+a33H8rIxCiCgZXM4EXTN4RCkkC1ueuMUF8iY P43OeJai98Bs9F2BRzcLo81PoB8eswgTd3GGv7i2o2us1c0EsdsaGYoa0NqFWTBYoUoQ q/nUsg7gyUItQW8REVRBIWxLsTCSelOZZLfkJnU2ncscgMEo1DtfQ+lu3cOuH/vrPxfY M3N3TC/G0adLnaBBQvoVvAnb99oNSkwqfwbkOH1ra+V2n5sD7i+1wSVEaBKN2sp1QoA0 2TFIm1IOey3on1p5F1jM0MwIIqviSXflovD0qgG2sVk231xNDcNwK0WbYwVcmanO0ao8 rITg== X-Gm-Message-State: AOAM530sY6PgfNuylOYe6U1TzkfL6s5znbJGB/Xe3dwxBDOgph/0Ng8w 3hP8O2DNdEXiJsAvQPIm1AnhXprIQF5O66wwzHY= X-Google-Smtp-Source: ABdhPJz+3IlGdR1deGQGDS55K+o95HjVr54prwL6rzG5OKbfac4BolwPOBlnPXQuCHgAhT03izRYXfziMrXXOf7XgHc= X-Received: by 2002:a25:acc1:0:b0:65d:31a:6cb with SMTP id x1-20020a25acc1000000b0065d031a06cbmr13458027ybd.76.1654299236971; Fri, 03 Jun 2022 16:33:56 -0700 (PDT) MIME-Version: 1.0 References: <20220603044229.2180216-2-goldstein.w.n@gmail.com> <20220603200429.379547-1-goldstein.w.n@gmail.com> <20220603200429.379547-2-goldstein.w.n@gmail.com> In-Reply-To: From: Noah Goldstein Date: Fri, 3 Jun 2022 18:33:45 -0500 Message-ID: Subject: Re: [PATCH v2 2/8] x86: Add COND_VZEROUPPER that can replace vzeroupper if no `ret` To: "H.J. Lu" Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jun 2022 23:33:59 -0000 On Fri, Jun 3, 2022 at 6:12 PM H.J. Lu wrote: > > On Fri, Jun 3, 2022 at 1:04 PM Noah Goldstein wrote: > > > > The RTM vzeroupper mitigation has no way of replacing inline > > vzeroupper not before a return. > > > > This code does not change any existing functionality. > > > > There is no difference in the objdump of libc.so before and after this > > patch. > > --- > > sysdeps/x86_64/multiarch/avx-rtm-vecs.h | 1 + > > sysdeps/x86_64/multiarch/avx2-rtm-vecs.h | 1 + > > sysdeps/x86_64/sysdep.h | 16 ++++++++++++++++ > > 3 files changed, 18 insertions(+) > > > > diff --git a/sysdeps/x86_64/multiarch/avx-rtm-vecs.h b/sysdeps/x86_64/multiarch/avx-rtm-vecs.h > > index c00b83ea0e..e954b8e1b0 100644 > > --- a/sysdeps/x86_64/multiarch/avx-rtm-vecs.h > > +++ b/sysdeps/x86_64/multiarch/avx-rtm-vecs.h > > @@ -20,6 +20,7 @@ > > #ifndef _AVX_RTM_VECS_H > > #define _AVX_RTM_VECS_H 1 > > > > +#define COND_VZEROUPPER COND_VZEROUPPER_XTEST > > #define ZERO_UPPER_VEC_REGISTERS_RETURN \ > > ZERO_UPPER_VEC_REGISTERS_RETURN_XTEST > > > > diff --git a/sysdeps/x86_64/multiarch/avx2-rtm-vecs.h b/sysdeps/x86_64/multiarch/avx2-rtm-vecs.h > > index a5d46e8c66..e20c3635a0 100644 > > --- a/sysdeps/x86_64/multiarch/avx2-rtm-vecs.h > > +++ b/sysdeps/x86_64/multiarch/avx2-rtm-vecs.h > > @@ -20,6 +20,7 @@ > > #ifndef _AVX2_RTM_VECS_H > > #define _AVX2_RTM_VECS_H 1 > > > > +#define COND_VZEROUPPER COND_VZEROUPPER_XTEST > > #define ZERO_UPPER_VEC_REGISTERS_RETURN \ > > ZERO_UPPER_VEC_REGISTERS_RETURN_XTEST > > > > diff --git a/sysdeps/x86_64/sysdep.h b/sysdeps/x86_64/sysdep.h > > index f14d50786d..2cb31a558b 100644 > > --- a/sysdeps/x86_64/sysdep.h > > +++ b/sysdeps/x86_64/sysdep.h > > @@ -106,6 +106,22 @@ lose: \ > > vzeroupper; \ > > ret > > > > +/* Can be used to replace vzeroupper that is not directly before a > > + return. */ > > +#define COND_VZEROUPPER_XTEST \ > > + xtest; \ > > + jz 1f; \ > > + vzeroall; \ > > + jmp 2f; \ > > +1: \ > > + vzeroupper; \ > > +2: > > Will "ret" always be after "2:"? At some point but not immediately afterwards. For example: L(zero): xorl %eax, %eax VZEROUPPER_RETURN L(check): tzcntl %eax, %eax cmpl %eax, %edx jle L(zero) addq %rdi, %rax VZEROUPPER_RETURN Can become: L(zero): xorl %eax, %eax ret L(check): tzcntl %eax, %eax COND_VZEROUPPER cmpl %eax, %edx jle L(zero) addq %rdi, %rax ret Which saves code size. > > > +/* In RTM define this as COND_VZEROUPPER_XTEST. */ > > +#ifndef COND_VZEROUPPER > > +# define COND_VZEROUPPER vzeroupper > > +#endif > > + > > /* Zero upper vector registers and return. */ > > #ifndef ZERO_UPPER_VEC_REGISTERS_RETURN > > # define ZERO_UPPER_VEC_REGISTERS_RETURN \ > > -- > > 2.34.1 > > > > > -- > H.J.