From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id 8B7E13858C83 for ; Mon, 7 Feb 2022 19:49:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8B7E13858C83 Received: by mail-pl1-x62e.google.com with SMTP id x4so4370639plb.4 for ; Mon, 07 Feb 2022 11:49:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ApG+Q38GKk9r6+CsJLeZvZismhNIwUaN+xBcfAVu9Dg=; b=8JgDfgcUG7ovvaCvHW2iGW6CUh6dcV6WPtsf4w+/aR0WyK7pHvwKxVw6H++g4MRMHl ylPyAQ+luEZTGuqm9QUfdB6FBAfbjL644BSA5V2QvmeblWsYpTDN8dO+JKX4enM98Ey8 UjZVH9Xp+45hKB9uJkp+JH+ExpHgfsyisfHw2arMv1mGJdGkud92W8dLeLABHDO2QBUL 1n75WkazCF45fG2Hp25fDExZSQGXPkWVfn5Rw2v7Jaw6FmOJGIMjFbSWaESfZPDPrWeh VtbLzABxhEwkLekVibjoMjb1SH5kLKmlcPsE2sc0namzqTrTcyqscua3V1vhcTPOE3Q0 6jHA== X-Gm-Message-State: AOAM530Tep1E6QNwBCy29PzlyVpS7vSzJ8q+qe5AqXC2IYOgvJTkPWA+ eYuS2hgUn7PX1SQ8sTP1Srp1gGUbzfU9sKFNheVO25A2F2c= X-Google-Smtp-Source: ABdhPJzVaSZ1lQidk53mAIsVnZqkqwiB+iOPQ/TiijmG6dk3uSb1+Pqt4OeUej6k4L2awRJGuoTlJa5E31E5QH5biRY= X-Received: by 2002:a17:903:2350:: with SMTP id c16mr1191086plh.4.1644263384700; Mon, 07 Feb 2022 11:49:44 -0800 (PST) MIME-Version: 1.0 References: <20220207063854.3324172-1-goldstein.w.n@gmail.com> <20220207193906.2111349-1-goldstein.w.n@gmail.com> In-Reply-To: <20220207193906.2111349-1-goldstein.w.n@gmail.com> From: "H.J. Lu" Date: Mon, 7 Feb 2022 11:49:08 -0800 Message-ID: Subject: Re: [PATCH v2] x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only) To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3027.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Feb 2022 19:49:46 -0000 On Mon, Feb 7, 2022 at 11:39 AM Noah Goldstein wrote: > > commit b62ace2740a106222e124cc86956448fa07abf4d > Author: Noah Goldstein > Date: Sun Feb 6 00:54:18 2022 -0600 > > x86: Improve vec generation in memset-vec-unaligned-erms.S > > Revert usage of 'pshufb' in broadcast logic as it is an SSSE3 > instruction and memset.S is restricted to only SSE2 instructions. > --- > sysdeps/x86_64/memset.S | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/sysdeps/x86_64/memset.S b/sysdeps/x86_64/memset.S > index ccf036be53..3f0517bbfc 100644 > --- a/sysdeps/x86_64/memset.S > +++ b/sysdeps/x86_64/memset.S > @@ -30,9 +30,10 @@ > > # define MEMSET_SET_VEC0_AND_SET_RETURN(d, r) \ > movd d, %xmm0; \ > - pxor %xmm1, %xmm1; \ > - pshufb %xmm1, %xmm0; \ > - movq r, %rax > + movq r, %rax; \ > + punpcklbw %xmm0, %xmm0; \ > + punpcklwd %xmm0, %xmm0; \ > + pshufd $0, %xmm0, %xmm0 > > # define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \ > movd d, %xmm0; \ > -- > 2.25.1 > LGTM. Reviewed-by: H.J. Lu Thanks. -- H.J.