From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by sourceware.org (Postfix) with ESMTPS id 375DA3858C51 for ; Tue, 29 Mar 2022 02:57:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 375DA3858C51 Received: by mail-pj1-x102f.google.com with SMTP id m22so16131814pja.0 for ; Mon, 28 Mar 2022 19:57:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eH2N3My+TWWWCtRQaS1AWeHIQbgXei8LR4D1m4rQWPo=; b=yo2FPGsYNrrp53llggIgveMn5luyyhCrjgWi6dhL9LfWNk5q3pJ5iuLUk3/gFQuUUR ZK2SOl35ZhZ7teRWeu6W7sLNBkcApJLDLZvhJsSaEqHGVMBV67tJ7BfvMyxtY3pArxo7 fjQW4WSK9Fr5i/XDWbcTlTr6hcRv8Kb7YpkxU+dku4U+RLNenZgN+qhIRoK7GqugATKH 2q5VMzR11phprd9oU0RZnjZhGgv7TiJmu1cjdGXep3wUw3J+PbOGFVGj+FKvnw4no2SY +k/r/h5KEViyfLcyNTC/lW6dsnACMShrlEtGGUxcmsMYtLuJwzSHAFGapcLBJpXXEzZz jDwA== X-Gm-Message-State: AOAM531BzLapftPDhl3XaHZfa9l9huwpFdzjdgFhbqC++hmQ8TZa2fzR VIslHaJWLfDbIBkIo3u37mfGa0ZUC1FcSoC8UKu5ZeRw X-Google-Smtp-Source: ABdhPJyqfkA4GUqkdqKgGw5W9ZAtaYfAzIucMo2iT/nEhYSuAhcvoK88XzRxoXGvDgWgCSYcQIwanHVsa6n1qMhzmp8= X-Received: by 2002:a17:90a:be12:b0:1c7:aea:b384 with SMTP id a18-20020a17090abe1200b001c70aeab384mr2200309pjs.178.1648522634143; Mon, 28 Mar 2022 19:57:14 -0700 (PDT) MIME-Version: 1.0 References: <89bb3f1942814671ae858dcef4b3b870@zhaoxin.com> <09816f3ba25043339d57121bbae3d991@zhaoxin.com> In-Reply-To: <09816f3ba25043339d57121bbae3d991@zhaoxin.com> From: Noah Goldstein Date: Mon, 28 Mar 2022 21:57:03 -0500 Message-ID: Subject: Re: [PATCH v1 3/6] x86: Remove mem{move|cpy}-ssse3 To: Mayshao-oc Cc: "H.J. Lu" , GNU C Library , Florian Weimer , "Carlos O'Donell" , "Louis Qi(BJ-RD)" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Mar 2022 02:57:18 -0000 On Mon, Mar 28, 2022 at 9:51 PM Mayshao-oc wrote: > > On Mon, Mar 28, 2022 at 9:07 PM H.J. Lu wrote: > > > > On Mon, Mar 28, 2022 at 1:10 AM Mayshao-oc wrote: > > > > > > On Fri, Mar 25, 2022 at 6:36 PM Noah Goldstein wrote: > > > > > > > With SSE2, SSE4.1, AVX2, and EVEX versions very few targets prefer > > > > SSSE3. As a result its no longer with the code size cost. > > > > --- > > > > sysdeps/x86_64/multiarch/Makefile | 2 - > > > > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 15 - > > > > sysdeps/x86_64/multiarch/ifunc-memmove.h | 18 +- > > > > sysdeps/x86_64/multiarch/memcpy-ssse3.S | 3151 -------------------- > > > > sysdeps/x86_64/multiarch/memmove-ssse3.S | 4 - > > > > 5 files changed, 7 insertions(+), 3183 deletions(-) > > > > delete mode 100644 sysdeps/x86_64/multiarch/memcpy-ssse3.S > > > > delete mode 100644 sysdeps/x86_64/multiarch/memmove-ssse3.S > > > > > > > On some platforms, such as Zhaoxin, the memcpy performance of SSSE3 > > > is better than that of AVX2, and the current computer system has sufficient > > > disk capacity and memory capacity. > > > > How does the SSSE3 version compare against the SSE2 version? > > On some Zhaoxin processors, the overall performance of SSSE3 is about > 10% higher than that of SSE2. > > > Best Regards, > May Shao Any chance you can post the result from running `bench-memset` or some equivalent benchmark? Curious where the regressions are. Ideally we would fix the SSE2 version so its optimal. > > > > It is strongly recommended to keep the SSSE3 version. > > > > > > > > > -- > > H.J.