From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by sourceware.org (Postfix) with ESMTPS id 7B0423850404 for ; Thu, 21 Jan 2021 16:41:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7B0423850404 Received: by mail-qk1-x732.google.com with SMTP id 22so2217955qkf.9 for ; Thu, 21 Jan 2021 08:41:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=M9889PbI5IyzwLMPGNiJFBQPwt3kwQk1AOoHF3pj1oo=; b=NSD9M4PAs4dp+IfAL2rTJ9/bDiJz/YjrGqab3sA1KptOC2afPEG3jqHAWBzYe7UOll CWTdhD/qgc7ZBhz3geAuWq5l4NDzkoETzKwEqi40ernvFmYbU9w4iEX+6fnuZnK3/6EJ 4a38TSDlOqUaJtKyFPg/GP1hD7mMo7qRomU7+O2bVFtfSku3Lspl3g85t0bTCrFzyedH ONqC6/rJuUgYev77k/Y62802b+uVjZnE2YJhq29x2SiYP6uU2PNHsqlMM99SsOl6IHuo S8MUKpE2QMFFJCNMJPVhq3mfBo+gYMwrJh1JevxIqffsTZG5EK0BgTavSMFRONyEDUAI QnyQ== X-Gm-Message-State: AOAM530B8KXRbk3pWRhJPHMQ1utIonUhC2n+oqRLdzkQZG2CBEkacc27 DHOvziZW+2scgNvOhxlc4b0jmw== X-Google-Smtp-Source: ABdhPJy1vjz1H8rHrFFC8pBd/dSqa6gOMtpc2bJsikD7Kw0RzRXXMvTBUDD9xzGtOcLz/y9o9tF8FQ== X-Received: by 2002:a37:a24a:: with SMTP id l71mr599269qke.161.1611247306083; Thu, 21 Jan 2021 08:41:46 -0800 (PST) Received: from [192.168.1.4] ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id c49sm3959371qta.89.2021.01.21.08.41.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 21 Jan 2021 08:41:45 -0800 (PST) Subject: Re: [PATCH] aarch64: revert memcpy optimze for kunpeng to avoid performance degradation To: "Zhangxuelei (Derek)" , "wangshuo (AF)" , "libc-alpha@sourceware.org" Cc: Hushiyuan , "liqingqing (C)" References: From: Adhemerval Zanella Autocrypt: addr=adhemerval.zanella@linaro.org; prefer-encrypt=mutual; keydata= mQINBFcVGkoBEADiQU2x/cBBmAVf5C2d1xgz6zCnlCefbqaflUBw4hB/bEME40QsrVzWZ5Nq 8kxkEczZzAOKkkvv4pRVLlLn/zDtFXhlcvQRJ3yFMGqzBjofucOrmdYkOGo0uCaoJKPT186L NWp53SACXguFJpnw4ODI64ziInzXQs/rUJqrFoVIlrPDmNv/LUv1OVPKz20ETjgfpg8MNwG6 iMizMefCl+RbtXbIEZ3TE/IaDT/jcOirjv96lBKrc/pAL0h/O71Kwbbp43fimW80GhjiaN2y WGByepnkAVP7FyNarhdDpJhoDmUk9yfwNuIuESaCQtfd3vgKKuo6grcKZ8bHy7IXX1XJj2X/ BgRVhVgMHAnDPFIkXtP+SiarkUaLjGzCz7XkUn4XAGDskBNfbizFqYUQCaL2FdbW3DeZqNIa nSzKAZK7Dm9+0VVSRZXP89w71Y7JUV56xL/PlOE+YKKFdEw+gQjQi0e+DZILAtFjJLoCrkEX w4LluMhYX/X8XP6/C3xW0yOZhvHYyn72sV4yJ1uyc/qz3OY32CRy+bwPzAMAkhdwcORA3JPb kPTlimhQqVgvca8m+MQ/JFZ6D+K7QPyvEv7bQ7M+IzFmTkOCwCJ3xqOD6GjX3aphk8Sr0dq3 4Awlf5xFDAG8dn8Uuutb7naGBd/fEv6t8dfkNyzj6yvc4jpVxwARAQABtElBZGhlbWVydmFs IFphbmVsbGEgTmV0dG8gKExpbmFybyBWUE4gS2V5KSA8YWRoZW1lcnZhbC56YW5lbGxhQGxp bmFyby5vcmc+iQI3BBMBCAAhBQJXFRpKAhsDBQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJ EKqx7BSnlIjv0e8P/1YOYoNkvJ+AJcNUaM5a2SA9oAKjSJ/M/EN4Id5Ow41ZJS4lUA0apSXW NjQg3VeVc2RiHab2LIB4MxdJhaWTuzfLkYnBeoy4u6njYcaoSwf3g9dSsvsl3mhtuzm6aXFH /Qsauav77enJh99tI4T+58rp0EuLhDsQbnBic/ukYNv7sQV8dy9KxA54yLnYUFqH6pfH8Lly sTVAMyi5Fg5O5/hVV+Z0Kpr+ZocC1YFJkTsNLAW5EIYSP9ftniqaVsim7MNmodv/zqK0IyDB GLLH1kjhvb5+6ySGlWbMTomt/or/uvMgulz0bRS+LUyOmlfXDdT+t38VPKBBVwFMarNuREU2 69M3a3jdTfScboDd2ck1u7l+QbaGoHZQ8ZNUrzgObltjohiIsazqkgYDQzXIMrD9H19E+8fw kCNUlXxjEgH/Kg8DlpoYJXSJCX0fjMWfXywL6ZXc2xyG/hbl5hvsLNmqDpLpc1CfKcA0BkK+ k8R57fr91mTCppSwwKJYO9T+8J+o4ho/CJnK/jBy1pWKMYJPvvrpdBCWq3MfzVpXYdahRKHI ypk8m4QlRlbOXWJ3TDd/SKNfSSrWgwRSg7XCjSlR7PNzNFXTULLB34sZhjrN6Q8NQZsZnMNs TX8nlGOVrKolnQPjKCLwCyu8PhllU8OwbSMKskcD1PSkG6h3r0AquQINBFcVGkoBEACgAdbR Ck+fsfOVwT8zowMiL3l9a2DP3Eeak23ifdZG+8Avb/SImpv0UMSbRfnw/N81IWwlbjkjbGTu oT37iZHLRwYUFmA8fZX0wNDNKQUUTjN6XalJmvhdz9l71H3WnE0wneEM5ahu5V1L1utUWTyh VUwzX1lwJeV3vyrNgI1kYOaeuNVvq7npNR6t6XxEpqPsNc6O77I12XELic2+36YibyqlTJIQ V1SZEbIy26AbC2zH9WqaKyGyQnr/IPbTJ2Lv0dM3RaXoVf+CeK7gB2B+w1hZummD21c1Laua +VIMPCUQ+EM8W9EtX+0iJXxI+wsztLT6vltQcm+5Q7tY+HFUucizJkAOAz98YFucwKefbkTp eKvCfCwiM1bGatZEFFKIlvJ2QNMQNiUrqJBlW9nZp/k7pbG3oStOjvawD9ZbP9e0fnlWJIsj 6c7pX354Yi7kxIk/6gREidHLLqEb/otuwt1aoMPg97iUgDV5mlNef77lWE8vxmlY0FBWIXuZ yv0XYxf1WF6dRizwFFbxvUZzIJp3spAao7jLsQj1DbD2s5+S1BW09A0mI/1DjB6EhNN+4bDB SJCOv/ReK3tFJXuj/HbyDrOdoMt8aIFbe7YFLEExHpSk+HgN05Lg5TyTro8oW7TSMTk+8a5M kzaH4UGXTTBDP/g5cfL3RFPl79ubXwARAQABiQIfBBgBCAAJBQJXFRpKAhsMAAoJEKqx7BSn lIjvI/8P/jg0jl4Tbvg3B5kT6PxJOXHYu9OoyaHLcay6Cd+ZrOd1VQQCbOcgLFbf4Yr+rE9l mYsY67AUgq2QKmVVbn9pjvGsEaz8UmfDnz5epUhDxC6yRRvY4hreMXZhPZ1pbMa6A0a/WOSt AgFj5V6Z4dXGTM/lNManr0HjXxbUYv2WfbNt3/07Db9T+GZkpUotC6iknsTA4rJi6u2ls0W9 1UIvW4o01vb4nZRCj4rni0g6eWoQCGoVDk/xFfy7ZliR5B+3Z3EWRJcQskip/QAHjbLa3pml xAZ484fVxgeESOoaeC9TiBIp0NfH8akWOI0HpBCiBD5xaCTvR7ujUWMvhsX2n881r/hNlR9g fcE6q00qHSPAEgGr1bnFv74/1vbKtjeXLCcRKk3Ulw0bY1OoDxWQr86T2fZGJ/HIZuVVBf3+ gaYJF92GXFynHnea14nFFuFgOni0Mi1zDxYH/8yGGBXvo14KWd8JOW0NJPaCDFJkdS5hu0VY 7vJwKcyHJGxsCLU+Et0mryX8qZwqibJIzu7kUJQdQDljbRPDFd/xmGUFCQiQAncSilYOcxNU EMVCXPAQTteqkvA+gNqSaK1NM9tY0eQ4iJpo+aoX8HAcn4sZzt2pfUB9vQMTBJ2d4+m/qO6+ cFTAceXmIoFsN8+gFN3i8Is3u12u8xGudcBPvpoy4OoG Message-ID: <0030e23a-db7d-b8a7-149a-f6bf616941ec@linaro.org> Date: Thu, 21 Jan 2021 13:41:42 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jan 2021 16:41:47 -0000 On 20/01/2021 22:55, Zhangxuelei (Derek) wrote: > Hi, > > They are my colleagues and we have certified this results together. It would be better to revert the original selection according to the negative performance of a specific product. And we will still study for a better or more balanced version of memcpy on Kunpeng. > > Thank you~ This is ok for 2.33, please commit. > > -----邮件原件----- > 发件人: Adhemerval Zanella [mailto:adhemerval.zanella@linaro.org] > 发送时间: 2021年1月20日 21:09 > 收件人: wangshuo (AF) ; Zhangxuelei (Derek) ; libc-alpha@sourceware.org > 抄送: Hushiyuan ; liqingqing (C) > 主题: Re: [PATCH] aarch64: revert memcpy optimze for kunpeng to avoid performance degradation > > Hi, > > Since I don't have access to this specific hardware, it would be good if the original author, Xuelei Zhang, of the change could certify this reversion is ok. > > It should be ok during the freeze since it just a selection of an already tested implementation for an specific chip implementation. > > On 20/01/2021 04:20, Shuo Wang wrote: >> In commit 863d775c481704baaa41855fc93e5a1ca2dc6bf6, kunpeng920 is >> added to default memcpy version, however, there is performance degradation when the copy size is some large bytes, eg: 100k. >> This is the result, tested in glibc-2.28: >> before backport after backport Performance improvement >> memcpy_1k 0.005 0.005 0.00% >> memcpy_10k 0.032 0.029 10.34% >> memcpy_100k 0.356 0.429 -17.02% >> memcpy_1m 7.470 11.153 -33.02% >> >> This is the demo >> #include "stdio.h" >> #include "string.h" >> #include "stdlib.h" >> >> char a[1024*1024] = {12}; >> char b[1024*1024] = {13}; >> int main(int argc, char *argv[]) >> { >> int i = atoi(argv[1]); >> int j; >> int size = atoi(argv[2]); >> >> for (j = 0; j < i; j++) >> memcpy(b, a, size*1024); >> return 0; >> } >> >> # gcc -g -O0 memcpy.c -o memcpy >> # time taskset -c 10 ./memcpy 100000 1024 >> >> Co-authored-by: liqingqing >> >> --- >> sysdeps/aarch64/multiarch/memcpy.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/sysdeps/aarch64/multiarch/memcpy.c >> b/sysdeps/aarch64/multiarch/memcpy.c >> index 27259d3386..0e0a5cbcfb 100644 >> --- a/sysdeps/aarch64/multiarch/memcpy.c >> +++ b/sysdeps/aarch64/multiarch/memcpy.c >> @@ -37,7 +37,7 @@ extern __typeof (__redirect_memcpy) __memcpy_falkor >> attribute_hidden; libc_ifunc (__libc_memcpy, >> (IS_THUNDERX (midr) >> ? __memcpy_thunderx >> - : (IS_FALKOR (midr) || IS_PHECDA (midr) || IS_KUNPENG920 (midr) >> + : (IS_FALKOR (midr) || IS_PHECDA (midr) >> ? __memcpy_falkor >> : (IS_THUNDERX2 (midr) || IS_THUNDERX2PA (midr) >> ? __memcpy_thunderx2 >>