From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by sourceware.org (Postfix) with ESMTPS id 95BD63955C34 for ; Wed, 20 Jan 2021 13:09:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 95BD63955C34 Received: by mail-qk1-x734.google.com with SMTP id b64so25174336qkc.12 for ; Wed, 20 Jan 2021 05:09:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=YTEnycbH5zvTwBrnPW0cO+Qnx7Qt3w2/5I22PUoTLIE=; b=MhWsrwdWIpEbhI1UrTZDsPtq9JrNNP/R3KPggT79TYZgEUwhv6QDOwczMaYSdbZF+U ++/NVNCwSNAhYuDaTTxOTOw2Zu/UQjmTjhe9c2iLq4dXcP+MgFsWBWXm83cAInoZ8nkO w+btBV279HPa6NcUxbObXt4TtlcrkXCJUVjcLq1FwViML8j9VMSYtx5MWlPtzVH+Lse2 F2sNFNF324UbgHkTrdPGsgOTEextjWxf+SVPGIE7q5xYDXPH44SdpooW+VaPx09r+ZYv qOyVnvs0ctZmX+WXumc4/7TJFTzUs6eClxVnyfvjz/HFGgcWhIqMFyYBZ82MZ4qS9uCU 1ujA== X-Gm-Message-State: AOAM533guErPTNEB9dSnckKtsroKLzaaAshNSPcEsT/H6k2q2nxXVsuY JEPBjCgfUV9k9gub+X/fL8STcRjoJr+NCQ== X-Google-Smtp-Source: ABdhPJzwYsgdjtOFG8yb1Yk8ZxJ/nz6xEMSCzhGaBxMpgjHSWssRabY+BJCE7CRiwnRtRdaFAEcdWw== X-Received: by 2002:a37:7981:: with SMTP id u123mr9172435qkc.360.1611148153119; Wed, 20 Jan 2021 05:09:13 -0800 (PST) Received: from [192.168.1.4] ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id b7sm1078992qtp.54.2021.01.20.05.09.10 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 20 Jan 2021 05:09:12 -0800 (PST) Subject: Re: [PATCH] aarch64: revert memcpy optimze for kunpeng to avoid performance degradation To: Shuo Wang , zhangxuelei4@huawei.com, libc-alpha@sourceware.org Cc: hushiyuan@huawei.com, liqingqing3@huawei.com References: <20210120072044.23228-1-wangshuo47@huawei.com> From: Adhemerval Zanella Autocrypt: addr=adhemerval.zanella@linaro.org; prefer-encrypt=mutual; keydata= mQINBFcVGkoBEADiQU2x/cBBmAVf5C2d1xgz6zCnlCefbqaflUBw4hB/bEME40QsrVzWZ5Nq 8kxkEczZzAOKkkvv4pRVLlLn/zDtFXhlcvQRJ3yFMGqzBjofucOrmdYkOGo0uCaoJKPT186L NWp53SACXguFJpnw4ODI64ziInzXQs/rUJqrFoVIlrPDmNv/LUv1OVPKz20ETjgfpg8MNwG6 iMizMefCl+RbtXbIEZ3TE/IaDT/jcOirjv96lBKrc/pAL0h/O71Kwbbp43fimW80GhjiaN2y WGByepnkAVP7FyNarhdDpJhoDmUk9yfwNuIuESaCQtfd3vgKKuo6grcKZ8bHy7IXX1XJj2X/ BgRVhVgMHAnDPFIkXtP+SiarkUaLjGzCz7XkUn4XAGDskBNfbizFqYUQCaL2FdbW3DeZqNIa nSzKAZK7Dm9+0VVSRZXP89w71Y7JUV56xL/PlOE+YKKFdEw+gQjQi0e+DZILAtFjJLoCrkEX w4LluMhYX/X8XP6/C3xW0yOZhvHYyn72sV4yJ1uyc/qz3OY32CRy+bwPzAMAkhdwcORA3JPb kPTlimhQqVgvca8m+MQ/JFZ6D+K7QPyvEv7bQ7M+IzFmTkOCwCJ3xqOD6GjX3aphk8Sr0dq3 4Awlf5xFDAG8dn8Uuutb7naGBd/fEv6t8dfkNyzj6yvc4jpVxwARAQABtElBZGhlbWVydmFs IFphbmVsbGEgTmV0dG8gKExpbmFybyBWUE4gS2V5KSA8YWRoZW1lcnZhbC56YW5lbGxhQGxp bmFyby5vcmc+iQI3BBMBCAAhBQJXFRpKAhsDBQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJ EKqx7BSnlIjv0e8P/1YOYoNkvJ+AJcNUaM5a2SA9oAKjSJ/M/EN4Id5Ow41ZJS4lUA0apSXW NjQg3VeVc2RiHab2LIB4MxdJhaWTuzfLkYnBeoy4u6njYcaoSwf3g9dSsvsl3mhtuzm6aXFH /Qsauav77enJh99tI4T+58rp0EuLhDsQbnBic/ukYNv7sQV8dy9KxA54yLnYUFqH6pfH8Lly sTVAMyi5Fg5O5/hVV+Z0Kpr+ZocC1YFJkTsNLAW5EIYSP9ftniqaVsim7MNmodv/zqK0IyDB GLLH1kjhvb5+6ySGlWbMTomt/or/uvMgulz0bRS+LUyOmlfXDdT+t38VPKBBVwFMarNuREU2 69M3a3jdTfScboDd2ck1u7l+QbaGoHZQ8ZNUrzgObltjohiIsazqkgYDQzXIMrD9H19E+8fw kCNUlXxjEgH/Kg8DlpoYJXSJCX0fjMWfXywL6ZXc2xyG/hbl5hvsLNmqDpLpc1CfKcA0BkK+ k8R57fr91mTCppSwwKJYO9T+8J+o4ho/CJnK/jBy1pWKMYJPvvrpdBCWq3MfzVpXYdahRKHI ypk8m4QlRlbOXWJ3TDd/SKNfSSrWgwRSg7XCjSlR7PNzNFXTULLB34sZhjrN6Q8NQZsZnMNs TX8nlGOVrKolnQPjKCLwCyu8PhllU8OwbSMKskcD1PSkG6h3r0AquQINBFcVGkoBEACgAdbR Ck+fsfOVwT8zowMiL3l9a2DP3Eeak23ifdZG+8Avb/SImpv0UMSbRfnw/N81IWwlbjkjbGTu oT37iZHLRwYUFmA8fZX0wNDNKQUUTjN6XalJmvhdz9l71H3WnE0wneEM5ahu5V1L1utUWTyh VUwzX1lwJeV3vyrNgI1kYOaeuNVvq7npNR6t6XxEpqPsNc6O77I12XELic2+36YibyqlTJIQ V1SZEbIy26AbC2zH9WqaKyGyQnr/IPbTJ2Lv0dM3RaXoVf+CeK7gB2B+w1hZummD21c1Laua +VIMPCUQ+EM8W9EtX+0iJXxI+wsztLT6vltQcm+5Q7tY+HFUucizJkAOAz98YFucwKefbkTp eKvCfCwiM1bGatZEFFKIlvJ2QNMQNiUrqJBlW9nZp/k7pbG3oStOjvawD9ZbP9e0fnlWJIsj 6c7pX354Yi7kxIk/6gREidHLLqEb/otuwt1aoMPg97iUgDV5mlNef77lWE8vxmlY0FBWIXuZ yv0XYxf1WF6dRizwFFbxvUZzIJp3spAao7jLsQj1DbD2s5+S1BW09A0mI/1DjB6EhNN+4bDB SJCOv/ReK3tFJXuj/HbyDrOdoMt8aIFbe7YFLEExHpSk+HgN05Lg5TyTro8oW7TSMTk+8a5M kzaH4UGXTTBDP/g5cfL3RFPl79ubXwARAQABiQIfBBgBCAAJBQJXFRpKAhsMAAoJEKqx7BSn lIjvI/8P/jg0jl4Tbvg3B5kT6PxJOXHYu9OoyaHLcay6Cd+ZrOd1VQQCbOcgLFbf4Yr+rE9l mYsY67AUgq2QKmVVbn9pjvGsEaz8UmfDnz5epUhDxC6yRRvY4hreMXZhPZ1pbMa6A0a/WOSt AgFj5V6Z4dXGTM/lNManr0HjXxbUYv2WfbNt3/07Db9T+GZkpUotC6iknsTA4rJi6u2ls0W9 1UIvW4o01vb4nZRCj4rni0g6eWoQCGoVDk/xFfy7ZliR5B+3Z3EWRJcQskip/QAHjbLa3pml xAZ484fVxgeESOoaeC9TiBIp0NfH8akWOI0HpBCiBD5xaCTvR7ujUWMvhsX2n881r/hNlR9g fcE6q00qHSPAEgGr1bnFv74/1vbKtjeXLCcRKk3Ulw0bY1OoDxWQr86T2fZGJ/HIZuVVBf3+ gaYJF92GXFynHnea14nFFuFgOni0Mi1zDxYH/8yGGBXvo14KWd8JOW0NJPaCDFJkdS5hu0VY 7vJwKcyHJGxsCLU+Et0mryX8qZwqibJIzu7kUJQdQDljbRPDFd/xmGUFCQiQAncSilYOcxNU EMVCXPAQTteqkvA+gNqSaK1NM9tY0eQ4iJpo+aoX8HAcn4sZzt2pfUB9vQMTBJ2d4+m/qO6+ cFTAceXmIoFsN8+gFN3i8Is3u12u8xGudcBPvpoy4OoG Message-ID: <8b21765f-347d-ab00-5806-9d8a0ef17754@linaro.org> Date: Wed, 20 Jan 2021 10:09:09 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20210120072044.23228-1-wangshuo47@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jan 2021 13:09:16 -0000 Hi, Since I don't have access to this specific hardware, it would be good if the original author, Xuelei Zhang, of the change could certify this reversion is ok. It should be ok during the freeze since it just a selection of an already tested implementation for an specific chip implementation. On 20/01/2021 04:20, Shuo Wang wrote: > In commit 863d775c481704baaa41855fc93e5a1ca2dc6bf6, kunpeng920 is added to default memcpy version, > however, there is performance degradation when the copy size is some large bytes, eg: 100k. > This is the result, tested in glibc-2.28: > before backport after backport Performance improvement > memcpy_1k 0.005 0.005 0.00% > memcpy_10k 0.032 0.029 10.34% > memcpy_100k 0.356 0.429 -17.02% > memcpy_1m 7.470 11.153 -33.02% > > This is the demo > #include "stdio.h" > #include "string.h" > #include "stdlib.h" > > char a[1024*1024] = {12}; > char b[1024*1024] = {13}; > int main(int argc, char *argv[]) > { > int i = atoi(argv[1]); > int j; > int size = atoi(argv[2]); > > for (j = 0; j < i; j++) > memcpy(b, a, size*1024); > return 0; > } > > # gcc -g -O0 memcpy.c -o memcpy > # time taskset -c 10 ./memcpy 100000 1024 > > Co-authored-by: liqingqing > > --- > sysdeps/aarch64/multiarch/memcpy.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/sysdeps/aarch64/multiarch/memcpy.c b/sysdeps/aarch64/multiarch/memcpy.c > index 27259d3386..0e0a5cbcfb 100644 > --- a/sysdeps/aarch64/multiarch/memcpy.c > +++ b/sysdeps/aarch64/multiarch/memcpy.c > @@ -37,7 +37,7 @@ extern __typeof (__redirect_memcpy) __memcpy_falkor attribute_hidden; > libc_ifunc (__libc_memcpy, > (IS_THUNDERX (midr) > ? __memcpy_thunderx > - : (IS_FALKOR (midr) || IS_PHECDA (midr) || IS_KUNPENG920 (midr) > + : (IS_FALKOR (midr) || IS_PHECDA (midr) > ? __memcpy_falkor > : (IS_THUNDERX2 (midr) || IS_THUNDERX2PA (midr) > ? __memcpy_thunderx2 >