From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1696 invoked by alias); 27 Mar 2017 10:45:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 1044 invoked by uid 89); 27 Mar 2017 10:45:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=Hx-spam-relays-external:sk:EUR02-A, HX-HELO:sk:EUR02-A, Hx-languages-length:2207, H*RU:sk:EUR02-A X-Spam-User: qpsmtpd, 2 recipients X-HELO: EUR02-AM5-obe.outbound.protection.outlook.com Authentication-Results: linaro.org; dkim=none (message not signed) header.d=none;linaro.org; dmarc=none action=none header.from=arm.com; Message-ID: <58D8ED38.20203@arm.com> Date: Mon, 27 Mar 2017 10:45:00 -0000 From: Szabolcs Nagy User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0 MIME-Version: 1.0 To: Steve Ellcey , libc-alpha CC: , Siddhesh Poyarekar , Adhemerval Zanella Subject: Re: [Patch] aarch64: Thunderx specific memcpy and memmove References: <1490397926.19074.73.camel@caviumnetworks.com> In-Reply-To: <1490397926.19074.73.camel@caviumnetworks.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-ClientProxiedBy: AM3PR07CA0090.eurprd07.prod.outlook.com (10.165.222.24) To HE1PR0802MB2491.eurprd08.prod.outlook.com (10.175.31.151) X-MS-Office365-Filtering-Correlation-Id: 7e8a76b9-daf0-48a2-2057-08d474fe5b17 X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(48565401081);SRVR:HE1PR0802MB2491; X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2491;3:2fkaQYa9sJttNeEiHxrCdSt+W2SVUgpcP+vR65NNaTNvbhwolJvzazR9EkLoDswQQPl4jGxLFOakuIcQCHQHdDsP1bexe6J+jJr/1IryuXp/fzhIIxwY/4zKFLT0VZKE86vsBRWNI/1550QtMftPYX6YHMV8HpBkv/WC/uPGfCiJdCxPTONMWpZ6pSYufDXMhi3Xc2EVNQZS3ZoPBb9mzZJNdAJHBDYMBMPZRCjGeuuwVfB8wowyRNjrtBBl+m9P79KPKSVH4Yywc2deoFLx8NyGgVaNIjamMhbiM4PbBCg=;25:+Lc0vid/IgwukJ406o+UzoPjoZcKiQ8Jyfhos8ckDxPifl6tyyVs0Zq1LUcho57xb7fgxsmAWW844I+ujqh0ovzUAo9DB7IdzgBU2QNDc1duqu9Jc0IIh2xt6Ff1wfusptCQcm9kHC0NzxifhC12BUaaJdz/zQYSdPx3a57Oo+s1sy08Ejo3u/IZbME4KtCV1WG9xr6w7HwuWGMh/SLTZbf+x3yYUO7w7Q6xq0vQeNXfB048NEvRQnnJDzQ96+iH+Pq6J+MLLCj/mroIV+512Tjc3W6hvTW814ujDjCo/lVQqSzY/xyhn+W25Es/pO2uOg5WXZ3b/ZppTmp+Y1BW46DDy4wP9VZIAngOlXncI7/4kdvzmIZ3nW/sAEIoebBvFKt0JVg81U/3jFF8H0iyPaLiHFB4nx2EyLW35dPAVPAUQYLS/4GB9dHHpS+Gpn5H6ZlyG/DafzEro86+OPZ/Fg== X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2491;31:gJNVxpmW5CDwKk+snZegvqwb8g43i3KNVz1lJEeA9nBWecLn3yChmRujFIQLJLtFG2eXTY0uP5VCzdrAUg2PDIPCRigVg60jPZ4vEDjpU1qNFEfI0T79boUn6Yt6wQgj22YpJhzea+he9Q8zU9yj+SA+7VAr20g9QKvJN/ibi1t6iUR2qoMZFD8yAy5zklz4u9OVZjjy+Ehle220XSFcj1zYAiiW0z/lyqX9fjUhHtAdfUDQ41L/tDm2NCq/s9ND;20:KDhxBLWLY/MjprXEqTGf8cCVEwjsnSrRy6s7k/2Zddr0BB10W8SH1ak9plobP6Uuk6YkdaDwrl37uKUXop9WYbsgMeikZ/LSqYkz4gyu18GB1UwLTM/yh7I8gSfT18wk8tGysTV3AFTxR6J5NSaas7lwPbTFZvtz1ORX3zULtGQ= NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040375)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026)(6041248)(20161123558025)(20161123562025)(20161123555025)(20161123564025)(20161123560025)(6072148);SRVR:HE1PR0802MB2491;BCL:0;PCL:0;RULEID:;SRVR:HE1PR0802MB2491; X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2491;4:6XCemyTce0B3hn70r3vp/nsaGhJiUSAXG/AZdMxmSn99xCgdYmh1q0iRirel9wQKk2m1/VvofbIioVTB3K0JX+Y8pSuWiHPt3KhRxmDkeUgr6yCeOq9VLiYQTetzkZFGw01tnx4pmtRpo7vUDMv77k7O1CHQhT7oqIejomZCvaaTaUvCfdaYZxKhhiIFjb2TP5QmjcyDELue8zDV6RpN8MBhx98tbkQW1ufD0UDME/vNDIobyUI5bDoovnZATBcqDnA/2sBz42gTGCuTZbKGdhU4hl/x4BlkoFYobH96+tSgGDT7oC6++SGKXodkLybgpU/yoRDfqhU0ECqR4LoqSihM6LPYkWf38GbUnDUdnw+pA+DifqtLJwOoVfi3NRwPewzODVx7SlIPbZruqiI/fIOE112sz8K+RWpeOb9qW/HCx+FTGywjHnTLDJ/kMx1zBbM6TgmuGoudx3iHkTQLi/pmazzSBaJ2DxGwlalgVZOSO61A2b61g4bK53hJQHqgiXVIlOfsAwMPSED6IdbhCM2Bx0nMQSVYonoriZlcR24uckljQ6RYYF3wVFwpi22An1CATKr4L1pUZ9NcRXNIeJisUOk/itSke8QGIPa565ccuNc+D5N0QfCOnG7OrvFo X-Forefront-PRVS: 02596AB7DA X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6049001)(6009001)(39410400002)(39450400003)(39860400002)(39840400002)(39850400002)(24454002)(377424004)(38730400002)(2906002)(229853002)(6666003)(2950100002)(6116002)(230700001)(6486002)(77096006)(53936002)(81166006)(8676002)(6246003)(83506001)(3846002)(23676002)(25786009)(53546009)(4001350100001)(54906002)(189998001)(4326008)(66066001)(64126003)(305945005)(5660300001)(7736002)(65806001)(65956001)(65816999)(47776003)(50986999)(76176999)(42186005)(50466002)(33656002)(86362001)(54356999)(36756003);DIR:OUT;SFP:1101;SCL:1;SRVR:HE1PR0802MB2491;H:[10.2.206.69];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtIRTFQUjA4MDJNQjI0OTE7MjM6WktibHNVQkpCTzlFdCsyQXRtVDVNVS8y?= =?utf-8?B?R1lkY2NBSXhOTUEzdi9PM2lIcjh6Rnh5Um8wQXJkbmFRaUxYd3pDUFFqL21N?= =?utf-8?B?cnRKWDBJVjFGb2VkS3ljQzZqb2VNK0pMN3duV3pLdERFSk9aOE9HaXdQSy84?= =?utf-8?B?c1AzamtSLzVGVjFTOTF0dnRUZld4Y2EzUS8yVUxsLzV0bVdDeWllZXFwMVh0?= =?utf-8?B?TlRxdHpOZ0hCRE14eTAyWk5tdm50dEZQTmJKRDhCVFdKL21MeWt1MG9sYUJN?= =?utf-8?B?YkFYNFJ0d0dkcm1WNGtpWXhLM25Mem5yWFVQbXZKTmpZRDBkdUFQcnFsdTlt?= =?utf-8?B?T3BrN1ltWjJ1N095aHNKOFVrNU9kNE9nMzBmdktYNGJsd25KQVA2K0lETlBl?= =?utf-8?B?aEEwUjdrNWFmUVRIOFBvZXI5bVgvQ2U2K1JrWTZxNHBKd3JlaFRXTnREUjFN?= =?utf-8?B?Nms2WUFPREl3c05rYUJOV1d2NUpHSklCbW8xMGNPMjhKL2s0UzdvUGE5czVj?= =?utf-8?B?UmVWZG9hSjZLNXNyZjd5MnB2aUFkeXhnZWJ1UGpTMXRDdklvcDlUZ3cvTXA5?= =?utf-8?B?WEtZa2hveEZuVXpZbUlsYUMwQjB4ZEV4Y0pRZU5aY0ZuSWYxNld6anpFQ1ox?= =?utf-8?B?M285QlRwL2kyVEREdy9rUnlCRG1wRnZuQWJTRDFTZkZjc3lYUzUvTzNDcXJt?= =?utf-8?B?bjN0RnJQU0lPK3Bib0NRbWU5Vm5UQW4za05NSVZXUXBVN3B4bGdUaGg4ZE91?= =?utf-8?B?N0VKVUlPTmlJdTFaK3ZMTjdReWNZRGkxSG1aS29aUzBmWlI2TVJqTFZyNmhY?= =?utf-8?B?RGNsTmZLT3ZUd1BYZFh4ZXZXQm1NUVdtcE5SQVRYemFLUjBWZzNVWEozNzdh?= =?utf-8?B?NFY3VGZsVDEvL3c3VzFXNFRrcXN2U05JOGVEMEovNHFlYS9hY09udkdoY0V4?= =?utf-8?B?Wk5WM2ZtbnkxN0FHQ2lueVlTS3EzT01oRit5VkNZc1Z1TDRWTnJSb25ielVY?= =?utf-8?B?RXJjWHM3Wkk3N2JjZnc0UHZZZ0k2QlV2OFFTWTcwN3BzVUpaVUJmM0diUkpM?= =?utf-8?B?SzhWNjEwR0JrSW4vN1FwL3RRbHpuRUljRGFEMFlnbStIdTR1QURZTUROZmxk?= =?utf-8?B?WDFwNitjMXdtVFJOTVZNMU9acExPYzVmZkprWkc4ZUxIMHI2T3kwODBJV3R1?= =?utf-8?B?ZXJuL1Y0SDFwanV3UnQ1NTA2R2RxY1FvTm4ybHR1cFlWcWpvWXl0U2hDUFVR?= =?utf-8?B?MUxGV1JKRkYxaVlzSDg0Tm5ZZG9YU0dWWExmcXFHeGVqYXRCWWJZeEo4OG0z?= =?utf-8?B?RjlRNkpjL1B1Mng0aGdFLy9sWEZ5MU83WS9yMUtOQkcrSUVLRmpjamQ2R2kx?= =?utf-8?B?VTBPbXdVMU5JRmljTUxHYXhJS2x6dS9wM21iNkhJVE4wOFVOY1BJYk9oRHh2?= =?utf-8?B?aW9vOUYzQngrVzBZMlMvM3BTZzNROGdCRWZXbXZ0bmZsSXdoUTc2SE5XUVN1?= =?utf-8?B?NzN2VDl1QUdRNTlKWHlnYjZXRWFGOW43Zk5aVjdtbXpyN2J3UmFIYUZNUXNU?= =?utf-8?B?c29KaE1FRE5lU2tXRnNoSDcrYjJja3VFaE5PYUs4aE9ZY01EbGFUeUxiTUR0?= =?utf-8?B?dVVESzNod0FhR0J4SlRYbHJqM2JOZkU4VVlLZjFiNDFRTUxBeHRDRmxvMmc9?= =?utf-8?Q?=3D?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2491;6:AvQjvqnJuvfI1syGQ/kOs9o5zipySYadfNtVIKmihKzKgEfx2yL1jQH6BPeEeOTKM4Xab9m6zLP6N3X8Dabk00d2q02bHvs5kgWv96uBlppXHCloU/KlD9SUA1Oxte4yrKTSoHxDgfwbStQJnwC458P+yNhTOjTxHTBPqMyZqCbXBL3Gimm74tWdUE11R69tV8+1mHAtRxby2XTGA770jERNuSOkEvc6i1uEX5GOa/0pBYxVOUVOMA8OWJhahmFFwv+koe0OXWzqF+S9Guq8blXGSWl5fI16EQR/o8dfN3hpuiHf1QTBotNe+QNV3UUsb4py0pTaQn+5+ko9O7t1bioipS8BEDZCA3NB7s7xrLeWSmsNd5kQ4eXjPSyaDZCvFzpF0lS7uU31GmxysDLlDRGQ1VUkB/TBWMhR8881jug=;5:gBFaj6nKYY5Nzgt1ANQoI3RWc77maVTVlyrhDAmZJxqN4iJrxmro3uaH1PTPOmpr4/U/lD2xVYafF2n5YfqqqvrnnwQpcJx1s4cJ29msHY7dohVyuPZJ1InsDG2giHuLusRGQZ4b11tfrBqlx1FVzw==;24:uaX8TzBLzfjzJ2cdYQQP2DMcBjoQC+sXoX7ou6KqieBhG8cCKvYtDCQ62L+Wke8bBjW4xmyST2/v0ojt75V9LkZexXT6n3Dd6ea+3R2bNco= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;HE1PR0802MB2491;7:VkU30uquM4XZ15xe48LwVbakq5tlJfQwIw/kA3s6oHitYRcRXvTWQe1AxoJx3Td0KaZJGYqODx/0zekvnON5NKL25tNIH4/O9I3EeoAnT4pmtvsJRxlOx0nS/fcXHMKa5se2p5X+LbwBzNuBE+rn2eW9XE2L4kwjdEYfoxmVRjYqmjucL6/sHscyL0etCQctHAsy4MmvAVXAArFDOj1aDTAIw8vqNN2kOJjy9Ll8kAh9TrA6waUF0i9pIwGxY0lZG/V1Ycl+raqfFBNTqPo9F7NbKLSsNbYiZOAwlO6DvC9U2KAMapY9YDwnZokld2af0imknWwMNAO43cqM4l9oJA== X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Mar 2017 10:45:16.1046 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0802MB2491 X-SW-Source: 2017-03/txt/msg00609.txt.bz2 On 24/03/17 23:25, Steve Ellcey wrote: > Now that the IFUNC infrastructure for aarch64 is in place, here is a > patch to use it to create ThunderX specific versions of memcpy and > memmove. > > This was part of my original patch before it was split in two and a > couple of issues were raised at that time. > > Siddhesh Poyarekar wanted to separate the generic and thunderx copies > of memcpy/memmove instead of using ifdefs in a combined source file. > I prefer the ifdef version as a cleaner implementation with less code > duplication but I can change it if that is the consensus. > both are fine with me. > Also Adhemerval Zanella did some benchmarking that showed the > prefetching done in the thunderx version might be appropriate for the > generic version. However if you look at the prefetching we only do it > every other time through the loop. This is because the loop copies 64 > bytes and the ThunderX cache line size is 128 bytes. If other aarch64 > chips have a 64 byte cache line they might want a different prefetching > setup. > > If people think we should use the ThunderX version of memcpy for all > aarch64 systems I am happy to drop this patch and create one that just > changes memcpy.S to do the ThunderX style prefetches for all aarch64 > systems. > adding prefetches to the generic code is preferable if it can make both thunderx and generic users happy. we need to find what's the best way to add the prefetches, the new memcpy benchmarks may help here. > Steve Ellcey > sellcey@cavium.com > > > 2017-03-24 Steve Ellcey > > * sysdeps/aarch64/memcpy.S (MEMMOVE, MEMCPY): New macros. > (memmove): Use MEMMOVE for name. > (memcpy): Use MEMCPY for name. Add loop with prefetching > under USE_THUNDERX macro. > * sysdeps/aarch64/multiarch/Makefile: New file. > * sysdeps/aarch64/multiarch/ifunc-impl-list.c: Likewise. > * sysdeps/aarch64/multiarch/init-arch.h: Likewise. > * sysdeps/aarch64/multiarch/memcpy.c: Likewise. > * sysdeps/aarch64/multiarch/memcpy_generic.S: Likewise. > * sysdeps/aarch64/multiarch/memcpy_thunderx.S: Likewise. > * sysdeps/aarch64/multiarch/memmove.c: Likewise. >