On Tue, 2017-05-09 at 08:45 +0530, Siddhesh Poyarekar wrote: > On Wednesday 03 May 2017 07:31 PM, Szabolcs Nagy wrote: > > > > if it turns out that a single generic memcpy does not work > > it makes more sense to me to organize the code differently: > > if we expect the generic memcpy to diverge from the thunderx > > one then it's better not to use the same code with ifdefs, but > > keep them separate, so the thunderx variant can be maintained > > independently by whoever cares about thunderx. > If that is the case then I think Steve might be better off posting a > patch with the thunderx implementation being independent of the stock > aarch64 implementation while Wilco does his investigation.  That way > we > don't scramble for a patch late in the 2.26 cycle - there's about a > month and a half left. > > Siddhesh That sounds reasonable to me.  Here is a patch that contains a separate memcpy_thunderx implementation.  I still have some (minor) changes to the generic memcpy.S file.  One change is to use macros for the function names so that the generic multiarch memcpy can include the standard non-multiarch version.  The other is to change a couple of internal labels to external labels.  This change isn't absolutely necessary but it is helpful in the thunderx memcpy where the branches are slightly different and I would like to keep the thunderx memcpy and the generic memcpy as similar as possible so that when a change happens in one or the other it is easy to compare the two versions.  I don't believe using different label types affects the generated code at all and personally, I find named labels easier to read than the internal numbered labels.  Being able to compare the two memcpy's is also why I kept the THUNDERX ifdef in memcpy_thunderx.S even though it is always defined there, so that the intended differences are explicit when comparing the two versions of memcpy. Tested on the top-of-tree sources with no regressions. Steve Ellcey sellcey@cavium.com 2017-05-09  Steve Ellcey   * sysdeps/aarch64/memcpy.S (MEMMOVE, MEMCPY): New macros. (memmove): Use MEMMOVE for name. (memcpy): Use MEMCPY for name.  Change internal labels to external labels. * sysdeps/aarch64/multiarch/Makefile: New file. * sysdeps/aarch64/multiarch/ifunc-impl-list.c: Likewise. * sysdeps/aarch64/multiarch/init-arch.h: Likewise. * sysdeps/aarch64/multiarch/memcpy.c: Likewise. * sysdeps/aarch64/multiarch/memcpy_generic.S: Likewise. * sysdeps/aarch64/multiarch/memcpy_thunderx.S: Likewise. * sysdeps/aarch64/multiarch/memmove.c: Likewise.