On Fri, Apr 1, 2016 at 12:38 PM, H.J. Lu wrote: > Tested on Haswell, Ivy Bridge, Westmere and Penryn. Also tested with > --disable-multi-arch. Any comments, feedbacks? > > >.J. > --- > Since the new SSE2/AVX2 memcpy/memmove are faster than the previous ones, > we can remove the previous SSE2/AVX2 memcpy/memmove and replace them with > the new ones. > > No change in IFUNC selection if SSE2 and AVX2 memcpy/memmove weren't used > before. If SSE2 or AVX2 memcpy/memmove were used, the new SSE2 or AVX2 > memcpy/memmove optimized with Enhanced REP MOVSB will be used for > processors with ERMS. The new AVX512 memcpy/memmove will be used for > processors with AVX512 which prefer vzeroupper. > > Since the new SSE2 memcpy/memmove are faster than the previous default > memcpy/memmove used in libc.a and ld.so, we also remove the previous > default memcpy/memmove and make them the default memcpy/memmove. > > Together, it reduces the size of libc.so by about 6 KB and the size of > ld.so by about 2 KB. > Here is the updated patch against master. The current memcpy performance data is at https://sourceware.org/bugzilla/attachment.cgi?id=9184 The current memmove performance data is at https://sourceware.org/bugzilla/attachment.cgi?id=9185 Any comments, feedbacks, objections? -- H.J.