I've improved the builtin memcmp expansion so it avoids a couple of things that p7 and previous processors don't like. Performance on p7 is now never worse than glibc memcmp(). Bootstrap/regtest in progress on power7 ppc64 BE.  OK for trunk if testing passes? gcc/ChangeLog: 2016-10-06 Aaron Sawdey * config/rs6000/rs6000.h (TARGET_EFFICIENT_OVERLAPPING_UNALIGNED) Add macro to say we can efficiently handle overlapping unaligned loads. * config/rs6000/rs6000.c (expand_block_compare): Avoid generating poor code for processors older than p8. -- Aaron Sawdey, Ph.D. acsawdey@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain