Here is an updated version of my new MIPS memset.S routine. I fixed the format of the comments and the ifdef indenting and I ran 'make check' and 'make bench' on little endian and big endian systems with the o32, n32, and n64 ABIs. The testing did find a bug that my original testing missed and I have fixed that bug (it involved a negative value as the constant being set). Other then that, the only failures I saw were the expected check-localplt and check-execstack errors. I don't know if you want to see all the performance results from bench-memset.out since it has a lot of output, but looking at the average time for 131072 byte memsets, the original libc in o32 little endian mode averaged 43732 (seconds I guess) and the new one was 27365. n32 went from 21886 to 21881 and n64 went from 21882 to 21877. So the 64 bit numbers only improved a little, but the 32 bit version shows a very nice improvement. Steve Ellcey sellcey@mips.com 2013-09-18 Steve Ellcey * sysdeps/mips/memset.S: Change prefetching and add loop unrolling. * sysdeps/mips/mips64/memset.S: Remove.