As I was auditing rs6000.md for power9 changes, I noticed that changes I had made in 2010 for power7 weren't as effective with power8. The FCTIWZ/FCTIWUZ instructions convert the scalar floating point value to a 32-bit signed/unsigned integer in bits 32-63 of the floating point or vector register. Unfortunately, the hardware does not guarantee that bits 0-31 are copies of the sign, so that it can be used as a valid 64-bit integer. There is no conversion from 32-bit int to floating point. This meant in the power7 days, if you wanted to round a floating point value to 32-bit integer, you would need to do: convert to 32-bit integer store 32-bit value on the stack load 32-bit value to a GPR sign/zero extend it store 32-bit value to the stack load 32-bit value to a FPR/vector register. The optimization does a store/load to sign/zero extend, rather than going through the GPRs. On power8, we have a direct move instruction that copies the value between the register sets, and the compiler will generate this if the above optimization is turned off (which is what this patch does). There are other ways to sign/zero extend a value in the vector registers without doing a move using multiple instructions, but in practice direct move seems to be as fast as the other instructions. I bootstrapped the compiler and there were no regressions with this patch. I rebuilt the Spec 2006 benchmark suite, and there 7 of the benchmarks that used this sequence somewhere in the code. I ran those benchmarks with this patch, and compared them to the original benchmarks. In 6 of the benchmarks, the run time was almost precisely the same. The 416.gamess benchmark was about 2% faster, and there were no regressions. Is this patch ok to apply to the trunk? I would like to apply it to the gcc 5 branch as well. Is this ok also? [gcc] 2016-03-11 Michael Meissner PR target/70131 * config/rs6000/rs6000.md (round322_fprs): Do not do the optimization if we have direct move. (roundu322_fprs): Likewise. [gcc/testsuite] 2016-03-11 Michael Meissner PR target/70131 * gcc.target/powerpc/ppc-round2.c: New test. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797