Hi Richard, on 2023/1/27 19:08, Richard Sandiford via Gcc-patches wrote: > PR96373 points out that a predicated SVE loop currently converts > trapping unconditional ops into unpredicated vector ops. Doing > the operation on inactive lanes can then raise an exception. > > As discussed in the PR trail, we aren't 100% consistent about > whether we preserve traps or not. But the direction of travel > is clearly to improve that rather than live with it. This patch > tries to do that for the SVE case. > > Doing this regresses gcc.target/aarch64/sve/fabd_1.c. I've added > -fno-trapping-math for now and filed PR108571 to track it. > A similar problem applies to fsubr_1.d. > > I think this is likely to regress Power 10, since conditional > operations are only available for masked loops. I think we'll > need to add -fno-trapping-math to any affected testcases, > but I don't have a Power 10 system to test on. Kewen, would you > mind giving this a spin and seeing how bad the fallout is? > Sorry for the late reply, I'm just back from vacation. Thank you for fixing this and caring about Power10! I tested your proposed patch on one Power10 machine (ppc64le), it's bootstrapped but some test failures got exposed as below. < FAIL: gcc.target/powerpc/p9-vec-length-epil-1.c scan-assembler-times \\\\mlxvl\\\\M 14 < FAIL: gcc.target/powerpc/p9-vec-length-epil-1.c scan-assembler-times \\\\mstxvl\\\\M 7 < FAIL: gcc.target/powerpc/p9-vec-length-epil-2.c scan-assembler-times \\\\mlxvl\\\\M 20 < FAIL: gcc.target/powerpc/p9-vec-length-epil-2.c scan-assembler-times \\\\mstxvl\\\\M 10 < FAIL: gcc.target/powerpc/p9-vec-length-epil-3.c scan-assembler-times \\\\mlxvl\\\\M 14 < FAIL: gcc.target/powerpc/p9-vec-length-epil-3.c scan-assembler-times \\\\mstxvl\\\\M 7 < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times \\\\mlxvl\\\\M 70 < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times \\\\mlxvx?\\\\M 120 < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times \\\\mstxvl\\\\M 70 < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times \\\\mstxvx?\\\\M 70 < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times \\\\mlxvl\\\\M 21 < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times \\\\mstxvl\\\\M 21 < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times \\\\mstxvx?\\\\M 21 < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times \\\\mlxvl\\\\M 10 < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times \\\\mlxvx?\\\\M 42 < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times \\\\mstxvl\\\\M 10 < FAIL: gcc.target/powerpc/p9-vec-length-epil-8.c scan-assembler-times \\\\mlxvl\\\\M 16 < FAIL: gcc.target/powerpc/p9-vec-length-epil-8.c scan-assembler-times \\\\mstxvl\\\\M 7 < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-not \\\\mlxvx\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-not \\\\mstxvx\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-times \\\\mlxvl\\\\M 20 < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-times \\\\mstxvl\\\\M 10 < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-not \\\\mlxvx\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-not \\\\mstxvx\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-times \\\\mlxvl\\\\M 20 < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-times \\\\mstxvl\\\\M 10 < FAIL: gcc.target/powerpc/p9-vec-length-full-3.c scan-assembler-times \\\\mlxvl\\\\M 14 < FAIL: gcc.target/powerpc/p9-vec-length-full-3.c scan-assembler-times \\\\mstxvl\\\\M 7 < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not \\\\mlxvx\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not \\\\mstxv\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not \\\\mstxvx\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-times \\\\mlxvl\\\\M 70 < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-times \\\\mstxvl\\\\M 70 < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not \\\\mlxvx\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not \\\\mstxv\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not \\\\mstxvx\\\\M < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-times \\\\mlxvl\\\\M 21 < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-times \\\\mstxvl\\\\M 21 < FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times \\\\mlxvl\\\\M 10 < FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times \\\\mstxvl\\\\M 10 < FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times \\\\mstxvx?\\\\M 6 < FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times \\\\mlxvl\\\\M 30 < FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times \\\\mstxvl\\\\M 10 By checking several of them, it's due to that we don't vectorize some loop having float type involved with partial vector any more. As you suggested above, I fixed them with an extra option "-fno-trapping-math" and verified all of them can pass again. I also noticed that the original test case in PR96373 fails on Power10 too, so I added one constructed case pr96373.c into sub bucket gcc.target/powerpc for testing coverage on Power. One re-spin with the attached adjustment shows there is no regression failure any more, and the new test case works well on both ppc64 (P8) and ppc64le (P10) Linux. BR, Kewen -----