Hi: Add define_peephole2 to perform optimization like bellow: +/* Optimize for TARGET_AVX512F + vpsubusw op1, op2, dst1; + vxorps xmm, xmm, dst2; ----> vpcmpleuw op1, op2, dst3 + vpcmpeqw dst1, dst2, dst3 */ and +/* Optimize for target above TARGET_SSE4_1 + vpsubusw op1, op2, dst1; vpminuw op1, op2, dst1 + vpxor xmm, xmm, dst2; ----> vpcmpeqw op1, dst1, dst3 + vpcmpeqw dst1, dst2, dst3 */ Bootstrap is ok, regression test is ok for i386/x86-64 backend. Ok for trunk? gcc/ChangeLog: PR target/96906 * config/i386/sse.md (VI12_128_256): New mode iterator. (define_peephole2): Optimize comparison between result of us_minus and 0, it could be optimized to "vpcmplequ" for AVX512 or "pminu + cmpeq" for target above TARGET_SSE4_1. gcc/testsuite/ChangeLog: * gcc.target/i386/avx2-pr96906-1.c: New test. * gcc.target/i386/avx512f-pr96906-1.c: New test. * gcc.target/i386/sse2-pr96906.c: New test. * gcc.target/i386/sse4_1-pr96906-1.c: New test. -- BR, Hongtao