On Thu, Jan 14, 2021 at 7:16 PM Hongtao Liu wrote: > > ping. > > On Thu, Jan 7, 2021 at 1:22 PM Hongtao Liu wrote: > > > > On Wed, Jan 6, 2021 at 10:39 PM Jakub Jelinek wrote: > > > > > > On Wed, Jan 06, 2021 at 02:49:13PM +0800, Hongtao Liu wrote: > > > > ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn > > > > for vector comparison to vector mask, but ix86_expand_sse_cmp(which is > > > > called in upper 2 functions.) may return integer mask whenever integer > > > > mask is available, so convert integer mask back to vector mask if > > > > needed. > > > > > > > > gcc/ChangeLog: > > > > > > > > PR target/98537 > > > > * config/i386/i386-expand.c (ix86_expand_fp_vec_cmp): > > > > When cmp is integer mask, convert it to vector. > > > > (ix86_expand_int_vec_cmp): Ditto. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > PR target/98537 > > > > * g++.target/i386/avx512bw-pr98537-1.C: New test. > > > > * g++.target/i386/avx512vl-pr98537-1.C: New test. > > > > * g++.target/i386/avx512vl-pr98537-2.C: New test. > > > > > > Do we optimize it then to an AVX/AVX2 comparison if possible? A new patch is proposed to solve a series of performance and correctness regressions brought by r10-5250. Integer mask comparison will only be used for 512-bit vectors, and 128/256-bit vcondmn(excluding the case where op_true/op_false is all 1s or 0s, it is actually a vec_cmpmn). in ix86_expand_sse_cmp/ix86_expand_int_sse_cmp - if (ix86_valid_mask_cmp_mode (cmp_ops_mode)) + if (GET_MODE_SIZE (mode) == 64 + || (ix86_valid_mask_cmp_mode (cmp_ops_mode) + /* When op_true and op_false is NULL, vector dest is required. */ + && op_true && op_false + /* Gimple sometimes transforms vec_cmpmn to vcondmn with + op_true/op_false as constm1_rtx/const0_rtx. + Don't generate integer mask comparison then. */ + && !((vector_all_ones_operand (op_true, GET_MODE (op_true)) + && CONST0_RTX (GET_MODE (op_false)) == op_false) + || (vector_all_ones_operand (op_false, GET_MODE (op_false)) + && CONST0_RTX (GET_MODE (op_true)) == op_true)))) Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for trunk? Ok for backport to gcc10? gcc/ChangeLog: PR target/98537 * config/i386/i386-expand.c (ix86_expand_sse_cmp): Don't generate integer mask comparison for 128/256-bits vector when op_true/op_false is NULL_RTX or CONSTM1_RTX/CONST0_RTX. Also delete redundant !maskcmp condition. (ix86_expand_int_vec_cmp): Ditto but no redundant deletion here. (ix86_expand_sse_movcc): Delete definition of maskcmp, add the condition directly to if (maskcmp), add extra check for cmpmode, it should be MODE_INT. (ix86_expand_fp_vec_cmp): Pass NULL to ix86_expand_sse_cmp's parameters op_true/op_false. gcc/testsuite/ChangeLog: PR target/98537 * g++.target/i386/avx512bw-pr98537-1.C: New test. * g++.target/i386/avx512vl-pr98537-1.C: New test. * g++.target/i386/avx512vl-pr98537-2.C: New test. * gcc.target/i386/avx512vl-pr88547-1.c: Adjust testcase, integer mask comparison should not be generated. * gcc.target/i386/avx512vl-pr92686-vpcmp-1.c: This test is used to guard code generation of integer mask comparison, but for vector comparison to vector dest, integer mask comparison is disliked, so detele this useless test. * gcc.target/i386/avx512vl-pr92686-vpcmp-2.c: Ditto. * gcc.target/i386/avx512vl-pr92686-vpcmp-intelasm-1.c: Ditto. -- BR, Hongtao