From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) by sourceware.org (Postfix) with ESMTPS id 3B0643882100 for ; Fri, 14 Jun 2024 06:13:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3B0643882100 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3B0643882100 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::229 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718345587; cv=none; b=b5Jy53wSwwoxcNw2Y35Uc1e9SxGxNhSwK29G1zUg1VMylW72OY3peXvMtZY92YMemgWZK9lreMuwuNC/GPzI1xc+4AALu8CbRY53Lt/uswq8VO86eDx3Y7KZDTlQQZAuK+YgamLiogMfdmWVfrglGAtICZbwQJmzGfrHtQ+gT/A= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718345587; c=relaxed/simple; bh=EeMD7QyLcJ51+6g9r/4SBwN4jaUO+Hzu4/8y+PxzHL8=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=U/dKZk5LN/OwRzWpr3JplEcgfgpTIuTe1IjIKgYuonT+WuxHMgH+JbO8b/T7VecwAO1WyBkQ33LdsA1BnvFFnBXAARQv5HVc7zH0iq06btj/yF+yLrgGhjXLJukpQYtBjqRhmeFjwxIg62nK7RDvBoA8JJjW+tzPKIF4mik4HKk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lj1-x229.google.com with SMTP id 38308e7fff4ca-2eadaac1d28so17473691fa.3 for ; Thu, 13 Jun 2024 23:13:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718345581; x=1718950381; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=dtVPEtB7o83Kr+VNMz2rUZqurkdDT99mCRWD9f+b3SU=; b=N/3AEDPhJ7xfT8tVRNQviSxODAkPNTb5lTyZOZD+bquKamQlunCp0Tv3nrm2V8x6GV pc9Y4AYHyEL3SCjkTTp2JbjEz7B543JNe0YZdJg8B+i61e7UIJf8nvLz2b4EKrJE5o2r 7GgpQGPfhAkq2hWVxeONuLWNiqlpEoaB4p7RK6qEqMkSxQPzH5FD+V6BF+KTgat68xBC mF6sXAiUlZkW3vKFeM/f5pgYwrx/BP9ycpfVfNqT5uDSOXPlTxnEsM+1faIoJRk/8y5O IUaX3aBnngQVW8hsS1bLMaOeEDu8ui/qJq7HTKPX+9K5JJcPGv+5Q+VUcjrkyXhg+/Y6 u0pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718345581; x=1718950381; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dtVPEtB7o83Kr+VNMz2rUZqurkdDT99mCRWD9f+b3SU=; b=CAHPy/0u6PN2VQPFn6xyx9trDuAFhdsVKw0Pl2rUzVZTqPqCkpu6YKXcYTpV6++E83 wLEdG6SaIBusWFroedlUi33AMP+bjVrG1uHchMUM8vZZd9Upx8tpVm5XBSWanA8MxaR7 HxXjQbpXf8wDR2RGR43qQRmZvTWRDGnf6mv8JuOn/bsjtQkACtPjhgWAFtnqLVpMmIRU 6ji1wvelXpk0ZRD7vBeKwMbZBZi7Dd21hZnpKZE2jNK+ZhM48VDATCdK0nGSiY43nVJb YnkjsO+35rM+suyIPeOzY9j29hoi4mDalp7pSPX8flsI0YUFpOWZrpUkOFP9rtM1Nf63 Lm2Q== X-Gm-Message-State: AOJu0YxHNimT66utbMDjQytqDbrY/eMoi9W1OPtydb1EOmMdaGaEA+94 qBw1UnpBIG987K+uxeX9OXat+Wff3Qf7f5HePDKOxe29IDpTm/t9EpBVPxo+KxxYIDPIdYo+bsz yJY9aZIEzlztgoObIXj6QuIYjeUM= X-Google-Smtp-Source: AGHT+IHKyBopu+mo0Bl/S59KbX+vEi5+wWvr3Pmc+KMp2sMKBOCZfCMulG1j4lRLainGexw92ZVk+6R2KFlOonjwv/0= X-Received: by 2002:a2e:851a:0:b0:2eb:e3a7:8b with SMTP id 38308e7fff4ca-2ec0e47a0bdmr11898411fa.26.1718345580110; Thu, 13 Jun 2024 23:13:00 -0700 (PDT) MIME-Version: 1.0 References: <20240614005632.4088419-1-lingling.kong@intel.com> <20240614005632.4088419-2-lingling.kong@intel.com> In-Reply-To: From: Richard Biener Date: Fri, 14 Jun 2024 08:12:49 +0200 Message-ID: Subject: Re: [PATCH 2/2] [APX CFCMOV] Support APX CFCMOV To: "Kong, Lingling" , Richard Sandiford Cc: "gcc-patches@gcc.gnu.org" , "Liu, Hongtao" , Uros Bizjak Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Jun 14, 2024 at 3:39=E2=80=AFAM Kong, Lingling wrote: > > From: konglin1 > > > > APX CFCMOV feature implements conditionally faulting which means that all > > memory faults are suppressed when the condition code evaluates to false a= nd > > load or store a memory operand. Now we could load or store a memory opera= nd > > may trap or fault for conditional move. > > > > To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_N= OTRAP > > in if-conversion pass to allow convert to cmov. > > > > Bootstrapped & regtested on x86-64-pc-linux-gnu with binutils 2.42 branch= . > > OK for trunk? How does if-conversion end up modifying the IL? I have the gut feeling that your hook changes semantics of RTL and you should instead have an optab for a "masked" load/store? Richard - do you already have plans how to represent the first-fault loads? (are there first-fault stores?) Richard. > > > gcc/ChangeLog: > > > > * config/i386/i386-expand.cc (ix86_can_cfcmov_p): New func= tion that > > test if the cfcmov can be generated. > > (ix86_expand_int_movcc): Expand to cfcmov pattern if ix86_= can_cfcmov_p > > return ture. > > * config/i386/i386-opts.h (enum apx_features): Add apx_cfc= mov. > > * config/i386/i386.cc (ix86_have_conditional_move_mem_notr= ap): New > > function to hook TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > > (TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP): Target hook def= ine. > > (ix86_rtx_costs): Add UNSPEC_APX_CFCMOV cost; > > * config/i386/i386.h (TARGET_APX_CFCMOV): Define. > > * config/i386/i386.md (*cfcmov_1): New define_insn t= o support > > cfcmov. > > (*cfcmov_2): Ditto. > > (UNSPEC_APX_CFCMOV): New unspec for cfcmov. > > * config/i386/i386.opt: Add enum value for cfcmov. > > * ifcvt.cc (noce_try_cmove_load_mem_notrap): Use target ho= ok to allow > > convert to cfcmov for conditional load. > > (noce_try_cmove_store_mem_notrap): Convert to conditional = store. > > (noce_process_if_block): Ditto. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/apx-cfcmov-1.c: New test. > > * gcc.target/i386/apx-cfcmov-2.c: Ditto. > > --- > > gcc/config/i386/i386-expand.cc | 63 +++++ > > gcc/config/i386/i386-opts.h | 4 +- > > gcc/config/i386/i386.cc | 33 ++- > > gcc/config/i386/i386.h | 1 + > > gcc/config/i386/i386.md | 53 +++- > > gcc/config/i386/i386.opt | 3 + > > gcc/config/i386/predicates.md | 7 + > > gcc/ifcvt.cc | 247 ++++++++++++++++++- > > gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c | 73 ++++++ > > gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c | 40 +++ > > 10 files changed, 511 insertions(+), 13 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c > > > > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand= .cc > > index 312329e550b..c02a4bcbec3 100644 > > --- a/gcc/config/i386/i386-expand.cc > > +++ b/gcc/config/i386/i386-expand.cc > > @@ -3336,6 +3336,30 @@ ix86_expand_int_addcc (rtx operands[]) > > return true; > > } > > > > +/* Return TRUE if we could convert "if (test) x =3D a; else x =3D b;" to= cfcmov, > > + especially when load a or b or x store may cause memmory faults. */ > > +bool > > +ix86_can_cfcmov_p (rtx x, rtx a, rtx b) > > +{ > > + machine_mode mode =3D GET_MODE (x); > > + if (TARGET_APX_CFCMOV > > + && (mode =3D=3D DImode || mode =3D=3D SImode || mode =3D=3D HImode= )) > > + { > > + /* C load (r m r), (r m C), (r r m). For r m m could use > > + two cfcmov. */ > > + if (register_operand (x, mode) > > + && ((MEM_P (a) && register_operand (b, mode)) > > + || (MEM_P (a) && b =3D=3D const0_rtx) > > + || (register_operand (a, mode) && MEM_P (b)) > > + || (MEM_P (a) && MEM_P (b)))) > > + return true; > > + /* C store (m r 0). */ > > + else if (MEM_P (x) && x =3D=3D b && register_operand (a, mode)) > > + return true; > > + } > > + return false; > > +} > > + > > bool > > ix86_expand_int_movcc (rtx operands[]) > > { > > @@ -3366,6 +3390,45 @@ ix86_expand_int_movcc (rtx operands[]) > > > > compare_code =3D GET_CODE (compare_op); > > > > + if (MEM_P (operands[0]) > > + && !ix86_can_cfcmov_p (operands[0], op2, op3)) > > + return false; > > + > > + if (may_trap_or_fault_p (op2) || may_trap_or_fault_p (op3)) > > + { > > + if (ix86_can_cfcmov_p (operands[0], op2, op3)) > > + { > > + if (may_trap_or_fault_p (op2)) > > + op2 =3D gen_rtx_UNSPEC (mode, gen_rtvec (1, operands[= 2]), > > + UNSPEC_APX= _CFCMOV); > > + if (may_trap_or_fault_p (op3)) > > + op3 =3D gen_rtx_UNSPEC (mode, gen_rtvec (1, operands[= 3]), > > + UNSPEC_APX= _CFCMOV); > > + emit_insn (compare_seq); > > + > > + if (may_trap_or_fault_p (op2) && may_trap_or_fault_p (o= p3)) > > + { > > + emit_insn (gen_rtx_SET (operands[0], > > + = gen_rtx_IF_THEN_ELSE (mode, > > + = compare_op, > > + = op2, > > + = operands[0]))); > > + emit_insn (gen_rtx_SET (operands[0], > > + = gen_rtx_IF_THEN_ELSE (mode, > > + = compare_op, > > + = operands[0], > > + = op3))); > > + } > > + else > > + emit_insn (gen_rtx_SET (operands[0], > > + gen_rtx_= IF_THEN_ELSE (mode, > > + = compare_op, > > + = op2, op3))); > > + return true; > > + } > > + return false; > > + } > > + > > if ((op1 =3D=3D const0_rtx && (code =3D=3D GE || code =3D=3D LT)) > > || (op1 =3D=3D constm1_rtx && (code =3D=3D GT || code =3D=3D LE))) > > sign_bit_compare_p =3D true; > > diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h > > index c7ec0d9fd39..711519ffb53 100644 > > --- a/gcc/config/i386/i386-opts.h > > +++ b/gcc/config/i386/i386-opts.h > > @@ -143,8 +143,10 @@ enum apx_features { > > apx_nf =3D 1 << 4, > > apx_ccmp =3D 1 << 5, > > apx_zu =3D 1 << 6, > > + apx_cfcmov =3D 1 << 7, > > apx_all =3D apx_egpr | apx_push2pop2 | apx_ndd > > - | apx_ppx | apx_nf | apx_ccmp | apx_zu, > > + | apx_ppx | apx_nf | apx_ccmp | apx_zu > > + | apx_cfcmov, > > }; > > > > #endif > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > index 173db213d14..b14c0a3d9f2 100644 > > --- a/gcc/config/i386/i386.cc > > +++ b/gcc/config/i386/i386.cc > > @@ -22349,10 +22349,18 @@ ix86_rtx_costs (rtx x, machine_mode mode, int o= uter_code_i, int opno, > > *total =3D COSTS_N_INSNS (1); > > if (!COMPARISON_P (XEXP (x, 0)) && !REG_P (XEXP (x, 0))) > > *total +=3D rtx_cost (XEXP (x, 0), mode, code, 0, spee= d); > > - if (!REG_P (XEXP (x, 1))) > > - *total +=3D rtx_cost (XEXP (x, 1), mode, code, 1, spee= d); > > - if (!REG_P (XEXP (x, 2))) > > - *total +=3D rtx_cost (XEXP (x, 2), mode, code, 2, spee= d); > > + rtx op1, op2; > > + op1 =3D XEXP (x, 1); > > + op2 =3D XEXP (x, 2); > > + /* Handle UNSPEC_APX_CFCMOV for cfcmov. */ > > + if (GET_CODE (op1) =3D=3D UNSPEC && XINT (op1, 1) =3D=3D = UNSPEC_APX_CFCMOV) > > + op1 =3D XVECEXP (op1, 0, 0); > > + if (GET_CODE (op2) =3D=3D UNSPEC && XINT (op2, 1) =3D=3D = UNSPEC_APX_CFCMOV) > > + op2 =3D XVECEXP (op2, 0, 0); > > + if (!REG_P (op1)) > > + *total +=3D rtx_cost (op1, mode, code, 1, speed); > > + if (!REG_P (op2)) > > + *total +=3D rtx_cost (op2, mode, code, 2, speed); > > return true; > > } > > return false; > > @@ -24998,6 +25006,19 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq= , struct noce_if_info *if_info) > > return default_noce_conversion_profitable_p (seq, if_info); > > } > > > > + > > +/* Implement targetm.have_conditional_move_mem_notrap hook. */ > > +static bool > > +ix86_have_conditional_move_mem_notrap (rtx x) > > +{ > > + machine_mode mode =3D GET_MODE (x); > > + if (TARGET_APX_CFCMOV > > + && (mode =3D=3D DImode || mode =3D=3D SImode || mode =3D=3D HImod= e) > > + && MEM_P (x)) > > + return true; > > + return false; > > +} > > + > > /* x86-specific vector costs. */ > > class ix86_vector_costs : public vector_costs > > { > > @@ -26975,6 +26996,10 @@ ix86_libgcc_floating_mode_supported_p > > #undef TARGET_NOCE_CONVERSION_PROFITABLE_P > > #define TARGET_NOCE_CONVERSION_PROFITABLE_P ix86_noce_conversion_profitab= le_p > > > > +#undef TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > > +#define TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP \ > > + ix86_have_conditional_move_mem_notrap > > + > > #undef TARGET_HARD_REGNO_NREGS > > #define TARGET_HARD_REGNO_NREGS ix86_hard_regno_nregs > > #undef TARGET_HARD_REGNO_MODE_OK > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > index dc1a1f44320..6a20fa678c8 100644 > > --- a/gcc/config/i386/i386.h > > +++ b/gcc/config/i386/i386.h > > @@ -58,6 +58,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively= . If not, see > > #define TARGET_APX_NF (ix86_apx_features & apx_nf) > > #define TARGET_APX_CCMP (ix86_apx_features & apx_ccmp) > > #define TARGET_APX_ZU (ix86_apx_features & apx_zu) > > +#define TARGET_APX_CFCMOV (ix86_apx_features & apx_cfcmov) > > > > #include "config/vxworks-dummy.h" > > > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > > index fd48e764469..57448c07828 100644 > > --- a/gcc/config/i386/i386.md > > +++ b/gcc/config/i386/i386.md > > @@ -221,6 +221,9 @@ > > ;; For APX CCMP support > > ;; DFV =3D default flag value > > UNSPEC_APX_DFV > > + > > + ;; For APX CFCMOV support > > + UNSPEC_APX_CFCMOV > > ]) > > > > (define_c_enum "unspecv" [ > > @@ -579,7 +582,7 @@ > > noavx512dq,fma_or_avx512vl,avx512vl,noa= vx512vl,avxvnni, > > avx512vnnivl,avx512fp16,avxifma,avx512i= fmavl,avxneconvert, > > avx512bf16vl,vpclmulqdqvl,avx_noavx512f= ,avx_noavx512vl, > > - vaes_avx512vl,noapx_nf" > > + vaes_avx512vl,noapx_nf,apx_cfcmov" > > (const_string "base")) > > > > ;; The (bounding maximum) length of an instruction immediate. > > @@ -986,6 +989,7 @@ > > (eq_attr "mmx_isa" "avx") > > (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX") > > (eq_attr "isa" "noapx_nf") (symbol_ref "!TARGET_APX_NF") > > + (eq_attr "isa" "apx_cfcmov") (symbol_ref "TARGET_APX_CFCMOV= ") > > ] > > (const_int 1))) > > > > @@ -24995,7 +24999,7 @@ > > ;; Conditional move instructions. > > > > (define_expand "movcc" > > - [(set (match_operand:SWIM 0 "register_operand") > > + [(set (match_operand:SWIM 0 "register_or_cfc_mem_operand") > > (if_then_else:SWIM (match_operand 1 "comparison_operator") > > (match_operand:SWIM 2 "") > > (match_operand:SWIM 3 "")))] > > @@ -25103,19 +25107,54 @@ > > (set (match_dup 0) > > (neg:SWI (ltu:SWI (reg:CCC FLAGS_REG) (const_int 0))))]) > > > > +(define_insn "*cfcmov_1" > > + [(set (match_operand:SWI248 0 "register_operand" "=3Dr,r") > > + (if_then_else:SWI248 (match_operator 1 "ix86_comparison_operator= " > > + [(reg FLAGS_REG) (const_int 0)]) > > + (unspec:SWI248 > > + [(match_operand:SWI248 2 "memory_operand" "m,m")] > > + UNSPEC_APX_CFCMOV) > > + (match_operand:SWI248 3 "reg_or_0_operand" "C,r")))] > > + "TARGET_CMOVE && TARGET_APX_CFCMOV" > > + "@ > > + cfcmov%O2%C1\t{%2, %0|%0, %2} > > + cfcmov%O2%C1\t{%2, %3, %0|%0, %3, %2}" > > + [(set_attr "isa" "*,apx_ndd") > > + (set_attr "type" "icmov") > > + (set_attr "prefix" "evex") > > + (set_attr "mode" "")]) > > + > > +(define_insn "*cfcmov_2" > > + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=3Dr,m") > > + (if_then_else:SWI248 (match_operator 1 "ix86_comparison_operator= " > > + [(reg FLAGS_REG) (const_int 0)]) > > + (match_operand:SWI248 2 "register_operand" "r,r") > > + (unspec:SWI248 > > + [(match_operand:SWI248 3 "memory_operand" "m,0")] > > + UNSPEC_APX_CFCMOV)))] > > + "TARGET_CMOVE && TARGET_APX_CFCMOV" > > + "@ > > + cfcmov%O2%c1\t{%3, %2, %0|%0, %2, %3} > > + cfcmov%O2%C1\t{%2, %0|%0, %2}" > > + [(set_attr "isa" "apx_ndd,*") > > + (set_attr "type" "icmov") > > + (set_attr "prefix" "evex") > > + (set_attr "mode" "")]) > > + > > (define_insn "*movcc_noc" > > - [(set (match_operand:SWI248 0 "register_operand" "=3Dr,r,r,r") > > + [(set (match_operand:SWI248 0 "register_operand" "=3Dr,r,r,r,r") > > (if_then_else:SWI248 (match_operator 1 "ix86_comparison_op= erator" > > [(reg FLAGS_REG) (con= st_int 0)]) > > - (match_operand:SWI248 2 "nonimmediate_operand" "rm,0,rm,= r") > > - (match_operand:SWI248 3 "nonimmediate_operand" "0,rm,r,r= m")))] > > + (match_operand:SWI248 2 "nonimmediate_operand" "rm,0,rm,r= ,r") > > + (match_operand:SWI248 3 "nonimm_or_0_operand" "0,rm,r,rm,= C")))] > > "TARGET_CMOVE && !(MEM_P (operands[2]) && MEM_P (operands[3]))" > > "@ > > cmov%O2%C1\t{%2, %0|%0, %2} > > cmov%O2%c1\t{%3, %0|%0, %3} > > cmov%O2%C1\t{%2, %3, %0|%0, %3, %2} > > - cmov%O2%c1\t{%3, %2, %0|%0, %2, %3}" > > - [(set_attr "isa" "*,*,apx_ndd,apx_ndd") > > + cmov%O2%c1\t{%3, %2, %0|%0, %2, %3} > > + cfcmov%O2%C1\t{%2, %0|%0, %2}" > > + [(set_attr "isa" "*,*,apx_ndd,apx_ndd,apx_cfcmov") > > (set_attr "type" "icmov") > > (set_attr "mode" "")]) > > > > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt > > index 353fffb2343..7d63d9abd95 100644 > > --- a/gcc/config/i386/i386.opt > > +++ b/gcc/config/i386/i386.opt > > @@ -1345,6 +1345,9 @@ Enum(apx_features) String(ccmp) Value(apx_ccmp) Set= (7) > > EnumValue > > Enum(apx_features) String(zu) Value(apx_zu) Set(8) > > > > +EnumValue > > +Enum(apx_features) String(cfcmov) Value(apx_cfcmov) Set(9) > > + > > EnumValue > > Enum(apx_features) String(all) Value(apx_all) Set(1) > > > > diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.m= d > > index 7afe3100cb7..d562e10ab41 100644 > > --- a/gcc/config/i386/predicates.md > > +++ b/gcc/config/i386/predicates.md > > @@ -2322,3 +2322,10 @@ > > > > return true; > > }) > > + > > +;; Return true if OP is a register operand or memory_operand is only > > +;; supported under TARGET_APX_CFCMOV. > > +(define_predicate "register_or_cfc_mem_operand" > > + (ior (match_operand 0 "register_operand") > > + (and (match_code "mem") > > + (match_test "TARGET_APX_CFCMOV")))) > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > > index 58ed42673e5..6e3e48af810 100644 > > --- a/gcc/ifcvt.cc > > +++ b/gcc/ifcvt.cc > > @@ -783,6 +783,8 @@ static rtx noce_emit_cmove (struct noce_if_info *, rt= x, enum rtx_code, rtx, > > rtx, rtx, rtx, rtx =3D N= ULL, rtx =3D NULL); > > static bool noce_try_cmove (struct noce_if_info *); > > static bool noce_try_cmove_arith (struct noce_if_info *); > > +static bool noce_try_cmove_load_mem_notrap (struct noce_if_info *); > > +static bool noce_try_cmove_store_mem_notrap (struct noce_if_info *, rtx = *, rtx); > > static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn *= *); > > static bool noce_try_minmax (struct noce_if_info *); > > static bool noce_try_abs (struct noce_if_info *); > > @@ -2401,6 +2403,237 @@ noce_try_cmove_arith (struct noce_if_info *if_inf= o) > > return false; > > } > > > > +/* When target support suppress memory fault, try more complex cases inv= olving > > + conditional_move's source or dest may trap or fault. */ > > + > > +static bool > > +noce_try_cmove_load_mem_notrap (struct noce_if_info *if_info) > > +{ > > + rtx a =3D if_info->a; > > + rtx b =3D if_info->b; > > + rtx x =3D if_info->x; > > + > > + if (MEM_P (x)) > > + return false; > > + /* Just handle a conditional move from one trap MEM + other non_trap, > > + non mem cases. */ > > + if (!(MEM_P (a) ^ MEM_P (b))) > > + return false; > > + bool a_trap =3D may_trap_or_fault_p (a); > > + bool b_trap =3D may_trap_or_fault_p (b); > > + > > + if (!(a_trap ^ b_trap)) > > + return false; > > + if (a_trap && (!MEM_P (a) || !targetm.have_conditional_move_mem_notrap= (a))) > > + return false; > > + if (b_trap && (!MEM_P (b) || !targetm.have_conditional_move_mem_notrap= (b))) > > + return false; > > + > > + rtx orig_b; > > + rtx_insn *insn_a, *insn_b; > > + bool a_simple =3D if_info->then_simple; > > + bool b_simple =3D if_info->else_simple; > > + basic_block then_bb =3D if_info->then_bb; > > + basic_block else_bb =3D if_info->else_bb; > > + rtx target; > > + enum rtx_code code; > > + rtx cond =3D if_info->cond; > > + rtx_insn *ifcvt_seq; > > + > > + /* if (test) x =3D *a; else x =3D c - d; > > + =3D> x =3D c - d; > > + if (test) > > + x =3D *a; > > + */ > > + > > + code =3D GET_CODE (cond); > > + insn_a =3D if_info->insn_a; > > + insn_b =3D if_info->insn_b; > > + > > + machine_mode x_mode =3D GET_MODE (x); > > + > > + if (!can_conditionally_move_p (x_mode)) > > + return false; > > + > > + /* Because we only handle one trap MEM + other non_trap, non mem cases= , > > + just move one trap MEM always in then_bb. */ > > + if (noce_reversed_cond_code (if_info) !=3D UNKNOWN) > > + { > > + bool reversep =3D false; > > + if (b_trap) > > + reversep =3D true; > > + > > + if (reversep) > > + { > > + if (if_info->rev_cond) > > + { > > + cond =3D if_info->rev_cond; > > + code =3D GET_CODE (cond); > > + } > > + else > > + code =3D reversed_comparison_code (cond, if_info->jump)= ; > > + std::swap (a, b); > > + std::swap (insn_a, insn_b); > > + std::swap (a_simple, b_simple); > > + std::swap (then_bb, else_bb); > > + } > > + } > > + > > + if (then_bb && else_bb > > + && (!bbs_ok_for_cmove_arith (then_bb, else_bb, if_info->orig_x) > > + || !bbs_ok_for_cmove_arith (else_bb, then_bb, if_info->o= rig_x))) > > + return false; > > + > > + start_sequence (); > > + > > + /* If one of the blocks is empty then the corresponding B or A value > > + came from the test block. The non-empty complex block that we will > > + emit might clobber the register used by B or A, so move it to a pse= udo > > + first. */ > > + > > + rtx tmp_b =3D NULL_RTX; > > + > > + /* Don't move trap mem to a pseudo. */ > > + if (!may_trap_or_fault_p (b) && (b_simple || !else_bb)) > > + tmp_b =3D gen_reg_rtx (x_mode); > > + > > + orig_b =3D b; > > + > > + rtx emit_a =3D NULL_RTX; > > + rtx emit_b =3D NULL_RTX; > > + rtx_insn *tmp_insn =3D NULL; > > + bool modified_in_a =3D false; > > + bool modified_in_b =3D false; > > + /* If either operand is complex, load it into a register first. > > + The best way to do this is to copy the original insn. In this > > + way we preserve any clobbers etc that the insn may have had. > > + This is of course not possible in the IS_MEM case. */ > > + > > + if (! general_operand (b, GET_MODE (b)) || tmp_b) > > + { > > + if (insn_b) > > + { > > + b =3D tmp_b ? tmp_b : gen_reg_rtx (GET_MODE (b)); > > + rtx_insn *copy_of_b =3D as_a (copy_rtx (= insn_b)); > > + rtx set =3D single_set (copy_of_b); > > + > > + SET_DEST (set) =3D b; > > + emit_b =3D PATTERN (copy_of_b); > > + } > > + else > > + { > > + rtx tmp_reg =3D tmp_b ? tmp_b : gen_reg_rtx (GET_MODE= (b)); > > + emit_b =3D gen_rtx_SET (tmp_reg, b); > > + b =3D tmp_reg; > > + } > > + } > > + > > + if (tmp_b && then_bb) > > + { > > + FOR_BB_INSNS (then_bb, tmp_insn) > > + /* Don't check inside insn_a. We will have changed it to e= mit_a > > + with a destination that doesn't conflict. */ > > + if (!(insn_a && tmp_insn =3D=3D insn_a) > > + && modified_in_p (orig_b, tmp_insn)) > > + { > > + modified_in_a =3D true; > > + break; > > + } > > + > > + } > > + > > + modified_in_b =3D emit_b !=3D NULL_RTX && modified_in_p (a, emit_b); > > + /* If insn to set up A clobbers any registers B depends on, try to > > + swap insn that sets up A with the one that sets up B. If even > > + that doesn't help, punt. */ > > + if (modified_in_a && !modified_in_b) > > + { > > + if (!noce_emit_bb (emit_b, else_bb, b_simple)) > > + goto end_seq_and_fail; > > + > > + if (!noce_emit_bb (emit_a, then_bb, a_simple)) > > + goto end_seq_and_fail; > > + } > > + else if (!modified_in_a) > > + { > > + if (!noce_emit_bb (emit_b, else_bb, b_simple)) > > + goto end_seq_and_fail; > > + > > + if (!noce_emit_bb (emit_a, then_bb, a_simple)) > > + goto end_seq_and_fail; > > + } > > + else > > + goto end_seq_and_fail; > > + > > + target =3D noce_emit_cmove (if_info, x, code, XEXP (cond, 0), XEXP (co= nd, 1), > > + a, b); > > + > > + if (! target) > > + goto end_seq_and_fail; > > + > > + if (target !=3D x) > > + noce_emit_move_insn (x, target); > > + > > + ifcvt_seq =3D end_ifcvt_sequence (if_info); > > + if (!ifcvt_seq || !targetm.noce_conversion_profitable_p (ifcvt_seq, if= _info)) > > + return false; > > + > > + emit_insn_before_setloc (ifcvt_seq, if_info->jump, > > + INSN_LOCATION (if_info->in= sn_a)); > > + if_info->transform_name =3D "noce_try_cmove_load_mem_notrap"; > > + return true; > > + > > + end_seq_and_fail: > > + end_sequence (); > > + return false; > > +} > > + > > +static bool > > +noce_try_cmove_store_mem_notrap (struct noce_if_info *if_info, rtx *x_pt= r, rtx orig_x) > > +{ > > + rtx a =3D if_info->a; > > + rtx b =3D if_info->b; > > + rtx x =3D orig_x; > > + machine_mode x_mode =3D GET_MODE (x); > > + > > + if (!MEM_P (x) || !rtx_equal_p (x, b)) > > + return false; > > + if (!may_trap_or_fault_p (x) || !targetm.have_conditional_move_mem_not= rap (x)) > > + return false; > > + if (!if_info->then_simple || !register_operand (a, x_mode)) > > + return false; > > + > > + rtx cond =3D if_info->cond; > > + enum rtx_code code =3D GET_CODE (cond); > > + rtx_insn *ifcvt_seq; > > + > > + start_sequence (); > > + > > + rtx target =3D noce_emit_cmove (if_info, x, code, XEXP (cond, 0), XEXP= (cond, 1), > > + a, b); > > + > > + if (! target) > > + goto end_seq_and_fail; > > + > > + if (target !=3D x) > > + noce_emit_move_insn (x, target); > > + > > + ifcvt_seq =3D end_ifcvt_sequence (if_info); > > + if (!ifcvt_seq || !targetm.noce_conversion_profitable_p (ifcvt_seq, if= _info)) > > + return false; > > + > > + emit_insn_before_setloc (ifcvt_seq, if_info->jump, > > + INSN_LOCATION (if_info->in= sn_a)); > > + if_info->transform_name =3D "noce_try_cmove_load_mem_notrap"; > > + if_info->x =3D orig_x; > > + *x_ptr =3D orig_x; > > + return true; > > + > > + end_seq_and_fail: > > + end_sequence (); > > + return false; > > +} > > + > > /* For most cases, the simplified condition we found is the best > > choice, but this is not the case for the min/max/abs transforms. > > For these we wish to know that it is A or B in the condition. */ > > @@ -4121,12 +4354,21 @@ noce_process_if_block (struct noce_if_info *if_in= fo) > > } > > > > if (!set_b && MEM_P (orig_x)) > > + { > > + /* Conditional_move_suppress_fault for condition mem store would n= ot > > + move any arithmetic calculations. */ > > + if (targetm.have_conditional_move_mem_notrap (orig_x) > > + && HAVE_conditional_move > > + && noce_try_cmove_store_mem_notrap (if_info, &x, orig_x)) > > + goto success; > > + else > > /* We want to avoid store speculation to avoid cases like > > if (pthread_mutex_trylock(mutex)) > > ++global_variable; > > Rather than go to much effort here, we rely on the SSA optimizers= , > > which do a good enough job these days. */ > > - return false; > > + return false; > > + } > > > > if (noce_try_move (if_info)) > > goto success; > > @@ -4160,6 +4402,9 @@ noce_process_if_block (struct noce_if_info *if_info= ) > > if (HAVE_conditional_move > > && noce_try_cmove_arith (if_info)) > > goto success; > > + if (HAVE_conditional_move > > + && noce_try_cmove_load_mem_notrap (if_info)) > > + goto success; > > if (noce_try_sign_mask (if_info)) > > goto success; > > } > > diff --git a/gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c b/gcc/testsuite= /gcc.target/i386/apx-cfcmov-1.c > > new file mode 100644 > > index 00000000000..4a1fb91b24c > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c > > @@ -0,0 +1,73 @@ > > +/* { dg-do compile { target { ! ia32 } } } */ > > +/* { dg-options "-O3 -mapxf" } */ > > + > > +/* { dg-final { scan-assembler-times "cfcmovne" 1 } } */ > > +/* { dg-final { scan-assembler-times "cfcmovg" 2} } */ > > +/* { dg-final { scan-assembler-times "cfcmove" 1 } } */ > > +/* { dg-final { scan-assembler-times "cfcmovl" 2 } } */ > > +/* { dg-final { scan-assembler-times "cfcmovle" 1 } } */ > > + > > +__attribute__((noinline, noclone, target("apxf"))) > > +int cfc_store (int a, int b, int c, int d, int *arr) > > +{ > > + if (a !=3D b) > > + *arr =3D c; > > + return d; > > + > > +} > > + > > +__attribute__((noinline, noclone, target("apxf"))) > > +int cfc_load_ndd (int a, int b, int c, int *p) > > +{ > > + if (a > b) > > + return *p; > > + return c; > > +} > > + > > +__attribute__((noinline, noclone, target("apxf"))) > > +int cfc_load_2_trap (int a, int b, int *c, int *p) > > +{ > > + if (a > b) > > + return *p; > > + return *c; > > +} > > + > > +__attribute__((noinline, noclone, target("apxf"))) > > +int cfc_load_zero (int a, int b, int c) > > +{ > > + int sum =3D 0; > > + if (a =3D=3D b) > > + return c; > > + return sum; > > +} > > + > > +__attribute__((noinline, noclone, target("apxf"))) > > +int cfc_load_mem (int a, int b, int *p) > > +{ > > + int sum =3D 0; > > + if (a < b ) > > + sum =3D *p; > > + return sum; > > +} > > + > > +__attribute__((noinline, noclone, target("apxf"))) > > +int cfc_load_arith_1 (int a, int b, int c, int *p) > > +{ > > + int sum =3D 0; > > + if (a > b) > > + sum =3D *p; > > + else > > + sum =3D a + c; > > + return sum + 1; > > +} > > + > > +__attribute__((noinline, noclone, target("apxf"))) > > +int cfc_load_arith_2 (int a, int b, int c, int *p) > > +{ > > + int sum =3D 0; > > + if (a > b) > > + sum =3D a + c; > > + else > > + sum =3D *p; > > + return sum + 1; > > +} > > diff --git a/gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c b/gcc/testsuite= /gcc.target/i386/apx-cfcmov-2.c > > new file mode 100644 > > index 00000000000..2b1660f64fa > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c > > @@ -0,0 +1,40 @@ > > +/* { dg-do run { target { ! ia32 } } } */ > > +/* { dg-require-effective-target apxf } */ > > +/* { dg-options "-mapxf -march=3Dx86-64 -O3" } */ > > + > > +#include "apx-cfcmov-1.c" > > + > > +extern void abort (void); > > + > > +int main () > > +{ > > + if (!__builtin_cpu_supports ("apxf")) > > + return 0; > > + > > + int arr =3D 6; > > + int arr1 =3D 5; > > + int res =3D cfc_store (1, 2, 3, 4, &arr); > > + if (arr !=3D 3 && res !=3D 4) > > + abort (); > > + res =3D cfc_load_ndd (2, 1, 2, &arr); > > + if (res !=3D 3) > > + abort (); > > + res =3D cfc_load_2_trap (1, 2, &arr1, &arr); > > + if (res !=3D 5) > > + abort (); > > + res =3D cfc_load_zero (1, 2, 3); > > + res =3D cfc_load_zero (1, 2, 3); > > + if (res !=3D 0) > > + abort (); > > + res =3D cfc_load_mem (2, 1, &arr); > > + if (res !=3D 0) > > + abort (); > > + res =3D cfc_load_arith_1 (1, 2, 3, &arr); > > + if (res !=3D 5) > > + abort(); > > + res =3D cfc_load_arith_2 (2, 1, 3,&arr); > > + if (res !=3D 6) > > + abort(); > > + return 0; > > +} > > + > > -- > > 2.31.1 > >