From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id E06EA3858D28 for ; Sat, 10 Jun 2023 22:54:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E06EA3858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=JKFrjU0R/NFsnj5DDUhwP3MMVmjzgMh4JgtiHTAWbsc=; b=U8utdHCv8vBRY6ya0KWTLSGwiO 1vtCTVzshsTUD3seXfpP2FNN2ne8+bJlAfCBisk2/8akpcpPD6Oq/40U/UEDGae2HgmXHL4p476dR vqmGVNp1PVgm94ERhVHpLjyjVtumB+7dDl3kuPP+QgRFwRISWpc3wpyxCmkjarlTVIIZhMX/7zpcK sZod7a2XgHT2KJgATYstJlwzC6mpnxbOP1DdSPq5+DTmShFwi1ca571KFVKSIhnecw67lnIoWf+/S z2EEJrwXm18vViwoRZ+BwchEv03IJSAfD4bzEuAVnuI3o+tCkprmeXfVHk6MOfuQusVG7BmpDxYFH qSH8qLeQ==; Received: from host86-169-41-81.range86-169.btcentralplus.com ([86.169.41.81]:49409 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1q87Ti-0007NQ-02 for gcc-patches@gcc.gnu.org; Sat, 10 Jun 2023 18:54:38 -0400 From: "Roger Sayle" To: Subject: [GCC 13 PATCH] PR target/109973: CCZmode and CCCmode variants of [v]ptest. Date: Sat, 10 Jun 2023 23:54:36 +0100 Message-ID: <03bd01d99bee$888f3a70$99adaf50$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_03BE_01D99BF6.EA53A270" X-Mailer: Microsoft Outlook 16.0 Thread-Index: Admb65Gck+Vb7+srTwaOAnWwt1UtGA== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_BARRACUDACENTRAL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multipart message in MIME format. ------=_NextPart_000_03BE_01D99BF6.EA53A270 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit This is a backport of the fixes for PR target/109973 and PR target/110083. This backport to the releases/gcc-13 branch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for gcc-13, or should we just close PR 109973 in Bugzilla? 2023-06-10 Roger Sayle Uros Bizjak gcc/ChangeLog PR target/109973 PR target/110083 * config/i386/i386-builtin.def (__builtin_ia32_ptestz128): Use new CODE_for_sse4_1_ptestzv2di. (__builtin_ia32_ptestc128): Use new CODE_for_sse4_1_ptestcv2di. (__builtin_ia32_ptestz256): Use new CODE_for_avx_ptestzv4di. (__builtin_ia32_ptestc256): Use new CODE_for_avx_ptestcv4di. * config/i386/i386-expand.cc (ix86_expand_branch): Use CCZmode when expanding UNSPEC_PTEST to compare against zero. * config/i386/i386-features.cc (scalar_chain::convert_compare): Likewise generate CCZmode UNSPEC_PTESTs when converting comparisons. Update or delete REG_EQUAL notes, converting CONST_INT and CONST_WIDE_INT immediate operands to a suitable CONST_VECTOR. (general_scalar_chain::convert_insn): Use CCZmode for COMPARE result. (timode_scalar_chain::convert_insn): Use CCZmode for COMPARE result. * config/i386/i386-protos.h (ix86_match_ptest_ccmode): Prototype. * config/i386/i386.cc (ix86_match_ptest_ccmode): New predicate to check for suitable matching modes for the UNSPEC_PTEST pattern. * config/i386/sse.md (define_split): When splitting UNSPEC_MOVMSK to UNSPEC_PTEST, preserve the FLAG_REG mode as CCZ. (*_ptest): Add asterisk to hide define_insn. Remove ":CC" mode of FLAGS_REG, instead use ix86_match_ptest_ccmode. (_ptestz): New define_expand to specify CCZ. (_ptestc): New define_expand to specify CCC. (_ptest): A define_expand using CC to preserve the current behavior. (*ptest_and): Specify CCZ to only perform this optimization when only the Z flag is required. gcc/testsuite/ChangeLog PR target/109973 PR target/110083 * gcc.target/i386/pr109973-1.c: New test case. * gcc.target/i386/pr109973-2.c: Likewise. * gcc.target/i386/pr110083.c: Likewise. Thanks, Roger -- ------=_NextPart_000_03BE_01D99BF6.EA53A270 Content-Type: text/plain; name="patchpt.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="patchpt.txt" diff --git a/gcc/config/i386/i386-builtin.def = b/gcc/config/i386/i386-builtin.def=0A= index 6dae697..37df018 100644=0A= --- a/gcc/config/i386/i386-builtin.def=0A= +++ b/gcc/config/i386/i386-builtin.def=0A= @@ -1004,8 +1004,8 @@ BDESC (OPTION_MASK_ISA_SSE4_1, 0, = CODE_FOR_sse4_1_roundps_sfix, "__builtin_ia32_=0A= BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_roundv4sf2, = "__builtin_ia32_roundps_az", IX86_BUILTIN_ROUNDPS_AZ, UNKNOWN, (int) = V4SF_FTYPE_V4SF)=0A= BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_roundv4sf2_sfix, = "__builtin_ia32_roundps_az_sfix", IX86_BUILTIN_ROUNDPS_AZ_SFIX, UNKNOWN, = (int) V4SI_FTYPE_V4SF)=0A= =0A= -BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestv2di, = "__builtin_ia32_ptestz128", IX86_BUILTIN_PTESTZ, EQ, (int) = INT_FTYPE_V2DI_V2DI_PTEST)=0A= -BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestv2di, = "__builtin_ia32_ptestc128", IX86_BUILTIN_PTESTC, LTU, (int) = INT_FTYPE_V2DI_V2DI_PTEST)=0A= +BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestzv2di, = "__builtin_ia32_ptestz128", IX86_BUILTIN_PTESTZ, EQ, (int) = INT_FTYPE_V2DI_V2DI_PTEST)=0A= +BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestcv2di, = "__builtin_ia32_ptestc128", IX86_BUILTIN_PTESTC, LTU, (int) = INT_FTYPE_V2DI_V2DI_PTEST)=0A= BDESC (OPTION_MASK_ISA_SSE4_1, 0, CODE_FOR_sse4_1_ptestv2di, = "__builtin_ia32_ptestnzc128", IX86_BUILTIN_PTESTNZC, GTU, (int) = INT_FTYPE_V2DI_V2DI_PTEST)=0A= =0A= /* SSE4.2 */=0A= @@ -1164,8 +1164,8 @@ BDESC (OPTION_MASK_ISA_AVX, 0, = CODE_FOR_avx_vtestpd256, "__builtin_ia32_vtestnzc=0A= BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_vtestps256, = "__builtin_ia32_vtestzps256", IX86_BUILTIN_VTESTZPS256, EQ, (int) = INT_FTYPE_V8SF_V8SF_PTEST)=0A= BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_vtestps256, = "__builtin_ia32_vtestcps256", IX86_BUILTIN_VTESTCPS256, LTU, (int) = INT_FTYPE_V8SF_V8SF_PTEST)=0A= BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_vtestps256, = "__builtin_ia32_vtestnzcps256", IX86_BUILTIN_VTESTNZCPS256, GTU, (int) = INT_FTYPE_V8SF_V8SF_PTEST)=0A= -BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestv4di, = "__builtin_ia32_ptestz256", IX86_BUILTIN_PTESTZ256, EQ, (int) = INT_FTYPE_V4DI_V4DI_PTEST)=0A= -BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestv4di, = "__builtin_ia32_ptestc256", IX86_BUILTIN_PTESTC256, LTU, (int) = INT_FTYPE_V4DI_V4DI_PTEST)=0A= +BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestzv4di, = "__builtin_ia32_ptestz256", IX86_BUILTIN_PTESTZ256, EQ, (int) = INT_FTYPE_V4DI_V4DI_PTEST)=0A= +BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestcv4di, = "__builtin_ia32_ptestc256", IX86_BUILTIN_PTESTC256, LTU, (int) = INT_FTYPE_V4DI_V4DI_PTEST)=0A= BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_ptestv4di, = "__builtin_ia32_ptestnzc256", IX86_BUILTIN_PTESTNZC256, GTU, (int) = INT_FTYPE_V4DI_V4DI_PTEST)=0A= =0A= BDESC (OPTION_MASK_ISA_AVX, 0, CODE_FOR_avx_movmskpd256, = "__builtin_ia32_movmskpd256", IX86_BUILTIN_MOVMSKPD256, UNKNOWN, (int) = INT_FTYPE_V4DF )=0A= diff --git a/gcc/config/i386/i386-expand.cc = b/gcc/config/i386/i386-expand.cc=0A= index 0d817fc..7719449 100644=0A= --- a/gcc/config/i386/i386-expand.cc=0A= +++ b/gcc/config/i386/i386-expand.cc=0A= @@ -2370,8 +2370,8 @@ ix86_expand_branch (enum rtx_code code, rtx op0, = rtx op1, rtx label)=0A= tmp =3D gen_reg_rtx (mode);=0A= emit_insn (gen_rtx_SET (tmp, gen_rtx_XOR (mode, op0, op1)));=0A= tmp =3D gen_lowpart (p_mode, tmp);=0A= - emit_insn (gen_rtx_SET (gen_rtx_REG (CCmode, FLAGS_REG),=0A= - gen_rtx_UNSPEC (CCmode,=0A= + emit_insn (gen_rtx_SET (gen_rtx_REG (CCZmode, FLAGS_REG),=0A= + gen_rtx_UNSPEC (CCZmode,=0A= gen_rtvec (2, tmp, tmp),=0A= UNSPEC_PTEST)));=0A= tmp =3D gen_rtx_fmt_ee (code, VOIDmode, flag, const0_rtx);=0A= diff --git a/gcc/config/i386/i386-features.cc = b/gcc/config/i386/i386-features.cc=0A= index a0a7348..4a3b07a 100644=0A= --- a/gcc/config/i386/i386-features.cc=0A= +++ b/gcc/config/i386/i386-features.cc=0A= @@ -974,12 +974,45 @@ general_scalar_chain::convert_op (rtx *op, = rtx_insn *insn)=0A= }=0A= }=0A= =0A= -/* Convert COMPARE to vector mode. */=0A= +/* Convert CCZmode COMPARE to vector mode. */=0A= =0A= rtx=0A= scalar_chain::convert_compare (rtx op1, rtx op2, rtx_insn *insn)=0A= {=0A= rtx src, tmp;=0A= +=0A= + /* Handle any REG_EQUAL notes. */=0A= + tmp =3D find_reg_equal_equiv_note (insn);=0A= + if (tmp)=0A= + {=0A= + if (GET_CODE (XEXP (tmp, 0)) =3D=3D COMPARE=0A= + && GET_MODE (XEXP (tmp, 0)) =3D=3D CCZmode=0A= + && REG_P (XEXP (XEXP (tmp, 0), 0)))=0A= + {=0A= + rtx *op =3D &XEXP (XEXP (tmp, 0), 1);=0A= + if (CONST_SCALAR_INT_P (*op))=0A= + {=0A= + if (constm1_operand (*op, GET_MODE (*op)))=0A= + *op =3D CONSTM1_RTX (vmode);=0A= + else=0A= + {=0A= + unsigned n =3D GET_MODE_NUNITS (vmode);=0A= + rtx *v =3D XALLOCAVEC (rtx, n);=0A= + v[0] =3D *op;=0A= + for (unsigned i =3D 1; i < n; ++i)=0A= + v[i] =3D const0_rtx;=0A= + *op =3D gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (n, v));=0A= + }=0A= + tmp =3D NULL_RTX;=0A= + }=0A= + else if (REG_P (*op))=0A= + tmp =3D NULL_RTX;=0A= + }=0A= +=0A= + if (tmp)=0A= + remove_note (insn, tmp);=0A= + }=0A= +=0A= /* Comparison against anything other than zero, requires an XOR. */=0A= if (op2 !=3D const0_rtx)=0A= {=0A= @@ -1023,7 +1056,7 @@ scalar_chain::convert_compare (rtx op1, rtx op2, = rtx_insn *insn)=0A= emit_insn_before (gen_rtx_SET (tmp, op11), insn);=0A= op11 =3D tmp;=0A= }=0A= - return gen_rtx_UNSPEC (CCmode, gen_rtvec (2, op11, op12),=0A= + return gen_rtx_UNSPEC (CCZmode, gen_rtvec (2, op11, op12),=0A= UNSPEC_PTEST);=0A= }=0A= else=0A= @@ -1052,7 +1085,7 @@ scalar_chain::convert_compare (rtx op1, rtx op2, = rtx_insn *insn)=0A= src =3D tmp;=0A= }=0A= =0A= - return gen_rtx_UNSPEC (CCmode, gen_rtvec (2, src, src), UNSPEC_PTEST);=0A= + return gen_rtx_UNSPEC (CCZmode, gen_rtvec (2, src, src), = UNSPEC_PTEST);=0A= }=0A= =0A= /* Helper function for converting INSN to vector mode. */=0A= @@ -1219,7 +1252,7 @@ general_scalar_chain::convert_insn (rtx_insn *insn)=0A= break;=0A= =0A= case COMPARE:=0A= - dst =3D gen_rtx_REG (CCmode, FLAGS_REG);=0A= + dst =3D gen_rtx_REG (CCZmode, FLAGS_REG);=0A= src =3D convert_compare (XEXP (src, 0), XEXP (src, 1), insn);=0A= break;=0A= =0A= @@ -1726,7 +1759,7 @@ timode_scalar_chain::convert_insn (rtx_insn *insn)=0A= break;=0A= =0A= case COMPARE:=0A= - dst =3D gen_rtx_REG (CCmode, FLAGS_REG);=0A= + dst =3D gen_rtx_REG (CCZmode, FLAGS_REG);=0A= src =3D convert_compare (XEXP (src, 0), XEXP (src, 1), insn);=0A= break;=0A= =0A= diff --git a/gcc/config/i386/i386-protos.h = b/gcc/config/i386/i386-protos.h=0A= index 71ae95f..b00756b 100644=0A= --- a/gcc/config/i386/i386-protos.h=0A= +++ b/gcc/config/i386/i386-protos.h=0A= @@ -140,6 +140,7 @@ extern void ix86_expand_copysign (rtx []);=0A= extern void ix86_expand_xorsign (rtx []);=0A= extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, = rtx[2]);=0A= extern bool ix86_match_ccmode (rtx, machine_mode);=0A= +extern bool ix86_match_ptest_ccmode (rtx);=0A= extern void ix86_expand_branch (enum rtx_code, rtx, rtx, rtx);=0A= extern void ix86_expand_setcc (rtx, enum rtx_code, rtx, rtx);=0A= extern bool ix86_expand_int_movcc (rtx[]);=0A= diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc=0A= index fbd33a6..30fc552 100644=0A= --- a/gcc/config/i386/i386.cc=0A= +++ b/gcc/config/i386/i386.cc=0A= @@ -15985,6 +15985,29 @@ ix86_cc_mode (enum rtx_code code, rtx op0, rtx = op1)=0A= }=0A= }=0A= =0A= +/* Return TRUE or FALSE depending on whether the ptest instruction=0A= + INSN has source and destination with suitable matching CC modes. */=0A= +=0A= +bool=0A= +ix86_match_ptest_ccmode (rtx insn)=0A= +{=0A= + rtx set, src;=0A= + machine_mode set_mode;=0A= +=0A= + set =3D PATTERN (insn);=0A= + gcc_assert (GET_CODE (set) =3D=3D SET);=0A= + src =3D SET_SRC (set);=0A= + gcc_assert (GET_CODE (src) =3D=3D UNSPEC=0A= + && XINT (src, 1) =3D=3D UNSPEC_PTEST);=0A= +=0A= + set_mode =3D GET_MODE (src);=0A= + if (set_mode !=3D CCZmode=0A= + && set_mode !=3D CCCmode=0A= + && set_mode !=3D CCmode)=0A= + return false;=0A= + return GET_MODE (SET_DEST (set)) =3D=3D set_mode;=0A= +}=0A= +=0A= /* Return the fixed registers used for condition codes. */=0A= =0A= static bool=0A= diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md=0A= index 513960e..e8d50a1 100644=0A= --- a/gcc/config/i386/sse.md=0A= +++ b/gcc/config/i386/sse.md=0A= @@ -20441,10 +20441,10 @@=0A= UNSPEC_MOVMSK)=0A= (match_operand 2 "const_int_operand")))]=0A= "TARGET_SSE4_1 && (INTVAL (operands[2]) =3D=3D (int) = ())"=0A= - [(set (reg:CC FLAGS_REG)=0A= - (unspec:CC [(match_dup 0)=0A= - (match_dup 0)]=0A= - UNSPEC_PTEST))])=0A= + [(set (reg:CCZ FLAGS_REG)=0A= + (unspec:CCZ [(match_dup 0)=0A= + (match_dup 0)]=0A= + UNSPEC_PTEST))])=0A= =0A= (define_expand "sse2_maskmovdqu"=0A= [(set (match_operand:V16QI 0 "memory_operand")=0A= @@ -23096,13 +23096,13 @@=0A= (set_attr "mode" "")])=0A= =0A= ;; ptest is very similar to comiss and ucomiss when setting FLAGS_REG.=0A= -;; But it is not a really compare instruction.=0A= -(define_insn "_ptest"=0A= - [(set (reg:CC FLAGS_REG)=0A= - (unspec:CC [(match_operand:V_AVX 0 "register_operand" "Yr, *x, x")=0A= - (match_operand:V_AVX 1 "vector_operand" "YrBm, *xBm, xm")]=0A= - UNSPEC_PTEST))]=0A= - "TARGET_SSE4_1"=0A= +;; But it is not really a compare instruction.=0A= +(define_insn "*_ptest"=0A= + [(set (reg FLAGS_REG)=0A= + (unspec [(match_operand:V_AVX 0 "register_operand" "Yr, *x, x")=0A= + (match_operand:V_AVX 1 "vector_operand" "YrBm, *xBm, xm")]=0A= + UNSPEC_PTEST))]=0A= + "TARGET_SSE4_1 && ix86_match_ptest_ccmode (insn)"=0A= "%vptest\t{%1, %0|%0, %1}"=0A= [(set_attr "isa" "noavx,noavx,avx")=0A= (set_attr "type" "ssecomi")=0A= @@ -23115,6 +23115,30 @@=0A= (const_string "*")))=0A= (set_attr "mode" "")])=0A= =0A= +;; Expand a ptest to set the Z flag.=0A= +(define_expand "_ptestz"=0A= + [(set (reg:CCZ FLAGS_REG)=0A= + (unspec:CCZ [(match_operand:V_AVX 0 "register_operand")=0A= + (match_operand:V_AVX 1 "vector_operand")]=0A= + UNSPEC_PTEST))]=0A= + "TARGET_SSE4_1")=0A= +=0A= +;; Expand a ptest to set the C flag=0A= +(define_expand "_ptestc"=0A= + [(set (reg:CCC FLAGS_REG)=0A= + (unspec:CCC [(match_operand:V_AVX 0 "register_operand")=0A= + (match_operand:V_AVX 1 "vector_operand")]=0A= + UNSPEC_PTEST))]=0A= + "TARGET_SSE4_1")=0A= +=0A= +;; Expand a ptest to set both the Z and C flags=0A= +(define_expand "_ptest"=0A= + [(set (reg:CC FLAGS_REG)=0A= + (unspec:CC [(match_operand:V_AVX 0 "register_operand")=0A= + (match_operand:V_AVX 1 "vector_operand")]=0A= + UNSPEC_PTEST))]=0A= + "TARGET_SSE4_1")=0A= +=0A= (define_insn "ptesttf2"=0A= [(set (reg:CC FLAGS_REG)=0A= (unspec:CC [(match_operand:TF 0 "register_operand" "Yr, *x, x")=0A= @@ -23129,17 +23153,17 @@=0A= (set_attr "mode" "TI")])=0A= =0A= (define_insn_and_split "*ptest_and"=0A= - [(set (reg:CC FLAGS_REG)=0A= - (unspec:CC [(and:V_AVX (match_operand:V_AVX 0 "register_operand")=0A= - (match_operand:V_AVX 1 "vector_operand"))=0A= - (and:V_AVX (match_dup 0) (match_dup 1))]=0A= + [(set (reg:CCZ FLAGS_REG)=0A= + (unspec:CCZ [(and:V_AVX (match_operand:V_AVX 0 "register_operand")=0A= + (match_operand:V_AVX 1 "vector_operand"))=0A= + (and:V_AVX (match_dup 0) (match_dup 1))]=0A= UNSPEC_PTEST))]=0A= "TARGET_SSE4_1=0A= && ix86_pre_reload_split ()"=0A= "#"=0A= "&& 1"=0A= - [(set (reg:CC FLAGS_REG)=0A= - (unspec:CC [(match_dup 0) (match_dup 1)] UNSPEC_PTEST))])=0A= + [(set (reg:CCZ FLAGS_REG)=0A= + (unspec:CCZ [(match_dup 0) (match_dup 1)] UNSPEC_PTEST))])=0A= =0A= (define_expand "nearbyint2"=0A= [(set (match_operand:VFH 0 "register_operand")=0A= diff --git a/gcc/testsuite/gcc.target/i386/pr109973-1.c = b/gcc/testsuite/gcc.target/i386/pr109973-1.c=0A= new file mode 100644=0A= index 0000000..a1b6136b=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.target/i386/pr109973-1.c=0A= @@ -0,0 +1,13 @@=0A= +/* { dg-do compile } */=0A= +/* { dg-options "-O2 -mavx2" } */=0A= +=0A= +typedef long long __m256i __attribute__ ((__vector_size__ (32)));=0A= +=0A= +int=0A= +foo (__m256i x, __m256i y)=0A= +{=0A= + __m256i a =3D x & y;=0A= + return __builtin_ia32_ptestc256 (a, a);=0A= +}=0A= +=0A= +/* { dg-final { scan-assembler "vpand" } } */=0A= diff --git a/gcc/testsuite/gcc.target/i386/pr109973-2.c = b/gcc/testsuite/gcc.target/i386/pr109973-2.c=0A= new file mode 100644=0A= index 0000000..167f6ee=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.target/i386/pr109973-2.c=0A= @@ -0,0 +1,13 @@=0A= +/* { dg-do compile } */=0A= +/* { dg-options "-O2 -msse4.1" } */=0A= +=0A= +typedef long long __m128i __attribute__ ((__vector_size__ (16)));=0A= +=0A= +int=0A= +foo (__m128i x, __m128i y)=0A= +{=0A= + __m128i a =3D x & y;=0A= + return __builtin_ia32_ptestc128 (a, a);=0A= +}=0A= +=0A= +/* { dg-final { scan-assembler "pand" } } */=0A= diff --git a/gcc/testsuite/gcc.target/i386/pr110083.c = b/gcc/testsuite/gcc.target/i386/pr110083.c=0A= new file mode 100644=0A= index 0000000..4b38ca8=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.target/i386/pr110083.c=0A= @@ -0,0 +1,26 @@=0A= +/* { dg-do compile { target int128 } } */=0A= +/* { dg-options "-O2 -msse4 -mstv -mno-stackrealign" } */=0A= +typedef int TItype __attribute__ ((mode (TI)));=0A= +typedef unsigned int UTItype __attribute__ ((mode (TI)));=0A= +=0A= +void foo (void)=0A= +{=0A= + static volatile TItype ivin, ivout;=0A= + static volatile float fv1, fv2;=0A= + ivin =3D ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1));=0A= + fv1 =3D ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1));=0A= + fv2 =3D ivin;=0A= + ivout =3D fv2;=0A= + if (ivin !=3D ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1))=0A= + || ((((128) > sizeof (TItype) * 8 - 1)) && ivout !=3D ivin)=0A= + || ((((128) > sizeof (TItype) * 8 - 1))=0A= + && ivout !=3D=0A= + ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1)))=0A= + || fv1 !=3D=0A= + (float) ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1))=0A= + || fv2 !=3D=0A= + (float) ((TItype) (UTItype) ~ (((UTItype) ~ (UTItype) 0) >> 1))=0A= + || fv1 !=3D fv2)=0A= + __builtin_abort ();=0A= +}=0A= +=0A= ------=_NextPart_000_03BE_01D99BF6.EA53A270--